Model Averaging of Partially Linear Multinomial Logit Model for Multi‐Categorical Data
Jialei Liu & Jing Lv
Abstract
Model averaging for categorical response variables has gained a lot of attention in recent years. To further improve the prediction accuracy, we present a partially linear multinomial logit model averaging (PLMLMA) technique. Our candidate models are built by selecting each continuous covariate in turn as the index variable of the non‐parametric function, thus avoiding both the artificial selection of index variables and the curse of dimensionality in estimation. The model averaging weights are determined by minimising the Kullback‐Leibler (KL) loss. We demonstrate asymptotic optimality by showing that the KL loss between the true model and the model where the log‐odds ratio is estimated by the ‘working’ log‐odds ratio is asymptotically equivalent to that of the best but impractical model averaging estimator. Furthermore, we establish the convergence rate of the weight estimator without assuming that the true model is included among the candidate models. The superior performance of the proposed method is evidenced by lower mean squared error (MSE) and higher hit rate (HR) in simulations, outperforming various competitors. We also apply our method to wheat variety classification to illustrate the merits of PLMLMA.
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.50 × 0.4 = 0.20 |
| M · momentum | 0.50 × 0.15 = 0.07 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.