Forecasting Corporate Default Risk Across Multiple Horizons With Interpretable Machine Learning
Qingli Dong & Li Li
Abstract
Accurate and transparent default prediction is central to credit‐risk operations and policy. This paper proposes ELG, an interpretable two‐stage machine‐learning framework for decision support in corporate default prediction. Stage I uses an explainable, sparsity‐controlled selector (E‐LassoNet) to produce a compact, auditable feature set; Stage II fits a gradient‐boosting model (GBDT) on the retained drivers to capture residual nonlinearity and output multi‐horizon default probabilities. Using a large firm‐year panel of Chinese listed companies and evaluating six forecast horizons, ELG achieves AUCs in the 0.98–0.99 range and Recalls in the 0.94–0.99 range with 27–28 predictors and outperforms standard classifiers on AUC and G_mean at most horizons. Post‐analysis shows financial signals dominate short horizons, while governance and other non‐financial indicators grow in importance at longer horizons. By enforcing interpretability ex ante and evaluating with decision‐aligned metrics, ELG links transparent feature selection to actionable, horizon‐aware forecasts, improving deployability and governance for credit‐risk management in emerging markets.
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.50 × 0.4 = 0.20 |
| M · momentum | 0.50 × 0.15 = 0.07 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.