The determinants of health expenditure: a machine learning approach

Nicola Caravaggio et al.

Empirical Economics2026https://doi.org/10.1007/s00181-025-02854-6article

AJG 2ABDC A

Weight

0.50

What the paper says

Accurate prediction of healthcare costs is essential for making decisions, shaping policies, preparing finances, and managing resources effectively, but traditional econometric models fall short in addressing this policy challenge adequately. This paper uses machine learning (ML) to predict healthcare expenditure in systems with heterogeneous regional needs. The Italian NHS is used as a case study, with administrative data spanning the years 1996 to 2019. The empirical analysis implements four ML algorithms (Elastic-Net, Gradient Boosting, Random Forest, and Support Vector Regression) and a multivariate regression as a baseline. Gradient boosting emerges as the superior algorithm in out-of-the-sample prediction performances; even when applied to 2019 data, the models trained up to 2018 demonstrate robust forecasting abilities. Important predictors of expenditure include temporal factors and technological progress, average family size and share of public expenditure over the total, regional area, population and share of foreign residents, GDP per capita and labour activity, and share of elderly population (75 years old and over). The remarkable effectiveness of the model demonstrates that ML can be efficiently employed to predict and then distribute national healthcare funds to areas with heterogeneous needs.

Open paper page →

Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact	0.50 × 0.4 = 0.20
M · momentum	0.50 × 0.15 = 0.07
V · venue signal	0.50 × 0.05 = 0.03
R · text relevance †	0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.