Warranty Provisions: Machine-Learning Versus Human Estimates

Martin Becker & Simon Schölzel

The European Accounting Review2025https://doi.org/10.1080/09638180.2024.2444521article
AJG 3ABDC A*
Weight
0.48

Abstract

This study employs machine learning to shed light on the accuracy of discretionary accounting estimates and the causes of human estimation errors. Using proprietary data from a large European manufacturing firm, we implement a set of prediction models to gauge a pervasive and economically relevant accounting estimate: the warranty provision. We find that machine learning models consistently outperform human experts when compared on the basis of individual warranty obligations. This gap widens when estimates are aggregated across homogeneous classes of products, as the machine makes relatively fewer and less severe overstatements. Applying model interpretability techniques and conducting a series of semi-structured interviews, we identify misspecifications of the managerial estimation model, specifically aggregation bias and anchoring to historical cost, as the primary causes of the larger human errors. Moreover, the interview evidence suggests that various firm-level factors, such as learning frictions, auditors’ preferences for process continuity, and strategic considerations, are important determinants of the design and continued use of misspecified estimation models in practice.

5 citations

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1080/09638180.2024.2444521

Or copy a formatted citation

@article{martin2025,
  title        = {{Warranty Provisions: Machine-Learning Versus Human Estimates}},
  author       = {Martin Becker & Simon Schölzel},
  journal      = {The European Accounting Review},
  year         = {2025},
  doi          = {https://doi.org/https://doi.org/10.1080/09638180.2024.2444521},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Warranty Provisions: Machine-Learning Versus Human Estimates

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.48

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.41 × 0.4 = 0.16
M · momentum0.63 × 0.15 = 0.09
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.