Utilizing large language models (LLMs) for quantitative reasoning-intensive tasks within the (re)insurance sector

Yilin Hao et al.

Annals of Actuarial Science2025https://doi.org/10.1017/s1748499525100079article
AJG 1ABDC A
Weight
0.50

Abstract

The rise of large language models (LLMs) has marked a substantial leap toward artificial general intelligence. However, the utilization of LLMs in (re)insurance sector remains a challenging problem because of the gap between general capabilities and domain-specific requirements. Two prevalent methods for domain specialization of LLMs involve prompt engineering and fine-tuning. In this study, we aim to evaluate the efficacy of LLMs, enhanced with prompt engineering and fine-tuning techniques, on quantitative reasoning tasks within the (re)insurance domain. It is found that (1) compared to prompt engineering, fine-tuning with task-specific calculation dataset provides a remarkable leap in performance, even exceeding the performance of larger pre-trained LLMs; (2) when acquired task-specific calculation data are limited, supplementing LLMs with domain-specific knowledge dataset is an effective alternative; and (3) enhanced reasoning capabilities should be the primary focus for LLMs when tackling quantitative tasks, surpassing mere computational skills. Moreover, the fine-tuned models demonstrate a consistent aptitude for common-sense reasoning and factual knowledge, as evidenced by their performance on public benchmarks. Overall, this study demonstrates the potential of LLMs to be utilized as powerful tools to serve as AI assistants and solve quantitative reasoning tasks in (re)insurance sector.

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1017/s1748499525100079

Or copy a formatted citation

@article{yilin2025,
  title        = {{Utilizing large language models (LLMs) for quantitative reasoning-intensive tasks within the (re)insurance sector}},
  author       = {Yilin Hao et al.},
  journal      = {Annals of Actuarial Science},
  year         = {2025},
  doi          = {https://doi.org/https://doi.org/10.1017/s1748499525100079},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Utilizing large language models (LLMs) for quantitative reasoning-intensive tasks within the (re)insurance sector

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.50 × 0.4 = 0.20
M · momentum0.50 × 0.15 = 0.07
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.