Textual analysis of insurance claims with large language models

Dongchen Li et al.

Journal of Risk and Insurance2025https://doi.org/10.1111/jori.70004article
AJG 3ABDC A
Weight
0.48

Abstract

This study proposes a comprehensive and general framework for examining discrepancies in textual content using large language models (LLMs), broadening application scenarios in the insurtech and risk management fields, and conducting empirical research based on actual needs and real‐world data. Our framework integrates OpenAI's interface to embed texts and project them into external categories while utilizing distance metrics to evaluate discrepancies. To identify significant disparities, we design prompts to analyze three types of relationships: identical information, logical relationships and potential relationships. Our empirical analysis shows that 22.1% of samples exhibit substantial semantic discrepancies, and 38.1% of the samples with significant differences contain at least one of the identified relationships. The average processing time for each sample does not exceed 4 s, and all processes can be adjusted based on actual needs. Backtesting results and comparisons with traditional NLP methods further demonstrate that our proposed method is both effective and robust.

5 citations

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1111/jori.70004

Or copy a formatted citation

@article{dongchen2025,
  title        = {{Textual analysis of insurance claims with large language models}},
  author       = {Dongchen Li et al.},
  journal      = {Journal of Risk and Insurance},
  year         = {2025},
  doi          = {https://doi.org/https://doi.org/10.1111/jori.70004},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Textual analysis of insurance claims with large language models

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.48

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.41 × 0.4 = 0.16
M · momentum0.63 × 0.15 = 0.09
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.