Hedging targeted risks with reinforcement learning: application to life insurance contracts with embedded guarantees

Carlos Octavio Pérez-Mendoza & Frédéric Godin

ASTIN Bulletin2026https://doi.org/10.1017/asb.2026.10084article
AJG 2ABDC A*
Weight
0.50

Abstract

We propose a deep reinforcement learning (RL) framework designed to optimize the hedging of specific, user-defined risk factors—referred to as targeted risks—in financial instruments affected by multiple sources of uncertainty. Our methodology uses Shapley value decompositions to establish source of risk grouping’s contribution to the projected contract cash flows, providing a clear attribution of the profit and loss to distinct risk categories. Leveraging this decomposition, we apply deep RL to hedge only the targeted risks, while leaving non-targeted risks mostly unaffected. In addition, we introduce a joint neural network architecture in which the agent network utilizes risk estimates from a risk measurement neural network to stabilize the hedging strategy, taking into account local risk dynamics. Numerical experiments show that our approach outperforms traditional methods, such as delta hedging and traditional deep hedging, significantly reducing targeted risks in variable annuities while maintaining flexibility for broader applications.

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1017/asb.2026.10084

Or copy a formatted citation

@article{carlos2026,
  title        = {{Hedging targeted risks with reinforcement learning: application to life insurance contracts with embedded guarantees}},
  author       = {Carlos Octavio Pérez-Mendoza & Frédéric Godin},
  journal      = {ASTIN Bulletin},
  year         = {2026},
  doi          = {https://doi.org/https://doi.org/10.1017/asb.2026.10084},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Hedging targeted risks with reinforcement learning: application to life insurance contracts with embedded guarantees

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.50 × 0.4 = 0.20
M · momentum0.50 × 0.15 = 0.07
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.