Reinforcement Learning for Jump‐Diffusions, With Financial Applications

Xuefeng Gao et al.

Mathematical Finance2026https://doi.org/10.1111/mafi.70027article
AJG 3ABDC A
Weight
0.50

Abstract

We study continuous‐time reinforcement learning (RL) for stochastic control in which system dynamics are governed by jump‐diffusion processes. We formulate an entropy‐regularized exploratory control problem with stochastic policies to capture the exploration–exploitation balance essential for RL. Unlike the pure diffusion case initially studied by Wang et al., the derivation of the exploratory dynamics under jump‐diffusions calls for a careful formulation of the jump part. Through a theoretical analysis, we find that one can simply use the same policy evaluation and q‐learning algorithms in Jia and Zhou, originally developed for controlled diffusions, without needing to check a priori whether the underlying data come from a pure diffusion or a jump‐diffusion. We investigate as an application the mean–variance portfolio selection problem with stock price modelled as a jump‐diffusion, and show that both RL algorithms and parameterizations are invariant with respect to jumps. Finally, we present a detailed study on applying the general theory to option hedging.

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1111/mafi.70027

Or copy a formatted citation

@article{xuefeng2026,
  title        = {{Reinforcement Learning for Jump‐Diffusions, With Financial Applications}},
  author       = {Xuefeng Gao et al.},
  journal      = {Mathematical Finance},
  year         = {2026},
  doi          = {https://doi.org/https://doi.org/10.1111/mafi.70027},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Reinforcement Learning for Jump‐Diffusions, With Financial Applications

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.50 × 0.4 = 0.20
M · momentum0.50 × 0.15 = 0.07
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.