Deep Learning for High-Dimensional Continuous-Time Stochastic Optimal Control Without Explicit Solution

Jean-Loup Dupret & Donatien Hainaut

Operations Research2026https://doi.org/10.1287/opre.2024.1102article

FT50UTD24AJG 4*ABDC A*

Weight

0.50

What the paper says

Multiasset Optimal Execution via Deep Learning for High-Dimensional Continuous-Time Stochastic Control In “Deep Learning for High-Dimensional Continuous-Time Stochastic Optimal Control Without Explicit Solution,” Dupret and Hainaut introduce the generalized policy iteration physics-informed neural network, a novel deep learning algorithm for solving high-dimensional continuous-time stochastic optimal control problems even when the optimal control does not admit explicit solution. The method combines physics-informed neural networks with an actor-critic structure based on generalized policy iteration and uses separate networks to approximate both the value function and the multidimensional optimal control. This approach provides a global approximation of the solution across time and space, enabling fast online evaluation. Theoretical guarantees on convergence and optimality are provided, whereas its accuracy and efficacy are empirically validated through two important numerical examples from operations research. Thereby, the authors generalize the Almgren–Chriss framework arising from optimal execution in finance by allowing both temporary and permanent price impacts to be fully nonlinear and by considering a multidimensional setting with multiple cointegrated assets.

Open paper page →

Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact	0.50 × 0.4 = 0.20
M · momentum	0.50 × 0.15 = 0.07
V · venue signal	0.50 × 0.05 = 0.03
R · text relevance †	0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.