Duality and policy gradient methods for stochastic control problems with controlled diffusions
File(s)
Author(s)
Davey, Ashley
Type
Thesis or dissertation
Abstract
In this thesis we develop numerical algorithms for stochastic control problems where the state
processes is an Itˆo process that depends on the control process in both the drift and diffusion
functions.
We derive sufficient conditions for convergence of a proximal policy gradient method (PPGM)
for the stochastic linear quadratic control problem. The optimal control can be uniquely determined
by solving a Ricatti equation. We prove that the control induced by the policy gradient
method converges to the optimal control. Considering convergence analysis in the general case,
we study the underlying backwards stochastic differential equations using Malliavin Calculus,
and determine some conditions under which convergence of the control process can be established.
To implement the PPGM method, we use concepts in machine learning methods, and extend
our algorithms using duality theory. Using deep learning methods, the algorithms developed
are scalable to high dimensional problems, with a reasonable runtime. Duality allows us to
solve an auxiliary control problem simultaneously to the original primal problem, which allows
us to exploit the dual-primal relations between the associated processes, and form tight bounds
of the value function. In certain cases solving the dual problem directly bypasses complexity
within the primal problem. We further use duality to implement algorithms that can be applied
to non-Markovian control problems.
processes is an Itˆo process that depends on the control process in both the drift and diffusion
functions.
We derive sufficient conditions for convergence of a proximal policy gradient method (PPGM)
for the stochastic linear quadratic control problem. The optimal control can be uniquely determined
by solving a Ricatti equation. We prove that the control induced by the policy gradient
method converges to the optimal control. Considering convergence analysis in the general case,
we study the underlying backwards stochastic differential equations using Malliavin Calculus,
and determine some conditions under which convergence of the control process can be established.
To implement the PPGM method, we use concepts in machine learning methods, and extend
our algorithms using duality theory. Using deep learning methods, the algorithms developed
are scalable to high dimensional problems, with a reasonable runtime. Duality allows us to
solve an auxiliary control problem simultaneously to the original primal problem, which allows
us to exploit the dual-primal relations between the associated processes, and form tight bounds
of the value function. In certain cases solving the dual problem directly bypasses complexity
within the primal problem. We further use duality to implement algorithms that can be applied
to non-Markovian control problems.
Version
Open Access
Date Issued
2023-03
Date Awarded
2023-08
Copyright Statement
Creative Commons Attribution NonCommercial Licence
Advisor
Zheng, Harry
Publisher Department
Mathematics
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)