Speaker
Description
The synthesis of feedback laws for infinite horizon via machine learning methods instead of classical methods has been a theme of interest in recent years, since they have the potential of mitigate the curse of dimensionality. There are two methods which are under study in this talk.
The first consists in looking for a feedback law in a finite dimensional functional space (for example polynomials, neural networks, RKHS) which minimize the averaged cost functional of the control problem with respect to a set of initial conditions. The second one corresponds to a regression method which minimizes the $L^2$ distance in the space of the controls. For the first method we provide a convergence result which relies on the existence of a sequence of smooth approximating optimal feedback laws. Further, the existence of such a sequence is proved relying on the Hölder continuity of the value function and the existence of a Lyapunov type function.
On the other hand, for the regression method we are able to prove the convergence by assuming that the value function is smooth. Additionally, we present a family of infinite horizon optimal control problem for which the degree of smoothness of the value function depends on a penalty parameter. This dependence is such that the value function is $C^2$ when the penalty parameter is close to 0 and is non-smooth but Lipschitz when it is large. Through this problem we are able to compare the behavior of the methods depending on the degree of smoothness of the value function by performing numerical realizations for both approaches.