Speaker
Description
System-theoretic dissipativity notions introduced by Jan C. Willems play a fundamental role in the analysis of optimal control problems. They enable the understanding of infinite-horizon asymptotics and turnpike properties. This talk introduces a dissipative formulation for training deep Residual Neural Networks (ResNets) in classification problems. To this end, we formulate the training of ResNets with a constant width as an optimal control problem and investigate its dissipativity properties when introducing a stage cost based on a variant of the cross entropy loss function, the classic loss function for classification tasks.
We illustrate the dissipative formulation by training on the MNIST dataset, which exhibits the turnpike phenomenon: the data remains unchanged throughout several layers. These layers can then be removed without changing the transformation learned by the NN. This technique can be used to obtain shallow neural networks for a given classification task with simplified hyperparameter tuning.