Industrial and Applied Math
Date: April 26, 2021
Time: 5:00PM - 6:00PM
Location: Zoom
Speaker: Dr. Boris Hanin, Princeton University
Title: Optimization and Generalization in Overparameterized Models
Abstract: Modern machine learning models, such as neural networks, have a number of theoretically puzzling but empirically robust properties. Chief among them are: (a) neural networks are trained on datasets which are much smaller than the total number of model parameters; (b) training proceeds by empirical risk minimization via a first order method from a random starting point and, despite the non-convexity of the risk, typically returns a global minimizer; (c) this minimizer of the risk not only fits interpolates the data precisely but also performs well on unseen data (i.e. generalizes). The purpose of this talk is to introduce these fascinating properties and give some basic intuitions for why they might be possible. The emphasis will be on heuristics rather than on precise theorems.