Event Series
Event Type
Seminar
Wednesday, April 22, 2020 12:00 PM
Andrea Montanari (Stanford)

In a canonical supervised learning setting, we are given n data samples, each comprising a feature vector and a label, or response variable. We are asked to learn a function f that can predict the the label associated to a new --unseen-- feature vector. How is it possible that the model learnt from observed data generalizes to new points? Classical learning theory assumes that data points are drawn i.i.d. from a common distribution and argue that this phenomenon is a consequence of uniform convergence: the training error is close to its expectation uniformly over all models in a certain class. Modern deep learning systems appear to defy this viewpoint: they achieve training error that is significantly smaller than the test error, and yet generalize well to new data. I will present a sequence of high-dimensional examples in which this phenomenon can be understood in detail. [Based on joint work with Song Mei, Feng Ruan, Youngtak Sohn, Jun Yan]

Zoom Seminar Link