Eventi
10 Ottobre, 2024 14:30
Sezione di Probabilità e Statistica Matematica
Large deviations of one-hidden-layer neural networks
Christian Hirsch, Aarhus University
Aula Seminari - III piano. Zoom link: polimi-it.zoom.us/j/94501257503
Abstract
In this talk, I will present large deviations in the context of stochastic gradient descent for one-hidden-layer neural networks with quadratic loss. I will explain how to derive quenched and annealed large deviation principle for the empirical weight evolution during training when letting the number of neurons and the number of training iterations simultaneously tend to infinity. The weight evolution is treated as an interacting dynamic particle system. The distinctive aspect compared to prior work on interacting particle systems lies in the discrete particle updates, simultaneously with a growing number of particles. This talk is based on joint work with Daniel Willhalm.
Seminari Matematici al
Politecnico di Milano
- Analisi
- Cultura Matematica
- Seminari FDS
- Geometria e Algebra
- Probabilità e Statistica Matematica
- Probabilità Quantistica