Dresden 2026 – wissenschaftliches Programm
Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe
DY: Fachverband Dynamik und Statistische Physik
DY 45: Focus Session: Physics of AI – Part I (joint session SOE/DY)
DY 45.5: Vortrag
Donnerstag, 12. März 2026, 10:45–11:00, GÖR/0226
Generalization performance of narrow one-hidden layer networks in the teacher-student setting — Rodrigo Pérez Ortiz1, •Gibbs Nwemadji2, Jean Barbier3, Federica Gerace1, Alessandro Ingrosso4, Clarissa Lauditi5, and Enrico Malatesta6 — 1Alma Mater Studiorum * Università di Bologna (Unibo), Bologna, Italy — 2International School of Advanced Studies (SISSA), Trieste, Italy — 3The Abdus Salam International Centre for Theoretical Physics, Trieste, Italy — 4Radboud University, Nijmegen, The Netherlands — 5Harvard University, Cambridge, US — 6Bocconi University, Milano, Italy
Generalization on simple input-output distributions is best studied in the teacher-student setting, but fully connected one-hidden-layer networks with generic activations still lack a complete theory. We develop such a framework for networks with a large but finite number of hidden neurons, using statistical-physics tools to obtain closed-form predictions for both Bayesian and ERM estimators through a few summary statistics. We also identify a specialization transition when the sample size matches the number of parameters. The resulting theory accurately predicts generalization errors for networks trained with Langevin dynamics or standard full-batch gradient descent.
Keywords: Machine Learning; Disordered Systems and Neural Networks; Statistical Mechanics; Information Theory; Computational Physics
