SKM 2023 – wissenschaftliches Programm
DY 11: Focus Session: Physics Meets ML II – Understanding Machine Learning as Complex Interacting Systems (joint session DY/TT)
DY 11.3: Hauptvortrag
Montag, 27. März 2023, 16:00–16:30, ZEU 250
Adaptive Kernel Approaches to Feature Learning in Deep Neural Networks — •Zohar Ringel — Racah Institute of Physics, Hebrew University in Jerusalem
Following the ever-increasing role of deep neural networks (DNNs) in our world, a better theoretical understanding of these complex artificial objects is desirable. Some progress in this direction has been seen lately in the realm of infinitely overparameterized DNNs. The outputs of such trained DNNs behave essentially as multivariate Gaussians governed by a certain covariance matrix called the kernel. While such infinite DNNs share many similarities with the finite ones used in practice, various important discrepancies exist. Most notably the fixed kernels of such DNNs stand in contrast to feature learning effects observed in finite DNNs. Such effects are crucial as they are the key to understanding how DNNs process data. To accommodate such effects within the Gaussian/kernel viewpoint, various ideas have been put forward. Here I will provide a short overview of those efforts and then discuss in some detail a general set of equations we developed for feature learning in fully trained/equilibrated DNNs. Interestingly, our approach shows that DNNs accommodate strong feature learning via mean-field effects while having decoupled layers and decoupled neurons within a layer. Furthermore, learning is achieved not by compression of information but rather by increasing neuron variance along label-relevant directions in function space.