Dresden 2026 – wissenschaftliches Programm
Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe
DY: Fachverband Dynamik und Statistische Physik
DY 58: Focus Session: Physics of AI – Part II (joint session SOE/DY)
DY 58.4: Vortrag
Freitag, 13. März 2026, 10:30–10:45, GÖR/0226
Power-Law Correlations in Language: Criticality vs. Hierarchical Generative Structure — •Marcel Kühn1,2, Max Staats1,2, and Bernd Rosenow2 — 1ScaDS.AI Dresden/Leipzig, Germany — 2Institute for Theoretical Physics, University of Leipzig, 04103 Leipzig, Germany
Natural language shows power-laws beyond Zipf: the mutual information between words as a function of separation --- a two-point correlation --- decays approximately as a power-law, a constraint for predictive language models. In autoregressive architectures like transformers, the softmax temperature of the output controls how sharply next-word probabilities concentrate, acting as a thermodynamic knob that might tune correlations. Since phase transitions are a well-known mechanism that generate such scale-free correlations, we ask whether the observed power-law mutual information requires tuning to a critical softmax temperature. Analyzing a Markov (bigram) model, we show that, in a large-system limit, power-law mutual information emerges only at a fine-tuned critical temperature, below correlations decay exponentially. Motivated by the fact that faithful language models must go beyond bigrams and that hierarchical generative processes introducing long range interactions are more representative, we analyze an autoregressive model that perfectly emulates a specific probabilistic context-free grammar. We demonstrate that simple versions of this model preserve power-law mutual information without temperature fine-tuning, and we discuss the generality of this result for variants of the model in which deviations from the grammatical rules may occur.
Keywords: Autoregressive Language Models; Power-Laws; Phase Transitions; Natural Language Statistics; Hierarchical Structure
