18.177. Topics in Stochastic Processes (fall 2018).
In the study of large (often high-dimensional) stochastic systems it is often important to be able to quantify the probabilities of rare events, or large deviations.
This course will cover the theory of large deviations, as well as recent applications and extensions in the field of probability and statistical physics.
The first part of the semester will cover general foundations. The second part will cover some recent research papers in probability theory.
The format will be similar to that of a class taught by David Aldous at Berkeley in 2008, although the topics of the second part will differ.
18.175 is a prerequisite. The lectures will also assume basic familiarity with point-set topology.
Please include "18.177" in the subject line of all emails.
Instructor: Nike Sun (nsun at mit dot edu).
Office hours: Mondays 3-4 and Wednesdays 11-12 in 2-432.
REFERENCES. References marked * are available electronically through libraries.mit.edu.
[DZ] *A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications (2010 corrected reprint).
[RS] *F. Rassoul-Agha and T. Seppäläinen, A Course on Large Deviations (2015).
[MM] M. Mézard and A. Montanari, Information, Physics, and Computation (2009).
*P. Billingsley, Convergence of Probability Measures (2nd edition 2008).
*S. Chatterjee, Large Deviations for Random Graphs (2015).
F. den Hollander, Large Deviations (2000).
J. R. Munkres, Topology (2nd edition 2015).
*D. Panchenko, The Sherrington-Kirkpatrick model (2013).
F. Rezakhanlou, lecture notes.
R. T. Rockafellar, Convex Analysis (1970).
- Lecture 1 (09/05) Introductory examples: gaussian and binomial large deviations; review of binary entropy and relative entropy; an example with conditioning.
Week 1 office hours Thursday (09/06) 4-5:30.
- Lecture 2 (09/10) Formal definition of LDP; lower semicontinuity (lsc) for the rate function.
RS 2.1-2.2 / DZ Ch. 1.
- Lecture 3 (09/12) Cramér's theorem; weak LDP; exponential tightness; change of measure.
RS 2.3-2.4 / DZ 2.2.
Week 2 office hours Monday (09/10) 3-4 and Wednesday (09/12) 11-12.
- Lecture 4 (09/17) More on change of measure; intuition and interpretations. Explicit calculations for discrete measures on R.
- Lecture 5 (09/18) A Cramér-type weighting in the random k-SAT problem. References:
- D. Achlioptas, C. Moore. "The asymptotic order of the random k-SAT threshold." Proc. 43rd FOCS: 779-788. IEEE, 2002.
- D. Achlioptas, Y. Peres. "The threshold for random k-SAT is 2^k log 2 - O(k)." J. Am. Math. Soc. 17:4 (2004): 947-973.
- Lecture 6 (09/24) Contraction principle; Varadhan's lemma; Curie--Weiss example; Legendre duality in general vector spaces; relative entropy and its dual.
RS 3.1-3.3, 4.1, 5.1.
- Lecture 7 (09/26) General weak LDP upper bound by convex conjugate; proof of Sanov's theorem.
RS 4.2, 5.2. Prohorov's theorem and Ulam's theorem in Billingsley (see above).
- Lectures 8 and 9 (10/01 and 10/03)
Guest lectures from Yufei Zhao on large deviations in random graphs. References:
- S. Chatterjee, S.R.S. Varadhan. "The large deviation principle for the Erdős-Rényi random graph." Eur. J. Combin. 32:7 (2011): 1000-1017.
- E. Lubetzky, Y. Zhao. "On replica symmetry of large deviations in random graphs." Random Struct. Algor. 47:1 (2015): 109-146.
- Lecture 10 (10/10) Introduction to spin glasses: binary synchronization problem; Sherrington--Kirkpatrick (SK) model; random energy model (REM); condensation phenomenon and Poisson--Dirichlet statistics.
Reading: MM Ch. 5, Panchenko Ch. 2. Further references:
- B. Derrida. "Random-energy model: Limit of a family of disordered models." Phys. Rev. Lett. 45.2 (1980): 79.
- B. Derrida. "Random-energy model: An exactly solvable model of disordered systems." Phys. Rev. B 24.5 (1981): 2613.
- Lecture 11 (10/15) Introduction to variational methods for factor models: Gibbs variational principle; naive mean-field approximation (e.g. Curie--Weiss); Bethe approximation; belief propagation (BP) equations.
Reading: MM Ch. 14. Further references:
- J. S. Yedidia, W. T. Freeman, Y. Weiss. "Constructing free-energy approximations and generalized belief propagation algorithms." IEEE Trans. Inform. Theory 51.7 (2005): 2282-2312.
- A. Montanari. "Inference in graphical models" (lecture notes). Available online.
- M. J. Wainwright, M. I. Jordan. "Graphical models, exponential families, and variational inference." Foundations and Trends in Machine Learning 1.1-2 (2008): 1-305. Available online.
- Lecture 12 (10/17) Bethe variational principle: constrained optimization on the local polytope; and an alternate view via random n-lifts.
- Lecture 13 (10/22) Binary synchronization problem: Bethe formula (replica symmetric formula) for optimal MSE.
Some references for the optimal (Bayes) MSE:
- T. Lesieur, F. Krzakala, L. Zdeborová. "Phase transitions in sparse PCA."
- Y. Deshpande, E. Abbe, A. Montanari. "Asymptotic mutual information for the binary stochastic block model."
- J. Barbier, M. Dia, N. Macris, F. Krzakala, T. Lesieur, L. Zdeborová.
"Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula."
- M. Lelarge, L. Miolane. "Fundamental limits of symmetric low-rank matrix estimation."
PTRF (2016): 1-71.
- A. El Alaoui, F. Krzakala. "Estimation in the spiked Wigner model: a short proof of the replica formula."
- A. El Alaoui, F. Krzakala, M. I. Jordan. "Fundamental limits of detection in the spiked Wigner model."
Some references for the spectral threshold:
- J. Baik, G. Ben Arous, S. Péché. "Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices."
AOP 33.5 (2005): 1643-1697.
- D. Féral, S. Péché. "The largest eigenvalue of rank one deformation of large Wigner matrices."
CMP 272.1 (2007): 185-228.
- F. Benaych-Georges, R. R. Nadakuditi. "The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices."
Adv. Math. 227.1 (2011): 494-521.
- Lecture 14 (10/24) Binary synchronization problem: interpretation of Bayes inference as gaussian scalar channel.
Some proof ideas: the partition function encodes the MMSE ("Nishimori identity" & gaussian integration by parts);
an interpolation of the Hamiltonian can be used to bound the partition function.
Some references for the Guerra interpolation:
- F. Guerra, F. L. Toninelli. "The thermodynamic limit in mean field spin glass models."
CMP 230.1 (2002): 71-79.
- F. Guerra. "Broken replica symmetry bounds in the mean field spin glass model."
CMP 233.1 (2003): 1-12.
- D. Panchenko, M. Talagrand. "Bounds for diluted mean-fields spin glass models."
PTRF 130.3 (2004): 319-336.
- M. Bayati, D. Gamarnik, P. Tetali. "Combinatorial approach to the interpolation method and scaling limits in sparse random graphs."
AOP 41.6 (2013): 4080-4115.
- D. Gamarnik. "Right-convergence of sparse random graphs."
PTRF 160.1-2 (2014): 253-278.
- Lecture 15 (10/29) Sherrington--Kirkpatrick (SK) model with external field: derivation of the TAP equations, and the "state evolution" scaling limit of the TAP (AMP) iteration. References:
- D. J. Thouless, P. W. Anderson, R. G. Palmer. "Solution of 'solvable model of a spin glass'."
Philos. Mag. 35.3 (1977): 593-601.
- J. R. L. de Almeida, D. J. Thouless. "Stability of the SK solution of a spin glass model."
J. Phys. A, 11.5 (1978): 983.
- M. Bayati, A. Montanari. "The dynamics of message passing on dense graphs, with applications to compressed sensing."
IEEE Trans. Inform. Theory 57.2 (2011): 764-785.
- E. Bolthausen. "An iterative construction of solutions of the TAP equations for the SK model."
CMP 325.1 (2014): 333-366.
- Lecture 16 (10/31) TAP equations and replica symmetric prediction for the Ising perceptron. References:
- W. Krauth, M. Mézard. "Learning algorithms with optimal stability in neural networks."
J. Phys. A 20.11 (1987): L745.
- M. Mézard. "The space of interactions in neural networks: Gardner's computation with the cavity method."
J. Phys. A 22.12 (1989): 2181.
- J. Ding, N. Sun. "Capacity lower bound for the Ising perceptron."
- E. Bolthausen. "A Morita type proof of the replica-symmetric formula for SK."
- Lecture 17 (11/05) Uniqueness, extremality (nonreconstruction), replica symmetry. Decomposition into Bethe states (clusters) and the 1RSB equations. Reading: MM Chapter 21. References:
- A. Montanari, D. Shah. "Counting good truth assignments of random k-SAT formulae."
- M. Mézard, A. Montanari. "Reconstruction on trees and spin glass transition."
J. Stat. Phys. 124.6 (2006): 1317-1350.
- F. Krzakala, A. Montanari, F. Ricci-Tersenghi, G. Semerjian, L. Zdeborová.
"Gibbs states and the set of solutions of random constraint satisfaction problems."
PNAS 104.25 (2007): 10318-10323.
- Lecture 18 (11/07) Random regular NAESAT for large k: combinatorial view of replica symmetry breaking. References:
- L. Zdeborová, F. Krzakala. "Phase transitions in the coloring of random graphs."
Phys. Rev. E 76.3 (2007): 031131.
- A. Montanari, F. Ricci-Tersenghi, G. Semerjian. "Clusters of solutions and replica symmetry breaking in random k-satisfiability."
J. Stat. Mech. 2008.04 (2008): P04004.
- A. Sly, N. Sun, Y. Zhang. "The number of solutions for random regular NAE-SAT."
- Lecture 19 (11/14) Gaussian comparison inequalities (Slepian's inequality, Gordon's inequality); capacity of the spherical perceptron. References:
- Section 3.3 in M. Ledoux, M. Talagrand. Probability in Banach Spaces: isoperimetry and processes.
- M. Stojnic. "Another look at the Gardner problem."
- Lecture 20 (11/19) low-complexity Gibbs measures. Main reference:
- T. Austin. "Measures of correlation and mixtures of product measures."
- T. Austin. "The structure of low-complexity Gibbs measures on product spaces."
Other references (some will be presented in December):
- S. Chatterjee, A. Dembo. "Nonlinear large deviations."
Adv. Math. 299 (2016): 396-450.
- R. Eldan. "Gaussian-width gradient complexity, reverse log-Sobolev inequalities and nonlinear large deviations."
- R. Eldan, R. Gross. "Decomposition of mean-field Gibbs distributions into product measures."
EJP 23 (2018).
- Lecture 21 (11/21) Generalized random energy model, Ruelle probability cascades, and the Parisi functional. Panchenko Ch.2.
- Lecture 22 (11/26) Some ingredients in the proof of the Parisi formula. Panchenko Ch. 3.
- Lecture 23 (11/28) Guest lecture from Andrej Risteski on variational methods and convex programming hierarchies. References:
A. Risteski. "How to calculate partition functions using convex programming hierarchies: provable bounds for variational methods." COLT 2016.
A. Risteski, Y. Li. "Approximate maximum entropy principles via Goemans--Williamson with applications to provable variational methods." NIPS 2016.
V. Jain, F. Koehler, A. Risteski. "Mean-field approximation, convex hierarchies, and the optimality of correlation rounding: a unified perspective." arXiv:1808.07226 (2018).
- Presentations (12/03):
- Sergei: T. Seppäläinen. "Large deviations for increasing sequences on the plane." PTRF 112.2 (1998): 221-244.
- Yuzhou: R. Basu, S. Ganguly, A. Sly. "Upper tail large deviations in first passage percolation." arXiv:1712.01255 (2017).
- Presentations (12/05):
- Roger: H. Cohn, R. Kenyon, J. Propp. "A variational principle for domino tilings." JAMS 14.2 (2001): 297-346.
- Jie Jun: E. Bolthausen, J.-D. Deuschel, G. Giacomin. "Entropic repulsion and the maximum of the two-dimensional harmonic crystal." AOP 29.4 (2001): 1670-1692.
- Yixiang: E. Lubetzky, F. Martinelli, A. Sly. "Harmonic pinnacles in the discrete gaussian model." CMP 344.3 (2016): 673-717.
- Presentations (12/10):
- Ashwin: G. Ben Arous, A. Guionnet. "Large deviations for Wigner's law and Voiculescu's non-commutative entropy." PTRF 108.4 (1997): 517-542.
- Suhas: N. Gantert, S. Kim, K. Ramanan. "Large deviations for random projections of ell^p balls." arXiv:1512.04988 (2015).
- Presentations (12/12):
- Sinho: R. Eldan. "Gaussian-width gradient complexity, reverse log-Sobolev inequalities and nonlinear large deviations." arXiv:1612.04346 (2016);
R. Eldan, R. Gross. "Decomposition of mean-field Gibbs distributions into product measures." EJP 23 (2018).
- Jonathan: F. Augeri. "Nonlinear large deviation bounds with applications to traces of Wigner matrices and cycles counts in Erdős-Rényi graphs." arXiv:1810.01558 (2018).