|   |   | 
| Line 11: | Line 11: | 
|  | <ul> |  | <ul> | 
|  | 
 |  | 
 | 
|  | <li> '''Thermodynamics and dynamics.''' Recall: a system equilibrates dynamically at temperature <math> T </math> (under Langevin, Monte Carlo) whenever at sufficiently large timescales it visits configurations with the frequency predicted by the Boltzmann-Gibbs distribution at temperature <math> T </math>. There are systems whose equilibration timescales are extremely large, and the system dynamics is out-of-equilibrium: to understand this dynamics, knowing the Boltzmann-Gibbs measure is not enough. </li><br>
 |  | 
|  | 
 |  | 
 | 
|  |   |  | <li> '''Rugged landscapes.''' Consider the spherical <math>p</math>-spin model for concreteness: <math>E(\vec{\sigma})</math> is an <ins> energy landscape </ins>. It is a random function on configuration space (in our case, the surface <math> \mathcal{S}_N </math> of the sphere in dimension <math> N </math>). This landscape has its global minimum(a) at the ground state configuration(s): the energy density of the ground state(s) can be obtained studying the partition function <math> Z </math> in the limit <math> \beta \to \infty </math>. Besides the ground state(s), the energy landscape can have other local minima; fully-connected models of glasses are characterized by the fact that there are plenty of these local minima: the energy landscape is <ins> rugged</ins>, see the sketch.   | 
|  |   |  | 
|  |   |  | 
|  | <li> '''Rugged landscapes.''' Consider the spherical <math>p</math>-spin model for concreteness: <math>E(\vec{S})</math> is an <ins> energy landscape </ins>. It is a random function on configuration space (in our case, the surface <math> \mathcal{S}_N </math> of the sphere in dimension <math> N </math>). This landscape has its global minimum(a) at the ground state configuration(s): the energy density of the ground state(s) can be obtained studying the partition function <math> Z </math> in the limit <math> \beta \to \infty </math>. Besides the ground state(s), the energy landscape can have other local minima; fully-connected models of glasses are characterized by the fact that there are plenty of these local minima: the energy landscape is <ins> rugged</ins>, see the sketch. |  | 
|  | </li> |  | </li> | 
|  | <br> |  | <br> | 
		Revision as of 16:03, 13 February 2024
Goal:  
So far we have discussed the equilibrium properties of disordered systems, that are encoded in their partition function and free energy. In this set of problems, we characterize the energy landscape of the spherical  -spin, by determining the number of its stationary points.
-spin, by determining the number of its stationary points.
Techniques:  differential geometry (just a tiny bit!), random matrix theory.
Dynamics, optimization, trapping local minima
 
  Convex and rugged energy landscapes.
-  Rugged landscapes. Consider the spherical  -spin model for concreteness: -spin model for concreteness: is an  energy landscape . It is a random function on configuration space (in our case, the surface is an  energy landscape . It is a random function on configuration space (in our case, the surface of the sphere in dimension of the sphere in dimension ). This landscape has its global minimum(a) at the ground state configuration(s): the energy density of the ground state(s) can be obtained studying the partition function ). This landscape has its global minimum(a) at the ground state configuration(s): the energy density of the ground state(s) can be obtained studying the partition function in the limit in the limit . Besides the ground state(s), the energy landscape can have other local minima; fully-connected models of glasses are characterized by the fact that there are plenty of these local minima: the energy landscape is  rugged, see the sketch. . Besides the ground state(s), the energy landscape can have other local minima; fully-connected models of glasses are characterized by the fact that there are plenty of these local minima: the energy landscape is  rugged, see the sketch.
-  Optimization by gradient descent. Suppose that we are interested in finding the configurations of minimal energy, starting from an arbitrary configuration  : we can implement a dynamics in which we progressively update the configuration moving towards lower and lower values of the energy, hoping to eventually converge to the ground state(s). The simplest dynamics of this sort is gradient descent, : we can implement a dynamics in which we progressively update the configuration moving towards lower and lower values of the energy, hoping to eventually converge to the ground state(s). The simplest dynamics of this sort is gradient descent,  where  is the gradient of the landscape restricted to the sphere. The dynamics stops when it reaches a   stationary point , a configuration where is the gradient of the landscape restricted to the sphere. The dynamics stops when it reaches a   stationary point , a configuration where . If the landscape has a convex structure, this will be the ground state; if the energy landscape is very non-convex like in glasses, the end point of this algorithm will be a local minimum at energies much higher than the ground state (see sketch). . If the landscape has a convex structure, this will be the ground state; if the energy landscape is very non-convex like in glasses, the end point of this algorithm will be a local minimum at energies much higher than the ground state (see sketch).
 
-  Stationary points and complexity. To guess where gradient descent dynamics (or its variation, such as  Langevin dynamics ) are expected to converge, it is useful to understand the distribution of the stationary points, i.e. the number  of such configuration having a given energy density of such configuration having a given energy density . In fully-connected models, this quantity has an exponential scaling, . In fully-connected models, this quantity has an exponential scaling, , where , where is the landscape’s complexity. [*] . Stationary points can be stable (local minima), or unstable (saddles or local maxima): their stability is encoded in the spectrum of the  Hessian matrix is the landscape’s complexity. [*] . Stationary points can be stable (local minima), or unstable (saddles or local maxima): their stability is encoded in the spectrum of the  Hessian matrix : when all the eigenvalues of the Hessian are positive, the point is a local minimum (and a saddle otherwise). : when all the eigenvalues of the Hessian are positive, the point is a local minimum (and a saddle otherwise).
- [*] - This quantity looks similar to the entropy  we computed for the REM in Problem 1.1. However, while the entropy counts all configurations at a given energy density, the complexity we computed for the REM in Problem 1.1. However, while the entropy counts all configurations at a given energy density, the complexity accounts only for the stationary points. accounts only for the stationary points.
 
Problems
In these problems, we discuss the computation of the annealed complexity of the spherical  -spin model, which is defined by
-spin model, which is defined by 
  
 
Problem 5.1: the Kac-Rice method and the complexity
First, a few notions of geometry: we define with  the projector on  the tangent plane to the sphere at
 the projector on  the tangent plane to the sphere at  : this is the plane orthogonal to the vector
: this is the plane orthogonal to the vector  . The gradient
. The gradient  is a
 is a  -dimensional vector that is obtained projecting the gradient
-dimensional vector that is obtained projecting the gradient ![{\displaystyle [\nabla E({\vec {S}})]_{i}=\partial E/\partial \sigma _{i}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8dcd1e32c36d1258eb172caca453d0634b4c7c7a) on the tangent plane,
 on the tangent plane,  . The Hessian
. The Hessian  is a
 is a  -dimensional matrix that is obtained from the Hessian
-dimensional matrix that is obtained from the Hessian ![{\displaystyle [\nabla ^{2}E({\vec {S}})]_{ij}=\partial ^{2}E/\partial S_{i}\partial S_{j}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/e1cd537e8f9ea7a4d44f9b1ca87e75562c957c54) as
 as  where
 where  is the identity matrix.
 is the identity matrix. 
-   The Kac-Rice formula. Consider first a random function of one variable  defined on an interval defined on an interval![{\displaystyle [a,b]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/9c4b788fc5c637e26ee98b45f89a5c08c85f7935) , and let , and let be the number of points be the number of points such that such that . Justify why . Justify why  
 where  is the probability density that is the probability density that is a zero of the function.
In particular, why is the derivative of the function appearing in this formula? Consider now the number of stationary points is a zero of the function.
In particular, why is the derivative of the function appearing in this formula? Consider now the number of stationary points of the of the -spin energy landscape, which satisfy -spin energy landscape, which satisfy . Justify why the generalization of the formula above gives . Justify why the generalization of the formula above gives
   
 where  is the probability density that is the probability density that is a stationary point of energy density is a stationary point of energy density , and , and is the Hessian matrix of the function is the Hessian matrix of the function restricted to the sphere. restricted to the sphere.
 
-  Statistical rotational invariance. Recall the expression of the correlations of the energy landscape of the  -spin computed in Problem 2.1: in which sense the correlation function is rotationally invariant? Justify why rotational invariance implies that -spin computed in Problem 2.1: in which sense the correlation function is rotationally invariant? Justify why rotational invariance implies that  
 where  is one fixed vector belonging to the surface of the sphere. Where does the prefactor arise from? is one fixed vector belonging to the surface of the sphere. Where does the prefactor arise from?
 
-  Gaussianity and correlations. Determine the distribution of the quantity  . Show that the components of the vector . Show that the components of the vector are Gaussian random variables with zero mean and covariances are Gaussian random variables with zero mean and covariances  
 The quantity  can be shown to be uncorrelated to can be shown to be uncorrelated to . The entries of the . The entries of the matrix matrix are also Gaussian variables. Computing their correlation, one finds that the matrix conditioned to the fact that are also Gaussian variables. Computing their correlation, one finds that the matrix conditioned to the fact that can be written as can be written as
 ![{\displaystyle [\nabla _{\perp }^{2}E({\vec {1}})]_{\alpha \beta }=\left[{\hat {\Pi }}\nabla ^{2}E({\vec {1}}){\hat {\Pi }}-N^{-1}p\,E({\vec {\sigma }})\mathbb {I} \right]_{\alpha \beta }=M_{\alpha \beta }-p\epsilon \,\delta _{\alpha \beta },}](https://wikimedia.org/api/rest_v1/media/math/render/svg/9d19868285421e2c92d53f69e6a7a18f55baf123)  
 where the matrix  has random entries with zero average and correlations has random entries with zero average and correlations
   
 Combining everything, show that this implies
   
 
Problem 5.2: the Hessian and random matrix theory
To get the complexity, it remains to compute the expectation value of the determinant of the Hessian matrix: this is the goal of this problem. We will do this exploiting results from random matrix theory.
-   Gaussian Random matrices.  Show that the matrix  is a GOE matrix, i.e. a matrix taken from the Gaussian Orthogonal Ensemble, meaning that it is a symmetric matrix with distribution is a GOE matrix, i.e. a matrix taken from the Gaussian Orthogonal Ensemble, meaning that it is a symmetric matrix with distribution What is the value of What is the value of ? ?
-  Eigenvalue density and concentration.  Let  be the eigenvalues of the matrix be the eigenvalues of the matrix . Show that the following identity holds: . Show that the following identity holds:![{\displaystyle {\overline {|{\text{det}}\left(M-p\epsilon \mathbb {I} \right)|}}={\overline {{\text{exp}}\left[(N-1)\left(\int d\lambda \,\rho _{N}(\lambda )\,\log |\lambda -p\epsilon |\right)\right]}},\quad \quad \rho _{N}(\lambda )={\frac {1}{N-1}}\sum _{\alpha =1}^{N-1}\delta (\lambda -\lambda _{\alpha })}](https://wikimedia.org/api/rest_v1/media/math/render/svg/e599a44e645c465bcf1e138c119021c9a920bed4)  
 where  is the empirical eigenvalue density. It can be shown that if is the empirical eigenvalue density. It can be shown that if is a GOE matrix, the distribution of the empirical density has a large deviation form (recall TD1) with speed is a GOE matrix, the distribution of the empirical density has a large deviation form (recall TD1) with speed , meaning that , meaning that![{\displaystyle P_{N}[\rho ]=e^{-N^{2}\,g[\rho ]}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0ba0d10bfa841563de17242ddc960b8c427472f1) where now where now![{\displaystyle g[\cdot ]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/915c282afa490af23c9e54df31dbe8384b7eea63) is a functional (a function of a function). Using a saddle point argument, show that this implies is a functional (a function of a function). Using a saddle point argument, show that this implies
 ![{\displaystyle {\overline {{\text{exp}}\left[(N-1)\left(\int d\lambda \,\rho _{N}(\lambda )\,\log |\lambda -p\epsilon |\right)\right]}}={\text{exp}}\left[N\left(\int d\lambda \,\rho _{\text{ty}}(\lambda +p\epsilon )\,\log |\lambda |\right)+o(N)\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6cc8132ada0449fb55c50aac707fb3213594ecde)  
 where  is the typical value of the eigenvalue density, which satisfies is the typical value of the eigenvalue density, which satisfies![{\displaystyle g[\rho _{\text{ty}}]=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/b71c7a6c3d54257542cea243daab2c654b145cb4) . .
 
-  The semicircle, the threshold and the ground state. The eigenvalue density of GOE matrices is self-averaging, and it equals to 
  
 
- Check this numerically: generate matrices for various values of  , plot their empirical eigenvalue density and compare with the asymptotic curve. Is the convergence faster in the bulk, or in the edges of the eigenvalue density, where it vanishes? , plot their empirical eigenvalue density and compare with the asymptotic curve. Is the convergence faster in the bulk, or in the edges of the eigenvalue density, where it vanishes?
-  Sketch  for different values of for different values of ; recalling that the Hessian encodes for the stability of the stationary points, show that there is a transition in the stability of the stationary points at a critical value of the energy density ; recalling that the Hessian encodes for the stability of the stationary points, show that there is a transition in the stability of the stationary points at a critical value of the energy density  
 When are the critical point stable local minima? When are they saddles? Why the stationary points at  are called   marginally stable ? are called   marginally stable ?
 
-  Combining all the results, show that the annealed complexity is
 ![{\displaystyle \Sigma _{\text{a}}(\epsilon )={\frac {1}{2}}\log[4e(p-1)]-\epsilon ^{2}+I_{p}(\epsilon ),\quad \quad I_{p}(\epsilon )={\frac {2}{\pi }}\int dx{\sqrt {1-\left(x-{\frac {\epsilon }{\epsilon _{\text{th}}}}\right)^{2}}}\,\log |x|,\quad \quad \epsilon _{\text{th}}=-{\sqrt {\frac {2(p-1)}{p}}}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7ccdb7468aae01084f63cb809b05b74ae0c3d5a3)  The integral  can be computed explicitly, and one finds: can be computed explicitly, and one finds:
   Plot the annealed complexity, and determine numerically where it vanishes: why is this a lower bound or the ground state energy density?
 
 
Check out: key concepts
Gradient descent, rugged landscapes, metastable states, Hessian matrices, random matrix theory, landscape’s complexity.