T-5: Difference between revisions
| (118 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| <strong>Goal: </strong>   | <strong>Goal: </strong>   | ||
| In this set of problems, we characterize the energy landscape of the spherical <math>p</math>-spin by studying its metastable states (local minima). | |||
| <br> | |||
| <strong>Techniques: </strong> Kac-Rice formula, conditional probabilities. | |||
| <br> | <br> | ||
| == Dynamics, optimization, trapping local minima == | |||
| [[File:Landscapes-GDD.png|thumb|right|x200px|Non-rugged vs rugged energy landscapes.]] | |||
| So far we have discussed the equilibrium properties of disordered systems, that are encoded in their partition function/free energy. When a system (following Langevin, Monte Carlo dynamics) equilibrates at sufficiently large times, its long-time properties are captured by these equilibrium calculations. In glassy systems the equilibration timescales are extremely large: for very large timescales the system  does not visit equilibrium configurations, but rather metastable states. | |||
| <ul> | |||
| <li> '''Rugged landscapes.''' Consider the spherical <math>p</math>-spin model: <math>E(\vec{\sigma})</math> is an <ins> energy landscape </ins>. It is a random function on configuration space (the surface <math> \mathcal{S}_N </math> of the sphere). This landscape has its global minimum(a) at the ground state configuration(s): the energy density of the ground state(s) can be obtained studying the partition function <math> Z </math> in the limit <math> \beta \to \infty </math>. Besides the ground state(s), the energy landscape can have other local minima; fully-connected models of glasses are characterized by the fact that there are plenty of these local minima: the energy landscape is <ins> rugged</ins>, see the sketch.  | |||
| </li> | |||
| <br> | <br> | ||
| <li>  | |||
| <li> '''Optimization by gradient descent.''' Suppose that we are interested in finding the configurations of minimal energy, starting from an arbitrary configuration <math>\vec{\sigma}_0</math>: we can implement a dynamics in which we progressively update the configuration moving towards lower and lower values of the energy, hoping to eventually converge to the ground state(s). The simplest dynamics of this sort is <ins>gradient descent</ins>, | |||
| <center> <math> | <center> <math> | ||
| \frac{d \vec{\sigma}(t)}{dt}=- \nabla_{\perp} E(\vec{\sigma})   | \frac{d \vec{\sigma}(t)}{dt}=- \nabla_{\perp} E(\vec{\sigma})   | ||
| </math> </center> | </math> </center> | ||
| where  | where <math>\nabla_{\perp} E(\vec{\sigma})</math> is the gradient of the landscape restricted to the sphere. The dynamics stops when it reaches a  <ins> stationary point </ins>, a configuration where <math>  \nabla_\perp E(\vec{\sigma})=0</math>. If the landscape has a convex structure, this will be the ground state; if the energy landscape is very non-convex like in glasses, the end point of this algorithm will be a local minimum at energies much higher than the ground state (see sketch). | ||
| </li> | </li> | ||
| <br> | <br> | ||
| <li>  | <li> '''Stationary points and complexity.''' To guess where gradient descent dynamics (or <ins> Langevin dynamics </ins>) are expected to converge, it is useful to understand the distribution of the stationary points, i.e. the number <math> \mathcal{N}(\epsilon)</math> of such configuration having a given energy density <math> \epsilon </math>. In fully-connected models, this quantity has an exponential scaling, <math> \mathcal{N}(\epsilon) \sim \text{exp}\left(N \Sigma(\epsilon) \right)</math>, where  <math>  \Sigma(\epsilon)</math> is the landscape’s <ins>complexity</ins>. <sup>[[#Notes|[*] ]]</sup>. Stationary points can be stable (local minima), or unstable (saddles or local maxima): their stability is encoded in the spectrum of the <ins> Hessian matrix </ins> <math>\nabla_{\perp}^2 E(\vec{\sigma})</math>: when all the eigenvalues of the Hessian are positive, the point is a local minimum (and a saddle otherwise). | ||
| </li> | </li> | ||
| </ul> | |||
| <br> | |||
| < | <div style="font-size:89%"> | ||
| : <small>[*]</small> - This quantity looks similar to the entropy <math> S(\epsilon) </math> we computed for the REM in Problem 1. However, while the entropy counts all configurations at a given energy density, the complexity <math> \Sigma(\epsilon) </math> accounts only for the stationary points. | |||
| </div> | |||
| </math>  | |||
| </ | |||
| <br> | <br> | ||
| == | == Problems == | ||
| In this  | In this and in the following problem, we discuss the computation of the annealed complexity of the spherical <math>p</math>-spin model, which is defined by   | ||
| <center> <math> | <center> <math> | ||
| \Sigma_{\text{a}}(\epsilon)= \lim_{N \to \infty}\frac{1}{N}\log \overline{\mathcal{N}(\epsilon)} , \quad \quad \mathcal{N}(\epsilon)= \left\{ \text{number stat. points of energy density }  \epsilon\right\} | \Sigma_{\text{a}}(\epsilon)= \lim_{N \to \infty}\frac{1}{N}\log \overline{\mathcal{N}(\epsilon)} , \quad \quad \mathcal{N}(\epsilon)= \left\{ \text{number stat. points of energy density }  \epsilon\right\} | ||
| </math> </center> | </math> </center> | ||
| === Problem 5: the Kac-Rice formula and the complexity === | |||
| Line 49: | Line 56: | ||
| </math> | </math> | ||
| </center> | </center> | ||
| where <math>  p_0(x) </math> is the probability that <math> x </math> is a zero of the function. | where <math>  p_0(x) </math> is the probability density that <math> x </math> is a zero of the function. | ||
| In particular, why is the derivative of the function appearing in this formula? Consider now the number of stationary points <math> \mathcal{N}(\epsilon)</math> of the <math>p</math>-spin energy landscape, which satisfy <math> \nabla_\perp E(\vec{\sigma})=0</math>. Justify why the generalization of the formula above gives   | In particular, why is the derivative of the function appearing in this formula? Consider now the number of stationary points <math> \mathcal{N}(\epsilon)</math> of the <math>p</math>-spin energy landscape, which satisfy <math> \nabla_\perp E(\vec{\sigma})=0</math>. Justify why the generalization of the formula above gives   | ||
| <center> | <center> | ||
| <math> | <math> | ||
| \overline{\mathcal{N}(\epsilon)}= \int_{ | \overline{\mathcal{N}(\epsilon)}= \int_{\mathcal{S}_N} d \vec{\sigma} \,p_{\epsilon}(\vec{\sigma})  , \quad \quad p_{\epsilon}(\vec{\sigma})=\overline{|\text{det} \nabla_\perp^2 E (\vec{\sigma})|\,\, \delta(\nabla_\perp E(\vec{\sigma})=0) \, \,\delta(E(\vec{\sigma})- N \epsilon)} | ||
| </math> | </math> | ||
| </center> | </center> | ||
| where <math> p_{\epsilon}(\vec{\sigma})</math> is the probability that <math> \vec \sigma </math> is a stationary point, <math> \nabla_\perp^2 E (\vec{\sigma}) </math> is the Hessian matrix of the function  <math> E (\vec{\sigma}) </math> restricted to the sphere. | where <math> p_{\epsilon}(\vec{\sigma})</math> is the probability density that <math> \vec \sigma</math> is a stationary point of energy density <math> \epsilon </math>, and <math> \nabla_\perp^2 E (\vec{\sigma}) </math> is the Hessian matrix of the function  <math> E (\vec{\sigma}) </math> restricted to the sphere. | ||
| </li> | </li> | ||
| </ol> | </ol> | ||
| Line 62: | Line 69: | ||
| <ol start="2"> | <ol start="2"> | ||
| <li><em> Statistical rotational invariance.</em> Recall the expression of the correlations of the energy landscape of the <math>p</math>-spin computed in Problem  | <li><em> Statistical rotational invariance.</em> Recall the expression of the correlations of the energy landscape of the <math>p</math>-spin computed in Problem 3.1: in which sense the correlation function is rotationally invariant? Justify why rotational invariance implies that   | ||
| <center> | <center> | ||
| <math> | <math> | ||
| \overline{\mathcal{N}(\epsilon)}= (2 \pi e)^{\frac{N}{2}} \, | \overline{\mathcal{N}(\epsilon)}= (2 \pi e)^{\frac{N}{2}} \, p_{\epsilon}(\vec{1}) | ||
| </math> | </math> | ||
| </center> | </center> | ||
| where <math> \vec{1}=(1,1,1, \cdots, 1) </math>. Where does the prefactor arise from? | where <math> \vec{1}=(1,1,1, \cdots, 1) </math> is one fixed vector belonging to the surface of the sphere. Where does the prefactor arise from? | ||
| </li> | </li> | ||
| </ol> | </ol> | ||
| Line 74: | Line 81: | ||
| <ol start="3"> | <ol start="3"> | ||
| <li><em> Gaussianity and correlations.</em>  | <li><em> Gaussianity and correlations.</em>   | ||
| <ul> | |||
| < | <li> Determine the distribution of the quantity <math> E (\vec{1})</math>. </li> | ||
| <li> The entries of <math>\nabla_\perp E (\vec{1}), \nabla^2_\perp E (\vec{1})</math> are Gaussian variables. One can show that the <math> N-1 </math> components of <math> \nabla_\perp E (\vec{1})</math> are uncorrelated to <math> E (\vec{1}), \nabla^2_\perp E (\vec{1})</math>; they have zero mean and covariances | |||
| <math> | <math> | ||
| \overline{ | \overline{(\nabla_\perp E)_\alpha  \, (\nabla_\perp E)_\beta}=  p \, \delta_{\alpha \beta}+O\left(\frac{1}{N} \right). | ||
| </math> | </math> | ||
| </ | Compute the probability density that <math> \nabla_\perp E (\vec{1})=0</math>. </li> | ||
| <li> The <math>(N-1)\times (N-1) </math> matrix <math> \nabla_\perp^2 E (\vec{\sigma}) </math> conditioned to the fact that <math> E(\vec 1)=N \epsilon </math> can be written as   | |||
| </li> | |||
| < | |||
| <center> | <center> | ||
| <math> | <math> | ||
| [\nabla_\perp^2 E(\vec{1})]_{\alpha \beta}=  | [\nabla_\perp^2 E(\vec{1})]_{\alpha \beta}=  M_{\alpha \beta}- p  \epsilon\, \delta_{\alpha \beta}, | ||
| </math> | </math> | ||
| </center> | </center> | ||
| where the matrix <math>  | where the matrix <math> M </math> has random entries with zero average and correlations | ||
| <math> | <math> | ||
| \overline{{ | \overline{{M}_{\alpha \beta} \, {M}_{\gamma \delta}}= \frac{p (p-1)}{ N} \left( \delta_{\alpha \gamma} \delta_{\beta \delta}+ \delta_{\alpha \delta} \delta_{\beta \gamma}\right) | ||
| </math> | </math> | ||
| Combining this with the results above, show that | |||
| <center> | <center> | ||
| <math> | <math> | ||
| \overline{\mathcal{N}(\epsilon)}= (2 \pi e)^{\frac{N}{2}} \,\frac{1}{(2 \pi \, p)^{\frac{N-1}{2}}}\; \sqrt{\frac{N}{2 \pi}} e^{-\frac{N \epsilon^2}{2}}\;\overline{|\text{det} \left( M- p  \epsilon \mathbb{I} \right)|} | |||
| </math> | </math> | ||
| </center> | </center> | ||
| </ul> | |||
| </li> | </li> | ||
| </ol> | </ol> | ||
| <br> | <br> | ||
| == Check out: key concepts == | |||
| Gradient descent, rugged landscapes, metastable states, landscape’s complexity. | |||
| < | <!----------OLD--------- | ||
| :  | First, a few notions of geometry: we define with <math> \hat \Pi(\vec{\sigma}) </math> the projector on  the tangent plane to the sphere at <math> \vec{\sigma}</math>: this is the plane orthogonal to the vector <math> \vec{\sigma}</math>. The gradient <math> \nabla_\perp E(\vec{\sigma}) </math> is a <math> (N-1)</math>-dimensional vector that is obtained projecting the gradient <math> [\nabla E(\vec{\sigma})]_i=\partial E/\partial \sigma_i </math> on the tangent plane, <math> \nabla_\perp E(\vec{\sigma})=\hat \Pi \nabla E(\vec{\sigma})</math>. The Hessian <math> \nabla^2_\perp E(\vec{\sigma}) </math> is a <math> (N-1) \times (N-1)</math>-dimensional matrix that is obtained from the Hessian <math> [\nabla^2 E(\vec{\sigma})]_{ij}=\partial^2 E/\partial \sigma_i \partial \sigma_j </math> as <math> \nabla^2_\perp E(\vec{\sigma})= \hat \Pi(\vec{\sigma}) \, \nabla^2 E(\vec{\sigma}) \, \hat\Pi(\vec{\sigma}) - N^{-1}\nabla E(\vec{\sigma}) \cdot \vec{\sigma} \mathbb{I}</math> where <math> \mathbb{I}</math> is the identity matrix. <br>--> | ||
| < | |||
Latest revision as of 11:00, 12 March 2025
Goal:  
In this set of problems, we characterize the energy landscape of the spherical -spin by studying its metastable states (local minima).
Techniques:  Kac-Rice formula, conditional probabilities.
Dynamics, optimization, trapping local minima
So far we have discussed the equilibrium properties of disordered systems, that are encoded in their partition function/free energy. When a system (following Langevin, Monte Carlo dynamics) equilibrates at sufficiently large times, its long-time properties are captured by these equilibrium calculations. In glassy systems the equilibration timescales are extremely large: for very large timescales the system does not visit equilibrium configurations, but rather metastable states.
- Rugged landscapes. Consider the spherical -spin model: is an energy landscape . It is a random function on configuration space (the surface of the sphere). This landscape has its global minimum(a) at the ground state configuration(s): the energy density of the ground state(s) can be obtained studying the partition function in the limit . Besides the ground state(s), the energy landscape can have other local minima; fully-connected models of glasses are characterized by the fact that there are plenty of these local minima: the energy landscape is rugged, see the sketch.
-  Optimization by gradient descent. Suppose that we are interested in finding the configurations of minimal energy, starting from an arbitrary configuration : we can implement a dynamics in which we progressively update the configuration moving towards lower and lower values of the energy, hoping to eventually converge to the ground state(s). The simplest dynamics of this sort is gradient descent,
where is the gradient of the landscape restricted to the sphere. The dynamics stops when it reaches a stationary point , a configuration where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nabla_\perp E(\vec{\sigma})=0} . If the landscape has a convex structure, this will be the ground state; if the energy landscape is very non-convex like in glasses, the end point of this algorithm will be a local minimum at energies much higher than the ground state (see sketch). 
- Stationary points and complexity. To guess where gradient descent dynamics (or Langevin dynamics ) are expected to converge, it is useful to understand the distribution of the stationary points, i.e. the number Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathcal{N}(\epsilon)} of such configuration having a given energy density Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \epsilon } . In fully-connected models, this quantity has an exponential scaling, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathcal{N}(\epsilon) \sim \text{exp}\left(N \Sigma(\epsilon) \right)} , where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Sigma(\epsilon)} is the landscape’s complexity. [*] . Stationary points can be stable (local minima), or unstable (saddles or local maxima): their stability is encoded in the spectrum of the Hessian matrix : when all the eigenvalues of the Hessian are positive, the point is a local minimum (and a saddle otherwise).
- [*] - This quantity looks similar to the entropy we computed for the REM in Problem 1. However, while the entropy counts all configurations at a given energy density, the complexity accounts only for the stationary points.
Problems
In this and in the following problem, we discuss the computation of the annealed complexity of the spherical -spin model, which is defined by
Problem 5: the Kac-Rice formula and the complexity
-   The Kac-Rice formula. Consider first a random function of one variable  defined on an interval , and let  be the number of points  such that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle  f(x)=0}
. Justify why
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \overline{\mathcal{N}}= \int_a^b dx \,p_0(x) , \quad \quad p_0(x)=\overline{\delta(f(x)) |f'(x)|} } where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p_0(x) } is the probability density that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x } is a zero of the function. In particular, why is the derivative of the function appearing in this formula? Consider now the number of stationary points Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathcal{N}(\epsilon)} of the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} -spin energy landscape, which satisfy Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nabla_\perp E(\vec{\sigma})=0} . Justify why the generalization of the formula above gives Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \overline{\mathcal{N}(\epsilon)}= \int_{\mathcal{S}_N} d \vec{\sigma} \,p_{\epsilon}(\vec{\sigma}) , \quad \quad p_{\epsilon}(\vec{\sigma})=\overline{|\text{det} \nabla_\perp^2 E (\vec{\sigma})|\,\, \delta(\nabla_\perp E(\vec{\sigma})=0) \, \,\delta(E(\vec{\sigma})- N \epsilon)} } where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p_{\epsilon}(\vec{\sigma})} is the probability density that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \vec \sigma} is a stationary point of energy density Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \epsilon } , and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nabla_\perp^2 E (\vec{\sigma}) } is the Hessian matrix of the function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E (\vec{\sigma}) } restricted to the sphere. 
-  Statistical rotational invariance. Recall the expression of the correlations of the energy landscape of the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p}
-spin computed in Problem 3.1: in which sense the correlation function is rotationally invariant? Justify why rotational invariance implies that 
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \overline{\mathcal{N}(\epsilon)}= (2 \pi e)^{\frac{N}{2}} \, p_{\epsilon}(\vec{1}) } where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \vec{1}=(1,1,1, \cdots, 1) } is one fixed vector belonging to the surface of the sphere. Where does the prefactor arise from? 
-  Gaussianity and correlations. 
- Determine the distribution of the quantity Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E (\vec{1})} .
- The entries of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nabla_\perp E (\vec{1}), \nabla^2_\perp E (\vec{1})} are Gaussian variables. One can show that the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N-1 } components of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nabla_\perp E (\vec{1})} are uncorrelated to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E (\vec{1}), \nabla^2_\perp E (\vec{1})} ; they have zero mean and covariances Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \overline{(\nabla_\perp E)_\alpha \, (\nabla_\perp E)_\beta}= p \, \delta_{\alpha \beta}+O\left(\frac{1}{N} \right). } Compute the probability density that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nabla_\perp E (\vec{1})=0} .
-  The Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (N-1)\times (N-1) }
 matrix Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle  \nabla_\perp^2 E (\vec{\sigma}) }
 conditioned to the fact that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle  E(\vec 1)=N \epsilon }
 can be written as 
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [\nabla_\perp^2 E(\vec{1})]_{\alpha \beta}= M_{\alpha \beta}- p \epsilon\, \delta_{\alpha \beta}, } where the matrix has random entries with zero average and correlations Combining this with the results above, show that 
 
Check out: key concepts
Gradient descent, rugged landscapes, metastable states, landscape’s complexity.
