T-5: Difference between revisions
Line 79: | Line 79: | ||
<ol start="3"> | <ol start="3"> | ||
<li><em> Step 2: | <li><em> Step 2: use Gaussianity.</em> | ||
</li> | </li> | ||
</ol> | </ol> |
Revision as of 18:37, 29 December 2023
Goal:
So far we have discussed the equilibrium properties of disordered systems, that are encoded in their partition function and free energy. In this set of problems, we characterize the energy landscape of a prototypical model, the spherical -spin.
Key concepts: Langevin dynamics, gradient descent, oout-of-equilibrium dynamics, metastable states, Hessian matrices, random matrix theory.
Dynamics, optimization, trapping local minima
- Energy landscapes. Consider the spherical -spin model; The function defines the energy landscape of the model: this is a random function defined on configuration space, which is the space all configurations belong to. This landscape has its global minima in the ground state configurations: the energy density of the ground states can be obtained studying the partition function in the limit . Besides the ground state(s), the energy landscape can have other local minima; the models of glasses are characterize by the fact that there are plenty of these local minima, see the sketch.
- Gradient descent and stationary points. Suppose that we are interested in finding the configurations of minimal energy of some model with energy landscape , starting from an arbitrary initial configuration : we can think about a dynamics in which we progressively update the configuration of the system moving towards lower and lower values of the energy, hoping to eventually converge to the ground state(s). The simplest dynamics of this sort is gradient descent, where the configurations change in time moving in the direction of the gradient of the energy landscape:
Under this dynamics, the system descends in the energy landscape towards configurations of lower and lower energy, until it reaches a stationary point, i.e. a configuration where : at that point, the dynamics stops. If the energy landscape has a simple, convex structure, the stationary point will be the ground state one is seeking for; however, if the energy landscape is very non-convex like in glasses, the end point of this algorithm will likely be a local minimum at energies much higher than the ground state. SKETCH
- Noise, Langevin dynamics and activation. How can one modify the dynamics to escape from a given local minimum and explore other regions of the energy landscape? One possibility is to add some stochasticity (or noise), i.e. some random terms that kick the systems in random directions in configuration space, towards which maybe the energy increases instead of decreasing:
The simplest choice is to choose to be a Gaussian vector at each time , uncorrelated from the vectors at other times , with zero average and some constant variance. This variance, which measures the strength of the noisy kicks, can be interpreted as a temperature: the resulting dynamics is known as Langevin dynamics .
Problem 5.1: the Kac-Rice method and the complexity
- The Kac-Rice formula I. Consider first a function of one variable defined on an interval , and let be the number of points such that . One has
Why is the derivative off the function appearing in this formula? Justify why if is a random function, the average of this number can be written as
where is the probability density of computed at zero, while is the expectation value of a random variable conditioned to the fact that an event is true.
- The Kac-Rice formula II. Consider now the number of stationary points of the -spin energy landscape, which satisfy . Justify why the generalization of the formula above gives
The Hessian We now use this formula to compute the annealed complexity , which is defined by . We do the calculation in three step.
- Step 1: use rotational invariance. Recall the expression of the correlations of the energy landscape of the -spin computed in Problem 2.1: in which sense the correlation function is rotationally invariant? Justify why rotational invariance implies that we can write
where is one coordinate. Where does the prefactor arise from?
- Step 2: use Gaussianity.
- Step 3: random determinants and eigenvalue density.