Extreme Statistics

Generically, finding the distribution of the maximum of a set of random variables is a non-trivial problem, which appears in many contexts ranging from the maximal height of water in a river to fluctuations in stock markets We consider N independent random variables $(x_{1}, . . ., x_{N})$ drawn from the same distribution $p (x)$ . We denote

y_{N} = \max (x_{1}, . . ., x_{N})

It is useful to use the following notations for the cumulative distributions

P^{<} (x) = \int_{- \infty}^{x} d x^{'} p (x^{'}) P^{>} (x) = \int_{x}^{+ \infty} d x^{'} p (x^{'})

Let us denote by $q_{N} (y)$ the distribution of $y_{N}$ and by $Q_{N} (y) = Prob (y_{N} < y)$ its cumulative distribution.

Write $Q_{N} (y)$ in terms of $P^{<} (y)$ . (Help: Start to write this relation for $N = 2, 3, . . .$ ).

This is the fundamental relation of Extreme statistics and we analyze its consequences in the large N limit where, analogously to the central limit theorem, extremes statistics display universal features.

In particular shows that in the large N limit we can write

Q_{N} (y) \sim \exp (- N P^{>} (y))

In the present exercise, we first study the case of the exponential distribution. In a second step we generalize our results to a larger class of distributions.

Exponential distribution

The exponential distribution is one of the fundamental continuous distributions, and already for this reason worthy of study. Among many other places, it appears in the Poisson process. The distribution writes:

p (x) = λ \exp (- λ x)

where both $λ$ and $x$ are positive numbers.

Preliminaries: the central limit

compute the mean value and the variance of this distribution
consider $X_{N}$ , the sum of N independent, exponentially distributed, random variables. How $X_{N}$ is distributed?

We write $X_{N}$ in a more convenient way

X_{N} = a_{N} + b_{N} z

where $a_{N}$ the location of the distribution and $b_{N}$ is the width of the distribution of $X_{N}$ . Both numbers depend on $N$ . Finally, $z$ is a random number and its distribution, $π (z)$ becomes independent of $N$ in the large "N" limit. In other words this means that the distribution of $X_{N}$ is significantly different from zero when the value of $X_{N}$ is around $a_{N}$ , in a region of size $b_{N}$ .

From the central limit theorem which is the natural choice for $a_{N}$ and $b_{N}$ ? Write the distribution $π (z)$

The Maxima

Consider now the case $λ = 1$

Write $P^{>} (x)$ and $P^{<} (x)$ . (Remember that $x$ is a positive number.)
Write $Q_{N} (y)$ and $q_{N} (y)$ .
Plot $q_{N} (y)$ for different values of N.

We want now to give a natural definition for the number $a_{N}$ and $b_{N}$ .

Consider $P^{>} (\tilde{y}) = \frac{1}{2}$ . If you draw N independent exponential variables, how many variables (in average) will be greater than $\tilde{y}$ ? Repeat the same exercise with $\tilde{\tilde{y}}$ such that $P^{>} (\tilde{\tilde{y}}) = \frac{2}{3}$

Justify that $a_{N}$ can be estimated from

P^{>} (a_{N}) = \frac{1}{N}

Compute $a_{N}$ for the exponential distribution and justify that

Q_{N} (y = a_{N} + z)

In the large N limit, the distribution $π (z)$ becomes $N$ independent.

Show that in this limit its cumulative takes the from

Π (z) = e^{- e^{- z}}

This is the cumulative distribution of the famous Gumbel distribution.

Let us remark that the precise definition of $a_{N}$ and $b_{N}$ fix the mean and the variance of the rescaled distribution $π (z)$ At variance with the central limit case the mean will be different from zero and the variance different from one.

Compute the mean, the variance and the asymptotic behavior of the Gumbel distribution. Draw the distribution. Explain why $z = 0$ is a special point

Generic case: Universality of the Gumbel distribution

The Gumbel distribution is the limit distribution of the maxima of a large class of function. We can say that the Gumbel distribution plays, for extreme statistics, the same role of the Gaussian distribution for the central limit theorem.

By contrast the behavior of $a_{N}$ and $b_{N}$ as a function of $N$ strongly depend on the particular distributions $p (x)$ . We discuss here a family of distribution characterized by a fast decay for large $x$

p (x) \sim c e^{- x^{α}}

where $α > 0$ The key point is to be able to determine $A (x)$ such that

P^{>} (x) = \exp (- A (x))

For $p (x) = e^{- x}$ shows $A (x) = x$

Otherwise $A (x)$ should be determined asymptotically for large $x$

Show that $A (x) = x^{α} + (α - 1) \log x + . . .$
Show that in general $A (a_{N}) = \log N + . . .$ and compute $a_{N}$ as a function of $α$ for large $N$ .
Show that the maximum distribution take the form

\lim_{N \to \infty} Q_{N} (y) = (y = a_{N} + \frac{z}{A^{'} (a_{N})})

with $z$ Gumbel distributed

Identify $b_{N}$ and discuss its behavior as a function of $α$

If the distribution $p (x)$ is defined on the entire real axis and is characterized by the same fast decay, it is easy to generalize this result also for the distribution of the minima.

Write the Gumbel distribution for the minima

Minimum of exponential random numbers

The Gumbel distribution is not the only distribution for the extremes. Consider the simple case of the minima of the exponential distribution

Show analytically that the distribution function for the minimum of $N$ exponential random numbers $x = \min (x_{1}, \dots, x_{N})$ with parameters $λ_{1}, \dots λ_{N}$ is again an exponential random number with parameter $λ_{1} + \dots + λ_{N}$ :

$π (x) = (λ_{1} + \dots + λ_{N}) \exp (- (λ_{1} + \dots + λ_{N}) x)$
Program this in Python, produce a histogram and compare with the exact result.

Look on the web which are the possible extreme distributions for independent and identically distributed variable

T-II-3

Contents

Extreme Statistics

Exponential distribution

Preliminaries: the central limit

The Maxima

Generic case: Universality of the Gumbel distribution

Minimum of exponential random numbers

Navigation menu

T-II-3

Extreme Statistics

Exponential distribution

Preliminaries: the central limit

The Maxima

Generic case: Universality of the Gumbel distribution

Minimum of exponential random numbers

Navigation menu

Search