So, what’s ergodic theory all about?

The idea is that we are given a system together with some operation {T} on it. For instance, {T} could be a homeomorphism of a topological space, i.e. a discrete dynamical system. We are interested in studying the iterates of this process. In many case, averaging over the iterates of this process yields in the limit something that is actually invariant under this process.

For instance, suppose {T} is a measure-preserving transformation of a measure space {(X, \mu)} (which means if {E} is measurable, then so is {T^{-1}(E)} and {\mu(T^{-1}(E)) = \mu(E)}). How might one arise? Well, suppose {M} is a compact symplectic manifold, {X} a Hamiltonian vector field, and {dV} the volume form. Then the flows {\phi_t: M \rightarrow M} of {X} leave invariant the volume form {dV}, so any such diffeomorphism {\phi_t} is a measure-preserving transformation of the measure induced by the volume form. Anyway, back to the general story. Then the action of {T} on a function {f} is given by

\displaystyle Tf(x) := f(T(x)).

The Birkhoff ergodic theorem states that the averages

\displaystyle \frac{1}{N} \sum_{i=0}^{N-1} T^i f

converge a.e. to a function invariant under {T}, provided {f \in L^1(\mu)}. In many interesting cases, the invariant limit will actually be constant a.e. For instance, this is guaranteed if the transformation {T} is ergodic, i.e. has no nontrivial invariant subsets. There are many spectacular applications to number theory of this result, e.g. the existence of Khintchine’s constant. Cf. also this post of Harrison Brown.

Today, I’d like to talk about a simpler result in functional analysis. The idea is to replace {f \in L^1(\mu)} with {f \in L^2(\mu)}, and a.e. convergence with {L^2} convergence. It turns out that the reason for this is to work more generally with operators in Hilbert spaces.

This is the von Neumann ergodic theorem:

Theorem 1

Let {U} be a unitary operator on a Hilbert space {H}. Consider the closed subspace {F} of vectors of {H} left invariant by {H}, and let {P: H \rightarrow F} be the orthogonal projection. Then for all {h \in H},\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{i=0}^{N-1} U^i h = Ph .



The proof of this result is actually extremely simple: decompose {H} into two parts, and verify the claim for each!

One of these components is {F}. The other is the closed span {W} of {\{Ug - g\}_{g \in H}}. I claim that {H = F \oplus W}. It will be easy to verify the theorem for {h \in F} or {W}. Suppose now {f \in F}. Then for all {g \in H}:

\displaystyle ( Ug - g, f) = ( Ug, f) - (g, f) = (g,Uf)-(g,f) =(g,f)-(g,f)=0

which shows that {W \subset F^{\perp}}. Conversely, if {x \in W^{\perp}}, then for any {g \in H}

\displaystyle ( Ux - x, g) =(x, Ug - g) = 0

by assumption, proving {x \in F}.

Now we will prove the equality for both subspaces {F,W}.

First, for vectors {h} in {W}, the result is seen as follows. The limit on the left is zero by a telescoping type argument. Also, {Ph =0 }, because {F} and {W} are orthogonal.

Now suppose {h \in F}. Then the sum on the left is always {h} since {Uh = h}, and {Ph = h} by assumption as well.