Let be a probability space and a measure-preserving transformation. In many cases, it turns out that the averages of a function given by
actually converge a.e. to a constant.
This is the case if is ergodic, which we define as follows: is ergodic if for all with , or . This is a form of irreducibility; the system has no smaller subsystem (disregarding measure zero sets). It is easy to see that this is equivalent to the statement: measurable (one could assume measurable and bounded if one prefers) and -invariant implies constant a.e. (One first shows that if is ergodic, then implies , by constructing something close to that is -invariant.)
In this case, therefore, the ergodic theorem takes the following form. Let be integrable. Then almost everywhere,
This is a very useful fact, and it has many applications.
Example: rotations of the circle
Consider the unit circle and the rotation , , where is irrational. I claim that it is ergodic. Indeed, suppose was invariant under the rotation; suppose its Fourier expansion is . Then by assumption for all , so . In the same vein, it can be shown that a rotation by of a compact abelian group (with respect to Haar measure) is ergodic iff the powers of are dense.
An averaging interpretation of ergodicity
We now prove:
Proposition 1 is ergodic iff for all measurable
The proof is an easy application of the ergodic theorem, but let’s see what it means intuitively. If are independent sets (independent in the sense of probability theory), then . Now , so the theorem says that ergodicity is equivalent to the statement that for any , the sets are asymptotically independent of each other in a Cesaro summability sense. This in turn leads to the stronger notions of weak and strong mixing given below.
Suppose first is ergodic. Then we have
as , by the Birkhoff theorem. If we multiply by and integrate (recall the dominated convergence theorem), we get the claim as in the proposition.
Now suppose the limit exists as stated for any , and we prove ergodicity. Suppose ; then it follows that so .
Weak and strong mixing
Say that is weak-mixing if for all measurable
as . This is clearly a strengthening of ergodicity. Say that is strong-mixing if for all measurable,
In these conditions, it is often only necessary to check them on some subset of all measurable sets. If any measurable set can be arbitrarily approximated by an element of some class (which is to say that if is measurable and , there is with ), then one only needs to check these conditions on . This can be seen by a standard argument.
As an example, consider the space , where is the discrete measure space such that has measure . Then one has a shift that shifts the coordinates by one. It is easy to check that the strong-mixing hypothesis holds when are sets that depend on only finitely many coordinates, so is strong-mixing more generally.