Let {X} be a measure space with measure {\mu}; let {T: X \rightarrow X} be a measure-preserving transformation. Last time we looked at how the averages

\displaystyle A_N := \frac{1}{N} \sum_{i=0}^{N-1} f \circ T^i

behave in {L^2}. But, now we want pointwise convergence.

The pointwise ergodic theorem

We consider the pointwise ergodic theorem of Garrett George Birkhoff: 

Theorem 1 (Birkhoff) Let {f \in L^1(\mu)}. Then the averages {A_N} converge almost everywhere to a function {f^* \in L^1(\mu)} with {f^* \circ = f^*} a.e.


There is something of analogy between Birkhoff’s theorem and the well-known fact from real analysis that a function in {L^1} on a euclidean space can be recovered from its integrals over balls, i.e., for almost all {x \in \mathbb{R}^n},

\displaystyle f(x) = \lim_{r \rightarrow 0} \frac{1}{m(B(x,r)} \int_{B(x,r)} f

where {m} is Lebesgue measure. The proof of this latter theorem usually proceeds by associating to a locally integrable function {f} on {\mathbb{R}^n} the Hardy-Littlewood maximal function

\displaystyle Mf := \sup_{r>0} \frac{1}{m(B(x,r)} \int_{B(x,r)} |f|

and proving that it defines a bounded sublinear operator from {L^1} to weak {L^1}. Then, one uses an approximation argument since the continuous functions with compact support are dense in {L^1}.

The maximal ergodic theorem

In proving the Birkhoff ergodic theorem, we will define the maximal operator

\displaystyle M_Tf(x) = \sup_{N \in \mathbb{Z}_{\geq 0}} \frac{1}{N} \sum_{i=0}^{N-1} f(T^i(x)).

(When {N=0}, that expression is set to be zero, so {M_Tf \geq 0} everywhere.) There is a similar weak-type inequality for this, which we will prove from the maximal ergodic theorem:

Theorem 2

For {f \in L^1(\mu)},\displaystyle \boxed{ \int_{M_T f > 0 } f d \mu \geq 0. }



To prove this, we use the abbreviation {U_T g := g \circ T}; then {U_T} becomes a transformation of {L^1} onto itself. When {M_Tf > 0}, we have

\displaystyle U_T M_Tf + f = M_T f

as is easily seen. In particular,

\displaystyle \int_{M_T f > 0 } f d \mu = \int_{M_T f > 0} M_T f d \mu - \int_{M_T f > 0} U_T M_T f d \mu

which is at least \( ||M_T f||_1 – ||UM_T f||_1 \geq 0 \) since

\displaystyle \int_{M_T f > 0} M_T f d \mu = \int M_T f d \mu

and {U} is bounded by 1. This completes the proof.

It isn’t quite clear how the maximal ergodic theorem is a weak-type inequality. To do this, we fix {\alpha > 0} and note that

\displaystyle M_T f > \alpha \quad \mathrm{iff} \quad M_T(f-\alpha) > 0.

In particular, by the maximal theorem,

\displaystyle \boxed{ \int_{M_T f > \alpha} f d \mu \geq \alpha \mu( \{ M_T f > \alpha \} ) }

which implies {\mu( \{ M_T f > \alpha \} ) \leq \frac{1}{\alpha} ||f||_1}, a weak-type bound. What we will actually use, however, is the boxed statement above, or rather a variant of it. If {E \subset X} with {T^{-1}E = E}, then

\displaystyle \boxed{ \int_{M_T f > \alpha \cap E} f d \mu \geq \alpha \mu( E \cap \{ M_T f > \alpha \} ) }

which follows by doing all this with {E, T|_E} replacing {X, T}.

Proof of the ergodic theorem

Given {f \in L^1(\mu)} and {\alpha, \beta \in \mathbb{R}}, consider the sets

\displaystyle U_{\alpha} : \lim \sup_{N \rightarrow \infty} \frac{1}{N} \sum_{i=0}^{N-1} U_T^i f(x) > \alpha


\displaystyle L_{\beta} := \lim \inf_{N \rightarrow \infty} \frac{1}{N} \sum_{i=0}^{N-1} U_T^i f(x) < \beta.

I will show that when {\alpha > \beta}, {\mu(U_{\alpha} \cap L_{\beta}) = 0}. Taking the union of these intersections for {\alpha, \beta \in \mathbb{Q}} with {\alpha > \beta}, one gets a set of measure zero outside of which the limit of the averages exists. So, it is enough to prove {\mu(U_{\alpha} \cap L_{\beta}) = 0}. Now {T^{-1}(U_{\alpha} \cap L_{\beta}) = U_{\alpha} \cap L_{\beta}}, as is easily seen. Also, at each point of {(U_{\alpha} \cap L_{\beta}}, we have {M_T f > \alpha} so by the last boxed statement,

\displaystyle \alpha \mu(U_{\alpha} \cap L_{\beta}) \leq \int_{U_{\alpha} \cap L_{\beta}} f d \mu .

Now we can do the exact same thing with {\beta}, since {U_{\alpha} \cap L_{\beta}} is the same thing as {L_{-\alpha} \cap U_{-\beta}} for {-f}, which implies

\displaystyle -\beta \mu( U_{\alpha} \cap L_{\beta} ) \leq \int_{U_{\alpha} \cap L_{\beta} } - f d \mu

and putting this all together gives {\alpha \mu(U_{\alpha} \cap L_{\beta}) \leq \beta \mu (U_{\alpha} \cap L_{\beta})}, possible only if {\mu( U_{\alpha} \cap L_{\beta}) = 0}.