Up until now, we have concentrated on a transformation {T} of a fixed measure space. We now take a different approach: {T} is fixed, and we look for appropriate measures (on a fixed {\sigma}-algebra).  First, we will show that this space is nonempty.  Then we will characterize ergodicity in terms of extreme points.

This is the first theorem we seek to prove:

Theorem 1 Let {T: X \rightarrow X} be a continuous transformation of the compact metric space {X}. Then there exists a probability Borel measure {\mu} on {X} with respect to which {T} is measure-preserving.

 

Consider the Banach space {C(X)} of continuous {f: X \rightarrow \mathbb{C}} and the dual {C(X)^*}, which, by the Riesz representation theorem, is identified with the space of (complex) Borel measures on {X}. The positive measures of total mass one form a compact convex subset {P} of {C(X)^*} in the weak* topology by Alaoglu’s theorem. Now, {T} induces a transformation of {C(X)}: {f \rightarrow f \circ T}. The adjoint transformation of {C(X)^*} is given by {\mu \rightarrow T^{-1}(\mu}, where for a measure {\mu}, {T^{-1}(\mu)(E) := \mu(T^{-1}E)}. We want to show that {T^*} has a fixed point on {P}; then we will have proved the theorem.

There are fancier methods in functional analysis one could use, but to finish the proof we will appeal to the simple

Lemma 2 Let {C} be a compact convex subset of a locally convex space {X}, and let {T: C \rightarrow C} be the restriction of a continuous linear map on {X}. Then {T} has a fixed point in {C}.

 

Pick {c \in C} and define

\displaystyle c_n = \frac{1}{n} \sum_{i=0}^{n-1} T^i c \in C.

I claim that as {n \rightarrow \infty}, {Tc_n - c_n \rightarrow 0}. Indeed, to say that {X} is locally convex means that it is topologized by a family of seminorms. Pick any seminorm {p: X \rightarrow \mathbb{R}_{\geq 0}}. Then

\displaystyle p(Tc_n -c_n) = \frac{1}{n} p( T^n c - c ) \leq \frac{M}{n}

where {M := 2 \sup_C p(x)}. This tends to zero as {n \rightarrow \infty}, and since {p} was arbitrary we get the claim. Now as a result, any limit point of {c_n} (and at least one exists by compactness) will be a fixed point. This proves the lemma.

Interpretation of ergodicity

Given {T: X \rightarrow X}, let {M(X,T)} denote the compact convex set of probability Borel measures on {X} with respect to which {T} is measure-preserving. We have shown {M(X,T)} is nonempty. The next result gives an interpretation of ergodicity.

 

Proposition 3 {\mu \in M(X,T)} is ergodic (i.e. {T} is ergodic w.r.t. {\mu}) if and only if {\mu} is an extreme point of {M(X,T)}.

 

Recall that an extreme point {p} of a convex set {C} in some vector space is one such that if {p', p'' \in C} and {p=tp' + (1-t)p''} for {t \in [0,1]}, then {p'=p''=p}. I.e. {p} does not lie on any proper line segments contained in {C}. Extreme points are interesting because of the theorem of Krein-Milman, which states that a compact convex set is the closed convex hull of its extreme points.

To prove the proposition, suppose {\mu} is not ergodic, and let {E} be a proper {T}-invariant set. Then so is {E^c := X-E}, and we have

\displaystyle \mu(F) = \mu(F \cap E) + \mu(F \cap E^c) = \mu(E) \left( \frac{ \mu(F \cap E)}{\mu(E)}\right) + (1-\mu(E)) \left( \frac{ \mu(F \cap E^c)}{\mu(E^c)}\right)

In this way, we have expressed {\mu} as a convex combination of two probability measures on {X}, supported respectively on {E, E^c}, and each of which is {T}-invariant. So {\mu} is not an extreme point.

Now suppose {\mu} is ergodic and we can write

\displaystyle \mu = t \mu_1 + (1-t) \mu_2, \quad \mu_1, \mu_2 \in M(X,T), t \in (0,1).

Then {\mu_1} and {\mu_2} are absolutely continuous with respect to {\mu}, so by the Radon-Nikodym theorem there are {f_1, f_2 \in L^1(\mu)} with

\displaystyle \mu_1(F) = \int_F f_1 d \mu, \quad \mu_2(F) = \int_F f_2 d \mu, \quad \forall F.

I will show that {f_1 \equiv 1} almost everywhere.

Consider a constant {\lambda} and {E = \{ x \in X: f_1(x) < \lambda \}}. I claim that {E} is {T}-invariant. Indeed,

\displaystyle \mu_1(E) = \int_{E - T^{-1} E} f_1 d \mu + \int_{T^{-1}E \cap E} f_1 d \mu

and similarly

\displaystyle \mu_1(E) = \mu_1(T^{-1}E) = \int_{T^{-1} E - E} f_1 d \mu + \int_{T^{-1}E \cap E} f_1 d \mu

which one can put together to get

\displaystyle \int_{E - T^{-1} E} f_1 d \mu = \int_{T^{-1} E - E} f_1 d \mu .

But {E-T^{-1}E, T^{-1}E - E} have the same {\mu}-measure (indeed, when added to {T^{-1}E \cap E} they give the sets {E, T^{-1}E}, which have the same measure). Since {f_1 < \lambda} on one set and {\geq \lambda } on the other, we have {\mu(E \Delta T^{-1}E) = 0}. By ergodicity, {\mu(E)=0,1}. Since this is true for every {\lambda}, {f_1} is constant a.e. {[\mu]} and {\mu_1=\mu}. Same for {\mu_2}.