Up until now, we have concentrated on a transformation ${T}$ of a fixed measure space. We now take a different approach: ${T}$ is fixed, and we look for appropriate measures (on a fixed ${\sigma}$-algebra).  First, we will show that this space is nonempty.  Then we will characterize ergodicity in terms of extreme points.

This is the first theorem we seek to prove:

Theorem 1 Let ${T: X \rightarrow X}$ be a continuous transformation of the compact metric space ${X}$. Then there exists a probability Borel measure ${\mu}$ on ${X}$ with respect to which ${T}$ is measure-preserving.

Consider the Banach space ${C(X)}$ of continuous ${f: X \rightarrow \mathbb{C}}$ and the dual ${C(X)^*}$, which, by the Riesz representation theorem, is identified with the space of (complex) Borel measures on ${X}$. The positive measures of total mass one form a compact convex subset ${P}$ of ${C(X)^*}$ in the weak* topology by Alaoglu’s theorem. Now, ${T}$ induces a transformation of ${C(X)}$: ${f \rightarrow f \circ T}$. The adjoint transformation of ${C(X)^*}$ is given by ${\mu \rightarrow T^{-1}(\mu}$, where for a measure ${\mu}$, ${T^{-1}(\mu)(E) := \mu(T^{-1}E)}$. We want to show that ${T^*}$ has a fixed point on ${P}$; then we will have proved the theorem.

There are fancier methods in functional analysis one could use, but to finish the proof we will appeal to the simple

Lemma 2 Let ${C}$ be a compact convex subset of a locally convex space ${X}$, and let ${T: C \rightarrow C}$ be the restriction of a continuous linear map on ${X}$. Then ${T}$ has a fixed point in ${C}$.

Pick ${c \in C}$ and define $\displaystyle c_n = \frac{1}{n} \sum_{i=0}^{n-1} T^i c \in C.$

I claim that as ${n \rightarrow \infty}$, ${Tc_n - c_n \rightarrow 0}$. Indeed, to say that ${X}$ is locally convex means that it is topologized by a family of seminorms. Pick any seminorm ${p: X \rightarrow \mathbb{R}_{\geq 0}}$. Then $\displaystyle p(Tc_n -c_n) = \frac{1}{n} p( T^n c - c ) \leq \frac{M}{n}$

where ${M := 2 \sup_C p(x)}$. This tends to zero as ${n \rightarrow \infty}$, and since ${p}$ was arbitrary we get the claim. Now as a result, any limit point of ${c_n}$ (and at least one exists by compactness) will be a fixed point. This proves the lemma.

Interpretation of ergodicity

Given ${T: X \rightarrow X}$, let ${M(X,T)}$ denote the compact convex set of probability Borel measures on ${X}$ with respect to which ${T}$ is measure-preserving. We have shown ${M(X,T)}$ is nonempty. The next result gives an interpretation of ergodicity.

Proposition 3 ${\mu \in M(X,T)}$ is ergodic (i.e. ${T}$ is ergodic w.r.t. ${\mu}$) if and only if ${\mu}$ is an extreme point of ${M(X,T)}$.

Recall that an extreme point ${p}$ of a convex set ${C}$ in some vector space is one such that if ${p', p'' \in C}$ and ${p=tp' + (1-t)p''}$ for ${t \in [0,1]}$, then ${p'=p''=p}$. I.e. ${p}$ does not lie on any proper line segments contained in ${C}$. Extreme points are interesting because of the theorem of Krein-Milman, which states that a compact convex set is the closed convex hull of its extreme points.

To prove the proposition, suppose ${\mu}$ is not ergodic, and let ${E}$ be a proper ${T}$-invariant set. Then so is ${E^c := X-E}$, and we have $\displaystyle \mu(F) = \mu(F \cap E) + \mu(F \cap E^c) = \mu(E) \left( \frac{ \mu(F \cap E)}{\mu(E)}\right) + (1-\mu(E)) \left( \frac{ \mu(F \cap E^c)}{\mu(E^c)}\right)$

In this way, we have expressed ${\mu}$ as a convex combination of two probability measures on ${X}$, supported respectively on ${E, E^c}$, and each of which is ${T}$-invariant. So ${\mu}$ is not an extreme point.

Now suppose ${\mu}$ is ergodic and we can write $\displaystyle \mu = t \mu_1 + (1-t) \mu_2, \quad \mu_1, \mu_2 \in M(X,T), t \in (0,1).$

Then ${\mu_1}$ and ${\mu_2}$ are absolutely continuous with respect to ${\mu}$, so by the Radon-Nikodym theorem there are ${f_1, f_2 \in L^1(\mu)}$ with $\displaystyle \mu_1(F) = \int_F f_1 d \mu, \quad \mu_2(F) = \int_F f_2 d \mu, \quad \forall F.$

I will show that ${f_1 \equiv 1}$ almost everywhere.

Consider a constant ${\lambda}$ and ${E = \{ x \in X: f_1(x) < \lambda \}}$. I claim that ${E}$ is ${T}$-invariant. Indeed, $\displaystyle \mu_1(E) = \int_{E - T^{-1} E} f_1 d \mu + \int_{T^{-1}E \cap E} f_1 d \mu$

and similarly $\displaystyle \mu_1(E) = \mu_1(T^{-1}E) = \int_{T^{-1} E - E} f_1 d \mu + \int_{T^{-1}E \cap E} f_1 d \mu$

which one can put together to get $\displaystyle \int_{E - T^{-1} E} f_1 d \mu = \int_{T^{-1} E - E} f_1 d \mu .$

But ${E-T^{-1}E, T^{-1}E - E}$ have the same ${\mu}$-measure (indeed, when added to ${T^{-1}E \cap E}$ they give the sets ${E, T^{-1}E}$, which have the same measure). Since ${f_1 < \lambda}$ on one set and ${\geq \lambda }$ on the other, we have ${\mu(E \Delta T^{-1}E) = 0}$. By ergodicity, ${\mu(E)=0,1}$. Since this is true for every ${\lambda}$, ${f_1}$ is constant a.e. ${[\mu]}$ and ${\mu_1=\mu}$. Same for ${\mu_2}$.