In the theory of dynamical systems, it is of interest to have invariants to tell us when two dynamical systems are qualitatively “different.” Today, I want to talk about one particularly important one: topological entropy.

We will be in the setting of discrete dynamical systems: here a discrete dynamical system is just a pair ${(T,X)}$ for ${X}$ a compact metric space and ${T: X \rightarrow X}$ a continuous map.

Recall that two such pairs ${(T,X), (S,Y)}$ are called topologically conjugate if there is a homeomorphism ${h: X \rightarrow Y}$ such that ${T = h^{-1}Sh}$. This is a natural enough definition, and it is clearly an equivalence relation. For instance, it follows that there is a one-to-one correspondence between the orbits of ${T}$ and those of ${S}$. In particular, if ${T}$ has a fixed point, so does ${S}$. Admittedly this necessary criterion for determining whether ${T,S}$ are topologically conjugate is rather trivial.

Note incidentally that topological conjugacy needs to be considered even when one is studying smooth dynamical systems—in many cases, one can construct a homeomorphism ${h}$ as above but not a diffeomorphism. This is the case in the Hartman-Grobman theorem, which states that if ${f: M \rightarrow M}$ is a smooth map with a fixed point where the derivative is a hyperbolic endomorphism of the tangent space, then it is locally conjugate to the derivative (that is, the corresponding linear map).

1. Definition of topological entropy

Anyway, we need new invariants. One extremely important one is topological entropy, which measures in some sense the “complexity” of ${T}$. Consider the following problem. For a natural number ${n}$, consider segments ${x, Tx, \dots, T^{n-1}x}$ for all ${x \in X}$. How many of them are there?

Clearly, the answer will be infinite in general. But we can count how many we need to get a dense packing in the space of all such segments. To be precise, for ${n \in \mathbb{N}, \epsilon>0}$, define the number ${S(n, \epsilon,T)}$ to be the minimal natural number ${m}$ such that there exist points ${y_1, \dots, y_m}$ such that for every ${x \in X}$, there is some ${j}$ such that

$\displaystyle d(T^ix , T^i y_j) < \epsilon, \ \ 0 \leq i \leq n-1.$

Here ${d}$ is the metric on ${X}$. The topological entropy ${h_{top}(T)}$ is defined as

$\displaystyle \boxed{ h_{top}(T)} = \lim_{\epsilon \rightarrow 0} \limsup_{n \rightarrow \infty} \frac{1}{n} \log S(n, \epsilon, T).$

This is a rather complex definition, so it will be useful to pause to think again about it. Another way to do this is to introduce a new metric on ${X}$.

Namely, we define the metric ${d_n}$ via ${d_n(x,y) = \max_{0 \leq i < n} d(T^ix, T^iy)}$. Then, in any metric space ${A, \delta}$, we can call a subset ${B \subset A}$ ${\epsilon}$-dense if every point of ${A}$ is of distance ${<\epsilon}$ from some point of ${B}$. The selection of points ${y_1, \dots, y_m}$ as above was made so that ${\{y_1, \dots, y_m\}}$ is an ${\epsilon}$-dense set—indeed, the smallest such—in ${X}$ endowed with the metric ${d_n}$. This provides some motivation for the definition.

There is a variation on the idea of ${\epsilon}$-dense: namely, ${\epsilon}$-separated. This means that any two distinct points in the given subset (which we call ${\epsilon}$-separated) have distance ${\geq \epsilon}$. The problem of finding a maximal ${\epsilon}$-separated set (“to pack the points such that they are far away from each other”) is related to the problem of finding a minimal ${\epsilon}$-dense set. Namely, one can check that a minimal ${2\epsilon}$-dense set is ${\epsilon}$-separated, and similarly a maximal ${\epsilon}$-separated set is ${2\epsilon}$-dense.

This provides another way of thinking of topological entropy. Let ${S'(n,\epsilon,T)}$ denote a maximal ${\epsilon}$-separated subset of ${(X, d_n)}$. Then

$\displaystyle h_{top}(T) = \lim_{\epsilon \rightarrow 0} \limsup_{n \rightarrow \infty} \frac{1}{n} \log S'(n, \epsilon, T).$

2. A more natural definition

I personally find this definition a little strange. For one thing, it appears superficially to depend on the metric ${d}$, while we supposedly just care about the topological structure. In addition, the formula is rather complicated. We have yet to show that it is invariant under topological conjugacy, in fact.

The original definition of Adler, Konheim, and McAndrew is simpler and seems more natural to me; it is defined very explicitly in terms of coverings. It does not even use the metric structure of ${X}$. So, fix a compact space ${X}$, and let ${T: X \rightarrow X}$ be a continuous map, as before. Now an open covering will be denoted ${\mathfrak{A}}$. The refinement ${\mathfrak{A} \vee \mathfrak{B}}$ of two open coverings ${\mathfrak{A}, \mathfrak{B}}$ is just the covering ${\{ U \cap V, U \in \mathfrak{A}, V \in \mathfrak{B}\}}$. We define the size ${\mathcal{N}(\mathfrak{A})}$ of the cover ${\mathfrak{A}}$ to be the cardinality of the minimal subcover; obviously ${\mathcal{N}(\mathfrak{A} \vee \mathfrak{B}) \leq \mathcal{N}(\mathfrak{A}) \mathcal{N}(\mathfrak{B})}$.

Given an open cover ${\mathfrak{A}}$, we define the inverse image ${T^{-1}(\mathfrak{A})}$ via ${\{ T^{-1}(U), U \in \mathfrak{A})}$; it is clear that ${\mathcal{N}(T^{-1}(\mathfrak{A})) \leq \mathcal{N}(\mathfrak{A})}$. The following theorem gives another definition of topological entropy (which is how Walters introduces it)

Theorem 1 The topological entropy is the supremum of$\displaystyle \lim_{n \rightarrow \infty} \frac{1}{n} \log \mathcal{N}( \mathfrak{A} \vee T ^{-1}( \mathfrak{A}) \vee \dots \vee T^{-(n-1)}(\mathfrak{A}))$

over all open covers ${\mathfrak{A}}$.

This result actually follows rather easily from the definitions. Note that the limit actually exists, because if ${c_n = \log \mathcal{N}( \mathfrak{A} \vee T ^{-1}( \mathfrak{A}) \vee \dots \vee T^{-(n-1)}(\mathfrak{A}))}$, then the properties of ${n}$ mentioned imply that ${c_{n+m} \leq c_n +c_m}$, from which it is a straightforward exercise in analysis that ${\lim \frac{c_n}{n}}$ equals and is the infimum.

Indeed, suppose ${\mathfrak{A}}$ is the cover by all ${\epsilon}$-balls. Take the metric ${d_n}$ as above. Any set ${\bigcap_{i=1}^n T^{-i}(U_i)}$ for ${U_i \in \mathfrak{A}}$ has ${d_n}$-diameter at most ${\epsilon}$. Then if ${S(n, \epsilon, T)}$ is the size of a minimal ${\epsilon}$-spanning set with respect to the metric ${d_n}$ as above, clearly

$\displaystyle S(n, \epsilon, T) = \mathcal{N}(\mathfrak{A} \vee T^{-1} (\mathfrak{A}) \vee \dots \vee T^{-(n-1)}(\mathfrak{A})) ,$

because taking one point from each set in the covering ${\mathfrak{A} \vee T^{-1}(\mathfrak{A}) \vee \dots \vee T^{-(n-1)}(\mathfrak{A})}$ gives an ${\epsilon}$-spanning set for ${d_n}$, and this spanning set is minimal if and only if the cover is minimal. It follows that

$\displaystyle \lim_{n \rightarrow \infty} \frac{1}{n} \log \mathcal{N}( \mathfrak{A} \vee T ^{-1}( \mathfrak{A}) \vee \dots \vee T^{-(n-1)}(\mathfrak{A})) = \lim_{n \rightarrow \infty} \frac{1}{n} \log S(n, \epsilon, T).$

In particular, the topological entropy is less than or equal to

$\displaystyle \lim_{n \rightarrow \infty} \frac{1}{n} \log \mathcal{N}( \mathfrak{A} \vee T ^{-1}( \mathfrak{A}) \vee \dots \vee T^{-(n-1)}(\mathfrak{A}))$

and is equal to the limit as ${\epsilon \rightarrow 0}$ of that quantity for ${\mathfrak{A}}$ the cover by ${\epsilon}$-balls.

Now, if ${\mathfrak{A}}$ is any cover, I claim that the limit exists and is at most ${h_{top}(f)}$. This will prove the theorem. This is because there is a Lebesgue number ${\epsilon>0}$ for ${\mathfrak{A}}$, and the limit is at most that of the limit with ${\mathfrak{A}}$ replaced by the cover of ${\epsilon}$-balls. So we can reduce to this case, which was already handled above.

Incidentally, it clearly suffices to take ${\mathfrak{A}}$ finite (because shrinking ${\mathfrak{A}}$ only increases the limit). This equation is evidently invariant under topological conjugacy, so the theorem implies:

Corollary 2 Topological entropy is invariant under topological conjugacy.

Next time, we’re going to compute some explicit examples of what this actually means, as well as proving a few more elementary properties.