3 Ergodic theorems
This chapter develops the three classical ergodic theorems that drive the multiplicative ergodic theorem: the maximal ergodic inequality of Hopf and Garsia, the pointwise (Birkhoff) ergodic theorem, and Kingman’s subadditive ergodic theorem. The first is the analytic gateway; the second turns the gateway into almost-everywhere convergence of Birkhoff averages; the third is the genuine engine of Oseledets, turning a subadditive sequence of log-norms into an almost-everywhere limit.
Throughout, \(T : X \to X\) is a map on a measurable space carrying a measure \(\mu \), birkhoffSum \(T\, g\, n\, x = \sum _{k{\lt}n} g(T^{[k]}x)\) is the \(n\)-th Birkhoff partial sum, and birkhoffAverage \(\mathbb {R}\, T\, g\, n\, x = n^{-1}\, \texttt{birkhoffSum}\, T\, g\, n\, x\) is the Birkhoff average. The arguments are due to Garsia, Katznelson–Weiss, and Karlsson; the formalization keeps every quantity in \(\mathbb {R}\) (or, where \(-\infty \) cannot be excluded a priori, in EReal) to avoid junk values.
3.1 Maximal ergodic inequality
For a measurable \(T\), a function \(g : X \to \mathbb {R}\), \(N : \mathbb {N}\) and \(x\), the maximal function is
the nonempty Finset.sup’ of the Birkhoff partial sums over \(k \in \texttt{range}\, (N+1)\). Since the \(k = 0\) term equals \(\texttt{birkhoffSum}\, T\, g\, 0\, x = 0\), the maximal function is its own positive part.
For all \(g, N, x\) one has \(0 \le \texttt{maxBirkhoff}\, T\, g\, N\, x\).
The index \(k = 0\) lies in \(\texttt{range}\, (N+1)\), and the corresponding term is \(\texttt{birkhoffSum}\, T\, g\, 0\, x = 0\). The supremum over a nonempty finite set dominates each of its terms, so the maximal function is at least \(0\).
For all \(g, N, x\),
This is the identity \(\texttt{range}\, (N+2) = \texttt{insert}\, (N+1)\, (\texttt{range}\, (N+1))\) applied to the supremum: splitting off the top index gives the join of the new term with the previous maximum. Both inequalities follow from the universal/existential characterizations of Finset.sup’. The recursion is what makes integrability provable by induction on \(N\).
On the set where the maximal function is positive, for every \(x\) with \(0 {\lt} \texttt{maxBirkhoff}\, T\, g\, N\, x\),
Pull the constant \(g\, x\) through the supremum: by the additive shift of a nonempty sup’ and the recursion \(\texttt{birkhoffSum}\, T\, g\, (k+1)\, x = g\, x + \texttt{birkhoffSum}\, T\, g\, k\, (Tx)\), the right-hand side equals \(\max _{0 \le k \le N} \texttt{birkhoffSum}\, T\, g\, (k+1)\, x\), the maximum of the shifted partial sums. The maximum defining the left-hand side is attained at some index \(k_0\); positivity forces \(k_0 \ge 1\) (the index \(0\) gives the value \(0\)), so \(\texttt{birkhoffSum}\, T\, g\, k_0\, x\) is one of the shifted sums and is therefore dominated by their maximum.
If \(T\) is measure-preserving and \(g\) is integrable, then each Birkhoff partial sum and the maximal function \(\texttt{maxBirkhoff}\, T\, g\, N\) are integrable; if moreover \(g\) and \(T\) are measurable, then \(\texttt{maxBirkhoff}\, T\, g\, N\) is measurable.
Each summand \(g \circ T^{[j]}\) is integrable because \(T^{[j]}\) is measure-preserving, so the finite Birkhoff sum is integrable; measurability is likewise a finite sum of measurable compositions. Integrability of the maximal function follows by induction on \(N\) using the recursion lemma 3.3 and the fact that a join of two integrable functions is integrable; measurability is the measurability of a finite sup’.
For a measure-preserving \(T\) and a measurable integrable \(g\), and every \(N : \mathbb {N}\),
Write \(E\) for the level set and \(M = \texttt{maxBirkhoff}\, T\, g\, N\). Integrating the pointwise inequality lemma 3.4 over \(E\) gives \(\int _E (M - M \circ T) \le \int _E g\). Now \(\int _E M \circ T \le \int _X M \circ T = \int _X M\) by nonnegativity and measure-preservation (via integral_map and \(\texttt{map}\, T\, \mu = \mu \), avoiding any embedding hypothesis), and \(\int _X M = \int _E M\) since \(M\) vanishes off \(E\). Chaining these cancels the two maximal-function integrals and leaves \(0 \le \int _E (M - M\circ T) \le \int _E g\).
The level sets \(\{ x : 0 {\lt} \texttt{maxBirkhoff}\, T\, g\, N\, x\} \) are monotone in \(N\), and their union over all \(N\) is exactly \(\{ x : \exists n,\ 0 {\lt} \texttt{birkhoffSum}\, T\, g\, (n+1)\, x\} \).
Monotonicity is Finset.sup’ monotonicity over the nested ranges. For the union: \(0 {\lt} \texttt{maxBirkhoff}\, T\, g\, N\, x\) means some \(\texttt{birkhoffSum}\, T\, g\, k\, x {\gt} 0\) with \(k \le N\); the index \(0\) gives \(0\) and is excluded, so \(k = n+1\) for some \(n\). Conversely a positive partial sum at index \(n+1\) makes the maximal function positive at level \(N = n+1\).
For a measure-preserving \(T\) and an integrable \(f : X \to \mathbb {R}\),
First treat a measurable integrable \(g\): by proposition 3.6 each level-set integral is nonnegative, and these integrals converge to the integral over the union lemma 3.7 by monotone convergence of set integrals, so the limit is nonnegative. For general integrable \(f\), pass to a measurable representative \(g =^{\text{a.e.}} f\); the Birkhoff sums of \(f\) and \(g\) agree a.e., so the two target sets agree a.e. (the one for \(g\) being measurable, the one for \(f\) null-measurable), and the integrands agree a.e., which transfers the inequality from \(g\) to \(f\).
3.2 Birkhoff
For a finite measure, a measure-preserving measurable \(T\) and integrable \(g\), the conditional expectation onto the \(\sigma \)-algebra \(\texttt{invariants}\, T\) of \(T\)-invariant sets is a.e. \(T\)-invariant:
Set-integral invariance of \(h \circ T\) over a measurable invariant set reduces, via \(\texttt{map}\, T\, \mu = \mu \), to the integral of \(h\). Hence \(\mu [g \circ T \mid \mathcal I]\) and \(\mu [g \mid \mathcal I]\) have equal integrals over every invariant set, so by uniqueness of the conditional expectation they agree a.e.; combining this with the commutation identity \(\mu [g\circ T \mid \mathcal I] =^{\text{a.e.}} (\mu [g\mid \mathcal I])\circ T\) (again by uniqueness, using that \(T\) is \((\mathcal I,\mathcal I)\)-measurable) yields the invariance.
For a measure-preserving \(T\) and integrable \(g\), the orbital tail \(n^{-1}\, g(T^{[n]}x)\) tends to \(0\) for \(\mu \)-a.e. \(x\).
A Borel–Cantelli argument. For each threshold \(\delta = 1/(k+1)\) the series \(\sum _n \mu \{ x : (n+1)\delta \le \left\lvert g\, x \right\rvert \} \) is finite: the pointwise count of crossed thresholds is at most \(\left\lvert g\, x \right\rvert /\delta \), and integrating (Tonelli) bounds the series by \(\delta ^{-1}\int \left\lvert g \right\rvert \). Measure-preservation transfers finiteness to the shifted orbit \(g \circ T^{[n]}\), so a.e. only finitely many \(n\) cross any fixed threshold; choosing \(k\) large makes the tail eventually smaller than any \(\varepsilon \).
For a finite measure, a measure-preserving \(T\) and integrable \(g\), the Birkhoff averages \(n \mapsto \texttt{birkhoffAverage}\, \mathbb {R}\, T\, g\, (n+1)\, x\) are a.e. bounded above (and, applied to \(-g\), a.e. bounded below).
The maximal ergodic inequality applied to \(g - c\) yields, for the maximal set \(B_c = \{ x : \exists n,\ c {\lt} \texttt{birkhoffAverage}\, \mathbb {R}\, T\, g\, (n+1)\, x\} \), the estimate \(c\, \mu B_c \le \int _{B_c} g \le \int \left\lvert g \right\rvert \). Taking \(c = k \in \mathbb {N}\) gives \(\mu B_k \le \int \left\lvert g \right\rvert /k \to 0\), so the intersection \(\bigcap _k B_k\) is null. Off this null set the range of Birkhoff averages is bounded above.
The pointwise limsup \(x \mapsto \limsup _n \texttt{birkhoffAverage}\, \mathbb {R}\, T\, g\, n\, x\) is a.e. \(T\)-invariant.
The difference \(A_n(g)(Tx) - A_n(g)(x) = n^{-1}(g(T^{[n]}x) - g\, x)\) tends to \(0\) a.e. by the tail estimate lemma 3.10. Two bounded sequences differing by a null sequence have equal limsup (proved via limsup_add_const and an \(\varepsilon \)-argument), and boundedness holds a.e. at both \(x\) and \(Tx\) by lemma 3.11; hence the limsup is a.e. \(T\)-invariant.
For a finite measure, measure-preserving \(T\), integrable \(g\) and \(\varepsilon {\gt} 0\), the superlevel set where the limsup of the Birkhoff averages exceeds \(\mu [g\mid \mathcal I] + \varepsilon \) is null.
Write \(L = \mu [g\mid \mathcal I]\) and \(Ls = \limsup A_\bullet (g)\). The set \(E = \{ L + \varepsilon {\lt} Ls\} \) is a.e. \(T\)-invariant (theorem 3.9, lemma 3.12), hence a.e. equal to a genuinely invariant measurable set \(E'\). Feed \(\varphi = \mathbf1_{E'}(g - L - \varepsilon )\) to the maximal ergodic inequality theorem 3.8; on \(E'\) the partial sums of \(\varphi \) telescope (using orbit-constancy of \(L\)), and the maximal set equals \(E'\). Thus \(0 \le \int _{E'}(g - L - \varepsilon ) = -\varepsilon \, \mu E'\) (using \(\int _{E'} g = \int _{E'} L\)), forcing \(\mu E' = 0\).
For a finite measure, a measure-preserving \(T\) and an integrable \(g\), the Birkhoff averages converge \(\mu \)-a.e. to the conditional expectation of \(g\) onto the invariant \(\sigma \)-algebra:
Unioning the null superlevel sets of proposition 3.13 over \(\varepsilon = 1/(k+1)\) gives \(\limsup A_\bullet (g) \le \mu [g\mid \mathcal I]\) a.e.; applying the same to \(-g\) (and using \(\limsup (-a) = -\liminf a\)) gives \(\mu [g\mid \mathcal I] \le \liminf A_\bullet (g)\) a.e. With a.e. boundedness lemma 3.11, the sandwich \(\limsup \le \mu [g\mid \mathcal I] \le \liminf \) forces convergence to \(\mu [g\mid \mathcal I]\).
For an ergodic \(T\) on a probability space and integrable \(g\), the Birkhoff averages converge \(\mu \)-a.e. to the space average \(\int g\, d\mu \).
By theorem 3.14 the limit is \(\mu [g\mid \mathcal I]\), which is a.e. \(T\)-invariant theorem 3.9; ergodicity forces it a.e. constant. Its integral equals \(\int g\) (conditional expectation preserves the integral) and equals the constant times \(\mu (X) = 1\), so the constant is \(\int g\).
3.3 Kingman
A sequence \(g : \mathbb {N}\to X \to \mathbb {R}\) is a subadditive cocycle over \(T\) when
For \(g_n = \log \left\lVert A^{(n)} \right\rVert \) this is submultiplicativity of the operator norm composed with the cocycle identity.
For a subadditive cocycle and \(n : \mathbb {N}\), one has \(g\, (n+1)\, x \le \texttt{birkhoffSum}\, T\, (g\, 1)\, (n+1)\, x\); more generally, for any decomposition of \([0,N)\) into \(k+1\) consecutive blocks of lengths \(\ell _0, \dots , \ell _k\), the cocycle is dominated by the sum of the block values along the orbit at the frontiers \(T^{[\sum _{j{\lt}i}\ell _j]}x\).
Both are inductions that peel off the last block and apply the defining subadditivity at the split point. The statement is restricted to nonempty decompositions (and to index \(n+1\)) because subadditivity at \((0,0)\) only forces \(0 \le g\, 0\, x\), the wrong sign for a one-sided bound at \(0\).
The normalized cocycle is \(\texttt{cdiv}\, g\, n\, x = g\, (n+1)\, x / (n+1)\), the sequence whose a.e. limit is the content of Kingman’s theorem; its EReal coercion is \(\texttt{ecdiv}\, g\, n\, x\), used where \(-\infty \) cannot be excluded a priori.
For a measure-preserving \(T\), an integrable subadditive cocycle \(g\) whose normalized integrals are bounded below, the sequence \((\int g\, (n+1)\, d\mu )/(n+1)\) converges to the Fekete constant \(\gamma \).
Integrating the cocycle inequality and using measure-preservation (\(\int g\, n\, (T^{[m]}\cdot ) = \int g\, n\)) shows the integral sequence \(a_n = \int g\, n\) is subadditive in Fekete’s sense. The \((n+1)\)-indexed lower bound is bridged by hand to a lower bound on \(a_n/n\) (the \(n=0\) term is \(0\)), and Fekete’s lemma delivers convergence of \(a_n/n\), hence of the shifted sequence, to \(\gamma = \inf _n a_n/n\).
For a finite measure, measure-preserving \(T\) and an a.e. measurable \(F\) with \(F\, x \le F\, (Tx)\) for a.e. \(x\), one has \(F \circ T =^{\text{a.e.}} F\).
For each rational \(c\) the upper level set \(\{ c \le F\} \) is null-measurable (via a measurable representative) and a.e. contained in its preimage \(T^{-1}\{ c \le F\} \), which has equal finite measure; an a.e. subset of equal finite measure is a.e. equal. Ranging over all rational \(c\) and collecting the a.e. statements gives the invariance.
For a finite measure, measure-preserving \(T\), an integrable subadditive cocycle with normalized integrals bounded below, the limsup envelope \(f_+(x) = \limsup _n \texttt{cdiv}\, g\, n\, x\) (and likewise the liminf envelope \(f_-\)) is a.e. \(T\)-invariant.
The subadditive bound \(\texttt{cdiv}\, g\, n\, x \le g\, 1\, x/(n+1) + g\, n\, (Tx)/(n+1)\) differs from \(\texttt{cdiv}\, g\, n\, (Tx)\) by a null sequence, so the vanishing-perturbation lemma for limsup gives the pointwise comparison \(f_+(x) \le f_+(Tx)\) wherever the cocycle is a.e. bounded (above and below) at \(x\) and \(Tx\). Feeding this into lemma 3.20 yields a.e. invariance.
Under the same hypotheses, the limsup envelope \(f_+\) is integrable.
The nonnegative Fatou defect \(d_n(x) = \texttt{birkhoffAverage}\, \mathbb {R}\, T\, (g\, 1)\, (n+1)\, x - \texttt{cdiv}\, g\, n\, x \ge 0\) (singleton subadditivity) controls the envelope. The ENNReal Fatou inequality \(\int ^- \liminf u_n \le \liminf \int ^- u_n\) applied to \(u_n = \texttt{ofReal}\, d_n\), together with \(\int d_n = \int g\, 1 - a_{n+1}/(n+1) \to \int g\, 1 - \gamma {\lt} \infty \) (lemma 3.19) and Birkhoff convergence of \(A_{n+1}(g\, 1)\) to a loose envelope \(B\) (theorem 3.14), shows \(B - f_+ \ge 0\) has finite lower integral, hence is integrable; therefore \(f_+ = B - (B - f_+)\) is integrable.
Under the same hypotheses, for a.e. \(x\) the EReal limsup of the normalized cocycle is dominated by its liminf.
This is the stopping-time / greedy block argument of Katznelson–Weiss and Karlsson. After the WLOG shift to the nonpositive process \(\tilde g\, (n+1) \le 0\), fix \(\varepsilon , M {\gt} 0\) and set \(h = \mu [\max (f_-, -M) \mid \mathcal I]\). A greedy two-type partition of \([0,n)\) into “good” blocks (where the stopping time \(\tau \le L\) realizing \(\tilde g\, \tau \le \tau (h+\varepsilon )\) exists) and short singletons (bad/overrun) bounds, via block subadditivity lemma 3.17, \(\tilde g\, n\, x/n \le (h\, x+\varepsilon )(1 - (L-1)/n - \texttt{birkhoffAverage}\, \mathbf1_{B_L})\). Letting \(n \to \infty \) (Birkhoff, theorem 3.14), then \(L \to \infty \) (the bad sets \(B_L\) shrink to a null set), \(M \to \infty \) and \(\varepsilon \to 0\) yields \(f_+ \le f_-\) a.e. The whole argument is carried in EReal to keep the bookkeeping clean near \(-\infty \).
For a finite measure, measure-preserving measurable \(T\), an integrable subadditive cocycle with normalized integrals bounded below, there is an integrable \(G\) with \(\texttt{cdiv}\, g\, n\, x \to G\, x\) for \(\mu \)-a.e. \(x\).
Take \(G = f_+\), integrable by lemma 3.22. On the a.e. good set the EReal limsup \(e\) satisfies \(\bot {\lt} e \le B {\lt} \top \) (finiteness from the loose envelope and the Fatou step), and \(\liminf = \limsup = e\) proposition 3.23; in a complete linear order equal liminf and limsup force convergence to \(e\). Transferring to \(\mathbb {R}\) gives \(\texttt{cdiv}\, g\, n\, x \to e^{\text{toReal}} = f_+\, x\).
For a finite measure, a measure-preserving \(T\), an integrable subadditive cocycle \(g\) whose normalized integrals are bounded below, there is a \(T\)-invariant integrable \(G\) with
Take \(G = f_-\). On the a.e. set where the normalized cocycle is bounded, \(\liminf \le \limsup \) is trivial and \(\limsup \le \liminf \) is the hard direction inside theorem 3.24, so \(f_- =^{\text{a.e.}} f_+\); the sandwich \(\limsup \le f_- \le \liminf \) yields pointwise convergence to \(f_-\), and reindexing removes the \(n=0\) term. Invariance is lemma 3.21 (liminf variant), and integrability follows from \(f_- =^{\text{a.e.}} f_+\) with lemma 3.22.
For an ergodic \(T\) on a probability space, an integrable subadditive cocycle with normalized integrals bounded below, there is a constant \(c\) with \(n^{-1}\, g\, n\, x \to c\) for \(\mu \)-a.e. \(x\).
Kingman’s theorem theorem 3.25 gives a \(T\)-invariant integrable limit \(G\); ergodicity forces an a.e. \(T\)-invariant integrable function to be a.e. constant, so \(G =^{\text{a.e.}} c\) and the limit is \(c\). (That constant is the Fekete infimum \(\gamma \); only a.e.-constancy is asserted, as this is what the multiplicative ergodic theorem consumes.)