The Oseledets Multiplicative Ergodic Theorem in Lean 4

2 The linear cocycle and Furstenberg–Kesten

2.1 The measure-preserving system and the linear cocycle

The Oseledets theorem studies the long-term growth of products of matrices driven by a dynamical system. The data are a measure-preserving self-map \(T : X \to X\) of a probability space \((X,\mu )\) and a matrix-valued generator \(A : X \to \operatorname {Matrix}(\operatorname {Fin}d)(\operatorname {Fin}d)\, \mathbb {R}\). The fundamental object is the iterated linear cocycle: the product of the generator evaluated along the orbit of \(x\). We use the convention that the newest factor sits on the left.

Definition 2.1 The entrywise measurable structure on matrices
#

A matrix is a function \(m \to n \to \alpha \), and we equip \(\operatorname {Matrix}\, m\, n\, \alpha \) with the Pi (product) \(\sigma \)-algebra induced from the entry type \(\alpha \). For finitely many entries over a Borel \(\alpha \) this agrees with the Borel \(\sigma \)-algebra. Mathlib does not register this automatically because \(\operatorname {Matrix}\) is a def rather than reducibly the underlying Pi type, so the ambient Pi instance does not transfer; we install it explicitly.

Definition 2.2 The iterated linear cocycle
#

Given \(A : X \to \operatorname {Matrix}(\operatorname {Fin}d)(\operatorname {Fin}d)\, \mathbb {R}\) and \(T : X \to X\), define \(A^{(n)} = \texttt{cocycle}\, A\, T\, n\) by recursion on \(n\):

\[ A^{(0)}(x) = 1, \qquad A^{(n+1)}(x) = A^{(n)}(Tx)\cdot A(x). \]

Unfolding, \(A^{(n)}(x) = A(T^{n-1}x)\cdots A(Tx)\, A(x)\), with the newest factor on the left. The matrix norm throughout is the scoped L2 operator norm, which is submultiplicative; vectors live in \(\mathrm{EuclideanSpace}\; \mathbb {R}\, (\operatorname {Fin}d)\) and the action is via \(\texttt{toEuclideanCLM}\).

Theorem 2.3 The cocycle identity
#

For all \(m, n \in \mathbb {N}\) and \(x \in X\),

\[ A^{(m+n)}(x) = A^{(m)}(T^{n}x)\cdot A^{(n)}(x). \]
Proof

Induction on \(n\) with \(x\) generalized. For \(n = 0\) both sides equal \(A^{(m)}(x)\). For the step, reassociate \(m + (n+1)\) and apply the defining recursion \(A^{(k+1)}(x) = A^{(k)}(Tx)\cdot A(x)\) on both sides, then use the inductive hypothesis at \(Tx\) together with \(T^{n+1}x = T^{n}(Tx)\) and associativity of matrix multiplication.

Definition 2.4 One-sided log-integrability of the generator
#

The hypothesis \(\texttt{IntegrableLogNorm}\, A\, \mu \) asserts that the positive part of the log-norm of the generator is integrable: \(\log ^{+}\left\lVert A(\cdot ) \right\rVert \in L^{1}(\mu )\), where \(\log ^{+} t = \max (\log t, 0)\). This is the standard integrability assumption of the Furstenberg–Kesten and Oseledets theorems; combined with the same hypothesis for the inverse generator \(A^{-1}\) it pins both extremal Lyapunov exponents in \(\mathbb {R}\).

Theorem 2.5 Measurability of the cocycle iterates
#

If \(A\) and \(T\) are measurable then for each \(n\) the iterate \(x \mapsto A^{(n)}(x)\) is measurable for the entrywise structure 2.1.

Proof

Induction on \(n\). The base case is a constant map. For the step, the recursion writes \(A^{(n+1)}\) as the product \((A^{(n)}\circ T)\cdot A\); matrix multiplication is jointly measurable on the Pi structure because each entry of a product is the finite sum \(\sum _k M_{ik}N_{kj}\) of products of measurable coordinate projections, so the result follows by composing the inductive hypothesis with \(T\) and multiplying by \(A\).

2.2 Measurability of the operator norm and the inverse

To feed the cocycle into the ergodic theory we must know that the operator norm and the matrix inverse are measurable on the entrywise structure. The subtlety is that Mathlib’s \(\texttt{Measurable.norm}\) is stated for a BorelSpace, whereas our matrix \(\sigma \)-algebra is the Pi structure; the two coincide here because the L2 operator-norm topology is definitionally the Pi product topology.

Lemma 2.6 The Pi structure is an opens-measurable space
#

The entrywise (Pi) measurable structure on \(\operatorname {Matrix}(\operatorname {Fin}d)(\operatorname {Fin}d)\, \mathbb {R}\) is an OpensMeasurableSpace for the L2 operator-norm topology, since that topology is installed (via replaceTopology) to be definitionally the Pi product topology, of which the Pi \(\sigma \)-algebra is exactly the Borel structure.

Theorem 2.7 Measurability of the L2 operator norm
#

The map \(M \mapsto \left\lVert M \right\rVert \) on \(\operatorname {Matrix}(\operatorname {Fin}d)(\operatorname {Fin}d)\, \mathbb {R}\) is measurable.

Proof

The norm is continuous for the operator-norm topology, and by 2.6 that topology’s Borel structure is the entrywise Pi structure; continuous maps into a Borel codomain are measurable.

Theorem 2.8 Measurability of the determinant
#

The determinant \(M \mapsto \det M\) is measurable.

Proof

By the Leibniz formula \(\det M = \sum _{\sigma }\operatorname {sgn}(\sigma )\prod _i M_{i,\sigma (i)}\), the determinant is a finite sum of finite products of measurable coordinate projections, hence measurable.

Theorem 2.9 Measurability of the matrix inverse
#

The inverse \(M \mapsto M^{-1}\) is measurable on the entrywise structure.

Proof

Writing \(M^{-1} = (\det M)^{-1}\cdot \operatorname {adj}(M)\), each entry is a ratio of polynomials in the entries. The adjugate is measurable (each entry is a determinant of a row update, again a polynomial in the entries), and \((\det M)^{-1}\) is measurable by 2.8 and measurability of inversion on \(\mathbb {R}\); the product is therefore measurable entrywise.

2.3 Positivity and submultiplicativity of the log-norm

To take logarithms cleanly we need the iterate norms to be strictly positive, which forces the generator to be everywhere invertible (\(\det A \neq 0\)) and the dimension to be nonzero. These hypotheses also make the \(\log \) of a product split as a genuine sum, which is what yields subadditivity rather than a mere inequality with junk values.

Lemma 2.10 Invertibility of the iterates
#

If \(\det A(x) \neq 0\) for every \(x\), then \(\det A^{(n)}(x) \neq 0\) for all \(n, x\).

Proof

Induction on \(n\). The base case is \(\det 1 = 1\). For the step, \(\det (A^{(n)}(Tx)\cdot A(x)) = \det A^{(n)}(Tx)\cdot \det A(x)\), and both factors are nonzero by the inductive hypothesis and the assumption on \(A\).

Lemma 2.11 The unit matrix has norm one
#

When \(d \neq 0\), \(\left\lVert (1 : \operatorname {Matrix}(\operatorname {Fin}d)(\operatorname {Fin}d)\, \mathbb {R}) \right\rVert = 1\) for the L2 operator norm.

Proof

For \(d \neq 0\) the space \(\mathrm{EuclideanSpace}\; \mathbb {R}\, (\operatorname {Fin}d)\) is nontrivial. The star-algebra equivalence \(\texttt{toEuclideanCLM}\) sends the identity matrix to the identity operator and preserves the norm, and the operator norm of the identity on a nontrivial space is \(1\). There is no NormOneClass instance for the matrix operator norm, so this must be proved by hand; note that at \(d = 0\) the statement is false (\(\left\lVert 1 \right\rVert = 0\)), which is why \(d \neq 0\) is required.

Lemma 2.12 Positivity of the iterate norms
#

Assume \(\det A(x) \neq 0\) for all \(x\) and \(d \neq 0\). Then \(0 {\lt} \left\lVert A^{(n)}(x) \right\rVert \) for every \(n, x\). The analogous statement \(\texttt{norm\_ inv\_ cocycle\_ pos}\) holds for the inverse iterates \(\left\lVert (A^{(n)}(x))^{-1} \right\rVert \).

Proof

If \(\left\lVert A^{(n)}(x) \right\rVert = 0\) then \(A^{(n)}(x)\) is the zero matrix, whose determinant vanishes (as \(d \neq 0\)), contradicting 2.10. The inverse case is identical using \(\det \bigl((A^{(n)}(x))^{-1}\bigr) = (\det A^{(n)}(x))^{-1} \neq 0\).

Theorem 2.13 Subadditivity of the log-norm cocycle
#

Assume \(\det A \neq 0\) everywhere and \(d \neq 0\). Then \(g_n(x) = \log \left\lVert A^{(n)}(x) \right\rVert \) is a subadditive cocycle over \(T\):

\[ g_{m+n}(x) \le g_m(x) + g_n(T^{m}x). \]
Proof

Rewrite \(m+n\) as \(n+m\) and apply the cocycle identity 2.3 to get \(A^{(m+n)}(x) = A^{(n)}(T^{m}x)\cdot A^{(m)}(x)\). Submultiplicativity of the operator norm and monotonicity of \(\log \) give \(\log \left\lVert A^{(m+n)}(x) \right\rVert \le \log \bigl(\left\lVert A^{(n)}(T^{m}x) \right\rVert \cdot \left\lVert A^{(m)}(x) \right\rVert \bigr)\); both factors are strictly positive by 2.12, so \(\log \) of the product splits as the sum \(\log \left\lVert A^{(n)}(T^{m}x) \right\rVert + \log \left\lVert A^{(m)}(x) \right\rVert \), which is the required bound after commuting the summands.

Theorem 2.14 Subadditivity of the inverse log-norm cocycle

Under the same hypotheses, \(g_n(x) = \log \left\lVert (A^{(n)}(x))^{-1} \right\rVert \) is a subadditive cocycle over \(T\).

Proof

As above, after \(\ref{cocycle_add}\) the inverse of the product reverses the order: \((A^{(m+n)}(x))^{-1} = (A^{(m)}(x))^{-1}\cdot (A^{(n)}(T^{m}x))^{-1}\). Submultiplicativity, monotone \(\log \), strict positivity of each inverse norm (2.12), and \(\log \) of a product splitting as a sum give the subadditive inequality.

2.4 Birkhoff-sum sandwich bounds

The two-sided integrability hypothesis controls the log-norm by Birkhoff sums of \(\log ^{+}\left\lVert A \right\rVert \) and \(\log ^{+}\left\lVert A^{-1} \right\rVert \) on both sides. These bounds drive both the integrability of each level and the bounded-below proviso of Kingman’s theorem.

Lemma 2.15 Upper Fekete bound
#

For \(\det A \neq 0\) everywhere and \(d \neq 0\),

\[ \log \left\lVert A^{(n)}(x) \right\rVert \le \sum _{k{\lt}n}\log ^{+}\left\lVert A(T^{k}x) \right\rVert . \]
Proof

Induction on \(n\). The base case is \(0 \le 0\). For the step, the recursion gives \(A^{(n+1)}(x) = A^{(n)}(Tx)\cdot A(x)\); submultiplicativity and \(\log \)-splitting bound \(\log \left\lVert A^{(n+1)}(x) \right\rVert \) by \(\log \left\lVert A(x) \right\rVert + \log \left\lVert A^{(n)}(Tx) \right\rVert \), and then \(\log \left\lVert A(x) \right\rVert \le \log ^{+}\left\lVert A(x) \right\rVert \) together with the inductive hypothesis at \(Tx\) peels off one Birkhoff term.

Lemma 2.16 Lower bound via the inverse Birkhoff sum

Under the same hypotheses,

\[ -\sum _{k{\lt}n}\log ^{+}\left\lVert (A(T^{k}x))^{-1} \right\rVert \le \log \left\lVert A^{(n)}(x) \right\rVert . \]
Proof

From \(A^{(n)}(x)\cdot (A^{(n)}(x))^{-1} = 1\) and submultiplicativity, \(1 = \left\lVert 1 \right\rVert \le \left\lVert A^{(n)}(x) \right\rVert \cdot \left\lVert (A^{(n)}(x))^{-1} \right\rVert \) (using 2.11 and 2.10), so taking logs gives \(0 \le \log \left\lVert A^{(n)}(x) \right\rVert + \log \left\lVert (A^{(n)}(x))^{-1} \right\rVert \). Bounding the inverse log-norm above by its Birkhoff sum 2.17 and rearranging yields the claim.

Lemma 2.17 Upper Fekete bound for the inverse cocycle

Under the same hypotheses,

\[ \log \left\lVert (A^{(n)}(x))^{-1} \right\rVert \le \sum _{k{\lt}n}\log ^{+}\left\lVert (A(T^{k}x))^{-1} \right\rVert . \]
Proof

Induction on \(n\), exactly as for the forward bound but with the inverse generator: the recursion and \(\texttt{mul\_ inv\_ rev}\) give \((A^{(n+1)}(x))^{-1} = (A(x))^{-1}\cdot (A^{(n)}(Tx))^{-1}\), and submultiplicativity plus \(\log \)-splitting and \(\log \left\lVert (A(x))^{-1} \right\rVert \le \log ^{+}\left\lVert (A(x))^{-1} \right\rVert \) peel off one Birkhoff term.

2.5 Integrability of each level

Lemma 2.18 Integral of a Birkhoff sum
#

For measure-preserving \(T\) and integrable \(f\), \(\int \sum _{k{\lt}n} f(T^{k}x)\, d\mu = n\int f\, d\mu \).

Proof

Each composition \(f\circ T^{k}\) is integrable and integral-preserving because \(T^{k}\) is measure-preserving, so \(\int f\circ T^{k}\, d\mu = \int f\, d\mu \). Summing the \(n\) equal terms over the finite range gives \(n\int f\, d\mu \).

Theorem 2.19 Integrability of the log-norm levels
#

Let \(T\) be measure-preserving for a finite measure \(\mu \), \(A\) measurable and everywhere invertible, \(d \neq 0\), with both \(\log ^{+}\left\lVert A \right\rVert \) and \(\log ^{+}\left\lVert A^{-1} \right\rVert \) integrable. Then each \(x \mapsto \log \left\lVert A^{(n)}(x) \right\rVert \) is integrable. The companion \(\texttt{integrable\_ logNorm\_ inv\_ cocycle}\) gives the same for the inverse iterates.

Proof

The level \(g_n\) is sandwiched between \(-B_n^{-}\) and \(B_n^{+}\), where \(B_n^{\pm }\) are the Birkhoff sums of \(\log ^{+}\left\lVert A \right\rVert \) and \(\log ^{+}\left\lVert A^{-1} \right\rVert \) (Lemmas 2.15, 2.16). Both \(B_n^{\pm }\) are nonnegative and integrable (finite sums of integrable, m.p. precompositions), so \(\left\lvert g_n \right\rvert \le B_n^{+} + B_n^{-}\) pointwise. Since \(g_n\) is measurable (2.7 composed with the cocycle), domination by the integrable \(B_n^{+}+B_n^{-}\) gives integrability.

2.6 The Furstenberg–Kesten theorems

Feeding the subadditive cocycles into Kingman’s ergodic theorem produces the two extremal Lyapunov exponents as a.e.-constant limits. The bounded-below proviso of Kingman is supplied by the cross integrability hypothesis: for the top exponent the lower bound comes from \(\log ^{+}\left\lVert A^{-1} \right\rVert \), and vice versa. This is precisely where the second integrability hypothesis keeps the limit finite in \(\mathbb {R}\) rather than \(-\infty \).

Theorem 2.20 Furstenberg–Kesten, top exponent
#

Let \(T\) be ergodic for a probability measure \(\mu \), let \(A\) be measurable and everywhere invertible with \(\log ^{+}\left\lVert A \right\rVert , \log ^{+}\left\lVert A^{-1} \right\rVert \in L^{1}(\mu )\). Then there is a constant \(\lambda _1 \in \mathbb {R}\) (the top Lyapunov exponent) with

\[ \frac{1}{n}\log \left\lVert A^{(n)}(x) \right\rVert \xrightarrow {n\to \infty } \lambda _1 \qquad \text{for } \mu \text{-a.e. } x. \]
Proof

If \(d = 0\) the matrix algebra is trivial, every norm is \(0\), and the limit is the constant \(0\). Otherwise set \(d \neq 0\) and let \(g_n = \log \left\lVert A^{(n)} \right\rVert \). It is a subadditive cocycle (2.13) with each level integrable (2.19). For the bounded-below proviso, the lower bound 2.16 and the Birkhoff integral identity 2.18 give \((\int g_{n+1})/(n+1) \ge -\int \log ^{+}\left\lVert A^{-1} \right\rVert \, d\mu \), a constant lower bound. The ergodic Kingman theorem 3.26 then returns the a.e. constant limit, which is the top exponent \(\lambda _1\) (see 4.8).

Theorem 2.21 Furstenberg–Kesten, bottom exponent
#

Under the same hypotheses there is a constant \(\lambda \in \mathbb {R}\) with

\[ \frac{1}{n}\log \left\lVert (A^{(n)}(x))^{-1} \right\rVert \xrightarrow {n\to \infty } \lambda \qquad \text{for } \mu \text{-a.e. } x, \]

so the bottom Lyapunov exponent \(\lambda _k = -\lambda \) exists and is finite.

Proof

Identical to the top case with the inverse subadditive cocycle \(g_n = \log \left\lVert (A^{(n)})^{-1} \right\rVert \) (2.14). The roles of the two integrability hypotheses swap: the bounded-below proviso now uses the lower bound \(-\sum _{k{\lt}n}\log ^{+}\left\lVert A(T^{k}x) \right\rVert \le \log \left\lVert (A^{(n)}(x))^{-1} \right\rVert \), obtained from \(\left\lVert 1 \right\rVert \le \left\lVert A^{(n)} \right\rVert \cdot \left\lVert (A^{(n)})^{-1} \right\rVert \) (2.11) and the forward Fekete bound 2.15, together with the integral identity 2.18. Ergodic Kingman 3.26 delivers the a.e. constant limit.