Documentation

Oseledets.Krieger.SMBSharp

The sharp Shannon–McMillan–Breiman theorem (Breiman/Bruin telescoping) #

This file develops the sharp Shannon–McMillan–Breiman (SMB) theorem, sharpening the crude name-count upper bound limsup (1/n)·iₙ ≤ log (card ι) of Oseledets.Krieger.SMB to the Kolmogorov–Sinai entropy rate h(P,T), for an ergodic measure-preserving transformation T and a finite measurable partition P.

The route is the Breiman/Bruin telescoping identity (Einsiedler–Lindenstrauss–Ward, Entropy in Ergodic Theory, Ch. 2; Bruin, Ergodic Theory I, Lecture 15). The conditional information weight of the P-cell i₀ given the past coding g, condInfoWeight, is the surprise -log (μ(B-cell ∩ Pᵢ₀)/μ(B-cell)) of learning a point's P-cell is i₀ once its B-coding (the future itinerary) is known. The chain rule I_{P∨Q} = I_Q + I_{P|Q} becomes, at the level of these weights, the one-step factorization infoWeight_{n+1}(cons i₀ g) = infoWeight_n(g) + condInfoWeight i₀ g, which telescopes to iₙ(x) = ∑_{j<n} g_{n-j}(Tʲx).

Results proved here (sorry-free) #

The remaining pointwise residual (precise) #

What is not yet proved is the pointwise a.e. convergence (1/n)·iₙ(x) → h(P,T) for μ-a.e. x under Ergodic T μ. By the integral-level identity above the target h is correct; the gap is the a.e. statement, which decomposes into (with gₖ the conditional information function):

References #

noncomputable def Oseledets.Krieger.condInfoWeight {α : Type u_1} {ι : Type u_2} [ : MeasurableSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} {β : Type u_3} [Fintype β] (B : Entropy.MeasurePartition μ β) (P : Entropy.MeasurePartition μ ι) (i : ι) (b : β) :

The conditional information weight of the P-cell index i given the B-cell index b: the surprise -log (μ(Bᵦ ∩ Pᵢ)/μ(Bᵦ)) of learning that a point of Bᵦ also lies in Pᵢ, i.e. the value of the conditional information function I_{P | σ(B)} on the cell Bᵦ for a point whose P-cell is Pᵢ. Here B may have a different (finite) index type than P.

Equations
Instances For
    theorem Oseledets.Krieger.condInfoWeight_nonneg {α : Type u_1} {ι : Type u_2} [ : MeasurableSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] {β : Type u_3} [Fintype β] (B : Entropy.MeasurePartition μ β) (P : Entropy.MeasurePartition μ ι) (i : ι) (b : β) :

    The conditional information weight is nonnegative: the conditional probability μ(Bᵦ ∩ Pᵢ)/μ(Bᵦ) lies in [0,1], so its log is nonpositive.

    theorem Oseledets.Krieger.ksJoinCells_cons {α : Type u_1} {ι : Type u_2} [ : MeasurableSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} {T : αα} (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) (n : ) (i₀ : ι) (g : Fin nι) :
    (Entropy.ksJoin hT P (n + 1)).cells (Fin.cons i₀ g) = P.cells i₀ T ⁻¹' (Entropy.ksJoin hT P n).cells g

    Cons-factorization of the flat-join cell. The cell of the (n+1)-fold iterated join at the code Fin.cons i₀ g is the P-cell i₀ (the 0-th symbol) intersected with the T-pullback of the n-fold join cell at g (the future symbols). This is the cell-level form of ⋁₀ⁿ T⁻ᵏP = P ∨ T⁻¹(⋁₀ⁿ⁻¹ T⁻ᵏP).

    theorem Oseledets.Krieger.infoWeight_succ_eq {α : Type u_1} {ι : Type u_2} [ : MeasurableSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] {T : αα} (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) (n : ) (i₀ : ι) (g : Fin nι) (hpos : μ (P.cells i₀ T ⁻¹' (Entropy.ksJoin hT P n).cells g) 0) :

    One-step telescoping factorization of the information weight. Peeling the first symbol of the (n+1)-step coding splits the iterated-join information weight into the n-step weight of the future coding g plus the conditional information weight of the first symbol i₀ given the T-pullback of the n-step past: infoWeight_{n+1}(cons i₀ g) = infoWeight_n(g) + condInfoWeight (T⁻¹(⋁₀ⁿ⁻¹)) P i₀ g.

    This is the pure measure-algebra core of the Breiman identity iₙ = ∑_{j<n} g_{n-j}∘Tʲ. It rests on the cell factorization ksJoinCells_cons, the measure-preservation of T (so the past cell has the same measure before and after pulling back along T), and the additivity of -log on the product μ(Pᵢ₀ ∩ T⁻¹C) = (μ(T⁻¹C ∩ Pᵢ₀)/μ(T⁻¹C)) · μ(T⁻¹C).

    The positivity hypothesis hpos : μ(Pᵢ₀ ∩ T⁻¹C) ≠ 0 is load-bearing: at a code whose (n+1)-cell is null but whose n-past cell is not, the -logs do not split additively (-log 0 = 0 is a junk value), so the identity is false there. This is exactly why the telescoping identity for the information functions holds only μ-a.e.: the codes realised by points all have positive-measure cells off a null set.

    Integrated (entropy-level) telescoping chain rule. The (n+1)-step join entropy splits as the n-step join entropy of the future plus the conditional entropy of P given the T-pullback B := T⁻¹(⋁₀ⁿ⁻¹ T⁻ᵏP) of the n-step past, additive over the cells of B: H(⋁₀ⁿ T⁻ᵏP) = H(⋁₀ⁿ⁻¹ T⁻ᵏP) + Hₐdd(P | T⁻¹(⋁₀ⁿ⁻¹ T⁻ᵏP)).

    This is the μ-integral of the pointwise one-step factorization infoWeight_succ_eq, obtained cleanly from the absolute chain rule entropy_join_eq_add_condEntropyGivenPartition applied to the join B ∨ P (which, reindexed by g' ↦ (tail g', head g'), is the (n+1)-fold join) together with the T-invariance of join entropy (entropy_pullback).

    Integrated telescoping chain rule, σ-algebra form. Bridging the additive-over-cells conditional entropy of ksEntropySeq_succ_eq_add_condEntropyGivenPartition to the σ-algebra conditional entropy via condEntropyGivenPartition_eq_condEntropy_generated, the (n+1)-step join entropy splits as H(⋁₀ⁿ T⁻ᵏP) = H(⋁₀ⁿ⁻¹ T⁻ᵏP) + H(P | σ(T⁻¹(⋁₀ⁿ⁻¹ T⁻ᵏP))), where the conditioning σ-algebra is the one generated by the cells of the T-pullback of the n-step past join. This is the entropy-level Breiman telescoping with the conditioning expressed in the form consumed by the fixed-partition Lévy theorem condEntropy_tendsto_iSup.

    Telescoped Breiman sum (entropy level). Iterating the σ-algebra chain rule, the n-step join entropy is the sum of the conditional entropies of P given the T-pullback of the k-step past, for k = 0, …, n-1: H(⋁₀ⁿ⁻¹ T⁻ᵏP) = ∑_{k<n} H(P | σ(T⁻¹(⋁₀ᵏ⁻¹ T⁻ʲP))).

    Each summand H(P | σ(T⁻¹(⋁₀ᵏ⁻¹ T⁻ʲP))) is condEntropy μ (σ(T⁻¹ ksJoin P k)) P.cells, the conditional entropy of P given the k-step future past — a decreasing sequence in k (more conditioning), converging to the relative entropy rate h(P,T) by condEntropy_tendsto_iSup. This is the integral-level statement of the Breiman telescoping iₙ = ∑_{j<n} g_{n-j}∘Tʲ (its μ-mean), and combined with the Fekete limit it identifies h(P,T) = lim_k H(P | σ(T⁻¹(⋁₀ᵏ⁻¹))) (a Cesàro mean of a convergent sequence converges to the same limit).

    The generated σ-algebra of the (Fekete) pullback partition coincides with that of the (factor) pulledBack partition: both have the cells i ↦ T⁻¹' R.cells i, and the generated σ-algebra depends only on the range of the cells.

    The conditioning σ-algebras σ(T⁻¹(⋁₀ᵏ⁻¹ T⁻ʲP)) of the Breiman telescoping form an increasing sequence in k: the (k+1)-step past join refines the k-step one, and T-pullback preserves refinement.

    The sharp Kolmogorov–Sinai rate as a conditional entropy (integral-level SMB rate). The Kolmogorov–Sinai entropy h(P,T) equals the conditional Shannon entropy of P given the σ-algebra of the strict future ⨆ₖ σ(T⁻¹(⋁₀ᵏ⁻¹ T⁻ʲP)) = σ(⋁_{j≥1} T⁻ʲP): h(P,T) = H(P | ⋁_{j≥1} T⁻ʲP).

    This is the integral-level statement of the sharp SMB rate (the μ-mean of the pointwise Breiman identity iₙ/n → h), and the precise identity ∫ g = h that the pointwise a.e. theorem converges to. It is assembled, unconditionally (no ergodicity, no maximal-function domination), from the telescoped Breiman sum ksEntropySeq_eq_sum_condEntropy, the fixed-partition Lévy theorem condEntropy_tendsto_iSup applied to the increasing conditioning family (generatedSigmaAlgebra_pullback_ksJoin_mono), and the Cesàro convergence Filter.Tendsto.cesaro — the average of a convergent sequence has the same limit — matched against the Fekete limit tendsto_ksEntropySeq.