Documentation

Oseledets.Krieger.SMBPointwise

Pointwise Shannon–McMillan–Breiman: the main (Birkhoff) term #

This file carries the pointwise Shannon–McMillan–Breiman theorem past the integral-level rate identity ksEntropyPartition_eq_condEntropy_iSup (proved in SMBSharp) towards the a.e. limit (1/n)·iₙ(x) → h(P,T) for an ergodic measure-preserving T.

The Breiman split of the information function iₙ(x) = ∑_{j<n} g_{n-j}(Tʲx) is compared with the Birkhoff sum of the limit conditional information function g∞(x) = ∑ᵢ 𝟙_{Pᵢ}(x) · (-log μ⟦Pᵢ | 𝒞∞⟧(x)), where 𝒞∞ = ⨆ₖ σ(T⁻¹(⋁₀ᵏ⁻¹ T⁻ʲP)) is the strict-future σ-algebra (the conditioning σ-algebra of the sharp rate identity). This file establishes the R3/R4 a.e. main term:

condInfoFun — the conditional information function g𝒜(x) = ∑ᵢ 𝟙_{Pᵢ}(x)·(-log μ⟦Pᵢ|𝒜⟧(x)), with its measurability (measurable_condInfoFun), nonnegativity (condInfoFun_nonneg), and the keystone integral identity integral_condInfoFun_eq_condEntropy : ∫ g𝒜 = H(P | 𝒜).
integrable_condInfoFun — g𝒜 ∈ L¹(μ) (its integral is the finite H(P|𝒜), and it is ≥ 0).
integral_condInfoFun_iSup_eq — ∫ g∞ = h(P,T), identifying the Birkhoff target as the sharp KS rate (via ksEntropyPartition_eq_condEntropy_iSup).
ae_tendsto_birkhoffAverage_condInfoFun_iSup — R4: for ergodic T, the Birkhoff averages of g∞ converge a.e. to ∫ g∞ = h(P,T).

The keystone integral identity is the per-cell pull-out ∫ 𝟙_{Pᵢ}·(-log pᵢ) = ∫ negMulLog pᵢ where pᵢ = μ⟦Pᵢ | 𝒜⟧: since -log pᵢ is 𝒜-measurable (a function of the conditional kernel), replacing 𝟙_{Pᵢ} by its 𝒜-conditional expectation pᵢ leaves the integral unchanged (condExp_stronglyMeasurable_mul_of_bound), and negMulLog pᵢ = pᵢ·(-log pᵢ). The unboundedness of -log at pᵢ = 0 is handled by a monotone truncation hₘ = min(-log pᵢ, M): ∫ 𝟙_{Pᵢ}·hₘ = ∫ pᵢ·hₘ for each M, and both sides increase to the (finite, by negMulLog ≤ e⁻¹) limit by monotone convergence in lintegral.

References #

L. Breiman, The individual ergodic theorem of information theory, Ann. Math. Statist. 28 (1957), 809–811; correction 31 (1960), 809–810.
K. L. Chung, A note on the ergodic theorem of information theory, Ann. Math. Statist. 32 (1961), 612–614.
M. Einsiedler, E. Lindenstrauss, T. Ward, Entropy in Ergodic Theory and Topological Dynamics, Ch. 2 (SMB).

noncomputable def Oseledets.Krieger.condInfoFun {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (P : Entropy.MeasurePartition μ ι) (x : α) :

The conditional information function of the finite partition P given the sub-σ-algebra 𝒜: g𝒜(x) = ∑ᵢ 𝟙_{Pᵢ}(x)·(-log μ⟦Pᵢ | 𝒜⟧(x)), the surprise of learning a point's P-cell once the information in 𝒜 is known. Here μ⟦Pᵢ | 𝒜⟧(x) is realized by the regular conditional probability (condExpKernel μ 𝒜 x) (Pᵢ). Exactly one indicator survives at each x (the one for its own cell), so g𝒜 is the pointwise limit, as 𝒜 ↑ 𝒞∞, of the per-step information functions in the Breiman telescoping.

Equations

Oseledets.Krieger.condInfoFun P x = ∑ i : ι, (P.cells i).indicator (fun (y : α) => -Real.log (((ProbabilityTheory.condExpKernel μ 𝒜) y) (P.cells i)).toReal) x

Instances For

theorem Oseledets.Krieger.measurable_condInfoFun {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (h𝒜 : 𝒜 ≤ mα) (P : Entropy.MeasurePartition μ ι) :

Measurable (condInfoFun P)

The conditional information function is measurable: a finite sum of indicators of measurable cells, each weighted by the measurable function x ↦ -log(condExpKernel μ 𝒜 x (Pᵢ)).toReal.

theorem Oseledets.Krieger.condInfoFun_nonneg {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (P : Entropy.MeasurePartition μ ι) (x : α) :

0 ≤ condInfoFun P x

The conditional information function is nonnegative: each indicator term is 𝟙_{Pᵢ}·(-log pᵢ) with pᵢ ∈ [0,1], so -log pᵢ ≥ 0.

theorem Oseledets.Krieger.integral_condInfoFun_eq_condEntropy {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (h𝒜 : 𝒜 ≤ mα) (P : Entropy.MeasurePartition μ ι) :

∫ (x : α), condInfoFun P x ∂μ = Entropy.condEntropy μ 𝒜 P.cells

The keystone integral identity. The conditional information function of P given 𝒜 integrates to the conditional Shannon entropy H(P | 𝒜): ∫ condInfoFun 𝒜 P = condEntropy μ 𝒜 P.cells.

Summing the per-cell pull-out integral_indicator_neg_log_eq_integral_negMulLog over the finite index recovers exactly the condEntropy integrand ∑ᵢ negMulLog(condExpKernel μ 𝒜 · Pᵢ).toReal.

theorem Oseledets.Krieger.integrable_condInfoFun {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (h𝒜 : 𝒜 ≤ mα) (P : Entropy.MeasurePartition μ ι) :

MeasureTheory.Integrable (condInfoFun P) μ

The conditional information function is μ-integrable: a finite sum of the integrable per-cell weighted indicators.

@[reducible]

noncomputable def Oseledets.Krieger.futureSigma {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} {T : α → α} (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) :

MeasurableSpace α

The strict-future conditioning σ-algebra 𝒞∞ = ⨆ₖ σ(T⁻¹(⋁₀ᵏ⁻¹ T⁻ʲP)) of the sharp SMB rate.

Equations

Oseledets.Krieger.futureSigma hT P = ⨆ (k : ℕ), Oseledets.Entropy.generatedSigmaAlgebra μ (Oseledets.Entropy.MeasurePartition.pullback hT (Oseledets.Entropy.ksJoin hT P k))

Instances For

theorem Oseledets.Krieger.futureSigma_le {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} {T : α → α} (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) :

futureSigma hT P ≤ mα

theorem Oseledets.Krieger.integral_condInfoFun_futureSigma_eq {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] {T : α → α} [Nonempty ι] (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) :

∫ (x : α), condInfoFun P x ∂μ = Entropy.ksEntropyPartition hT P

The Birkhoff target equals the sharp KS rate. The limit conditional information function g∞ = condInfoFun 𝒞∞ P integrates to the Kolmogorov–Sinai entropy h(P,T): ∫ g∞ = condEntropy μ 𝒞∞ P.cells = ksEntropyPartition hT P. Combines the keystone identity with ksEntropyPartition_eq_condEntropy_iSup.

theorem Oseledets.Krieger.ae_tendsto_birkhoffAverage_condInfoFun_futureSigma {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] {T : α → α} [Nonempty ι] (hT : Ergodic T μ) (P : Entropy.MeasurePartition μ ι) :

∀ᵐ (x : α) ∂μ, Filter.Tendsto (fun (n : ℕ) => birkhoffAverage ℝ T (condInfoFun P) n x) Filter.atTop (nhds (Entropy.ksEntropyPartition ⋯ P))

R4: the Birkhoff main term converges a.e. to h(P,T). For an ergodic measure-preserving T, the Birkhoff averages of the limit conditional information function g∞ = condInfoFun 𝒞∞ P converge μ-a.e. to ∫ g∞ = ksEntropyPartition hT P. This is the pointwise ergodic theorem (tendsto_birkhoffAverage_ae_integral) applied to the integrable g∞ (integrable_condInfoFun), with the integral value supplied by integral_condInfoFun_futureSigma_eq.

R5: Chung's `L¹` maximal domination #

The Cesàro tail (1/n)∑_{j<n}(g_{n-j} − g∞)(Tʲx) → 0 of the Breiman split is killed by the Chung maximal function g* = ⨆ₖ gₖ, where gₖ = condInfoFun (𝒞ₖ) P and 𝒞ₖ = σ(T⁻¹(⋁₀ᵏ⁻¹ T⁻ʲP)) is the increasing conditioning family. The genuinely analytic content is g* ∈ L¹(μ), which follows from the per-cell stopping-time tail estimate μ{x ∈ Pᵢ : g* x > λ} ≤ e^{−λ} (Chung 1961) and the layer-cake formula, giving ∫ g* ≤ H(P) + 1.

This section delivers:

condLevelSigma, condInfoMaxFun — the conditioning family and the (ℝ≥0∞-valued) maximal function, with measurability (measurable_condInfoMaxFun).
lintegral_min_meas_exp_le — the per-cell layer-cake estimate ∫⁻_{(0,∞)} min(μ Pᵢ, e^{−t}) dt ≤ ofReal(negMulLog (μ Pᵢ).toReal) + μ Pᵢ.
lintegral_condInfoMaxFun_le_of_layer — the R5 L¹ bound, sorry-free given the per-cell tail hypothesis chungTail: ∫⁻ g* ≤ ofReal (entropy μ P.cells) + 1.

The two named residual leaves (the precise missing Mathlib pieces) are:

chungTail — the Doob stopping-time tail μ{x ∈ Pᵢ : λ < g* x} ≤ ofReal e^{−λ} (a Markov bound on the conditional-probability martingale pₖ = μ⟦Pᵢ | 𝒞ₖ⟧, on the stopping time τ = inf{k : pₖ < e^{−λ}}); and
the Maker/Breiman dominated-Cesàro step (1/n)∑_{j<n}(g_{n-j} − g∞)(Tʲ·) → 0 a.e. from g* ∈ L¹ and gₖ → g∞ a.e. (not in Mathlib).

@[reducible]

noncomputable def Oseledets.Krieger.condLevelSigma {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} {T : α → α} (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) (k : ℕ) :

MeasurableSpace α

The k-th conditioning σ-algebra 𝒞ₖ = σ(T⁻¹(⋁₀ᵏ⁻¹ T⁻ʲP)) of the Breiman telescoping.

Equations

Oseledets.Krieger.condLevelSigma hT P k = Oseledets.Entropy.generatedSigmaAlgebra μ (Oseledets.Entropy.MeasurePartition.pullback hT (Oseledets.Entropy.ksJoin hT P k))

Instances For

theorem Oseledets.Krieger.condLevelSigma_le {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} {T : α → α} (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) (k : ℕ) :

condLevelSigma hT P k ≤ mα

noncomputable def Oseledets.Krieger.condInfoMaxFun {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] {T : α → α} (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) (x : α) :

The Chung maximal information function g* x = ⨆ₖ ofReal (gₖ x) (in ℝ≥0∞), where gₖ = condInfoFun (𝒞ₖ) P. Working in ℝ≥0∞ makes the supremum total (it may be ∞) and feeds the layer-cake formula directly.

Equations

Oseledets.Krieger.condInfoMaxFun hT P x = ⨆ (k : ℕ), ENNReal.ofReal (Oseledets.Krieger.condInfoFun P x)

Instances For

theorem Oseledets.Krieger.measurable_condInfoMaxFun {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] {T : α → α} (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) :

Measurable (condInfoMaxFun hT P)

The maximal information function is measurable: a countable supremum of the measurable ofReal ∘ gₖ.

theorem Oseledets.Krieger.lintegral_min_meas_exp_le (a : ENNReal) (ha : a ≤ 1) :

∫⁻ (t : ℝ) in Set.Ioi 0, min a (ENNReal.ofReal (Real.exp (-t))) ≤ ENNReal.ofReal a.toReal.negMulLog + a

Per-cell layer-cake estimate. For a measure a ∈ [0,1], the layer-cake integrand min(a, e^{−t}) over (0,∞) integrates to at most negMulLog a + a (equality for a ∈ (0,1]). This is the 1-D real-analysis core of the Chung bound ∫ g* ≤ H(P) + 1: split (0,∞) at c = −log a, where min = a below c (contributing a·(−log a) = negMulLog a) and min = e^{−t} above (contributing e^{−c} = a).

theorem Oseledets.Krieger.lintegral_tail_sum_le {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (P : Entropy.MeasurePartition μ ι) :

∫⁻ (t : ℝ) in Set.Ioi 0, ∑ i : ι, min (μ (P.cells i)) (ENNReal.ofReal (Real.exp (-t))) ≤ ENNReal.ofReal (Entropy.entropy μ P.cells) + 1

The Chung tail integral is H(P) + 1. Summing the per-cell layer-cake estimate over the finite partition, the layer-cake tail ∫⁻_{(0,∞)} ∑ᵢ min(μ Pᵢ, e^{−t}) dt is at most ofReal(entropy μ P.cells) + 1: each cell contributes negMulLog(μ Pᵢ) + μ Pᵢ, and the masses sum to μ(univ) = 1 while the negMulLog terms sum to the Shannon entropy of P.

theorem Oseledets.Krieger.lintegral_condInfoMaxFun_le_of_layer {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] {T : α → α} (hT : MeasureTheory.MeasurePreserving T μ μ) (P : Entropy.MeasurePartition μ ι) (hlayer : ∫⁻ (x : α), condInfoMaxFun hT P x ∂μ ≤ ∫⁻ (t : ℝ) in Set.Ioi 0, ∑ i : ι, min (μ (P.cells i)) (ENNReal.ofReal (Real.exp (-t)))) :

∫⁻ (x : α), condInfoMaxFun hT P x ∂μ ≤ ENNReal.ofReal (Entropy.entropy μ P.cells) + 1

R5: the Chung L¹ maximal bound ∫ g* ≤ H(P) + 1, reduced to the layer-cake tail leaf.

Given the layer-cake tail hypothesis hlayer ∫⁻ g* ≤ ∫⁻_{(0,∞)} ∑ᵢ min(μ Pᵢ, e^{−t}) dt, the maximal information function g* = condInfoMaxFun hT P has ∫⁻ g* ≤ ofReal(H(P)) + 1 < ∞, hence is in L¹. The bound is closed sorry-free by lintegral_tail_sum_le.

The hypothesis hlayer is exactly what the per-cell Chung stopping-time tail μ{x ∈ Pᵢ : λ < g* x} ≤ e^{−λ} delivers through the layer-cake formula (MeasureTheory.lintegral_eq_lintegral_meas_le) and the union bound over the finitely many cells: μ{t ≤ g*} = ∑ᵢ μ{x ∈ Pᵢ : t ≤ g*} ≤ ∑ᵢ min(μ Pᵢ, e^{−t}) (each cell-tail is ≤ μ Pᵢ trivially and ≤ e^{−t} by Chung). Proving hlayer from the partition structure is the one genuinely missing Mathlib piece (the Doob/Markov bound on the conditional-probability martingale pₖ = μ⟦Pᵢ | 𝒞ₖ⟧ along the stopping time τ = inf{k : pₖ < e^{−λ}}).

The pointwise Shannon–McMillan–Breiman theorem: full structure and remaining leaves #

This is the assembly of the pointwise SMB theorem (1/n)·iₙ(x) → h(P,T) from the proved pieces and the two precisely-isolated remaining leaves. It records the dependency structure as an honest theorem taking the two leaves as hypotheses, so the reduction is machine-checked even though the leaves are not yet discharged.

The Breiman telescoping (SMBSharp.infoWeight_succ_eq) gives iₙ(x) = ∑_{j<n} g_{n−j}(Tʲx) a.e., so (1/n)·iₙ(x) = A_n(g∞)(x) + (1/n)∑_{j<n}(g_{n−j} − g∞)(Tʲx), where:

the main term A_n(g∞)(x) → h(P,T) a.e. is R4, proved: ae_tendsto_birkhoffAverage_condInfoFun_futureSigma;
the Cesàro tail → 0 a.e. is the content of the two leaves below.

theorem Oseledets.Krieger.ae_tendsto_div_infoFun_of_tail {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] {T : α → α} [Nonempty ι] (hT : Ergodic T μ) (P : Entropy.MeasurePartition μ ι) (infoFun_n : ℕ → α → ℝ) (hTail : ∀ᵐ (x : α) ∂μ, Filter.Tendsto (fun (n : ℕ) => infoFun_n n x / ↑n - birkhoffAverage ℝ T (condInfoFun P) n x) Filter.atTop (nhds 0)) :

∀ᵐ (x : α) ∂μ, Filter.Tendsto (fun (n : ℕ) => infoFun_n n x / ↑n) Filter.atTop (nhds (Entropy.ksEntropyPartition ⋯ P))

Pointwise SMB, assembled from the single remaining (Maker/Chung) leaf. For ergodic T, the information-function averages (1/n)·iₙ(x) converge μ-a.e. to h(P,T) = ksEntropyPartition hT P, given the one residual leaf

hTail — the Maker/Breiman dominated-Cesàro vanishing of the Cesàro tail iₙ(x)/n − A_n(g∞)(x) → 0 a.e., whose L¹ domination is the Chung bound lintegral_condInfoMaxFun_le_of_layer.

The proof adds the R4 main-term limit (ae_tendsto_birkhoffAverage_condInfoFun_futureSigma, A_n(g∞)(x) → ∫ g∞ = h(P,T)) to the vanishing tail and rewrites iₙ/n = A_n(g∞) + tail. Here infoFun_n n plays the role of the information function iₙ (Breiman's iₙ(x) = ∑_{j<n} g_{n−j}(Tʲx), SMBSharp.infoWeight_succ_eq); the statement is iₙ-agnostic since the only property used is the tail decomposition. Everything but hTail (condInfoFun, its integral = h, R4, the Chung L¹ bound reduced to its tail leaf) is proved sorry-free above.