Conditional Shannon entropy of a finite measurable partition #

This file extends the measure-theoretic foundation for Kolmogorov–Sinai entropy started in Oseledets.Entropy.Partition to the conditional setting. Given a probability space (α, μ) that is standard Borel, a sub-σ-algebra 𝒜 ≤ mα, and a finite family of cells s : ι → Set α, the conditional Shannon entropy is the μ-average of the pointwise entropy computed with respect to the regular conditional probability κ = condExpKernel μ 𝒜: H(s | 𝒜) = ∫ ω, ∑ᵢ negMulLog (κ(ω) (s i)).toReal ∂μ.

This is the information about the partition that remains after conditioning on the information in 𝒜. We record its three basic properties — nonnegativity, the log k upper bound for a genuine partition, and the reduction to ordinary entropy when 𝒜 = ⊥ — together with the fundamental fact that conditioning does not increase entropy (condEntropy_le), which is Jensen's inequality for the concave function negMulLog applied to the disintegration of μ.

Main definitions #

Oseledets.Entropy.condEntropy: the conditional Shannon entropy ∫ ω, ∑ i, negMulLog (condExpKernel μ 𝒜 ω (s i)).toReal ∂μ of a finite family of cells.

Main results #

Oseledets.Entropy.condEntropy_nonneg: conditional entropy is nonnegative.
Oseledets.Entropy.condEntropy_le_log_card: a partition into k cells has conditional entropy at most log k.
Oseledets.Entropy.condEntropy_bot: conditioning on the trivial σ-algebra ⊥ recovers the ordinary entropy.
Oseledets.Entropy.condEntropy_le: conditioning does not increase entropy.

References #

François Le Maître, Notes on the Kolmogorov–Sinai theorem (2017), §1.
Peter Walters, An Introduction to Ergodic Theory, Springer GTM 79, Chapter 4.

source

noncomputable def Oseledets.Entropy.condEntropy {α : Type u_1} {ι : Type u_2} {mΩ : MeasurableSpace α} [Fintype ι] [StandardBorelSpace α] (μ : MeasureTheory.Measure α) [MeasureTheory.IsFiniteMeasure μ] (𝒜 : MeasurableSpace α) (s : ι → Set α) :

ℝ

The conditional Shannon entropy ∫ ω, ∑ i, negMulLog (condExpKernel μ 𝒜 ω (s i)).toReal ∂μ of a finite family of cells s : ι → Set α given a sub-σ-algebra 𝒜. It is the μ-average of the pointwise entropy of s computed against the regular conditional probability condExpKernel μ 𝒜 ω, i.e. the information about the partition remaining after conditioning on 𝒜.

Equations

Oseledets.Entropy.condEntropy μ 𝒜 s = ∫ (ω : α), ∑ i : ι, (((ProbabilityTheory.condExpKernel μ 𝒜) ω) (s i)).toReal.negMulLog ∂μ

Instances For

source

@[simp]

theorem Oseledets.Entropy.condEntropy_def {α : Type u_1} {ι : Type u_2} {mΩ : MeasurableSpace α} [Fintype ι] [StandardBorelSpace α] (μ : MeasureTheory.Measure α) [MeasureTheory.IsFiniteMeasure μ] (𝒜 : MeasurableSpace α) (s : ι → Set α) :

condEntropy μ 𝒜 s = ∫ (ω : α), ∑ i : ι, (((ProbabilityTheory.condExpKernel μ 𝒜) ω) (s i)).toReal.negMulLog ∂μ

source

theorem Oseledets.Entropy.toReal_condExpKernel_le_one {α : Type u_1} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] {μ : MeasureTheory.Measure α} [MeasureTheory.IsFiniteMeasure μ] (A : Set α) (ω : α) :

(((ProbabilityTheory.condExpKernel μ 𝒜) ω) A).toReal ≤ 1

For every ω, the conditional probability condExpKernel μ 𝒜 ω A of a set A has real value in [0, 1], because condExpKernel μ 𝒜 ω is a probability measure.

source

theorem Oseledets.Entropy.negMulLog_condExpKernel_sum_nonneg {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsFiniteMeasure μ] (s : ι → Set α) (ω : α) :

0 ≤ ∑ i : ι, (((ProbabilityTheory.condExpKernel μ 𝒜) ω) (s i)).toReal.negMulLog

The pointwise integrand of condEntropy is nonnegative: for every ω each cell has conditional probability in [0, 1] (the kernel is Markov), where negMulLog is nonnegative.

source

theorem Oseledets.Entropy.condEntropy_nonneg {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsFiniteMeasure μ] {s : ι → Set α} :

0 ≤ condEntropy μ 𝒜 s

Conditional Shannon entropy is nonnegative: the integrand is pointwise nonnegative because each conditional cell probability lies in [0, 1].

source

theorem Oseledets.Entropy.integrable_negMulLog_condExpKernel {α : Type u_1} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] {μ : MeasureTheory.Measure α} [MeasureTheory.IsFiniteMeasure μ] (h𝒜 : 𝒜 ≤ mα) {A : Set α} (hA : MeasurableSet A) :

MeasureTheory.Integrable (fun (ω : α) => (((ProbabilityTheory.condExpKernel μ 𝒜) ω) A).toReal.negMulLog) μ

Integrability of the single term ω ↦ negMulLog (condExpKernel μ 𝒜 ω A).toReal: the function is measurable (composition of the continuous negMulLog with the measurable conditional probability) and bounded by 1, hence integrable on the finite measure μ.

source

theorem Oseledets.Entropy.integrable_condEntropy_integrand {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsFiniteMeasure μ] (h𝒜 : 𝒜 ≤ mα) (s : ι → Set α) (hs : ∀ (i : ι), MeasurableSet (s i)) :

MeasureTheory.Integrable (fun (ω : α) => ∑ i : ι, (((ProbabilityTheory.condExpKernel μ 𝒜) ω) (s i)).toReal.negMulLog) μ

The integrand ω ↦ ∑ i, negMulLog (condExpKernel μ 𝒜 ω (s i)).toReal of condEntropy is μ-integrable, as a finite sum of integrable single terms.

source

theorem Oseledets.Entropy.condExpKernel_sum_toReal_measure_eq_one {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (h𝒜 : 𝒜 ≤ mα) (P : MeasurePartition μ ι) :

∀ᵐ (ω : α) ∂μ, ∑ i : ι, (((ProbabilityTheory.condExpKernel μ 𝒜) ω) (P.cells i)).toReal = 1

For a finite measurable partition P of the probability space and μ-almost every ω, the conditional probabilities of the cells under condExpKernel μ 𝒜 ω sum to 1.

The cells are measurable, cover the whole space, and are μ-a.e. disjoint; the last fact transfers to the conditional kernel via the disintegration condExpKernel μ 𝒜 ∘ₘ μ.trim h𝒜 = μ, so for a.e. ω the cells are condExpKernel μ 𝒜 ω-a.e. disjoint and the kernel — being Markov — distributes its total mass 1 over them.

source

theorem Oseledets.Entropy.condEntropy_le_log_card {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] [Nonempty ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (h𝒜 : 𝒜 ≤ mα) (P : MeasurePartition μ ι) :

condEntropy μ 𝒜 P.cells ≤ Real.log ↑(Fintype.card ι)

Conditional version of Proposition 1 (Le Maître). A finite measurable partition of a probability space into k cells has conditional Shannon entropy at most log k.

For μ-almost every ω the cell probabilities under condExpKernel μ 𝒜 ω sum to 1, so the pointwise entropy is bounded by log k via the existing Jensen bound entropy_le_log_card; integrating this constant bound over the probability measure μ yields the claim.

source

theorem Oseledets.Entropy.condEntropy_bot {α : Type u_1} {ι : Type u_2} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] {s : ι → Set α} (hs : ∀ (i : ι), MeasurableSet (s i)) :

condEntropy μ ⊥ s = entropy μ s

Conditioning on the trivial σ-algebra ⊥ recovers the ordinary Shannon entropy: the conditional probability condExpKernel μ ⊥ ω (s i) equals the unconditional probability μ (s i) for μ-almost every ω, so the integrand is a.e. the constant entropy μ s.

source

theorem Oseledets.Entropy.integral_condExpKernel_toReal {α : Type u_1} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (h𝒜 : 𝒜 ≤ mα) {A : Set α} (hA : MeasurableSet A) :

∫ (ω : α), (((ProbabilityTheory.condExpKernel μ 𝒜) ω) A).toReal ∂μ = (μ A).toReal

The disintegration identity ∫ ω, (condExpKernel μ 𝒜 ω A).toReal ∂μ = (μ A).toReal: the μ-average of the conditional probability of a measurable set recovers its unconditional probability, by the law of total probability (condExpKernel_ae_eq_condExp and integral_condExp applied to the indicator of the set).

source

theorem Oseledets.Entropy.condEntropy_le {α : Type u_1} {ι : Type u_2} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (h𝒜 : 𝒜 ≤ mα) (P : MeasurePartition μ ι) :

condEntropy μ 𝒜 P.cells ≤ entropy μ P.cells

Conditioning does not increase entropy. For any finite measurable partition P of the probability space and any sub-σ-algebra 𝒜, the conditional entropy is at most the unconditional entropy.

This is Jensen's inequality (ConcaveOn.le_map_integral) for the concave function negMulLog, applied term by term: for each cell Pᵢ, the average of negMulLog (κ(ω) Pᵢ).toReal is at most negMulLog of the average ∫ ω, (κ(ω) Pᵢ).toReal ∂μ, which equals negMulLog (μ Pᵢ).toReal by the disintegration integral_condExpKernel_toReal. Summing over i gives the bound.

Documentation

Oseledets.Entropy.CondPartition

Conditional Shannon entropy of a finite measurable partition #

Main definitions #

Main results #

References #