The conditional chain rule (analytic heart of Abramov–Rokhlin, issue #13) #

This module proves the entropy chain rule H(P ∨ Q) = H(P) + H(Q | P), the EQUALITY refining entropy_join_le, both in its absolute form and in the kernel-level conditional form H(P ∨ Q | 𝒜) = H(P | 𝒜) + H(Q | 𝒜 ⊔ σ(P)).

Step 1: the pure finite (scalar) chain rule. #

For a "joint distribution" p : ι × κ → ℝ (nonneg, summing to 1) with ι-marginal a i = ∑ⱼ p (i,j), the joint entropy splits as the marginal entropy of a plus the average over i of the conditional entropies of the rows j ↦ p (i,j) / a i. This is a pure negMulLog_mul identity; no measure theory.

source

theorem Oseledets.Entropy.sum_negMulLog_row {κ : Type u_3} [Fintype κ] (r : κ → ℝ) (a : ℝ) (ha : a ≠ 0) (hrow : ∑ j : κ, r j = a) :

∑ j : κ, (r j).negMulLog = a.negMulLog + a * ∑ j : κ, (r j / a).negMulLog

Scalar chain rule, single-row form. For a nonnegative row r : κ → ℝ with sum a ≥ 0, ∑ⱼ negMulLog (r j) = negMulLog a + a · (∑ⱼ negMulLog (r j / a)), provided a ≠ 0 (or trivially both sides are about a = 0). The identity is negMulLog (a · (rⱼ/a)) expanded by negMulLog_mul, summed over j, using ∑ⱼ rⱼ/a = 1.

source

noncomputable def Oseledets.Entropy.condEntropyOnCell {α : Type u_1} {κ : Type u_3} [mα : MeasurableSpace α] [Fintype κ] (μ : MeasureTheory.Measure α) (sᵢ : Set α) (t : κ → Set α) :

ℝ

Per-cell conditional entropy of a partition Q given the cell s i of P. This is the Shannon entropy of the conditional distribution j ↦ μ(s i ∩ t j) / μ(s i) (the law of Q restricted to and renormalized on the cell s i). When μ (s i) = 0 the division-by-zero convention makes the conditional measures 0, so the entropy is 0.

Equations

Oseledets.Entropy.condEntropyOnCell μ sᵢ t = ∑ j : κ, ((μ (sᵢ ∩ t j)).toReal / (μ sᵢ).toReal).negMulLog

Instances For

source

noncomputable def Oseledets.Entropy.condEntropyGivenPartition {α : Type u_1} {ι : Type u_2} {κ : Type u_3} [mα : MeasurableSpace α] [Fintype ι] [Fintype κ] (μ : MeasureTheory.Measure α) (s : ι → Set α) (t : κ → Set α) :

ℝ

Conditional entropy of Q given the finite partition P (the additive-over-cells form H(Q | P) = ∑ᵢ μ(Pᵢ) · H(Q | Pᵢ)). This is the second summand of the chain rule H(P ∨ Q) = H(P) + H(Q | P), and is exactly the 𝒜 = ⊥ instance of the conditional entropy H(Q | 𝒜 ⊔ σ(P)) written without reference to the refined kernel condExpKernel μ (𝒜 ⊔ σ(P)).

Equations

Oseledets.Entropy.condEntropyGivenPartition μ s t = ∑ i : ι, (μ (s i)).toReal * Oseledets.Entropy.condEntropyOnCell μ (s i) t

Instances For

Step 2: the absolute chain rule `H(P ∨ Q) = H(P) + H(Q | P)`. #

source

theorem Oseledets.Entropy.entropy_join_eq_add_condEntropyGivenPartition {α : Type u_1} {ι : Type u_2} {κ : Type u_3} [mα : MeasurableSpace α] [Fintype ι] [Fintype κ] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (P : MeasurePartition μ ι) (Q : MeasurePartition μ κ) :

entropy μ (joinCells P.cells Q.cells) = entropy μ P.cells + condEntropyGivenPartition μ P.cells Q.cells

The absolute entropy chain rule. For two finite measurable partitions P and Q of a probability space, the entropy of the join equals the entropy of P plus the conditional entropy of Q given P: H(P ∨ Q) = H(P) + H(Q | P).

This is the EQUALITY refining entropy_join_le. It is the pure finite identity obtained by applying the single-row scalar chain rule sum_negMulLog_row to each row r j = μ(Pᵢ ∩ Qⱼ) with marginal a = μ(Pᵢ), then summing over i. The a = 0 rows contribute 0 on both sides (every μ(Pᵢ ∩ Qⱼ) ≤ μ(Pᵢ) = 0).

Step 3: the kernel-level (general `𝒜`) conditional chain rule. #

The absolute identity, applied pointwise against the Markov kernel condExpKernel μ 𝒜 ω (which is a.e. a probability measure for which P, Q are still partitions) and integrated over μ. This is the EQUALITY refining condEntropy_join_le. The refined-conditioning term H(Q | 𝒜 ⊔ σ(P)) is expressed in the additive-over-cells form, avoiding the refined kernel condExpKernel μ (𝒜 ⊔ σ(P)) entirely.

source

noncomputable def Oseledets.Entropy.condEntropyGivenPartitionCond {α : Type u_1} {ι : Type u_2} {κ : Type u_3} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] [Fintype κ] (μ : MeasureTheory.Measure α) [MeasureTheory.IsFiniteMeasure μ] (𝒜 : MeasurableSpace α) (s : ι → Set α) (t : κ → Set α) :

ℝ

Conditional version of condEntropyGivenPartition, evaluated against the regular conditional probability condExpKernel μ 𝒜 ω and averaged over μ. This is H(Q | 𝒜 ⊔ σ(P)) written in the additive-over-cells form, i.e. the second summand of the conditional chain rule.

Uses the ambient mα for the kernel (exactly as condEntropy/condEntropy_join_le); the sub-σ-algebra 𝒜 is a plain explicit argument, never an instance.

Equations

Oseledets.Entropy.condEntropyGivenPartitionCond μ 𝒜 s t = ∫ (ω : α), Oseledets.Entropy.condEntropyGivenPartition ((ProbabilityTheory.condExpKernel μ 𝒜) ω) s t ∂μ

Instances For

source

theorem Oseledets.Entropy.condEntropy_join_eq {α : Type u_1} {ι : Type u_2} {κ : Type u_3} {𝒜 : MeasurableSpace α} [mα : MeasurableSpace α] [StandardBorelSpace α] [Fintype ι] [Fintype κ] {μ : MeasureTheory.Measure α} [MeasureTheory.IsProbabilityMeasure μ] (h𝒜 : 𝒜 ≤ mα) (P : MeasurePartition μ ι) (Q : MeasurePartition μ κ) :

condEntropy μ 𝒜 (joinCells P.cells Q.cells) = condEntropy μ 𝒜 P.cells + condEntropyGivenPartitionCond μ 𝒜 P.cells Q.cells

The conditional chain rule (kernel-level form). H(P ∨ Q | 𝒜) = H(P | 𝒜) + H(Q | 𝒜 ⊔ σ(P)), with the last term in additive-over-cells form ∫ ω, ∑ᵢ κ(ω)(Pᵢ) · H_{κ(ω)|Pᵢ}(Q) ∂μ. The EQUALITY refining condEntropy_join_le.

Documentation

Oseledets.Entropy.CondChainRule

The conditional chain rule (analytic heart of Abramov–Rokhlin, issue #13) #

Step 1: the pure finite (scalar) chain rule. #

Step 2: the absolute chain rule `H(P ∨ Q) = H(P) + H(Q | P)`. #

Step 3: the kernel-level (general `𝒜`) conditional chain rule. #

The conditional chain rule (analytic heart of Abramov–Rokhlin, issue #13) #

Step 1: the pure finite (scalar) chain rule. #

Step 2: the absolute chain rule H(P ∨ Q) = H(P) + H(Q | P). #

Step 3: the kernel-level (general 𝒜) conditional chain rule. #

Step 2: the absolute chain rule `H(P ∨ Q) = H(P) + H(Q | P)`. #

Step 3: the kernel-level (general `𝒜`) conditional chain rule. #