The conditional chain rule (analytic heart of Abramov–Rokhlin, issue #13) #
This module proves the entropy chain rule H(P ∨ Q) = H(P) + H(Q | P), the EQUALITY refining
entropy_join_le, both in its absolute form and in the kernel-level conditional form
H(P ∨ Q | 𝒜) = H(P | 𝒜) + H(Q | 𝒜 ⊔ σ(P)).
Step 1: the pure finite (scalar) chain rule. #
For a "joint distribution" p : ι × κ → ℝ (nonneg, summing to 1) with ι-marginal
a i = ∑ⱼ p (i,j), the joint entropy splits as the marginal entropy of a plus the average
over i of the conditional entropies of the rows j ↦ p (i,j) / a i. This is a pure
negMulLog_mul identity; no measure theory.
Scalar chain rule, single-row form. For a nonnegative row r : κ → ℝ with sum a ≥ 0,
∑ⱼ negMulLog (r j) = negMulLog a + a · (∑ⱼ negMulLog (r j / a)), provided a ≠ 0
(or trivially both sides are about a = 0). The identity is negMulLog (a · (rⱼ/a)) expanded by
negMulLog_mul, summed over j, using ∑ⱼ rⱼ/a = 1.
Per-cell conditional entropy of a partition Q given the cell s i of P.
This is the Shannon entropy of the conditional distribution j ↦ μ(s i ∩ t j) / μ(s i)
(the law of Q restricted to and renormalized on the cell s i). When μ (s i) = 0 the
division-by-zero convention makes the conditional measures 0, so the entropy is 0.
Equations
Instances For
Conditional entropy of Q given the finite partition P (the additive-over-cells form
H(Q | P) = ∑ᵢ μ(Pᵢ) · H(Q | Pᵢ)). This is the second summand of the chain rule
H(P ∨ Q) = H(P) + H(Q | P), and is exactly the 𝒜 = ⊥ instance of the conditional entropy
H(Q | 𝒜 ⊔ σ(P)) written without reference to the refined kernel
condExpKernel μ (𝒜 ⊔ σ(P)).
Equations
- Oseledets.Entropy.condEntropyGivenPartition μ s t = ∑ i : ι, (μ (s i)).toReal * Oseledets.Entropy.condEntropyOnCell μ (s i) t
Instances For
Step 2: the absolute chain rule H(P ∨ Q) = H(P) + H(Q | P). #
The absolute entropy chain rule. For two finite measurable partitions P and Q of a
probability space, the entropy of the join equals the entropy of P plus the conditional entropy
of Q given P:
H(P ∨ Q) = H(P) + H(Q | P).
This is the EQUALITY refining entropy_join_le. It is the pure finite identity obtained by
applying the single-row scalar chain rule sum_negMulLog_row to each row
r j = μ(Pᵢ ∩ Qⱼ) with marginal a = μ(Pᵢ), then summing over i. The a = 0 rows
contribute 0 on both sides (every μ(Pᵢ ∩ Qⱼ) ≤ μ(Pᵢ) = 0).
Step 3: the kernel-level (general 𝒜) conditional chain rule. #
The absolute identity, applied pointwise against the Markov kernel condExpKernel μ 𝒜 ω
(which is a.e. a probability measure for which P, Q are still partitions) and integrated
over μ. This is the EQUALITY refining condEntropy_join_le. The refined-conditioning term
H(Q | 𝒜 ⊔ σ(P)) is expressed in the additive-over-cells form, avoiding the refined kernel
condExpKernel μ (𝒜 ⊔ σ(P)) entirely.
Conditional version of condEntropyGivenPartition, evaluated against the regular
conditional probability condExpKernel μ 𝒜 ω and averaged over μ. This is H(Q | 𝒜 ⊔ σ(P))
written in the additive-over-cells form, i.e. the second summand of the conditional chain rule.
Uses the ambient mα for the kernel (exactly as condEntropy/condEntropy_join_le); the
sub-σ-algebra 𝒜 is a plain explicit argument, never an instance.
Equations
- Oseledets.Entropy.condEntropyGivenPartitionCond μ 𝒜 s t = ∫ (ω : α), Oseledets.Entropy.condEntropyGivenPartition ((ProbabilityTheory.condExpKernel μ 𝒜) ω) s t ∂μ
Instances For
The conditional chain rule (kernel-level form).
H(P ∨ Q | 𝒜) = H(P | 𝒜) + H(Q | 𝒜 ⊔ σ(P)), with the last term in additive-over-cells form
∫ ω, ∑ᵢ κ(ω)(Pᵢ) · H_{κ(ω)|Pᵢ}(Q) ∂μ. The EQUALITY refining condEntropy_join_le.