Documentation

def «termKL[_;_#_;_]».delab :

Pretty printer defined by notation3 command.

Equations

One or more equations did not get rendered due to their size.

Instances For

def «termKL[_;_#_;_]» :

If X, Y are two G-valued random variables, the Kullback--Leibler divergence is defined as KL(X ‖ Y) := ∑ₓ 𝐏(X = x) log(𝐏(X = x) / 𝐏(Y = x)).

Equations

One or more equations did not get rendered due to their size.

Instances For

def «termKL[_#_]» :

If X, Y are two G-valued random variables, the Kullback--Leibler divergence is defined as KL(X ‖ Y) := ∑ₓ 𝐏(X = x) log(𝐏(X = x) / 𝐏(Y = x)).

Equations

One or more equations did not get rendered due to their size.

Instances For

def «termKL[_#_]».delab :

Pretty printer defined by notation3 command.

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem KL_div_eq_of_equiv {Ω : Type u_1} {Ω' : Type u_2} {Ω'' : Type u_3} {Ω''' : Type u_4} {G : Type u_5} [mΩ : MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [mΩ' : MeasurableSpace Ω'] {μ' : MeasureTheory.Measure Ω'} [mΩ'' : MeasurableSpace Ω''] {μ'' : MeasureTheory.Measure Ω''} [mΩ''' : MeasurableSpace Ω'''] {μ''' : MeasureTheory.Measure Ω'''} [hG : MeasurableSpace G] {X : Ω → G} {Y : Ω' → G} (X' : Ω'' → G) (Y' : Ω''' → G) (hX : ProbabilityTheory.IdentDistrib X X' μ μ'') (hY : ProbabilityTheory.IdentDistrib Y Y' μ' μ''') :

KL[X ; μ # Y ; μ'] = KL[X' ; μ'' # Y' ; μ''']

If X' is a copy of X, and Y' is a copy of Y, then KL(X' ‖ Y') = KL(X ‖ Y).

theorem KL_div_nonneg {Ω : Type u_1} {Ω' : Type u_2} {G : Type u_5} [mΩ : MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [mΩ' : MeasurableSpace Ω'] {μ' : MeasureTheory.Measure Ω'} [hG : MeasurableSpace G] {X : Ω → G} {Y : Ω' → G} :

KL[X ; μ # Y ; μ'] ≥ 0

KL(X ‖ Y) ≥ 0.

theorem KL_div_eq_zero_iff_identDistrib {Ω : Type u_1} {Ω' : Type u_2} {G : Type u_5} [mΩ : MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [mΩ' : MeasurableSpace Ω'] {μ' : MeasureTheory.Measure Ω'} [hG : MeasurableSpace G] {X : Ω → G} {Y : Ω' → G} :

KL[X ; μ # Y ; μ'] = 0 ↔ ProbabilityTheory.IdentDistrib X Y μ μ'

KL(X ‖ Y) = 0 if and only if Y is a copy of X.

theorem KL_div_of_convex {Ω : Type u_1} {Ω' : Type u_2} {Ω'' : Type u_3} {G : Type u_5} [mΩ : MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [mΩ' : MeasurableSpace Ω'] {μ' : MeasureTheory.Measure Ω'} [mΩ'' : MeasurableSpace Ω''] {μ'' : MeasureTheory.Measure Ω''} [hG : MeasurableSpace G] {X : Ω → G} {Y : Ω' → G} {I : Type u_6} {S : Finset I} {w : I → ℝ} (hw : ∀ (s : I), w s ≥ 0) (hsum : (S.sum fun (s : I) => w s) = 1) (X' : I → Ω'' → G) (hconvex : ∀ (x : G), ((MeasureTheory.Measure.map X μ) {x}).toReal = S.sum fun (s : I) => w s * ((MeasureTheory.Measure.map (X' s) μ'') {x}).toReal) :

KL[X ; μ # Y ; μ'] ≤ S.sum fun (s : I) => w s * KL[X' s ; μ'' # Y ; μ']

If $S$ is a finite set, $\sum_{s \in S} w_s = 1$ for some non-negative $w_s$, and ${\bf P}(X=x) = \sum_{s\in S} w_s {\bf P}(X_s=x)$, ${\bf P}(Y=x) = \sum_{s\in S} w_s {\bf P}(Y_s=x)$ for all $x$, then $$D_{KL}(X\Vert Y) \le \sum_{s\in S} w_s D_{KL}(X_s\Vert Y_s).$$

theorem KL_div_of_comp_inj {Ω : Type u_1} {Ω' : Type u_2} {G : Type u_5} [mΩ : MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [mΩ' : MeasurableSpace Ω'] {μ' : MeasureTheory.Measure Ω'} [hG : MeasurableSpace G] {X : Ω → G} {Y : Ω' → G} {H : Type u_6} [MeasurableSpace H] [MeasurableSingletonClass H] {f : G → H} (hf : Function.Injective f) :

KL[X ; μ # Y ; μ'] = KL[f ∘ X ; μ # f ∘ Y ; μ']

If $f:G \to H$ is an injection, then $D_{KL}(f(X)\Vert f(Y)) = D_{KL}(X\Vert Y)$.

theorem KL_div_add_le_KL_div_of_indep {Ω : Type u_1} {G : Type u_5} [mΩ : MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [hG : MeasurableSpace G] [AddCommGroup G] [MeasurableSub₂ G] [MeasurableAdd₂ G] {X : Ω → G} {Y : Ω → G} {Z : Ω → G} [FiniteRange X] [FiniteRange Y] [FiniteRange Z] (hindep : ProbabilityTheory.iIndepFun (fun (x : Fin (Nat.succ 0).succ.succ) => hG) ![X, Y, Z] μ) :

KL[X + Z ; μ # Y + Z ; μ] ≤ KL[X ; μ # Y ; μ]

If $X, Y, Z$ are independent $G$-valued random variables, then $$D_{KL}(X+Z\Vert Y+Z) \leq D_{KL}(X\Vert Y).$$

noncomputable def condKL_div {Ω : Type u_1} {Ω' : Type u_2} {G : Type u_5} [mΩ : MeasurableSpace Ω] [mΩ' : MeasurableSpace Ω'] [hG : MeasurableSpace G] {S : Type u_6} (X : Ω → G) (Y : Ω' → G) (Z : Ω → S) (μ : autoParam (MeasureTheory.Measure Ω) _auto✝) (μ' : autoParam (MeasureTheory.Measure Ω') _auto✝) :

ℝ

If $X,Y,Z$ are random variables, with $X,Z$ defined on the same sample space, we define $$ D_{KL}(X|Z \Vert Y) := \sum_z \mathbf{P}(Z=z) D_{KL}( (X|Z=z) \Vert Y).$$

Equations

KL[X | Z ; μ # Y ; μ'] = ∑' (z : S), (μ (Z ⁻¹' {z})).toReal * KL[X ; ProbabilityTheory.cond μ (Z ⁻¹' {z}) # Y ; μ']

Instances For

def «termKL[_|_;_#_;_]» :

If $X,Y,Z$ are random variables, with $X,Z$ defined on the same sample space, we define $$ D_{KL}(X|Z \Vert Y) := \sum_z \mathbf{P}(Z=z) D_{KL}( (X|Z=z) \Vert Y).$$

Equations

One or more equations did not get rendered due to their size.

Instances For

def «termKL[_|_;_#_;_]».delab :

Pretty printer defined by notation3 command.

Equations

One or more equations did not get rendered due to their size.

Instances For

def «termKL[_|_#_]» :

If $X,Y,Z$ are random variables, with $X,Z$ defined on the same sample space, we define $$ D_{KL}(X|Z \Vert Y) := \sum_z \mathbf{P}(Z=z) D_{KL}( (X|Z=z) \Vert Y).$$

Equations

One or more equations did not get rendered due to their size.

Instances For

def «termKL[_|_#_]».delab :

Pretty printer defined by notation3 command.

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem condKL_div_eq {Ω : Type u_1} {Ω' : Type u_2} {G : Type u_5} [mΩ : MeasurableSpace Ω] [mΩ' : MeasurableSpace Ω'] [hG : MeasurableSpace G] {S : Type u_6} [MeasurableSpace S] (X : Ω → G) (Y : Ω' → G) (Z : Ω → S) (μ : autoParam (MeasureTheory.Measure Ω) _auto✝) (μ' : autoParam (MeasureTheory.Measure Ω') _auto✝) :

KL[X | Z ; μ # Y ; μ'] = KL[X ; μ # Y ; μ'] + H[X ; μ] - H[X | Z ; μ]

If $X, Y$ are independent $G$-valued random variables, and $Z$ is another random variable defined on the same sample space as $X$, then $$D_{KL}((X|Z)\Vert Y) = D_{KL}(X\Vert Y) + \bbH[X] - \bbH[X|Z].$$