Documentation

PFR.ForMathlib.Entropy.Basic

Entropy and conditional entropy #

Main definitions #

Main statements #

Notations #

All notations have variants where we can specify the measure (which is otherwise supposed to be volume). For example H[X ; μ] and I[X : Y ; μ] instead of H[X] and I[X : Y] respectively.

noncomputable def ProbabilityTheory.entropy {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] (X : ΩS) (μ : autoParam (MeasureTheory.Measure Ω) _auto✝) :

Entropy of a random variable with values in a finite measurable space.

Equations
Instances For

    Pretty printer defined by notation3 command.

    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Entropy of a random variable with values in a finite measurable space.

      Equations
      • One or more equations did not get rendered due to their size.
      Instances For

        Pretty printer defined by notation3 command.

        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Entropy of a random variable with values in a finite measurable space.

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            Entropy of a random variable with values in a finite measurable space.

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Pretty printer defined by notation3 command.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Pretty printer defined by notation3 command.

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Entropy of a random variable with values in a finite measurable space.

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For
                    theorem ProbabilityTheory.entropy_def {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] (X : ΩS) (μ : MeasureTheory.Measure Ω) :

                    Entropy of a random variable agrees with entropy of its distribution.

                    Entropy of a random variable is also the kernel entropy of the distribution over a Dirac mass.

                    @[simp]
                    theorem ProbabilityTheory.entropy_zero_measure {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] (X : ΩS) :
                    H[X ; 0] = 0

                    Any variable on a zero measure space has zero entropy.

                    theorem ProbabilityTheory.entropy_congr {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] {μ : MeasureTheory.Measure Ω} {X : ΩS} {X' : ΩS} (h : μ.ae.EventuallyEq X X') :
                    H[X ; μ] = H[X' ; μ]

                    Two variables that agree almost everywhere, have the same entropy.

                    theorem ProbabilityTheory.entropy_nonneg {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] (X : ΩS) (μ : MeasureTheory.Measure Ω) :
                    0 H[X ; μ]

                    Entropy is always non-negative.

                    theorem ProbabilityTheory.IdentDistrib.entropy_eq {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] {X : ΩS} {μ : MeasureTheory.Measure Ω} {Ω' : Type u_1} [MeasurableSpace Ω'] {μ' : MeasureTheory.Measure Ω'} {X' : Ω'S} (h : ProbabilityTheory.IdentDistrib X X' μ μ') :
                    H[X ; μ] = H[X' ; μ']

                    Two variables that have the same distribution, have the same entropy.

                    theorem ProbabilityTheory.entropy_le_log_card {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSingletonClass S] [Fintype S] (X : ΩS) (μ : MeasureTheory.Measure Ω) :
                    H[X ; μ] ((Fintype.card S)).log

                    Entropy is at most the logarithm of the cardinality of the range.

                    theorem ProbabilityTheory.entropy_le_log_card_of_mem {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [Countable S] [MeasurableSpace S] [MeasurableSingletonClass S] {A : Finset S} {μ : MeasureTheory.Measure Ω} {X : ΩS} (hX : Measurable X) (h : ∀ᵐ (ω : Ω) ∂μ, X ω A) :
                    H[X ; μ] ((Nat.card { x : S // x A })).log

                    Entropy is at most the logarithm of the cardinality of a set in which X almost surely takes values in.

                    theorem ProbabilityTheory.entropy_eq_sum {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] {X : ΩS} (hX : Measurable X) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] :
                    H[X ; μ] = ∑' (x : S), ((MeasureTheory.Measure.map X μ) {x}).toReal.negMulLog

                    H[X] = ∑ₛ P[X=s] log 1 / P[X=s].

                    theorem ProbabilityTheory.entropy_eq_sum' {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] {X : ΩS} (hX : Measurable X) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] :
                    H[X ; μ] = ∑' (x : S), ((MeasureTheory.Measure.map X μ).real {x}).negMulLog
                    theorem ProbabilityTheory.entropy_eq_sum_finset {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] {X : ΩS} (hX : Measurable X) {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {A : Finset S} (hA : (MeasureTheory.Measure.map X μ) (A) = 0) :
                    H[X ; μ] = A.sum fun (x : S) => ((MeasureTheory.Measure.map X μ) {x}).toReal.negMulLog
                    theorem ProbabilityTheory.entropy_eq_sum_finset' {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] {X : ΩS} (hX : Measurable X) {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {A : Finset S} (hA : (MeasureTheory.Measure.map X μ) (A) = 0) :
                    H[X ; μ] = A.sum fun (x : S) => ((MeasureTheory.Measure.map X μ).real {x}).negMulLog
                    theorem ProbabilityTheory.entropy_eq_sum_finiteRange {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSingletonClass S] {X : ΩS} (hX : Measurable X) {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] :
                    H[X ; μ] = (FiniteRange.toFinset X).sum fun (x : S) => ((MeasureTheory.Measure.map X μ) {x}).toReal.negMulLog
                    theorem ProbabilityTheory.entropy_eq_sum_finiteRange' {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSingletonClass S] {X : ΩS} (hX : Measurable X) {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] :
                    H[X ; μ] = (FiniteRange.toFinset X).sum fun (x : S) => ((MeasureTheory.Measure.map X μ).real {x}).negMulLog
                    theorem ProbabilityTheory.entropy_cond_eq_sum {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] {X : ΩS} {Y : ΩT} (hX : Measurable X) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (y : T) :
                    H[X | Yy ; μ] = ∑' (x : S), ((MeasureTheory.Measure.map X (ProbabilityTheory.cond μ (Y ⁻¹' {y}))) {x}).toReal.negMulLog

                    H[X | Y=y] = ∑_s P[X=s | Y=y] log 1/(P[X=s | Y=y]).

                    theorem ProbabilityTheory.entropy_cond_eq_sum_finiteRange {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSingletonClass S] {X : ΩS} {Y : ΩT} (hX : Measurable X) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (y : T) [FiniteRange X] :
                    H[X | Yy ; μ] = (FiniteRange.toFinset X).sum fun (x : S) => ((MeasureTheory.Measure.map X (ProbabilityTheory.cond μ (Y ⁻¹' {y}))) {x}).toReal.negMulLog
                    theorem ProbabilityTheory.entropy_comp_of_injective {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} (μ : MeasureTheory.Measure Ω) (hX : Measurable X) (f : ST) (hf : Function.Injective f) :
                    H[f X ; μ] = H[X ; μ]

                    If X, Y are S-valued and T-valued random variables, and Y = f(X) for some injection f : S \to T, then H[Y] = H[X]. One can also use entropy_of_comp_eq_of_comp as an alternative if verifying injectivity is fiddly. For the upper bound only, see entropy_comp_le.

                    @[simp]
                    theorem ProbabilityTheory.entropy_const {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSingletonClass S] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (c : S) :
                    H[fun (x : Ω) => c ; μ] = 0

                    The entropy of any constant is zero.

                    theorem ProbabilityTheory.IsUniform.entropy_eq {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [Countable S] [MeasurableSpace S] [MeasurableSingletonClass S] (H : Finset S) (X : ΩS) {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (hX : ProbabilityTheory.IsUniform (H) X μ) (hX' : Measurable X) :
                    H[X ; μ] = ((Nat.card { x : S // x H })).log

                    If X is uniformly distributed on H, then H[X] = log |H|.

                    theorem ProbabilityTheory.IsUniform.entropy_eq' {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [Countable S] [MeasurableSpace S] [MeasurableSingletonClass S] {H : Set S} [Finite H] {X : ΩS} {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (hX : ProbabilityTheory.IsUniform H X μ) (hX' : Measurable X) :
                    H[X ; μ] = ((Nat.card H)).log

                    Variant of IsUniform.entropy_eq where H is a finite Set rather than Finset.

                    theorem ProbabilityTheory.entropy_eq_log_card {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSingletonClass S] {X : ΩS} [Fintype S] (hX : Measurable X) (μ : MeasureTheory.Measure Ω) [hμ : NeZero μ] [MeasureTheory.IsFiniteMeasure μ] :
                    H[X ; μ] = ((Fintype.card S)).log ∀ (s : S), (MeasureTheory.Measure.map X μ) {s} = μ Set.univ / (Fintype.card S)

                    If X is S-valued random variable, then H[X] = log |S| if and only if X is uniformly distributed.

                    theorem ProbabilityTheory.prob_ge_exp_neg_entropy {Ω : Type uΩ} {S : Type uS} [mΩ : MeasurableSpace Ω] [Nonempty S] [MeasurableSpace S] [MeasurableSingletonClass S] (X : ΩS) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) [hX' : FiniteRange X] :
                    ∃ (s : S), (MeasureTheory.Measure.map X μ) {s} μ Set.univ * (-H[X ; μ]).exp.toNNReal

                    If X is an S-valued random variable, then there exists s ∈ S such that P[X = s] ≥ \exp(- H[X]). TODO: remove the probability measure hypothesis, which is unncessary here.

                    theorem ProbabilityTheory.prob_ge_exp_neg_entropy' {S : Type uS} [Nonempty S] [MeasurableSpace S] [MeasurableSingletonClass S] {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (X : ΩS) (hX : Measurable X) [FiniteRange X] :
                    ∃ (s : S), (-H[X ; μ]).exp μ.real (X ⁻¹' {s})

                    If X is an S-valued random variable, then there exists s ∈ S such that P[X=s] ≥ \exp(-H[X]).

                    theorem ProbabilityTheory.const_of_nonpos_entropy {S : Type uS} [Nonempty S] [MeasurableSpace S] [MeasurableSingletonClass S] {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {X : ΩS} (hX : Measurable X) [FiniteRange X] (hent : H[X ; μ] 0) :
                    ∃ (s : S), μ.real (X ⁻¹' {s}) = 1

                    If X is an S-valued random variable of non-positive entropy, then X is almost surely constant.

                    theorem ProbabilityTheory.entropy_comm {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) :
                    H[X, Y ; μ] = H[Y, X ; μ]

                    H[X, Y] = H[Y, X].

                    theorem ProbabilityTheory.entropy_assoc {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) (μ : MeasureTheory.Measure Ω) :
                    H[X, Y, Z ; μ] = H[X, Y, Z ; μ]

                    H[(X, Y), Z] = H[X, (Y, Z)].

                    @[simp]
                    theorem ProbabilityTheory.entropy_prod_comp {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} (hX : Measurable X) (μ : MeasureTheory.Measure Ω) (f : ST) :
                    H[X, f X ; μ] = H[X ; μ]

                    H[X, f(X)] = H[X].

                    noncomputable def ProbabilityTheory.condEntropy {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] (X : ΩS) (Y : ΩT) (μ : autoParam (MeasureTheory.Measure Ω) _auto✝) :

                    Conditional entropy of a random variable w.r.t. another. This is the expectation under the law of Y of the entropy of the law of X conditioned on the event Y = y.

                    Equations
                    Instances For
                      theorem ProbabilityTheory.condEntropy_def {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] (X : ΩS) (Y : ΩT) (μ : MeasureTheory.Measure Ω) :
                      H[X | Y ; μ] = ∫ (x : T), (fun (y : T) => H[X | Yy ; μ]) xMeasureTheory.Measure.map Y μ

                      Pretty printer defined by notation3 command.

                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        Conditional entropy of a random variable w.r.t. another. This is the expectation under the law of Y of the entropy of the law of X conditioned on the event Y = y.

                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          Pretty printer defined by notation3 command.

                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            Conditional entropy of a random variable w.r.t. another. This is the expectation under the law of Y of the entropy of the law of X conditioned on the event Y = y.

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For
                              theorem ProbabilityTheory.condEntropy_eq_zero {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsFiniteMeasure μ] (t : T) (ht : ((MeasureTheory.Measure.map Y μ) {t}).toReal = 0) :
                              H[X | Yt ; μ] = 0

                              Conditional entropy of a random variable is equal to the entropy of its conditional kernel.

                              theorem ProbabilityTheory.map_prod_comap_swap {Ω : Type uΩ} {S : Type uS} {U : Type uU} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace U] {Z : ΩU} {X : ΩS} (hX : Measurable X) (hZ : Measurable Z) (μ : MeasureTheory.Measure Ω) :
                              MeasureTheory.Measure.comap Prod.swap (MeasureTheory.Measure.map (fun (ω : Ω) => (X ω, Z ω)) μ) = MeasureTheory.Measure.map (fun (ω : Ω) => (Z ω, X ω)) μ

                              The law of (X, Z) is the image of the law of (Z, X).

                              theorem ProbabilityTheory.condEntropy_two_eq_kernel_entropy {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] [MeasurableSingletonClass U] {Z : ΩU} {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange Y] [FiniteRange Z] :
                              H[X | Y, Z ; μ] = Hk[ProbabilityTheory.kernel.condKernel (ProbabilityTheory.condDistrib (fun (a : Ω) => (Y a, X a)) Z μ) , (MeasureTheory.Measure.map Z μ).compProd (ProbabilityTheory.kernel.fst (ProbabilityTheory.condDistrib (fun (a : Ω) => (Y a, X a)) Z μ))]
                              @[simp]
                              theorem ProbabilityTheory.condEntropy_zero_measure {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] (X : ΩS) (Y : ΩT) :
                              H[X | Y ; 0] = 0

                              Any random variable on a zero measure space has zero conditional entropy.

                              theorem ProbabilityTheory.condEntropy_nonneg {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] (X : ΩS) (Y : ΩT) (μ : MeasureTheory.Measure Ω) :
                              0 H[X | Y ; μ]

                              Conditional entropy is non-negative.

                              theorem ProbabilityTheory.condEntropy_le_log_card {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [Fintype S] (X : ΩS) (Y : ΩT) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] :
                              H[X | Y ; μ] ((Fintype.card S)).log

                              Conditional entropy is at most the logarithm of the cardinality of the range.

                              theorem ProbabilityTheory.condEntropy_eq_sum {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass T] (X : ΩS) (Y : ΩT) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsFiniteMeasure μ] (hY : Measurable Y) [FiniteRange Y] :
                              H[X | Y ; μ] = (FiniteRange.toFinset Y).sum fun (y : T) => ((MeasureTheory.Measure.map Y μ) {y}).toReal * H[X | Yy ; μ]

                              H[X|Y] = ∑_y P[Y=y] H[X|Y=y].

                              theorem ProbabilityTheory.condEntropy_eq_sum_fintype {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass T] (X : ΩS) (Y : ΩT) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsFiniteMeasure μ] (hY : Measurable Y) [Fintype T] :
                              H[X | Y ; μ] = Finset.univ.sum fun (y : T) => (μ (Y ⁻¹' {y})).toReal * H[X | Yy ; μ]

                              H[X|Y] = ∑_y P[Y=y] H[X|Y=y]$.

                              theorem ProbabilityTheory.condEntropy_prod_eq_sum {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass T] {T' : Type u_1} {X : ΩS} {Y : ΩT} {Z : ΩT'} [MeasurableSpace T'] [MeasurableSingletonClass T'] (μ : MeasureTheory.Measure Ω) (hY : Measurable Y) (hZ : Measurable Z) [MeasureTheory.IsFiniteMeasure μ] [Fintype T] [Fintype T'] :
                              H[X | Y, Z ; μ] = Finset.univ.sum fun (z : T') => (μ (Z ⁻¹' {z})).toReal * H[X | Y ; ProbabilityTheory.cond μ (Z ⁻¹' {z})]
                              theorem ProbabilityTheory.condEntropy_eq_sum_sum {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] {X : ΩS} [MeasurableSingletonClass T] (hX : Measurable X) {Y : ΩT} (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] :
                              H[X | Y ; μ] = (FiniteRange.toFinset Y).sum fun (y : T) => (FiniteRange.toFinset X).sum fun (x : S) => ((MeasureTheory.Measure.map Y μ) {y}).toReal * ((MeasureTheory.Measure.map X (ProbabilityTheory.cond μ (Y ⁻¹' {y}))) {x}).toReal.negMulLog

                              H[X|Y] = ∑_y ∑_x P[Y=y] P[X=x|Y=y] log ⧸(P[X=x|Y=y])$.

                              theorem ProbabilityTheory.condEntropy_eq_sum_prod {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] {X : ΩS} [MeasurableSingletonClass T] (hX : Measurable X) {Y : ΩT} (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] :
                              H[X | Y ; μ] = (FiniteRange.toFinset X ×ˢ FiniteRange.toFinset Y).sum fun (p : S × T) => ((MeasureTheory.Measure.map Y μ) {p.2}).toReal * ((MeasureTheory.Measure.map X (ProbabilityTheory.cond μ (Y ⁻¹' {p.2}))) {p.1}).toReal.negMulLog

                              Same as previous lemma, but with a sum over a product space rather than a double sum.

                              theorem ProbabilityTheory.condEntropy_of_injective {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} [MeasurableSingletonClass U] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsFiniteMeasure μ] (hX : Measurable X) (hY : Measurable Y) (f : TSU) (hf : ∀ (t : T), Function.Injective (f t)) [FiniteRange Y] :
                              H[fun (ω : Ω) => f (Y ω) (X ω) | Y ; μ] = H[X | Y ; μ]

                              If X : Ω → S, Y : Ω → T are random variables, and f : T × S → U is injective for each fixed t ∈ T, then H[f(Y, X) | Y] = H[X | Y]. Thus for instance H[X-Y|Y] = H[X|Y].

                              theorem ProbabilityTheory.condEntropy_comp_of_injective {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] {X : ΩS} {Y : ΩT} [MeasurableSingletonClass S] [MeasurableSingletonClass U] (μ : MeasureTheory.Measure Ω) (hX : Measurable X) (f : SU) (hf : Function.Injective f) :
                              H[f X | Y ; μ] = H[X | Y ; μ]

                              A weaker version of the above lemma in which f is independent of Y.

                              theorem ProbabilityTheory.condEntropy_comm {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} {Z : ΩU} (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) :
                              H[X, Y | Z ; μ] = H[Y, X | Z ; μ]

                              H[X, Y| Z] = H[Y, X| Z].

                              theorem ProbabilityTheory.chain_rule' {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) [FiniteRange X] [FiniteRange Y] :
                              H[X, Y ; μ] = H[X ; μ] + H[Y | X ; μ]

                              One form of the chain rule : H[X, Y] = H[X] + H[Y | X].

                              theorem ProbabilityTheory.chain_rule {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) [FiniteRange X] [FiniteRange Y] :
                              H[X, Y ; μ] = H[Y ; μ] + H[X | Y ; μ]

                              Another form of the chain rule : H[X, Y] = H[Y] + H[X | Y].

                              theorem ProbabilityTheory.chain_rule'' {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) [FiniteRange X] [FiniteRange Y] :
                              H[X | Y ; μ] = H[X, Y ; μ] - H[Y ; μ]

                              Another form of the chain rule : H[X | Y] = H[X, Y] - H[Y].

                              theorem ProbabilityTheory.IdentDistrib.condEntropy_eq {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {μ : MeasureTheory.Measure Ω} {Ω' : Type u_1} [MeasurableSpace Ω'] {X : ΩS} {Y : ΩT} {μ' : MeasureTheory.Measure Ω'} {X' : Ω'S} {Y' : Ω'T} [MeasureTheory.IsProbabilityMeasure μ] [MeasureTheory.IsProbabilityMeasure μ'] (hX : Measurable X) (hY : Measurable Y) (hX' : Measurable X') (hY' : Measurable Y') (h : ProbabilityTheory.IdentDistrib (X, Y) (X', Y') μ μ') [FiniteRange X] [FiniteRange Y] [FiniteRange X'] [FiniteRange Y'] :
                              H[X | Y ; μ] = H[X' | Y' ; μ']

                              Two pairs of variables that have the same joint distribution, have the same conditional entropy.

                              theorem ProbabilityTheory.condEntropy_of_injective' {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [Nonempty U] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass T] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} [MeasurableSingletonClass S] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) (f : TU) (hf : Function.Injective f) (hfY : Measurable (f Y)) [FiniteRange X] [FiniteRange Y] :
                              H[X | f Y ; μ] = H[X | Y ; μ]

                              If X : Ω → S and Y : Ω → T are random variables, and f : T → U is an injection then H[X | f(Y)] = H[X | Y].

                              theorem ProbabilityTheory.condEntropy_comp_self {Ω : Type uΩ} {S : Type uS} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable U] [Nonempty S] [Nonempty U] [MeasurableSpace S] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass U] {X : ΩS} {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) {f : SU} (hf : Measurable f) [FiniteRange X] :
                              H[X | f X ; μ] = H[X ; μ] - H[f X ; μ]

                              H[X | f(X)] = H[X] - H[f(X)].

                              theorem ProbabilityTheory.cond_chain_rule' {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) [FiniteRange X] [FiniteRange Y] [FiniteRange Z] :
                              H[X, Y | Z ; μ] = H[X | Z ; μ] + H[Y | X, Z ; μ]

                              If X : Ω → S, Y : Ω → T, Z : Ω → U are random variables, then H[X, Y | Z] = H[X | Z] + H[Y|X, Z].

                              theorem ProbabilityTheory.cond_chain_rule {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) [FiniteRange X] [FiniteRange Y] [FiniteRange Z] :
                              H[X, Y | Z ; μ] = H[Y | Z ; μ] + H[X | Y, Z ; μ]

                              H[X, Y | Z] = H[Y | Z] + H[X | Y, Z].

                              theorem ProbabilityTheory.entropy_comp_le {Ω : Type uΩ} {S : Type uS} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable U] [Nonempty S] [Nonempty U] [MeasurableSpace S] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass U] {X : ΩS} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (f : SU) [FiniteRange X] :
                              H[f X ; μ] H[X ; μ]

                              Data-processing inequality for the entropy: H[f(X)] ≤ H[X]. To upgrade this to equality, see entropy_of_comp_eq_of_comp or entropy_comp_of_injective.

                              theorem ProbabilityTheory.entropy_of_comp_eq_of_comp {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) (f : ST) (g : TS) (h1 : Y = f X) (h2 : X = g Y) [FiniteRange X] [FiniteRange Y] :
                              H[X ; μ] = H[Y ; μ]

                              A Schroder-Bernstein type theorem for entropy : if two random variables are functions of each other, then they have the same entropy. Can be used as a substitute for entropy_comp_of_injective if one doesn't want to establish the injectivity.

                              noncomputable def ProbabilityTheory.mutualInfo {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] (X : ΩS) (Y : ΩT) (μ : autoParam (MeasureTheory.Measure Ω) _auto✝) :

                              The mutual information I[X : Y] of two random variables is defined to be H[X] + H[Y] - H[X ; Y].

                              Equations
                              • I[X : Y ; μ] = H[X ; μ] + H[Y ; μ] - H[X, Y ; μ]
                              Instances For

                                The mutual information I[X : Y] of two random variables is defined to be H[X] + H[Y] - H[X ; Y].

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  Pretty printer defined by notation3 command.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For

                                    Pretty printer defined by notation3 command.

                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      The mutual information I[X : Y] of two random variables is defined to be H[X] + H[Y] - H[X ; Y].

                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For
                                        theorem ProbabilityTheory.mutualInfo_def {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] (X : ΩS) (Y : ΩT) (μ : MeasureTheory.Measure Ω) :
                                        I[X : Y ; μ] = H[X ; μ] + H[Y ; μ] - H[X, Y ; μ]
                                        theorem ProbabilityTheory.mutualInfo_comm {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) :
                                        I[X : Y ; μ] = I[Y : X ; μ]

                                        I[X : Y] = I[Y : X].

                                        theorem ProbabilityTheory.entropy_add_entropy_sub_mutualInfo {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] (X : ΩS) (Y : ΩT) (μ : MeasureTheory.Measure Ω) :
                                        H[X ; μ] + H[Y ; μ] - I[X : Y ; μ] = H[X, Y ; μ]
                                        theorem ProbabilityTheory.mutualInfo_eq_entropy_sub_condEntropy {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] :
                                        I[X : Y ; μ] = H[X ; μ] - H[X | Y ; μ]

                                        I[X : Y] = H[X] - H[X|Y].

                                        theorem ProbabilityTheory.mutualInfo_eq_entropy_sub_condEntropy' {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] :
                                        I[X : Y ; μ] = H[Y ; μ] - H[Y | X ; μ]

                                        I[X : Y] = H[Y] - H[Y | X].

                                        theorem ProbabilityTheory.entropy_sub_mutualInfo_eq_condEntropy {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] :
                                        H[X ; μ] - I[X : Y ; μ] = H[X | Y ; μ]

                                        H[X] - I[X : Y] = H[X | Y].

                                        theorem ProbabilityTheory.entropy_sub_mutualInfo_eq_condEntropy' {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] :
                                        H[Y ; μ] - I[X : Y ; μ] = H[Y | X ; μ]

                                        H[Y] - I[X : Y] = H[Y | X].

                                        theorem ProbabilityTheory.mutualInfo_nonneg {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [FiniteRange X] [FiniteRange Y] :
                                        0 I[X : Y ; μ]

                                        Mutual information is non-negative.

                                        theorem ProbabilityTheory.IdentDistrib.mutualInfo_eq {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] {X : ΩS} {Y : ΩT} {μ : MeasureTheory.Measure Ω} {Ω' : Type u_1} [MeasurableSpace Ω'] {μ' : MeasureTheory.Measure Ω'} {X' : Ω'S} {Y' : Ω'T} (hXY : ProbabilityTheory.IdentDistrib (X, Y) (X', Y') μ μ') :
                                        I[X : Y ; μ] = I[X' : Y' ; μ']

                                        Substituting variables for ones with the same distributions doesn't change the mutual information.

                                        theorem ProbabilityTheory.entropy_pair_le_add {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [FiniteRange X] [FiniteRange Y] :
                                        H[X, Y ; μ] H[X ; μ] + H[Y ; μ]

                                        Subadditivity of entropy.

                                        I[X : Y] = 0 iff X, Y are independent.

                                        Alias of the reverse direction of ProbabilityTheory.mutualInfo_eq_zero.


                                        I[X : Y] = 0 iff X, Y are independent.

                                        theorem ProbabilityTheory.mutualInfo_const {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} (hX : Measurable X) (c : T) {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] :
                                        I[X : fun (x : Ω) => c ; μ] = 0

                                        The mutual information with a constant is always zero.

                                        theorem ProbabilityTheory.entropy_pair_eq_add {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] :
                                        H[X, Y ; μ] = H[X ; μ] + H[Y ; μ] ProbabilityTheory.IndepFun X Y μ

                                        H[X, Y] = H[X] + H[Y] if and only if X, Y are independent.

                                        theorem ProbabilityTheory.IndepFun.entropy_pair_eq_add {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] :
                                        ProbabilityTheory.IndepFun X Y μH[X, Y ; μ] = H[X ; μ] + H[Y ; μ]

                                        If X, Y are independent, then H[X, Y] = H[X] + H[Y].

                                        noncomputable def ProbabilityTheory.condMutualInfo {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] (X : ΩS) (Y : ΩT) (Z : ΩU) (μ : autoParam (MeasureTheory.Measure Ω) _auto✝) :

                                        The conditional mutual information I[X : Y| Z] is the mutual information of X| Z=z and Y| Z=z, integrated over z.

                                        Equations
                                        • I[X : Y|Z;μ] = ∫ (x : U), (fun (z : U) => H[X | Zz ; μ] + H[Y | Zz ; μ] - H[X, Y | Zz ; μ]) xMeasureTheory.Measure.map Z μ
                                        Instances For
                                          theorem ProbabilityTheory.condMutualInfo_def {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] (X : ΩS) (Y : ΩT) (Z : ΩU) (μ : MeasureTheory.Measure Ω) :
                                          I[X : Y|Z;μ] = ∫ (x : U), (fun (z : U) => H[X | Zz ; μ] + H[Y | Zz ; μ] - H[X, Y | Zz ; μ]) xMeasureTheory.Measure.map Z μ

                                          Pretty printer defined by notation3 command.

                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            The conditional mutual information I[X : Y| Z] is the mutual information of X| Z=z and Y| Z=z, integrated over z.

                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              Pretty printer defined by notation3 command.

                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              Instances For

                                                The conditional mutual information I[X : Y| Z] is the mutual information of X| Z=z and Y| Z=z, integrated over z.

                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For

                                                  The conditional mutual information agrees with the information of the conditional kernel.

                                                  theorem ProbabilityTheory.condMutualInfo_eq_integral_mutualInfo {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] {X : ΩS} {Y : ΩT} {Z : ΩU} {μ : MeasureTheory.Measure Ω} :
                                                  I[X : Y|Z;μ] = ∫ (x : U), (fun (z : U) => I[X : Y ; ProbabilityTheory.cond μ (Z ⁻¹' {z})]) xMeasureTheory.Measure.map Z μ
                                                  theorem ProbabilityTheory.condMutualInfo_eq_sum {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure μ] (hZ : Measurable Z) [FiniteRange Z] :
                                                  I[X : Y|Z;μ] = (FiniteRange.toFinset Z).sum fun (z : U) => (μ (Z ⁻¹' {z})).toReal * I[X : Y ; ProbabilityTheory.cond μ (Z ⁻¹' {z})]
                                                  theorem ProbabilityTheory.condMutualInfo_eq_sum' {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsFiniteMeasure μ] (hZ : Measurable Z) [Fintype U] :
                                                  I[X : Y|Z;μ] = Finset.univ.sum fun (z : U) => (μ (Z ⁻¹' {z})).toReal * I[X : Y ; ProbabilityTheory.cond μ (Z ⁻¹' {z})]

                                                  A variant of condMutualInfo_eq_sum when Z has finite codomain.

                                                  theorem ProbabilityTheory.condMutualInfo_comm {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (Z : ΩU) (μ : MeasureTheory.Measure Ω) :
                                                  I[X : Y|Z;μ] = I[Y : X|Z;μ]

                                                  I[X : Y | Z] = I[Y : X | Z].

                                                  theorem ProbabilityTheory.condMutualInfo_nonneg {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] {X : ΩS} {Y : ΩT} (hX : Measurable X) (hY : Measurable Y) (Z : ΩU) (μ : MeasureTheory.Measure Ω) [FiniteRange X] [FiniteRange Y] :
                                                  0 I[X : Y|Z;μ]

                                                  Conditional information is non-nonegative.

                                                  theorem ProbabilityTheory.condMutualInfo_eq {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange Z] :
                                                  I[X : Y|Z;μ] = H[X | Z ; μ] + H[Y | Z ; μ] - H[X, Y | Z ; μ]

                                                  I[X : Y| Z] = H[X| Z] + H[Y| Z] - H[X, Y| Z].

                                                  theorem ProbabilityTheory.condMutualInfo_eq' {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] [FiniteRange Z] :
                                                  I[X : Y|Z;μ] = H[X | Z ; μ] - H[X | Y, Z ; μ]

                                                  I[X : Y| Z] = H[X| Z] - H[X|Y, Z].

                                                  theorem ProbabilityTheory.condMutualInfo_of_inj_map {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) {V : Type u_1} [Nonempty V] [MeasurableSpace V] [MeasurableSingletonClass V] [Countable V] (f : USV) (hf : ∀ (t : U), Function.Injective (f t)) [FiniteRange Z] :
                                                  I[fun (ω : Ω) => f (Z ω) (X ω) : Y|Z;μ] = I[X : Y|Z;μ]

                                                  If f(Z, X) is injective for each fixed Z, then I[f(Z, X) : Y| Z] = I[X : Y| Z].

                                                  theorem ProbabilityTheory.condEntropy_prod_eq_of_indepFun {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass S] [MeasurableSingletonClass T] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} {μ : MeasureTheory.Measure Ω} [Fintype T] [Fintype U] [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) [FiniteRange X] (h : ProbabilityTheory.IndepFun (X, Y) Z μ) :
                                                  H[X | Y, Z ; μ] = H[X | Y ; μ]
                                                  theorem ProbabilityTheory.entropy_sub_condEntropy {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] {X : ΩS} {Y : ΩT} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [MeasurableSingletonClass S] [MeasurableSingletonClass T] (hX : Measurable X) (hY : Measurable Y) [FiniteRange X] [FiniteRange Y] :
                                                  H[X ; μ] - H[X | Y ; μ] = I[X : Y ; μ]

                                                  H[X] - H[X|Y] = I[X : Y]

                                                  theorem ProbabilityTheory.condEntropy_le_entropy {Ω : Type uΩ} {S : Type uS} {T : Type uT} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] {X : ΩS} {Y : ΩT} (μ : MeasureTheory.Measure Ω) [MeasurableSingletonClass S] [MeasurableSingletonClass T] (hX : Measurable X) (hY : Measurable Y) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] :
                                                  H[X | Y ; μ] H[X ; μ]

                                                  H[X | Y] ≤ H[X].

                                                  theorem ProbabilityTheory.entropy_submodular {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [MeasurableSingletonClass S] [MeasurableSingletonClass T] (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) [FiniteRange X] [FiniteRange Y] [FiniteRange Z] :
                                                  H[X | Y, Z ; μ] H[X | Z ; μ]

                                                  H[X | Y, Z] ≤ H[X | Z].

                                                  theorem ProbabilityTheory.condEntropy_comp_ge {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [Nonempty U] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} [MeasurableSingletonClass S] [MeasurableSingletonClass T] [FiniteRange X] [FiniteRange Y] (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] (hX : Measurable X) (hY : Measurable Y) (f : SU) :
                                                  H[Y | f X ; μ] H[Y | X ; μ]

                                                  Data-processing inequality for the conditional entropy: H[Y|f(X)] ≥ H[Y|X] To upgrade this to equality, see condEntropy_of_injective'

                                                  theorem ProbabilityTheory.entropy_triple_add_entropy_le {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [Nonempty U] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] [MeasurableSingletonClass S] [MeasurableSingletonClass T] (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) [FiniteRange X] [FiniteRange Y] [FiniteRange Z] :
                                                  H[X, Y, Z ; μ] + H[Z ; μ] H[X, Z ; μ] + H[Y, Z ; μ]

                                                  The submodularity inequality: H[X, Y, Z] + H[Z] ≤ H[X, Z] + H[Y, Z].

                                                  theorem ProbabilityTheory.condMutualInfo_eq_zero {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable U] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} [MeasurableSingletonClass S] [MeasurableSingletonClass T] {μ : MeasureTheory.Measure Ω} (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] [FiniteRange Z] :
                                                  I[X : Y|Z;μ] = 0 ProbabilityTheory.CondIndepFun X Y Z μ

                                                  I[X : Y| Z]=0 iff X, Y are conditionally independent over Z.

                                                  theorem ProbabilityTheory.ent_of_cond_indep {Ω : Type uΩ} {S : Type uS} {T : Type uT} {U : Type uU} [mΩ : MeasurableSpace Ω] [Countable S] [Countable T] [Countable U] [Nonempty S] [Nonempty T] [Nonempty U] [MeasurableSpace S] [MeasurableSpace T] [MeasurableSpace U] [MeasurableSingletonClass U] {X : ΩS} {Y : ΩT} {Z : ΩU} [MeasurableSingletonClass S] [MeasurableSingletonClass T] {μ : MeasureTheory.Measure Ω} (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) (h : ProbabilityTheory.CondIndepFun X Y Z μ) [MeasureTheory.IsProbabilityMeasure μ] [FiniteRange X] [FiniteRange Y] [FiniteRange Z] :
                                                  H[X, Y, Z ; μ] = H[X, Z ; μ] + H[Y, Z ; μ] - H[Z ; μ]

                                                  If X, Y are conditionally independent over Z, then H[X, Y, Z] = H[X, Z] + H[Y, Z] - H[Z].