13 Further improvement to exponent
13.1 Kullback–Leibler divergence
In the definitions below, is a set.
Definition
13.1
Kullback–Leibler divergence
If are two -valued random variables, the Kullback–Leibler divergence is defined as
Lemma
13.2
Kullback–Leibler divergence of copy
If is a copy of , and is a copy of , then .
Lemma
13.3
Gibbs inequality
Lemma
13.4
Converse Gibbs inequality
Lemma
13.5
Convexity of Kullback–Leibler
If is a finite set, for some non-negative , and , for all , then
Proof
▶
For each , replace in the definition with for each , and apply Lemma 1.4.
Lemma
13.6
Kullback–Leibler and injections
If is an injection, then .
Now let be an additive group.
Lemma
13.7
Kullback–Leibler and sums
If are independent -valued random variables, then
Definition
13.8
Conditional Kullback–Leibler divergence
If are random variables, with defined on the same sample space, we define
Lemma
13.9
Kullback–Leibler and conditioning
If are independent -valued random variables, and is another random variable defined on the same sample space as , then
Proof
▶
Compare the terms correspond to each on both sides.
Lemma
13.10
Conditional Gibbs inequality
Proof
▶
Clear from Definition 13.8 and Lemma 13.3.
13.2 Rho functionals
Let be an additive group, and let be a non-empty subset of .
Definition
13.11
Rho minus
For any -valued random variable , we define to be the infimum of , where is uniform on and ranges over -valued random variables independent of .
Definition
13.12
Rho plus
For any -valued random variable , we define .
Lemma
13.13
Rho minus non-negative
Lemma
13.14
Rho minus of subgroup
If is a finite subgroup of , then .
Proof
▶
For every -valued random variable that is independent of ,
by Lemma 1.4. Then observe that
This proves .
To get the equality, let and observe that
Corollary
13.15
Rho plus of subgroup
If is a finite subgroup of , then .
Definition
13.16
Rho functional
Proof
▶
by the choice . The claim then follows from Lemma 13.13.
Lemma
13.18
Rho of subgroup
If is a finite subgroup of , and , then there exists such that , and .
Lemma
13.19
Rho invariant
Lemma
13.20
Rho continuous
depends continuously on the distribution of .
Lemma
13.21
Rho and sums
If are independent, one has
and
Proof
▶
The first inequality follows from Lemma 13.7. The second and third inequalities are direct corollaries of the first.
Definition
13.22
Conditional Rho functional
Lemma
13.23
Conditional rho and translation
Lemma
13.24
Conditional rho and relabeling
Proof
▶
Clear from the definition.
Lemma
13.25
Rho and conditioning
If are defined on the same space, one has
and
Proof
▶
The first inequality follows from Lemma 13.9. The second and third inequalities are direct corollaries of the first.
The following lemmas hold for .
Lemma
13.26
Rho and sums, symmetrized
Proof
▶
Apply Lemma 13.21 for and and take their average.
Lemma
13.27
Rho and conditioning, symmetrized
Proof
▶
First apply Lemma 13.25 to get , and . Then apply Lemma 13.19 to get and take the average of the two inequalities.
13.3 Studying a minimizer
Set . In this section, consider .
Definition
13.28
Given -valued random variables , define
and define a -minimizer to be a pair of random variables which minimizes .
Lemma
13.29
-minimizers exist
There exists a -minimizer.
Let be a -minimizer, and be independent copies of respectively. Similar to the original proof we define
First we need the -minimizer variants of Lemma 6.12 and Lemma 6.16.
Proof
▶
Similar to Lemma 6.12: get upper bounds for by and , and then apply Lemma 6.8 to get an upper bound for .
Proof
▶
First of all, by , , and the fibring identity obtained by applying Corollary 5.3 on , we have . Then apply Lemma 13.31 to get , and rearrange.
Next we need some inequalities for the endgame.
Lemma
13.33
If -valued random variables satisfy , then
Proof
▶
Conditioned on every , by Definition 13.28. Then take the weighted average with weight and then apply Lemma 3.23 to bound the RHS.
Lemma
13.34
If -valued random variables satisfy , then
Proof
▶
Take the average of Lemma 13.33 over all permutations of .
Lemma
13.35
For independent random variables over , define , , . Then
Proof
▶
Let , . First note that
by Lemma 13.25, Lemma 13.27, Lemma 13.26 respectively. On the other hand, observe that
by Lemma 13.24, Lemma 13.26, Lemma 13.27 respectively. By replacing with in the above inequalities, one has
and
Finally, take the sum of all four inequalities, apply Corollary 5.3 on and to rewrite the sum of last terms in the four inequalities, and divide the result by .
Lemma
13.36
For independent random variables over , define and . Then
Proof
▶
Apply Lemma 13.35 on for , and take the sum.
Proposition
13.37
If is a -minimizer, then .
Proof
▶
Consider , and . Note that . First apply Lemma 13.34 on when conditioned on to get
Then apply Lemma 13.36 on and get
by Lemma 13.31. Plug in the inequality above to (1), we get
By Lemma 13.32 we can conclude that
Finally by Lemma 13.30 and we get that the second term is , and thus . By the choice and the non-negativity of we have .
Proposition
13.38
For any random variables , there exist a subgroup such that
Corollary
13.39
If , then there exists a subgroup and such that , and .
Corollary
13.40
If , then there exist a subgroup and a subset of with , such that and .
Theorem
13.41
PFR with
If is finite non-empty with , then there exists a subgroup of with such that can be covered by at most translates of .