PFR

13 Further improvement to exponent

13.1 Kullback–Leibler divergence

In the definitions below, G is a set.

Definition 13.1 Kullback–Leibler divergence
#

If X,Y are two G-valued random variables, the Kullback–Leibler divergence is defined as

DKL(XY):=xP(X=x)logP(X=x)P(Y=x).
Lemma 13.2 Kullback–Leibler divergence of copy
#

If X is a copy of X, and Y is a copy of Y, then DKL(XY)=DKL(XY).

Proof
Lemma 13.3 Gibbs inequality
#

DKL(XY)0.

Proof
Lemma 13.4 Converse Gibbs inequality
#

If DKL(XY)=0, then Y is a copy of X.

Proof
Lemma 13.5 Convexity of Kullback–Leibler
#

If S is a finite set, sSws=1 for some non-negative ws, and P(X=x)=sSwsP(Xs=x), P(Y=x)=sSwsP(Ys=x) for all x, then

DKL(XY)sSwsDKL(XsYs).
Proof
Lemma 13.6 Kullback–Leibler and injections
#

If f:GH is an injection, then DKL(f(X)f(Y))=DKL(XY).

Proof

Now let G be an additive group.

Lemma 13.7 Kullback–Leibler and sums
#

If X,Y,Z are independent G-valued random variables, then

DKL(X+ZY+Z)DKL(XY).
Proof
Definition 13.8 Conditional Kullback–Leibler divergence

If X,Y,Z are random variables, with X,Z defined on the same sample space, we define

DKL(X|ZY):=zP(Z=z)DKL((X|Z=z)Y).
Lemma 13.9 Kullback–Leibler and conditioning
#

If X,Y are independent G-valued random variables, and Z is another random variable defined on the same sample space as X, then

DKL((X|Z)Y)=DKL(XY)+H[X]H[X|Z].
Proof
Lemma 13.10 Conditional Gibbs inequality
#

DKL((X|W)Y)0.

Proof

13.2 Rho functionals

Let G be an additive group, and let A be a non-empty subset of G.

Definition 13.11 Rho minus
#

For any G-valued random variable X, we define ρ(X) to be the infimum of DKL(XUA+T), where UA is uniform on A and T ranges over G-valued random variables independent of UA.

Definition 13.12 Rho plus
#

For any G-valued random variable X, we define ρ+(X):=ρ(X)+H(X)H(UA).

Lemma 13.13 Rho minus non-negative

We have ρ(X)0.

Proof
Lemma 13.14 Rho minus of subgroup
#

If H is a finite subgroup of G, then ρ(UH)=log|A|logmaxt|A(H+t)|.

Proof
Corollary 13.15 Rho plus of subgroup
#

If H is a finite subgroup of G, then ρ+(UH)=log|H|logmaxt|A(H+t)|.

Proof
Definition 13.16 Rho functional
#

We define ρ(X):=(ρ+(X)+ρ(X))/2.

Lemma 13.17

We have ρ(UA)=0.

Proof
Lemma 13.18 Rho of subgroup

If H is a finite subgroup of G, and ρ(UH)r, then there exists t such that |A(H+t)|er|A||H|, and |H|/|A|[e2r,e2r].

Proof
Lemma 13.19 Rho invariant
#

For any sG, ρ(X+s)=ρ(X).

Proof
Lemma 13.20 Rho continuous
#

ρ(X) depends continuously on the distribution of X.

Proof
Lemma 13.21 Rho and sums
#

If X,Y are independent, one has

ρ(X+Y)ρ(X)
ρ+(X+Y)ρ+(X)+H[X+Y]H[X]

and

ρ(X+Y)ρ(X)+12(H[X+Y]H[X]).
Proof
Definition 13.22 Conditional Rho functional
#

We define ρ(X|Y):=yP(Y=y)ρ(X|Y=y).

Lemma 13.23 Conditional rho and translation
#

For any sG, ρ(X+s|Y)=ρ(X|Y).

Proof
Lemma 13.24 Conditional rho and relabeling
#

If f is injective, then ρ(X|f(Y))=ρ(X|Y).

Proof
Lemma 13.25 Rho and conditioning
#

If X,Z are defined on the same space, one has

ρ(X|Z)ρ(X)+H[X]H[X|Z]
ρ+(X|Z)ρ+(X)

and

ρ(X|Z)ρ(X)+12(H[X]H[X|Z]).
Proof

The following lemmas hold for G=F2n.

Lemma 13.26 Rho and sums, symmetrized
#

If X,Y are independent, then

ρ(X+Y)12(ρ(X)+ρ(Y)+d[X;Y]).
Proof
Lemma 13.27 Rho and conditioning, symmetrized
#

If X,Y are independent, then

ρ(X|X+Y)12(ρ(X)+ρ(Y)+d[X;Y]).
Proof

13.3 Studying a minimizer

Set η<1/8. In this section, consider G=F2n.

Definition 13.28
#

Given G-valued random variables X,Y, define

ϕ[X;Y]:=d[X;Y]+η(ρ(X)+ρ(Y))

and define a ϕ-minimizer to be a pair of random variables X,Y which minimizes ϕ[X;Y].

Lemma 13.29 ϕ-minimizers exist
#

There exists a ϕ-minimizer.

Proof

Let (X1,X2) be a ϕ-minimizer, and X~1,X~2 be independent copies of X1,X2 respectively. Similar to the original proof we define

I1:=I[X1+X2:X~1+X2|X1+X2+X~1+X~2],I2:=I[X1+X2:X1+X~1|X1+X2+X~1+X~2].

First we need the ϕ-minimizer variants of Lemma 6.12 and Lemma 6.16.

Lemma 13.30
#

I12ηd[X1;X2]

Proof
Lemma 13.31
#

d[X1;X1]+d[X2;X2]=2d[X1;X2]+(I2I1).

Proof
Lemma 13.32
#

I22ηd[X1;X2]+η1η(2ηd[X1;X2]I1).

Proof

Next we need some inequalities for the endgame.

Lemma 13.33
#

If G-valued random variables T1,T2,T3 satisfy T1+T2+T3=0, then

d[X1;X2]3I[T1:T2]+(2H[T3]H[T1]H[T2])+η(ρ(T1|T3)+ρ(T2|T3)ρ(X1)ρ(X2)).
Proof
Lemma 13.34
#

If G-valued random variables T1,T2,T3 satisfy T1+T2+T3=0, then

d[X1;X2]1i<j3I[Ti:Tj]+η31i<j3(ρ(Ti|Tj)+ρ(Tj|Ti)ρ(X1)ρ(X2))
Proof
Lemma 13.35
#

For independent random variables Y1,Y2,Y3,Y4 over G, define S:=Y1+Y2+Y3+Y4, T1:=Y1+Y2, T2:=Y1+Y3. Then

ρ(T1|T2,S)+ρ(T2|T1,S)12iρ(Yi)12(d[Y1;Y2]+d[Y3;Y4]+d[Y1;Y3]+d[Y2;Y4]).
Proof
Lemma 13.36
#

For independent random variables Y1,Y2,Y3,Y4 over G, define T1:=Y1+Y2,T2:=Y1+Y3,T3:=Y2+Y3 and S:=Y1+Y2+Y3+Y4. Then

1i<j3(ρ(Ti|Tj,S)+ρ(Tj|Ti,S)12iρ(Yi))1i<j4d[Yi;Yj]
Proof
Proposition 13.37
#

If X1,X2 is a ϕ-minimizer, then d[X1;X2]=0.

Proof
Proposition 13.38
#

For any random variables Y1,Y2, there exist a subgroup H such that

2ρ(UH)ρ(Y1)+ρ(Y2)+8d[Y1;Y2].
Proof
Corollary 13.39
#

If |A+A|K|A|, then there exists a subgroup H and tG such that |A(H+t)|K4|A||H|, and |H|/|A|[K8,K8].

Proof
Corollary 13.40
#

If |A+A|K|A|, then there exist a subgroup H and a subset c of G with Ac+H, such that |c|K5|A|1/2/|H|1/2 and |H|/|A|[K8,K8].

Proof
Theorem 13.41 PFR with C=9
#

If AF2n is finite non-empty with |A+A|K|A|, then there exists a subgroup H of F2n with |H||A| such that A can be covered by at most 2K9 translates of H.

Proof