PFR

6 Entropy version of PFR

Definition 6.1
#

\(\eta := 1/9\).

Throughout this chapter, \(G = \mathbb {F}_2^n\), and \(X^0_1, X^0_2\) are \(G\)-valued random variables.

Definition 6.2 \(\tau \) functional
#

If \(X_1,X_2\) are two \(G\)-valued random variables, then

\[ \tau [X_1; X_2] := d[X_1; X_2] + \eta d[X^0_1; X_1] + \eta d[X^0_2; X_2]. \]
Lemma 6.3 \(\tau \) depends only on distribution

If \(X'_1, X'_2\) are copies of \(X_1,X_2\), then \(\tau [X'_1;X'_2] = \tau [X_1;X_2]\).

Proof

Immediate from Lemma 3.6.

Definition 6.4 \(\tau \)-minimizer
#

A pair of \(G\)-valued random variables \(X_1, X_2\) are said to be a \(\tau \)-minimizer if one has

\[ \tau [X_1;X_2] \leq \tau [X'_1;X'_2] \]

for all \(G\)-valued random variables \(X'_1, X'_2\).

Proposition 6.5 \(\tau \) has minimum
#

A pair \(X_1, X_2\) of \(\tau \)-minimizers exist.

Proof

By Lemma 6.3, \(\tau \) only depends on the probability distributions of \(X_1, X_2\). This ranges over a compact space, and \(\tau \) is continuous. So \(\tau \) has a minimum.

6.1 Basic facts about minimizers

In this section we assume that \(X_1,X_2\) are \(\tau \)-minimizers. We also write \(k := d[X_1;X_2]\).

Lemma 6.6 Distance lower bound

For any \(G\)-valued random variables \(X'_1,X'_2\), one has

\[ d[X'_1;X'_2] \geq k - \eta (d[X^0_1;X'_1] - d[X^0_1;X_1] ) - \eta (d[X^0_2;X'_2] - d[X^0_2;X_2] ). \]
Proof

Immediate from Definition 6.2 and Definition 6.5.

Lemma 6.7 Conditional distance lower bound

For any \(G\)-valued random variables \(X'_1,X'_2\) and random variables \(Z,W\), one has

\[ d[X'_1|Z;X'_2|W] \geq k - \eta (d[X^0_1;X'_1|Z] - d[X^0_1;X_1] ) - \eta (d[X^0_2;X'_2|W] - d[X^0_2;X_2] ). \]
Proof

Apply Lemma 6.6 to conditioned random variables and then average.

6.2 First estimate

We continue the assumptions from the preceding section.

Let \(X_1, X_2, \tilde X_1, \tilde X_2\) be independent random variables, with \(X_1,\tilde X_1\) copies of \(X_1\) and \(X_2,\tilde X_2\) copies of \(X_2\). (This is possible thanks to Lemma 3.7.)

We also define the quantity

\[ I_1 := I [X_1+X_2 : \tilde X_1 + X_2 | X_1+X_2+\tilde X_1+\tilde X_2]. \]
Lemma 6.8 Fibring identity for first estimate
#

We have

\begin{align*} & d[X_1+\tilde X_2;X_2+\tilde X_1] + d[X_1|X_1+\tilde X_2; X_2|X_2+\tilde X_1] \\ & \quad + \mathbb {I}[X_1+ X_2 : \tilde X_1 + X_2 \, |\, X_1 + X_2 + \tilde X_1 + \tilde X_2] = 2k. \end{align*}
Proof

Immediate from Corollary 5.3.

Lemma 6.9 Lower bound on distances
#

We have

\begin{align*} d[X_1+\tilde X_2; X_2+\tilde X_1] \geq k & - \eta (d[X^0_1; X_1+\tilde X_2] - d[X^0_1; X_1]) \\ & \qquad - \eta (d[X^0_2; X_2+\tilde X_1] - d[X^0_2; X_2]) \end{align*}
Proof

Immediate from Lemma 6.6.

Lemma 6.10 Lower bound on conditional distances
#

We have

\begin{align*} & d[X_1|X_1+\tilde X_2; X_2|X_2+\tilde X_1] \\ & \qquad \quad \geq k - \eta (d[X^0_1; X_1 | X_1 + \tilde X_2] - d[X^0_1; X_1]) \\ & \qquad \qquad \qquad \qquad - \eta (d[X^0_2; X_2 | X_2 + \tilde X_1] - d[X^0_2; X_2]). \end{align*}
Proof

Immediate from Lemma 6.7.

Lemma 6.11 Upper bound on distance differences

We have

\begin{align*} d[X^0_1; X_1+\tilde X_2] - d[X^0_1; X_1] & \leq \tfrac {1}{2} k + \tfrac {1}{4} \mathbb {H}[X_2] - \tfrac {1}{4} \mathbb {H}[X_1]\\ d[X^0_2;X_2+\tilde X_1] - d[X^0_2; X_2] & \leq \tfrac {1}{2} k + \tfrac {1}{4} \mathbb {H}[X_1] - \tfrac {1}{4} \mathbb {H}[X_2], \\ d[X_1^0;X_1|X_1+\tilde X_2] - d[X_1^0;X_1] & \leq \tfrac {1}{2} k + \tfrac {1}{4} \mathbb {H}[X_1] - \tfrac {1}{4} \mathbb {H}[X_2] \\ d[X_2^0; X_2|X_2+\tilde X_1] - d[X_2^0; X_2] & \leq \tfrac {1}{2}k + \tfrac {1}{4} \mathbb {H}[X_2] - \tfrac {1}{4} \mathbb {H}[X_1]. \end{align*}
Proof

Immediate from Lemma 3.25 (and recalling that \(k\) is defined to be \(d[X_1;X_2]\)).

Lemma 6.12 First estimate
#

We have \(I_1 \leq 2 \eta k\).

Proof

Take a suitable linear combination of Lemma 6.8, Lemma 6.9, Lemma 6.10, and Lemma 6.11.

One can also extract the following useful inequality from the proof of the above lemma.

Lemma 6.13 Entropy bound on quadruple sum
#

With the same notation, we have

\begin{equation} \label{HS-bound} \mathbb {H}[X_1+X_2+\tilde X_1+\tilde X_2] \le \tfrac {1}{2} \mathbb {H}[X_1]+\tfrac {1}{2} \mathbb {H}[X_2] + (2 + \eta ) k - I_1. \end{equation}
1

Proof

Subtracting Lemma 6.10 from Lemma 6.8, and combining the resulting inequality with Lemma 6.11 gives the bound

\[ d[X_1+\tilde X_2;X_2+\tilde X_1] \le (1 + \eta ) k - I_1, \]

and the claim follows from Lemma 3.11 and the definition of \(k\).

6.3 Second estimate

We continue the assumptions from the preceding section. We introduce the quantity

\[ I_2 := \mathbb {I}[X_1+X_2 : X_1 + \tilde X_1 | X_1+X_2+\tilde X_1+\tilde X_2]. \]
Lemma 6.14 Distance between sums
#

We have

\[ d[X_1+\tilde X_1; X_2+\tilde X_2] \geq k - \frac{\eta }{2} ( d[X_1; X_1] + d[X_2;X_2] ). \]
Proof

From Lemma 6.6 one has

\begin{align*} d[X_1+\tilde X_1; X_2+\tilde X_2] \geq k & - \eta (d[X^0_1;X_1] - d[X^0_1;X_1+\tilde X_1]) \\ & - \eta (d[X^0_2;X_2] - d[X^0_2;X_2+\tilde X_2]). \end{align*}

Now Lemma 3.25 gives

\[ d[X^0_1;X_1+\tilde X_1] - d[X^0_1;X_1] \leq \tfrac {1}{2} d[X_1;X_1] \]

and

\[ d[X^0_2;X_2+\tilde X_2] - d[X^0_2;X_2] \leq \tfrac {1}{2} d[X_2;X_2], \]

and the claim follows.

Lemma 6.15
#

We have

\[ d[X_1;X_1] + d[X_2;X_2] \leq 2 k + \frac{2(2 \eta k - I_1)}{1-\eta }. \]
Proof

We may use Lemma 3.11 to expand

\begin{align*} & d[X_1+\tilde X_1;X_2+\tilde X_2] \\ & = \mathbb {H}[X_1+\tilde X_1 + X_2 + \tilde X_2] - \tfrac {1}{2} \mathbb {H}[X_1+\tilde X_1] - \tfrac {1}{2} \mathbb {H}[X_2+\tilde X_2] \\ & = \mathbb {H}[X_1+\tilde X_1 + X_2 + \tilde X_2] - \tfrac {1}{2} \mathbb {H}[X_1] - \tfrac {1}{2} \mathbb {H}[X_2] \\ & \qquad \qquad \qquad - \tfrac {1}{2} \left( d[X_1;X_1] + d[X_2; X_2] \right), \end{align*}

and hence by Lemma 6.13

\[ d[X_1+\tilde X_1; X_2+\tilde X_2] \leq (2+\eta ) k - \tfrac {1}{2} \left( d[X_1;X_1] + d[X_2;X_2] \right) - I_1. \]

Combining this bound with Lemma 6.14 we obtain the result.

Lemma 6.16 Second estimate
#

We have

\[ I_2 \leq 2 \eta k + \frac{2 \eta (2 \eta k - I_1)}{1 - \eta }. \]
Proof

We apply Corollary 5.3, but now with the choice

\[ (Y_1,Y_2,Y_3,Y_4) := (X_2, X_1, \tilde X_2, \tilde X_1). \]

Now Corollary 5.3 can be rewritten as

\begin{align*} & d[X_1+\tilde X_1;X_2+\tilde X_2] + d[X_1|X_1+\tilde X_1; X_2|X_2+\tilde X_2] \\ & \quad + \mathbb {I}[X_1+X_2 : X_1 + \tilde X_1 \, |\, X_1+X_2+\tilde X_1+\tilde X_2] = 2k, \end{align*}

recalling once again that \(k := d[X_1;X_2]\). From Lemma 6.7 one has

\begin{align*} d[X_1|X_1+\tilde X_1; X_2|X_2+\tilde X_2] \geq k & - \eta (d[X^0_1;X_1] - d[X^0_1;X_1|X_1+\tilde X_1]) \\ & - \eta (d[X^0_2;X_2] - d[X^0_2;X_2|X_2+\tilde X_2]) . \end{align*}

while from Lemma 3.25 we have

\[ d[X^0_1;X_1|X_1+\tilde X_1] - d[X^0_1;X_1] \leq \tfrac {1}{2} d[X_1;X_1], \]

and

\[ d[X^0_2;X_2|X_2+\tilde X_2] - d[X^0_2;X_2] \leq \tfrac {1}{2} d[X_1;X_2]. \]

Combining all these inequalities with Lemma 6.14, we have

\begin{equation} \label{combined} \mathbb {I}[X_1+X_2 : X_1 + \tilde X_1 | X_1+X_2+\tilde X_1+\tilde X_2] \leq \eta ( d[X_1; X_1] + d[X_2; X_2] ). \end{equation}
2

Together with Lemma 6.15, this gives the conclusion.

6.4 Endgame

Let \(X_1,X_2,\tilde X_1,\tilde X_2\) be as before, and introduce the random variables

\[ U := X_1 + X_2, \qquad V := \tilde X_1 + X_2, \qquad W := X_1 + \tilde X_1 \]

and

\[ S := X_1 + X_2 + \tilde X_1 + \tilde X_2. \]
Lemma 6.17 Symmetry identity
#

We have

\[ I(U:W | S) = I(V:W | S). \]
Proof

This should follow from Lemma 3.6, Lemma 2.26, and Lemma 2.13.

Lemma 6.18 Bound on conditional mutual informations

We have

\[ I(U : V \, | \, S) + I(V : W \, | \, S) + I(W : U \, | \, S) \leq 6 \eta k - \frac{1 - 5 \eta }{1-\eta } (2 \eta k - I_1). \]
Proof

From the definitions of \(I_1,I_2\) and Lemma 6.17, we see that

\[ I_1 = I(U : V \, | \, S), \qquad I_2 = I(W : U \, | \, S), \qquad I_2 = I(V : W \, | \, S). \]

Applying Lemma 6.12 and Lemma 6.16 we have the inequalities

\[ I_2 \leq 2 \eta k + \frac{2\eta (2 \eta k - I_1)}{1-\eta } . \]

We conclude that

\[ I_1 + I_2 + I_2 \leq I_1+4\eta k+ \frac{4\eta (2 \eta k - I_1)}{1-\eta } \]

and the claim follows from some calculation.

Lemma 6.19 Bound on distance increments
#

We have

\begin{align*} \sum _{i=1}^2 \sum _{A\in \{ U,V,W\} } \big(d[X^0_i;A|S] & - d[X^0_i;X_i]\big) \\ & \leq (6 - 3\eta ) k + 3(2 \eta k - I_1). \end{align*}
Proof

By Lemma 3.26 (taking \(X = X_1^0\), \(Y = X_1\), \(Z = X_2\) and \(Z' = \tilde X_1 + \tilde X_2\), so that \(Y + Z = U\) and \(Y + Z + Z' = S\)) we have, noting that \(\mathbb {H}[Y+ Z] = \mathbb {H}[Z']\),

\[ d[X^0_1;U|S] - d[X^0_1;X_1] \leq \tfrac {1}{2} (\mathbb {H}[S] - \mathbb {H}[X_1]). \]

Further applications of Lemma 3.26 give

\begin{align*} d[X^0_2;U|S] - d[X^0_2; X_2] & \leq \tfrac {1}{2} (\mathbb {H}[S] - \mathbb {H}[X_2]) \\ d[X^0_1;V|S] - d[X^0_1;X_1] & \leq \tfrac {1}{2} (\mathbb {H}[S] - \mathbb {H}[X_1])\\ d[X^0_2;V|S] - d[X^0_2;X_2] & \leq \tfrac {1}{2} (\mathbb {H}[S] - \mathbb {H}[X_2]) \end{align*}

and

\[ d[X^0_1;W|S] - d[X^0_1;X_1] \leq \tfrac {1}{2} (\mathbb {H}[S] + \mathbb {H}[W] - \mathbb {H}[X_1] - \mathbb {H}[W']), \]

where \(W' := X_2 + \tilde X_2\). To treat \(d[X^0_2;W|S]\), first note that this equals \(d[X^0_2;W'|S]\), since for a fixed choice \(s\) of \(S\) we have \(W' = W + s\) (here we need some helper lemma about Ruzsa distance). Now we may apply Lemma 3.26 to obtain

\[ d[X^0_2;W'|S] - d[X^0_2;X_2] \leq \tfrac {1}{2} (\mathbb {H}[S] + \mathbb {H}[W'] - \mathbb {H}[X_2] - \mathbb {H}[W]). \]

Summing these six estimates and using Lemma 6.13, we conclude that

\begin{align*} \sum _{i=1}^2 \sum _{A\in \{ U,V,W\} } \big(d[X^0_i;A|S] & - d[X^0_i;X_i]\big) \\ & \leq 3 \mathbb {H}[S] - \tfrac {3}{2} \mathbb {H}[X_1] - \tfrac {3}{2} \mathbb {H}[X_2]\\ & \leq (6 - 3\eta ) k + 3(2 \eta k - I_1) \end{align*}

as required.

Lemma 6.20 Key identity
#

We have \(U+V+W=0\).

Proof

Obvious because we are in characteristic two.

For the next two lemmas, let \((T_1,T_2,T_3)\) be a \(G^3\)-valued random variable such that \(T_1+T_2+T_3=0\) holds identically. Set

\begin{equation} \label{delta-t1t2t3-def} \delta := \sum _{1 \leq i {\lt} j \leq 3} \mathbb {I}[T_i;T_j]. \end{equation}
3

Lemma 6.21 Constructing good variables, I
#

One has

\begin{align*} k \leq \delta + \eta (& d[X^0_1;T_1]-d[X^0_1;X_1]) + \eta (d[X^0_2;T_2]-d[X^0_2;X_2]) \\ & + \tfrac 12 \eta \mathbb {I}[T_1:T_3] + \tfrac 12 \eta \mathbb {I}[T_2:T_3]. \end{align*}

(Note: in the paper, this lemma was phrased in a more intuitive formulation that is basically the contrapositive of the one here. Similarly for the next two lemmas.)

Proof

We apply Lemma 3.23 with \((A,B) = (T_1, T_2)\) there. Since \(T_1 + T_2 = T_3\), the conclusion is that

\begin{align} \nonumber \sum _{t_3} \mathbb {P}[T_3 = t_3] & d[(T_1 | T_3 = t_3); (T_2 | T_3 = t_3)] \\ & \leq 3 \mathbb {I}[T_1 : T_2] + 2 \mathbb {H}[T_3] - \mathbb {H}[T_1] - \mathbb {H}[T_2].\label{bsg-t1t2} \end{align}

The right-hand side in 4 can be rearranged as

\begin{align*} & 2( \mathbb {H}[T_1] + \mathbb {H}[T_2] + \mathbb {H}[T_3]) - 3 \mathbb {H}[T_1,T_2] \\ & = 2(\mathbb {H}[T_1] + \mathbb {H}[T_2] + \mathbb {H}[T_3]) - \mathbb {H}[T_1,T_2] - \mathbb {H}[T_2,T_3] - \mathbb {H}[T_1, T_3] = \delta ,\end{align*}

using the fact (from Lemma 2.2) that all three terms \(\mathbb {H}[T_i,T_j]\) are equal to \(\mathbb {H}[T_1,T_2,T_3]\) and hence to each other. We also have

\begin{align*} & \sum _{t_3} P[T_3 = t_3] \bigl(d[X^0_1; (T_1 | T_3=t_3)] - d[X^0_1;X_1]\bigr) \\ & \quad = d[X^0_1; T_1 | T_3] - d[X^0_1;X_1] \leq d[X^0_1;T_1] - d[X^0_1;X_1] + \tfrac {1}{2} \mathbb {I}[T_1 : T_3] \end{align*}

by Lemma 3.24, and similarly

\begin{align*} & \sum _{t_3} \mathbb {P}[T_3 = t_3] (d[X^0_2;(T_2 | T_3=t_3)] - d[X^0_2; X_2]) \\ & \quad \quad \quad \quad \quad \quad \leq d[X^0_2;T_2] - d[X^0_2;X_2] + \tfrac {1}{2} \mathbb {I}[T_2 : T_3]. \end{align*}

Putting the above observations together, we have

\begin{align*} \sum _{t_3} \mathbb {P}[T_3=t_3] \psi [(T_1 | T_3=t_3); (T_2 | T_3=t_3)] \leq \delta + \eta (d[X^0_1;T_1]-d[X^0_1;X_1]) \\ + \eta (d[X^0_2;T_2]-d[X^0_2;X_2]) + \tfrac 12 \eta \mathbb {I}[T_1:T_3] + \tfrac 12 \eta \mathbb {I}[T_2:T_3] \end{align*}

where we introduce the notation

\[ \psi [Y_1; Y_2] := d[Y_1;Y_2] + \eta (d[X_1^0;Y_1] - d[X_1^0;X_1]) + \eta (d[X_2^0;Y_2] - d[X_2^0;X_2]). \]

On the other hand, from Lemma 6.6 we have \(k \leq \psi [Y_1;Y_2]\), and the claim follows.

Lemma 6.22 Constructing good variables, II
#

One has

\begin{align*} k & \leq \delta + \frac{\eta }{3} \biggl( \delta + \sum _{i=1}^2 \sum _{j = 1}^3 (d[X^0_i;T_j] - d[X^0_i; X_i]) \biggr). \end{align*}
Proof

Average Lemma 6.21 over all six permutations of \(T_1,T_2,T_3\).

Theorem 6.23 \(\tau \)-decrement
#

Let \(X_1, X_2\) be tau-minimizers. Then \(d[X_1;X_2] = 0\).

Proof

Set \(k := d[X_1;X_2]\). Applying Lemma 6.22 with any random variables \((T_1,T_2,T_3)\) such that \(T_1+T_2+T_3=0\) holds identically, we deduce that

\[ k \leq \delta + \frac{\eta }{3} \biggl( \delta + \sum _{i=1}^2 \sum _{j = 1}^3 (d[X^0_1;T_j] - d[X^0_i;X_i]) \biggr). \]

Note that \(\delta \) is still defined by 3 and thus depends on \(T_1,T_2,T_3\). In particular we may apply this for

\[ T_1 = (U | S = s),\qquad T_2 = (V | S = s), \qquad T_3 = (W | S = s) \]

for \(s\) in the range of \(S\) (which is a valid choice by Lemma 6.20) and then average over \(s\) with weights \(p_S(s)\), to obtain

\[ k \leq \tilde\delta + \frac{\eta }{3} \biggl( \tilde\delta + \sum _{i=1}^2 \sum _{A\in \{ U,V,W\} } \bigl( d[X^0_i;A|S] - d[X^0_i;X_i]\bigr) \biggr), \]

where

\[ \tilde\delta := \mathbb {I}[U : V | S] + \mathbb {I}[V : W | S] + \mathbb {I}[W : U | S]. \]

Putting this together with Lemma 6.18 and Lemma 6.19, we conclude that

\begin{align*} k & \leq \Bigl(1+\frac{\eta }{3}\Bigr)\Bigl(6\eta k-\frac{1-5\eta }{1-\eta }(2\eta k-I_1)\Bigr)+\frac{\eta }{3}\Bigl((6-3\eta )k+3(2\eta k-I_1)\Bigr)\\ & = (8\eta + \eta ^2) k - \biggl( \frac{1 - 5 \eta }{1-\eta }\Bigl(1 + \frac{\eta }{3}\Bigr) - \eta \biggr)(2 \eta k - I_1)\\ & \leq (8 \eta + \eta ^2) k \end{align*}

since the quantity \(2 \eta k - I_1\) is non-negative (by Lemma 6.12), and its coefficient in the above expression is non-positive provided that \(\eta (2\eta + 17) \le 3\), which is certainly the case with Definition 6.1. Moreover, from Definition 6.1 we have \(8 \eta + \eta ^2 {\lt} 1\). It follows that \(k=0\), as desired.

6.5 Conclusion

Theorem 6.24 Entropy version of PFR
#

Let \(G = \mathbb {F}_2^n\), and suppose that \(X^0_1, X^0_2\) are \(G\)-valued random variables. Then there is some subgroup \(H \leq G\) such that

\[ d[X^0_1;U_H] + d[X^0_2;U_H] \le 11 d[X^0_1;X^0_2], \]

where \(U_H\) is uniformly distributed on \(H\). Furthermore, both \(d[X^0_1;U_H]\) and \(d[X^0_2;U_H]\) are at most \(6 d[X^0_1;X^0_2]\).

Proof

Let \(X_1, X_2\) be the \(\tau \)-minimizer from Lemma 6.5. From Theorem 6.23, \(d[X_1;X_2]=0\). From Corollary 4.6, \(d[X_1;U_H] = d[X_2; U_H] = 0\). Also from \(\tau \)-minimization we have \(\tau [X_1;X_2] \leq \tau [X^0_2;X^0_1]\). Using this and the Ruzsa triangle inequality we can conclude.

Note: a “stretch goal” for this project would be to obtain a ‘decidable‘ analogue of this result (see the remark at the end of Section 2 for some related discussion).