6 Entropy version of PFR

Definition 6.1

✓

$η := 1 / 9$ .

Throughout this chapter, $G = F_{2}^{n}$ , and $X_{1}^{0}, X_{2}^{0}$ are $G$ -valued random variables.

Definition 6.2

τ

functional

✓

If $X_{1}, X_{2}$ are two $G$ -valued random variables, then

τ [X_{1}; X_{2}] := d [X_{1}; X_{2}] + η d [X_{1}^{0}; X_{1}] + η d [X_{2}^{0}; X_{2}] .

Lemma 6.3

τ

depends only on distribution

✓

If $X_{1}^{'}, X_{2}^{'}$ are copies of $X_{1}, X_{2}$ , then $τ [X_{1}^{'}; X_{2}^{'}] = τ [X_{1}; X_{2}]$ .

Proof ▶

Definition 6.4

τ

-minimizer

✓

A pair of $G$ -valued random variables $X_{1}, X_{2}$ are said to be a $τ$ -minimizer if one has

τ [X_{1}; X_{2}] \leq τ [X_{1}^{'}; X_{2}^{'}]

for all $G$ -valued random variables $X_{1}^{'}, X_{2}^{'}$ .

Proposition 6.5

τ

has minimum

✓

A pair $X_{1}, X_{2}$ of $τ$ -minimizers exist.

Proof ▶

By Lemma 6.3, $τ$ only depends on the probability distributions of $X_{1}, X_{2}$ . This ranges over a compact space, and $τ$ is continuous. So $τ$ has a minimum.

6.1 Basic facts about minimizers

In this section we assume that $X_{1}, X_{2}$ are $τ$ -minimizers. We also write $k := d [X_{1}; X_{2}]$ .

Lemma 6.6 Distance lower bound

✓

For any $G$ -valued random variables $X_{1}^{'}, X_{2}^{'}$ , one has

d [X_{1}^{'}; X_{2}^{'}] \geq k - η (d [X_{1}^{0}; X_{1}^{'}] - d [X_{1}^{0}; X_{1}]) - η (d [X_{2}^{0}; X_{2}^{'}] - d [X_{2}^{0}; X_{2}]) .

Proof ▶

Lemma 6.7 Conditional distance lower bound

✓

For any $G$ -valued random variables $X_{1}^{'}, X_{2}^{'}$ and random variables $Z, W$ , one has

d [X_{1}^{'} | Z; X_{2}^{'} | W] \geq k - η (d [X_{1}^{0}; X_{1}^{'} | Z] - d [X_{1}^{0}; X_{1}]) - η (d [X_{2}^{0}; X_{2}^{'} | W] - d [X_{2}^{0}; X_{2}]) .

Proof ▶

6.2 First estimate

We continue the assumptions from the preceding section.

Let $X_{1}, X_{2}, {\tilde{X}}_{1}, {\tilde{X}}_{2}$ be independent random variables, with $X_{1}, {\tilde{X}}_{1}$ copies of $X_{1}$ and $X_{2}, {\tilde{X}}_{2}$ copies of $X_{2}$ . (This is possible thanks to Lemma 3.7.)

We also define the quantity

I_{1} := I [X_{1} + X_{2} : {\tilde{X}}_{1} + X_{2} | X_{1} + X_{2} + {\tilde{X}}_{1} + {\tilde{X}}_{2}] .

Lemma 6.8 Fibring identity for first estimate

✓

We have

\begin{aligned} d [X_{1} + {\tilde{X}}_{2}; X_{2} + {\tilde{X}}_{1}] + d [X_{1} | X_{1} + {\tilde{X}}_{2}; X_{2} | X_{2} + {\tilde{X}}_{1}] \\ + I [X_{1} + X_{2} : {\tilde{X}}_{1} + X_{2} | X_{1} + X_{2} + {\tilde{X}}_{1} + {\tilde{X}}_{2}] = 2 k . \end{aligned}

Proof ▶

Lemma 6.9 Lower bound on distances

✓

We have

\begin{aligned} d [X_{1} + {\tilde{X}}_{2}; X_{2} + {\tilde{X}}_{1}] \geq k & - η (d [X_{1}^{0}; X_{1} + {\tilde{X}}_{2}] - d [X_{1}^{0}; X_{1}]) \\ - η (d [X_{2}^{0}; X_{2} + {\tilde{X}}_{1}] - d [X_{2}^{0}; X_{2}]) \end{aligned}

Proof ▶

Lemma 6.10 Lower bound on conditional distances

✓

We have

\begin{aligned} d [X_{1} | X_{1} + {\tilde{X}}_{2}; X_{2} | X_{2} + {\tilde{X}}_{1}] \\ \geq k - η (d [X_{1}^{0}; X_{1} | X_{1} + {\tilde{X}}_{2}] - d [X_{1}^{0}; X_{1}]) \\ - η (d [X_{2}^{0}; X_{2} | X_{2} + {\tilde{X}}_{1}] - d [X_{2}^{0}; X_{2}]) . \end{aligned}

Proof ▶

Lemma 6.11 Upper bound on distance differences

✓

We have

\begin{aligned} d [X_{1}^{0}; X_{1} + {\tilde{X}}_{2}] - d [X_{1}^{0}; X_{1}] & \leq \frac{1}{2} k + \frac{1}{4} H [X_{2}] - \frac{1}{4} H [X_{1}] \\ d [X_{2}^{0}; X_{2} + {\tilde{X}}_{1}] - d [X_{2}^{0}; X_{2}] & \leq \frac{1}{2} k + \frac{1}{4} H [X_{1}] - \frac{1}{4} H [X_{2}], \\ d [X_{1}^{0}; X_{1} | X_{1} + {\tilde{X}}_{2}] - d [X_{1}^{0}; X_{1}] & \leq \frac{1}{2} k + \frac{1}{4} H [X_{1}] - \frac{1}{4} H [X_{2}] \\ d [X_{2}^{0}; X_{2} | X_{2} + {\tilde{X}}_{1}] - d [X_{2}^{0}; X_{2}] & \leq \frac{1}{2} k + \frac{1}{4} H [X_{2}] - \frac{1}{4} H [X_{1}] . \end{aligned}

Proof ▶

Immediate from Lemma 3.25 (and recalling that $k$ is defined to be $d [X_{1}; X_{2}]$ ).

Lemma 6.12 First estimate

✓

We have $I_{1} \leq 2 η k$ .

Proof ▶

One can also extract the following useful inequality from the proof of the above lemma.

Lemma 6.13 Entropy bound on quadruple sum

✓

With the same notation, we have

H [X_{1} + X_{2} + {\tilde{X}}_{1} + {\tilde{X}}_{2}] \leq \frac{1}{2} H [X_{1}] + \frac{1}{2} H [X_{2}] + (2 + η) k - I_{1} .

Proof ▶

Subtracting Lemma 6.10 from Lemma 6.8, and combining the resulting inequality with Lemma 6.11 gives the bound

d [X_{1} + {\tilde{X}}_{2}; X_{2} + {\tilde{X}}_{1}] \leq (1 + η) k - I_{1},

and the claim follows from Lemma 3.11 and the definition of $k$ .

6.3 Second estimate

We continue the assumptions from the preceding section. We introduce the quantity

I_{2} := I [X_{1} + X_{2} : X_{1} + {\tilde{X}}_{1} | X_{1} + X_{2} + {\tilde{X}}_{1} + {\tilde{X}}_{2}] .

Lemma 6.14 Distance between sums

✓

We have

d [X_{1} + {\tilde{X}}_{1}; X_{2} + {\tilde{X}}_{2}] \geq k - \frac{η}{2} (d [X_{1}; X_{1}] + d [X_{2}; X_{2}]) .

Proof ▶

From Lemma 6.6 one has

\begin{aligned} d [X_{1} + {\tilde{X}}_{1}; X_{2} + {\tilde{X}}_{2}] \geq k & - η (d [X_{1}^{0}; X_{1}] - d [X_{1}^{0}; X_{1} + {\tilde{X}}_{1}]) \\ - η (d [X_{2}^{0}; X_{2}] - d [X_{2}^{0}; X_{2} + {\tilde{X}}_{2}]) . \end{aligned}

Now Lemma 3.25 gives

d [X_{1}^{0}; X_{1} + {\tilde{X}}_{1}] - d [X_{1}^{0}; X_{1}] \leq \frac{1}{2} d [X_{1}; X_{1}]

and

d [X_{2}^{0}; X_{2} + {\tilde{X}}_{2}] - d [X_{2}^{0}; X_{2}] \leq \frac{1}{2} d [X_{2}; X_{2}],

and the claim follows.

Lemma 6.15

✓

We have

d [X_{1}; X_{1}] + d [X_{2}; X_{2}] \leq 2 k + \frac{2 (2 η k - I_{1})}{1 - η} .

Proof ▶

We may use Lemma 3.11 to expand

\begin{aligned} d [X_{1} + {\tilde{X}}_{1}; X_{2} + {\tilde{X}}_{2}] \\ = H [X_{1} + {\tilde{X}}_{1} + X_{2} + {\tilde{X}}_{2}] - \frac{1}{2} H [X_{1} + {\tilde{X}}_{1}] - \frac{1}{2} H [X_{2} + {\tilde{X}}_{2}] \\ = H [X_{1} + {\tilde{X}}_{1} + X_{2} + {\tilde{X}}_{2}] - \frac{1}{2} H [X_{1}] - \frac{1}{2} H [X_{2}] \\ - \frac{1}{2} (d [X_{1}; X_{1}] + d [X_{2}; X_{2}]), \end{aligned}

and hence by Lemma 6.13

d [X_{1} + {\tilde{X}}_{1}; X_{2} + {\tilde{X}}_{2}] \leq (2 + η) k - \frac{1}{2} (d [X_{1}; X_{1}] + d [X_{2}; X_{2}]) - I_{1} .

Combining this bound with Lemma 6.14 we obtain the result.

Lemma 6.16 Second estimate

✓

We have

I_{2} \leq 2 η k + \frac{2 η (2 η k - I_{1})}{1 - η} .

Proof ▶

We apply Corollary 5.3, but now with the choice

(Y_{1}, Y_{2}, Y_{3}, Y_{4}) := (X_{2}, X_{1}, {\tilde{X}}_{2}, {\tilde{X}}_{1}) .

Now Corollary 5.3 can be rewritten as

\begin{aligned} d [X_{1} + {\tilde{X}}_{1}; X_{2} + {\tilde{X}}_{2}] + d [X_{1} | X_{1} + {\tilde{X}}_{1}; X_{2} | X_{2} + {\tilde{X}}_{2}] \\ + I [X_{1} + X_{2} : X_{1} + {\tilde{X}}_{1} | X_{1} + X_{2} + {\tilde{X}}_{1} + {\tilde{X}}_{2}] = 2 k, \end{aligned}

recalling once again that $k := d [X_{1}; X_{2}]$ . From Lemma 6.7 one has

\begin{aligned} d [X_{1} | X_{1} + {\tilde{X}}_{1}; X_{2} | X_{2} + {\tilde{X}}_{2}] \geq k & - η (d [X_{1}^{0}; X_{1}] - d [X_{1}^{0}; X_{1} | X_{1} + {\tilde{X}}_{1}]) \\ - η (d [X_{2}^{0}; X_{2}] - d [X_{2}^{0}; X_{2} | X_{2} + {\tilde{X}}_{2}]) . \end{aligned}

while from Lemma 3.25 we have

d [X_{1}^{0}; X_{1} | X_{1} + {\tilde{X}}_{1}] - d [X_{1}^{0}; X_{1}] \leq \frac{1}{2} d [X_{1}; X_{1}],

and

d [X_{2}^{0}; X_{2} | X_{2} + {\tilde{X}}_{2}] - d [X_{2}^{0}; X_{2}] \leq \frac{1}{2} d [X_{1}; X_{2}] .

Combining all these inequalities with Lemma 6.14, we have

I [X_{1} + X_{2} : X_{1} + {\tilde{X}}_{1} | X_{1} + X_{2} + {\tilde{X}}_{1} + {\tilde{X}}_{2}] \leq η (d [X_{1}; X_{1}] + d [X_{2}; X_{2}]) .

Together with Lemma 6.15, this gives the conclusion.

6.4 Endgame

Let $X_{1}, X_{2}, {\tilde{X}}_{1}, {\tilde{X}}_{2}$ be as before, and introduce the random variables

U := X_{1} + X_{2}, V := {\tilde{X}}_{1} + X_{2}, W := X_{1} + {\tilde{X}}_{1}

and

S := X_{1} + X_{2} + {\tilde{X}}_{1} + {\tilde{X}}_{2} .

Lemma 6.17 Symmetry identity

✓

We have

I (U : W | S) = I (V : W | S) .

Proof ▶

Lemma 6.18 Bound on conditional mutual informations

✓

We have

I (U : V | S) + I (V : W | S) + I (W : U | S) \leq 6 η k - \frac{1 - 5 η}{1 - η} (2 η k - I_{1}) .

Proof ▶

From the definitions of $I_{1}, I_{2}$ and Lemma 6.17, we see that

I_{1} = I (U : V | S), I_{2} = I (W : U | S), I_{2} = I (V : W | S) .

Applying Lemma 6.12 and Lemma 6.16 we have the inequalities

I_{2} \leq 2 η k + \frac{2 η (2 η k - I_{1})}{1 - η} .

We conclude that

I_{1} + I_{2} + I_{2} \leq I_{1} + 4 η k + \frac{4 η (2 η k - I_{1})}{1 - η}

and the claim follows from some calculation.

Lemma 6.19 Bound on distance increments

✓

We have

\begin{aligned} \sum_{i = 1}^{2} \sum_{A \in {U, V, W}} (d [X_{i}^{0}; A | S] & - d [X_{i}^{0}; X_{i}]) \\ \leq (6 - 3 η) k + 3 (2 η k - I_{1}) . \end{aligned}

Proof ▶

By Lemma 3.26 (taking $X = X_{1}^{0}$ , $Y = X_{1}$ , $Z = X_{2}$ and $Z^{'} = {\tilde{X}}_{1} + {\tilde{X}}_{2}$ , so that $Y + Z = U$ and $Y + Z + Z^{'} = S$ ) we have, noting that $H [Y + Z] = H [Z^{'}]$ ,

d [X_{1}^{0}; U | S] - d [X_{1}^{0}; X_{1}] \leq \frac{1}{2} (H [S] - H [X_{1}]) .

Further applications of Lemma 3.26 give

\begin{aligned} d [X_{2}^{0}; U | S] - d [X_{2}^{0}; X_{2}] & \leq \frac{1}{2} (H [S] - H [X_{2}]) \\ d [X_{1}^{0}; V | S] - d [X_{1}^{0}; X_{1}] & \leq \frac{1}{2} (H [S] - H [X_{1}]) \\ d [X_{2}^{0}; V | S] - d [X_{2}^{0}; X_{2}] & \leq \frac{1}{2} (H [S] - H [X_{2}]) \end{aligned}

and

d [X_{1}^{0}; W | S] - d [X_{1}^{0}; X_{1}] \leq \frac{1}{2} (H [S] + H [W] - H [X_{1}] - H [W^{'}]),

where $W^{'} := X_{2} + {\tilde{X}}_{2}$ . To treat $d [X_{2}^{0}; W | S]$ , first note that this equals $d [X_{2}^{0}; W^{'} | S]$ , since for a fixed choice $s$ of $S$ we have $W^{'} = W + s$ (here we need some helper lemma about Ruzsa distance). Now we may apply Lemma 3.26 to obtain

d [X_{2}^{0}; W^{'} | S] - d [X_{2}^{0}; X_{2}] \leq \frac{1}{2} (H [S] + H [W^{'}] - H [X_{2}] - H [W]) .

Summing these six estimates and using Lemma 6.13, we conclude that

\begin{aligned} \sum_{i = 1}^{2} \sum_{A \in {U, V, W}} (d [X_{i}^{0}; A | S] & - d [X_{i}^{0}; X_{i}]) \\ \leq 3 H [S] - \frac{3}{2} H [X_{1}] - \frac{3}{2} H [X_{2}] \\ \leq (6 - 3 η) k + 3 (2 η k - I_{1}) \end{aligned}

as required.

Lemma 6.20 Key identity

✓

We have $U + V + W = 0$ .

Proof ▶

For the next two lemmas, let $(T_{1}, T_{2}, T_{3})$ be a $G^{3}$ -valued random variable such that $T_{1} + T_{2} + T_{3} = 0$ holds identically. Set

δ := \sum_{1 \leq i < j \leq 3} I [T_{i}; T_{j}] .

Lemma 6.21 Constructing good variables, I

✓

One has

\begin{aligned} k \leq δ + η ( & d [X_{1}^{0}; T_{1}] - d [X_{1}^{0}; X_{1}]) + η (d [X_{2}^{0}; T_{2}] - d [X_{2}^{0}; X_{2}]) \\ + \frac{1}{2} η I [T_{1} : T_{3}] + \frac{1}{2} η I [T_{2} : T_{3}] . \end{aligned}

(Note: in the paper, this lemma was phrased in a more intuitive formulation that is basically the contrapositive of the one here. Similarly for the next two lemmas.)

Proof ▶

We apply Lemma 3.23 with $(A, B) = (T_{1}, T_{2})$ there. Since $T_{1} + T_{2} = T_{3}$ , the conclusion is that

\begin{aligned} \sum_{t_{3}} P [T_{3} = t_{3}] & d [(T_{1} | T_{3} = t_{3}); (T_{2} | T_{3} = t_{3})] \\ \leq 3 I [T_{1} : T_{2}] + 2 H [T_{3}] - H [T_{1}] - H [T_{2}] . \end{aligned}

The right-hand side in 4 can be rearranged as

\begin{aligned} 2 (H [T_{1}] + H [T_{2}] + H [T_{3}]) - 3 H [T_{1}, T_{2}] \\ = 2 (H [T_{1}] + H [T_{2}] + H [T_{3}]) - H [T_{1}, T_{2}] - H [T_{2}, T_{3}] - H [T_{1}, T_{3}] = δ, \end{aligned}

using the fact (from Lemma 2.2) that all three terms $H [T_{i}, T_{j}]$ are equal to $H [T_{1}, T_{2}, T_{3}]$ and hence to each other. We also have

\begin{aligned} \sum_{t_{3}} P [T_{3} = t_{3}] (d [X_{1}^{0}; (T_{1} | T_{3} = t_{3})] - d [X_{1}^{0}; X_{1}]) \\ = d [X_{1}^{0}; T_{1} | T_{3}] - d [X_{1}^{0}; X_{1}] \leq d [X_{1}^{0}; T_{1}] - d [X_{1}^{0}; X_{1}] + \frac{1}{2} I [T_{1} : T_{3}] \end{aligned}

by Lemma 3.24, and similarly

\begin{aligned} \sum_{t_{3}} P [T_{3} = t_{3}] (d [X_{2}^{0}; (T_{2} | T_{3} = t_{3})] - d [X_{2}^{0}; X_{2}]) \\ \leq d [X_{2}^{0}; T_{2}] - d [X_{2}^{0}; X_{2}] + \frac{1}{2} I [T_{2} : T_{3}] . \end{aligned}

Putting the above observations together, we have

\begin{array}{r} \sum_{t_{3}} P [T_{3} = t_{3}] ψ [(T_{1} | T_{3} = t_{3}); (T_{2} | T_{3} = t_{3})] \leq δ + η (d [X_{1}^{0}; T_{1}] - d [X_{1}^{0}; X_{1}]) \\ + η (d [X_{2}^{0}; T_{2}] - d [X_{2}^{0}; X_{2}]) + \frac{1}{2} η I [T_{1} : T_{3}] + \frac{1}{2} η I [T_{2} : T_{3}] \end{array}

where we introduce the notation

ψ [Y_{1}; Y_{2}] := d [Y_{1}; Y_{2}] + η (d [X_{1}^{0}; Y_{1}] - d [X_{1}^{0}; X_{1}]) + η (d [X_{2}^{0}; Y_{2}] - d [X_{2}^{0}; X_{2}]) .

On the other hand, from Lemma 6.6 we have $k \leq ψ [Y_{1}; Y_{2}]$ , and the claim follows.

Lemma 6.22 Constructing good variables, II

✓

One has

\begin{aligned} k & \leq δ + \frac{η}{3} (δ + \sum_{i = 1}^{2} \sum_{j = 1}^{3} (d [X_{i}^{0}; T_{j}] - d [X_{i}^{0}; X_{i}])) . \end{aligned}

Proof ▶

Average Lemma 6.21 over all six permutations of $T_{1}, T_{2}, T_{3}$ .

Theorem 6.23

τ

-decrement

✓

Let $X_{1}, X_{2}$ be tau-minimizers. Then $d [X_{1}; X_{2}] = 0$ .

Proof ▶

Set $k := d [X_{1}; X_{2}]$ . Applying Lemma 6.22 with any random variables $(T_{1}, T_{2}, T_{3})$ such that $T_{1} + T_{2} + T_{3} = 0$ holds identically, we deduce that

k \leq δ + \frac{η}{3} (δ + \sum_{i = 1}^{2} \sum_{j = 1}^{3} (d [X_{1}^{0}; T_{j}] - d [X_{i}^{0}; X_{i}])) .

Note that $δ$ is still defined by 3 and thus depends on $T_{1}, T_{2}, T_{3}$ . In particular we may apply this for

T_{1} = (U | S = s), T_{2} = (V | S = s), T_{3} = (W | S = s)

for $s$ in the range of $S$ (which is a valid choice by Lemma 6.20) and then average over $s$ with weights $p_{S} (s)$ , to obtain

k \leq \tilde{δ} + \frac{η}{3} (\tilde{δ} + \sum_{i = 1}^{2} \sum_{A \in {U, V, W}} (d [X_{i}^{0}; A | S] - d [X_{i}^{0}; X_{i}])),

where

\tilde{δ} := I [U : V | S] + I [V : W | S] + I [W : U | S] .

Putting this together with Lemma 6.18 and Lemma 6.19, we conclude that

\begin{aligned} k & \leq (1 + \frac{η}{3}) (6 η k - \frac{1 - 5 η}{1 - η} (2 η k - I_{1})) + \frac{η}{3} ((6 - 3 η) k + 3 (2 η k - I_{1})) \\ = (8 η + η^{2}) k - (\frac{1 - 5 η}{1 - η} (1 + \frac{η}{3}) - η) (2 η k - I_{1}) \\ \leq (8 η + η^{2}) k \end{aligned}

since the quantity $2 η k - I_{1}$ is non-negative (by Lemma 6.12), and its coefficient in the above expression is non-positive provided that $η (2 η + 17) \leq 3$ , which is certainly the case with Definition 6.1. Moreover, from Definition 6.1 we have $8 η + η^{2} < 1$ . It follows that $k = 0$ , as desired.

6.5 Conclusion

Theorem 6.24 Entropy version of PFR

✓

Let $G = F_{2}^{n}$ , and suppose that $X_{1}^{0}, X_{2}^{0}$ are $G$ -valued random variables. Then there is some subgroup $H \leq G$ such that

d [X_{1}^{0}; U_{H}] + d [X_{2}^{0}; U_{H}] \leq 11 d [X_{1}^{0}; X_{2}^{0}],

where $U_{H}$ is uniformly distributed on $H$ . Furthermore, both $d [X_{1}^{0}; U_{H}]$ and $d [X_{2}^{0}; U_{H}]$ are at most $6 d [X_{1}^{0}; X_{2}^{0}]$ .

Proof ▶

Let $X_{1}, X_{2}$ be the $τ$ -minimizer from Proposition 6.5. From Theorem 6.23, $d [X_{1}; X_{2}] = 0$ . From Corollary 4.6, $d [X_{1}; U_{H}] = d [X_{2}; U_{H}] = 0$ . Also from $τ$ -minimization we have $τ [X_{1}; X_{2}] \leq τ [X_{2}^{0}; X_{1}^{0}]$ . Using this and the Ruzsa triangle inequality we can conclude. □

Note: a “stretch goal” for this project would be to obtain a ‘decidable‘ analogue of this result (see the remark at the end of Section 2 for some related discussion).