8 Improving the exponents

The arguments here are due to Jyun-Jie Liao.

Definition 8.1 New definition of

η

✓

$η$ is a real parameter with $η > 0$ .

Previously in Definition 6.1 we had set $η = 1 / 9$ . To implement this chapter, one should refactor the previous arguments so that $η$ is now free to be a positive number, though the specific hypothesis $η = 1 / 9$ would now need to be added to Theorem 6.23.

Let $X_{1}^{0}, X_{2}^{0}$ be $G$ -valued random variables, and let $X_{1}, X_{2}$ be $τ$ -minimizers as defined in Definition 6.4.

For the next two lemmas, let $(T_{1}, T_{2}, T_{3})$ be a $G^{3}$ -valued random variable such that $T_{1} + T_{2} + T_{3} = 0$ holds identically. Let $δ$ be the quantity in 3.

We have the following variant of Lemma 6.21:

Lemma 8.2 Constructing good variables, I’

✓

One has

\begin{aligned} k \leq δ + η ( & d [X_{1}^{0}; T_{1} | T_{3}] - d [X_{1}^{0}; X_{1}]) + η (d [X_{2}^{0}; T_{2} | T_{3}] - d [X_{2}^{0}; X_{2}]) . \end{aligned}

Proof ▶

We apply Lemma 3.23 with $(A, B) = (T_{1}, T_{2})$ there. Since $T_{1} + T_{2} = T_{3}$ , the conclusion is that

\begin{aligned} \sum_{t_{3}} P [T_{3} = t_{3}] & d [(T_{1} | T_{3} = t_{3}); (T_{2} | T_{3} = t_{3})] \\ \leq 3 I [T_{1} : T_{2}] + 2 H [T_{3}] - H [T_{1}] - H [T_{2}] . \end{aligned}

The right-hand side in 1 can be rearranged as

\begin{aligned} 2 (H [T_{1}] + H [T_{2}] + H [T_{3}]) - 3 H [T_{1}, T_{2}] \\ = 2 (H [T_{1}] + H [T_{2}] + H [T_{3}]) - H [T_{1}, T_{2}] - H [T_{2}, T_{3}] - H [T_{1}, T_{3}] = δ, \end{aligned}

using the fact (from Lemma 2.2) that all three terms $H [T_{i}, T_{j}]$ are equal to $H [T_{1}, T_{2}, T_{3}]$ and hence to each other. We also have

\begin{aligned} \sum_{t_{3}} P [T_{3} = t_{3}] (d [X_{1}^{0}; (T_{1} | T_{3} = t_{3})] - d [X_{1}^{0}; X_{1}]) \\ = d [X_{1}^{0}; T_{1} | T_{3}] - d [X_{1}^{0}; X_{1}] \end{aligned}

and similarly

\begin{aligned} \sum_{t_{3}} P [T_{3} = t_{3}] (d [X_{2}^{0}; (T_{2} | T_{3} = t_{3})] - d [X_{2}^{0}; X_{2}]) \\ \leq d [X_{2}^{0}; T_{2} | T_{3}] - d [X_{2}^{0}; X_{2}] . \end{aligned}

Putting the above observations together, we have

\begin{array}{r} \sum_{t_{3}} P [T_{3} = t_{3}] ψ [(T_{1} | T_{3} = t_{3}); (T_{2} | T_{3} = t_{3})] \leq δ + η (d [X_{1}^{0}; T_{1} | T_{3}] - d [X_{1}^{0}; X_{1}]) \\ + η (d [X_{2}^{0}; T_{2} | T_{3}] - d [X_{2}^{0}; X_{2}]) \end{array}

where we introduce the notation

ψ [Y_{1}; Y_{2}] := d [Y_{1}; Y_{2}] + η (d [X_{1}^{0}; Y_{1}] - d [X_{1}^{0}; X_{1}]) + η (d [X_{2}^{0}; Y_{2}] - d [X_{2}^{0}; X_{2}]) .

On the other hand, from Lemma 6.6 we have $k \leq ψ [Y_{1}; Y_{2}]$ , and the claim follows.

(One could in fact refactor Lemma 6.21 to follow from Lemma 8.2 and Lemma 3.24).

Lemma 8.3 Constructing good variables, II’

✓

One has

\begin{aligned} k & \leq δ + \frac{η}{6} \sum_{i = 1}^{2} \sum_{1 \leq j, l \leq 3; j \neq l} (d [X_{i}^{0}; T_{j} | T_{l}] - d [X_{i}^{0}; X_{i}]) \end{aligned}

Proof ▶

Average Lemma 8.2 over all six permutations of $T_{1}, T_{2}, T_{3}$ .

Now let $X_{1}, X_{2}, {\tilde{X}}_{1}, {\tilde{X}}_{2}$ be independent copies of $X_{1}, X_{2}, X_{1}, X_{2}$ , and set

U := X_{1} + X_{2}, V := {\tilde{X}}_{1} + X_{2}, W := X_{1} + {\tilde{X}}_{1}

and

S := X_{1} + X_{2} + {\tilde{X}}_{1} + {\tilde{X}}_{2}

and introduce the quantities

k = d [X_{1}; X_{2}]

and

I_{1} = I (U : V | S) .

Lemma 8.4 Constructing good variables, III’

✓

One has

\begin{aligned} k & \leq I (U : V | S) + I (V : W | S) + I (W : U | S) + \frac{η}{6} \sum_{i = 1}^{2} \sum_{A, B \in {U, V, W} : A \neq B} (d [X_{i}^{0}; A | B, S] - d [X_{i}^{0}; X_{i}]) . \end{aligned}

Proof ▶

For each $s$ in the range of $S$ , apply Lemma 8.3 with $T_{1}, T_{2}, T_{3}$ equal to $(U | S = s)$ , $(V | S = s)$ , $(W | S = s)$ respectively (which works thanks to Lemma 6.20), multiply by $P [S = s]$ , and sum in $s$ to conclude.

To control the expressions in the right-hand side of this lemma we need a general entropy inequality.

Lemma 8.5 General inequality

✓

Let $X_{1}, X_{2}, X_{3}, X_{4}$ be independent $G$ -valued random variables, and let $Y$ be another $G$ -valued random variable. Set $S := X_{1} + X_{2} + X_{3} + X_{4}$ . Then

\begin{aligned} d [Y; X_{1} + X_{2} | X_{1} + X_{3}, S] - d [Y; X_{1}] \\ \leq \frac{1}{4} (d [X_{1}; X_{2}] + 2 d [X_{1}; X_{3}] + d [X_{2}; X_{4}]) \\ + \frac{1}{4} (d [X_{1} | X_{1} + X_{3}; X_{2} | X_{2} + X_{4}] - d [X_{3} | X_{3} + X_{4}; X_{1} | X_{1} + X_{2}]) \\ + \frac{1}{8} (H [X_{1} + X_{2}] - H [X_{3} + X_{4}] + H [X_{2}] - H [X_{3}] \\ + H [X_{2} | X_{2} + X_{4}] - H [X_{1} | X_{1} + X_{3}]) . \end{aligned}

Proof ▶

On the one hand, by Lemma 3.24 and two applications of Lemma 3.25 we have

\begin{aligned} d [Y; X_{1} + X_{2} | X_{1} + X_{3}, S] \\ \leq d [Y; X_{1} + X_{2} | S] + \frac{1}{2} I [X_{1} + X_{2} : X_{1} + X_{3} | S] \\ \leq d [Y; X_{1} + X_{2}] \\ + \frac{1}{2} (d [X_{1} + X_{2}; X_{3} + X_{4}] + I [X_{1} + X_{2} : X_{1} + X_{3} | S]) \\ + \frac{1}{4} (H [X_{1} + X_{2}] - H [X_{3} + X_{4}]) \\ \leq d [Y; X_{1}] \\ + \frac{1}{2} (d [X_{1}; X_{2}] + d [X_{1} + X_{2}; X_{3} + X_{4}] + I [X_{1} + X_{2} : X_{1} + X_{3} | S]) \\ + \frac{1}{4} (H [X_{1} + X_{2}] - H [X_{3} + X_{4}] + H [X_{2}] - H [X_{1}]) . \end{aligned}

From Corollary 5.3 (with $Y_{1}, Y_{2}, Y_{3}, Y_{4}$ set equal to $X_{3}, X_{1}, X_{4}, X_{2}$ respectively) one has

d [X_{3} + X_{4}; X_{1} + X_{2}] + d [X_{3} | X_{3} + X_{4}; X_{1} | X_{1} + X_{2}]

+ I [X_{3} + X_{1} : X_{1} + X_{2} | S] = d [X_{3}; X_{1}] + d [X_{4}; X_{2}] .

Rearranging the mutual information and Ruzsa distances slightly, we conclude that

\begin{aligned} d [Y; X_{1} + X_{2} | X_{1} + X_{3}, S] \\ \leq d [Y; X_{1}] \\ + \frac{1}{2} (d [X_{1}; X_{2}] + d [X_{1}; X_{3}] + d [X_{2}; X_{4}] - d [X_{3} | X_{3} + X_{4}; X_{1} | X_{1} + X_{2}]) \\ + \frac{1}{4} (H [X_{1} + X_{2}] - H [X_{3} + X_{4}] + H [X_{2}] - H [X_{1}]) . \end{aligned}

On the other hand, $(X_{1} + X_{2} | X_{1} + X_{3}, S)$ has an identical distribution to the independent sum of $(X_{1} | X_{1} + X_{3})$ and $(X_{2} | X_{2} + X_{4})$ . We may therefore apply Lemma 3.25 to conditioned variables $(X_{1} | X_{1} + X_{3} = s)$ and $(X_{2} | X_{2} + X_{4} = t)$ and average in $s, t$ to obtain the alternative bound

\begin{aligned} d [Y; X_{1} + X_{2} | X_{1} + X_{3}, S] \\ \leq d [Y; X_{1} | X_{1} + X_{3}] + \frac{1}{2} d [X_{1} | X_{1} + X_{3}; X_{2} | X_{2} + X_{4}] \\ + \frac{1}{4} (H [X_{2} | X_{2} + X_{4}] - H [X_{1} | X_{1} + X_{3}]) \\ \leq d [Y; X_{1}] \\ + \frac{1}{2} (d [X_{1}; X_{3}] + d [X_{1} | X_{1} + X_{3}; X_{2} | X_{2} + X_{4}]) \\ + \frac{1}{4} (H [X_{2} | X_{2} + X_{4}] - H [X_{1} | X_{1} + X_{3}] + H [X_{1}] - H [X_{3}]) . \end{aligned}

If one takes the arithmetic mean of these two bounds and simplifies using Corollary 5.3, one obtains the claim.

Returning to our specific situation, we now have

Lemma 8.6 Bound on distance differences

✓

We have

\begin{aligned} \sum_{i = 1}^{2} \sum_{A, B \in {U, V, W} : A \neq B} d [X_{i}^{0}; A | B, S] - d [X_{i}^{0}; X_{i}] \\ \leq 12 k + \frac{4 (2 η k - I_{1})}{1 - η} . \end{aligned}

Proof ▶

If we apply Lemma 8.5 with $X_{1} := X_{1}$ , $Y := X_{1}^{0}$ and $(X_{2}, X_{3}, X_{4})$ equal to the $3!$ permutations of $(X_{2}, {\tilde{X}}_{1}, {\tilde{X}}_{2})$ , and sums (using the symmetry $H [X | X + Y] = H [Y | X + Y]$ , which follows from Lemma 2.12), we can bound

\sum_{A, B \in {U, V, W} : A \neq B} d [X_{1}^{0}; A | B, S] - d [X_{1}^{0}; X_{1}]

\begin{aligned} \frac{1}{4} (6 d [X_{1}; X_{2}] + 6 d [X_{1}; {\tilde{X}}_{2}] \\ + 6 d [X_{1}; {\tilde{X}}_{1}] + 2 d [{\tilde{X}}_{1}; {\tilde{X}}_{2}] + 2 d [{\tilde{X}}_{1}; X_{2}] + 2 d [X_{2}; {\tilde{X}}_{2}]) \\ + \frac{1}{8} (2 H [X_{1} + X_{2}] + 2 H [X_{1} + {\tilde{X}}_{1}] + 2 H [X_{1} + {\tilde{X}}_{2}] \\ - 2 H [{\tilde{X}}_{1} + X_{2}] - 2 H [X_{2} + {\tilde{X}}_{2}] - 2 H [{\tilde{X}}_{1} + {\tilde{X}}_{2}]) \\ + \frac{1}{4} (H [X_{2} | X_{2} + {\tilde{X}}_{2}] + H [{\tilde{X}}_{1} | {\tilde{X}}_{1} + {\tilde{X}}_{2}] + H [{\tilde{X}}_{1} | X_{1} + {\tilde{X}}_{2}] \\ - H [X_{1} | X_{1} + {\tilde{X}}_{1}] - H [X_{1} | X_{1} + X_{2}] - H [X_{1} | X_{1} + {\tilde{X}}_{2}]), \end{aligned}

which simplifies to

\begin{aligned} \frac{1}{4} (16 k + 6 d [X_{1}; X_{1}] + 2 d [X_{2}; X_{2}]) \\ + \frac{1}{4} (H [X_{1} + {\tilde{X}}_{1}] - H [X_{2} + {\tilde{X}}_{2}] + d [X_{2} | X_{2} + {\tilde{X}}_{2}] - d [X_{1} | X_{1} + {\tilde{X}}_{1}]) . \end{aligned}

A symmetric argument also bounds

\sum_{A, B \in {U, V, W} : A \neq B} d [X_{2}^{0}; A | B, S] - d [X_{2}^{0}; X_{2}]

\begin{aligned} \frac{1}{4} (16 k + 6 d [X_{2}; X_{2}] + 2 d [X_{1}; X_{1}]) \\ + \frac{1}{4} (H [X_{2} + {\tilde{X}}_{2}] - H [X_{1} + {\tilde{X}}_{1}] + d [X_{1} | X_{1} + {\tilde{X}}_{1}] - d [X_{2} | X_{2} + {\tilde{X}}_{2}]) . \end{aligned}

On the other hand, from Lemma 6.15 one has

d [X_{1}; X_{1}] + d [X_{2}; X_{2}] \leq 2 k + \frac{2 (2 η k - I_{1})}{1 - η} .

Summing the previous three estimates, we obtain the claim.

Theorem 8.7 Improved

τ

-decrement

✓

Suppose $0 < η < 1 / 8$ . Let $X_{1}, X_{2}$ be tau-minimizers. Then $d [X_{1}; X_{2}] = 0$ .

Proof ▶

From Lemma 8.4, Lemma 8.6, and Lemma 6.18 one has

k \leq 8 η k - \frac{(1 - 5 η - \frac{4}{6} η) (2 η k - I_{1})}{(1 - η)} .

For any $η < 1 / 8$ , we see from Lemma 6.12 that the expression $\frac{(1 - 5 η - \frac{4}{6} η) (2 η k - I_{1})}{(1 - η)}$ is nonnegative, and hence $k = 0$ as required.

Theorem 8.8 Limiting improved

τ

-decrement

✓

For $η = 1 / 8$ , there exist tau-minimizers $X_{1}, X_{2}$ satisfying $d [X_{1}; X_{2}] = 0$ .

Proof ▶

For each $η < 1 / 8$ , consider minimizers $X_{1}^{η}$ and $X_{2}^{η}$ from Proposition 6.5. By Theorem 8.7, they satisfy $d [X_{1}^{η}; X_{2}^{η}] = 0$ . By compactness of the space of probability measures on $G$ , one may extract a converging subsequence of the distributions of $X_{1}^{η}$ and $X_{2}^{η}$ as $η \to 1 / 8$ . By continuity of all the involved quantities, the limit is a pair of tau-minimizers for $1 / 8$ satisfying additionally $d [X_{1}; X_{2}] = 0$ .

Theorem 8.9 Improved entropy version of PFR

✓

Let $G = F_{2}^{n}$ , and suppose that $X_{1}^{0}, X_{2}^{0}$ are $G$ -valued random variables. Then there is some subgroup $H \leq G$ such that

d [X_{1}^{0}; U_{H}] + d [X_{2}^{0}; U_{H}] \leq 10 d [X_{1}^{0}; X_{2}^{0}],

where $U_{H}$ is uniformly distributed on $H$ . Furthermore, both $d [X_{1}^{0}; U_{H}]$ and $d [X_{2}^{0}; U_{H}]$ are at most $6 d [X_{1}^{0}; X_{2}^{0}]$ .

Proof ▶

Let $X_{1}, X_{2}$ be the good $τ$ -minimizer from Theorem 8.8. By construction, $d [X_{1}; X_{2}] = 0$ . From Corollary 4.6, $d [X_{1}; U_{H}] = d [X_{2}; U_{H}] = 0$ . Also from $τ$ -minimization we have $τ [X_{1}; X_{2}] \leq τ [X_{2}^{0}; X_{1}^{0}]$ . Using this and the Ruzsa triangle inequality we can conclude.

One can then replace Lemma 7.2 with

Lemma 8.10

✓

If $A \subset F_{2}^{n}$ is non-empty and $| A + A | \leq K | A |$ , then $A$ can be covered by at most $K^{6} | A |^{1 / 2} / | H |^{1 / 2}$ translates of a subspace $H$ of $F_{2}^{n}$ with

| H | / | A | \in [K^{- 10}, K^{10}] .

Proof ▶

By repeating the proof of Lemma 7.2 and using Theorem 8.9 one can obtain the claim with $13 / 2$ replaced with $6$ and $11$ replaced by $10$ .

This implies the following improved version of Theorem 7.3:

Theorem 8.11 Improved PFR

✓

If $A \subset F_{2}^{n}$ is non-empty and $| A + A | \leq K | A |$ , then $A$ can be covered by most $2 K^{11}$ translates of a subspace $H$ of $F_{2}^{n}$ with $| H | \leq | A |$ .

Proof ▶

By repeating the proof of Theorem 7.3 and using Lemma 8.10 one can obtain the claim with $11$ replaced by $10$ . □

Of course, by replacing Theorem 7.3 with Theorem 8.11 we may also improve constants in downstream theorems in a straightforward manner.