12 The -torsion case
12.1 Data processing inequality
Lemma
12.1
Data processing for a single variable
Let be a random variable. Then for any function on the range of , one has .
Lemma
12.2
One-sided unconditional data processing inequality
Let be random variables. For any function on the range of , we have .
Lemma
12.3
Unconditional data processing inequality
Let be random variables. For any functions on the ranges of respectively, we have .
Lemma
12.4
Data processing inequality
Let . For any functions on the ranges of respectively, we have .
12.2 More Ruzsa distance estimates
Let be an additive group.
Lemma
12.5
Flipping a sign
Proof
▶
Without loss of generality (using Lemma 3.10 and Lemma 3.7) we may take to be independent. By , be copies of that are conditionally independent over (this exists thanks to Lemma 3.22). By Lemma 3.7, we can also find another copy of that is independent of . From Corollary 2.21, one has
From Lemma 3.11, Lemma 3.1, Lemma 3.10 we have
Since is a function of , we see from Lemma 2.2 and Corollary 2.24 that
Because , we have
and thus by Lemma 2.2
and hence by Corollary 2.18
Since are independent, we see from Lemma 3.11, Lemma 3.10 that
Similarly
We conclude that
Finally, from Lemma 12.1 we have
From Corollary 2.24 followed by Corollary 2.30, we have
and thus by Lemma 3.11, Lemma 3.10, Lemma 2.2, Corollary 2.24
Applying all of these estimates, the claim now follows from linear arithmetic.
Lemma
12.6
Kaimonovich–Vershik–Madiman inequality
If and are jointly independent -valued random variables, then
Proof
▶
This is trivial for , while the case is Lemma 3.21. Now suppose inductively that , and the claim was already proven for . By a further application of Lemma 3.21 one has
By induction hypothesis one has
Summing the two inequalities, we obtain the claim.
Lemma
12.7
Kaimonovich–Vershik–Madiman inequality, II
If and are jointly independent -valued random variables, then
Proof
▶
Applying Lemma 12.6 with all the replaced by , and using Lemma 3.1 and Lemma 3.11, we obtain after some rearranging
From Corollary 3.5 we have
for all ; subtracting and averaging, we conclude that
and thus
From Lemma 3.13 we have
Since , the claim follows.
Lemma
12.8
Kaimonovich–Vershik–Madiman inequality, III
If and are jointly independent -valued random variables, then
Lemma
12.9
Comparing sums
Let and be tuples of jointly independent random variables (so the ’s and ’s are also independent of each other), and let be a function, then
Lemma
12.10
Sums of dilates I
Let be independent -valued random variables, with a copy of , and let be an integer. Then
and
Proof
▶
From Lemma 3.17 we have
which gives the first inequality. Similarly from Lemma 3.17 we have
which (when combined with Lemma 3.1) gives the second inequality.
Lemma
12.11
Sums of dilates II
Let be independent -valued random variables, and let be an integer. Then
We remark that in the paper [GGMT2024] the variant estimate
is also proven by a similar method. This variant is superior for (or ); but we will not need this estimate here.
12.3 Multidistance
We continue to let be an abelian group.
Definition
12.12
Multidistance
Let be a positive integer, and let be an -tuple of -valued random variables . Then we define
where the are independent copies of the .
Lemma
12.13
Multidistance of copy
If and are such that and have the same distribution for each , then .
Lemma
12.14
Multidistance of independent variables
If are jointly independent, then .
Lemma
12.15
Nonnegativity
For any such tuple, we have .
Proof
▶
From Corollary 3.5 one has
for each . Averaging over , we obtain the claim.
Lemma
12.16
Relabeling
If is a bijection, then .
Lemma
12.17
Multidistance and Ruzsa distance, I
Let , and let be a tuple of -valued random variables. Then
Lemma
12.18
Multidistance and Ruzsa distance, II
Let , and let be a tuple of -valued random variables. Then
Proof
▶
From Lemma 3.18 we have , and applying this to every summand in Lemma 12.17, we obtain the claim.
Lemma
12.19
Multidistance and Ruzsa distance, III
Let , and let be a tuple of -valued random variables. If the all have the same distribution, then for any .
Lemma
12.20
Multidistance and Ruzsa distance, IV
Let , and let be a tuple of independent -valued random variables. Let . Then
Proof
▶
Take to be further independent copies of (which exist by Lemma 3.7), and write . Fix any distinct .
From Lemma 3.21 one has
and also
Combining this with 1 and then applying Corollary 3.5 we have
Averaging this over all choices of gives , and the claim follows from Lemma 3.11.
Proposition
12.21
Vanishing
If , then for each there is a finite subgroup such that .
With more effort one can show that is independent of , but we will not need to do so here.
12.4 The tau functional
Fix , and a reference variable in .
Definition
12.23
-functional
If is a tuple, we define its -functional
Definition
12.24
-minimizer
A -minimizer is a tuple that minimizes the -functional among all tuples of -valued random variables.
Proposition
12.25
Existence of -minimizer
If is finite, then a -minimizer exists.
Proposition
12.26
Minimizer close to reference variables
If is a -minimizer, then .
Lemma
12.27
Lower bound on multidistance
If is a -minimizer, and , then for any other tuple , one has
Definition
12.28
Conditional multidistance
If and are tuples of random variables, with the being -valued (but the need not be), then we define
where each ranges over the support of for .
Lemma
12.29
Alternate form of conditional multidistance
Lemma
12.30
Conditional multidistance nonnegative
If and are tuples of random variables, then .
Lemma
12.31
Lower bound on conditional multidistance
If is a -minimizer, and , then for any other tuples and with the -valued, one has
Corollary
12.32
Lower bound on conditional multidistance, II
With the notation of the previous lemma, we have
for any permutation .
12.5 The multidistance chain rule
Lemma
12.33
Multidistance chain rule
Let be a homomorphism of abelian groups and let be a tuple of jointly independent -valued random variables. Then is equal to
where .
Proof
▶
For notational brevity during this proof, write .
From Lemma 2.26 and Lemma 2.2, noting that is determined both by and by , we have
and by Lemma 2.13 the right-hand side is equal to
Therefore,
From a further application of Lemma 2.13 and Lemma 2.2 we have
for all . Averaging 7 in and subtracting this from 6, we obtain the claim from Definition 12.12.
We will need to iterate the multidistance chain rule, so it is convenient to observe a conditional version of this rule, as follows.
Lemma
12.34
Conditional multidistance chain rule
Let be a homomorphism of abelian groups. Let be a finite index set and let be a tuple of -valued random variables. Let be another tuple of random variables (not necessarily -valued). Suppose that the pairs are jointly independent of one another (but need not be independent of ). Then
Proof
▶
For each in the support of , apply Lemma 12.33 with replaced by the conditioned random variable , and the claim 8 follows by averaging 5 in the using the weights .
We can iterate the above lemma as follows.
Lemma
12.35
Let be a positive integer. Suppose one has a sequence
of homomorphisms between abelian groups , and for each , let be the homomorphism from to arising from this sequence by composition (so for instance is the identity homomorphism and is the zero homomorphism). Let be a jointly independent tuple of -valued random variables. Then
In particular, by Lemma 2.27,
Proof
▶
From Lemma 12.34 (taking and there, and noting that determines ) we have
for . The claim follows by telescoping series, noting that and that (and also ).
In our application we will need the following special case of the above lemma.
Corollary
12.36
Let be an abelian group and let . Suppose that , , are independent -valued random variables. Then
where all the multidistances here involve the indexing set .
Proof
▶
In Lemma 12.35 we take with the maps for defined by
with . Since can be obtained from by applying a homomorphism, we obtain a sequence of the form 9.
Now we apply Lemma 12.35 with and . Using joint independence and Corollary 2.24, we find that
On the other hand, for , we see that once is fixed, is determined by and vice versa, so
Since the are jointly independent, we may further simplify:
Putting all this into the conclusion of Lemma 12.35, we obtain
and the claim follows by rearranging.
12.6 Bounding the mutual information
As before, is an abelian group, and . We let be a -minimizer.
Proposition
12.37
Bounding mutual information
Suppose that , , are jointly independent -valued random variables, such that for each , the random variables coincide in distribution with some permutation of . Write
Then
Proof
▶
For each we call the tuple a column and for each we call the tuple a row. Hence, by hypothesis, each column is a permutation of .
From Corollary 12.36 we have
where
and
We first consider the , for fixed . By Lemma 12.16 and and our hypothesis on columns, we have
Let be a permutation such that , and write and . Corollary 12.32, we have
We similarly consider . By Lemma 12.16 applied to the -th column,
For , denote the sum of row by
if we apply Corollary 12.32 again, now with , , and with the variable being trivial, we obtain
It remains to bound the distances appearing in 16 and 17 further using Ruzsa calculus. For and , by Lemma 3.25 we have
For each , summing over gives
On the other hand, by Lemma 12.8 (since appears in the sum ) we have
Combining 15, 16 and 17 with 18 and 19 (the latter two summed over ), we get
By Lemma 12.9 (with taking each to the index such that is a copy of ) we obtain the bound
Finally, summing over and using gives
where in the second step we used the permutation hypothesis. Combining this with 20 gives the
The claim 14 is now immediate from Lemma 12.18.
12.7 Endgame
Now let , let be an -torsion abelian group, and let be a -minimizer.
Definition
12.38
Additional random variables
By a slight abuse of notation, we identify and in the obvious way, and let be an independent copy of for . Then also define:
and
The addition takes place over . Note that, because we are assuming is -torsion, it is well-defined to multiply elements of by elements of . We will also define for the variables
Proposition
12.40
Mutual information bound
Proof
▶
We analyze these variables by Proposition 12.37 in several different ways. In the first application, take . Note that each column is indeed a permutation of ; in fact, the trivial permutation. Note also that for each , the row sum is
and for each , the column sum is
Finally note that . From Proposition 12.37 we then have
with as in 23. Since is a function of by 21, and similarly is a function of , it follows immediately from Lemma 12.4 that
In the second application of Proposition 12.37, we instead consider . Again, for each fixed , the tuple is a permutation of . This time the row sums for are
Similarly, the column sums for are
As before, . Hence, using 21 and Lemma 12.4 again, Proposition 12.37 tells us
In the third application of Proposition 12.37, take . The column and row sums are respectively
and
Hence, Proposition 12.37 and Lemma 12.4 give
which completes the proof.
Proof
▶
Without loss of generality, we may take to be independent. Write . Note that for each , the sum from 21 above has the same distribution as . By Lemma 12.6 we have
By Lemma 12.20, we have
and hence
From Definition 12.12 we have
and the claim follows.
Lemma
12.43
Mutual information bound
Lemma
12.44
Distance bound
Proof
▶
For each , using Lemma 12.8 (noting the sum contains as a summand) we have
and using Lemma 3.24 we have
Combining with 26 and Lemma 12.43 gives
Summing over and applying Lemma 12.42 gives
Finally, applying Lemma 12.18 (and dropping some lower order terms) gives the claim.
Lemma
12.45
Application of BSG
Let be an abelian group, let be a -valued random variable such that holds identically, and write
Let be some further -valued random variables and let be a constant. Then there exists a random variable such that
Proof
▶
We apply Lemma 3.23 with and . Since , we find that
where the last line follows from Lemma 2.2 by observing
since any two of determine the third.
By 28 and the triangle inequality,
and by Lemma 3.25, for each ,
Hence,
and the result follows by setting for some such that the quantity in parentheses on the left-hand side is at most the weighted average value.
Proposition
12.46
Vanishing entropy
Proof
▶
For each value , apply Lemma 12.45 (and Lemma 12.39) to
with and . Write
for this choice, and note that
by Proposition 12.40. Write for the random variable guaranteed to exist by Lemma 12.45, so that 27 gives
Let denote the tuple consisting of the same variable repeated times. By Lemma 12.19
On the other hand, from Lemma 12.27 one has
Combining 30, 31 and 32 and averaging over (with weight ), and recalling the value , gives
since the terms cancel by our choice of . Substituting in Lemma 12.44 and 29, and using the fact that , we have
From Definition 12.22 we have we have
and hence . The claim now follows from Lemma 12.15.
12.8 Wrapping up
Theorem
12.47
Entropy form of PFR
Suppose that is a finite abelian group of torsion . Suppose that is a -valued random variable. Then there exists a subgroup such that
Lemma
12.48
Suppose that is a finite abelian group of torsion . If is non-empty and , then can be covered by at most translates of a subspace of with
Theorem
12.49
PFR
Suppose that is a finite abelian group of torsion . If is non-empty and , then can be covered by most translates of a subspace of with .