3 Entropic Ruzsa calculus
In this section \(G\) will be a finite additive group. (May eventually want to generalize to infinite \(G\).)
If \(X\) is \(G\)-valued, then \(\mathbb {H}[-X]=\mathbb {H}[X]\).
Immediate from Lemma 2.2.
If \(X,Y\) are \(G\)-valued, then \(\mathbb {H}[X \pm Y | Y]=\mathbb {H}[X|Y]\) and \(\mathbb {H}[X \pm Y, Y] = \mathbb {H}[X, Y]\).
If \(X,Y\) are \(G\)-valued random variables on \(\Omega \), we have
If \(X,Y\) are \(G\)-valued random variables on \(\Omega \) and \(Z\) is another random variable on \(\Omega \) then
This follows from Lemma 3.3 by conditioning to \(Z = z\) and summing over \(z\) (weighted by \( \mathbb {P}[Z=z]\)).
If \(X,Y\) are independent \(G\)-valued random variables, then
One random variable is said to be a copy of another if they have the same distribution.
If \(X'\) is a copy of \(X\) then \(\mathbb {H}[X'] = \mathbb {H}[X]\).
Immediate from Definition 2.1.
Let \(X_i : \Omega _i \to S_i\) be random variables for \(i=1,\dots ,k\). Then if one gives \(\prod _{i=1}^k S_i\) the product measure of the laws of \(X_i\), the coordinate functions \((x_j)_{j=1}^k \mapsto x_i\) are jointly independent random variables which are copies of the \(X_1,\dots ,X_k\).
Explicit computation.
Let \(X,Y\) be \(G\)-valued random variables (not necessarily on the same sample space). The Ruzsa distance \(d[X ;Y]\) between \(X\) and \(Y\) is defined to be
where \(X',Y'\) are (the canonical) independent copies of \(X,Y\) from Lemma 3.7.
If \(X\) is a \(G\)-valued random variable and \(0\) is the random variable taking the value \(0\) everywhere then
This is an immediate consequence of the definitions and \(X-0\equiv X\) and \(\mathbb {H}(0)=0\).
If \(X',Y'\) are copies of \(X,Y\) respectively then \(d[X';Y']=d[X ;Y]\).
If \(X,Y\) are independent \(G\)-random variables then
If \(X,Y\) are \(G\)-valued random variables, then
If \(X,Y\) are \(G\)-valued random variables, then
If \(X,Y\) are independent \(G\)-valued random variables, then
If \(X,Y\) are \(G\)-valued random variables, then
Immediate from Lemma 3.13.
If \(G\) is an additive group and \(X\) is a \(G\)-valued random variable and \(H\leq G\) is a finite subgroup then, with \(\pi :G\to G/H\) the natural homomorphism we have (where \(U_H\) is uniform on \(H\))
WLOG, we make \(X\), \(U_H\) independent (Lemma 3.7). Now by Lemmas 2.20, 3.2, 2.3
By Lemma 2.13
and therefore
Furthermore by Lemma 3.13
Adding these inequalities gives the result.
If \(X,Y,Z\) are \(G\)-valued random variables on \(\Omega \) with \((X,Y)\) independent of \(Z\), then
This is an improvement over the usual Ruzsa triangle inequality because \(X,Y\) are not assumed to be independent. However we will not utilize this improvement here.
Apply Corollary 2.21 to obtain
Using
(from Lemma 2.2), and
(from Lemma 2.2 and Lemma 2.24) and rearranging, we indeed obtain 1.
If \(X,Y,Z\) are \(G\)-valued random variables, then
If \((X, Z)\) and \((Y, W)\) are random variables (where \(X\) and \(Y\) are \(G\)-valued) we define
similarly
The expression \(d[X | Z;Y | W]\) is unchanged if \((X,Z)\) or \((Y,W)\) is replaced by a copy. Furthermore, if \((X,Z)\) and \((Y,W)\) are independent, then
and similarly
Suppose that \(X, Y, Z\) are independent \(G\)-valued random variables. Then
From Lemma 2.20 we have
However, using Lemmas 2.24, 2.2 repeatedly we have \(\mathbb {H}[X, X + Y+ Z] = \mathbb {H}[X, Y+ Z] = \mathbb {H}[X] + \mathbb {H}[Y+ Z]\), \(\mathbb {H}[Z, X + Y + Z] = \mathbb {H}[Z, X + Y] = \mathbb {H}[Z] + \mathbb {H}[X + Y]\) and \(\mathbb {H}[X, Z, X + Y+ Z] = \mathbb {H}[X, Y, Z] = \mathbb {H}[X] + \mathbb {H}[Y] + \mathbb {H}[Z]\). The claim then follows from a calculation.
For \(X,Y\) random variables, there exist random variables \(X_1,X_2,Y'\) on a common probability space with \((X_1, Y'), (X_2, Y')\) both having the distribution of \((X,Y)\), and \(X_1, X_2\) conditionally independent over \(Y'\) in the sense of Definition 2.28.
Explicit construction.
Let \(A,B\) be \(G\)-valued random variables on \(\Omega \), and set \(Z := A+B\). Then
Let \((A_1, B_1)\) and \((A_2, B_2)\) (and \(Z'\), which by abuse of notation we call \(Z\)) be conditionally independent trials of \((A,B)\) relative to \(Z\) as produced by Lemma 3.22, thus \((A_1,B_1)\) and \((A_2,B_2)\) are coupled through the random variable \(A_1 + B_1 = A_2 + B_2\), which by abuse of notation we shall also call \(Z\).
Observe from Lemma 3.11 that the left-hand side of 2 is
since, crucially, \((A_1 | Z=z)\) and \((B_2 | Z=z)\) are independent for all \(z\).
Applying submodularity (Lemma 2.21) gives
We estimate the second, third and fourth terms appearing here. First note that, by Lemma 2.30 and Lemma 2.2 (noting that the tuple \((A_1 - B_2, A_1, B_1)\) determines the tuple \((A_1, A_2, B_1, B_2)\) since \(A_1+B_1=A_2+B_2\))
Next observe that
Finally, we have
Substituting 7, 8 and 9 into 4 yields
and so by Corollary 2.19
Since
and similarly for \(\mathbb {H}[B_2 | Z]\), we see that 3 is bounded by \(3 \mathbb {I}[A:B] + 2\mathbb {H}[Z]-\mathbb {H}[A]-\mathbb {H}[B]\) as claimed.
Suppose that \((X, Z)\) and \((Y, W)\) are random variables, where \(X, Y\) take values in an abelian group. Then
In particular,
Using Lemma 3.20 and Lemma 3.7, if \((X',Z'), (Y',W')\) are independent copies of the variables \((X,Z)\), \((Y,W)\), we have
Here, in the middle step we used Lemma 2.19, and in the last step we used Definition 3.8 and Definition 2.15.
Let \(X, Y, Z\) be random variables taking values in some abelian group of characteristic \(2\), and with \(Y, Z\) independent. Then we have
and
We first prove 10. We may assume (taking an independent copy, using Lemma 3.7 and Lemma 3.10, 3.11) that \(X\) is independent of \(Y, Z\). Then we have
Combining this with Lemma 3.21 gives the required bound. The second form of the result is immediate Lemma 3.11.
Turning to 11, we have from Definition 2.15 and Lemma 2.2
and so 11 is a consequence of Lemma 3.24. Once again the second form of the result is immediate from Lemma 3.11.
Let \(X, Y, Z, Z'\) be random variables taking values in some abelian group, and with \(Y, Z, Z'\) independent. Then we have