Probability with Martingales by David Williams

\[ \newcommand{\Q}{\mathbb Q} \newcommand{\R}{\mathbb R} \newcommand{\C}{\mathbb C} \newcommand{\Z}{\mathbb Z} \newcommand{\N}{\mathbb N} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\Norm}[1]{\left \lVert #1 \right \rVert} \newcommand{\Abs}[1]{\left \lvert #1 \right \rvert} \newcommand{\pind}{\bot \!\!\! \bot} \newcommand{\probto}{\buildrel P\over \to} \newcommand{\vect}[1]{\boldsymbol #1} \DeclareMathOperator{\EE}{\mathbb E} \DeclareMathOperator{\PP}{\mathbb P} \DeclareMathOperator{\E}{E} \DeclareMathOperator{\dnorm}{\mathcal N} \DeclareMathOperator{\sgn}{sgn} \DeclareMathOperator{\Var}{Var} \DeclareMathOperator{\Cov}{Cov} \DeclareMathOperator{\Leb}{Leb} \DeclareMathOperator{\Bin}{Bin} \newcommand{\wto}{\buildrel w\over \to} \]

Dominated Convergence Theorem

Problem 5.1

Let \(S = [0,1]\), \(\Sigma = \mathcal B(S)\), \(\mu\) be the Lesbegue measure. Define \(f_n = n I_{(0,1/n)}\). Prove that \(f_n(s) \to 0\) for every \(s\in S\) but that \(\mu(f_n)=1\) for every \(n\). Draw a picture of \(g = \sup_n \abs{f_n}\) and show that \(g\not \in \mathcal L^1(S,\Sigma,\mu)\)

If \(n>1/s\) then \(f_n(s) = 0\) because the indicator \(I_{(0,1/n)}\) doesn't put any mass at points as large as \(s\). So \(f_n(s)\) eventually for every \(s\) and \(\lim f_n(s) \to 0\). On the other hand \(\mu(f_n) = n \mu(0,1/n) = \frac n n = 1\). The function \(g=\sup_n \abs{f_n}\) sort of looks like the function \(1/x\) near 0, except it has stairsteps at each \(1/n\). It takes the value \(n\) on the range \((1/(n+1),n]\). Now \(g_n = \max(\abs{f_1}, \dots, \abs{f_n})\) is a simple function which satisfies

\[ \begin{equation} \mu g_n \geq \sum_{k=1}^n k \left( \frac 1 k - \frac 1 {k+1}\right) = \sum_{k=1}^n \frac 1 {k+1} \end{equation} \]

The sum diverges as \(n\uparrow \infty\), which means \(\mu g\) also diverges since its greater than any \(\mu g_n\).

Problem 5.2

Prove inclusion-exclusion considering integrals of indicator functions

We will use the following property of indicators repeatedly: for events \(A\) and \(B\),

\[ \begin{equation} I_A I_B = I_{A\cap B} \label{eq:intersectionrule} \end{equation} \]

This can be verified by considering the four cases \(x \in (A\cup B)^c\), \(x\in A\setminus B\), \(x\in B\setminus A\) and \(x\in A\cap B\). These four cases partition \(\Omega\), and the values of the expressins on both sides of \(\eqref{eq:intersectionrule}\) are equal.

Let \(A_1,\dots, A_n\) be a collection of events. To simplify notation I'm going to write \(I(A)\) for indicators rather than \(I_A\), but the indicator remains a function of \(\omega \in \Omega\). Now by de Morgan's law \((A_1\cup A_2 \cup \dots A_n)^c = A^c_1 \cap A^c_2 \cap \dots \cap A^c_n\).

\[ \begin{equation} \begin{split} 1-I( A_1 \cup \dots \cup A_n) &= I((A_1\cup \dots A_n)^c) \\ =& I( A^c_1 \cap A^c_2 \cap \dots \cap A^c_n) \\ =& I(A^c_1 ) I(A^c_2) \dots I(A^c_n) \\ =& (1-I(A_1) ) (1-I(A_2)) \dots (1-I(A_n)) \\ =& 1 - \sum_i I(A_i) + \sum_{i<j} I(A_i)I(A_j) - \sum_{i<j<k} I(A_i)I(A_j)I(A_k) \\ &\,\,\,+ \dots + (-1)^{n-1} I(A_1)I(A_2)\dots I(A_n) \\ =& 1 - \sum_i I(A_i) + \sum_{i<j} I(A_i \cap A_j) - \sum_{i<j<k} I(A_i \cap A_j \cap A_k) \\ & +\dots + (-1)^{n-1} I(A_1 \cap A_2 \cap \dots \cap A_n) \end{split} \end{equation} \]

Take expectations of each side yields inclusion exclusion

\[ \begin{equation} \begin{split} \Pr( A_1 \cup \dots \cup A_n) =& \sum_i \Pr(A_i) + \sum_{i<j} \Pr(A_i \cap A_j) - \sum_{i<j<k} \Pr(A_i \cap A_j \cap A_k) \\ &+\dots + (-1)^{n-1} \Pr(A_1 \cap A_2 \cap \dots \cap A_n) \end{split} \end{equation} \]

Now we prove the inclusion-exclusion in equalities. The \(d\)-depth inequality on \(n\) terms on the indicator functions is given by

\[ \begin{equation} \begin{split} I( A_1 \cup \dots \cup A_n) \,\, \lesseqgtr \,\,& \sum_i I(A_i) + \sum_{i<j} I(A_i \cap A_j) - \sum_{i<j<k} I(A_i \cap A_j \cap A_k) \\ &\,\,\,+\dots + (-1)^{d-1} \sum_{i_1<i_2<\dots<i_d} I(A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_d}) \end{split} \end{equation} \]

where the inequality is \(\leq\) if \(d\) is odd and \(\geq\) if \(d\) is even. For each \(d\) we induct on \(n\). Clearly its true for \(d=1\) and \(n=2\) since by inclusion-exclusion

\[ \begin{equation} I(A_1\cup A_2) = I(A_1) + I(A_2) - I(A_1\cap A_2) \leq I(A_1)+ I(A_2) \end{equation} \]

Then its true inductively for any \(n\) and \(d=1\), assuming its true for \(n-1\)

\[ \begin{equation} \begin{split} I(A_1\cup \dots \cup A_n) &\leq I(A_1\cup \dots \cup A_{n-1}) \cup I(A_n) \\ &\leq I(A_1) + \dots + I(A_{n-1}) + I(A_n) \end{split} \end{equation} \]

For arbitrary \(d\), the inequality is true for \(n\leq k\), since in this case its just the inclusion-exclusion equality. No terms are truncated. So inductively if its true for \(n-1\), we can group together \(A_1 \cup A_2\) and treat it as one set

\[ \begin{equation} \begin{split} I( A_1 \cup \dots \cup A_n) \,\, \lesseqgtr \,\,& I(A_1 \cup A_2 ) + \sum_i I(A_i) \\ & - \sum_{i} I((A_1 \cup A_2)\cap A_{i} ) - \sum_{i<j} I(A_i \cap A_j) \\ & + \sum_{i,j} I((A_1 \cup A_2) \cap A_{i} \cap A_{j} ) + \sum_{i<j<k} I(A_i \cap A_j \cap A_k) \\ & + \dots\\ & + (-1)^{d-1} \sum_{i_1<i_2<\dots<i_{d-1}} I( (A_1 \cup A_2 ) \cap A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_{d-1}}) \\ & + (-1)^{d-1} \sum_{i_1<i_2<\dots<i_d} I(A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_d}) \end{split} \end{equation} \]

Here indices range from 3 to n since we've explicitly broken out 1 and 2. Using the distributive law, each indicator which includes \(A_1 \cup A_2\) can be written as the indicator of a union of two sets \((A_1\cap B) \cup (A_2 \cap B)\) where \(B = A_{i_1} \cap \dots \cap A_{i_m}\). For all such indicators of \(m < d\) terms, use inclusion-exclusion for \(n=2\) to expand \(I((A_1\cup A_2)\cap B) \to I(A_1\cap B) + I(A_2\cap B) - I(A_1 \cap A_2 \cap B)\). After performing all these expansions, we see we recover most of the terms inclusion-exclusion formula up to level \(d\). This is because the terms of level \(m\) expand to provide all the missing terms in level \(m\) which have exactly one of \(A_1\) or \(A_2\), and also to provide all the missing terms in level \(m+1\) including both \(A_1\) and \(A_2\). Terms with neither \(A_1\) or \(A_2\) were already present before the expansion.

Finally we extend the inequality by expanding the terms in level \(d\) using the inequality \(I((A_1\cup A_2) \cap B) \leq I(A_1\cap B)+I(A_2 \cap B)\). This provides the missing terms at level \(d\). This also extends the inequality in the right direction, \(\leq\) for odd \(d\) and \(\geq\) for even \(d\), owing to the factor \((-1)^{d-1}\) on these terms.

Contact

For comments or corrections please contact Ryan McCorvie at ryan@martingale.group