Probability with Martingales - Ryan McCorvie's solutions

\[ \newcommand{\Q}{\mathbb Q} \newcommand{\R}{\mathbb R} \newcommand{\C}{\mathbb C} \newcommand{\Z}{\mathbb Z} \newcommand{\N}{\mathbb N} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\Norm}[1]{\left \lVert #1 \right \rVert} \newcommand{\Abs}[1]{\left \lvert #1 \right \rvert} \newcommand{\pind}{\bot \!\!\! \bot} \newcommand{\probto}{\buildrel P\over \to} \newcommand{\vect}[1]{\boldsymbol #1} \DeclareMathOperator{\EE}{\mathbb E} \DeclareMathOperator{\PP}{\mathbb P} \DeclareMathOperator{\E}{E} \DeclareMathOperator{\dnorm}{\mathcal N} \DeclareMathOperator{\sgn}{sgn} \DeclareMathOperator{\Var}{Var} \DeclareMathOperator{\Cov}{Cov} \DeclareMathOperator{\Leb}{Leb} \DeclareMathOperator{\Bin}{Bin} \newcommand{\wto}{\buildrel w\over \to} \]

Martingales

Problem 10.1 Polya's urn

At time 0 an urn contains 1 black ball and 1 white ball. At each time $t=1,2,\dots$ a ball is chosen at random from the urn and is replaced, and another ball of the same color is also added. Just after time $n$ there are therefore $n+2$ balls in the urn, of which $B_n+1$ is black, where $B_n$ is the number of black balls chosen by time $n$.

Let $M_n=(B_n+1)/(n+2)$ be the proportion of blacks balls just after time $n$. Prove that $M$ is a martingale.

Prove $\Pr(B_n=k) = \frac 1 {n+1}$ for $0\leq k \leq n$. What's the distribution of $\Theta = \lim M_n$. Prove that for $0<\theta<1$,

\[ \begin{equation} N^\theta_n = \frac{(n+1)!}{B_n! (n-B_n)!} \theta^{B_n}(1-\theta)^{n-B_n} \end{equation} \]

defines a martingale $N^\theta$

We will consider $M$ with respect to the filtration $\mathcal F_n = \sigma(B_1,\dots,B_n)$. Thus, since $\sigma(B_{n-1}) \subset \mathcal F_{n-1}$

\[ \begin{equation} \begin{split} \E[M_n \mid \mathcal F_{n-1}] &= \Pr[B_n=B_{n+1}] \frac{B_{n-1}+1} {n+2} + \Pr[B_n=B_{n+1}+1] \frac{B_{n-1}+2} {n+2} \\ &= \frac{n-B_{n-1}}{n+1} \cdot \frac{B_{n-1}+1} {n+2} + \frac{ B_{n-1}+1}{n+1} \cdot \frac{B_{n-1}+2} {n+2} \\ &= \frac{(n+2)(B_{n-1}+1)}{(n+2)(n+1)} = M_{n-1} \end{split} \end{equation} \]

Now let $t_i$ specify on which draw the $i$th black ball selected, and let $s_j$ specify which draw the $j$th white ball is selected are white. That is, if $n=7$ and $k=4$ and the balls drawn are $bwwbbwb$ then $t_1=1,t_2=4,t_3=5,t_4=7$ and $t_i$ is given by $s_1=2,s_2=3,s_3=6$. The length of each sequence is the total number of balls chosen of the respective color. Since every draw is either black or white, thus it must appear either in the black sequence or the white sequence. Therefore $\bigcup_i t_i \cup \bigcup_i s_i =\{1,2,\dots,n\}$. Let's compute the probability of a specific sequence

\[ \begin{equation} P = \prod_{i=1}^k \frac{i}{t_i+1} \cdot \prod_{j=1}^{n-k} \frac{j}{s_j+1} = \frac{k! (n-k)!}{n+1} = \frac{1}{(n+1)} \binom{n}{k}^{-1} \end{equation} \]

We've learned a remarkable fact: the probability doesn't depend on the order of the draws, just the total number which are black. The sequence is exchangable, any permutation of the outcomes results in an event with the same probability. There are $\binom{n}{k}$ sequences with $k$ black balls and $n-k$ white ones.

\[ \begin{equation} \Pr(B_n=k) = \binom{n}{k} \frac 1 {(n+1)} \binom{n}{k}^{-1} = \frac 1 {n+1} \label{eq:polya uncond} \end{equation} \]

Thus we've learned another remarkable fact, the distribution of $B_n$ is uniform on $1, 2, \dots, n+$1. Thus $\Theta = \lim_{n\to \infty} M_n$ must converge in distribution to the uniform distribution $U(0,1)$.

Turning to $N^\theta_n$, let $B$ be the event that there's a black ball on the $n$th draw. That is that $B_n = B_{n-1}+1$. Let $W=B^c$, the probability there is a white ball.

\[ \begin{equation} \begin{split} E[N^\theta_n \mid \mathcal F_{n-1}] = &\Pr(W) \frac{(n+1)!}{B_{n-1}!(n-B_{n-1})!} \theta^{B_{n-1}}(1-\theta)^{n-B_{n-1}} \\ &\,\,\,+ \Pr(B) \frac{(n+1)!}{(B_{n-1}+1)!(n-B_{n-1}-1)!} \theta^{B_{n-1}+1}(1-\theta)^{n-B_n-1} \\ =& \frac{n!}{B_{n-1}!(n-1-B_{n-1})!}\theta^{B_n-1}(1-\theta)^{n-1-B_n} \,\,\times\\ & \,\,\,\left[ \frac{(1-\theta)(n+1)}{n-B_{n-1}} P(W)+ \frac{\theta(n+1)}{B_{n-1}+1} P(B)\right] \end{split} \end{equation} \]

The factor before the parantheses is $N^\theta_{n-1}$. The expression in the parentheses is

\[ \begin{equation} (1-\theta)\frac {n+1}{n-B_{n+1}}\left(\frac{n-B_{n-1}}{n+1}\right) + \theta\frac{n+1}{B_{n-1}+1} \left( \frac{B_{n-1}+1}{n+1} \right) = 1-\theta + \theta = 1 \end{equation} \]

Problem 10.2 Belman's optimality principle

Your winnings per unit stake on game $n$ are $\epsilon_n$ where $\epsilon_n$ are i.i.d. RV with distribution

\[ \begin{equation} \epsilon_n = \begin{cases} +1 & \text{with probability } p\\ -1 & \text{with probability } q \end{cases} \end{equation} \]

Your stake $C_n$ on game $n$ must lie between 0 and $Z_{n-1}$, where $Z_{n-1}$ is your fortune at time $n-1$. Your object is to maximize the expected interest rate $\E \log(Z_N/Z_0)$ where $N$ is a given integer representing the number of times you will play and $Z_0$, your initial wealth, is a given constant. Let $\mathcal F_n = \sigma(\epsilon_1,\dots,\epsilon_n)$ be your history up to time $n$. Show that if $C$ is any previsible strategy, then $\log Z_n - n\alpha$ is a supermartingale where $\alpha$ denotes the entropy

\[ \begin{equation} \alpha = p \log p + q\log q + \log 2 \end{equation} \]

so that $\E \log (Z_N/Z_0) \leq N\alpha$, but that, for a certain strategy, $\log Z_n - n\alpha$ is a martingale. What is the best strategy?

For a discrete distribution on $\{1,2,\dots,n\}$, lets maximize $E\log X$ for a random variable subject to the constraint $\sum_{i=1}^n X(i) = K$. Using the method of Lagrange multipliers we maximize the expression

\[ \begin{equation} F = \sum_{i=1}^n p_i \log x_i - \lambda(\sum_{i=1}^n x_i - K) \end{equation} \]

over the $n+1$ variables $x_1,x_2,\dots,x_n, \lambda$. (Maximizing with respect to $\lambda$) is just the same as respecting the constraint. The first order conditions are

\[ \begin{equation} \frac {\partial F} {\partial x_i} = 0 \qquad \Rightarrow \qquad \frac{p_i}{x_i} = \lambda \end{equation} \]

Summing over the equations $1=\sum_i p_i = \lambda \sum_i x_i = \lambda K$. Thus $x_i = K p_i$ is the extremal solution. This is a maximum, which can be seen by considering the distribution where $p_1=1$, or the second order conditions $\frac {\partial^2 F} {\partial x_i^2} = -\frac {p_i}{x^2_i} = -\frac 1 {p_i}$. The maximum value is given by

\[ \begin{equation} \sum_i p_i \log x_i = \sum_i p_i (\log p_i + \log K) = -H(p) + \log K \end{equation} \]

\[ \begin{equation} \log Z_n - \log Z_{n-1} = \log\left( \frac{Z_{n-1} + C_n \epsilon_n}{Z_n} \right) \end{equation} \]

Let $f_n = C_n / Z_{n-1}$. For any choice $0\leq C_n \leq Z_{n-1}$, let $f_n = C_n / Z_{n-1}$ we get $0\leq f \leq 1$.

\[ \begin{equation} \E[ \log Z_n - \log Z_{n-1}] = p\log (1+f_n) + q \log(1-f_n) \leq p\log p + q\log q + \log 2 \end{equation} \]

where the last inequality follow from our previous calculation. Therefore $Z_n - n\alpha$ is a supermartingale for any previsible $C_n$ and the optimal $C_n$ is given by $C_n = (2p-1)Z_{n-1} = (p-q) Z_{n-1}$ . The supermartingale property implies that $E[\log Z_N - N\alpha] \leq E[\log Z_0]$ and thus $E[\log(Z_N/Z_0)] \leq N\alpha$

Problem 10.3

Stopping times. Suppose $S$ and $T$ are stopping times relative to $(\Omega, \mathcal F, \{\mathcal F_n\})$. Prove that $S\wedge T = \min(S,T)$ and $S\vee T = \max(S,T)$ and $S+T$ are all stopping times.

First note that for any constant $n$, $ \{T=n\} \in \mathcal F_n$. This is $\{T \leq n\} \setminus \{T\leq n-1\}$ and the former is in $\mathcal F_n$ and the latter is in $\mathcal F_{n-1} \subset \mathcal F_n$. Therefore $\{T\geq n\} \in \mathcal F_n$ since $\{T\geq n\} = \{T=n\} \cup \{ T>n\} $. The latter is just $\{T\leq n\}^c \in \mathcal F_n$.

Turning to $S\wedge T$, suppose $S\wedge T = n$. Without loss of generality, suppose $S=n$ and $T\geq n$. We've already shown that each of these sets $\{ S=n\}$ and $\{T\geq n\}$ are in $\mathcal F_n$, therefore their intersection is as well. The function $S\vee T$ is even simpler, since if $S\vee T=n$, without loss of generality assume $S=n$ and $T\leq n$. These events are both in $\mathcal F_n$, so their intersection is as well.

For $S+T$, note that each of $S$ and $T$ are natural numbers. Therefore if $S+T=n$ for some $n\in \N$, there are a finite number of possibilities that $S=s$ and $T=t$ and $s+t=n$. (Namely $s\in \{0,1,\dots,n\}$ and $t=n-s$). So $\{T+S=n\} = \bigcup_{k=0}^n \{S=k\} \cap \{T=n-k\}$, and each of the sets in this expression are clearly in $\mathcal F_n$

Problem 10.4

Let $S$ and $T$ be stopping times with $S\leq T$. Define the process $1_{(S,T]}$ with parameter set $\N$ via

\[ \begin{equation} 1_{(S,T]}(n,\omega) = \begin{cases} 1 & \text{if } S(\omega) <n \leq T(\omega)\\ 0 & \text{otherwise} \end{cases} \end{equation} \]

Prove $1_{(S,T]}$ is previsible and deduce that if $X$ is a supermartingale then

\[ \begin{equation} \E(X_{T\wedge n}) \leq \E(X_{S\wedge n}), \qquad \forall n \end{equation} \]

\[ \begin{equation} \{ 1_{(S,T]}(n, \omega) = 1 \} = \{S(\omega) \leq n-1\} \cap \{T(\omega) \geq n\} \end{equation} \]

The first event $\{S\leq n-1\} \in \mathcal F_{n-1}$ because $S$ is a stopping time. Similarly the second event $\{T \geq n\} = \{T\leq n-1\}^c \in \mathcal F_{n-1}$ because $T$ is a stopping time. Since $1_{(S,T]}$ only takes the values $\{0,1\}$ this proves that $\omega \hookrightarrow 1_{(S,T]}(n,\omega)$ is $\mathcal F_{n-1}$ measurable. Hence the process $1_{(S,T)}$ is previsible.

Thus for any supermartingale $X_n$, the process $Y_n = 1_{(S,T]} \bullet X_n$ is also a supermartingale with value at time $n$ given by

\[ \begin{equation} 1_{(S,T]} \bullet X_n = X_{T\wedge n} - X_{S\wedge n} \end{equation} \]

Without loss of generality we can assume $Y_0 = 0$, reindexing time if necessary so that $S,T\geq 1$. Taking expectations of both sides gives the desired result

\[ \begin{equation} \E Y_n = \E X_{T\wedge n} - \E X_{S\wedge n} \leq Y_0 = 0 \end{equation} \]

Problem 10.5

Suppose that $T$ is a stopping time such that for}some $N\in \N$ and some $\epsilon>0$ we have, for every $n$

\[ \begin{equation} \Pr(T\leq n + N \mid \mathcal F_n) > \epsilon, \qquad \text{a.s.} \end{equation} \]

By induction using $\Pr(T>kN) = \Pr(T>kN; T>(k-1)N)$ that for $k=1,2,3,\dots$

\[ \begin{equation} \Pr( T>kN) \leq (1-\epsilon)^k \end{equation} \]

Show that $\E(T)<\infty$

\[ \begin{equation} \Pr(G_{n+N} \mid \mathcal F_n) < 1-\epsilon, \qquad \text{a.s.} \end{equation} \]

Taking $n=0$ this implies $\Pr(G_N) = \Pr(G_N \mid \mathcal F_n) <1-\epsilon$. By induction, using the relation above for $n=N(k-1)$ we get

\[ \begin{equation} \Pr(G_{Nk}) = \Pr(G_{N(k-1) + N} \mid G_{N(k-1)} ) \Pr( G_{N(k-1)})\leq (1-\epsilon)(1-\epsilon)^{n-1} \end{equation} \]

\[ \begin{equation} \E T = \sum_{n=1}^\infty n \Pr( T=n) = \sum_{n=1}^\infty \sum_{k=1}^n \Pr(T=n) = \sum_{k=1}^\infty \sum_{n=k}^\infty \Pr(T=n) = \sum_{k=1}^\infty \Pr( T\geq k) \end{equation} \]

Also if $m<n$ then $G_m \supset G_n$ we have the trivial inequality $\Pr(G_m) \geq \Pr( G_n)$. So using this inequality for $n=Nk+1$ to $n=N(k+1)-1$ compared with $m=Nk$ we get

\[ \begin{equation} \E T = \sum_{n=1}^\infty \Pr( T\geq n) \leq \sum_{k=0}^\infty N \Pr( T\geq Nk) \leq N\sum_{k=0}^\infty (1-\epsilon)^k = \frac N \epsilon \end{equation} \]

Clearly this is finite. Note that the inequalities used above are true almost surely. However, we used only a countable number of inequalities, and the countable intersection of almost sure events is still almost sure. Thus the inequality above also holds almost surely. The result may be summarized, ‘‘so long as there's a chance something happens, no matter how small (so long as the chance is bounded below by a positive number), then it will happen eventually’’

Problem 10.6

ABRACADABRA. At each of times $1,2,3,\dots$ a monkey types a capital letter at random, so the sequence typed is an i.i.d. sequence of RV's each chosed uniformly among the 26 possible capital letters. Just before each time $t=1,2,\dots$ a new gambler arrives. He bets $1 that

\[ \begin{equation} \text{the $n$th letter will be $A$} \end{equation} \]

If he loses he leaves. If he wins he bets his fortune of $26 that

\[ \begin{equation} \text{the $(n+1)$st letter will be $B$} \end{equation} \]

If he loses he leaves. If he wins he bets his fortune of $$26^2$ that

\[ \begin{equation} \text{the $(n+2)$st letter will be $R$} \end{equation} \]

and so on through ABRACADABRA. Let $T$ be the first time the monkey has produced the consecutive sequence. Show why

\[ \begin{equation} \E(T) = 26^{11} + 26^4 + 26 \label{eq:abra expected time} \end{equation} \]

Intuitively, the cumulative winnings of all the gamblers are a martingale, since its the sum of a bunch of martingales. The cumulative winnings are $-T$ for the initial stake placed at each time $1,\dots, T$ plus $26^{11} + 26^4 + 26$ payout for the winnings on the bets placed on the first $A$ (winning for the whole word ABRACADABRA), the bet placed on the second to last A (winning for the terminal fragment ABRA) and the bet placed on the final A (winning for A). For this to be a martingale, expectated cumulative winnings must be 0 so $\E T = 26^{11} + 26^4 + 26$

More formally, let $L_n$ be the $n$th randomly selected letter. The $L_n$ are i.i.d., with uniform distribution over $\{A,B,\dots,Z\}$. For a sequence of letters $a_1 a_2 \dots a_n$ consider the martingale

\[ \begin{equation} M^{(j)}_n = \begin{cases} 1 & \text{if } n\leq j \\ 26^k & \text{if } n=j+k \text{ and } L_j = a_1, L_{j+1} = a_{2}, \dots, L_{j+k-1} = a_k \\ 26^{11} & \text{if } n>j+11 \text{ and } L_j = a_1, L_{j+1} = a_{2}, \dots, L_{j+10} = a_{11} \\ 0 & \text{otherwise} \end{cases} \end{equation} \]

This is bounded, and hence is in $L^1$. $M^{(j)}_n$ satisfies the martingale prorperty since, except for case 2, its constant and in that case

\[ \begin{equation} \E[ M^{(j)}_{n+1} \mid \mathcal F_n ] = \frac 1 {26} \cdot 26 M^{(j)}_n + \frac {25}{26} \cdot 0 = M^{(j)}_n \end{equation} \]

Now form the martingale $N^{(j)} = I_{n\geq j}\bullet M^{(j)}$ where $I_{n\geq j}$ is the indicator for $n\geq j$. Now form the martingale

At any time $n$ only finitely many of the terms are non-zero. Each term is in $L^1$ , so $B$ is also in $L^1$. The martingale property holds since $B$ is the sum of martingales. For $n\in \N$ let $S$ be the set of $j\leq n$ such that $M^{(j)}_n$ is in case 2, matching some number of letters. Let $S'$ be the set of $j\leq n$ such that $M^{(j)}_n$ is in case 3, matching all the letters. The value of $B$ is given by

\[ \begin{equation} B_n = -n + \sum_{j\in S} 26^{n-j} + \sum_{j\in S’} 26^{11} \label{eq:abra martingale} \end{equation} \]

Let $T$ be the stopping time for the first time some $M^{(j)}$ enters case 3, that is $M^{(j)}$ matches all 11 letters in ABRACADABRA. It must be that $\E \abs{T} < \infty$ because it statisfies the property in 10.5 with $N=11$ and $\epsilon = 26^{-11}$– i.e., it stops if the next 26 letters are ABRACADABRA in order. Whenever the stopping time happens, $B_n$ has the same value for the second and third terms. In this case $S = \{ n-4, n-1 \}$ since the terminal ABRA and A matched those $M^{(n-4)}$ and $M^{(n-1)}$ respectively, and $S'$ has exactly one element, since this is the first time ABRACADABRA appeared. Hence by Doob's optional stopping theorem

\[ \begin{equation} 0 = B_0 = \E B_T = -\E T + 26^4 + 26 + 26^{11} \end{equation} \]

Problem 10.7

Gambler's Ruin. Suppose that $X_1,X_2,\dots$ are i.i.d. RV's with

\[ \begin{equation} \Pr[X=+1] = p, \qquad \Pr[X=-1] = q, \qquad \text{where } 0<p=1-q<1 \end{equation} \]

and $p\neq q$. Suppose that $a$ and $b$ are integers with $0<a<b$. Define

\[ \begin{equation} S_n = a+X_1 +\dots +X_n, \qquad T = \inf\{n : S_n = 0 \text{ or } S_n=b\} \end{equation} \]

Let $\mathcal F_0 = \{\emptyset, \Omega\}$ and $\mathcal F_n = \sigma(X_1,\dots,X_n)$. Explain why $T$ satisfies the condition in 10.5. Prove that

\[ \begin{equation} M_n = \left( \frac q p \right)^{S_n} \qquad \text{and}\qquad N_n = S_n - n(p-q) \end{equation} \]

define martingales $M$ and $N$. Deduce the values of $\Pr(S_T = 0)$ and $\E S_T$

\[ \begin{align} \E[ M_n \mid \mathcal F_{n-1}] &= p \left( \frac q p \right)^{S_{n-1}+1}+ q\left( \frac q p \right)^{S_{n-1}-1} = (q+p) \left( \frac q p \right)^{S_{n-1}} = M_{n-1} \\ \E[N_n\mid \mathcal F_{n-1}] &= p(S_{n-1}+1 - n(p-q)) + q(S_{n-1}-1 - n(p-q)) \\ &= (p+q)S_{n-1} +p-q -n(p-q) = N_{n-1} \end{align} \]

Now the condition described in 10.5 holds with $N=b-1$. For any $n$ suppose that $X_{n+1}=X_{n+2}=\dots = X_{n+b} = +1$. Conditional on $S_n \in (0,b)$, then some $S_{n+b} = S_n+b > b$, so $S_{n+k}=b$ for some $k\in \{1,2,\dots,b-1\}$. In particular, $T \leq n+b-1$. However $\Pr(X_{n+1}=X_{n+2}=\dots = X_{n+b} = +1) = p^b > 0$. So

\[ \begin{equation} \left( \frac q p \right)^a = M_0 = \E M_T = \pi \cdot 1 + (1-\pi) \cdot \left( \frac q p \right)^b \end{equation} \]

\[ \begin{equation} \Pr(S_T = 0) = \pi = \frac{ (q/p)^a - (q/p)^b}{1-(q/p)^b} \qquad \Pr(S_T = b) = 1-\pi = \frac{ 1- (q/p)^a }{1-(q/p)^b} \end{equation} \]

\[ \begin{equation} a = N_0 = \E N_T = \pi \cdot 0 + (1-\pi) \cdot b - T(p-q) \end{equation} \]

Problem 10.8

A random number $\Theta$ is chosen uniformly in $[0,1]$ and a coin with probability $\Theta$ of heads is minted. The coin is tossed repeatedly. Let $B_n$ be the number of heads in $n$ tosses. Prove that $B_n$ has exactly the same probabilistic structure as the $(B_n)$ sequence in 10.1 on Polya's urn. Prove that $N^\theta_n$ is the regular conditional pdf of $\Theta$ given $B_1,\dots, B_n$

Consider the beta function $B(m+1,n+1) = \int_0^1 u^m(1-u)^{n} \, du$. Note $B(m,n)=B(n,m)$ by a change of variables $v=1-u$. Thus, without loss of generality, assume $m\geq n$

\[ \begin{equation} \begin{split} B(m+1,n+1) &= \int_0^1 u^m(1-u)^{n} \, du \\ &= \left. \frac 1 {m+1} u^{m+1} u^{n} \right|_0^1 + \frac{n}{m+1} \int_0^1 u^{m+1} (1-u)^{n-1}\, du \\ &= \frac{n}{m+1} B(m+2,n) = \left( \prod_{k=1}^{n} \frac{n+1-k}{m+k} \right) B(m+n+1,1) \\ &= \frac{ n! m! }{(m+n)!} \, \frac 1 {m+n+1} = \frac{n! m!}{(n+m+1)!}\label{eq:beta factorial} \end{split} \end{equation} \]

Now conditional on $\Theta = \theta$, $B_n\sim \Bin(n,\theta)$. Thus we can marginalize over $\theta$ to find

\[ \begin{equation} \begin{split} \Pr( B_n = k ) &= \int_0^1 \Pr(B_n=k\mid \Theta=\theta)\, d\theta = \int_0^1 \binom{n}{k}\theta^k(1-\theta)^{n-k}\, d\theta \\ &= \binom{n}{k} B(k+1,n-k+1) = \frac 1 {n+1} \end{split} \end{equation} \]

\[ \begin{equation} \begin{split} \Pr( B_n = k; B_{n-1}=k-1) &= \int_0^1 \Pr( B_n = k; B_{n-1}=k-1\mid \Theta = \theta)\, d\theta \\ &= \int_0^1 \theta \cdot \binom{n-1}{k-1}\theta^{k-1}(1-\theta)^{n-k}\, d\theta \\ &= \binom{n-1}{k-1} B(k+1,n-k+1) = \frac k {n(n+1)} \end{split} \end{equation} \]

We've used the fact that $\Pr(B_n=k \mid B_{n-1}=k-1, \Theta = \theta) = \theta$, since conditional on $\Theta$, the trials are independent with probability $\theta$. Therefore using Baye's rule

\[ \begin{equation} \Pr( B_n = k \mid B_{n-1}=k-1) = \frac{\Pr(B_n=k;B_{n-1}=k-1)}{\Pr(B_{n-1}=k-1)} = \frac k n \end{equation} \]

Let's compute the regular conditinal pdf for $\Theta$. Let $B^{(n)}=(B_1,B_2, \dots, B_n)$ be the vector of values of $B_k$ for $k=1,\dots,n$ and let $b=(b_1,\dots,b_n)$ be a realization. Now

\[ \begin{equation} \Pr( B^{(n)} = b \mid \Theta = \theta) = \prod_{k=1}^n \theta^{I(b_k = b_{k-1}+1)} (1-\theta)^{I(b_k=b_{k-1})} \end{equation} \]

That is, we pick up a factor of $\theta$ whenever $b_k$ increases (we flipped a head) or $1-\theta$ whenever $b_k$ stays the same (we flipped a tails). Clearly this can be simplified to

\[ \begin{equation} \Pr( B^{(n)} = b \mid \Theta = \theta) = \theta^{B_n} (1-\theta)^{n-B_n} \end{equation} \]

\[ \begin{equation} \Pr(B^{(n)}=b) = \int_0^1 \theta^{B_n}(1-\theta)^{n-B_n} \, d\theta= \frac{ B_n! (n-B_n)!}{(n+1)!} \end{equation} \]

\[ \begin{equation} \begin{split} \Pr( \Theta = \theta \mid B^{(n)}=b ) &= \frac{\Pr(B^{(n)}=b \mid \Theta=\theta) \Pr( \Theta=\theta)}{\Pr(B^{(n)}=b)}\\ &= \frac{(n+1)!}{B_n! (n-B_n)!} \, \theta^{B_n} (1-\theta)^{n-B_n} \,d\theta \\ &= N^\theta_n \, d\theta \end{split} \end{equation} \]

Problem 10.9

Show that if $X$ is a non-negative supermartingale and $T$ is a stopping time then

\[ \begin{equation} \E(X_T;T<\infty) \leq \E X_0 \label{eq:fatou doob} \end{equation} \]

Deduce that

\[ \begin{equation} \Pr( \sup X_n \geq c )\leq \frac {\E X_0} c \label{eq:maximal markov} \end{equation} \]

Let $Z_n = X_n \,I_{T<\infty}$. Then as $n\to \infty, Z_{n\wedge T} \to X_T I_{T<\infty}$ since if $T(\omega) = N$ then $Z_n = X_T$ for all $n>N$ so $Z_n \to X_T$ and if $T(\omega) = \infty$ then $Z_n=0$ for all $n$. Since $X_n\geq 0$, by Fatou's lemma

\[ \begin{equation} \E[ X_T; {T<\infty}] = \E \lim Z_{T\wedge n} = \E \liminf Z_{T\wedge n} \leq \liminf \E Z_{T\wedge n} \end{equation} \]

\[ \begin{equation} \begin{split} \E[ X_T; {T<\infty}] - \E[ X_0] &\leq \liminf \E Z_{T\wedge n}-\E X_0 \\ &\leq \liminf_{n\to \infty} \E X_{T\wedge n}-\E X_0 \\ &\leq 0 \end{split} \end{equation} \]

Now consider the stopping time $T = \inf\{ X_n \geq c \}$. From the definition of $T$, if $n = T$ then $X_n \geq c$.

\[ \begin{equation} c \Pr( \max X_n \geq c ) = c \Pr( T<\infty ) \leq \E[ X_T I_{\{T<\infty\}}] \leq \E X_0 \end{equation} \]

This gives us $\eqref{eq:maximal markov}$. Note this is a maximal version of Markov's inequality.

Problem 10.10 The ‘‘Starship Enterprise’’ problem

The control system on the star-ship Enterprise has gone wonky. All that one can do is to set a distance to be travelled. The spaceship will then move that distance in a randomly chosen direction, then stop. The object is to get into the Solar System, a ball of radius $\rho$. Initially, the Enterprise is at a distance $R_0 > \rho$ from the Sun.

Show that for all ways to set the jump lengths $r_n$

\[ \begin{equation} \Pr(\text{ the Enterprise returns to the solar system}) \leq \frac \rho {R_0} \end{equation} \]

and find a strategy of jump lengths $r_n$ so that

\[ \begin{equation} \Pr(\text{the Enterprise returns to the solar system}) \geq \frac \rho {R_0} -\epsilon \end{equation} \]

Upper bound

Let $R_n$ be the distance after $n$ space-hops. Now $f(\vect x) = \frac 1 {\abs{\vect x}}$ is a harmonic function satisfying $\nabla^2 f = \sum_i \frac {\partial^2 f}{\partial x_i^2} = 0$ when $x\neq 0$. Let $S(\vect x, r)$ be the 2-sphere of radius $r$ centered at $\vect x$, and let $S_1 = S(0,1)$, and let $d\sigma$ be the surface area element. Similarly let $B(\vect x, r)$ be the ball of radius $r$ centered at $x$.

\[ \begin{equation} \frac {\partial f} {\partial x_i} = -\frac {x_i} {\abs{\vect x}^3} \qquad \qquad \frac {\partial^2 f} {\partial x_i^2} = -\frac 1 {\abs{\vect x}^3} + \frac {3 x_i^2} {\abs{\vect x}^5} \qquad \qquad \nabla^2 f = -\frac 3 {\abs{\vect x}^3} + \frac {3\abs{\vect x}^2}{\abs{\vect x}^5} = 0 \label{eq:3d fundamental} \end{equation} \]

Any harmonic function satisfies the mean-value property. Let's compute the average value $A(x,r)$ of $f$ on the sphere $S(x,r)$

\[ \begin{equation} A(\vect x,r) = \frac 1 {4\pi r^2} \iint_{S(x,r)} f\, d\sigma = \frac 1 {4\pi} \iint_{S_1} f( \vect x + r \vect n)\, d\sigma_1 \end{equation} \]

In the above we applied a dilation to integrate over $S(1)$ with surface element $d\sigma_1 = r^{-2} d\sigma$. Then differentiating under the integral sign with respect to $r$

\[ \begin{equation} \begin{split} \frac d {dr} A(\vect x,r) &=\frac 1 {4\pi} \iint_{S_1} \nabla f( \vect x + r \vect n) \cdot n \, d\sigma_1 = \frac 1 {4\pi r^2} \iint_{S(x,r)} \nabla f \cdot n \, d\sigma \\ &= \frac 1 {4\pi r^2} \iiint_{B(x,r)} \nabla^2 f \, dV = 0 \end{split} \label{eq:mv} \end{equation} \]

In the last equation we used Gauss’ law. By continuity $A(\vect x,r) \to f(\vect x)$ as $r\to 0$ so $A(\vect x,r) = f(\vect x)$ for all $r<\abs{\vect x}$.

The probability law for the starship enterprise for a jump of size $r$ is given by $\frac {d\sigma}{4\pi r^2}$ on the ball $S(\vect x, r)$. so $\E[ R^{-1}_{n+1} \mid X_n = x_n] = A(\vect x_n, r )$ for $f(\vect x)=\abs{x}^{-1}$. Hence, if $r < R_n$, by the above calculation

\[ \begin{equation} \E\left[ \frac 1 {R_{n+1}} \mid \frac 1 {R_n} \right] = \frac 1 {R_n} \end{equation} \]

since $\abs{\vect x}^{-1}$ is harmonic inside of $B(\vect x, r)$, and therefore $1/R_n$ is a martingale.

Before considering the case $r>R_n$ we need to perform a calculation. For any $\epsilon>0$ and $f(\vect x) = \abs{\vect x}^{-1}$

\[ \begin{equation} \iint_{S(0,\epsilon)} \nabla f \cdot \vect n \, d\sigma = \iint_{S(0,\epsilon)} -\frac{ \vect x \cdot \vect x} {\abs{\vect x}^4 } \, d\sigma = -\frac{ 4 \pi \epsilon^2}{\epsilon^2} = -4\pi \end{equation} \]

since $\nabla \abs{\vect x}^{-1} = -\vect x / \abs{\vect x}^3$ and $\vect n = \vect x / \abs{\vect x}$. Remarkably, the integral does not depend on $\epsilon$. Thus for $r>R_{n}$, we can modify $\eqref{eq:mv}$ to integrate over $B(\vect x, r) \setminus B(0,\epsilon)$, and then take $\epsilon\to 0$. This gives the formula

\[ \begin{equation} \frac d {dr} A(\vect x, r) = \begin{cases} 0 & \text{if } r<\abs{x} \\ -\frac 1 {r^2} & \text{if } r\geq\abs{x} \\ \end{cases} \end{equation} \]

\[ \begin{equation} \begin{split} A(\vect x, r) &= A(\vect x, 0) + \int_0^r \frac {d}{du} A(x,u)\, du \\ &= \frac 1 {\abs{\vect x}} + \int_{\abs{\vect x}}^r -\frac{1}{u^2} \,du = \frac 1 r \end{split} \end{equation} \]

\[ \E\left[ \frac 1 {R_{n+1}} \mid \frac 1 {R_n} \right] = \frac 1 {\max(r_n, R_n)} \leq \frac 1 {R_n} \]

Therefore $1/R_n$ is a supermartingale everywhere, and by the positive supermartingale maximal inequality $\eqref{eq:maximal markov}$

\[ \begin{equation} \Pr( \min_n R_n \leq \rho ) = \Pr\left( \max_n \frac 1 {R_n} \geq \frac 1 \rho \right) \leq \frac \rho {R_0} \end{equation} \]

The probability on the left is the probability that the enterprise ever gets into the solar system, and it applies to any strategy $r_n$ of jump-lengths.

Lower bound approach 1

Choose any step size of length at least $\epsilon$ at most $R_n - \rho + \epsilon$. The idea is if it crosses into the ball of radius $\rho$, it's still pretty close boundary of the ball. For concreteness let $r_n = R_n - \rho + \epsilon$. Let $T$ be the hitting time of the Enterprise crossing into the ball with radius $\rho$.

Since $r_n < \rho$ always, the calculation for the upper bound shows $Z_n = 1/R_n$ is a positive martingle with maximum $1/(\rho-\epsilon)$. By Doob's forward convergence theorem, $Z_n = 1/R_n$ converges to some random variable $Z_\infty$.

On the set $\{ T = \infty \}$ I claim that $Z_\infty = 0 \text{ a.s.}$. In other words, if the Enterprise doesn't enter the ball of radius $\rho$ then its distance tends to infinity. The intuition is that $R_n$ can't converge to any finite distance, because there is always a fixed positive probability it will jump at least a fixed distance farther.

Let $\theta_n$ be the angle between $X_{n}$ and $X_{n-1}$ and the line from the $X_{n-1}$ and $0$ (the location of the sun). Restricting our attenion to the event $\{T=\infty\}$, let $B_n$ be the event $\theta_n \geq \frac {2\pi} 3$ so $\cos \theta_n \leq -\frac 1 2$. That is, $B_n$ is the event that the Enterprise moves in a cone which is pointed sufficiently away from the direction of the sun. From the formula for the area of the cap on a sphere $\Pr{B_n} = \frac 1 2 (1+\cos \frac {2 \pi} 3) = \frac 1 4$. Trivially, $\sum_n \Pr{B_n} = \infty$, and since the $B_n$ are all independent, the 2nd Borel-Cantelli lemma shows that $B_n$ occurs infinitely often.

\[ \begin{aligned} R_n^2 &= R_{n-1}^2 + r_n^2 - 2 R_{n-1}r_n \cos \theta_n \\ &\geq R_{n-1}^2 + r_n^2 + R_{n-1}r_n \\ &> \left( R_{n-1} + \frac 1 2 r_n \right)^2 \end{aligned} \]

For any $a>0$ suppose $Z_n \leq a$ for all $n>N$ for sufficiently large $N$. Then for all $n>N$ such that $B_n$ occurs, we also have

for some $\epsilon'$ which depends on $a$ but not on $n$. Since $B_n$ occurs infinitely often, it cannot be that $\lim Z_n = a$ for any $a>0$, since if $\limsup Z_n = a$ then $\liminf Z_n \leq a-\epsilon'$. This shows $Z_n \to 0$ on $\{T=\infty\}$.

On the set $\{T<\infty\}$ the enterprise stops with distance at least $\rho-\epsilon$ and on $\{T=\infty\}$ the distance is at least $\rho$. Therefore by dominated convergence

\[ \E[{\lim_{n \to \infty} Z_{T \wedge n}}] = \lim_{n \to \infty} [\E{Z_{T \wedge n}}] = \E{Z_0} = \frac{1}{R}. \]

\[ \begin{aligned} \E[\lim_{n \to \infty} Z_{T \wedge n}] &= \E[Z_T I_{\{T < \infty\}}] + \E[\lim_{n \to \infty} Z_n I_{\{T = \infty\}}] \\ &\leq \frac {\Pr(T<\infty)} {\rho-\epsilon}\\ \end{aligned} \]

Lower bound approach 2

This relies more on the central limit theorem than martingale techniques. The other solution is more in the spirit of chapter 10.

Consider the process with constant jump sizes $r_n = \epsilon$ for any $0<\epsilon <\rho$, and also choose $\rho’ > R_0$. Let $\tau = \inf\{ n \mid R_n \leq \rho \}$ and $\tau’ = \inf\{ n \mid R_n \geq \rho’ \}$ and let $t = \tau \wedge \tau'$. First we show $\Pr( t < \infty ) = 1$, that is we hit one barrier or the other eventually. Note the $x$ coordinate of the Enterprise's location follows a random walk since the $x$ projection of each jump of length $\epsilon$ is i.i.d.. Let $X_n$ be the $x$-coordinate at time $n$. By the symmetry of each spherical jump, $X_n$ is a martingale, so $\E X_n = X_0$. Also if $J_n = X_n-X_{n-1}$ note $\Var( J_n ) = \epsilon^2 \sigma^2_0$ for some $\sigma_0$ since, by dilation symmetry of the spherical jump, the variance scales as $\epsilon^2$.

\[ \begin{equation} \Pr( R_n \geq \rho’ ) \geq \Pr( \abs{X_n} \geq \rho’ ) \geq \Pr( \abs{ X_n - X_0 } > 2\rho’) \end{equation} \]

\[ \begin{equation} \Pr( \abs{ X_n - X_0 } > 2\rho’) \to 2 \Phi\left( - \frac {2\rho’} {\epsilon \sigma_0 \sqrt{n}} \right) \to 1 \qquad \text{as } n \to \infty \end{equation} \]

\[ \begin{equation} \Pr( t = \infty) \leq \Pr( \tau’ = \infty) = \Pr( \max_n R_n \leq \rho’) \leq \inf_n \Pr( R_n \leq \rho’ ) = 0 \end{equation} \]

Now let's consider the stopped martingale $Z_{n} = 1/R_{t \wedge n}$. This is a martingale since our chose of $\epsilon < \rho$ implies $r_n=\epsilon < R_n$ for all $n$. We can apply Doob's optional stopping theorem since

\[ \begin{equation} \frac 1 {\rho'+\epsilon} \leq Z_n \leq \frac 1 {\rho-\epsilon} \end{equation} \]

and therefore $Z_n$ is a bounded, positive martingale. We can break the expectation up into two parts, when $R_n$ hits the upper and lower barriers $\rho'$ and $\rho$.

\[ \begin{equation} \frac 1 {R_0} = \E[ Z_{t}] =s \E[ Z_{t} ; t = \tau ] + \E[ Z_{t} ; t = \tau’ ] \leq \frac{\Pr( t=\tau)}{\rho-\epsilon} + \frac{\Pr( t=\tau’)}{\rho’} \end{equation} \]

\[ \begin{equation} \begin{split} \rho-\epsilon \leq &R_t \leq \rho \qquad \qquad \text{if } t=\tau \\ \rho’ \leq &R_t \leq \rho'+\epsilon \qquad \text{if } t=\tau’ \end{split} \end{equation} \]

\[ \begin{equation} \Pr( t= \tau ) \geq \frac{ \frac 1 {R_0} - \frac 1 {\rho’} }{ \frac 1 {\rho-\epsilon} - \frac 1 {\rho'}} \end{equation} \]

Now as $\rho’ \to \infty$, the event $\{t = \tau\}$ is essentially the event that the Enterprise returns to the solar system. It says that the distance $R_n \leq \rho$ occurs before $R_n\to \infty$. So we can interpret this equation in the limit that $\rho'\to \infty$ to say

\[ \begin{equation} \Pr( \text{Enterprise returns} ) \geq \frac{\rho-\epsilon} {R_0} \end{equation} \]

Problem 10.11 Star Trek 2, Captain's Log

Mr Spock and Chief Engineer Scott have modified the control system so that the Enterprise is confined to move for ever in a fixed plane passing through the Sun. However, the next 'hop-length’ is now automatically set to be the current distance to the Sun ('next’ and 'current’ being updated in the obvious way). Spock is muttering something about logarithms and random walks, but I wonder whether it is (almost) certain that we will get into the Solar System sometime. " ’

\[ \begin{equation} \frac{ \partial f}{\partial x_i} = -\frac {x_i} {r^2} \qquad \frac{ \partial^2 f}{\partial x_i^2} = -\frac 1 {r^2} + \frac{ 2x_i^2}{r^4} \qquad \nabla^2 f = -\frac 2 {r^2} + \frac{2 r^2}{r^4} = 0 \end{equation} \]

Thus $\log R_n$ is a martingale so $X_n = \log R_n-\log R_{n-1}$ has mean zero. Note the law of $R_n$ is unique determined by the value of $R_{n-1}$. However, the law of $X_n$ is independent of the value of $R_{n-1}$ because

\[ \begin{equation} X_n = \log R_n - \log R_{n-1} = \log \alpha R_n -\log \alpha R_{n-1} \qquad \text{for any } \alpha>0 \end{equation} \]

We can choose $\alpha = 1/R_{n-1}$. Therefore we conclude the sequence $X_1,X_2,\dots$ is i.i.d. Furthermore, as we'll show later, $\E X_n^2<\infty$, so the random variables have finite variance $\sigma^2$. Consider

\[ \begin{equation} \Pr( \inf_n S_n = -\infty) \geq \Pr( S_n \leq -\alpha \sigma \sqrt{n}, \text{i.o}) \end{equation} \]

since if the event on the right hand side happens, the event on the left hand side certainly does, since $S_n$ takes on values which are arbitrarily negative if its less than $-\alpha \sigma \sqrt{n}$ infinitely often.

Let $E_k$ be a sequence of events. Then the event that $E_k$ occur infinitely often is given by

\[ \begin{equation} E_{\text{i.o.}} = \bigcap_{n\in \N} \bigcup_{k\geq n} E_k \end{equation} \]

\[ \begin{equation} \Pr(E_{\text{i.o}}) = \lim_{n\to\infty} \Pr( \bigcup_{k\geq n} E_k) = \limsup_{n\to \infty} \Pr( \bigcup_{k\geq n} E_k) \end{equation} \]

The limit exists because the terms are non-increasing, and it equals the $\limsup$. Since $\Pr( \bigcup_{k\geq n} E_k) \geq \Pr( E_n )$, a lower bound is given by

\[ \begin{equation} \Pr(E_{\text{i.o}}) \geq \limsup_{n\to \infty} \Pr( E_n ) \end{equation} \]

\[ \begin{equation} \lim_{n\to \infty} \Pr( S_n / \sqrt n \leq -\alpha \sigma ) = \Phi(-\alpha) > 0 \end{equation} \]

and therefore $\Pr( \inf_n S_n = -\infty) > 0$. This implies that $\Pr( \inf_n S_n = -\infty ) = 1$ by Kolmolgorov's 0-1 law, since we will show this event is in the tail field.

There are two loose ends to tie up. First we will demonstrate that the event $E = \{ \inf S_n = -\infty \}$ is in the tail field. Select $k$ such that $S_{n_k} \leq -k$. Now replace $X_1, \dots, X_m$ by $X’_1, \dots, X’_m$, and let

\[ \begin{equation} S’_n = \begin{cases} X’_1 + X’_2 + \dots + X’_n & \text{if } n\leq m \\ X’_1 + X’_2 + \dots + X’_m + X_{m+1} + \dots + X_n & \text{if } n>m \end{cases} \end{equation} \]

If $D = \sum_{i=1}^m X’_i - X_i$, then for $n>m$, $S’_n = S_n + D$. From this its obvious that, for large enough $k$, $S’_{n_k} \leq -k + M \to -\infty$. Thus $\inf S’_n = -\infty$ as well. Therefore $E$ is in $\sigma(X_{m+1}, X_{m+2}, \dots)$ since its independent of the values of $X_1,\dots, X_m$. Since this is true for any $m$, $E$ is in the tail field.

Finally we show that $\sigma^2 = \Var(X_i) < \infty$. We show $\E X_i^2 < \infty$. Without loss of generality, assume $R_{n-1}=1$, and we'll compute the pdf of $R=R_n$. Let $\Theta$ be the direction the Enterprise travels, with $\Theta=0$ corresponding to the direction toward the center of the solar system. By hypothesis $\Theta \sim U(0,2\pi )$. Now $R$ is the base of an isoceles triangle with edges 1 and opposite angle $\Theta$ so

\[ \begin{equation} R = 2 \sin \frac \Theta 2 \qquad \Rightarrow \qquad p_R(r) = \frac 4 {\sqrt{4-r^2}} \end{equation} \]

\[ \begin{equation} \E (log R)^2 = \int_0^2 \frac {4(\log r)^2}{\sqrt{4-r^2}} \, dr = \int_{-\infty}^{\log 2} \frac{4u^2 e^u}{\sqrt{4-e^u}} \, du \end{equation} \]

The left tail when $r$ is close to zero equivalent to the transformed integral when $u$ is close to $-\infty$. This integral resembles the tail of $\Gamma(3) = \int_0^\infty u^2 e^{-u}$, and is therefore finite. On the right tail when $r\to 2$ the integrand resembles $4(\log 2)^2 / \sqrt{4-r^2}$ which has antiderivative $A \arctan r/2$ for some constant $A$, and therefore finite integral on $[2-\epsilon, 2]$. On the intermediate range $[\epsilon, 2-\epsilon]$, the integrand is continuous and hence bounded, so the integral is bounded there too. Thus $\E (\log R)^2 < \infty$

Beta-Gamma Identity

Show $\Gamma(x)\Gamma(y) = B(x,y) \Gamma(x+y)$

\[ \begin{equation} \begin{split} \Gamma(x)\Gamma(y) &= \int_0^\infty s^{x-1} e^{-x}\, ds \,\, \int_0^\infty t^{y-1} e^{-y}\, dt \\ &= \int_0^\infty \int_0^\infty s^{x-1} t^{y-1} e^{-x-y} \, ds\,dt \\ &= \int_0^\infty \int_0^1 (uv)^{x-1} ((1-u)v)^{y-1} e^{-x-y} \,v \,du\,dv \\ &= \int_0^1 u^{x-1}(1-u)^{y-1} \, du \int_0^\infty v^{x+y-1} e^{-x-y}\,dv\\ &= B(x,y) \Gamma(x+y) \end{split} \end{equation} \]

\[ \begin{equation} \frac {\partial(u,v)}{\partial(s,t)} = \begin{vmatrix} t/ (s+t)^2 & -s/(s+t)^2 \\ 1 & 1 \end{vmatrix} = \frac 1 {s+t} = \frac 1 v \end{equation} \]

Probability with Martingales by David Williams

Martingales

Upper bound

Lower bound approach 1

Lower bound approach 2

Contact