Martingale (probability theory)

For a less technical analysis of the betting system see martingale (roulette system).

A stopped Brownian motion as an example for a martingale

In probability theory, a (discrete-time) martingale is a discrete-time stochastic process (i.e., a sequence of random variables) X₁, X₂, X₃, ... that satisfies the identity

E(X_{n+1}\mid X_{1},\dots ,X_{n})=X_{n},

i.e., the conditional expected value of the next observation, given all of the past observations, is equal to the last observation. As is frequent in probability theory, the term was adopted from the language of gambling.

Somewhat more generally, a sequence Y₁, Y₂, Y₃, ... is said to be a martingale with respect to another sequence X₁, X₂, X₃, ... if

E(Y_{n+1}\mid X_{1},\dots ,X_{n})=Y_{n},

for every n.

Similarly, a continuous-time martingale is a stochastic process X_t such that

E(X_{t}\mid X_{s},s\leq t)=X_{s},\,

for all s ≤ t. Namely, the conditional expectation of observations at time t, given all the observations up to time $s$ , is equal to the observation at time s (of course, provided that s ≤ t).

History

Originally, martingale referred to a class of betting strategies popular in 18th century France. The simplest of these strategies was designed for a game in which the gambler wins his stake if a coin comes up heads and loses it if the coin comes up tails. The strategy had the gambler double his bet after every loss, so that the first win would recover all previous losses plus win a profit equal to the original stake. Since as a gambler's wealth and available time jointly approach infinity his probability of eventually flipping heads approaches 1, the martingale betting strategy was seen as a sure thing by those who practiced it. Of course in reality the exponential growth of the bets would eventually bankrupt those foolish enough to use the martingale for a long time.

The concept of martingale in probability theory was introduced by Paul Pierre Lévy, and much of the original development of the theory was done by Joseph Leo Doob. Part of the motivation for that work was to show the impossibility of successful betting strategies.

Examples of martingales

Suppose X_n is a gambler's fortune after n tosses of a "fair" coin, where the gambler wins $1 if the coin comes up heads and loses $1 if the coin comes up tails. The gambler's conditional expected fortune after the next trial, given the history, is equal to his present fortune, so this sequence is a martingale. This is also known as D'Alembert system.

Let Y_n = X_n² − n where X_n is the gambler's fortune from the preceding example. Then the sequence { Y_n : n = 1, 2, 3, ... } is a martingale. This can be used to show that the gambler's total gain or loss grows roughly as the square root of the number of steps.

(de Moivre's martingale) Now suppose an "unfair" or "biased" coin, with probability p of "heads" and probability q = 1 − p of "tails". Let

X_{n+1}=X_{n}\pm 1

with "+" in case of "heads" and "−" in case of "tails". Let

Y_{n}=(q/p)^{X_{n}}.

Then { Y_n : n = 1, 2, 3, ... } is a martingale with respect to { X_n : n = 1, 2, 3, ... }.

Let Y_n = P(A | X₁, ..., X_n). Then { Y_n : n = 1, 2, 3, ... } is a martingale with respect to { X_n : n = 1, 2, 3, ... }.

(Polya's urn) An urn initially contains r red and b blue marbles. One is chosen randomly. Then it is put back together with another one of the same colour. Let X_n be the number of red marbles in the urn after n iterations of this procedure, and let Y_n = X_n/(n+r+b). Then the sequence { Y_n : n = 1, 2, 3, ... } is a martingale.

(Likelihood-ratio testing in statistics) A population is thought to be distributed according to either a probability density f or another probability density g. A random sample is taken, the data being X₁, ..., X_n. Let Y_n be the "likelihood ratio"

Y_{n}=\prod _{i=1}^{n}{\frac {g(X_{i})}{f(X_{i})}}

(which, in applications, would be used as a test statistic). If the population is actually distributed according to the density f rather than according to g, then { Y_n : n = 1, 2, 3, ... } is a martingale with respect to { X_n : n = 1, 2, 3, ... }.

Suppose each amoeba either splits into two amoebas, with probability p, or eventually dies, with probability 1 − p. Let X_n be the number of amoebas surviving in the nth generation (in particular X_n = 0 if the population has become extinct by that time). Let r be the probability of eventual extinction. (Finding r as function of p is an instructive exercise. Hint: The probability that the descendants of an amoeba eventually die out is equal to the probability that either of its immediate offspring dies out, given that the original amoeba has split.) Then

\{\,r^{X_{n}}:n=1,2,3,\dots \,\}

is a martingale with respect to { X_n: n = 1, 2, 3, ... }.

The number of individuals of any particular species in an ecosystem of fixed size is a function of (discrete) time, and may be viewed as a sequence of random variables. This sequence is a martingale under the unified neutral theory of biodiversity.

If { N_t : t ≥ 0 } is a Poisson process with intensity λ, then the Compensated Poisson process { N_t − λt : t ≥ 0 } is a continuous-time martingale with right-continuous/left-limit sample paths.

Martingales and stopping times

A stopping time with respect to a sequence of random variables X₁, X₂, ... is a random variable τ with the property that for each t, the occurrence or non-occurrence of the event τ = t depends only on the values of X₁, X₂, ..., X_t. The intuition behind the definition is that at any particular time t, you can look at the sequence so far and tell if it is time to stop. An example in real life might be the time at which a gambler leaves the gambling table, which might be a function of his previous winnings (for example, he might leave only when he goes broke), but he can't choose to go or stay based on the outcome of games that haven't been played yet.

Some mathematicians defined the concept of stopping time by requiring only that the occurrence or non-occurrence of the event τ = t be probabilistically independent of X_t+1, X_t+2, X_t+3, ..., but not that it be completely determined by the history of the process up to time t. That is a weaker condition than the one appearing in the paragraph above, but is strong enough to serve in some of the proofs in which stopping times are used.

The optional stopping theorem (or optional sampling theorem) says that, under certain conditions, the expected value of a martingale at a stopping time is equal to its initial value. One version of the theorem is given below:

Let X₁, X₂, ... be a martingale and τ a stopping time with respect to X₁, X₂, ... . If (a) Pr[τ < ∞] = 1, (b) E[τ] < ∞, and (c) there exists a constant c such that |X_i+1 − X_i| ≤ c for all i; then E[X_τ] = E[X₁].

Some applications of the theorem:

We can use it to prove the impossibility of successful betting strategies for a gambler with a finite lifetime (which gives conditions (a) and (b)) and a house limit on bets (condition (c)). Suppose that the gambler can wager up to c dollars on a fair coin flip at times 1, 2, 3, etc., winning his wager if the coin comes up heads and losing it if the coin comes up tails. Suppose further that he can quit whenever he likes, but cannot predict the outcome of gambles that haven't happened yet. Then the gambler's fortune over time is a martingale, and the time τ at which he decides to quit (or goes broke and is forced to quit) is a stopping time. So the theorem says that E[X_τ] = E[X₁]. In other words, the gambler leaves with the same amount of money on average as when he started.

Suppose we have a random walk that goes up or down by one with equal probability on each step. Suppose further that the walk stops if it reaches 0 or m; the time at which this first occurs is a stopping time. If we happen to know that the expected time that the walk ends is finite (say, from Markov chain theory), the optional stopping theorem tells us that the expected position when we stop is equal to the initial position a. Solving a = pm + (1−p)0 for the probability p that we reach m before 0 gives p = a/m.

Now consider a random walk that starts at 0 and stops if we reach −m or +m, and use the Y_n = X_n² − n martingale from the examples section. If τ is the time at which we first reach ±m, then 0 = E[Y₁] = E[Y_τ] = m² − E[τ]. We immediately get E[τ] = m².

Submartingales and supermartingales

A submartingale is a sequence $X_{1},X_{2},X_{3},...$ of random variables satisfying just

E[X_{n+1}|X_{1},\ldots ,X_{n}]\geq X_{n}.

Analogously a supermartingale satisfies

E[X_{n+1}|X_{1},\ldots ,X_{n}]\leq X_{n}.

Here is a mnemonic for remembering which is which: "Life is a supermartingale; as time advances, expectation decreases."

Examples of submartingales and supermartingales

Every martingale is also a submartingale and a supermartingale. Conversely, any stochastic process that is both a submartingale and a supermartingale is a martingale.
Consider again the gambler who wins $1 when a coin comes up heads and loses $1 when the coin comes up tails. Suppose now that the coin may be biased, so that it comes up heads with probability p.
- If p is equal to 1/2, the gambler on average neither wins nor loses money, and the gambler's fortune over time is a martingale.
- If p is less than 1/2, the gambler loses money on average, and the gambler's fortune over time is a supermartingale.
- If p is greater than 1/2, the gambler wins money on average, and the gambler's fortune over time is a submartingale.
A convex function of a martingale is a submartingale, by Jensen's inequality. For example, the square of the gambler's fortune in the fair coin game is a submartingale (which also follows from the fact that X_n² − n is a martingale). Similarly, a concave function of a martingale is a supermartingale.

A more general definition

One can define a martingale which is an uncountable family of random variables. Also, those random variables may take values in a more general space than just the real numbers.

Let ${\mathcal {I}}$ be a directed set, $V$ be a real topological vector space, and $V^{\star }$ its topological dual (denote by $\left(\cdot ,\cdot \right)$ this duality). Moreover, let $\left(\Omega ,\mathbb {P} ,{\mathcal {F}},{\mathcal {F}}_{i}\right)$ be a filtered probability space, that is a probability space $\left(\Omega ,\mathbb {P} ,{\mathcal {F}}\right)$ equipped with a family of sigma-algebras $\{{\mathcal {F}}_{i}\}_{i\in {\mathcal {I}}}$ with the following property: for each $i,j\in {\mathcal {I}}$ with $i\leq j$ , one has ${\mathcal {F}}_{i}\subset {\mathcal {F}}_{j}\subset {\mathcal {F}}$ .

A family of random variables $\{X_{i}\}_{i\in {\mathcal {I}}}$ :

X_{i}:\Omega \to V,

are called a martingale if for each $u\in V^{\star }$ and $i,j\in {\mathcal {I}}$ with $i\leq j$ , the three following properties are satisfied:

$\left(X_{i},u\right)$ is ${\mathcal {F}}_{i}$ -measurable.

$\mathbb {E} [|(X_{i},u)|]<\infty$ .

$\mathbb {E} [\left(X_{j},u\right)|{\mathcal {F}}_{i}]=\left(X_{i},u\right)$ .

If the directed ${\mathcal {I}}$ is a real interval (or the whole real axis, or a semiaxis) then a martingale is called a continuous time martingale. If ${\mathcal {I}}$ is the set of natural number is called a discrete time martingale.

References

David Williams, Probability with Martingales, Cambridge University Press, 1991, ISBN 0-521-40605-6
Hagen Kleinert, Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 4th edition, World Scientific (Singapore, 2004); Paperback ISBN 981-238-107-4 (also available online: PDF-files)