Optimal Stopping Problem

Motivation. Imagine a gambling situation, where there is a sequence of prizes inside boxes. The gambler knows the distribution of these boxes, but is only shown one at a time. They can claim only one box, and once a box is opened the prize must be claimed or trashed. How can the gambler act?

def. (Optimal stopping problem) let prizes of random variables $X_{1}, X_{2} \dots, X_{n}$ be distributed $F_{1}, F_{2}, \dots, F_{n}$ . The gambler only knows the distribution of each of these boxes, and the order in which the boxes are shown is shuffled randomly.

thm. Prophet Inequality. There is a strategy for the gambler to achieve at least $\frac{1}{2}$ of the optimal revenue, i.e.:

E (payoff) \geq \frac{1}{2} E (max_{i = 1}^{n} X_{i})

where…

$payoff$ is the payoff to the gambler
let $X^{*} : = max_{i = 1}^{n} X_{i}$ , a random variable. This is what the “Prophet” gets, i.e. a optimal strategy. Additionally, the theorem states that this strategy is a optimal cutoff strategy, which is one that stops if the payoff from the current opened box is larger than predetermined cutoff $w$ .

Proof. We know that $payoff = base payoff + excess payoff$ where

base payoff is $w$
excess payoff is $X_{j} - w$ , where $X_{j}$ is the box we stop at We also know that these two are random varibles:

w = {w 0 if X^{*} \geq w else

excess payoff = {E ((X_{j} - w)^{+}) 0 if stopped at X_{j} if never stopped

Now, the expected payoff is:

E (payoff) = expected base P (X^{*} \geq w) \cdot w + expected excess j = 1 \sum n P (stopping at X_{j}) \cdot E ((X_{j} - w)^{+})

We know that the probability of stopping at $X_{j}$ (from the first case of excess payoff) is:

P (stopping at X_{j}) = P (max_{i = 1}^{j - 1} X_{i} < w) \geq P (max_{i = 1}^{n} X_{i} < w) = P (X^{*} < w) ...that boxes before j where < w ...that all boxes are < w by definition of X^{*}

Thus:

E (payoff) \geq P (X^{*} \geq w) \cdot w + take out since X^{*} is not relevant to j P (X^{*} < w) \cdot j = 1 \sum n E ((X_{j} - w)^{+})

(lemma 1) On the other hand, the expected prophet payoff is

E (X^{*}) = E (w + max_{j = 1}^{n} (X_{j} - w)) \leq w + E (max_{j = 1}^{n} (X_{j} - w)^{+}) \leq w + j = 1 \sum n E ((X_{j} - w)^{+}) by definition of (\dots)^{+} sum greater than max

(lemma 2) Noticing lemma 1 and lemma 2 both have term $\sum_{j = 1}^{n} E ((X_{j} - w)^{+})$ , we can organize for that:

\frac{E ( payoff ) - w \cdot P ( X ^{*} \geq w )}{P ( X ^{*} < w )} \geq j = 1 \sum n E ((X_{j} - w)^{+}) \geq E (X^{*}) - w

Simplifying we get

E (payoff) \geq \frac{1}{2} E (X^{*})

■

PK's Notes

Explorer

Optimal Stopping Problem

Graph View

Backlinks