Squid Game asymptotic probability

Difficulty:   ★★★☆☆   undergraduate

We continue with the bridge-crossing scenario from Squid Game called “Glass Stepping Stones”. Here I analyse the probabilities for late contestants on a very long bridge, and the expectation value for a player’s progress. Last time I found an exact expression for the probability P(i,n) that player number i is still alive on the nth step. Now seems a good place to mention there are equivalent expressions, such as:

    \[P(i,n) = 1 - \binom{n}{i}\cdot{_2F_1}(i,n+1,i+1;-1),\]

where {_2F_1} is called the hypergeometric function, and the other term is the binomial coefficient which is read as “n choose i”. The factor 2^{-n} seen previously has been absorbed. We listed several special cases of the probabilities last time. Another is:

    \[P(i,2i-1) = \frac{1}{2}.\]

So remarkably, the chance player i will be alive on step 2i – 1 is precisely 50%! For fixed large i, if we plot the probability distribution as a function of n it looks smooth, remaining near 1 for early steps before rolling down to near 0. Qualitatively this looks like a \tan^{-1}, tanh, or erf (“error function”). We reflect these curves, centre them on the value 1/2 at n = 2i – 1, and scale them linearly: so they have the appropriate bounds and match the slope at the centre point. See the Figure below.

probability for player 400
Figure: The probability player number i = 400 is still alive at step n. The scaled tanh function is close to the exact curve, while the scaled erf function nearly overlays it.

In fact the slope used in the Figure is only an approximation as described next, but this is a deliberate choice to show it still gives a good fit. The exact slope \partial P/\partial n evaluated at n = 2i – 1 seems a little too complicated to be useful. It contains a derivative of the hypergeometric function, which appears to approach -1/2 in the limit of large i, hence the slope at the centre point is asymptotic to -1/\sqrt{4\pi i}. Another approach is to consider the subsequent bridge step, for which:

    \[P(i,2i) = \frac{1}{2} - \frac{\Gamma(i+1/2)}{2\sqrt\pi\,\Gamma(i+1)},\]

which uses the Gamma function. The difference P(i,2i) – P(i,2i-1) approximates the slope, and is also asymptotic to -1/\sqrt{4\pi i} as i \rightarrow \infty. Hence our approximation for late players is:

    \[P(i,n) \approx \frac{1}{2}\Big( 1-\operatorname{erf}\frac{n+1-2i}{2\sqrt i} \Big).\]

Now the error function by definition is the integral of a gaussian curve. The derivative with respect to n of our approximation is precisely the righthand side below, which itself approximates the chance a late player dies on that step:

    \[P(i\textrm{ dies on }n) \approx -\frac{1}{\sqrt{4\pi i}}e^{-(n-2i+1)^2/4i}.\]

For fixed i this is a gaussian distribution with centre n = 2i – 1 and standard deviation \sqrt{2i}. It is normalised in the sense its integral over n \in \mathbb R is exactly 1, but physically we want the discrete sum over n \in \mathbb N^+. For the 10th player this is approx. 0.9991, which is already close. The exact chance for dying on a given step was determined in the previous article to be b_{i,n} = \binom{n-1}{i-1} \cdot 2^{-n}. The Figure below shows some early values. As before, we can extend the function beyond integer parameters.

probability of death
Figure: The probability player i will die on step n, given no foreknowledge (that is, before the game begins). The blue dots correspond to integer values of the parameters, which are physical. Contestants face a near-gaussian “hill of death” so to speak, which peaks at n = 2i – 2 and 2i – 1. I have included the “spires” at the back for sake of interest, as a peek into the rich structure for negative n, though this is unphysical.

The ratio b_{i,n}/b_{i,n+1} = 2(n-i+1)/n precisely. Hence for a given player, the adjacent steps n = 2i – 2 and 2i – 1 are equally likely locations their game will be “discontinued”. This is surely the maximum assuming integer parameters, apart from the first player for whom step 0 is safe but step 1 is their most likely “resting place”. Hence the reader might prefer to translate our gaussian approximation by half a step or so; apparently there are various approximations to a binomial coefficient. The subsequent step n = 2i is a more likely endpoint than the earlier step 2i – 3.

The expectation value for a given player’s death is:

    \[\sum_{n=1}^\infty n \cdot b_{i,n} = \binom{0}{i-1} \cdot {_2F_1}(1,-i,2-i;-1).\]

This function is very close to 2i, apart from a small oscillatory wiggle. At integer i it is singular, but from inspection of its plot it may be extended to a continuous function with value precisely 2i on physical parameter values (that is, integer i). Finally, for a given step n, the probability that some player breaks a tile is:

    \[\sum_{i=1}^n b_{i,n} = \frac{1}{2}\]

precisely, which is unsurprising. (Better terminology would be the nth “column”, as Henle+  use.) This assumes that n or more players have finished their run, otherwise the step is less likely to be broken.

Update, May 19: The death chance b_{i,n} is ½ times a binomial distribution in n. We previously found a gaussian curve for a given player i. Now. for a fixed step n, the de Moivre-Laplace approximation is a gaussian over the player number i:

    \[\frac{1}{\sqrt{2\pi(n-1)}}e^{-\frac{(i - n/2 - 1/2)^2}{(n - 1)/2}.\]

🡐 general solution | Squid Game bridge | conditional probabilities 🠦

More Squid Game probabilities

Difficulty:   ★★★☆☆   undergraduate

Last time I analysed the bridge-crossing scenario in the series Squid Game. In this fictional challenge called “Glass Stepping Stones”, the front contestant must leap forward along glass panels, choosing left or right each time, knowing only that one side is strengthened glass while the other will shatter. At least later players may learn from the choices of their forerunners. Here I use combinatorial arguments, derive a recurrence relation for the chance to die on a given step, and obtain an analytic solution with a hypergeometric function.

Again, write a_{i,n} or equivalently P(i,n) for the probability the ith player is still alive on the nth step. We showed these probabilities satisfy the recurrence relation a_{i,n} = \frac{1}{2}(a_{i-1,n-1}+a_{i,n-1}), along with initial conditions a_{1,n} = 1/2^n, and a_{i,1} = 1 for all players after the first. Equivalently, we can start from a_{0,n} := 0, and a_{i,0} := 1 for i \ge 1. This is a bit like Pascal’s triangle. Rather than adding the previous two terms, we take their average — which of course is the sum divided by two. And rather than 1’s at the sides, we have 0’s and 1’s.

Let’s write b_{i,n} for the likelihood the ith player will die upon landing on the nth step. Then b_{i,n} = a_{i,n-1} - a_{i,n}. These values satisfy the same recurrence relation as before:

    \[\begin{aligned} b_{i,n} &=a_{i,n-1} - a_{i,n} \\ &= \frac{1}{2}(a_{i-1,n-2}+a_{i,n-2}-a_{i,n-1}-a_{i-1,n-1}) \\ &= \frac{b_{i,n-1}+b_{i-1,n-1}}{2}.\end{aligned}\]

Only the initial conditions are different: b_{1,n} = 1/2^n, and b_{i,1} = 0 for all players after the first. It is aesthetic to begin a step earlier: b_{i,0} := 0 =: b_{0,n}, except for b_{0,0} = 1. The Table below shows a few early entries.

Table: Probability player i will die on step n itself, given no foreknowledge
step:

n = 1

2 3 4 5 6 7 8
player:

i = 1

1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256
2 0 1/4 1/4 3/16 1/8 5/64 3/64 7/256
3 0 0 1/8 3/16 3/16 5/32 15/128 21/256
4 0 0 0 1/16 1/8 5/32 5/32 35/256
5 0 0 0 0 1/32 5/64 15/128 35/256
6 0 0 0 0 0 1/64 3/64 21/256

Alternatively, there are elegant combinatorial arguments, for which I was initially inspired by another blog . For player i to die on step n, the previous i – 1 players must have died somewhere amongst the n – 1 prior steps. There are n – 1 choose i – 1″ ways to arrange these mistaken steps, out of 2^{n-1} total combinations of equal probability. Given any such arrangement, the next player has a 50% chance their following leap is a misstep, hence:

    \[b_{i,n} := P(i\textrm{ dies on }n) = \binom{n-1}{i-1}2^{-n}.\]

(I originally found this simple formula in a much more roundabout way, as often happens!) If i > n, the probability is zero. By similar reasoning, the chance that precisely i players have died by step n (inclusive) is:

    \[P(i\textrm{ players died by }n) = \binom{n}{i}2^{-n}.\]

A draft paper (Henle+ 2021 ) gives this result. It may also be obtained by summing over the previous formula: \sum_{n' = i}^n b_{i,n'}/2^{n-n'}. Note if player i died on n′, the next player must make nn′ correct guesses in a row, so that no-one else dies by the nth step.

Now the probability the ith player is alive at the nth step or further, is the probability any number of previous players died by step n or before. (So what is ruled out is i or more dying by this stage.) This is just a sum over the previous displayed formula: \sum_{i'=0}^{i-1}P(i\textrm{ died by }n), which computer algebra simplifies to:

    \[P(i,n) = 1-\binom{n}{i}2^{-n}\cdot{_2F_1}(1,i-n,i+1;-1).\]

Here _2F_1 is called the “(ordinary) hypergeometric function”. I gave a limited table of these probability values in the previous blog. For fixed integer i \ge 0, the entire expression reduces to a polynomial in n of order i – 1 with rational coefficients, all times 2^{-n}. For example the likelihood the 5th player is alive at step n is:

    \[P(5,n) = 2^{-n}\frac{1}{24}\big( n^4-2n^3+11n^2+14n+24 \big).\]

In general, for n < i we have a_{i,n} = 1, so players get some steps for free. The diagonal terms are a_{i,i} = 1-2^{-i}. This makes sense because for player i to not be alive on the ith step, every leap by previous players must also have been a misstep. Some results like these may also be shown using induction and the recurrence relation. I give more special cases in the next blog post. Yet the general formula works even for non-integers, though this is not physical, as the Figure below shows. For negative parameters (not shown) it has a rich structure, with singularities, and some probabilities values negative or exceeding 1.

probabilities plot
Figure: Probability that player i is alive at step n. The blue dots are for integer values of the parameters, which are physical. The probability decreases with step number. Visually, it is as if contestants start from the plateau at top-left, then slide down a hill of death 🙁

An alternative derivation of the probabilities is based on where the previous player died (if at all). If that player i – 1 died on step n′ < n, their follower must make nn′ correct guesses in a row to reach step n safely. Now sum the result from n′ = i – 1, which is the earliest step upon which they may conceivably die, up to n′ = n – 1. Add to this the chance the player was still alive at step n – 1 (which is one minus the sum of chances they died on step n′) as this guarantees the following player i is alive at n. Numerical testing shows the result is indeed equivalent. Hence rather than summing over players for a fixed step, one may instead sum over steps for a fixed player.

🡐 recurrence relation | Squid Game bridge | asymptotics 🠦