In this note, we solve an example problem in the book Bernoulli’s Fallacy using three approaches: (1) maximum likelihood, (2) Bayes’ theorem, and (3) MCMC simulation with PyMC3 package in Python.

Problem Statement

The problem is on page 36 of Bernoulli’s Fallacy:

Your friend rolls a six-sided die and secretly records the outcome; this number becomes the target \(T\). You then put on a blindfold and roll the same six-sided die over and over. You’re unable to see how it lands so, each time, your friend (under the watchful eye of a judge, to prevent any cheating) tells you only whether the number you just rolled was greater than, equal to, or less than $T$. After some number of rolls, say 10, the sequence of the outcomes is

G, G, L, E, L, L, L, E, G, L

with G representing a greater roll, L a lesser roll, and E an equal roll.

You must guess what the target was. What would your strategy for guessing be, and how confident would you be?

Solution

Because rolls in the sequence are independent, the tallies of the rolls are important statistics. The numbers of greater rolls ($N_G$), equal rolls ($N_E$), and lesser rolls ($N_L$) are

\[\begin{align*} N_G &= 3, \\ N_E & = 2, \\ N_L & = 5. \\ \end{align*}\]

Maximum Likelihood

In this approach, the best target value $T$ maximizes the likelihood:

\[P(D|T) = \left(\frac{6-T}{6}\right)^{N_G} \left(\frac{1}{6}\right)^{N_E} \left(\frac{T-1}{6}\right)^{N_L}, \label{eqn_likelihood}\]

where $D$ is the observed sequence of rolls. $\frac{T-1}{6}$, $\frac{1}{6}$, and$\frac{T-1}{6}$ are the probabilities of getting a greater roll, an equal roll and a lesser roll, respectively.

The values of Equation ($\ref{eqn_likelihood}$) for all possible values of $T$ are listed in the table below.

\(T\) Likelihood $P(D\mid T)$
1 0
2 $64\times 6^{-10}$
3 $864\times 6^{-10}$
4 $1944\times 6^{-10}$
5 $1024\times 6^{-10}$
6 0

The likelihood is the largest when $T=4$. Therefore, $T=4$ is the best guess based on maximum likelihood. However, we don’t have the confidence estimate of $T$ yet. We will use Bayes’ theorem for that.

Bayes’ Theorem

Using Bayes’ theorem, we calculate the probability of the target $T$ given the observed data, $D$:

\[P(T|D)=\frac{P(D|T)P(T)}{P(D)} \notag\]

The likelihood $P(D\mid T)$ has been calculated in the previous section.

\[P(D) = \sum_{T=1}^6 P(D|T)P(T)\notag\]

Further, we assume a uniform prior, $P(T)=\frac{1}{6}$, then the posterior is

$T$ Posterior $P(T\mid D)$
1 0
2 $\frac{8}{487}$
3 $\frac{108}{487}$
4 $\frac{243}{487}$
5 $\frac{128}{487}$
6 0
Posterior $P(T|D)$.

The distribution of the posterior provides the assessment of the uncertainty of the inference of $T$.

Probabilistic Programming using PyMC3

The posterior inference can be done using probabilistic programming, for example, with the PyMC3 Python package. In the following code, the Discrete Uniform prior and likelihood are defined. The posterior is sampled using the Metropolis algorithm.

import numpy as np
import pymc3 as pm

nL = 5
nE = 2
nG = 3
with pm.Model() as model:
    #Discrete uniform prior
    T = pm.DiscreteUniform('T', 1, 6) 
    #Log-likelihood
    logp = nL*np.log((T-1)/6) + nE*np.log(1/6) + nG*np.log((6-T)/6) 
    potential = pm.Potential('potential', logp)
    
with model:
    trace = pm.sample(1000, tune=1000)
Posterior estimated by MCMC sampling.

The MCMC result is similar to the theoretical values obtained with Bayes’ theorem.