Blog

Modelling Uncertainty: The Laplace’s Rule of Succession

The Laplace's Rule of Succession forms a very important cornerstone of probability and uncertainty theory. It aims to provide us with a way of associating some probability with an unobserved event based on previous observed events. I came upon it in the following textbook - Introduction to Probability by Bertsekas - in one of the exercise problems at the end of one of the chapters. Curiosity got the better of me and I soon found myself reading all about this rule/law and why is it so important.

The basic problem is this: suppose your probability of having success at a task is $0 \lt p \lt 1$ but it is unknown to you. If you perform this task $n$ number of times and $r$ of these $n$ trials results in a success, what will be the probability of success for the $(n+1)^{th}$ trial? Also, note that each of the $n$ trials is independent of one another.

Observing the previous trials, an obvious answer, based on the theoretical definition of probability, would be the following: $$p = P(\text{success}) = \frac{\text{total number of successes}}{\text{total number of trials}} = \frac{r}{n}$$

However, there is one flaw in this approach - we are told that previous $r$ trials were successfull and hence, all further guesses should take this information into account. The probability of success, transforms into a conditional probability conditioned on the outcome of our previous trials $$p = P(\text{success} \: | \: \text{success in $r$ out of $n$ trials})$$ Another important point is that $p$ is not a constant and will keep on changing as you observe more and more trials with some of those resulting in success and some in failures.


Solving for Success: An Urns and Balls Approach

Lets simplify the problem by having $r = n$ i.e. each of the previous trials results in a success. We now formulate this problem and solve for the desired probability.

Suppose we have $m + 1$ boxes with the $k^{th}$ box having $k$ red balls and $m - k$ white balls, with $0 \leq k \leq m$. We choose a box at random and then choose a ball at random from that box, $n$ successive times independently (the chosen ball is replaced each time before another round). Suppose a red ball is drawn each of the $n$ times. What is the probability of drawing a red ball again on the $(n+1)^{th}$ draw?

 
Screen Shot 2020-03-27 at 1.18.40 AM.png
 

Let us define the following random variables $$R_n = \text{Getting n red balls in n draws}$$$$E = \text{Getting a red ball on $(n+1)^{th}$ draw}$$

Calculation of $P(R_n)$, involves two steps: selecting a box and then drawing $n$ balls from it.$$\begin{aligned} P(R_n) &= \sum_{k=0}^{m}P(k^{th} \text{box})P(\text{$n$ red balls from $k^{th}$ box}) \newline &= \sum_{k=0}^{m}\frac{1}{m+1}(\frac{k}{m})^{n} \newline &= \frac{1}{(m+1)m^{n}}\sum_{k=0}^{m}k^n\end{aligned}$$

Making use of Riemann Sum approximation we can convert, $\sum_{k=0}^{m}k^n = \int_{0}^{1}k^n dk$$$\begin{aligned}P(R_n) &= \frac{1}{(m+1)m^{n}}\int_{0}^{1}k^n dk \newline &= \frac{1}{(m+1)m^{n}}\frac{1}{n+1}\end{aligned}$$

Using the above result, we need to find the following, $$\begin{aligned} P(E|R_n) &= \frac{P(E, R_n)}{P(R_n)} \newline &= \frac{P(R_{n+1})}{P(R_n)} \newline &= \frac{n+1}{n+2} \end{aligned}$$

We finally arrive at the result of Laplace's Rule - $\frac{n+1}{n+2}$. Although there are other alternate ways to come to this solution, the urns and balls approach is often used to make this problem intuitively easy to understand and present its proof. A few points of note:

  • This approach makes a lot of sense - according to the original problem statement at the beginning of this article, I said that the underlying probability of achieving success is not a constant but infact a variable and keeps changing with our ongoing trials. This is simulated in a perfect manner by converting it to a problem of drawing balls from urns. Since each urn contains variable proportions of the red balls, our probability of success also becomes variable, depending on which urn we choose.

  • As $n$ approaches very large values, the probability of success approaches 1. Again, intuitively, this makes sense because as our number of successes keeps on increasing, our underlying belief of success also increases.

Note that the above result extends perfectly when not all trials result in a success (although, I wont go into the details of how it is derived). If only $r$ out of the $n$ trials lead to success, then $$P(\text{success|$r$ successes out of $n$ trials}) = \frac{r+1}{n+1}$$


Practical Application of this Rule: User Reviews and Ratings

Countless number of times now, you have opened Amazon, searched a product and then become confused as to which product to choose based on the given ratings. Lets say (for simplicity), a user can give only positve or negative rating, instead of the stars. You are then presented with the following two sellers:

  • 500 reviews and 95% positive rating

  • 20 reviews and 98% positive rating

Out of these two, which one would you choose so as to maximise your probability of having a successful experience with the seller - in other words what is your probability of success with this seller.

 
Screen Shot 2020-03-27 at 1.21.18 AM.png
 

A very straightforward answer would be to compare 98% and 95% and decide that since the latter is a larger value, you will have better experience with this seller (98%). However, lets get an answer using the Rule of Succession, $$P(\text{success with a seller}) = \frac{\text{number of positive reviews} + 1}{\text{total reviews} + 2}$$

Plugging the values in this equation we get, $$P(\text{success with 95% seller}) = \frac{0.95*500 + 1}{500 + 2} = 0.948$$$$P(\text{success with 98% seller}) = \frac{0.98*20 + 1}{20 + 2} = 0.936$$

This shows that a customer will actually have a higher probability of having successful/positive experience with the seller with 95% rating as opposed to the one with a higher 98% rating. Although the difference between 0.94 and 0.936 is not so large, your main takeaway from the above example should be to not just blindly look at numbers but try to reason with the underlying probabilities. Such scenarios are very very commonplace in the world of online reviews and ratings and this example demonstrates how a simple concept of probability can help us resolve conflits between confusing reviews.