Modelling Uncertainty: The Laplace’s Rule of Succession
The Laplace's Rule of Succession forms a very important cornerstone of probability and uncertainty theory. It aims to provide us with a way of associating some probability with an unobserved event based on previous observed events. I came upon it in the following textbook - Introduction to Probability by Bertsekas - in one of the exercise problems at the end of one of the chapters. Curiosity got the better of me and I soon found myself reading all about this rule/law and why is it so important.
The basic problem is this: suppose your probability of having success at a task is $0 \lt p \lt 1$ but it is unknown to you. If you perform this task $n$ number of times and $r$ of these $n$ trials results in a success, what will be the probability of success for the $(n+1)^{th}$ trial? Also, note that each of the $n$ trials is independent of one another.
Observing the previous trials, an obvious answer, based on the theoretical definition of probability, would be the following: $$p = P(\text{success}) = \frac{\text{total number of successes}}{\text{total number of trials}} = \frac{r}{n}$$
However, there is one flaw in this approach - we are told that previous $r$ trials were successfull and hence, all further guesses should take this information into account. The probability of success, transforms into a conditional probability conditioned on the outcome of our previous trials $$p = P(\text{success} \: | \: \text{success in $r$ out of $n$ trials})$$ Another important point is that $p$ is not a constant and will keep on changing as you observe more and more trials with some of those resulting in success and some in failures.
Solving for Success: An Urns and Balls Approach
Lets simplify the problem by having $r = n$ i.e. each of the previous trials results in a success. We now formulate this problem and solve for the desired probability.
Suppose we have $m + 1$ boxes with the $k^{th}$ box having $k$ red balls and $m - k$ white balls, with $0 \leq k \leq m$. We choose a box at random and then choose a ball at random from that box, $n$ successive times independently (the chosen ball is replaced each time before another round). Suppose a red ball is drawn each of the $n$ times. What is the probability of drawing a red ball again on the $(n+1)^{th}$ draw?
Let us define the following random variables $$R_n = \text{Getting n red balls in n draws}$$$$E = \text{Getting a red ball on $(n+1)^{th}$ draw}$$
Calculation of $P(R_n)$, involves two steps: selecting a box and then drawing $n$ balls from it.$$\begin{aligned} P(R_n) &= \sum_{k=0}^{m}P(k^{th} \text{box})P(\text{$n$ red balls from $k^{th}$ box}) \newline &= \sum_{k=0}^{m}\frac{1}{m+1}(\frac{k}{m})^{n} \newline &= \frac{1}{(m+1)m^{n}}\sum_{k=0}^{m}k^n\end{aligned}$$
Making use of Riemann Sum approximation we can convert, $\sum_{k=0}^{m}k^n = \int_{0}^{1}k^n dk$$$\begin{aligned}P(R_n) &= \frac{1}{(m+1)m^{n}}\int_{0}^{1}k^n dk \newline &= \frac{1}{(m+1)m^{n}}\frac{1}{n+1}\end{aligned}$$
Using the above result, we need to find the following, $$\begin{aligned} P(E|R_n) &= \frac{P(E, R_n)}{P(R_n)} \newline &= \frac{P(R_{n+1})}{P(R_n)} \newline &= \frac{n+1}{n+2} \end{aligned}$$
We finally arrive at the result of Laplace's Rule - $\frac{n+1}{n+2}$. Although there are other alternate ways to come to this solution, the urns and balls approach is often used to make this problem intuitively easy to understand and present its proof. A few points of note:
This approach makes a lot of sense - according to the original problem statement at the beginning of this article, I said that the underlying probability of achieving success is not a constant but infact a variable and keeps changing with our ongoing trials. This is simulated in a perfect manner by converting it to a problem of drawing balls from urns. Since each urn contains variable proportions of the red balls, our probability of success also becomes variable, depending on which urn we choose.
As $n$ approaches very large values, the probability of success approaches 1. Again, intuitively, this makes sense because as our number of successes keeps on increasing, our underlying belief of success also increases.
Note that the above result extends perfectly when not all trials result in a success (although, I wont go into the details of how it is derived). If only $r$ out of the $n$ trials lead to success, then $$P(\text{success|$r$ successes out of $n$ trials}) = \frac{r+1}{n+1}$$
Practical Application of this Rule: User Reviews and Ratings
Countless number of times now, you have opened Amazon, searched a product and then become confused as to which product to choose based on the given ratings. Lets say (for simplicity), a user can give only positve or negative rating, instead of the stars. You are then presented with the following two sellers:
500 reviews and 95% positive rating
20 reviews and 98% positive rating