## Bayes theorem

The Bayes's theorem helps us to compute the conditional probability for events. It is much more than a mere problem solving technique. The theorem paved the way for Bayesian Statistics , which is philosophically different from the frequentist method on which this whole tutorial is based. More on this in the next section. Though it looks complicated when it is stated formally, Bayes theorem (also called Bayes rule) is very easy to use for problem solving.

In this section, we will first derive and state the Baye's theorem. Next, an example problem will be solved, which will clearly demonstrate the theorem and lead to the formal definition.

Recall the multiplication rule of the conditional probability we learnt in the previous section. For two dependent events A and B in the sample space S, the multiplication rule states that,

$\small{ P(A \cap B) = P(A|B) P(B) = P(B|A) P(A) }$

Instead of a single event A, consider a set of n mutually exclusive events $\small{ A_1, A_2, A_3, ... A_k }$ in the sample space S.

Let B be any event in the sample space S.

We can write, for any event $\small{ A_k }$ in the sample space, the conditional probability of event $\small{A_k}$ given the event B as,

$\small{ P(A_k | B) = \dfrac{P(A_k \cap B)}{P(B)} }$

Since the contribution to the event B can come from the mutually exclusice events $\small{A_1}$ to $\small{A_n}$, we can write,

$\small{ P(B) = P(A_1 \cap B) + P(A_2 \cap B) + ... + P(A_n \cap B) \\ ~~~~~~~~= P(B|A_1) P(A_1)~+~P(B|A_2) P(A_2) ... +~P(B|A_n) P(A_n) }$

Also, we can write the numerator $\small{ P(A_k \cap B) }$ as,

$\small {P(A_k \cap B) = P(B|A_k) P(A_k ) }$

With this, the conditional probability $\small{ P(A_k|B) }$ becomes,

$\small{ P(A_k|B) = \dfrac{P(B|A_k) P(A_k )}{P(B|A_1) P(A_1)~+~P(B|A_2) P(A_2) ... +~P(B|A_n) P(A_n)} }$

The above formula is the statement of Bayes theorem . We can write it in a summation notation as,

$\small{ P(A_k|B) = \dfrac{P(B|A_k) P(A_k)} {\sum\limits_{k=1}^{n} P(B|A_k) P(A_k)} }$

In order to understand the above theorem, we will solve an example problem and revisit the theorem after that:

Exercise 1:
We have three boxes called Box1, Box2 and Box3 which contain two brands of medicinal tablets with names M1 and M2. The details are as follows:

$~~~~$ Box1 has 3 tablets of M1 and 4 tablets of M2.
$~~~~$ Box2 contains 5 tablets of M1 and 3 tablets of M2.
$~~~~$ Box3 contains 6 tablets of M1 and 3 tablets of M2.

During a clinical trial, the probabilities of selecting these boxes are not same, but kept as,
$\small{ P(Box1) = \frac{1}{3} }$, $\small{P(Box2) = \frac{1}{6} }$ and $\small{P(Box1) = \frac{1}{2} }$

The experiment consists of drawing a box at random with the above probabilities and from the selected box, pick out a tablet at random.

What is the probability P(M1) of picking up tablet M1 in such a draw?.

To select M1, we must pick a box at random and then from the box we must randomly pick a tablet. Thus the probability of selecting M1 is deciced by the union of three mutually exclusive events $\small{M1 \cap Box1, M1 \cap Box2, M1 \cap box3. }$. Therefore,

$\small{P(M1) = P(M1 \cap Box1) + P(M1 \cap Box2) + P(M1 \cap Box3) }$

$\small{~~~~~~~~~~ = P(M1|Box1) P(Box1) + P(M1|Box2) P(Box2) + P(M1|Box3) P(Box3)}$

$\small{~~~~~~~~~~ = \dfrac{3}{7} \times \dfrac{1}{3}~ + ~\dfrac{5}{8} \times \dfrac{1}{6}~ +~ \dfrac{6}{9} \times \dfrac{1}{2} }$

$\small{~~~~~~~~~~ = 0.58 }$

Suppose we know that the medicine M1 has been picked up, but we do not know from which box it came from. We wish to compute the conditional probability that the selected medicine M1 would have come from the box1 . We denote this probability by the symbol P(Box1|M1). We can use bayes theorem to get this probability.

From the definition of conditional probability, we can write

$\small{P(Box1|M1) = \dfrac{P(Box1 \cap M1)}{P(M1)} }$

For the numerator, we write, $\small{P(Box1 \cap M1) = P(M1|Box1) P(Box1) }$

Since M1 could have come from any one of the three boxes, selection of M1 has three mutually exclusive possibilities, namely from Box1 or Box2 or box3. We then write

$\small{P(M1) = P(M1|Box1) P(Box1) + P(M1|Box2) P(Box2) + P(M1|Box3) P(Box3) }$

With this, the expression for the conditional probability $\small{P(Box1|M1)}$ becomes,

$\small{ P(Box1|M1) = \dfrac{ P(M1|Box1) P(Box1)}{P(M1|Box1) P(Box1) + P(M1|Box2) P(Box2) + P(M1|Box3) P(Box3)} }$

From the given data, we have

$\small{P(M1|Box1) = \dfrac{3}{7},~~ P(Box1) = \dfrac{1}{3} }$

$\small{P(M1|Box2) = \dfrac{5}{8},~~ P(Box2) = \dfrac{1}{6} }$

$\small{P(M1|Box3) = \dfrac{6}{9},~~ P(Box2) = \dfrac{1}{2} }$

Subsituting the values, we get the probability that Medicine M1 would have come from box1 as,

$\small{ P(Box1|M1) = \dfrac{\dfrac{3}{7} \times \dfrac{1}{3}}{ (\dfrac{3}{7} \times \dfrac{1}{3})~+~ (\dfrac{5}{8} \times \dfrac{1}{6})~+~ (\dfrac{6}{9} \times \dfrac{1}{2}) } }$ ~ = ~ 0.246

Similarly, we get the conditional probability that the medicine M1 would have come from Box2:
$\small{ P(Box2|M1) = \dfrac{ P(M1|Box2) P(Box2)}{P(M1|Box1) P(Box1) + P(M1|Box2) P(Box2) + P(M1|Box3) P(Box3)} }$

$~~~~~~~~~~~~~~~~~\small{ = \dfrac{\dfrac{5}{8} \times \dfrac{1}{6}}{ (\dfrac{3}{7} \times \dfrac{1}{3})~+~ (\dfrac{5}{8} \times \dfrac{1}{6})~+~ (\dfrac{6}{9} \times \dfrac{1}{2}) } }$ ~ = ~ 0.179

Next,we compute the conditional probability that the medicine M1 would have come from Box3:
$\small{ P(Box3|M1) = \dfrac{ P(M1|Box3) P(Box3)}{P(M1|Box1) P(Box1) + P(M1|Box2) P(Box2) + P(M1|Box3) P(Box3)} }$

$~~~~~~~~~~~~~~~~~\small{ = \dfrac{\dfrac{6}{9} \times \dfrac{1}{2}}{ (\dfrac{3}{7} \times \dfrac{1}{3})~+~ (\dfrac{5}{8} \times \dfrac{1}{6})~+~ (\dfrac{6}{9} \times \dfrac{1}{2}) } }$ ~ = ~ 0.574

The solutions to the set of problems shown above have to be carefully analysed.

Note that the original probabilities of selecting the three boxes were given in the beginning as, $\small{P(Box1) = \dfrac{1}{3} = 0.333},~~~~\small{P(Box2) = \dfrac{1}{6} = 0.166}~~and~~\small{P(Box3) = \dfrac{1}{2} = 0.5}$.
However, once it was known that the medicine that was pulled out of the box was M1, these probabilities changed to $\small{P(Box1|M1)=0.246,~~P(Box2|M1)=0.179,~~and~~P(Box3|M1)=0.574.}$

This can be intuitively understood as follows: Before selecting a medicine, the three boxes had certain probability of being selected.The information that the medicine M1 has been chosen from the selected box has altered these probability of selection of Boxes, since different boxes have different fractions of M1. Once the medicine M1 has been selected, the probability that it would have come from Box3 is more than the probability that it would have come from Box1 or Box2, since the fraction of M1 in Box3 is more than that in Box1 or Box2.

Thus, since

(fraction of M 1 in Box1) < (f raction of M 1 in Box2) < (f raction of M 1 in Box3),

we have,
$\small{P(Box1|M1) \lt P(Box2|M1) \lt P(Box3|M3) }$

The original probabilities $\small{P(Box1)}$, $\small{P(Box2)}$ and $\small{P(Box3)}$ are called the prior probabilities.

The conditional probabilities $\small{P(Box1|M1)}$, $\small{P(Box2|M1)}$ and $\small{P(Box3|M1)}$ are called the posterior probabilities.

Thus by employing Baye's theorem,the prior probabilities have been modifed to posterior porbabilities using the available information (data) on the fractions $\small{P(Box1)}$, $\small{P(Box2)}$ and $\small{P(Box3)}$.