1. Sep 7, 2015

### TheMathNoob

1. The problem statement, all variables and given/known data
I dont undersand why Pr(A l B) = P(AnB)/P(B)

2. Relevant equations
conditional probability

Pr(A l B) = P(AnB)/P(B)
A given B

3. The attempt at a solution
I can make a bunch of problems with this definition intuitively, but I really want to understand with sets what is going on behind this equation. Why is that true?. The book is very vague at explaining that.

2. Sep 7, 2015

### RUber

If you already know that b is true, you have to scale your probability so that the sum of all options is 1.
Dividing by p(b) is the proper scaling factor, since it is the size of the sample space you are interested in.

3. Sep 7, 2015

### Ray Vickson

Why do you write in bold font? It makes it look like you are shouting at us.

Anyway: $P(A|B) = P(A\cap B)/P(B)$ is a definition. It cannot be "proved", because definitions are not subject to proof except, maybe, proof of usefulness.

It can be motivated in several ways. One way is to look at the relative frequency interpretation of probability. So, say we have events $A, B$ in some sample space $S$. Imagine performing a large number $N$ if independent, identical experiments, and in each experiment we observe whether or not $A, B$ or both occurred. We can categorize the outcomes of our experiments as follows.
$$\begin{array}{c|cc|c} & A & \bar{A}& \text{Total} \\ --&--&--&--\\ B & N_{AB} & N_{\bar{A} B} & N_B \\ \bar{B}& N_{A \bar{B}} & N_{\bar{A} \bar{B}}& N_{\bar{B}} \\ --&--&--&-- \\ \text{Total}& N_A & N_{\bar{A}} & N \end{array}$$
Here, $\bar{A}$ means "not-$A$" and $N_{UV}$ is the number of experiments in which the outcomes $U$ and $V$ occurred. In the relative frequency interpretation we take the ratio $N_A/N$ as an estimate of $P(A)$, $N_{AB}/N$ as an estimate of $P(AB) \equiv P(A \cap B)$, etc.

In fact, it is a deep result in probability theory---called the strong law of large numbers---that in the limit $N \to \infty$ the ratios approach the probabilities almost surely (that is, with probability 1). This corresponds to the weird-sounding statement that "the probability is 1.0 that the limiting ratio equals the probability". More colloquially, we could restate this as "there is no chance that the limiting ratio is not equal to the probability"---but the fact remains that it is a statement about the probability of a probability.

Anyway, to continue the story: if we look only at those experiments in which $B$ occurred, these are the ones in row 1 of the table. There are $N_B \approx P(B) \cdot N$ such outcomes altogether. Among those, $N_{AB}$ outcomes had $A$ occurring, so if we imagine the $B$-outcomes to form a new "sample space", then in that sample space we have $N_{AB}$ occurrences of $A$ and $N_{\bar{A} B}$ non-occurrences of $A$. A frequentist would interpret this as saying that (in the sample space of $B$-outcomes), the probability $P_B(A)$ of $A$ is
$$P_B(A) \approx \frac{N_{AB} }{N_B} \approx \frac{P(A \cap B) \cdot N}{ P(B) \cdot N} = \frac{P(A \cap B)}{P(B)}$$
We call this $P_B$ relative to the new sample-space $B$ the "conditional probability, given $B$". Usually, at least in elementary treatments, we use the notation $P(A|B)$ instead of $P_B(A)$, but the interpretation is exactly the same.

Note that we look at conditional probabilities all the time, perhaps without even realizing it. For example, say we have a jar containing 6 red balls and 4 black balls. We draw 2 balls at random, without replacement. What is the probability that both are red? A standard argument is that for the first draw we have P(red) = 6/10, because of the randomness. Now, after drawing a red first, we are left with 9 balls, of which 5 are red and 4 are black, so for the second draw we have P(red) = 5/9. Notice that we have, essentially, switched to a conditional probability, but there is no mystery at all in doing that---we are just looking at a jar of red and black balls again. So, we have $P(R_1 \cap R_2) = P(R_1) P(R_2 | R_1)$ in this case, almost without thinking about it. That can be turned around to give the statement $P(R_2|R_1) = P(R_1 \cap R_2)/P(R_1)$.

4. Sep 7, 2015

### TheMathNoob

Whoa, Ray thank you so much for your help. No one has ever done to me such detailed explanation. It makes sense to me now how we got there from the basic idea of #outcomes of event/ #outcomes of sample space. It's a sample space into another sample space and events occurring at the same time in this new sample space.