1. The problem statement, all variables and given/known data My book says: "Let A and B be two events such that P(A) > 0. Denote by P(B|A) the probability of B given that A has occurred. Since A is known to have occurred, it becomes the new sample space replacing the original S. From this we are led to the definition P(B|A) ≡ P(A ∩ B) / P(A). P(A ∩ B) ≡ P(B|A) P(A)." 2. Relevant equations P(B|A) ≡ P(A ∩ B) / P(A) and P(A ∩ B) ≡ P(B|A) P(A) [ii] 3. The attempt at a solution There's two things I would like to ask about: 1) Could someone please explain to me the intuition for at least the equation? I understand that, since event A has happened and what we want a certain probability out of the probability of A having occurred, P(A) is the denominator, but I don't intuitively understand the P(A ∩ B) part/numerator of . 2) I suppose that is not a theorem, because is not something that can be proven (and same goes for [ii]). Why is (and same goes for [ii]) not an axiom instead of a definition, though? And phrased more directly: why is it a definition, and why does it need to be defined? I hope my questions make sense; please tell me if they don't. I appreciate all answers, but please provide me with the most succinct answers you can, because I don't want to get more confused.