B Are proofs needed for definitions? Conditional probabilities

jeremy22511
Messages
29
Reaction score
0
My probability class has me wondering about pure math questions now. We started with the axioms and are slowly building up the theory. Everything was fine but then a definition of Conditional Probability P[A|B] = \frac{P[AB]}{P} appeared and it's just not sitting right with me. I know that formula works because in simple problems I can usually see the answer. I'm just not seeing how it works or why a proof isn't needed..
 
Physics news on Phys.org
jeremy22511 said:
My probability class has me wondering about pure math questions now. We started with the axioms and are slowly building up the theory. Everything was fine but then a definition of Conditional Probability P[A|B] = \frac{P[AB]}{P} appeared and it's just not sitting right with me. I know that formula works because in simple problems I can usually see the answer. I'm just not seeing how it works or why a proof isn't needed..
I guess you mean
$$P(A|B) = \frac{P(A \cap B)}{P(B)}$$
You can justify/prove that by noting that you are taking ##B## as your reduced sample space and taking the proportion of these cases where event ##A## also occurs.

Note that if you rewrite this as:
$$P(A \cap B) = P(A|B)P(B)$$
Then, you also have:
$$P(A \cap B) = P(B \cap A) = P(B|A)P(A)$$
And Bayes's theorem drops out:
$$P(A|B)P(B) = P(B|A)P(A)$$
 
Well, I'm not a mathematician, but let me explain my point of view, to be clear let me state the main answer to your question:
"Definitions can't (and notice I say can't, not don't need to) be proved."
So, if you define conditional probability as
$$P(A|B) = \frac{P(AB)}{P(B)}$$
this is a definition, you can say that you don't like it and you would prefer another definition, or simply you think that the definition is useless, or whatever you want, but this statement cannot be proved or disproved. For example, in mathematics one defines three numbers as follow:
$$\sqrt{2}:\text{ Is the only positive number such that its square is } 2.$$
$$\pi:\text{ Is the perimeter of a circle of diameter }1.$$
$$e = \lim_{n\to\infty}\left(1-\frac{1}{n}\right)^n$$
How could you prove those definitions? Or prove they are wrong? Try to think about it.

Another topic, which I think is the one important here is; "Can a single thing have two different definitions?" Maybe your problem here is that your professor has told you that the definition of conditional probability is
$$P(A|B): \text{ Is the probability of an event, }A \text{, occurring given that another event, }B\text{, has occurred. }$$
Then, of course, you must prove that the two definitions are equivalent (i.e. that one is true if and only if the other is true). And once you proved that you can say that they are two definitions for the same thing (although, it is usually very good to keep one and only one "true" definition, because sometimes you can prove that they are equivalent using some axiom and then, in other fields with different axioms they don't need to be equivalent).

So, to summarize, a definition cannot be proved, but if you have two definitions for the same thing, then you should choose one of the definitions as the "true" definitions and prove the other statement as a consequence of the definitions.
 
  • Like
Likes Stephen Tashi
jeremy22511 said:
Everything was fine but then a definition of Conditional Probability P[A|B] = \frac{P[AB]}{P} appeared

Yes, the definition of conditional probability is a conceptual leap.

I'll assume you have studied the concept of a "probability space" (perhaps by a different name). It consists of a set ##\Omega## whose elements are called "outcomes" and a "probability measure" ##P## defined on a collection ##\Sigma## of subsets of ##\Omega##. The subsets are called "events". In an advanced course, there is the requirement that the collection of events is a "sigma algebra" of sets.

The laws of probability first taught, such as ##P(A \cup B) = P(A) + P(B) - P(A \cap B)## refer to a single probability space and single probability measure ##P##.

When students are told about conditional probabilities like ##P(A|B)## they often don't understand that conditional probability involves two different probability spaces.

On can attempt to explain the distinction between ##P(A|B)## and ##P(A \cap B)## by focusing on the meaning of the English words "given" and "and". This type of explanation doesn't make it clear that the notations "##P(A|B)##" and "##P(A \cap B)##" refer to different probability measures applied to the same set of outcomes. In fact, students are liable to think that "##A|B##" and "##A \cap B##" denote different events ( i.e. different subsets of outcomes) and that the same probability measure ##P## is applied to them.

The correct way to look at conditional probability is that "##P(...|B)##" refers to a probability measure different than ##P##. From the viewpoint of pure math, it would clearer to use notation like:

Definition of conditional probability: Given a probability space ##(\Omega, \Sigma,P) ##The conditional probability ##P(A|B)## is defined to be the probability of the event ##A\cap B## in the probability space ##(\Omega, \Sigma, Q)## where the probability measure ##Q## is given by ##Q(X) = P(X \cap B)/ P(B)##

Using the "##P(A|B)##" type notation, you should think of the "##P(...|B)##" part of it as notation for a probability measure ##Q## that is different than ##P##. Think of it that way instead of thinking of "##A|B##" and ##A \cap B## is different events that are assigned values by the same probability measure ##P##.
 
  • Like
Likes Gaussian97
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Back
Top