Homework Help: Probability Events in Statistics

1. Nov 28, 2012

eric.l

1. The problem statement, all variables and given/known data

Given two events, A and B, state why each of the following is not possible. Use formulas or equations to illustrate your answer.

d. P(B) = 0.24 and P(B|A) = 0.32

2. Relevant equations

P(A ∩ B) = P(A) * P(B|A)
P(B|A)= (P(A ∩ B))/P(A)
P(A|B)= (P(A ∩ B))/P(B)

3. The attempt at a solution

My teacher gave us the hint that P(B) ≥ P(A ∩ B), so:
P(B) ≥ P(A ∩ B)
P(B) ≥ P(A) * P(B|A)
P(B) ≥ P(A) * P(A ∩ B)/P(A), the P(A)'s cancel to get right back to
P(B) ≥ P(A ∩ B)
and here I am, lost. I've no idea if this is even remotely on the right track, but that was my attempt and fail.
1. The problem statement, all variables and given/known data

2. Relevant equations

3. The attempt at a solution

2. Nov 28, 2012

Dick

I really don't see any reason why that should be not possible. Suppose you have 1000 events. B contains 240 events. Suppose A contains 100 events and AnB contains 32 events. Then P(B)=0.24 and P(B|A) is 0.32. I don't see any problem with that. Do you?

3. Nov 28, 2012

Simon Bridge

OPs teacher supplied the following hint:
... in Dick's example, P(AnB)=32/1000=0.032 < P(B) - so it still fits... except ... Dick gives the system a superset of 1000 members of which A and B are subsets.

But the question only mentions two categories: A and B ... if A and B are all there is, (perhaps we are supposed to take this as implicit) then P(A) + P(B) ≥ 1

... and the argument goes like this:

P(B|A) = P(AnB)/P(A) ≤ P(B)/P(A)... since P(AnB) ≤ P(B)

Hence: P(A) ≤ P(B)/P(B|A) eg. P(A) ≤ 0.75

Since we know that P(B)=0.24, then P(A) + P(B) ≤ 0.99 < 1

Which contradicts the requirement that P(A) + P(B) ≥ 1
Hence: impossible.

 I wouldn't normally do someones homework for them - but (a) there is the possibility I've made a mistake, and (b) it goes to an ambiguity in the question: I'm unsure how to encourage OP to the realization, and (c) I'd already hit save by accident :) ... so OP already got the email notification.

Last edited: Nov 28, 2012
4. Nov 28, 2012

Dick

Hah, hit save by accident. I've done that. Usually I try to delete as fast as possible and hope for the best. I think there might be a delay on the email thing, but I might be wrong. You raise a good point. If A and B are the only allowable outcomes then things change. I usually don't think of questions like this that way, but that would make sense.

Last edited: Nov 28, 2012
5. Nov 29, 2012

Simon Bridge

I don't normally read them that way either - I got there by wondering if there was some extra information left out ... perhaps implicit in the context. i.e. we have not seen the other "impossible" situations ... perhaps they all rely on A and B being the only sets?

When the numbers panned out so nicely I figured it was probably the case... even though the wording does leave room for there to be events that are neither A nor B. In fact - if you came across those statistics, you'd take it as evidence that there was at least one other possible event.

6. Nov 29, 2012

Ray Vickson

If A and B are the only sets, presumably they must be disjoint and together give the whole space. But that would give them an empty intersection, so P(A|B) = P(B|A) = 0. On the other hand, if they are not disjoint we also have "other" sets, such as $A - B, \; B - A, \; A \cap B.$ Presumably we would have $A \cup B = \Omega$ in this case, where Ω is the sample space. So, I cannot really make sense of the problem as stated.

7. Nov 29, 2012

Simon Bridge

If AUB is the entire sample-space, the A and B are disjoint if P(A)+P(B)=1
They intersect if P(A)+P(B)>1

If P(A)+P(B)<1 then AUB is not the entire sample space.

For instance, I could have set A being "women" and set B being "scientists". AnB would be "women scientists". If I carefully filled the room so that all the men present were scientists then sets A and B would be all there was and there would be a non-empty intersection.

8. Nov 29, 2012

Ray Vickson

Right. All I was saying was that if A and B are the "only" sets then they must be disjoint and their union must be the whole space, in which case P(A|B) = 0, etc. If they have a nonempty intersection then they cannot be the "only" sets---there are others (that can be constructed from them by set operations).

9. Nov 29, 2012

D H

Staff Emeritus
In short, either the problem is wrong or eric.l did not specify the full problem; something's missing.

There certainly is nothing wrong with P(B) being very small and P(B|A) being large. That is essentially what automated fault detection is all about, looking for extremely low probability events (failures should be extremely unlikely) in light of evidence.

10. Nov 29, 2012

Simon Bridge

Well the problem is supposed to be wrong - the question asks to explain how it is wrong.
The given A and B, in the problem, you mean?

Or perhaps I am not using the right terminology to describe what I have in mind?

11. Nov 29, 2012

eric.l

Basically everything above is what I ended up getting to and all of it seemed to be possible, but my teacher stated that it wasn't possible. Thank you so much for all the help though, at least I was on the right track with what I did afterwards

12. Nov 29, 2012

Ray Vickson

I'm sorry, but the two items "P(B) = 0.24 and P(B|A) = 0.32" ARE possible, so either your teacher is wrong or you have left out some conditions.

13. Nov 29, 2012

D H

Staff Emeritus
Exactly. DNA testing, drug testing, and all other kinds of statistical inferencing rely on this.

14. Nov 29, 2012

eric.l

Showed the above to my teacher and he verified how that was the correct way to show that it was impossible. I double and triple checked before I even posted to make sure I didn't forget anything and I did not, and I'm almost positive the question came off a previous AP exam and I doubt that whoever issues the tests would give a question that doesn't go along with the instructions given.

15. Nov 29, 2012

D H

Staff Emeritus
If this is supposed to be a generic statement, it's wrong. Flat out wrong.

16. Nov 29, 2012

Ray Vickson

From the problem description, as you have written it, there is no reason why you need P(A) + P(B) ≥ 1. The problem talks about two events A and B and says nothing about "categories" or anything like that.

17. Nov 29, 2012

Simon Bridge

If A and B are the only two events, then P(A)+P(B)≥1 ... this condition may not have been given explicitly. There may have been something in the original wording or the context of the lesson (perhaps the class has not considered the situation where there is a superset or a third event possible yet?)

The way the numbers pan out suggests this is the case. However - I'd be interested to see what the teacher in question has to say. If this was me, I'd present both Dick's example and the above for marking and maybe finesse a bonus mark ;)

18. Nov 30, 2012

D H

Staff Emeritus
If A and B are the only two events then A is the complement of B. In that case, P(A)+P(B) is identically equal to one (not ≥1). Saying P(B|A) cannot be 0.32 in this case is a bit silly because P(B|A) is identically zero. The events are mutually exclusive.

If A is not the complement of B, there are (at least) four events: B, ~B, A, and ~A. Your P(A)+P(B)≥1 is not necessarily true.

An example: Suppose there is some boolean property P that pertains to each member of the population, and suppose there is a test Q for that condition. Set B comprises those members of the population for which property P is true. Set A comprises those members of the population who for which test Q yields a positive result. Note that P(B|A) is identically one if test Q has no false positives. (This does not mean that A=B. For example, the test can have false negatives.)

Suppose the test is not perfect; it yields false positives and false negatives. Suppose there are much more expensive tests than collectively are perfect, and suppose we subject one thousand members of the population to these definitive tests and to test Q, yielding the following confusion matrix:
Code (Text):
|  A    ~A  total
------+----------------
B   | 220   20    240
~B   |   2  758    760
total | 222  778   1000
In this example, P(B) is 0.24 while P(B|A) is over 0.99. The test does a very good job of positively identifying members with the condition P, but it does so at the expense of a significant number of false negatives. A cheap screening test should have a low false negative rate. Suppose the confusion matrix is instead:
Code (Text):
|  A    ~A  total
------+----------------
B   | 239    1    240
~B   |  26  734    760
total | 265  735   1000
This has a low number of false negatives (less than 0.5%) but at the expense of more significantly more false positives (almost 10%). Even with those false positives, P(B|A) is still over 90%.

Note that in these two examples, P(A)+P(B) is about 1/2.

Last edited: Nov 30, 2012
19. Nov 30, 2012

Simon Bridge

But you know what I'm trying to say?

I am imagining a ven diagram with two circles that overlap. There is nothing outside them.

This is a typical HS beginning example used when students are starting out... so I figure this is implicit the the class in question.
Supposing this and the question makes sense.

How would you put it?

I guess the implication is that we were supposed to know that $A \cup B = \Omega$ (the whole sample space). That does mean that A and B are not the only sets: there are also A-B,B-A and AB (the intersection). However, I suppose one can guess what is meant, and in that case the result is, indeed, true. If we let P(AB)=x, we have P(B) = x + P(B-A) = x + 1-P(A), so 0.24 = x + 1-P(A) ---> P(A) = x + 0.76. Thus, P(B|A) = x/(x + 0.76) = 0.32. so x = 152/425 = 0.3576. and that would make P(A) = 76/100+x = 19/17 > 1.