Badly worded probability question?

  • Thread starter kenewbie
  • Start date
  • Tags
    Probability
In summary, the probability of a random person having disease A is 0.04 and the probability of having disease B is 0.05. The probability of having both diseases A and B is 0.002. It cannot be determined if the two diseases are independent without knowing the probabilities of a person having disease A and not disease B, and vice versa. The use of the word "independent" in this context refers to the statistical/probabilistic interpretation and does not necessarily carry over to a medical interpretation. To show dependence between the two diseases, it would be necessary to violate the "independence" condition, as the simple test assumes independence unless proven otherwise through statistical tests such as the chi squared test.
  • #1
kenewbie
239
0

Homework Statement



The probability that a random person has disease A is 0.04. The probability that the person has disease B is 0.05 The probability that he has both disease A and B is 0.002.

Are the two diseases independent?

The Attempt at a Solution



I can solve this, that is not the problem. What bothers me is that my book states _as a rule_ that if P(A [intersection] B) = P(A) * P(B) then A and B are independent of each other.

This sounds mindboggingly wrong to me. You cannot simply pick 100.000 people, tally how many that has AIDS and how many have the sniffles, and if the product happens to agree with the intersection then lo and behold, they are dependant on each other!

What am I missing here?

k
 
Last edited:
Physics news on Phys.org
  • #2
There is not enough information to tell. You would also have to know the probabilities that a random person has disease A and NOT disease B, and vice versa.

And, the question gives you the probabilities, which strictly speaking have to come from an infinite sample size. There is no way to know the actual probabilities from a finite sample.
 
  • #3
Avodyne said:
There is not enough information to tell. You would also have to know the probabilities that a random person has disease A and NOT disease B, and vice versa.

Hm? I don't quite follow. The probability that a person has A and not B would be .038, since there is a .002 intersection between the two.

k
 
  • #4
These are independent.

There are two (equivalent) ways to check for independence of events. If

[tex]
\Pr(A \mid B) = \Pr(A)
[/tex]

then the events [tex] A, B [/tex] are independent.

This condition is equivalent to the one you give:

[tex]
\Pr(A \cap B) = \Pr(A) \cdot \Pr(B)
[/tex]

Both your condition and the condition I gave above are satisfied by the numbers the OP gave.
 
  • #5
Allow me to restate: I know the formulas, I know how to solve this. What I am interested in the scope in which the formula is valid. I have my doubts that if you count the occurence of two diseases in a population and find the intersection of the two, that the simple check would be valid medical proof of the diseases being dependent on each other or not.

k
 
  • #6
"Allow me to restate: I know the formulas, I know how to solve this. What I am interested in the scope in which the formula is valid. I have my doubts that if you count the occurence of two diseases in a population and find the intersection of the two, that the simple check would be valid medical proof of the diseases being dependent on each other or not. "

First, I never implied you didn't know how to use the formulas. I did misunderstand the focus of your question: I took it to mean you didn't understand the probability interpretation.

The use of the word "independent" in these problems refers only to the statistical/probabilistic interpretation, which is what the calculations address. I do not know whether this carries to a medical interpretation of the type you are questioning.
However, in studies, when different phenomenon are investigated, if we have proof of statistical independence, as here, that is taken as implying a lack of interaction between the two.
 
  • #7
statdad;1938836 The use of the word "independent" in these problems refers only to the statistical/probabilistic interpretation said:
See, THAT is what I should have been asking; what does the word independent infer in this context. Thanks for the reply. I'm still sort of thrown that such a simple test has meaning outside of trivial or special cases, but I guess I have to live with this until I get a better understanding of statistics in general.

Thanks again.

k
 
  • #8
Avodyne said:
There is not enough information to tell. You would also have to know the probabilities that a random person has disease A and NOT disease B, and vice versa.

That probability IS given of course.

And, the question gives you the probabilities, which strictly speaking have to come from an infinite sample size. There is no way to know the actual probabilities from a finite sample.

Probabilities certainly don't have to come from an infinite sample. I thought that idea was refuted in the 50s or 60s.
 
  • #9
if you would want to show there IS some dependence between having disease A and disease B, then you'd agree you'd need to *violate* the "independence" condition, right?
That's all that is meant.
 
  • #10
kenewbie said:
Allow me to restate: I know the formulas, I know how to solve this. What I am interested in the scope in which the formula is valid. I have my doubts that if you count the occurence of two diseases in a population and find the intersection of the two, that the simple check would be valid medical proof of the diseases being dependent on each other or not.
You are correct. This is not what researchers do. The problem in the original post is an artificial problem for what appears to be an introductory statistics class. You were given the true probabilities. On the other hand, when you compute the frequency of some disease in a population of 1,000 (or 100,000, or whatever), you are arriving at an estimate of the true probability. Suppose in a sample of 1,000 people, 43 have disease A, 54 have disease B, and 3 have both disease A and B. Are the diseases independent? The simple test says they are not independent since 0.043*0.054=0.002322 rather than 0.003.

What researchers do instead is use various statistical tests of independence, the most common being the chi squared test. In my artificial example, one cannot say with a reasonable degree of certainty that the diseases are not statistically independent. Note the double negative. That was intentional. Researchers develop a "null hypothesis" (diseases A and B are statistically independent) and an "alternate hypothesis" (diseases A and B are correlated). The statistical tests indicate whether one should reject the null hypothesis on the basis of it being incredibly unlikely.

Several problems can arise in doing this kind of analysis. Systematic errors in the collection process make make the gathered statistics suspect. Even after removing these, there is a chance that the researcher rejected the null hypothesis when the null hypothesis was in fact true (a type I error) or accepted the null hypothesis when the null hypothesis was in fact false (a type II error).
 
  • #11
Thanks a lot for clarifying DH.

I think my book is doing a very bad job at disclaiming these formulas. In fact they are doing flat out misguiding the way they explain them. Then again it is not from a statistics-book, rather a general math sort of thing. I guess I should get a more narrow focused source to get better information.

k
 

1. What is a badly worded probability question?

A badly worded probability question is a question that is unclear or ambiguous, making it difficult to determine the intended outcome or answer.

2. Why is it important to avoid badly worded probability questions?

Avoiding badly worded probability questions is important because it can lead to incorrect conclusions or misunderstandings of data, which can have significant consequences in scientific research.

3. How can a badly worded probability question affect the reliability of a study?

A badly worded probability question can affect the reliability of a study by introducing bias or causing participants to interpret the question differently, resulting in inaccurate data and potentially invalid conclusions.

4. What are some common examples of badly worded probability questions?

Some common examples of badly worded probability questions include double negatives, confusing language, and ambiguous phrasing that can lead to multiple interpretations.

5. How can scientists ensure they are asking clear and precise probability questions?

Scientists can ensure they are asking clear and precise probability questions by carefully reviewing and revising their questions, using simple and direct language, and conducting pilot studies to test the clarity of their questions with a small sample size before conducting the full study.

Similar threads

  • Precalculus Mathematics Homework Help
Replies
18
Views
455
  • Precalculus Mathematics Homework Help
Replies
2
Views
1K
  • Precalculus Mathematics Homework Help
Replies
7
Views
4K
  • Precalculus Mathematics Homework Help
Replies
7
Views
2K
  • Calculus and Beyond Homework Help
Replies
7
Views
1K
  • Advanced Physics Homework Help
Replies
1
Views
807
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
478
  • Precalculus Mathematics Homework Help
Replies
10
Views
2K
  • Precalculus Mathematics Homework Help
Replies
12
Views
6K
  • Precalculus Mathematics Homework Help
Replies
12
Views
2K
Back
Top