- #1
daviddoria
- 97
- 0
I have 1000 experiments on the same data, each of which is trying to decide the probability that the data agrees with a model.
The problem is, even if each experiment REALLY agrees with the model (ie. p(d1) = .99, p(d2) = .99, etc), when I multiply these together to get P(d1 & d2 & d3... & dn), the result is some really really small number (near 0). If I do 2 sets of such experiments, I can tell which set of data had a higher probability of matching the model (simply by seeing which of the two combined numbers was larger), but the actual numbers at that point are meaningless - they are only useful in a "relative to a different one" sense.
Is there a way to combine these to get a "valid" probability without just multiplying them? A reasonable idea seems to just "average" them, ie [tex]\sum p(dn) / n[/tex], but everyone tells me this is a really terrible idea and it is not a real probability any more.
Thoughts?
Thanks,
Dave
The problem is, even if each experiment REALLY agrees with the model (ie. p(d1) = .99, p(d2) = .99, etc), when I multiply these together to get P(d1 & d2 & d3... & dn), the result is some really really small number (near 0). If I do 2 sets of such experiments, I can tell which set of data had a higher probability of matching the model (simply by seeing which of the two combined numbers was larger), but the actual numbers at that point are meaningless - they are only useful in a "relative to a different one" sense.
Is there a way to combine these to get a "valid" probability without just multiplying them? A reasonable idea seems to just "average" them, ie [tex]\sum p(dn) / n[/tex], but everyone tells me this is a really terrible idea and it is not a real probability any more.
Thoughts?
Thanks,
Dave