Optimizing Data Sampling for Probabilities

Loren Booda · Dec 29, 2006

Is there a mathematical method to determine the optimum sampling of data for probabilities?

Flip a coin. Simplistically speaking from experience, it has a 1/2 chance of landing on either side. But what if it can land on its edge? What if it can fall through a crack? What if lava from a fissure invading the room can envelop and melt the coin? What if it can quantum mechanically flip itself after landing? Other examples of probability, like the nonlinear trajectory of a particle, have determinism not immediately apparent.

Even an electronic random number generator run by a quantum computer is susceptable to decoherence between the device and the observer. It seems that we must have extensive practical knowledge about the system under observation, then apply Occam's razor, if we are to determine the set of data required. But how may this be done systematically?

Hurkyl · Dec 29, 2006

Loren Booda said:

Flip a coin. Simplistically speaking from experience, it has a 1/2 chance of landing on either side. But what if it can land on its edge? What if it can fall through a crack? What if lava from a fissure invading the room can envelop and melt the coin? What if it can quantum mechanically flip itself after landing? Other examples of probability, like the nonlinear trajectory of a particle, have determinism not immediately apparent.

You could flip it until it behaves normally. :tongue:

In fact, that's a common practical solution in the gaming world -- one keeps rerolling a die until it doesn't fall off the table or lean against something or whatever.

Loren Booda · Dec 29, 2006

By "normal" one might mean "average." How does one determine mathematically how many tries one needs to achieve average? Don't methods like standard deviation incorporate their own error, ad infinitum?

Overall, how and when can we be assured of precision's reproducibility?

Hurkyl · Dec 29, 2006

But that's not what was meant when I said normal. I meant for "behaving normally" to be "lands heads up or lands tails up".

Loren Booda · Dec 29, 2006

Duly noted.

Please allow me to repeat [with editing]:

How does one determine mathematically [with a huge number of interacting physical variables] how many tries one needs to achieve [a significant] average?

Don't [results] like standard deviation incorporate [errors of their own, each with their own statistical deviations] ad infinitum?

Overall, how and when can we be assured of precision's reproducibility?

matt grime · Dec 29, 2006

You decide before hand what consitutes the sample space - either heads or tails. Any other outcome is deemed inadmissable. Why? Because that is what we want, and has nothing to do with mathematics. Mathematics is merely a tool for modelling, in this instance. Whether real life behaves sufficiently close to the model for the model to be valid is a different matter. There are plenty of tests to work out whether sample data is likely to have come from a population with assumed properties; they are taught to high school students such as confidence intervals; I'm surprised you've not met them. Then there is the strong law of large numbers, chi squared tests, t tests, ANOVA, et c.

Loren Booda · Dec 30, 2006

Your examples are worthwhile studying. Do you know of an online tutorial that compares most of them?

Optimizing Data Sampling for Probabilities

What is the importance of prioritizing statistical data?

How do you determine which statistical data to prioritize?

What are the potential challenges in prioritizing statistical data?

How can prioritizing statistical data improve the research process?

What are some common methods used to prioritize statistical data?

Similar threads

Hot Threads

Recent Insights