Reliability of Dynamic Probability Models: Understanding and Calculating

disregardthat · Jan 22, 2013

I'm interested in the question of defining a reliability parameter of a model of the probability of an event.

Say you're tossing a fair coin, with a 50% chance of heads. Your model tells you it's 40% of the coin showing heads. How reliable is your model?

Say you're tossing coins A1,A2,... each with probability P1,P2,... of showing heads. Your model tells you it's T1,T2,... How reliable is your model?

Is there a concept in statistics concerning the reliability of events?

I'm trying to figure out a method of determining the reliability of a model that has constantly changing probabilities, but I'd first like to know the basic concept to develop a method of calculating its reliability.

mfb · Jan 22, 2013

I think that depends on your requirements.

Are you interested in the absolute difference between predicted and real outcomes? => The model is "10% wrong".
Are you interested in the relative difference between predicted and real outcomes, relative to the actual result? => The model is "20% wrong"
...

D H · Jan 22, 2013

disregardthat said:

Is there a concept in statistics concerning the reliability of events?

There are many such concepts. For example, statistical significance, statistical hypothesis testing, statistical inference. These break down further depending on whether one ascribes to a Bayesian or frequentist approach to statistics.

Stephen Tashi · Jan 22, 2013

disregardthat said:

I'm interested in the question of defining a reliability parameter of a model of the probability of an event.

Say you're tossing a fair coin, with a 50% chance of heads. Your model tells you it's 40% of the coin showing heads. How reliable is your model?

Your example suggests that your concept of "reliability" involves measuring how well a model predicts a given set of data. There is no universally correct way to measure how well a model predicts data. You can define a measure of error in various ways. In your example, is the error of predicting "H" when the data is "T" the same "size" error as predicting "T" when the data is "H"? There is no law of mathematics or logic that tells whether these errors are equally serious. (For example, in medicine we can thing of "H" as having a disease and "T" as not having it. A test predicting "T" when the actual result is "H" makes an error that has different consequences that a test predicting "H" when the actual result is "T".)

The best way to investigate an appropriate definition of "reliability" is to ask what decisions you would make on the basis of the measure.

Since a stochastic model doesn't make definite predictions, even if you define a measure of error between data and definite set of predictions, you still must define how you will compare the model to data. Your example suggests you might be thinking of running the model once and getting a definite set of predictions. Is that your idea?

disregardthat · Jan 22, 2013

Stephen Tashi said:

Your example suggests that your concept of "reliability" involves measuring how well a model predicts a given set of data. There is no universally correct way to measure how well a model predicts data. You can define a measure of error in various ways. In your example, is the error of predicting "H" when the data is "T" the same "size" error as predicting "T" when the data is "H"? There is no law of mathematics or logic that tells whether these errors are equally serious. (For example, in medicine we can thing of "H" as having a disease and "T" as not having it. A test predicting "T" when the actual result is "H" makes an error that has different consequences that a test predicting "H" when the actual result is "T".)

The best way to investigate an appropriate definition of "reliability" is to ask what decisions you would make on the basis of the measure.

Since a stochastic model doesn't make definite predictions, even if you define a measure of error between data and definite set of predictions, you still must define how you will compare the model to data. Your example suggests you might be thinking of running the model once and getting a definite set of predictions. Is that your idea?

Yep, my idea is definitely that the model is constantly compared to what it attempts to model. Furthermore, the model is constantly adapting to this, changing the probabilities based on what occurs in order to sharpen future predictions.

What I had in mind would be something of the sort: instead of the probabilities T1,T2,... in my example, T1,T2,... would each be a random variable with a probability distribution of the probability. Maybe it's a fair assumption that they are normal distributed. In that case I'd like to have a median MX and a standard deviation SX for each TX.

As I had in mind the TX's constantly changing to adapt to the situation, I'd want the parameters MX and SX to equally change, with SX being a measure of how little or much the probability changes, i.e. the "reliability" of MX.

Thus SX could be the reliability I am searching for. However I don't know if a normal distribution is a reasonable assumption, so more generally I am asking for a method that might find an optimal distribution instead with a corresponding measure of reliability. I think the TX's have a tendency of decreasing faster than they increase over time, with a wave-like shape globally. So a more fitting distribution might be a skewed one. But from what I gather from your post there might not be a 'customized' distribution depending on how the variables changes.

Basically, what I practically would want to have out of this is a confidence interval of the probability (of say, 95%) instead of a fixed value, but that can of course be drawn from the distributions TX.

mfb · Jan 22, 2013

Bayesian probability can do all that.

Stephen Tashi · Jan 22, 2013

The way that a "probability of a probability" is usually modeled is with a Bayesian prior distribution. A random variable is assume to be from a family of distributions (such as gaussians) and some "prior" probability distribution is assigned for the parameters of the family of distributions( such a probability distribution on the mean and variance of the gaussians). The distributions on the parameters are updated from data using Bayes Theorem. There are many different ways to implement this general approach, so it still doesn't direct you to a particular procedure. You should first study the general method and then decide what fits your particular problem. A somewhat advanced book on this subject is Jayne's "Probability The Logic Of Science". There are pages related to this book on the web. I don't know whether the whole book is still available online. It used to be.

disregardthat · Jan 22, 2013

Thanks for the help, I'll look into these references.

Reliability of Dynamic Probability Models: Understanding and Calculating

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect