The probability of a probability?

haruspex · Aug 10, 2016

alan_longor said:

thank you , so what would the probability be ?

Strictly speaking, it will depend on the prior. But suppose your prior is 1000:1 against. What posterior likelihood do you get?

alan_longor · Aug 10, 2016

haruspex said:

Strictly speaking, it will depend on the prior. But suppose your prior is 1000:1 against. What posterior likelihood do you get?

well one of the conditions was that we know nothing about the light , the only thing we know is that it came up for one billion nights . now when i asked weather or not we can set a probability for it coming up the next day , that means weather or not the information we have is enough to set a probability . if something else has to be known then we cannot set it . in either cases logically the probability must be a number very close to 1 .

haruspex · Aug 10, 2016

alan_longor said:

well one of the conditions was that we know nothing about the light , the only thing we know is that it came up for one billion nights . now when i asked weather or not we can set a probability for it coming up the next day , that means weather or not the information we have is enough to set a probability . if something else has to be known then we cannot set it . in either cases logically the probability must be a number very close to 1 .

I assume you have been introduced to Bayes' theorem. This tells you that you cannot estimate the probability of anything from observations unless you start with a prior estimate.

alan_longor · Aug 10, 2016

haruspex said:

I assume you have been introduced to Bayes' theorem. This tells you that you cannot estimate the probability of anything from observations unless you start with a prior estimate.

ok , then in this case we cannot set a probability for the appearance of the light on the next night .

Stephen Tashi · Aug 10, 2016

alan_longor said:

so if i ask you on day number one billion what's the probability that the light will come up tomorrow to ... is that a meaningful question ?

It may be a meaningful question, but it isn't a mathematical question unless you state a specific probability model.

Even if you state a probability model, it isn't a solvable mathematical question unless there is sufficient given information.

People who wish to demonstrate that mathematics can be applied in a way that agrees with "common sense" can reformulate the question in various ways by creating a probability model for it that gives sufficient information to create a solvable mathematical problem.

haruspex · Aug 10, 2016

alan_longor said:

ok , then in this case we cannot set a probability for the appearance of the light on the next night .

Sure, solipsism is a sound philosophy from a logical perspective, but it doesn't get you very far. In the real world, we function quite effectively by subconsciously assigning reasonable a priori probabilities.
Your rising sun model is insufficiently divorced from real experience to think about in a detached manner. How about, you notice that on three consecutive occasions the winning lottery number ends in a five. What is the probability that there is a flaw in the randomisation? What probability would you have assigned to that beforehand... one in 100? Too high. One in 10,000? Let's say one in a million. How many consecutive occasions of a final digit 5 will push that estimate to greater than half?

alan_longor · Aug 11, 2016

haruspex said:

Sure, solipsism is a sound philosophy from a logical perspective, but it doesn't get you very far. In the real world, we function quite effectively by subconsciously assigning reasonable a priori probabilities.
Your rising sun model is insufficiently divorced from real experience to think about in a detached manner. How about, you notice that on three consecutive occasions the winning lottery number ends in a five. What is the probability that there is a flaw in the randomisation? What probability would you have assigned to that beforehand... one in 100? Too high. One in 10,000? Let's say one in a million. How many consecutive occasions of a final digit 5 will push that estimate to greater than half?

i am sorry for being unable to understand that exact case , and please allow me to ask you a question even though i was unable to answer yours . so what if someone says he has a real number generator that would generate a random number between 1 and 100 . and that there is no pattern and there is no way to predict what that machine gives . is it correct to assume that the probability for the number given by the machine to be between 90 and 100 is 1/10 ? since all the numbers seem to have an equal probability here . thank you very much .

haruspex · Aug 11, 2016

alan_longor said:

i am sorry for being unable to understand that exact case , and please allow me to ask you a question even though i was unable to answer yours . so what if someone says he has a real number generator that would generate a random number between 1 and 100 . and that there is no pattern and there is no way to predict what that machine gives . is it correct to assume that the probability for the number given by the machine to be between 90 and 100 is 1/10 ? since all the numbers seem to have an equal probability here . thank you very much .

If you completely trust that information, yes. But complete trust also constitutes a prior distribution.

alan_longor · Aug 11, 2016

haruspex said:

If you completely trust that information, yes. But complete trust also constitutes a prior distribution.

But complete trust also constitutes a prior distribution.
that's the phrase i have been looking for ... thank you .

FactChecker · Aug 11, 2016

If the word 'assign' implies that it is just an initial value that will be corrected by some (Bayesian) method, than of course you can assign any valid initial probability guess.
If the word 'assign' implies that you can defend that value as being the correct probability parameter, then you can not do that. Here is why:

Whether Bayesian methods are to be used or not, the answer to 1) is that no value can be assigned to p'. The most that Bayesian methods can do is to assign an initial standard distribution ( uniform ) with no justification. There is no real assumption that the initial standard distribution is correct. In fact, Bayesian methods assume that the initial distribution is incorrect and it tells you how to change it as real information is obtained.

You should say that the answer to 1) is that no value can be assigned to p'. If you attempt to assign any value, a, to p', you can easily give an example where that value is wrong.

haruspex · Aug 11, 2016

FactChecker said:

The most that Bayesian methods can do is to assign an initial standard distribution ( uniform ) with no justification.

Maybe it's a matter of philosophy, but I take a different view.
Bayesian methods require you to supply a prior distribution, but that does not provide an excuse to set it to uniform without justification. In normal practice, it will be some gut feel based on experience. The fundamental point is that the choice should not be that crucial provided it is reasonable.

Consider the classic 'fair coin' problem. What would be a sensible a priori estimate of whether a coin has a 0.01% or more bias towards heads? 0.9 would probably be too much; one in a billion almost surely too low. With those bounds, we can run a trial for long enough to bring our posterior estimate within some preset range.

Question 1 is problematic because we are given no information regarding the nature of the event. The only experience we can call on is of probabilities of events in general, a basis so nebulous that to assign any shape to the prior distribution is highly questionable. Still, I would argue that we come come across a lot of probabilities close to 1, so the range "> .9" should have a prior probability of at least, say, 1 in 1000. This is why question 2 can be answered.

FactChecker · Aug 11, 2016

haruspex said:

Maybe it's a matter of philosophy, but I take a different view.
Bayesian methods require you to supply a prior distribution, but that does not provide an excuse to set it to uniform without justification. In normal practice, it will be some gut feel based on experience. The fundamental point is that the choice should not be that crucial provided it is reasonable.

That is a good point. I was influenced by this example that doesn't give any glue for the initial distribution.
That brings up another question (I don't want to hijack this thread, though):
If you use an initial distribution that is not uniform, that might make it harder to correct. I have no experience with this, but the question has come up before at work where it was being applied iteratively.

haruspex · Aug 11, 2016

FactChecker said:

If you use an initial distribution that is not uniform, that might make it harder to correct

That assumes the uniform distribution is closer to the true distribution than is the chosen prior. As I posted, a good approach is to consider some range of priors that you feel encompass the answer and run the trials until sufficiently confident.
In the case of fairness of a coin, one that at least looks fair, you would be well justified in taking a prior distribution that sets the probability that the frequency of heads is between 0.4 and 0.6 as being at least 0.8, say.

FactChecker · Aug 11, 2016

haruspex said:

That assumes the uniform distribution is closer to the true distribution than is the chosen prior.

"closer to" is tricky to define. In an iterative process, if stage 1 resulted in a distribution with a small standard deviation, it might be difficult to move the mean in stage 2. So it might be better to increase the standard deviation for the stage 2 prior distribution -- perhaps even to a uniform distribution. But we never investigated it while I was there, so I don't know.

Stephen Tashi · Aug 12, 2016

haruspex said:

Maybe it's a matter of philosophy, but I take a different view.
Bayesian methods require you to supply a prior distribution, but that does not provide an excuse to set it to uniform without justification. In normal practice, it will be some gut feel based on experience. The fundamental point is that the choice should not be that crucial provided it is reasonable.

There is a way to use Bayesian methods in the same spirit that we use frequentist methods.

Frequentist methods don't answer the question "What is the probability of the result that I'm interested in?". Instead they answer questions like "If I assume the result I'm interested in (or its negation) what is the probability of the data ?". From the answer to that question, people get a subjective feeling about the probability of the result that interests them, but not an actual number for it.

In a similar manner, a person can assume a uniform (or "maximum entropy") prior and compute the posterior distribution just to get a subjective feeling about how strongly the data suggests a certain range of values for the result of interest. This is different that "taking the prior seriously".

Ray Vickson · Aug 12, 2016

FactChecker said:

That is a good point. I was influenced by this example that doesn't give any glue for the initial distribution.
That brings up another question (I don't want to hijack this thread, though):
If you use an initial distribution that is not uniform, that might make it harder to correct. I have no experience with this, but the question has come up before at work where it was being applied iteratively.

There are "natural" types of priors for various type of probability models. In this case the model is that of independent, repeated trials with some unknown success probability probability ##p##---Binomial(n,p) in other words. The natural prior that people tend to use in such a case is the Beta distribution, with density function
$$f_0(p) = \frac{1}{B(a,b)} p^{a-1} (1-p)^{b-1}, \; 0 < p < 1 $$
Here, ##a,b >0## are parameters and ##B(a,b)## is the so-called Beta function; see, eg.,
https://en.wikipedia.org/wiki/Beta_distribution for more formulas and details.
The uniform prior is a special case in which ##a = b = 1##.

Typically one might start by assigning (or estimating, or ...) something like the most probable value of ##p## or the prior mean of ##p##, perhaps also with some typical or probable range estimates. That gives one the purely mathematical problem of determining the parameters ##a,b##. Once one has ##a## and ##b## the Bayesian updating is easy; after observing ##k## successes in ##n## trials the posterior density of ##p## is Beta with parameters ##a+k## and ##b + n-k## in place of ##a## and ##b##. This updating simplicity is the reason the Beta is used as a prior for the Binomial case.

Nothing prevents you from using a different type of prior, but then the updating scheme becomes more difficult and less intuitive. By tuning the parameters, the Beta is capable of representing most types of general prior information, but of course there will always be exceptions.

The probability of a probability?

Similar threads

Why Are There Two Possible Values of x in Similar Shapes Geometry Problems?

Find the polar form of a complex number

[ASK] Trigonometric Inequality

What does this equation mean?

Finding the number of ways to arrange identical balls in a circle (3 different colors)

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers