The probability of a probability?

  • Thread starter Thread starter alan_longor
  • Start date Start date
  • Tags Tags
    Probability
Click For Summary
The discussion revolves around the concept of assigning probabilities to an event with two outcomes, A and B, and the implications of additional information regarding the event's outcomes. Initially, without the extra information, it is debated whether a value can be assigned to the probability p' that p lies within a specific range. The additional information indicates that outcome A occurred a billion times, leading to the conclusion that the posterior probability of p' approaches 1, suggesting a strong bias towards outcome A. The conversation also touches on the complexities of Bayesian statistics and the necessity of defining a probability distribution for p to derive meaningful conclusions. Ultimately, the problem illustrates the nuanced nature of probability theory and the challenges in assigning values without clear distributions.
  • #31
alan_longor said:
thank you , so what would the probability be ?
Strictly speaking, it will depend on the prior. But suppose your prior is 1000:1 against. What posterior likelihood do you get?
 
Physics news on Phys.org
  • #32
haruspex said:
Strictly speaking, it will depend on the prior. But suppose your prior is 1000:1 against. What posterior likelihood do you get?
well one of the conditions was that we know nothing about the light , the only thing we know is that it came up for one billion nights . now when i asked weather or not we can set a probability for it coming up the next day , that means weather or not the information we have is enough to set a probability . if something else has to be known then we cannot set it . in either cases logically the probability must be a number very close to 1 .
 
  • #33
alan_longor said:
well one of the conditions was that we know nothing about the light , the only thing we know is that it came up for one billion nights . now when i asked weather or not we can set a probability for it coming up the next day , that means weather or not the information we have is enough to set a probability . if something else has to be known then we cannot set it . in either cases logically the probability must be a number very close to 1 .
I assume you have been introduced to Bayes' theorem. This tells you that you cannot estimate the probability of anything from observations unless you start with a prior estimate.
 
  • #34
haruspex said:
I assume you have been introduced to Bayes' theorem. This tells you that you cannot estimate the probability of anything from observations unless you start with a prior estimate.
ok , then in this case we cannot set a probability for the appearance of the light on the next night .
 
  • #35
alan_longor said:
so if i ask you on day number one billion what's the probability that the light will come up tomorrow to ... is that a meaningful question ?

It may be a meaningful question, but it isn't a mathematical question unless you state a specific probability model.

Even if you state a probability model, it isn't a solvable mathematical question unless there is sufficient given information.

People who wish to demonstrate that mathematics can be applied in a way that agrees with "common sense" can reformulate the question in various ways by creating a probability model for it that gives sufficient information to create a solvable mathematical problem.
 
  • #36
alan_longor said:
ok , then in this case we cannot set a probability for the appearance of the light on the next night .
Sure, solipsism is a sound philosophy from a logical perspective, but it doesn't get you very far. In the real world, we function quite effectively by subconsciously assigning reasonable a priori probabilities.
Your rising sun model is insufficiently divorced from real experience to think about in a detached manner. How about, you notice that on three consecutive occasions the winning lottery number ends in a five. What is the probability that there is a flaw in the randomisation? What probability would you have assigned to that beforehand... one in 100? Too high. One in 10,000? Let's say one in a million. How many consecutive occasions of a final digit 5 will push that estimate to greater than half?
 
  • #37
haruspex said:
Sure, solipsism is a sound philosophy from a logical perspective, but it doesn't get you very far. In the real world, we function quite effectively by subconsciously assigning reasonable a priori probabilities.
Your rising sun model is insufficiently divorced from real experience to think about in a detached manner. How about, you notice that on three consecutive occasions the winning lottery number ends in a five. What is the probability that there is a flaw in the randomisation? What probability would you have assigned to that beforehand... one in 100? Too high. One in 10,000? Let's say one in a million. How many consecutive occasions of a final digit 5 will push that estimate to greater than half?
i am sorry for being unable to understand that exact case , and please allow me to ask you a question even though i was unable to answer yours . so what if someone says he has a real number generator that would generate a random number between 1 and 100 . and that there is no pattern and there is no way to predict what that machine gives . is it correct to assume that the probability for the number given by the machine to be between 90 and 100 is 1/10 ? since all the numbers seem to have an equal probability here . thank you very much .
 
  • #38
alan_longor said:
i am sorry for being unable to understand that exact case , and please allow me to ask you a question even though i was unable to answer yours . so what if someone says he has a real number generator that would generate a random number between 1 and 100 . and that there is no pattern and there is no way to predict what that machine gives . is it correct to assume that the probability for the number given by the machine to be between 90 and 100 is 1/10 ? since all the numbers seem to have an equal probability here . thank you very much .
If you completely trust that information, yes. But complete trust also constitutes a prior distribution.
 
  • #39
haruspex said:
If you completely trust that information, yes. But complete trust also constitutes a prior distribution.
But complete trust also constitutes a prior distribution.
that's the phrase i have been looking for ... thank you .
 
  • #40
If the word 'assign' implies that it is just an initial value that will be corrected by some (Bayesian) method, than of course you can assign any valid initial probability guess.
If the word 'assign' implies that you can defend that value as being the correct probability parameter, then you can not do that. Here is why:

Whether Bayesian methods are to be used or not, the answer to 1) is that no value can be assigned to p'. The most that Bayesian methods can do is to assign an initial standard distribution ( uniform ) with no justification. There is no real assumption that the initial standard distribution is correct. In fact, Bayesian methods assume that the initial distribution is incorrect and it tells you how to change it as real information is obtained.

You should say that the answer to 1) is that no value can be assigned to p'. If you attempt to assign any value, a, to p', you can easily give an example where that value is wrong.
 
  • #41
FactChecker said:
The most that Bayesian methods can do is to assign an initial standard distribution ( uniform ) with no justification.
Maybe it's a matter of philosophy, but I take a different view.
Bayesian methods require you to supply a prior distribution, but that does not provide an excuse to set it to uniform without justification. In normal practice, it will be some gut feel based on experience. The fundamental point is that the choice should not be that crucial provided it is reasonable.

Consider the classic 'fair coin' problem. What would be a sensible a priori estimate of whether a coin has a 0.01% or more bias towards heads? 0.9 would probably be too much; one in a billion almost surely too low. With those bounds, we can run a trial for long enough to bring our posterior estimate within some preset range.

Question 1 is problematic because we are given no information regarding the nature of the event. The only experience we can call on is of probabilities of events in general, a basis so nebulous that to assign any shape to the prior distribution is highly questionable. Still, I would argue that we come come across a lot of probabilities close to 1, so the range "> .9" should have a prior probability of at least, say, 1 in 1000. This is why question 2 can be answered.
 
  • Like
Likes FactChecker
  • #42
haruspex said:
Maybe it's a matter of philosophy, but I take a different view.
Bayesian methods require you to supply a prior distribution, but that does not provide an excuse to set it to uniform without justification. In normal practice, it will be some gut feel based on experience. The fundamental point is that the choice should not be that crucial provided it is reasonable.
That is a good point. I was influenced by this example that doesn't give any glue for the initial distribution.
That brings up another question (I don't want to hijack this thread, though):
If you use an initial distribution that is not uniform, that might make it harder to correct. I have no experience with this, but the question has come up before at work where it was being applied iteratively.
 
  • #43
FactChecker said:
If you use an initial distribution that is not uniform, that might make it harder to correct
That assumes the uniform distribution is closer to the true distribution than is the chosen prior. As I posted, a good approach is to consider some range of priors that you feel encompass the answer and run the trials until sufficiently confident.
In the case of fairness of a coin, one that at least looks fair, you would be well justified in taking a prior distribution that sets the probability that the frequency of heads is between 0.4 and 0.6 as being at least 0.8, say.
 
  • Like
Likes FactChecker
  • #44
haruspex said:
That assumes the uniform distribution is closer to the true distribution than is the chosen prior.
"closer to" is tricky to define. In an iterative process, if stage 1 resulted in a distribution with a small standard deviation, it might be difficult to move the mean in stage 2. So it might be better to increase the standard deviation for the stage 2 prior distribution -- perhaps even to a uniform distribution. But we never investigated it while I was there, so I don't know.
 
  • #45
haruspex said:
Maybe it's a matter of philosophy, but I take a different view.
Bayesian methods require you to supply a prior distribution, but that does not provide an excuse to set it to uniform without justification. In normal practice, it will be some gut feel based on experience. The fundamental point is that the choice should not be that crucial provided it is reasonable.

There is a way to use Bayesian methods in the same spirit that we use frequentist methods.

Frequentist methods don't answer the question "What is the probability of the result that I'm interested in?". Instead they answer questions like "If I assume the result I'm interested in (or its negation) what is the probability of the data ?". From the answer to that question, people get a subjective feeling about the probability of the result that interests them, but not an actual number for it.

In a similar manner, a person can assume a uniform (or "maximum entropy") prior and compute the posterior distribution just to get a subjective feeling about how strongly the data suggests a certain range of values for the result of interest. This is different that "taking the prior seriously".
 
  • Like
Likes FactChecker
  • #46
FactChecker said:
That is a good point. I was influenced by this example that doesn't give any glue for the initial distribution.
That brings up another question (I don't want to hijack this thread, though):
If you use an initial distribution that is not uniform, that might make it harder to correct. I have no experience with this, but the question has come up before at work where it was being applied iteratively.

There are "natural" types of priors for various type of probability models. In this case the model is that of independent, repeated trials with some unknown success probability probability ##p##---Binomial(n,p) in other words. The natural prior that people tend to use in such a case is the Beta distribution, with density function
$$f_0(p) = \frac{1}{B(a,b)} p^{a-1} (1-p)^{b-1}, \; 0 < p < 1 $$
Here, ##a,b >0## are parameters and ##B(a,b)## is the so-called Beta function; see, eg.,
https://en.wikipedia.org/wiki/Beta_distribution for more formulas and details.
The uniform prior is a special case in which ##a = b = 1##.

Typically one might start by assigning (or estimating, or ...) something like the most probable value of ##p## or the prior mean of ##p##, perhaps also with some typical or probable range estimates. That gives one the purely mathematical problem of determining the parameters ##a,b##. Once one has ##a## and ##b## the Bayesian updating is easy; after observing ##k## successes in ##n## trials the posterior density of ##p## is Beta with parameters ##a+k## and ##b + n-k## in place of ##a## and ##b##. This updating simplicity is the reason the Beta is used as a prior for the Binomial case.

Nothing prevents you from using a different type of prior, but then the updating scheme becomes more difficult and less intuitive. By tuning the parameters, the Beta is capable of representing most types of general prior information, but of course there will always be exceptions.
 
Last edited:
  • Like
Likes FactChecker

Similar threads

  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 18 ·
Replies
18
Views
2K
Replies
10
Views
3K
Replies
2
Views
1K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 7 ·
Replies
7
Views
5K
Replies
9
Views
2K
  • · Replies 29 ·
Replies
29
Views
6K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 12 ·
Replies
12
Views
2K