I Calculating the probability that the Universe is finite

Buzz Bloom
Gold Member
Messages
2,517
Reaction score
465
TL;DR Summary
The calculation is based on two numbers: (1) the value for the curvature density, and (2) the +/- error value. Both of these values are given in equation 47b on page 40 of the reference listed in the body. The calculation is also based on three assumptions: (1) the probability distribution is Gaussian, (2) all four Ωs can have a range of values (to find a best fit to database values), and (3) that the universe is not and cannot be flat.
Reference

I note that the use of Gaussian probabilities is mentioned many times in the reference. However in many discussions via posts in many threads, there seems to be a consensus that the distribution is actually only approximately Gaussian, so the results of the calculation presented will likely not be as accurate as it is shown to be.

The equation 47b is
Ωk = 0.0007 +/- 0.0019 .​
This means that the integral representing the probability for
0 < Ωk < 0.0007​
is
(1/√π)∫07/19 e-x2 dx = 0.19882 .​

The integral for
0.0007 < Ωk < ∞​
is
(1/√π)∫7/19 e-x2 dx = 0.5 .​

The probability Pfu that the universe is finite is
Pfu = 0.19882 + 0.5 = 0.69882 .​
The probability Piu that the universe is infinite is
Piu = 0.30118.​
Note that the probability that the universe is flat is 0 because the probability for a single value 0 would be calculated by an integral from 0 to 0.

This the first of a series I plan to calculate. Next is calculating the expected value of Ωk and the corresponding expected value of the radius of curvature. I am guessing (without yet having started to calculate) that the Ωk answer will not be very far from 0.0007, but also larger and not by a trivial amount.

EDITED 2/8/2022
 
Last edited:
  • Like
Likes anuttarasammyak
Space news on Phys.org
Finite with almost 70% provability interests me. I thought wrongly that the Universe had been revealed to be infinite or open. Thanks.
 
  • Like
Likes sysprog
I think that it's an untestable hypothesis that is not validly subjectable to probabalistic analysis ##-## there is no legitimately comparable phenomenon set available to provide a foundation for such analysis.
 
  • Like
Likes mattt
Buzz Bloom said:
I note that the use of Gaussian probabilities is mentioned many times in the reference.
That is for the assumed distribution of errors in the measurements underlying the calculations. It is not in any way a claim that there is a meaningful "probability distribution" for the spatial curvature of the universe, or that the results given in the paper express the parameters of such a distribution. You can of course compute the portion of the total area under a Gaussian curve that corresponds to values in or outside some range. But that is very different from the claim that the result of such a computation is meaningful as a probability that the universe is positively curved/spatially finite.
 
  • Like
Likes phinds and sysprog
anuttarasammyak said:
Finite with almost 70% provability interests me.
"70% probability" is not the same as "70% provability", which is an even stronger claim. (But even the weaker claim is not meaningful in this context.)

anuttarasammyak said:
I thought wrongly that the Universe had been revealed to be infinite or open.
I don't think any cosmologist has claimed that this question is settled. Cosmologists often use exact spatial flatness in their models because it is the simplest case to model mathematically, and as long as "exactly flat" is within the error bars of the measurement, this modeling convention is justifiable. But it is not at all the same as claiming that we know our universe actually is spatially flat. Since error bars are always finite, if the error bars include exact spatial flatness, they will also include some region of very small positive and very small negative curvature. And as long as that continues to be the case, the question will remain open. The only way to resolve it for sure would be for the error bars to move away from spatial flatness, in either the positive or negative direction, far enough to no longer include exact spatial flatness. But that has not happened.
 
  • Like
Likes mattt, anuttarasammyak and sysprog
Buzz Bloom said:
Next is calculating the expected value of Ωk and the corresponding expected value of the radius of curvature. I am guessing (without yet having started to calculate) that the Ωk answer will not be very far from 0.0007, but also larger and not by a trivial amount.
Your guess is obviously false. The value you refer to in the paper is the expected value of ##\Omega_k## given the data (i.e., the mean of the Gaussian distribution obtained from all the measurements with their assumed Gaussian distribution of errors). The plus or minus value given for that ##\Omega_k## value is just the standard deviation of the Gaussian distribution.
 
Buzz Bloom said:
Note that the probability that the universe is flat is 0 because the probability for a single value 0 would be calculated by an integral from 0 to 0.
Yes, and since the universe could in fact be exactly flat, this correct statement is one way of seeing why any intepretation of calculations such as you have been making in terms of "the probability of the universe having such and such curvature" is meaningless.
 
PeterDonis said:
That is for the assumed distribution of errors in the measurements underlying the calculations. It is not in any way a claim that there is a meaningful "probability distribution" for the spatial curvature of the universe, or that the results given in the paper express the parameters of such a distribution.
Well, cosmologistst typically do Bayesian statistics (and definitely in this case as they are using CosmoMC). This means that the posterior inherently is considered a probability distribution. It is of course dependent on whatever prior went into the analysis and this needs to be considered in the interpretation. In a model where a continuous prior for curvature is natural, it should not come as a surprise that you get zero probability for exactly flat out. On the other hand, a model where you predict exactly flat would also have that as an output. The more interesting computation would be to compare the Bayesian evidence for these two models, which would give you an idea of ”how much should this data make us lean towards one or the other model”. However, the outputis cerainly a probability distribution. You just have to be very careful with what conclusions you draw from it.
 
  • Like
Likes PeroK and sysprog
Orodruin said:
The more interesting computation would be to compare the Bayesian evidence for these two models
I think that would depend on which two models you try to compare. If you try to compare a model that says "the universe is exactly flat" with a model that says "the universe is positively curved", unless the latter model makes some particular prediction about what the curvature parameter should be, you're back to a continuous prior for the curvature parameter, which gives the "exactly flat" model zero probability and so isn't a meaningful comparison.

But if you could compare two models, one which says "the universe is exactly flat" and one which says "the universe is spatially curved, but the curvature parameter is very small so it's hard to distinguish from flat", then perhaps you could do a useful Bayesian analysis on those grounds. The problem here is that we don't have a model (at least AFAIK) that makes the first prediction. Our best current model, which is Lambda CDM plus inflation, makes the second sort of prediction (and it doesn't specify whether the very small spatial curvature is positive or negative). I'm not aware of any other model with which we could compare it in this way.

Orodruin said:
the output is cerainly a probability distribution. You just have to be very careful with what conclusions you draw from it.
Yes, agreed. I was talking about the "very careful" part.
 
  • Like
Likes sysprog
  • #10
PeterDonis said:
I think that would depend on which two models you try to compare. If you try to compare a model that says "the universe is exactly flat" with a model that says "the universe is positively curved", unless the latter model makes some particular prediction about what the curvature parameter should be, you're back to a continuous prior for the curvature parameter, which gives the "exactly flat" model zero probability and so isn't a meaningful comparison.
No, this is not how Bayesian model comparison works. What you would do is to compare the Bayesian evidence of the respective models:
- A model where the distribution of the curvature parameter is a delta function at zero.
- A model where you have a continuous curvature prior.
The ratio of the Bayesian evidence for the models (computed separately) in essence tells you how to update your relative belief in these models. Unless your second model restricts the curvature parameter to be very close to flat, I suspect this would come out strongly in favor of the flat model.
 
  • #11
Orodruin said:
What you would do is to compare the Bayesian evidence of the respective models:
- A model where the distribution of the curvature parameter is a delta function at zero.
- A model where you have a continuous curvature prior.
I'm not sure you can compute meaningful Bayesian evidence with the first model. Wouldn't a delta function prior automatically give a zero posterior for any observation except a curvature of exactly zero?

Or are you saying the actual prediction of the first model for what data would be observed, based on finite measurement accuracy, would be some continuous Gaussian distribution of observations with a mean of zero and a narrow standard deviation?
 
  • #12
PeterDonis said:
I'm not sure you can compute meaningful Bayesian evidence with the first model. Wouldn't a delta function prior automatically give a zero posterior for any observation except a curvature of exactly zero?
Yes, but that is not what evidence is. The Bayesian evidence is in essence the likelihood of the data given the model. In the case of a model with fixed parameters, this is trivially the likelihood of the data given those parameters. In a model where the parameters are not fixed, it is the integral of the likelihood over the parameter space weighted by the prior. The evidence ratio is in essence the ratio of how likely each model was to produce the observed data (of course experimental errors and uncertainties need to be included here).
 
  • Like
Likes mattt, PeroK and sysprog
  • #13
Orodruin said:
The Bayesian evidence is in essence the likelihood of the data given the model.
Yes, I understand that. I'm just not sure how you get a nonzero likelihood of any data other than "curvature exactly zero" from the delta function model without some additional assumptions, to do with finite accuracy of measurement if nothing else, so that there is some continuous distribution of predicted data. But then I'm not sure what the difference between this model and a model whose prior for curvature is a narrow Gaussian instead of a delta function.

Perhaps it would help if you described in more detail how you envision computing the Bayesian evidence for each model from the data given in the OP.
 
  • #14
PeterDonis said:
Yes, I understand that. I'm just not sure how you get a nonzero likelihood of any data other than "curvature exactly zero" from the delta function model without some additional assumptions, to do with finite accuracy of measurement if nothing else, so that there is some continuous distribution of predicted data. But then I'm not sure what the difference between this model and a model whose prior for curvature is a narrow Gaussian instead of a delta function.
That is not how the measurements work though. The data is significantly more complicated in this case, but let us assume that you measure the parameter with some experimental uncertainty s and central value 0. The likelihood of obtaining a measurement x is then the Gaussian likelihood at x with standard deviation s. Again, this is not about the posterior distribution of parameters in the model - the posterior within the model will obviously still be a delta function at zero. However, this is not what you use for model comparison. You use the Bayesian evidence.

For the model with continuous parameter ##y##, the Bayesian evidence would be given by
$$
\int L(x,y,s) \pi(y) dy.
$$
Here ##L## would be the likelihood of measuring ##x## when the expectation is ##y## and the standard deviation ##s## and ##\pi(y)## is the prior within the continuous model.
 
  • Like
Likes sysprog
  • #15
Orodruin said:
let us assume that you measure the parameter with some experimental uncertainty s and central value 0
In other words, yes, you do have to add an additional assumption about experimental uncertainty. As I said in my previous post. And then, as I said in my previous post, I'm not sure how this model is any different than a model with a continuous prior for the curvature parameter, in terms of the likelihood of observing some particular data.

Orodruin said:
this is not about the posterior distribution of parameters in the model
Yes, I know that. I'm not talking about the posterior distribution in the model, I'm talking about Bayesian evidence, i.e., the likelihood of the data given the model.
 
  • #16
Hi @PeterDonis and @Orodruin:

I do very much appreciate your posts. Unfortunately I am not educated well enough to grasp implications based on Bayesian methods. However, I can visualize a process of producing a numerical error result corresponding to a particular choice of values for each of the four Ωs (which sum to unity) based on a reference database consisting of pairs of numbers: (1) for the redshift of light from a distant galaxy, and (2) for the distance to the galaxy measured by the brightness of a particular kind of supernova.

A process might work as follows. (I do not claim that such a process is actually used.) One might randomly choose a value for each of the four Ωs (which sum to unity), and then using the Friedmann equation with each of the database pairs to calculate a corresponding value of lookback time based on the z value. (I have just tried to find on the Internet the formula for the lookback time involving an integral, but I failed to find it.) The lookback time times c gives a distance value. Let E be the difference between this value and the distance value in the database pair. The process is repeated multiple times to find the combination of Ωs which gives the minimum of the sum of the E values for a specific four Ωs. These minimum values are the means of a distribution, and the standard deviation of the values can also be calculated. I do not know if this distribution would be Gaussian, but I would expect there would be a sufficient closeness to Gaussian for the purpose of calculating reasonable expected values while assuming a Gaussian distribution.

I understand that it may be a reasonable interpretation that such an approach would not be applicable to cosmology. However, I am unable to visualize any reason for this. I would much appreciate what either of both of you might say to explain such a reason.

ADDED
I did the math for lookback time T. It took me quite a while, but I think I have it correctly.
T = (1/H0) ∫a(T)1 F(a) da​
where
a(T) = 1/(z+1),​
and
F(a) = (Ωr/a2 + Ωm/a + Ωk + a2ΩΛ)-1/2 .​

Regards,
Buzz
 
Last edited:
  • #17
PeterDonis said:
In other words, yes, you do have to add an additional assumption about experimental uncertainty.
All measurements cone with uncertainty. If not there would be no point in discussing statistics.

PeterDonis said:
And then, as I said in my previous post, I'm not sure how this model is any different than a model with a continuous prior for the curvature parameter, in terms of the likelihood of observing some particular data.
It is not, why would it be? If you find a model where it is natural to get values very close to flatness (such as inflation), it is going to be very very difficult to separate from a model with exact flatness by construction. However, if you have a model where a prior that is reasonably flat over a large region of non-flatness is natural (such as not having inflation), then such a model will be disfavored by data relative to a model with exact flatness.
 
  • #18
Orodruin said:
If you find a model where it is natural to get values very close to flatness (such as inflation), it is going to be very very difficult to separate from a model with exact flatness by construction.
Yes, and this is the sort of model I was talking about in the earlier post of mine that I referred to: a model whose prior for curvature is a very narrow one centered on flatness. Of course a model whose prior for curvature is much wider is strongly disfavored by the data.
 
  • #19
Orodruin said:
All measurements cone with uncertainty.
Yes, but the effect of measurement uncertainty on the likelihood of particular data given the model can vary widely by model. For example, given a model whose prior for curvature is a very wide distribution, the effect of measurement uncertainty on the likelihood of particular data for this model will be very small. Whereas, given a model whose prior for curvature is a delta function at zero, the effect of measurement uncertainty on the likelihood of data is highly significant: without measurement uncertainty, the likelihood of any data other than exactly zero given this model is zero.
 
  • #20
PeterDonis said:
Yes, and since the universe could in fact be exactly flat, this correct statement is one way of seeing why any intepretation of calculations such as you have been making in terms of "the probability of the universe having such and such curvature" is meaningless.
Hi @PeterDonis:

I do not understand why, nor can I imagine why,
(1) "the universe could in fact be exactly flat", and​
(2) the calculation of probabilities for positive and/or negative values of Ωk
can not relate to the possibility "1" lead to your conclusion:
"the probability of the universe having such and such curvature" is meaningless.

I would very much appreciate reading your reasoning about this. It seems to me that the method does calculate any small continuous range of values, but no single value can be calculated. This does not seem to be a reason why what it can calculate is not a useful understanding of an approximate cosmological fact, particularly since there is no observable evidence, and currently also no current anticipated method to gain future evidence that the universe is definitely flat.

Regards,
Buzz
 
Last edited:
  • #21
Buzz Bloom said:
I do not understand why, nor can I imagine why,
(1) "the universe could in fact be exactly flat", and​
(2) the calculation of probabilities for positive and/or negative values of Ωk
can not relate to the possibility "1" lead to your conclusion:
"the probability of the universe having such and such curvature" is meaningless.
Because the relevant calculation is not the calculation of a probability distribution for ##\Omega_k## but a calculation of a probability distribution for models. The statement "the universe could be exactly flat" is not a statement about one possible place where our actual universe could fall within the distribution of possible values for ##\Omega_k## that you calculated. It is a statement about possible models: it says that one possible model of the universe is that there is some constraint that forces it to be exactly spatially flat, always.

Your calculation says nothing about the relative likelihood of such a model as compared to, say, an inflation model in which the spatial curvature can start out anywhere and gets driven to be very close to flat by inflation. (Note that in such a model, the universe is not exactly spatially flat.) But that is the sort of relative likelihood we have to assess if we want to answer the title question of this thread.

Buzz Bloom said:
no current anticipated method to gain future evidence that the universe is definitely flat
We could never gain direct evidence that the universe is exactly flat, because our measurements always have some finite uncertainty. Any belief we might have about the universe being exactly flat (as opposed to just so close to flat that we can't measure any difference) would have to be based on our belief in some model in which that was a constraint, and such a model would have to be established by other evidence besides measurements of spatial curvature.
 
  • Like
Likes PeroK
  • #22
Buzz Bloom said:
I would very much appreciate reading your reasoning about this.
Fundamentally, the point is that "the probability that the universe is closed" isn't a complete concept. The universe is what it is, it can't be anything else, so probability theory doesn't apply. It's like tossing a coin and seeing it land heads and asking what's the probability that it landed heads. It's not really a meaningful question because that toss is a done deal - probability is for things where several outcomes are possible. What you are really trying to calculate is "the probability that the universe is closed given our current understanding of theory and our data" - which is in the domain of probability theory. That's like me tossing a coin and hiding the result from you. The toss is a done deal, but you don't know the outcome.

Bayes' Theorem is your tool for this. Here's an example. I get a coin from your wallet and toss it ten times, getting nine heads and one tail. Do you think that the coin is fair?

The data implies a mean of 0.9 with a standard error of 0.3. That's quite a long way from 0.5, so a naive view might be that your best estimate is that the coin is biased. But you have to admit that nine heads could just be luck from a fair coin.

But, come on! I specifically said I took the coin from your wallet, so it's probably legit legal tender, and national banks (and vending machine companies!) are really finnicky about precision manufacture. If one particular type of coin were biased due to manufacture that'd be on Wiki or Mythbusters by now. So if it's biased it must have been damaged somehow, and (a) coins are pretty tough, and (b) damaged coins go out of circulation fairly quickly these days. How did it get damaged? And how unlucky would we have to be to draw a biased coin to do our test?

This kind of stuff is a priori information about the coin: it almost certainly is fair. We have two models in mind - a mint coin and a damaged coin. And initially we believe that we are hugely more likely to have a fair coin than an unfair one. But then we do our experiment and get a fairly extreme result. That should weaken our belief that this is a fair coin, but probably not much. Bayes' theorem will tell you what your a posteriori beliefs should be, given your a priori beliefs and the data from the experiment.

You can apply similar reasoning to your actual problem here. You have three models in mind - open, flat, and closed universes, and you have data in the form of a measure of curvature with error bars. You need to write down your prior beliefs, then see how the data affects them. Since what the data says is pretty equivocal, it probably won't change anything much.
 
Last edited:
  • #23
Ibix said:
The universe is what it is, it can't be anything else, so probability theory doesn't apply. It's like tossing a coin and seeing it land heads and asking what's the probability that it landed heads. It's not really a meaningful question because that toss is a done deal - probability is for things where several outcomes are possible.
This is not necessarily true. Probability theory can be used to estimate the likelihood of something that is definite but unknown.

In terms of the "probability that the universe if finite", then I would say (as a frequentist) that may not be a valid use of probability theory. This is where Bayesian probability theory is able to conjure a probability from circumstances where a frequentist would not! See, for example:

https://ocw.mit.edu/courses/mathema...pring-2014/readings/MIT18_05S14_Reading20.pdf

The real issue is (for the frequentist) is that you must assume a distribution. Then it's a case of applying the standard theory of conditional probabilities.

Let's look at the example of a coin that comes up heads nine times out of ten.

We can test the hypothesis that it is a fair coin. And, we can test any other hypotheses. But, it's difficult to give a probability that it is a fair coin. The reasoning is as follows:

1) If we know that all coins are fair, then it must be a fair coin.

2) If we know (or assume) that coins come in three equally-likely flavours: fair coins; coins that are 90% biased towards heads; and, coins that are 90% biased towards tails - then we can crank out the probability that this is a fair coin. (Assuming the original choice of coin was equally likely to be anyone of these three.)

3) If we know or assume that biased coins are rare, then we can calculate a different number. And, the number depends on how rare we think they are.

What we can't do, therefore, is give a number to the probability that it is a fair coin without assuming the distribution of fair coins against biased coins.

Bayesian methods can, however, go further than this. Personally, however, I'm not convinced that the numbers that come out of such calculations are meaningful.
 
  • #24
PeroK said:
This is not necessarily true. Probability theory can be used to estimate the likelihood of something that is definite but unknown.
Yes - that's why I gave the example of a hidden coin toss. Perhaps I should have said that probability is for when several outcomes are possible based on the information you have at the time.
 
  • #25
Ibix said:
Yes - that's why I gave the example of a hidden coin toss. Perhaps I should have said that probability is for when several outcomes are possible based on the information you have at the time.
I think a probability of ##0## or ##1## is just as valid as a probability of ##0.5##!
 
  • #26
PeterDonis said:
Because the relevant calculation is not the calculation of a probability distribution for but a calculation of a probability distribution for models.
Hi @PeterDonis:

I understand that the calculation of a Friedmann model has as a result values for five variables: H0 and the four Ωs. Also, for each variable, a +/- error range is given together with a value for the degree of confidence. The confidence can be defined in terms of standard deviations. I may have misunderstood what I read in the Planck 2018 paper, or confused what I remember, but I think the confidence values were defined as one standard deviation. It may be that the H0 variable was not calculated specifically for the Planck paper, but was taken form another source. The issue we are disagreeing about is whether (1) each of the five variables have values for mean and standard deviation, or (2) the numbers presented in the form x +/- y have a different interpretation. I think we agree that the values depend on data used specifically for this paper, and that it not intended to be an all time final result. What I cannot grasp is what specifically the values could mean as a "(2) different interpretation".

Perhaps a simpler topic would help me to understand the meaning. The calculated value for H0 has been (and continues to be) calculated based on a database collection of pair of values: z and distance (z values corrected if necessary for our own velocity with respect to CMB or other standard). The calculation gives a value for H0 and a +/- error value (or two values, one for each of + and -). Let us assume for the purpose of this discussion that it is a single +/- error value. The H0 result seems to me to be a mean of a probability distribution, and the +/- error is a standard deviation, or a specified multiple of a standard deviation. With these values one could calculate the probability that H0 is within a specified range of values, with the understanding that such a calculation value is likely to change when new data produces a different mean value and standard deviation. We possibly also agree that a change implies that the values are not a reliable mean and standard deviation, but we might disagree whether they are useful for this purpose.

Regards,
Buzz
 
Last edited:
  • #27
Ibix said:
Fundamentally, the point is that "the probability that the universe is closed" isn't a complete concept. The universe is what it is, it can't be anything else, so probability theory doesn't apply.
Hi @ibis:

I do not understand the logic in the above quote. It is certainly unknown, and possibly unknowable whether or not the universe is hyperbolic, flat, or hyper-spherical. However, even if it is what it is, and we do not know what it is, we do have a clue. The clue is what I interpret to be the mean and standard deviation values of Ωk. The clue is incomplete in that "flat" implies a specific value (zero), and the probability distribution clue cannot deal with that, or at least I am unable to understand how it might.

However, there is a related question that does have a calculable answer:
"Is the universe finite or infinite?"​

I just cannot understand why something that is what is denies the possibility that among the possible things it might be, each possibility has a probability. If you agree with this, how can it be that "probability theory doesn't apply"?

Example: I have a standard shuffled deck of 52 cards. You pick a card without looking at it. It is what it is. There is a certain probability (25%) that the suit of this card is Spades. There is a certain probability (7.692...%) that the denomination of this card is a Jack. There is a certain probability (1.923...%) that the card is the Jack of Spades. Why do these probabilities "not apply"?

Regards,
Buzz
 
  • #28
Buzz Bloom said:
Example: I have a standard shuffled deck of 52 cards. You pick a card without looking at it. It is what it is. There is a certain probability (25%) that the suit of this card is Spades. There is a certain probability (7.692...%) that the denomination of this card is a Jack. There is a certain probability (1.923...%) that the card is the Jack of Spades. Why do these probabilities "not apply"?

Regards,
Buzz
What if you have a pack of cards and you don't know how many suits there are or how many cards there are in each suit? You just get to look at a few cards.

Of course, you can come up with a number, such as 25% or 1%, but unless you have sufficient knowledge of the data you are dealing with such numbers may be meaningless.
 
  • #29
PeroK said:
What if you have a pack of cards and you don't know how many suits there are or how many cards there are in each suit? You just get to look at a few cards.

Of course, you can come up with a number, such as 25% or 1%, but unless you have sufficient knowledge of the data you are dealing with such numbers may be meaningless.
Hi @PeroK:

Are you implying that the value of Ωk and its +/- standard deviation value are meaningless numbers? If so, what makes them meaningless? Also are the correspond numbers for H0, Ωr, Ωm, and ΩΛ also meaningless?

Regards,
Buzz
 
  • #30
Buzz Bloom said:
Hi @PeroK:

Are you implying that the value of Ωk and its +/- standard deviation value are meaningless numbers? If so, what makes them meaningless? Also are the correspond numbers for H0, Ωr, Ωm, and ΩΛ also meaningless?

Regards,
Buzz
A measured or estimated energy density is not meaningless. But, say, a statement that says that there is a 30% probability that the universe is finite may be a meaningless statement.

In general, probabilities only make sense given some specific assumption(s).
 
  • #31
Buzz Bloom said:
the calculation of a Friedmann model
Is not the relevant calculation, because what you mean by "a Friedmann model" here is one particular Friedmann model with particular unknown values for all the parameters, and you're trying to estimate what the parameters are from the data. In other words, you're not comparing models, you're fixing the model and estimating its parameters, and you're assuming that all of the parameters you are estimating are free, to be estimated from the data.

But if I propose a different model, which is still described by a Friedmann equation but in which there is a constraint that forces one particular parameter, ##\Omega_k##, to have the exact value zero always and everywhere, then that changes how you estimate the other parameters. And the calculations you are doing do not help you at all in comparing the likelihood of those two different models, the one I propose and the one you refer to in what I quoted from you above. If all you have is the parameter estimation data, the best you can do is say that neither model is ruled out by the data (since ##\Omega_k = 0## is within the error bars of our current measurements). You can't give any relative likelihood.
 
  • #32
Buzz Bloom said:
However, there is a related question that does have a calculable answer:
"Is the universe finite or infinite?"​

I just cannot understand why something that is what is denies the possibility that among the possible things it might be, each possibility has a probability.
The question is how you estimate that probability. The point is that you can't estimate it just from the ##\Omega_k## calculation.

Buzz Bloom said:
I have a standard shuffled deck of 52 cards.
Yes, which means you have fixed the model, and that model already has known probabilities for all possible cards.

But the position we are in in trying to answer the title question of this post is not like that. We do not know what the correct model for the universe is. We can pick a model that looks like it might be right, and estimate parameters from that, but that does not answer the title question of this thread, because it does not answer the question of how likely that particular model is to be right among all the possible models that are consistent with the data.

It's as if, as @PeroK said, there are many different possible decks of cards and we don't know which one the card whose properties we are trying to calculate the probability of was drawn from. Calculating the probability of drawing a Spade from a standard deck does not help answer that question, because it doesn't tell you the likelihood that the deck is a standard deck to begin with.
 
  • #33
Buzz Bloom said:
Are you implying that the value of Ωk and its +/- standard deviation value are meaningless numbers? If so, what makes them meaningless? Also are the correspond numbers for H0, Ωr, Ωm, and ΩΛ also meaningless?
Of course they're not meaningless; they are estimates of those parameters given a particular assumption about the underlying model. But that in itself, as has been pointed out, is not enough to answer the title question of this thread.
 
  • #34
PeroK said:
In general, probabilities only make sense given some specific assumption(s).
Hi @PeroK:

I would veru much appreciate your listing a few examples of probabilities making sense based on given some specific assumptions.

Regards,
Buzz
 
  • #35
Buzz Bloom said:
Hi @PeroK:

I would veru much appreciate your listing a few examples of probabilities making sense based on given some specific assumptions.

Regards,
Buzz
If we stick to cosmology:

1) The probability that Betegeuse will go supernova in the next 100 years, as observed from Earth.

2) The probability that Dark Matter particles will be discovered in the next 20 years.

3) The probability that extraterrestrial life is discovered in the next 50 years.

4) The probability that universe is proved to be finite (or that a finite model is adopted) in the next 500 years.

Notice that if we remove the timescale, it makes things a lot less clear. If you bet that extraterrestrial life will never be discovered, then you can only ever lose that bet and never successfully win.

Likewise, if I bet that the universe is infinite, it's a bet that I (probably!) can't ever win. What would constitute proof of an infinite universe? I could only ever lose that bet.
 
  • #36
PeterDonis said:
If all you have is the parameter estimation data, the best you can do is say that neither model is ruled out by the data (since Ωk is within the error bars of our current measurements). You can't give any relative likelihood.
Hi @PeterDonis:

I am having a bit of confusion. The two models you describe seem to me to be plausibly (but not necessarily) in competition. The competition is based on the differences between the two with respect to the total sum of the differences between the input data (z and distance) with each corresponding model's fit of
da/dt to H0 D (D= distance).​
It seems plausible that the value of this fit measurement for Ωk=0 will be larger than the one for the other model's non-zero value for Ωk.
This difference will enable a calculation of
(1) the fit measurement result M1 based on the assumption that Ωk=0, and​
(2) the fit measurement result based M2 on the assumption that Ωk has a Gaussian distribution.​

I do not know if the following would be useful, but I would find it to be interesting. A series of fit measurement M values could be calculated for a range of Ωk values. From this result data, the average MA of the M fit measurement result in a range R between
Ωk-Q and Ωk+Q​
which equals the value M1. Then, I suggest that the integral of the Gaussian distribution for Ωk over the R range is an approximate probability value that Ωk=0.

The above described calculation is just a guess that something like that might possibly produce a probability for Ωk=0.

Regards,
Buzz
 
  • #37
Buzz Bloom said:
The two models you describe seem to me to be plausibly (but not necessarily) in competition.
They're in competition in the sense that they give different answers to the title question of this thread, yes.

Buzz Bloom said:
each corresponding model's fit of
da/dt to H0 D (D= distance).
Neither model makes a single prediction for this. As I said before, each model estimates all of its free parameters based on the data. The difference is that the alternate model I proposed has one fewer free parameter, since it fixes ##\Omega_k = 0##. But that still leaves multiple free parameters for both models, and it has to estimate all of them based on the data. The method of estimation, as I understand it, is basically a least squares fit to the data, but the data is not da/dt vs. H0 D.

Buzz Bloom said:
(2) ... the assumption that Ωk has a Gaussian distribution.
The model under discussion for this item does not predict that ##\Omega_k## has a Gaussian distribution. To the extent it makes a prediction about ##\Omega_k##, it says that any value that is not ruled out by our current evidence is about equally probable. (A Bayesian would call this a uniform prior over the interval of values not ruled out by evidence.)

Buzz Bloom said:
The above described calculation is just a guess that something like that might possibly produce a probability for Ωk=0.
If the two models under discussion were known to be the only two possible models, then we could estimate the probability for ##\Omega_k = 0## based on the relative likelihood of the alternate model I proposed, which fixes ##\Omega_k = 0##, vs. the standard inflation-based model, given the data. But that's still not the same as either the "fit" estimate you've described, or the parameter estimation I described.

If there are more possible models besides those two, then we would have to be able to estimate their relative likelihoods as well given the data.

In any case, the key point is that you are talking about estimating the probability of the data given a model, while what we actually need is the probability of each model given the data.
 
  • #38
The standard deviation you are talking about is an estimate of the variation in the measurement of the curvature of the universe. One takes a number of measurements and finds how much they vary. Everyone thinks the curvature is practically constant, so the variation is due to imperfections in the measuring procedure. We then say that this standard deviation is an estimate of the amount of "error" in the measurement.

In other words, everyone thinks that during these measurements the curvature is practically a constant. This is the distribution we are hypothesizing about. If the measurements were perfect then they would always measure the same number. The variation in our sample is entirely due to measurement error.

If zero were more than 3 std deviations from the mean of the measurement then the no curvature hypothesis would be doubted.

Statistics can't be used to prove anything. One makes a hypothesis, takes some measurements, and calculates how consistent they are with the hypothesis. In this case the measurements are consistent with the zero hypothesis, while high curvature hypotheses are rejected. That's all you can get from these measurements.

Both zero and almost zero are consistent with the measurement. The measured statistics don't indicate one over the other.
 
Last edited:
  • #39
PeterDonis said:
Of course they're not meaningless; they are estimates of those parameters given a particular assumption about the underlying model.
Hi @PeterDonis:

I remain confused. Please explain what the particular assumption about the underlying model is. My understanding is that the underlying model is the value of the five variables (and the assumption that Friedmann equation is the framework for making models). But these five values are not assumed. They are calculated as a best fit to a database of data. That is, they are not inputs, they are outputs.

By the way, I apologize for the card example. I now see that it was inappropriate.

Regards,
Buzz
 
Last edited:
  • #40
PeterDonis said:
If there are more possible models besides those two, then we would have to be able to estimate their relative likelihoods as well given the data.
Hi @PeterDonis:

Again I get the impression that my confusion about your explanations is that I see a model as the result of calculations to find the model that is the best fit to the raw data. You discuss two models, (M1) with the assumed value that Ωk=0, and the other (M2) with Ωk as one of the model variables with a value to calculate from the relevant database of data. This comparison allows for a comparison between the results of the database fit to the two model results with respect to the database variables. For this pair of models it is likely that the fit by M1 is sufficiently close to the fit by M2 that it is plausible (perhaps with a numerical degree of confidence) that Ωk=0 is a correct value.

This method is also a reasonable way to confirm that Ωr=0 is a convenient assumption for calculations of the model's values for H(t) when Ωr<<Ωm.

I still do not see why the mean and standard deviation values for Ωk do not represent a probability distribution of Ωk that can be used to calculate the probability that Ωk>0 (and also its expected value when assuming Ωk>0). Perhaps the fact that there are five variables in the models creates a too complicated example of the reasons against the probability function interpretation. I will post later what I hope will be an acceptable context based on a one variable model: H0.

Regards,
Buzz
 
  • #41
Hornbein said:
The standard deviation you are talking about is an estimate of the variation in the measurement of the curvature of the universe.
Hi @Hornbein:

I apologize for not understanding what the quote above is intended to communicate. You seem to be saying that measurements are made to obtain values for Ωk. I am unable to guess what kinds of measurements are made. My guess about calculating a value for Ωk (as well as for the other variables) is that the Friedmann equation is used with assumed values for the five variables to calculate values for H(t) for a list of t (time) values which correspond to some astronomical values. That is, a comparison is made between some database values and calculated values and the particular values for the Friedmann variables with the least sum of squares of differences with the database values are the means of the variable values The sum of differences can be used to calculate standard deviations.

So, my guess is that there is no measurement of curvature. The measurements are the sums of differences squared between calculated variables and database variables related to H(t).

Regards,
Buzz
 
  • #42
Buzz Bloom said:
Please explain what the particular assumption about the underlying model is.
Your underlying model is that the universe is described by an FRW spacetime and that all five of the parameters you listed are free parameters; none of their values are fixed by the model, they are all to be estimated from the data.

The alternate model that I proposed is that the universe is described by an FRW spacetime and the four of the five parameters you listed are free parameters, but one, ##\Omega_k##, is not: it is fixed at the value ##0## by constraints imposed by the model (I have given no details about what those constraints are because I am just proposing this model for comparison). The other four parameters are to be estimated from the data.
 
  • Like
Likes Buzz Bloom and PeroK
  • #43
Buzz Bloom said:
I apologize for the card example. I now see that it was inappropriate.
No, it wasn't; it was a good illustration of the difference between calculating probabilities given a fixed underlying model, and trying to compare different underlying models.
 
  • Like
Likes Buzz Bloom
  • #44
Buzz Bloom said:
I see a model as the result of calculations to find the model that is the best fit to the raw data.
That is not how I am using the term "model". See post #42.

You can't even do "calculations to find the model that is the best fit to the raw data", using "model" in your sense, until you have decided on a "model" in my sense--an underlying theoretical model that tells you what the relevant equations are and what parameters in those equations are to be estimated from the data. And you cannot use "calculations to find the model that is the best fit to the raw data" to compare "models" in my sense. But comparing "models" in my sense is what you have to do to answer the title question of this thread.

In other words, the title question in this thread cannot be answered by doing calculations to find the best fit of the five parameters you listed, of which ##\Omega_k## is one, to the data, under the assumption that all five of them are free parameters to be estimated from the data. Those calculations can only answer a different question, namely: if we assume that ##\Omega_k## is a free parameter to be estimated from the data, what is the probability that the spatial curvature is positive (i.e., that the universe is spatially closed)? That is the question your calculations in the OP answer. But it is not the same question as the title question of this thread; the if clause in front makes a huge difference.
 
  • #45
Hi @PeterDonis:

If I am understanding your post #42 (further explained in post #44), you are making a distinction between two categories of models, both assuming and based on the Friedmann equation. Category 1 is a model where all five parameters are to be determined by what you describe as "estimated from the data", and I describe as making the best fit to a database of data. Category 2 is a model in which one or more (but not all five) of the five parameters are assumed to have a fixed value, and the other parameters are "estimated from the data" (or calculated by finding the best fit to a database). I also think you intend (and I agree) that for a Category 2 model to have a result that is useful, it needs to be compared with the Category 1 model. If the Category 2 model has a much worse fit to the database than Category 1, it will not be evaluated as being a reasonably reliable model.

Assuming that you and I agree about the above paragraph, I would very much appreciate your explaining why you do not accept that the Category 1 model has values for each variable which represent the mean and standard deviation of probabilities for the variables' values.

Regards,
Buzz
 
  • #46
Buzz Bloom said:
you are making a distinction between two categories of models
Not two categories, two specific models, each of which makes a specific designation of which parameters are free parameters.

Buzz Bloom said:
I also think you intend (and I agree) that for a Category 2 model to have a result that is useful, it needs to be compared with the Category 1 model.
Not at all. I can use my alternate model to estimate the fit of its free parameters to the data without even knowing of the existence of the other model, the "standard" one. And of course you can use the "standard" model to estimate the fit of its free parameters to the data without knowing of the existence of my alternate model (since that is what you did in the OP of this thread, before I had even proposed the alternate model).

What you can't do is answer the title question of this thread by using estimated parameter fits from only one model. But there are lots of other uses for particular models besides trying to answer the title question of this thread.

Buzz Bloom said:
If the Category 2 model has a much worse fit to the database than Category 1, it will not be evaluated as being a reasonably reliable model.
If the best fit of any model's parameters to the data has huge variances (i.e., the model is unable to reproduce the data very closely at all, no matter what values you plug in for its parameters), then of course it's not going to be considered a viable model if there is another model whose fits are much better (i.e., which can reproduce the data much more closely with appropriate values for parameters). But that doesn't mean the other model is necessarily the best possible one. There could be still another model that could fit the data even better.

There is one particular sense in which the "standard" model, the one with all five parameters you listed as free parameters, is "more general" than any other model that only uses those parameters: it allows for the possibility that all five parameters could have values that we don't know for sure. Whereas any other model using only those parameters (such as my alternate model) must claim that we know for sure the value of at least one parameter (in the case of my alternate model, that is ##\Omega_k##). That might be what you are trying to get at here. But it's still a fairly weak claim; in particular, it's not sufficient to ground any claim that considering only the "standard" model is sufficient to answer the title question of this thread.

Buzz Bloom said:
I would very much appreciate your explaining why you do not accept that the Category 1 model has values for each variable which represent the mean and standard deviation of probabilities for the variables' values.
If by this you mean the claim about the "standard" being "more general" that I described above, then it is correct, but limited (as I said above).

If you mean anything stronger, such as the claim that the distribution of ##\Omega_k## in this model says anything useful about the value of ##\Omega_k## in my alternate model, then the statement is incorrect.

Also, we have so far not even discussed the possibility of another alternate model that had more free parameters than the standard one does (for example, a model in which the dark energy density is not treated as constant but is allowed to vary with time). In such a model, the estimate for ##\Omega_k## might be different from the one you calculate (because the presence of additional free parameters can change the overall best fit). You can't make any claims about that sort of model either from your calculations in the OP.

You can of course try to argue that models with additional free parameters are ruled out by Occam's razor since the fit to the data of the "standard" model is "good enough". But such arguments have nothing to do with the calculations you made in the OP.
 
  • #47
PeterDonis said:
Not two categories, two specific models, each of which makes a specific designation of which parameters are free parameters.
Hi @PeterDonis:

Every time I think I am making progress, I am instead becoming more confused. The key issue I am trying to understand is the following.
Why is it not reasonable that calculating the values of the Friedman equation variables (five of them) by finding the Friedman variable values which have the best fit to database values (which are related to the Friedman equation, for example H(t) values) produce a mean and standard deviation for a probability distribution for each of the five Friedmann variables?

A secondary issue is why isn't the two categories I defined meaningful? I get that Category 2 could be recognized as having several sub-categories, such as each sub-category having a specific subset (non-empty) of the five variables, and each such subset having preset fixed values.

My understanding of the term "Friedman model" is that such a model has a specific value for each of the five variables. You seem to be using this term with a different sense.

Regards,
Buzz
 
Last edited:
  • Sad
Likes PeroK
  • #48
Buzz Bloom said:
Why is it not reasonable that calculating the values of the Friedman equation variables (five of them) by finding the Friedman variable values which have the best fit to database values (which are related to the Friedman equation, for example H(t) values) produce a mean and standard deviation for a probability distribution for each of the five Friedmann variables?
I have said no such thing. You aren't reading what I'm actually saying.

The calculations you refer to do produce means and standard deviations for the five variables, under the assumption that you are using a model in which all five of those variables are free parameters. Of course it is reasonable to make those calculations under that assumption. I have simply been pointing out, several times now, that the assumption I put in italics just now is an essential part of the calculations. The calculations only apply to the particular model that satisfies the assumption. They don't apply to other models that don't satisfy the assumption.

Buzz Bloom said:
A secondary issue is why isn't the two categories I defined meaningful?
Again you aren't reading what I'm saying. I said they're not categories, they're individual models (with the proper usage of the term "model"--see below). Of course the difference between them is meaningful, and I never said otherwise.

Buzz Bloom said:
My understanding of the term "Friedman model" is that such a model has a specific value for each of the five variables. You seem to be using this term with a different sense.
Of course I am, and I already explained the different sense in which I am using it, in post #44. Please read that post again, carefully.

Basically, you are using "category" to mean what I mean by "model", and you are using "model" to mean what I mean by "model with a particular set of values for its free parameters". That is a matter of words, not physics. My usage is, I believe, the standard usage in physics. Your usage is not. But either way, the concepts are clear, and it's the concepts that we are discussing, not the choice of words.
 
  • Like
Likes Buzz Bloom
  • #49
PeterDonis said:
That is a matter of words, not physics. My usage is, I believe, the standard usage in physics. Your usage is not.

Hi @PeterDonis:

It seems to be that a lot of my confusions are vocabulary issues. I try to avoid having a single word used to have more than a single meaning. Over the years my experience has been that when such usage is used my understanding of what a word means is the wrong definition, although it is one of several usages of the word. I have tried to make a distinction between (1) a "model" with five specific values, and (2) a kind of "model" where the variables have not been given specific values. I apologize for my confusion. In my readings I have not become familiar with the nuances of these two different meanings for "model".

Post #48
PeterDonis said:
The calculations you refer to do produce means and standard deviations for the five variables, under the assumption that you are using a model in which all five of those variables are free parameters.
Post #4
PeterDonis said:
That is for the assumed distribution of errors in the measurements underlying the calculations. It is not in any way a claim that there is a meaningful "probability distribution" for the spatial curvature of the universe, or that the results given in the paper express the parameters of such a distribution.

My interpretation of Post #4 led me to believe that your view is that it is unreasonable for me to use the
0.0007+/- 0.0019 values for Ωk
to calculate the probability that Ωk>0, and that this result is a reasonable probability value that the universe is finite. A good bit of this thread was about my trying to understand why this was your view. Now, I interpret Post #48 as a correction to my misunderstanding of Post #4, and that you did not mean that my calculation of the probability that the universe is finite is an unreasonable conclusion. Do I now have a correct interpretation of your view?

Regards,
Buzz
 
  • #50
Buzz Bloom said:
I try to avoid having a single word used to have more than a single meaning.
That's fine, but in order to have a useful discussion about a scientific topic, you need to use words with the single meaning that scientists who work on that topic use. You can't just pick your own meaning without regard to how the word is used by scientists working in the field.

Buzz Bloom said:
I have tried to make a distinction between (1) a "model" with five specific values, and (2) a kind of "model" where the variables have not been given specific values.
The distinction itself is perfectly valid. The issue is that you have been using the word "model" to mean #1, whereas scientists use the word "model" to mean #2, so that's what I have been using the word "model" to mean as well. If you want to be consistent with standard scientific terminology, you need to find a different word for #1, since "model" is used in standard scientific terminology to mean #2.

Buzz Bloom said:
In my readings I have not become familiar with the nuances of these two different meanings for "model".
There aren't two different meanings for "model" in standard terminology, which is why you have not seen anything in your readings about any such thing. There is only one. It is just not the one you have been using "model" to mean. See above.
 
  • Like
Likes Vanadium 50
Back
Top