I Does the statistical weight of data depend on the generating process?

fresh_42 · Dec 6, 2019

Dale said:

It is a fundamentally different approach. In the frequentist approach the hypothesis (usually p=0.5) is taken to be certain and the data is considered to be a random variable from some sample space. That is the issue, the two sample spaces are different. For the Bayesian approach the data is considered certain and the hypothesis is a random variable.

Sorry for my stubbornness, but I have difficulties to figure out the difference.

Let's say I test a coin and the null hypothesis is ##p=0.5##. Is it true that in the frequentists' model if I flip the coin in many different test with different setups, I only measure how reliable my data are under the assumption of an ideal coin, whereas in the Bayesian model, I measure the bias of my coin under the assumption that my data will tell me?

Seems a bit linguistic to me.

PeterDonis · Dec 6, 2019

Dale said:

your prior could conceivably be different

How might the prior for couple #2 be different from the prior for couple #1?

PeterDonis · Dec 6, 2019

fresh_42 said:

have difficulties to figure out the difference

See the question I asked @PeroK in the last part of post #30.

fresh_42 · Dec 6, 2019

PeterDonis said:

The only relevant difference between the two couples is the process they used. What would your answer be then?

Given a reasonable sample size, we shouldn't be able to tell a difference. However, I don't think such an ideal case can be realized. Boy to girl is p to (1-p) regardless of the measurement. In reality this is not the case IMO.

PeterDonis · Dec 6, 2019

fresh_42 said:

Given a reasonable sample size, we shouldn't be able to tell a difference.

I already specified what the two samples are: the (identical) data from couples #1 and #2. So are you saying that, if the only difference between the couples is the process they used, the two data sets have the same statistical weight when estimating ##p##?

fresh_42 said:

I don't think such an ideal case can be realized.

I agree--no two couples are ever exactly the same except for just the process they used--but idealized cases are often useful for investigating questions even when they can't be realized.

PeroK · Dec 6, 2019

PeterDonis said:

Yes, but that's not the question I asked. The question I asked was whether ##\lambda = 0.5## is less likely given the second case vs. the first.
How does "the second data set is less likely given the hypothesis that ##\lambda = 0.5##" get transformed to
"the hypothesis that ##\lambda = 0.5## is less likely given the second data set"? That is not a valid deductive syllogism; in fact it's a common error people make (assuming that if A then B is equivalent to if B then A).

I'm working to standard hypothesis testing. In particular, there is a single, unknown value ##\lambda##. It's not a random variable.

We can test ##\lambda = 0.5## (or any other value) against a random data set ##X## and compute ##p(X|\lambda)## for that data set.

The data in case #2 is less likely, given the hypothesis ##\lambda = 0.5##.

Eventually, with enough data, we would have to abandon the hypothesis ##\lambda = 0.5##. That is a thornier issue. In reality, it is more about an accumulation of data than one test.

Here the data in case #2 gives us less confidence in our hypothesis. That is the sense in which ##\lambda = 0.5## is "less likely".

PeterDonis · Dec 6, 2019

PeroK said:

Here the data in case #2 gives us less confidence in our hypothesis.

Why? As I've already said, there is no valid deductive reasoning that gets you from "the second data set is less likely given the hypothesis that ##\lambda = 0.5##" to "the hypothesis that ##\lambda = 0.5## is less likely given the second data set". So since you can't be using valid deductive reasoning, what reasoning are you using?

PeroK said:

I'm working to standard hypothesis testing.

I'm not sure that standard hypothesis testing (aka frequentist statistics) has a good answer to the question I just posed above. But if there is one, I would like to know it.

fresh_42 · Dec 6, 2019

PeterDonis said:

See the question I asked @PeroK in the last part of post #30.

If this is the difference between the two, then the Bayesian model doesn't make much sense to me for real life situations. You cannot setup different experiments such that the outcome only depends on the random variable.

PeterDonis · Dec 6, 2019

fresh_42 said:

You cannot setup different experiments such that the outcome only depends on the random variable.

I don't see how this is relevant. The two cases don't differ in their outcomes; the outcomes are the same. They only differ in the process used to generate the outcomes, and that process, itself, does not depend on the variable (p, or ##\lambda## in @Dale's notation) whose value we are trying to estimate.

fresh_42 · Dec 6, 2019

PeterDonis said:

I already specified what the two samples are: the (identical) data from couples #1 and #2. So are you saying that, if the only difference between the couples is the process they used, the two data sets have the same statistical weight when estimating ##p##?

I don't see how we can estimate anything from two tests. With sample size I meant enough tests of either setups. If we measure an effect a million times at CERN and a thousand times at Fermi, and have the same results, why should there be a different significance? The million tops the thousands, but given the identical outcome, I don't see a different weight.

PeterDonis · Dec 6, 2019

fresh_42 said:

given the identical outcome, I don't see a different weight.

Ok.

fresh_42 · Dec 6, 2019

PeterDonis said:

I don't see how this is relevant.

I think there is a major difference between theory and real life. Given the same outcome, we cannot decide which experiment is closer to the real distribution. The quality of the processes cannot be distinguished. I just say that there are always unkowns which don't find their way into the calculation. Such as the father's age in the first example.

PeterDonis · Dec 6, 2019

fresh_42 said:

Given the same outcome, we cannot decide which experiment is closer to the real distribution.

Again, I'm confused by this, because the two different "experiments" (the different processes the couples are using) have nothing to do with the distribution. They have nothing to do with what the value of ##\lambda## is. So asking "which experiment is closer to the real distribution" seems like nonsense to me.

PeroK · Dec 6, 2019

PeterDonis said:

I'm not sure that standard hypothesis testing (aka frequentist statistics) has a good answer to the question I just posed above. But if there is one, I would like to know it.

I wouldn't discount it quite so readily. Let's follow your line of logic through. Suppose you did a large survey of births in the USA in the last year. You want to measure the probability that a boy is born, as opposed to a girl. Call this ##\lambda##. What you cannot do is give a probability distribution for ##\lambda##. Something like:

##p(\lambda = 0.47) = 0.05##
##p(\lambda = 0.48) = 0.10##
##p(\lambda = 0.49) = 0.20##
##p(\lambda = 0.50) = 0.30##
##p(\lambda = 0.51) = 0.20##
##p(\lambda = 0.52) = 0.10##
##p(\lambda = 0.53) = 0.05##

That is not valid because ##\lambda## was not a random variable in the data you analysed.

Instead, you can say some thing like:

##\lambda## is in the range ##0.47 - 0.52## with ##99\%## confidence.
##\lambda## is in the range ##0.48 - 0.51## with ##90\%## confidence.
##\lambda## is in the range ##0.49 - 0.50## with ##80\%## confidence.

That's the difference between "confidence" and "probabilities". Parameters associated with a distribution have confidence levels, not probabilities. The random data has probabilities.

BWV · Dec 6, 2019

with a single sample in either trial the ex post odds are the same - one success in seven trials. continuing with the coin flipping analogy, if you had additional samples, the distribution would differ - one sample set would be of the number of heads in seven coin flips and the other the number of flips before the first head appeared.

the boy/girl example is confusing because it’s not clear whether the problem assumes an equal p=boy between the two couples, which biologically would not be true, or is attempting to measure p=boy for each couple separately, which, while biologically realistic, precludes any additional information from further samples, or to use the two couples to estimate the p=boy for the overall population, in which case one can simply disregard the two couples as outliers

Dale · Dec 6, 2019

PeroK said:

What you cannot do is give a probability distribution for ##\lambda##. Something like:

##p(\lambda = 0.47) = 0.05##
##p(\lambda = 0.48) = 0.10##
##p(\lambda = 0.49) = 0.20##
##p(\lambda = 0.50) = 0.30##
##p(\lambda = 0.51) = 0.20##
##p(\lambda = 0.52) = 0.10##
##p(\lambda = 0.53) = 0.10##

That is not valid because ##\lambda## was not a random variable in the data you analysed.

That is exactly what Bayesian statistics do. They do treat ##\lambda## as a random variable and determine its probability distribution.

PeterDonis · Dec 6, 2019

BWV said:

t’s not clear whether the problem assumes an equal p=boy between the two couples

In my discussion with @fresh_42 I clarified that I intended to include this assumption, yes. I agree, as I said in that discussion, that the assumption is an idealization.

We could go into how one would analyze the data if that assumption were dropped, but that's a further complication that I don't really want to get into in this thread.

fresh_42 · Dec 6, 2019

PeterDonis said:

Again, I'm confused by this, because the two different "experiments" (the different processes the couples are using) have nothing to do with the distribution. They have nothing to do with what the value of ##\lambda## is. So asking "which experiment is closer to the real distribution" seems like nonsense to me.

I believe that each real life test has different random variables and different conditional probabilities and thus different distributions. The assumption that they are the same is already a hypothesis. One I would work with as long as the outcomes remain stable. This adds up to the confidence into the hypothesis. If you mean confidence by statistical weight, then the number of tests and the setup does play a role.

PeterDonis · Dec 6, 2019

Dale said:

That is exactly what Bayesian statistics do. They do treat ##\lambda## as a random variable and determine its probability distribution.

This might be a matter of differing terminology. In Jaynes' Probability Theory, for example, he describes processes like estimating a distribution for ##\lambda## as "parameter estimation". (He doesn't appear to like the term "random variable" much at all, and discusses some of the confusions that using it can cause.)

PeroK · Dec 6, 2019

Dale said:

That is exactly what Bayesian statistics do. They do treat ##\lambda## as a random variable and determine its probability distribution.

What does a Bayesian analysis give numerically for the data in post #1?

Dale · Dec 6, 2019

PeterDonis said:

How might the prior for couple #2 be different from the prior for couple #1?

If you had previous studies that showed, for example, that couples who decided on a fixed number of children in advance had different ##\lambda## than other couples.

StoneTemplePython · Dec 6, 2019

PeroK said:

I wouldn't discount it quite so readily. Let's follow your line of logic through. Suppose you did a large survey of births in the USA in the last year. You want to measure the probability that a boy is born, as opposed to a girl. Call this ##\lambda##. What you cannot do is give a probability distribution for ##\lambda##...

this appears to be falling victim to the Inspection Paradox. Whether you sample based on children or parents matters. Original post discussed sampling by Parents (I think) and you are now sampling by children.

- - - -
I wish Peter would restate the question in a clean probabilistic manner. Being a Frequentist or Bayesian has little do with the essence of the problem. The original post is really about stopping rules, something pioneered by Wald (who, yes did some bayesian stats too). And yes subsequent to Wald, stopping rules were extended in a big way by Doob via Martingales.

fresh_42 · Dec 6, 2019

I vaguely remember similar discussions at my institute. I like Hendrik's approach in QFT: sit down and calculate. Interpretations are another game.

PeroK · Dec 6, 2019

StoneTemplePython said:

this appears to be falling victim to the Inspection Paradox. Whether you sample based on children or parents matters. Original post discussed sampling by Parents (I think) and you are now sampling by children.

Are you talking about the case where some parents have a genetic disposition to one sex for their children?

I was assuming the idealised case where we have a single probability in all cases.

Dale · Dec 6, 2019

fresh_42 said:

Seems a bit linguistic to me.

In general the difference between ##p(X|\lambda)## and ##p(\lambda|X)## is not merely linguistic. They are different numbers. In addition there is the difference in the space over which the probabilities are measured. One is a measure over the space of all possible experimental outcomes ##X## and the other is a measure over the space of all possible boy-birth probabilities ##\lambda##

PeroK · Dec 6, 2019

StoneTemplePython said:

this appears to be falling victim to the Inspection Paradox. Whether you sample based on children or parents matters. Original post discussed sampling by Parents (I think) and you are now sampling by children.

PS in any case, I was only describing the difference between probability and confidence; not trying to analys the initial problem. See post #6.

Dale · Dec 6, 2019

PeroK said:

What does a Bayesian analysis give numerically for the data in post #1?

I left my laptop at work, but I could answer that on Monday. Here is a post where I used a similar process for a different question.

https://www.physicsforums.com/threads/questions-about-error-range-from-bayesian-statistics.973377

See in particular post 11

StoneTemplePython · Dec 6, 2019

PeroK said:

Are you talking about the case where some parents have a genetic disposition to one sex for their children?

I was assuming the idealised case where we have a single probability in all cases.

My read on original post was a question with two 'types' (or iid representatives for classes) of families. One having n kids (stopping rule: n, so random variable = n, with probability one for our purposes) and the other has a geometrically distributed random variable for number of kids (stopping rule: when a girl is born).

The underlying idea of how you sample is closely related to what Dale is saying -- but the way people get tripped up... happens so often it goes under the name of "Inspection Paradox" (originally a renewal theory idea, but pretty general)... we need to be very careful on whether we are doing our estimates by sampling kids or sampling the parents/couples

PeroK · Dec 6, 2019

StoneTemplePython said:

My read on original post was a question with two 'types' of families. One having n kids (stopping rule: n, so random variable = n, with probability one for our purposes) and the other has a geometrically distributed random variable for number of kids (stopping rule: when a girl is born).

The underlying idea of how you sample is closely related to what Dale is saying -- but the way people get tripped up... happens so often it goes under the name of "Inspection Paradox" (originally a renewal theory idea, but pretty general)... we need to be very careful on whether we are doing our estimates by sampling kids or sampling the parents/couples

What's your opinion on post #6? I know you're the real expert on this!

Dale · Dec 6, 2019

PeroK said:

What's your opinion on post #6? I know you're the real expert on this!

I worry that you think I was criticizing your calculation in #6. I am not. It seems to me like a valid calculation, it is just a calculation of a different probability than what you would calculate with Bayesian methods. Nothing wrong with that, just different.

I Does the statistical weight of data depend on the generating process?

Similar threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

I What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

A Distribution of Range of Samples taken from N(0,1)

I A variant of the Monty Hall problem

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers