Neurologist: What P-values should I be expecting?

ElijahRockers · Nov 30, 2015

Inexperienced data analyst here with a real-world example,

I have attached a zip-file with screenshots and p-values of the following data. The "reference regions" are Cerebellum White, Cerebellum Gray, and Temporal Cortex. The top-most graphs depict the curves in the indicated region for young and old subjects. The bottom-most graph has two curves, one for the averaged old values, and one for the averaged young values.

Say I have MRI data for 11 different human subjects which allows me to see the concentration of some chemical compound in specific areas of the brain over time. I have a total of 180 time points for each subject. The data are noisy, but you can clearly see the peak immediately after injection, and the steady slow concentration decay for a time afterward.

I separate them into two groups, 5 younger and 6 older subjects.

Our hypothesis: We expect the older subjects' curves to decay more slowly than young subjects in some areas of the brain, but in the reference regions we would not expect much of a difference.

I use MATLAB to perform a two-sample t-test ('ttest2') on the average of the young subjects, against the average of the old subjects, and get P-values for each of the regions of the brain I am interested in.

What happens is that my p-values seem to somewhat reflect what I was expecting, i.e. the p-values for the reference regions are much higher than those of the other regions.

However, all of the p-values are very low to begin with (they are all statistically significant, P<0.05 ), which seems strange, and excluding a single subject from the analysis can drastically change the p-values by several orders of magnitude.

Why are my p-values so low? In the regions where I would expect a significant difference, the p-values are on the order of 10^-17. This seems wayyy lower than I was expecting, the curves are not THAT different, right?

Is this because of the signal to noise ratio? Because I have a small number of subjects? Or because I have a large number of time points? Or some combination of these, or something else I may not have considered?

Any suggestions as to what I should do at this point?

[EDIT] Attachment deleted by Mentor.

jedishrfu · Nov 30, 2015

I deleted your zip attachment as we don't allow those types of attachments.

Instead could you place the data in a new post in code tags?

[.code=data.]

your data goes here

[./code.]

Remember to remove the dots from the code tags.

Dale · Nov 30, 2015

ElijahRockers said:

Why are my p-values so low? In the regions where I would expect a significant difference, the p-values are on the order of 10-17. This seems wayyy lower than I was expecting, the curves are not THAT different, right?

Is this because of the signal to noise ratio? Because I have a small number of subjects? Or because I have a large number of time points? Or some combination of these, or something else I may not have considered?

So this is a pretty common misconception. A p-value is not a measure of how large a difference is. It is just a measure of how probable the data is if the difference were zero. If you collect a large enough sample, even the most miniscule differences can become enormously significant. So even in the reference regions it is unlikely that the true difference is zero.

With that said, your statistical test sounds like it might be suboptimal. The data almost surely are not normally distributed. What might be normally distributed would be the residuals to some curve fit. It seems that it would be more reasonable to fit the time courses to some appropriate signal model and then do your statistics on the fit parameters. Also, since you have different N for your Two groups I would tend to use an ANOVA or a linear model rather than a T test.

ElijahRockers · Dec 1, 2015

DaleSpam said:

It is just a measure of how probable the data is if the difference were zero.

Could you maybe clarify what you mean by 'if the difference were zero'? (Or perhaps suggest some light reading on the subject?)

I have very little experience with statistics, and a tad bit more with probability: My basic understanding is that our two sets of data are the results of two different probability distribution functions (PDF), and the p-value is the probability that two sets of data come from PDFs with the same mean.

DaleSpam said:

It seems that it would be more reasonable to fit the time courses to some appropriate signal model and then do your statistics on the fit parameters.

It has been suggested that I could, perhaps, use some combination of the gamma variate and some other function to model my data and analyze the parameters. If I do this, am I basically finding the parameters of the gamma variate that best fit each curve, and then performing a statistical analysis on the parameters?

I think that would be some kind of non-linear fitting, no? Is this just something I would have to do computationally, using some form of trial and error to close in on the best fit? Or is there some more 'elegant' way to do this?

micromass · Dec 1, 2015

ElijahRockers said:

the p-value is the probability that two sets of data come from PDFs with the same mean.

NO! The p-value is not this probability AT ALL. This is a very popular misconception. If you want to interpret something as this probability, you need Bayesian statistics.

It has been suggested that I could, perhaps, use some combination of the gamma variate and some other function to model my data and analyze the parameters. If I do this, am I basically finding the parameters of the gamma variate that best fit each curve, and then performing a statistical analysis on the parameters?

I think that would be some kind of non-linear fitting, no? Is this just something I would have to do computationally, using some form of trial and error to close in on the best fit? Or is there some more 'elegant' way to do this?

But are you interested whether the two curves are the same? They're clearly not for most of the data you linked. In a lot of cases I see one curves always lying below the other curve with a significant distance. This is not what I would expect for two curves that are supposed to be the same.

But in your OP your question is not "the two curves are the same" but rather "the rate with which the curves decrease is the same". This is a very different question!

ElijahRockers · Dec 1, 2015

Attached a condensed version of the original attachment.

micromass said:

The p-value is not this probability AT ALL.

I only gleaned that information from the MATLAB help file on the t-test. I am kind of scrambling to learn this stuff.

micromass said:

In a lot of cases I see one curve always lying below the other curve with a significant distance.

This is what we are expecting. One group would be different than the other group, that is, the older group should have a roughly higher concentration than the younger group for all or most time points. I can see the difference with my eyeballs in some regions, but it is not clear from the statistical test. This is why I am looking for help. (That, and I have no prior experience with statistics)

ElijahRockers · Dec 1, 2015

Condensed version of the data attached to this post, in .png format.

Dale · Dec 1, 2015

ElijahRockers said:

Could you maybe clarify what you mean by 'if the difference were zero'? (Or perhaps suggest some light reading on the subject?)

In statistics this is called the "null hypothesis". Basically, you assume that the means are in fact identical. Then using that assumption you calculate how likely it would be to get the data you measured. That is what a p value is, the probability of the data, given the null hypothesis.

micromass · Dec 2, 2015

DaleSpam said:

That is what a p value is, the probability of the data, given the null hypothesis.

That is also not what a p-value is. Think about it, if the null-hypothesis is "the data is normally distributed with mean 0 and standard deviation 1", then the probability of the data (any data) is 0.

What the p-value actually is, is the probability of rejecting the null-hypothesis when the null-hypothesis is actually true. In statistics, the null-hypothesis is often something we want to reject. The p-value is the probability that we make errors when doing this rejection. We want this to be as small as possible: smaller p-value means smaller probabilities of making a wrong decision.

Dale · Dec 2, 2015

micromass said:

That is also not what a p-value is. Think about it, if the null-hypothesis is "the data is normally distributed with mean 0 and standard deviation 1", then the probability of the data (any data) is 0.

Sure, I was keeping things as simple as I could. Technically the p value is the probability that a random sample produces a test statistic that is at least as extreme as the test statistic that was actually observed given the null hypothesis. I didn't think that the additional wordiness was helpful to the OP here.

micromass said:

What the p-value actually is, is the probability of rejecting the null-hypothesis when the null-hypothesis is actually true.

While that is true, that doesn't help the OP understand why he is getting the results he has found. He has some data which "are not THAT different" and is surprised by how small the p values are. He knows that the p value should lead him to reject the null hypothesis, but is surprised by that because he has the common idea that we reject the p value when the difference is large.

Instead, we reject the p value when the observed difference is unlikely. We can make an arbitrarily small difference unlikely (and therefore statistically significant) simply by including an arbitrarily large sample. So a low p value does not indicate a large effect, as expected by many people.

Dale · Dec 2, 2015

ElijahRockers said:

I think that would be some kind of non-linear fitting, no? Is this just something I would have to do computationally, using some form of trial and error to close in on the best fit? Or is there some more 'elegant' way to do this?

since you are already using Matlab there are nonlinear fitting routines already available. You would want to start with those.

Since you are interested in the rate of decay, your chosen fit function should have a parameter which adjusts the "height" and a parameter which adjusts the decay rate. From your plots it seems likely that all of the heights will be significantly different, but maybe not the decay rates.

ElijahRockers · Dec 2, 2015

micromass said:

What the p-value actually is, is the probability of rejecting the null-hypothesis when the null-hypothesis is actually true.

This gives me a little better perspective, thanks.

micromass said:

if the null-hypothesis is "the data is normally distributed with mean 0 and standard deviation 1", then the probability of the data (any data) is 0.

I'm having a hard time wrapping my head around that. The PDF for a normal distribution is very well known, you can plug in any value x and, given the mean and sd, return the probability of observing that value right?

DaleSpam said:

there are nonlinear fitting routines already available.

Thanks, this is a great start, thank you both very much for your help.

Dale · Dec 2, 2015

ElijahRockers said:

I'm having a hard time wrapping my head around that. The PDF for a normal distribution is very well known, you can plug in any value x and, given the mean and sd, return the probability of observing that value right?

For a continuous random variable the height of the PDF is the probability density, not the probability. So to get the probability you have to integrate over some range. If you integrate over a single point the integral is zero.

Stephen Tashi · Dec 2, 2015

DaleSpam said:

For a continuous random variable the height of the PDF is the probability density, not the probability. So to get the probability you have to integrate over some range. If you integrate over a single point the integral is zero.

To reinforce that thought, think of a thin rod that has a "mass density per unit length". At a given point the point has zero mass. The mass density at the point is used in calculations that estimate the mass of small intervals of rod that contain that point, but the number you estimate for mass depends on the size of the small interval.

In practice, measurements of a continuous variable in an experiment are not actually points because there is uncertainty due to the measuring apparatus. So if you attempt calculate the probability of a given observation you need to account for the the uncertainty of the measurement to give yourself a finite interval. The probability of realizing a particular value (specified with complete certainty) from a normal distribution is zero even though the probability density function at that value is not zero.

Statistical tests are procedures, not proofs. When applied to continuous random values, such procedures are phrased in terms of intervals or "acceptance regions". For example, "...if the mean is less than..." or "if the mean is between the values...". If they were phrased in terms of single values (e.g. "...if the mean is exactly 1.3794..."), the probabilities involved would be zero.

ElijahRockers · Dec 2, 2015

Stephen Tashi said:

To reinforce that thought, think of a thin rod that has a "mass density per unit length". At a given point the point has zero mass. The mass density at the point is used in calculations that estimate the mass of small intervals of rod that contain that point, but the number you estimate for mass depends on the size of the small interval.

Interesting, how does this contrast with discrete examples?

To see if I am understanding: If I consider the PMF of the sum of two dice, can't I say I have exactly a 1/36 probability to roll snake eyes? Now suppose I have two 'continuous' dice, that can be any value from 1-6, they would have 0 probability to roll snakeyes (or any particular value, for that matter), because there are infinitely many other values the dice could 'roll' instead?

Edit: I think it is dawning on me why we use probability mass vs. probability density

Dale · Dec 2, 2015

Yes, that is exactly correct.

If you have a 6-sided dice then the probability of rolling a 1 is 1/6. If you have a 20 sided dice then the probability of rolling a 1 is 1/20. If you have an infinite sided dice then the probability of rolling a 1 is 0.

WWGD · Dec 2, 2015

DaleSpam said:

Yes, that is exactly correct.

If you have a 6-sided dice then the probability of rolling a 1 is 1/6. If you have a 20 sided dice then the probability of rolling a 1 is 1/20. If you have an infinite sided dice then the probability of rolling a 1 is 0.

If all faces of the infinite-sided , as the limiting case (I think this can be made more precise, maybe with techniqes/results from B.B.I's "Metric Geometry" book) of a dice with a large number of faces, are equal * , I think the limiting position is a sphere. So we would then be "rolling a sphere"* Guess a theoretical, I guess, physically difficult, if not impossible to construct for large number of (equal) sides.

PMRDK · Dec 8, 2015

ElijahRockers said:

I use MATLAB to perform a two-sample t-test ('ttest2') on the average of the young subjects, against the average of the old subjects, and get P-values for each of the regions of the brain I am interested in.

What happens is that my p-values seem to somewhat reflect what I was expecting, i.e. the p-values for the reference regions are much higher than those of the other regions.

However, all of the p-values are very low to begin with (they are all statistically significant, P<0.05 ), which seems strange, and excluding a single subject from the analysis can drastically change the p-values by several orders of magnitude.

If I understood correctly: For each subject you have a time series (180 points). You then average the time series across subjects within each group.

Could you please clarify, what exactly did you feed into the ttest2 function? The entire average time series of 180 point each?

ElijahRockers · Dec 8, 2015

PMRDK said:

If I understood correctly: For each subject you have a time series (180 points). You then average the time series across subjects within each group.

Could you please clarify, what exactly did you feed into the ttest2 function? The entire average time series of 180 point each?

Yep, I averaged across subjects within the groups, resulting in two curves of 180 time points each. One for the old subjects, and one for the young subjects. I compared these two curves using ttest2.

We have since hired a statistician, and she has done some modeling, some statistical analysis that probably makes more sense... I am hoping that she will not mind teaching me the process so I can recreate it on my own.

Number Nine · Dec 15, 2015

micromass said:

That is also not what a p-value is. Think about it, if the null-hypothesis is "the data is normally distributed with mean 0 and standard deviation 1", then the probability of the data (any data) is 0.

What the p-value actually is, is the probability of rejecting the null-hypothesis when the null-hypothesis is actually true. In statistics, the null-hypothesis is often something we want to reject. The p-value is the probability that we make errors when doing this rejection. We want this to be as small as possible: smaller p-value means smaller probabilities of making a wrong decision.

The p-value is the probability of observing an effect greater than or equal to the observed effect, under the null hypothesis. The p-value cannot be interpreted as the probability of making a type-1 error: How then would you interpret a p-value of >0.05? If our significance threshold is 0.05, then the probability that we have made a type-1 error is zero, since we haven't even rejected the null hypothesis. The p-value is almost entirely unrelated to the probability of falsely rejecting the null hypothesis, which would require Bayesian methods to calculate.

Neurologist: What P-values should I be expecting?

Attachments

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Neurologist: What P-values should I be expecting?

Attachments

Similar threads