High School Does a correlation coefficient represent probability?

Click For Summary
Correlation coefficients, such as r-squared values, do not represent probability or chance; they measure the extent to which one variable explains the variance in another. While some argue that a high r-squared indicates a strong connection between variables, it is not a direct measure of probability. The chance of randomness in statistical analysis is determined by the p-value, not the r-squared value. Misinterpretation can occur, as r-squared can be inflated by overfitting with too many variables. Understanding the distinction between correlation and probability is essential for accurate statistical analysis.
Hallucinogen
Messages
37
Reaction score
0
Sorry if this is trivial - I'm arguing with someone on Facebook :sorry: who is claiming that r-squared values, or correlation coefficients represents "chance" or "probability". I've never heard of this.
I just need a simple yes or no answer along with a short explanation why it is or isn't probability.
He says:
"It is the probability that there is a connection between two different things.

A Correlation Probability of 1 means that these two things ALWAYS occur together, even if there is absolutely no causal links between them.

A Correlation Probability of ½ (or .5, or 50%) means that the two things are connected 50% of the time. Or, roughly what we would see in "Chance" for most things, depending upon what two things we are looking at."

I say:
"I think you have something wrong here. I don't know what a correlation probability is and I'm struggling to find it online. The ".55" value and other correlations in the literature are correlation coefficients, they are r-squared values calculated using a least squares regression analysis. R-squared estimates the fraction of the variance in IQ scores that is explained by % genetic relation - it has nothing to do with chance.

Chance of randomness is given by the p value, calculated by a t test on the same data. The chance of randomness depends on the data sets you're attempting to correlate and the variances of the data points. I still don't actually understand where you've gotten 50% from."

He says:
"An R-Squared IS a PROBABILITY (Describing the Chance/Probability that two variables are correlated)."

Is r-squared the same thing as probability? I've not learned it that way, I've learned it *only* as how much variance in one thing explains variance in another.
 
Physics news on Phys.org
An ##R^2## is not a probability.
 
  • Like
Likes Hallucinogen
R2 is a measurement of what fraction of the variation of one variable might be explained by the other variable. Although it is not directly a probability, it is a statistic that has a distribution and associated probabilities. A large R2 from enough data implies that the apparent association between the variables would have taken a lot of luck if they are not, in fact, related.
 
FactChecker said:
A large R2 from enough data implies that the apparent association between the variables would have taken a lot of luck if they are not, in fact, related.

I would be very very careful before taking this interpretation. It is always possible to inflate your ##R^2## by taking enough variables. So your large ##R^2## might be due to overfitting, and not due to an actual relation.
 
  • Like
Likes FactChecker
micromass said:
I would be very very careful before taking this interpretation. It is always possible to inflate your ##R^2## by taking enough variables. So your large ##R^2## might be due to overfitting, and not due to an actual relation.
I agree about the danger, especially if there are a lot of variables and not a lot of data. With all applied math, you have to be careful. But it is the only interpretation to take and is fundamental for lot of statistics. Algorithms will often help to select a small subset of variables that gives a statistically significant R2.
 
Last edited:
First trick I learned this one a long time ago and have used it to entertain and amuse young kids. Ask your friend to write down a three-digit number without showing it to you. Then ask him or her to rearrange the digits to form a new three-digit number. After that, write whichever is the larger number above the other number, and then subtract the smaller from the larger, making sure that you don't see any of the numbers. Then ask the young "victim" to tell you any two of the digits of the...

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
7
Views
3K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 24 ·
Replies
24
Views
3K
  • · Replies 21 ·
Replies
21
Views
4K
Replies
9
Views
2K