A Good Examples of Causation does not Imply Correlation

  • A
  • Thread starter Thread starter WWGD
  • Start date Start date
  • Tags Tags
    Correlation
AI Thread Summary
The discussion explores the concept that causation does not imply correlation, particularly when the relationship is non-linear. Examples like Hooke's law and the quadratic relationship between voltage and power are cited to illustrate scenarios where causation exists but correlation is zero. Participants emphasize the importance of distinguishing between general correlation and linear correlation, noting that certain relationships, such as temperature variations throughout the year, can also exhibit zero correlation despite underlying causative factors. The conversation also touches on the significance of mutual information as a measure of statistical association, which can highlight relationships even when traditional correlation metrics fail. Overall, the thread seeks to clarify the nuances of causation and correlation in statistical contexts.
WWGD
Science Advisor
Homework Helper
Messages
7,700
Reaction score
12,712
Ok, so if the causality relation between A,B is not linear, then it will go unnoticed by correlation, i.e., we may have A causing B but Corr(A, B)=0. I am trying to find good examples to illustrate this but not coming up with much. I can think of Hooke's law, where data pairs (x, kx^2) would have zero correlation. Is this an " effective" way of illustrating the point that causation does not imply ( nonzero) correlation? Any other examples?
 
Physics news on Phys.org
If you apply a voltage across a resistor it causes power to dissipate in the resistor. The power is quadratic in the voltage so the linear correlation coefficient is zero.
 
  • Like
  • Love
Likes Abhishek11235, etotheipi and WWGD
Why would (x,x^2) not have a high correlation for positive x?
 
  • Like
Likes Delta2
Something like (x,xsin(x)) would have little correlation
 
BWV said:
Why would (x,x^2) not have a high correlation for positive x?
I haven't double-checked the actual values of the correlation ( difficult to do on the phone) but because points in a parabola do not closely resemble/fit points in a line.
 
BWV said:
Something like (x,xsin(x)) would have little correlation
Thanks. Can you find a causal relation des cribed by such pairs?
 
Correlation(x,x^2)~0.97 for x=1:100
 
BWV said:
Why would (x,x^2) not have a high correlation for positive x?
I wasn’t limiting it to positive x. The correlation is 0 for a balanced positive and negative sample
 
I guess the fact that it's quadratic isn't interesting here, (x,|x|) would have similarly small correlation. Basically anytime you have a signed input, and an unsigned output whose magnitude depends on the magnitude of the input.

The correlation of the charge on an ion and the angle of curvature when it passes through a magnetic field? Actually constructing these examples is annoying.

What about something like the correlation between day of the year and temperature. Days 1 and 365 are both cold (at least in the northern hemisphere), the middle days are warm, so correlation is zero.
 
  • #10
Office_Shredder said:
I guess the fact that it's quadratic isn't interesting here, (x,|x|) would have similarly small correlation. Basically anytime you have a signed input, and an unsigned output whose magnitude depends on the magnitude of the input.

The correlation of the charge on an ion and the angle of curvature when it passes through a magnetic field? Actually constructing these examples is annoying.

What about something like the correlation between day of the year and temperature. Days 1 and 365 are both cold (at least in the northern hemisphere), the middle days are warm, so correlation is zero.
Thanks, but it is not just any dataset, or, like you said, it is relatively-straightforward. I am looking for one describing a causal relation.
 
  • #11
Office_Shredder said:
I guess the fact that it's quadratic isn't interesting here, (x,|x|) would have similarly small correlation. Basically anytime you have a signed input, and an unsigned output whose magnitude depends on the magnitude of the input.

The correlation of the charge on an ion and the angle of curvature when it passes through a magnetic field? Actually constructing these examples is annoying.

What about something like the correlation between day of the year and temperature. Days 1 and 365 are both cold (at least in the northern hemisphere), the middle days are warm, so correlation is zero.
Oops! Realized I forgot to shift the ## y=kx^2 ## to avoid symmetry. Consider, e.g., ## y=k(x-1)^2##. That should do it.
 
  • #12
Using the word correlation to imply linear correlation is a little uncomfortable to me when used in the phrase, "Causation does not Imply Correlation". I always interpret "correlation" as general correlation in the converse.
 
Last edited:
  • Like
Likes FactChecker
  • #13
I think the examples given here all have zero general correlation.
 
  • #14
ultimately if measured properly, causation should result in linear correlation, some adjustment of variables will result in linear correlation in the examples above. In the quadratic example centered at the origin, for instance, a simple look at the data will reveal the relationship and all one has to do is take the absolute value of the input.
 
  • #15
Office_Shredder said:
I think the examples given here all have zero general correlation.
I think zero correlation means knowing the value of one would give you absolutely no information that is useful to predict the value of the other.
 
  • #16
For context, I may be teaching a small online class that includes this general area and was looking for examples that are " natural". I am thinking too of including Anscombe's quartet somehow. More interesting to me, but beyond the scope, is having different RVs with the same distribution: like the RVs counting heads or tails in a binomial with p=0.5.
 
  • #17
The other situation is a missing variable, where A impacts B, but does not show up statistically because the impact of C is not accounted for
 
  • #18
BWV said:
The other situation is a missing variable, where A impacts B, but does not show up statistically because the impact of C is not accounted for
I'm not sure, but encryption might be a good example.
 
  • #19
BWV said:
The other situation is a missing variable, where A impacts B, but does not show up statistically because the impact of C is not accounted for
You mean lurking variables?
 
  • #20
Jarvis323 said:
Using the word correlation to imply linear correlation is a little uncomfortable to me when used in the phrase, "Causation does not Imply Correlation". I always interpret "correlation" as general correlation in the converse.
Since this thread is in the statistics section I assumed that standard statistical correlation was implied, but you do make a good point. That isn’t the only meaning to the term.
 
  • Like
Likes Klystron
  • #21
Dale said:
Since this thread is in the statistics section I assumed that standard statistical correlation was implied, but you do make a good point. That isn’t the only meaning to the term.
I assume everything outside of General Discussion to be interpreted technically. " Big Picture" questions, no less important/interesting than the latter, I assume belong in GD.
 
  • #22
Dale said:
Since this thread is in the statistics section I assumed that standard statistical correlation was implied, but you do make a good point. That isn’t the only meaning to the term.
I assume everything outside of General Discussion to be interpreted technically. " Big Picture" questions, no less important/interesting than the latter, I assume belong in GD. Edit: Unless explicitly stated otherwise. The linked content below makes me think this is the way PF is organized.
 
  • #23
Causation implies you can make a prediction about the value is basically a tautology, and doesn't really help much. How do I figure out if there exists an arbitrarily shaped function which results in at least ##\epsilon## predictive power?

One method of testing this is by measuring the correlation. If it exists, then a predictive function exists (even if the relationship is not casual). The fact that correlation can be zero and you can still have perfect predictive power is an interesting result in my opinion.
 
  • #24
While we're talking about correlation. Anyone know if we can consider Spearman Rho for more than 2 datasets? Edit: I know we can use Kruskal -Wallis one-way Anova for simiular but just curious as to Spearman Rho.
 
  • #25
The spearman coefficient is the Pearson coefficient of pairs of integers describing the relative values of the data sets, so if there's a Pearson coefficient there is a spearman one.

I don't know of anything specific, but as far as the Pearson coefficient measures how good a line fits the data, you can certainly measure e.g. the variance explained by the first pca factor and take the square root of that. I think that won't give pearson's coefficient since the line returned by pca on two variables is not the best fit line, but I might be wrong about that.
 
  • #26
I think mutual information might be one of the purest and most relevant measures of correlation? I guess some measures of correlation are used so frequently (e.g. in linear statistics), that it's become common to use correlation as short for whichever measure a group of people are used to working with.
 
  • #27
Office_Shredder said:
One method of testing this is by measuring the correlation. If it exists, then a predictive function exists (even if the relationship is not casual).
One issue is that correlation is a statistical concept. If you have a stationary process with a finite set of states, then you can measure correlation, e.g. with mutual information. If you don't, then you can still have causality, but might not be able to use statistics at all.
Office_Shredder said:
The fact that correlation can be zero and you can still have perfect predictive power is an interesting result in my opinion.
I am skeptical about this.
 
  • #28
Jarvis323 said:
I think mutual information might be one of the purest and most relevant measures of correlation?
I think that is a pretty large abuse of terminology. Do you have any scientific reference that supports that claim?
 
  • #29
Dale said:
I think that is a pretty large abuse of terminology. Do you have any scientific reference that supports that claim?
To the contrary, using the term "correlation" as short for a specific type of linear statistical relationship is an abuse of terminology, although a convenient one if you are primarily using linear statistics. Correlation technically means any statistical relationship. Mutual information is a good measure here because it is one of the purest measures of statistical association. If there is any statistical relationship, then there will be mutual information.

In the context of the saying, it's also a good measure, because a statistical association doesn't imply causality, no matter if you're talking about correlation in the purest sense, or linear correlation. If you want to discuss the converse (does causality imply correlation?), I think it would be misleading and less interesting to use a narrow/restricted measure of correlation. Then again, due to the confusion with the word "correlation" becoming used so imprecisely in certain fields, it might be better just to ask if causality implies statistical association. Then it comes down to whether the process is stationary, or is the setting restricted properly so that the application of statistics is meaningful and core statistical assumptions can be made.

Likewise, in the context of all of the recent questions about causality and correlation, one should assume the broad definition of correlation (any statistical relationship) otherwise the questions are trivial, somewhat arbitrarily restrictive, and uninteresting.

Here is an early paper that you might be helpful.

http://www.economics.soton.ac.uk/staff/aldrich/spurious.PDF
 
Last edited:
  • #30
Jarvis323 said:
To the contrary, using the term "correlation" as short for a specific type of linear statistical relationship is an abuse of terminology, although a convenient one if you are primarily using linear statistics. Correlation technically means any statistical relationship. Mutual information is a good measure here because it is one of the purest measures of statistical association. If there is any statistical relationship, then there will be mutual information.

In the context of the saying, it's also a good measure, because a statistical association doesn't imply causality, no matter if you're talking about correlation in the purest sense, or linear correlation. If you want to discuss the converse (does causality imply correlation?), I think it would be misleading and less interesting to use a narrow/restricted measure of correlation. Then again, due to the confusion with the word "correlation" becoming used so imprecisely in certain fields, it might be better just to ask if causality implies statistical association. Then it comes down to whether the process is stationary, or is the setting restricted properly so that the application of statistics is meaningful and core statistical assumptions can be made.

Likewise, in the context of all of the recent questions about causality and correlation, one should assume the broad definition of correlation (any statistical relationship) otherwise the questions are trivial, somewhat arbitrarily restrictive, and uninteresting.
Well, " Any Statistical Relation" is hopelessly vague. Just what does that mean and how is it measured? And I don't see why it is uninteresting ( obviously it interests me, since I asked the question), because the definition of correlation : Spearman and Rho that I am aware of, entail simultaneous change of two variabled so that it seems unintuitive to have causation without simultaneous change.
 
  • #31
WWGD said:
Well, " Any Statistical Relation" is hopelessly vague. Just what does that mean and how is it measured? And I don't see why it is uninteresting ( obviously it interests me, since I asked the question), because the definition of correlation : Spearman and Rho that I am aware of, entail simultaneous change of two variabled so that it seems unintuitive to have causation without simultaneous change.
Any statistical relationship isn't hopefully vague. It means that ##P(X|Y) \neq P(X)##. This is the type of association that is most relevant to the saying "Correlation doesn't imply causality." Mutual information is a measure that captures this notion.

Why shouldn't it be the same meaning of correlation when talking about the converse of the saying? It's less interesting to me (I shouldn't have said that since it's my opinion) if you're talking about an arbitrary restrictive correlation measure, because of course you can exploit the restriction to find the example, but that doesn't say something fundamental about causality and probability or statistics, it just points out the importance of watching which simplifying assumptions you're making, and the limitations of certain measures.
 
  • Skeptical
Likes Dale
  • #32
Jarvis323 said:
Any statistical relationship isn't hopefully vague. It means that ##P(X|Y) > P(X)##. This is the type of association that is relevant to the saying "Correlation doesn't imply causality." Mutual information is a measure that captures this notion.

Why shouldn't it be the same meaning of correlation when talking about the converse of the saying? It's less interesting if you're talking about an arbitrary restrictive measure of correlation, because of course you can exploit the restriction to find the example, but that doesn't say something fundamental about causality and probability or statistics, it just points out the importance of watching which simplifying assumptions you're making, and the limitations of certain measures.
From what I know that shows dependence between X, Y; specifically negative dependence, but not necessarily correlation. The term ' correlation' as far as I know, has a specific meaning and it refers to either Pearson correlation or Spearman correlation.
 
  • Like
Likes Dale
  • #33
WWGD said:
From what I know that shows dependence between X, Y; specifically negative dependence, but not necessarily correlation. The term ' correlation' as far as I know, has a specific meaning and it refers to either Pearson correlation or Spearman correlation.
I edited it to ##P(X|Y)\neq P(X)##. Anyways, I don't think "Correlation doesn't imply causation" is a saying specifically about Pearson or Spearman correlation, but rather a saying about statistical relationships (and correlation in general) and causality. But that doesn't mean it isn't interesting to examine in the context of Pearson's or Spearman's correlation, because they are so well known. Sorry I disrupted the discussion.
 
  • #34
Jarvis323 said:
I edited it to ##P(X|Y)\neq P(X)##. Anyways, I don't think "Correlation doesn't imply causation" is a saying specifically about Pearson or Spearman correlation, but rather a saying about statistical relationships (and correlation in general) and causality. But that doesn't mean it isn't interesting to examine in the context of Pearson's or Spearman's correlation, because they are so well known. Sorry I disrupted the discussion.
You have valid points, but my question is a narrower, more technical one, where I make reference to correlation in the sense I think it is commonly-used, and not a broader question on causality. For one, PF does not allow philosophical questions ( too many headaches and no experts on the matter to moderate). But it is worth exploring the connection between causality and dependence or causality and other factors.
 
  • Like
Likes Klystron and Dale
  • #35
Jarvis323 said:
I'm not sure what you're talking about, it's not a "philosophical question", it's logic, probability and statistics. Using the narrower definition in my view is less technical. That's why I interjected along the lines of "technically". I am being precise and true to the nature of the question as it has been for over 100 years.
But ultimately it is the question I chose to ask. The standard definitions of correlation I am aware of are either the Spearman Rho or Pearson. Do you know of any other statistic that is defined to measure correlation other than these two? I am. at this moment, concerned with the relation between causality and correlation as I understand these to be defined and not a more general question, however enlightning it could be, on causation vs other traits RVs may have. It is the focus I chose for the question.
 
  • Like
Likes Dale
  • #36
WWGD said:
You have valid points, but my question is a narrower, more technical one, where I make reference to correlation in the sense I think it is commonly-used, and not a broader question on causality. For one, PF does not allow philosophical questions ( too many headaches and no experts on the matter to moderate). But it is worth exploring the connection between causality and dependence or causality and other factors.
Fair enough, but to avoid confusion, I would make sure to choose a specific measure of correlation if you are not addressing the broader question. Because if you use a restricted measure of correlation, then your examples will also depend on the specific measure of correlation as well (what is true about Pearson's won't be true about Spearman's and so forth). If you are simply uncomfortable with correlation not being in the term "mutual information", then you could instead use "total correlation". It's even more general than "mutual information" and has the word correlation in it :) And it is a measure that comes from a branch of engineering, so you might not get in trouble for getting too Philosophical ;).

https://en.wikipedia.org/wiki/Total_correlation
 
Last edited:
  • #37
Jarvis323 said:
Fair enough, but to avoid confusion, I would make sure to choose a specific measure of correlation if you are not addressing the broader question. Because if you use a restricted measure of correlation, then your examples will also depend on the specific measure of correlation as well (what is true about Pearson's won't be true about Spearman's and so forth). If you are simply uncomfortable with correlation not being in the term "mutual information", then you could instead use "total correlation". It's even more general than "mutual information" and has the word correlation in it :) And it is a measure that comes from a branch of engineering, so you might not get in trouble for getting too Philosophical ;).

https://en.wikipedia.org/wiki/Total_correlation
I brought up a much simpler case of 2 RVs. Does Total Correlation shed much light when we only consider two variables? Clearly if there is causality there is a degree of dependence but it seems overkill for this basic case.
 
  • #38
WWGD said:
I brought up a much simpler case of 2 RVs. Does Total Correlation shed much light when we only consider two variables? Clearly if there is causality there is a degree of dependence but it seems overkill for this basic case.
Total correlation works with 2 random variables. It's just a generalization of mutual information (which is between two random variables).

It's also nice because it works with discrete and categorical data. So you can reason about abstract events, which makes connecting it with causality easier.
 
Last edited:
  • #39
Jarvis323 said:
To the contrary, using the term "correlation" as short for a specific type of linear statistical relationship is an abuse of terminology
Don't be silly. It is never an abuse of terminology to use a standard term in the standard manner with the standard meaning. Never. And particularly not in the sub-forum specifically for that discipline where the standard term is defined.

However, since you did produce a reference that supports your basic point (not your specific point about mutual information but the general idea of using correlation in a more broad sense), please feel free to use your non-standard meaning in this thread as long as you explicitly identify when you are using the non-standard meaning.
 
  • Like
Likes WWGD
  • #40
It is a good ref in general but overkill for the case of two variables. Using a tank to kill a fly.
 
  • #41
Dale said:
Don't be silly. It is never an abuse of terminology to use a standard term in the standard manner with the standard meaning. Never. And particularly not in the sub-forum specifically for that discipline where the standard term is defined.
It's a term that is context dependent.

Most definitions I find online say correlation is a measure of linear dependency. I guess this means that Spearman's correlation is not standard either? We are talking about strictly linear correlation right?

Then it's pretty easy and unsurprising that there are cases where causality doesn't imply linear correlation, because there are lots of non-linear processes.

You can use examples like trends that go up and then go down, like the number of active viral infections vs the number of people that have had it. Or things like the x position of a wheel in euclidean space vs how much it has turned, or sinusoidals, etc.
 
Last edited:
  • #42
Jarvis323 said:
It's a term that is context dependent.

Most definitions I find online say correlation is a measure of linear dependency. I guess this means that Spearman's correlation is not standard either? We are talking about strictly linear correlation right?

Then it's pretty easy and unsurprising that there are cases where causality doesn't imply linear correlation, because there are lots of non-linear processes.

You can use examples like trends that go up and then go down, like the number of active viral infections vs the number of people that have had it. Or things like the x position of a wheel in euclidean space vs how much it has turned, or sinusoidals, etc.
That's why my question was narrower on either Pearson or Spearman. That way it has a somewhat clear answer, even if it is an oversimplification. Otherwise we would likely have an endless discussion , and in an area I am not too familiar with.
 
  • Like
Likes Dale
  • #43
Dale said:
Don't be silly. It is never an abuse of terminology to use a standard term in the standard manner with the standard meaning. Never. And particularly not in the sub-forum specifically for that discipline where the standard term is defined.

However, since you did produce a reference that supports your basic point (not your specific point about mutual information but the general idea of using correlation in a more broad sense), please feel free to use your non-standard meaning in this thread as long as you explicitly identify when you are using the non-standard meaning.
Fair enough. But at least the mutual information part is supported at least somewhat by the fact that another name for mutual information is total correlation between two random variables.

I usually am studying topics where linear correlation isn't very relevant compared with statistical association in general, and I often use the word correlation. I've learned from this thread that people are so used to meaning linear correlation when they say correlation that I should just stop using the word correlation all together in those cases.

Just pretend I wasn't here.
 
Last edited:
  • #44
Jarvis323 said:
Fair enough. But at least the mutual information part is supported at least somewhat by the fact that another name for mutual information is total correlation between two random variables. I usually an studying topics where linear correlation isn't very relevant compared with statistical association in general, and I sometimes often use the word correlation. I've learned from this topic that people are so used to meaning linear correlation when they say correlation that I should just stop using the word correlation all together in those cases.
But how much sense does it make to bring it up for a simple case of two variables Y vs X?
 
  • #45
WWGD said:
But how much sense does it make to bring it up for a simple case of two variables Y vs X?
Mutual information is probably the simplest measure of dependency of all for two random variables. It's calculation is simple, and it's interpretation is simple (although maybe harder to visualize), at least in my opinion.

Of coarse as I have been arguing, I think it makes the most sense to use mutual information in this context, but that's just my opinion.

That said you're right, it would make the discussion a lot more complicated and maybe philosophical, and not produce the result you need for the class. So I concede it might be inappropriate.

Encryption is a decent example though. Because encrypted strings are meant to appear to have absolutely no statistical association with their plain text counterparts, even though they are generated from each other, with a small missing ingredient. But with standard linear correlation you can't use any examples like this because of both the type of data and limitation as to what kind of association it measures.

There is also a concept of "correlation-immunity" in cryptography that is relevant.
https://en.wikipedia.org/wiki/Correlation_immunity
 
Last edited:
  • Like
Likes WWGD
  • #46
Just about any time the output increases (or decreases) monotonically with the input, causation will imply correlation.
 
  • #47
A simple example of causation without correlation is between Y and XY, where X and Y are independent random variables: X has probability = 1/2 on each of =1, +1 and Y has probability = 1/2 on each of 0, 1. The random variables Y and XY have correlation = 0 but event A{Y=0} forces event B{XY=0}.
This example should be kept in mind when considering if causation implies a linear relationship or that zero correlation implies that one variable gives no indication of the value of another variable. Both of those statements are incorrect in the general case.
 
Last edited:
  • #48
WWGD said:
Ok, so if the causality relation between A,B is not linear, then it will go unnoticed by correlation, i.e., we may have A causing B but Corr(A, B)=0.

It's insufficient proof to assert "then it will" and then give a reason using the phrase "we may have".

What you describe is not a well defined mathematical situation. If we define a "causal relation" between A and B to mean that B is a function of A, the "correlation between A and B" is undefined until some procedure of sampling the values of A is specified.

For example, if B = f(A) then there may be intervals where f is increasing with respect to A and intervals where it is decreasing with respect to A. The "correlation between A and B" is not defined as a specific number until how we say how to sample the various intervals.

Furthermore the term "causal relation" is ambiguous. For example suppose for each value of B = b, the value of A is given by a family of probability distributions of the form f(A,b) where b is a parameter of the distribution. Then A is not function of B, but B still "has an effect" on the value of A.

I may be teaching a small online class that includes this general area and was looking for examples that are " natural".

I hope you don't teach using ambiguous jingles like "Causation Does Not Imply Correlation".
 
  • #49
Stephen Tashi said:
It's insufficient proof to assert "then it will" and then give a reason using the phrase "we may have".

What you describe is not a well defined mathematical situation. If we define a "causal relation" between A and B to mean that B is a function of A, the "correlation between A and B" is undefined until some procedure of sampling the values of A is specified.

For example, if B = f(A) then there may be intervals where f is increasing with respect to A and intervals where it is decreasing with respect to A. The "correlation between A and B" is not defined as a specific number until how we say how to sample the various intervals.

Furthermore the term "causal relation" is ambiguous. For example suppose for each value of B = b, the value of A is given by a family of probability distributions of the form f(A,b) where b is a parameter of the distribution. Then A is not function of B, but B still "has an effect" on the value of A.
I hope you don't teach using ambiguous jingles like "Causation Does Not Imply Correlation".
Consider Hooke's law and the causal relation ( Causality is still somrwhat of a philosophical term at this stage, so I am settling for accepted Physical laws as describing /defining causality) with a shift, to## y=k(x-1)^2## . Then the samples at opposite sides of the above equalities described by it, as well as by other Physical laws may give rise to uncorrelated data sets. I cannot afford to enter or present serious background on causality in an intro-level class.
 
  • #50
If you are going to teach an introductory class, I think you should be careful about these terms. Saying that A and B are uncorrelated implies that A and B are random variables. Saying that A causes B implies that A and B are events. The two implications are conflicting. It would be better to talk about random variables X and Y being correlated and about the event ##X \in A## implying (not causing) the event ##Y \in B##. (You could talk about events A and B being independent, but not uncorrelated).
Also, you should be careful to indicate that "causation" is a logic problem that depends on subject knowledge, not a statistical problem.
 
Last edited:
Back
Top