# Correlation and the Probability of Causation

by koenigcochran
Tags: causation, correlation, fallacy, probability
 P: 15 I've always been told "correlation does not imply causation." However, I've never been told much about whether it can imply a probability of causation. Moreover, there seem to be competing and often misused definitions of "to cause", i.e., use in a syllogistic sense versus use in probabilistic sense. Please consider the following: Imagine we conduct a strictly controlled experiment only once, and it has one of two outcomes: (Outcome 1) Y is strongly correlated to X. (Outcome 2) Y is not correlated to X at all. Suppose the single experiment has outcome (1), and we call the probability that X causes Y, P1. Now let's go back in time, suppose it instead has outcome (2), and call the probability that X causes Y, P2. Is P1 > P2? Why? I realize this raises lots of questions: what do I mean by "cause"? What do we know about the experiment? What do we mean by strictly controlled? For the sake of my curiosity, I invite you to provide your own assumptions in answer to these questions. I apologize for the vagaries, but it's the vagaries of this question that have me scratching my head! Please, feel free to point out what must be clarified, and some possible clarifications in response (point out the blanks and fill them in). Thanks so much!
 P: 4,579 The other thing also is the nature of your information. For example most information we deal with is more or less a simplification of something more complex. Think of it as a projection from a huge higher dimensional space to something much lower. As a result of this simplification of data, we are going to miss things and make bad inferences, especially if we forget what the simplification is and what it relates to in terms of the context of the un-projected, complete data as a whole. So if you have this circumstance where you think you have the complete amount of data, but the data itself is a vast simplification that hides many of the internal mechanisms that contribute to the final 'simplified' numeric quantities you use in the analysis, just remember that simplifications, what is measured, and how they relate to each other and the context of the experiment makes a big impact on causality and also in analyzing correlation. In fact it is a good idea to mention those things in a report so that other people can be aware of your analyses to draw their own conclusions whether favorable or not, in a constructive manner.
P: 250
Correlation and the Probability of Causation

Hi koenigcochran!

 Quote by koenigcochran I've always been told "correlation does not imply causation." However, I've never been told much about whether it can imply a probability of causation. Moreover, there seem to be competing and often misused definitions of "to cause", i.e., use in a syllogistic sense versus use in probabilistic sense. Please consider the following: Imagine we conduct a strictly controlled experiment only once, and it has one of two outcomes: (Outcome 1) Y is strongly correlated to X. (Outcome 2) Y is not correlated to X at all. Suppose the single experiment has outcome (1), and we call the probability that X causes Y, P1. Now let's go back in time, suppose it instead has outcome (2), and call the probability that X causes Y, P2. Is P1 > P2? Why?
This is a cool question, OK, my two cents on this one:

There are an infinite number of experiments in which X is the cause of Y in (Outcome 1) and another infinite number of experiments in which X is not the cause of Y. And exactly the same thing happens in (Outcome 2).

So, to calculate (and compare) P1 and P2 would only make sense within a well define set of scenarios.

There is no way whatsoever to prove (or, interestingly, disprove) causality by just looking at the data, you need to understand the underlying model that causes X and Y to do so, and then the causality is as true as your model is.

We humans have the feeling that P1>P2 because our "well define sets of scenarios" is not the infinite mathematical possible cases but our daily experience which is full of underlying models that we build throughout our lives to have better chances to survive.

So if we see event Y happening right after event X (whether correlated or not) our collection of human underlying models will assume as the more likely scenario X to be the cause of Y. And so we humans will do with the example you pose by considering P1, not higher, but much higher than P2 and, in this context, we are right, P1>>P2.

In short, once you define the context (whether mathematical, physical, human...) you can talk about how causality relates to correlation.
 Homework Sci Advisor HW Helper Thanks P: 9,928 koenigcochran, you have more-or-less described Bayesian reasoning. Yes, if you have an a priori probability of an hypothesis, and can calculate the odds of an observation on the basis that the hypothesis is true, and again on the basis that it is false, then you can adjust your probability of the hypothesis in light of the data. This is one reason (the main reason?) that a feasible mechanism for cause contributes greatly to one's confidence in it. Some would say that without such the a priori probability is zero, so can never rise above that. OTOH, we should always allow the possibility that we just haven't thought of the mechanism. Btw, as indicated by earlier posts, need to distinguish between cause and causal connection. There may be a common cause.

 Related Discussions Quantum Physics 76 Set Theory, Logic, Probability, Statistics 1 Calculus & Beyond Homework 10 General Discussion 7 General Discussion 5