# How is causality proven when it's impossible to change the IV?

1. Oct 26, 2012

### Cinitiator

How is causality usually proven when it's impossible to change the independent variable?
Also, why is correlation often treated as a very reliable hint to causation, even in various respected journals?

Here's an example:
http://www.webmd.com/diet/news/20080303/eating-breakfast-may-beat-teen-obesity

This article directly implies a causality between eating breakfast and weighting less ("Although adolescents may think that skipping breakfast seems like a good way to save on calories, findings suggest the opposite."), based only on a correlation. However, there can be many causes for the said correlation. For example, those who are more physically active (which is a cause for a weight loss) may be more likely to eat breakfast. Or those who sleep less may be more likely to skip breakfasts, and also have a slower metabolism as a result of sleep deprivation. Doesn't one need to conduct an experiment and see if manipulating the breakfast independent variable will actually lead to a reduction in weight?

2. Oct 26, 2012

### Stephen Tashi

It's easy to conclude that "Showing X doesn't mathematiclly prove causality" for any X that you want to pick since there is no rigorous definition for casuality in mathematical statistics. Thus there are no mathematical proofs of it.

3. Oct 26, 2012

### ImaLooser

You argue: give evidence, refute objections, etc. and try to persuade that X causes Y.

It is a reliable hint.

4. Oct 27, 2012

### chiro

As has been pointed out, the point is to provide evidence and to make suggestions to build on what has been found as a means of where to look next and how to make an interpretation of what has been observed.

All of statistics is about confidence at some point, but even then we need to look at why we are confident about a particular assertion and that means looking at what our data is, what the context of the experiment is and how the whole process leads to a conclusion or interpretation.

Confidence can be a con-game and it's important to understand when this is the case by avoiding the results and looking at where they came from and how they were generated.

5. Oct 27, 2012

### Cinitiator

Let's say that ice cream sales usually go up in the summer, and so do the drowning deaths.
Is the correlation between the rise of the aggregate ice cream consumption and drowning deaths a reliable hint to a causality between the two?

But how can we demonstrate any causality with any confidence intervals if we can't change the independent variable? What kind of statistics tools are available for determining the said causality in these cases?

6. Oct 27, 2012

### chiro

You can determine that something fits a particular model that relates two variables together in a way that they are highly dependent.

The simplest way of doing this is to use correlation, but this is really only a good thing when you have a linear relationship: it doesn't really work when you have something highly non-linear even if there is a strong non-linear relationship.

The next thing is to look at whether there is evidence of a non-linear relationship given the scatter-plot or the data. Typically one will either try and transform the data to something resembling a linear fit or they will try one of several general methods to find relationships.

The area of data mining has a lot of these general methods and they have different ideas in terms of how they originated and what they focus on.

The idea is that these methods look for patterns of any kind and you have everything from Naive Bayes to Support Vector Machines to find the so called patterns.

Statistically, naive bayes and the entropy methods are ways of finding patterns and establish possible causal links between variables or at the least, subsets of the data that you have.

There are definitions for orthogonality of random variables and also for independence of random variables and these can also be used to make hypotheses of whether certain variables may have a relation or not.

7. Oct 27, 2012

### Stephen Tashi

As I said before, there is no formal definition of "causality" in mathematical statistics. You are asking a question about scientific methodology, not a question about mathematics.