Is Correlation Coefficient an Informative Indicator in Real-World Datasets?

alex.kin. · Nov 3, 2012

Hi,

Are you aware of any dataset (in R or elsewhere) consisting of a sample from two variables where the correlation coefficient is (approximately) equal to 1, but the variables refer to completely irrelevant things, i.e. one measuring something that happens on Earth and the other something on a distant planet?

or

a case where a parameter causauly affects a measure, but because other such 'causual' parameters also exist, a sample from the respective two variables has correlation coefficient far distant from 1 or -1?

My point is that the correlation coefficient is really an indicator that is informative?

Any suggestions?

thanx, alex

Stephen Tashi · Nov 3, 2012

Perhaps you should look at the correlation between two nearly constant "random" variables. Something like X = 1 if there are at least 100 sunny days this year and Y = 1 if the Cubs don't win the world series this year. ( I supose you need a little variation to prevent the covriances from being 0.)

bpet · Nov 4, 2012

alex.kin. said:

Hi,

Are you aware of any dataset (in R or elsewhere) consisting of a sample from two variables where the correlation coefficient is (approximately) equal to 1, but the variables refer to completely irrelevant things, i.e. one measuring something that happens on Earth and the other something on a distant planet?

or

a case where a parameter causauly affects a measure, but because other such 'causual' parameters also exist, a sample from the respective two variables has correlation coefficient far distant from 1 or -1?

My point is that the correlation coefficient is really an indicator that is informative?

Any suggestions?

thanx, alex

You could create your own data, for example:

Code:

> z<-rnorm(1000)
> w<-rnorm(1000)
> # spurious correlation (random walks)
> cor(cumsum(z),cumsum(w))
[1] 0.6556251
> # perfectly correlated but correlation less than 1
> cor(exp(z),exp(2*z))
[1] 0.8726321
> # perfectly anticorrelated but correlation is almost zero
> cor(exp(z),exp(-2*z))
[1] -0.08543019

Other measures of dependence such as rank correlation have much nicer properties.

alex.kin. · Nov 6, 2012

Stephen Tashi said:

Perhaps you should look at the correlation between two nearly constant "random" variables. Something like X = 1 if there are at least 100 sunny days this year and Y = 1 if the Cubs don't win the world series this year. ( I supose you need a little variation to prevent the covriances from being 0.)

Thanks Stephen,

I know that it is possible to create such a dataset, however so far I haven't found any real-world dataset with the data I have access to.

alex.kin. · Nov 6, 2012

Thanks Stephen,

I know that it is possible to create such a dataset, however so far I haven't found any real-world dataset with the data I have access to.

Stephen Tashi · Nov 6, 2012

alex.kin. said:

Thanks Stephen,

I know that it is possible to create such a dataset, however so far I haven't found any real-world dataset with the data I have access to.

You'll have to distinguish between the existence of data and the existence of datasets. There are cultural reasons why people would not bother to publish a dataset of nearly constant random variables. This doesn't mean that the data isn't "real world".

bpet · Nov 7, 2012

alex.kin. said:

Thanks Stephen,

I know that it is possible to create such a dataset, however so far I haven't found any real-world dataset with the data I have access to.

Stock price data has properties very similar to the examples described in post #3.

Is Correlation Coefficient an Informative Indicator in Real-World Datasets?

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

A Prove that points which are indistinguishable from 0 exist (using logic)

A Mathematical Connection between Cosmic Expansion and Exponential Growth

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective