Question about correlation coefficient

pamparana
Messages
123
Reaction score
0
Hello everyone,

I have a question (perhaps a very noob one as well!) regarding correlation between variables where the number of observations are different between the two sets.

So, I have some 32 responses generated from a survey which aim to measure certain variable and I want to correlate them against some company performance indicators. Now these performance indicators are available per month basis for like 63 months.

Now, I have 32 instances of a variable against 63 instances of another. Is it possible to do a simple correlation within these sets where the number of instances are different...

Thanks,
Luca
 
Physics news on Phys.org
What, precisely, would you be correlating, exactly? What is the relationship between the two variables that would allow a correlation to make sense? How do you know which variables in one measure are associated with which variables in another?

Correlation is, fundamentally, normalized covariance, which is a characteristic of joint distributions of random variables. If you have two uneven sets of completely unrelated and unpaired data, then the notion of correlation makes no sense.
 
Are the 32 responses spread across the same 63 month period?
 
Hello,

Thanks for the replies. The 32 responses are from employees that have been employed over that 5 year period.

What I am trying to correlate is employee attitudes towards company performance over that time.

Thanks,
Luca
 
You should be aware that correlation in its most common form between two variables, only makes sense when the relationship is linear (since this is is what it is trying to determine).

You need to take a look at your data and decide whether you can discern any relationship at all, and if necessary transform your data to try and get a linear-looking relationship.

The most important thing however is try put context into your data: if you can quantify the characteristics of the behaviour with a simple function that makes sense non-mathematically (i.e. you can explain what it means in plain english without using mathematics and with a reference to something specific) then this is what you should be doing.

If you are just trying to produce metrics without having a clue what's going on, you'll be setting yourself up to make a potentially bad decision.
 
pamparana said:
Hello,

Thanks for the replies. The 32 responses are from employees that have been employed over that 5 year period.

What I am trying to correlate is employee attitudes towards company performance over that time.

Thanks,
Luca

So you have performance data on a per month basis for 63 months, and survey data every month as well? If you only have survey data at one time point, I'm not sure what you would correlate, exactly.
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top