Question about correlation coefficient

  • Context: Undergrad 
  • Thread starter Thread starter pamparana
  • Start date Start date
  • Tags Tags
    Coefficient Correlation
Click For Summary

Discussion Overview

The discussion revolves around the feasibility of calculating a correlation coefficient between two sets of data: employee survey responses and company performance indicators, where the number of observations differs significantly. The scope includes conceptual understanding of correlation, data pairing, and the implications of linear relationships.

Discussion Character

  • Exploratory
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • Luca questions whether it is possible to correlate 32 survey responses against 63 monthly performance indicators, given the difference in the number of observations.
  • One participant emphasizes the need for a clear relationship between the two variables to justify correlation, noting that correlation requires paired data.
  • Another participant inquires if the 32 responses are distributed across the same 63-month period.
  • Luca clarifies that the responses are from employees over a 5-year period and aims to correlate employee attitudes towards company performance.
  • A participant warns that correlation typically only makes sense for linear relationships and suggests examining the data for discernible patterns or transformations.
  • There is uncertainty expressed about how to correlate the data if survey responses are collected at only one time point.

Areas of Agreement / Disagreement

Participants express differing views on the feasibility and methodology of correlating the data, with no consensus reached on how to proceed given the differences in data collection.

Contextual Notes

Participants highlight potential limitations regarding the pairing of data, the need for linear relationships, and the importance of contextualizing the data before attempting correlation.

pamparana
Messages
123
Reaction score
0
Hello everyone,

I have a question (perhaps a very noob one as well!) regarding correlation between variables where the number of observations are different between the two sets.

So, I have some 32 responses generated from a survey which aim to measure certain variable and I want to correlate them against some company performance indicators. Now these performance indicators are available per month basis for like 63 months.

Now, I have 32 instances of a variable against 63 instances of another. Is it possible to do a simple correlation within these sets where the number of instances are different...

Thanks,
Luca
 
Physics news on Phys.org
What, precisely, would you be correlating, exactly? What is the relationship between the two variables that would allow a correlation to make sense? How do you know which variables in one measure are associated with which variables in another?

Correlation is, fundamentally, normalized covariance, which is a characteristic of joint distributions of random variables. If you have two uneven sets of completely unrelated and unpaired data, then the notion of correlation makes no sense.
 
Are the 32 responses spread across the same 63 month period?
 
Hello,

Thanks for the replies. The 32 responses are from employees that have been employed over that 5 year period.

What I am trying to correlate is employee attitudes towards company performance over that time.

Thanks,
Luca
 
You should be aware that correlation in its most common form between two variables, only makes sense when the relationship is linear (since this is is what it is trying to determine).

You need to take a look at your data and decide whether you can discern any relationship at all, and if necessary transform your data to try and get a linear-looking relationship.

The most important thing however is try put context into your data: if you can quantify the characteristics of the behaviour with a simple function that makes sense non-mathematically (i.e. you can explain what it means in plain english without using mathematics and with a reference to something specific) then this is what you should be doing.

If you are just trying to produce metrics without having a clue what's going on, you'll be setting yourself up to make a potentially bad decision.
 
pamparana said:
Hello,

Thanks for the replies. The 32 responses are from employees that have been employed over that 5 year period.

What I am trying to correlate is employee attitudes towards company performance over that time.

Thanks,
Luca

So you have performance data on a per month basis for 63 months, and survey data every month as well? If you only have survey data at one time point, I'm not sure what you would correlate, exactly.
 

Similar threads

  • · Replies 13 ·
Replies
13
Views
5K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
1
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
6K
  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K