Solving Variables Puzzle: Correlation Coefficients & SPSS for N=850

  • Context: MHB 
  • Thread starter Thread starter Leigh13
  • Start date Start date
  • Tags Tags
    Stuck Variables
Click For Summary
SUMMARY

This discussion focuses on calculating correlation coefficients using SPSS for three ordinal variables: the year of company founding, technological intensity of products, and the number of foreign markets. The consensus is that Spearman's correlation coefficient is appropriate due to the ordinal nature of the data, while Pearson's coefficient is unsuitable. Users are advised to input the data into SPSS in three adjacent columns and follow a step-by-step method to calculate Spearman's r, including ranking and handling ties.

PREREQUISITES
  • Understanding of ordinal data types
  • Familiarity with SPSS software
  • Basic knowledge of correlation coefficients, specifically Spearman's and Pearson's
  • Ability to perform data sorting and ranking
NEXT STEPS
  • Learn how to input and manipulate data in SPSS for correlation analysis
  • Study the calculation of Spearman's correlation coefficient in detail
  • Explore Excel functions for calculating correlation coefficients
  • Investigate the implications of ordinal data on statistical analysis
USEFUL FOR

Statisticians, data analysts, researchers, and students looking to understand correlation analysis using SPSS for ordinal data.

Leigh13
Messages
1
Reaction score
0
Hi guys!

I have never used statistics in my life, so now I'm completely lost.

I have only 3 variables -

the year of the company founding (up to 1993, 1994-2004, 2005-now), the technological intensity of their products (that's either a, b, c or d) and the number of foreign markets (that's either 1, 2, 3 or more)

I would like to check the correlation between the year and the tech. intensity, the year and the number of markets, and the intensity and the number of markets

I understand I'm looking for some sort of correlation, but which coefficient to check - Spearman or Pearson?

And how to get the variables into SPSS? n=850 Do I just put 3 columns one next to each other?

Thanks so much for help!
 
Physics news on Phys.org
Leigh13 said:
Hi guys!

I have never used statistics in my life, so now I'm completely lost.

I have only 3 variables -

the year of the company founding (up to 1993, 1994-2004, 2005-now), the technological intensity of their products (that's either a, b, c or d) and the number of foreign markets (that's either 1, 2, 3 or more)

I would like to check the correlation between the year and the tech. intensity, the year and the number of markets, and the intensity and the number of markets

I understand I'm looking for some sort of correlation, but which coefficient to check - Spearman or Pearson?

And how to get the variables into SPSS? n=850 Do I just put 3 columns one next to each other?

Thanks so much for help!

Welcome to MHB, Leigh13! :)

All of your variables are ordinal.
That is, they have an ordering, but for instance an average is meaningless.
In particular that means that you cannot use Pearson.
So yes, Spearman is the way to go.

In SPSS you would indeed simply put 3 columns next to each other and specify you want to use Spearman.
 
Also, try doing it in Excel! That way, you'll really get an intuitive feel for how Spearman's correlation coefficient works.

Calculating Spearman's r From Scratch
Step 1: Create Your Table
Create columns for the IDs, variables, ranks, d, and d^2 (see table 1 below).

Step 2: Sort According to the First Variable
Sort the (entire! i.e. all columns of your data set!) data according to your first variable (Intensity(a)). Make a note of whether you sorted them in ascending, or descending order. In the example below, I used ascending order and gave the value "a" for the variable Intensity(a) rank 1, value "b" rank 2 and so forth.

Step 3: Assign First Variable Ranks, Deal with Ties
Assign ranks to each observation's value for the variable we sorted in step 2 (i.e. Intensity(a)). Now, you may notice that some of the observations have the same variable value, but different ranks. These are called ties, and need to receive new and equal ranks for the calculations to work.

e.g. observations with IDs 2, 3, and 7 all share a value of "a" in the Intensity(a) variable, but hold the dissimilar ranks 4, 5, and 6. Their new equal rank will be: (6 + 5 + 4)/3 = 5.

Step 4: Sort According to the Second Variable
Sort your according to the next variable. Make sure that the ranking system follows the same pattern as your first variable. In our case, "a" was considered high and received the rank of one. Therefore, the observation with 8 markets will receive the rank of 1, the observation with 7 markets will receive rank 2 and so forth.

Step 5: Assign Second Variable Ranks, Deal with Ties
Assign ties with dissimilar ranks new equal ranks using the same method as step 3.

Step 6: Sort According to Observation IDs
Re-sort all of your data according to your ID numbers (not a requirement, but it makes the data easier to read).

Step 7: Calculate d
Subtract each observation's rank in variable (a) from their rank in variable (b) in column d. Note that if these are summed up, you will always get zero.

Step 8: Calculate d^2
Raise each value in d by two in order to make sure that their sum is'nt zero.

Step 9:
Sum the values of d^2 and use it in the formula below.

Formula:
$$Spearman's r = 1 - \frac{6\sum_{x = 1}^n {d^2}}{n(n^2-1)} = 1 - \frac{6*11,5}{7(7^2-1)} = 0,79$$
Table 1. Calculating Spearman's r in Excel/Open Office
[table="width: 500 , class: grid"]
[tr]
[td]ID[/td]
[td]Intensity(a)[/td]
[td]RANK(a)[/td]
[td]#Markets(b)[/td]
[td]RANK(b)[/td]
[td]d[/td]
[td]d^2[/td]
[/tr]
[tr]
[td]1[/td]
[td]a[/td]
[td]1.5[/td]
[td]7[/td]
[td]2[/td]
[td]-0.5[/td]
[td]0.25[/td]
[/tr]
[tr]
[td]2[/td]
[td]c[/td]
[td]5[/td]
[td]5[/td]
[td]4.5[/td]
[td]0.5[/td]
[td]0.25[/td]
[/tr]
[tr]
[td]3[/td]
[td]c[/td]
[td]5[/td]
[td]4[/td]
[td]6.5[/td]
[td]-1.5[/td]
[td]2.25[/td]
[/tr]
[tr]
[td]4[/td]
[td]d[/td]
[td]7[/td]
[td]5[/td]
[td]4.5[/td]
[td]2.5[/td]
[td]6.25[/td]
[/tr]
[tr]
[td]5[/td]
[td]b[/td]
[td]3[/td]
[td]6[/td]
[td]3[/td]
[td]0[/td]
[td]0[/td]
[/tr]
[tr]
[td]6[/td]
[td]a[/td]
[td]1.5[/td]
[td]8[/td]
[td]1[/td]
[td]0.5[/td]
[td]0.25[/td]
[/tr]
[tr]
[td]7[/td]
[td]c[/td]
[td]5[/td]
[td]4[/td]
[td]6.5[/td]
[td]-1.5[/td]
[td]2.25[/td]
[/tr]
[tr]
[td]SUM:[/td]
[td][/td]
[td][/td]
[td][/td]
[td][/td]
[td]0[/td]
[td]11.5[/td]
[/tr]
[/table]

Fun Experiments! Plug in and see what happens to the results.
1. What happens if all "d" values are zero?
2. What happens if all "d" values are negative?
3. What happens when the ranks are perfectly dissimilar? And similar?
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
Replies
5
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 2 ·
Replies
2
Views
10K
Replies
28
Views
8K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 18 ·
Replies
18
Views
6K