MHB Solving Variables Puzzle: Correlation Coefficients & SPSS for N=850

  • Thread starter Thread starter Leigh13
  • Start date Start date
  • Tags Tags
    Stuck Variables
AI Thread Summary
The discussion focuses on analyzing three ordinal variables: the year of company founding, technological intensity, and the number of foreign markets. The user seeks guidance on which correlation coefficient to use, with the consensus being that Spearman's correlation is appropriate due to the ordinal nature of the data. Instructions for entering the data into SPSS are provided, confirming that the variables can be organized in three adjacent columns. Additionally, a detailed explanation of calculating Spearman's correlation, including steps for ranking and handling ties, is shared. Overall, the thread emphasizes the importance of using the correct statistical methods for ordinal data analysis.
Leigh13
Messages
1
Reaction score
0
Hi guys!

I have never used statistics in my life, so now I'm completely lost.

I have only 3 variables -

the year of the company founding (up to 1993, 1994-2004, 2005-now), the technological intensity of their products (that's either a, b, c or d) and the number of foreign markets (that's either 1, 2, 3 or more)

I would like to check the correlation between the year and the tech. intensity, the year and the number of markets, and the intensity and the number of markets

I understand I'm looking for some sort of correlation, but which coefficient to check - Spearman or Pearson?

And how to get the variables into SPSS? n=850 Do I just put 3 columns one next to each other?

Thanks so much for help!
 
Mathematics news on Phys.org
Leigh13 said:
Hi guys!

I have never used statistics in my life, so now I'm completely lost.

I have only 3 variables -

the year of the company founding (up to 1993, 1994-2004, 2005-now), the technological intensity of their products (that's either a, b, c or d) and the number of foreign markets (that's either 1, 2, 3 or more)

I would like to check the correlation between the year and the tech. intensity, the year and the number of markets, and the intensity and the number of markets

I understand I'm looking for some sort of correlation, but which coefficient to check - Spearman or Pearson?

And how to get the variables into SPSS? n=850 Do I just put 3 columns one next to each other?

Thanks so much for help!

Welcome to MHB, Leigh13! :)

All of your variables are ordinal.
That is, they have an ordering, but for instance an average is meaningless.
In particular that means that you cannot use Pearson.
So yes, Spearman is the way to go.

In SPSS you would indeed simply put 3 columns next to each other and specify you want to use Spearman.
 
Also, try doing it in Excel! That way, you'll really get an intuitive feel for how Spearman's correlation coefficient works.

Calculating Spearman's r From Scratch
Step 1: Create Your Table
Create columns for the IDs, variables, ranks, d, and d^2 (see table 1 below).

Step 2: Sort According to the First Variable
Sort the (entire! i.e. all columns of your data set!) data according to your first variable (Intensity(a)). Make a note of whether you sorted them in ascending, or descending order. In the example below, I used ascending order and gave the value "a" for the variable Intensity(a) rank 1, value "b" rank 2 and so forth.

Step 3: Assign First Variable Ranks, Deal with Ties
Assign ranks to each observation's value for the variable we sorted in step 2 (i.e. Intensity(a)). Now, you may notice that some of the observations have the same variable value, but different ranks. These are called ties, and need to receive new and equal ranks for the calculations to work.

e.g. observations with IDs 2, 3, and 7 all share a value of "a" in the Intensity(a) variable, but hold the dissimilar ranks 4, 5, and 6. Their new equal rank will be: (6 + 5 + 4)/3 = 5.

Step 4: Sort According to the Second Variable
Sort your according to the next variable. Make sure that the ranking system follows the same pattern as your first variable. In our case, "a" was considered high and received the rank of one. Therefore, the observation with 8 markets will receive the rank of 1, the observation with 7 markets will receive rank 2 and so forth.

Step 5: Assign Second Variable Ranks, Deal with Ties
Assign ties with dissimilar ranks new equal ranks using the same method as step 3.

Step 6: Sort According to Observation IDs
Re-sort all of your data according to your ID numbers (not a requirement, but it makes the data easier to read).

Step 7: Calculate d
Subtract each observation's rank in variable (a) from their rank in variable (b) in column d. Note that if these are summed up, you will always get zero.

Step 8: Calculate d^2
Raise each value in d by two in order to make sure that their sum is'nt zero.

Step 9:
Sum the values of d^2 and use it in the formula below.

Formula:
$$Spearman's r = 1 - \frac{6\sum_{x = 1}^n {d^2}}{n(n^2-1)} = 1 - \frac{6*11,5}{7(7^2-1)} = 0,79$$
Table 1. Calculating Spearman's r in Excel/Open Office
[table="width: 500 , class: grid"]
[tr]
[td]ID[/td]
[td]Intensity(a)[/td]
[td]RANK(a)[/td]
[td]#Markets(b)[/td]
[td]RANK(b)[/td]
[td]d[/td]
[td]d^2[/td]
[/tr]
[tr]
[td]1[/td]
[td]a[/td]
[td]1.5[/td]
[td]7[/td]
[td]2[/td]
[td]-0.5[/td]
[td]0.25[/td]
[/tr]
[tr]
[td]2[/td]
[td]c[/td]
[td]5[/td]
[td]5[/td]
[td]4.5[/td]
[td]0.5[/td]
[td]0.25[/td]
[/tr]
[tr]
[td]3[/td]
[td]c[/td]
[td]5[/td]
[td]4[/td]
[td]6.5[/td]
[td]-1.5[/td]
[td]2.25[/td]
[/tr]
[tr]
[td]4[/td]
[td]d[/td]
[td]7[/td]
[td]5[/td]
[td]4.5[/td]
[td]2.5[/td]
[td]6.25[/td]
[/tr]
[tr]
[td]5[/td]
[td]b[/td]
[td]3[/td]
[td]6[/td]
[td]3[/td]
[td]0[/td]
[td]0[/td]
[/tr]
[tr]
[td]6[/td]
[td]a[/td]
[td]1.5[/td]
[td]8[/td]
[td]1[/td]
[td]0.5[/td]
[td]0.25[/td]
[/tr]
[tr]
[td]7[/td]
[td]c[/td]
[td]5[/td]
[td]4[/td]
[td]6.5[/td]
[td]-1.5[/td]
[td]2.25[/td]
[/tr]
[tr]
[td]SUM:[/td]
[td][/td]
[td][/td]
[td][/td]
[td][/td]
[td]0[/td]
[td]11.5[/td]
[/tr]
[/table]

Fun Experiments! Plug in and see what happens to the results.
1. What happens if all "d" values are zero?
2. What happens if all "d" values are negative?
3. What happens when the ranks are perfectly dissimilar? And similar?
 
Last edited:
Suppose ,instead of the usual x,y coordinate system with an I basis vector along the x -axis and a corresponding j basis vector along the y-axis we instead have a different pair of basis vectors ,call them e and f along their respective axes. I have seen that this is an important subject in maths My question is what physical applications does such a model apply to? I am asking here because I have devoted quite a lot of time in the past to understanding convectors and the dual...
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Back
Top