Correlation, simple formula, meaning

AI Thread Summary
The discussion focuses on understanding the correlation formula and the logic behind its steps, particularly the multiplication of variables in the calculation. The user seeks clarity on why the algorithm is structured as it is, especially regarding the significance of multiplying the differences from the mean. It is noted that the numerator represents covariance, while the denominator involves standard deviations, but the user is looking for a more intuitive mathematical explanation. The conversation emphasizes the need for a foundational knowledge of statistics to fully grasp these concepts. Resources for further learning are also shared to assist in the user's self-education.
ducmod
Messages
86
Reaction score
0

Homework Statement


Hello!

Here is the quote of mathisfun explanation of correlation formula and after ## my understanding or questions:

Let us call the two sets of data "x" and "y" (in our case Temperature is x and Ice Cream Sales is y):

Step 1: Find the mean of x, and the mean of y

Step 2: Subtract the mean of x from every x value (call them "a"), do the same for y (call them "b")
## with step 2 we compute how each variable differs from the mean

Step 3: Calculate: a × b, a2 and b2 for every value

## here I come to the point where I need help: I understand that we have to square each value from step 2 (a and b)
to avoid negative numbers;

## but I don't understand the meaning (ligic; why) of multiplication of variables from step 2 a x b

Step 4: Sum up a × b, sum up a2 and sum up b2

Step 5: Divide the sum of a × b by the square root of [(sum of a2) × (sum of b2)]

## in step 5 again I don't understand the logic of multiplication, what does this multiplication mean; and then the division.
## usually, division shows how many parts of divisor are in divident, or percent.

Thank you!

Homework Equations

The Attempt at a Solution

 
Physics news on Phys.org
What you are quoting is an algorithm, a recipe to find the linear correlation between two data sets. Why the algorithm is the way it is, and what the meaning of the numbers represent - wait until you have a basic knowledge of statistics.
 
Svein said:
What you are quoting is an algorithm, a recipe to find the linear correlation between two data sets. Why the algorithm is the way it is, and what the meaning of the numbers represent - wait until you have a basic knowledge of statistics.
Thank you. You are right that I need statistics knowledge, and I am moving towards it.
I also understand that it's an algorithm. I even know that numerator reflect covariance, and in the denominator there is a multiplication of standard deviations.
But my question is not about statistical terms or there usage, but more about the meaning and logic of this multiplication, I assume that there is a simple math logic which I don't understand.
I am learning on my own.
Thank you!
 
Back
Top