MHB Visual illustration of Pearson correlation coefficient r

Click For Summary
The discussion focuses on the calculation of the Pearson correlation coefficient using a sample dataset with five data points. A visual illustration was created to depict the relationship between the variables x and y, including color-coded mean lines for clarity. The covariance formula and the standard deviation calculations for both x and y were correctly applied in the context of the diagram. It was confirmed that the interpretation of the correlation coefficient and its formula were accurate. The formula for r can be simplified, emphasizing the relationship between the covariance and the standard deviations of the datasets.
dhiraj
Messages
3
Reaction score
0
From what I have understood about Pearson correlation coefficient I have created a visual illustration, I would like to know if this understanding looks correct.

Say I have a sample with 5 data points:-

x y
8 6
16 8
20 16
28 12
32 20

My goal is to calculate Pearson correlation coefficient between x and y.

So this is how the diagram I created looks like:-

View attachment 6472

I have done appropriate color coding.

So in this case the covariance between x and y is:-

[math]cov(x,y) = \frac {\sum d_x d_y}{n-1} [/math]

[math]d_x[/math] and [math]d_y[/math] are the deviations (not standard deviation) from [math]\bar{x}[/math] and [math]\bar{y}[/math] respectively, these mean lines are shown in the diagram (red line for [math]\bar{x}[/math] and the green line for [math]\bar{y}[/math]).

Pearson correlation coefficient [math] r = \frac{cov(x,y)}{S_x S_y} [/math]

Based on the diagram, standard deviations of x and y are:-

[math]S_x = \sqrt{ \frac{\sum d_x^2}{n-1} }[/math]

[math]S_y = \sqrt{ \frac{\sum d_y^2}{n-1} }[/math]

So replacing these in the formula for the correlation coefficient we get:-
[math] r = \frac {\sum d_x d_y} { (n-1) \sqrt{ \frac{\sum d_x^2}{n-1} } \sqrt{ \frac{\sum d_y^2}{n-1} } } [/math]Is this interpretation correct with respect to the diagram I have shown? I know the signs of [math]d_x[/math] and [math]d_y[/math] will depend on which side of [math]\bar{x}[/math] and [math]\bar{y}[/math] , [math]x[/math] and [math]y[/math] appear.
 

Attachments

  • Correlation.png
    Correlation.png
    5.4 KB · Views: 121
Mathematics news on Phys.org
Hi dhiraj!

It's all correct.
And note that the formula for $r$ can be simplified to:
$$ r = \frac {\sum d_x d_y} {\sqrt{ \sum d_x^2 } \sqrt{ \sum d_y^2 }}$$
 
Here is a little puzzle from the book 100 Geometric Games by Pierre Berloquin. The side of a small square is one meter long and the side of a larger square one and a half meters long. One vertex of the large square is at the center of the small square. The side of the large square cuts two sides of the small square into one- third parts and two-thirds parts. What is the area where the squares overlap?

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 17 ·
Replies
17
Views
4K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
2
Views
2K
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K