Coefficient of correletion problem

In summary, the conversation is discussing the coefficient of correlation between the number of black and red balls taken out of a vase containing one red ball, two white balls, and three black balls. The formula for the correlation coefficient is mentioned and the problem of calculating the covariance is discussed. Various methods are suggested, including using the smoothing theorem and expanding the product. The conversation ends with a final question about the probability of getting a red ball in a given "trial."
  • #1
ENgez
75
0
a vase contains one red ball two white balls and three black balls. n balls are take out of the vase and (each ball returned to it afterwards). let B denote the number of black balls taken out and R denote the number of red balls taken out. what is the coefficient of correlation between R and B?

well i know that coeffieicnt of correlation = [itex]\frac{cov(R,B)}{\sigma R*\sigma B}[/itex]
and that R and B are simply bernouli trials with x and n-x trials respectivly.

my problem is calculating the covariance = E[RB] -E[R]E (E is the mean).
E[R] and E are straightforward but E[RB] is a bit trickier for me.

i thought of using the definition of the mean with a multinomial vector for the shared probability for R and B and summing over 0<x<n, but is there an easier way?
 
Physics news on Phys.org
  • #2
It would be easy if it weren't for the white balls... As it is, I don't see any way of doing it short of writing out the sum.
 
  • #3
The number of red balls is R = sum_{i=1}^n I[ball i is red] where I is the indicator function so to calculate E[RB] you just need to expand the product.
 
  • #4
bpet, i am sorry, but can you elaborate?

and i thought of using the smoothing theorom,
Due to the fact that it is easier to think in terms of conditional probability in this problem.

E[RB]=E[E[RB|B]]=E[B*E[R|B]=E[B*(n-B)*P(R)]

=P(R)*(nE-E[B^2])

B is distributed binomially with n trials and probability P(B).

the problem here is that i get a different answer for E[E[RB|R]]
what am i doing wrong?
 
  • #5
sry for the double post but believe i understood the part about expanding the product :).

tell me if this is correct

[itex]R = R1+R2 +R3 ... + Rn[/itex]

where:
[itex] R = 1 , p=1/6 [/itex]
[itex] R = 0 , p=5/6 [/itex]

and likewise for B

[itex]E[RB] = E[(R1+R2+...+Rn)(B1+B2+...+Bn)= (n^2-n)*P(B)*P(R)][/itex]
Ri*Bi is always 0 becuase one being 1 implies the other being 0.

[itex]E[RB]-E[R]E = (n^2-n)*P(B)*P(R)-n^2*P(B)*P(R) =- n*P(R)*P(B) = -n/12[/itex]
 
Last edited:
  • #6
ENgez said:
E[RB]=E[E[RB|B]]=E[B*E[R|B]=E[B*(n-B)*P(R)]

=P(R)*(nE-E[B^2])


If you change the RHS to E[B*(n-B)*P(R|B)] does that work?
 
  • #7
i don't see how P(R|B) changes anything becuase (n-B)*P(R) is the mean of R|B (where B=s , 0<s<n).
P(R) is the probability that in a given "trial" (pulling a ball out) you get red, which is independent from the amount of black balls you took out.
 

1. What is the coefficient of correlation?

The coefficient of correlation, also referred to as Pearson's correlation coefficient, is a statistical measure that indicates the strength and direction of the linear relationship between two variables.

2. How is the coefficient of correlation calculated?

The coefficient of correlation is calculated by dividing the covariance of two variables by the product of their standard deviations. This results in a value between -1 and 1, with -1 indicating a perfect negative correlation, 1 indicating a perfect positive correlation, and 0 indicating no linear relationship between the variables.

3. What does a coefficient of correlation of 0 mean?

A coefficient of correlation of 0 means that there is no linear relationship between the two variables being compared. However, it is important to note that there could still be a non-linear relationship between the variables, which would not be captured by the coefficient of correlation.

4. What is a good coefficient of correlation?

A good coefficient of correlation depends on the context of the data being analyzed. In general, a coefficient of correlation between 0.7 and 0.9 is considered a strong positive correlation, while a value between -0.7 and -0.9 is considered a strong negative correlation. However, the interpretation of the coefficient of correlation should always be accompanied by further analysis and context.

5. Can the coefficient of correlation be used to determine causation?

No, the coefficient of correlation only measures the strength and direction of the linear relationship between two variables. It does not indicate or prove causation, which requires further analysis and evidence.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
Replies
4
Views
356
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Differential Geometry
Replies
1
Views
975
  • Advanced Physics Homework Help
Replies
3
Views
2K
  • Thermodynamics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
3K
  • Topology and Analysis
Replies
3
Views
1K
Back
Top