How do two random variables correlate and remain independent?

  • Thread starter Thread starter archaic
  • Start date Start date
  • Tags Tags
    Joint
AI Thread Summary
The discussion explores the correlation and independence of two random variables, X1 and X2, with a calculated correlation coefficient of approximately 0.63, indicating a strong positive correlation. Despite this correlation, the variables are not independent, as shown by the joint probability not equating to the product of their individual probabilities. The conversation highlights the complexity of linear relationships, suggesting that even with a strong correlation, the relationship may not be strictly linear. A proposed method for assessing linearity involves minimizing a specific function to determine the best fit, which yielded an R² value of 0.195, suggesting some linearity. Overall, the analysis emphasizes the nuanced relationship between correlation and independence in random variables.
archaic
Messages
688
Reaction score
214
Homework Statement
##X_1## represents the number of clients in a queue, and ##X_2## the same, but is faster (the queue). (see figure for the pmf)
1) What's the probability that:
a) both queues are empty?
b) both queues are of the same length?
c) the total number of customers in the two queues is 4?
d) the faster line has more than 1 customer, given that the other is empty?

2) Find the correlation between ##X_1## and ##X_2##. Comment on the existence and strength of linear relation between ##X_1##and ##X_2##.

3) Are the two random variables independent? Why?
Relevant Equations
n/a
1.PNG

$$ \begin{array}{lllll}X_1&0&1&2&3\\f_{X_1}&0.4&0.3&0.25&0.05\end{array}\,|\,\begin{array}{llllll}X_2&0&1&2&3&4\\f_{X_2}&0.05&0.2&0.25&0.2&0.3\end{array}$$
1a) ##p=0.05##
1b) ##p=0.05+0.05+0+0=0.1##
1c) ##p=0.05+0.05+0+0=0.1##
1d) ##p=0.1+0.05+0.05=0.2##

2) ##\mu_{X_1}=0\times0.4+1\times0.3+2\times0.25+3\times0.05=0.95##
##\sigma^2_{X_1}=0^2\times0.4+1^2\times0.3+2^2\times0.25+3^2\times0.05-0.95^2= 0.8475 ##

##\mu_{X_2}=0\times0.05+1\times0.2+2\times0.25+3\times0.2+4\times0.3=2.5##
##\sigma^2_{X_2}=0^2\times0.05+1^2\times0.2+2^2\times0.25+3^2\times0.2+4^2\times0.3-2.5^2=1.55##

##\mathrm E[X_1X_2]=1\times1\times0.05+1\times2\times0.15+1\times3\times0.05+1\times4\times0.05+2\times3\times0.1+2\times4\times0.15+3\times4\times0.05=3.1##
##\mathrm{cov}(X_1,X_2)=E[X_1X_2]-\mu_{X_1}\mu_{X_2}=3.1-0.95\times2.5=0.725##
##\rho_{X_1X_2}=\frac{\mathrm{cov}(X_1,X_2)}{\sigma_{X_1}\sigma_{X_2}}=\frac{0.725}{\sqrt{0.8475\times1.55}}=0.632560842248##

Since ##\rho_{X_1X_2}>0.5##, the two random variables are strongly correlated.
The probability for a fixed ##X_1## increases then decreases as ##X_2## varies, so there doesn't seem to exist a linear relationship between the two variables. Correct?

3) They are not independent because ##f_{X_1X_2}(0,0)=0.05\neq f_{X_1}(0)f_{X_2}(0)=0.02##
Anything amiss? Thanks!
 
Physics news on Phys.org
1 looks good to me. I didn't check all the numbers for your correlation calculation but the final result seems reasonable. About the linear relation I think you missed the point. Suppose we have random variables Y and Z, where ##Z=Y\pm 1##. Then for a fixed Y, the probability of various Z's will increase then decrease as you scan the table, but Y and Z are clearly linearly related.

I agree with your answer for 3, it might be worth giving a real life description of why they are not independent.
 
  • Like
Likes archaic
Office_Shredder said:
1 looks good to me. I didn't check all the numbers for your correlation calculation but the final result seems reasonable. About the linear relation I think you missed the point. Suppose we have random variables Y and Z, where ##Z=Y\pm 1##. Then for a fixed Y, the probability of various Z's will increase then decrease as you scan the table, but Y and Z are clearly linearly related.

I agree with your answer for 3, it might be worth giving a real life description of why they are not independent.
Thank you!
Right, here the correlation is strong, so the relation is close to linear, and is positive, so as one RV increases, the other also tends to increase.
It also makes sense given the scenario. As the slower queue is filled, people will want to go to the faster one in order to kind of balance the waiting time (this also serves as an answer to your last sentence); if it is possible to join X2, then one would do it. However, one would not go to X2 if it is somewhat filled and X1 is reasonably less so.
 
I would think the right way to measure linearity is to find the a, b, c that minimises ##\Sigma_{i,j} p_{i,j}(aX_i+bY_j+c)^2## and then look at the ##R^2## value. But maybe that's overkill here.
 
Further to post #3, I realized I needed to fix c as nonzero. I chose 1.
This led to the best fit being ##0.0842X_1-0.354X_2+1=0##, giving ##R^2=0.195##, which seems reasonably linear.
Note that this is a symmetric fit. If you want to predict X1 from X2 or v.v. the best fit will be a bit different, and differ from each other.
 
  • Love
Likes archaic
I forgot to divide by ##f_{X_1}(0)## in d). :/
 
Back
Top