MHB Predict Z-Score for Y Given X at 30th Percentile

  • Thread starter Thread starter dlee
  • Start date Start date
dlee
Messages
4
Reaction score
0
Consider two random variables X,Y whose correlation is ρ = 0.7 (and the joint PMF is football shaped). Predict the z-score for Y if you observe that X is at the 30th percentile (assuming X ~ N(4,4)).

The solution to this problem is -0.364, but I'm not sure how to approach this answer.
 
Physics news on Phys.org
Re: Correlation?

dlee said:
Consider two random variables X,Y whose correlation is ρ = 0.7 (and the joint PMF is football shaped). Predict the z-score for Y if you observe that X is at the 30th percentile (assuming X ~ N(4,4)).

The solution to this problem is -0.364, but I'm not sure how to approach this answer.

I we assume a bivariate normal distribution, we "expect" the relation:
$$y(x) = \text{sgn}(\rho) \frac {\sigma_Y}{\sigma_X} (x - \mu_X) + \mu_Y$$

With X at the 30th percentile, that means $z_X = \frac{x - \mu_X}{\sigma_X} = \text{invNorm}(0.30) = -0.524$.

In other words, the z-score for Y is
$$z_Y = \frac{y - \mu_Y}{\sigma_Y} = \text{sgn}(\rho) z_X = -0.524$$

I don't know how they got to -0.364.
 
Re: Correlation?

I like Serena said:
I we assume a bivariate normal distribution, we "expect" the relation:
$$y(x) = \text{sgn}(\rho) \frac {\sigma_Y}{\sigma_X} (x - \mu_X) + \mu_Y$$

With X at the 30th percentile, that means $z_X = \frac{x - \mu_X}{\sigma_X} = \text{invNorm}(0.30) = -0.524$.

In other words, the z-score for Y is
$$z_Y = \frac{y - \mu_Y}{\sigma_Y} = \text{sgn}(\rho) z_X = -0.524$$

I don't know how they got to -0.364.

That can't be right.

You can without loss of generality assume $$\mu_X = \mu_Y = 0$$, so we have a model:

$$y=\alpha x$$

then $\displaystyle \sigma_Y=\alpha\; \sigma_X$, and $\rho=E(XY)/(\sigma_X \sigma_Y)=\alpha\; \sigma_X/\sigma_Y$

Hence: $$\alpha=\rho \frac{\sigma_Y}{\sigma_X}$$...

.
 
Re: Correlation?

zzephod said:
That can't be right.

You can without loss of generality assume $$\mu_X = \mu_Y = 0$$

I didn't.
The problem asks for a z-score, meaning $\mu_X$, and $\mu_Y$ get eliminated (see my derivation).

so we have a model:

$$y=\alpha x$$

then $\displaystyle \sigma_Y=\alpha\; \sigma_X$, and $\rho=E(XY)/(\sigma_X \sigma_Y)=\alpha\; \sigma_X/\sigma_Y$

Hence: $$\alpha=\rho \frac{\sigma_Y}{\sigma_X}$$...

Well... multiplying by 0.7 almost gives the requested result.
But that won't be right.
 
Re: Correlation?

I like Serena said:
... Well... multiplying by 0.7 almost gives the requested result.
But that won't be right.

It will be if you use "nearest value" in inverse normal lookup in a table.

.
 
Last edited:
Re: Correlation?

I like Serena said:
I didn't.
The problem asks for a z-score, meaning $\mu_X$, and $\mu_Y$ get eliminated (see my derivation).

Well, since you failed to set up a model with the correct correlation it is not irrelevant to make an observation that simplifies setting the correlation without changing the answer.

.
 
Last edited:
Re: Correlation?

zzephod said:
Well, since you failed to set up a model with the correct correlation it is not irrelevant to make an observation that simplifies setting the correlation without changing the answer.

.

The model is a positive sloped football that could be anywhere.
The problem puts the heart at x=4 with a variance of 4.
The y coordinate of the heart and the slope can still be freely chosen.
Then, with the given correlation, the "width" of the football becomes fixed.

Either way, when talking about the z-score of y, all these choices become moot, since they are standardized.
The relationship between $E(z_Y|z_X)$ and $z_X$ is simply $E(z_Y|z_X) = z_X$, whichever model you pick.
This is a "standardized" football that is aligned on the line y=x with a width such that the correlation is satisfied.
 
Last edited:
Re: Correlation?

I like Serena said:
The model is a positive sloped football that could be anywhere.
The problem puts the heart at x=4 with a variance of 4.
The y coordinate of the heart and the slope or can still be freely chosen.
Then, with the given correlation the "width" of the football becomes fixed.

Either way, when talking about the z-score of y, all these choices become moot, since they are standardized.
The relationship between $E(z_Y|z_X)$ and $z_X$ is simply $E(z_Y|z_X) = z_X$, whichever model you pick.
This is a "standardized" football that is aligned on the line y=x with a width such that the correlation is satisfied.

Since for Bivariate normal rv $X,\ Y$:

$$E(Y|X)=\rho\; \frac{\sigma_Y}{\sigma_X}Y$$

So as $z_X,\ z_Y$ have the same correlation coefficient as $X$ and $Y$ we have:

$$E(z_Y|z_X) = \rho\; z_X$$

See: http://athenasc.com/Bivariate-Normal.pdf.

... And simulation confirms this.

.
 
Last edited:

Similar threads

Replies
5
Views
2K
Replies
30
Views
4K
Replies
3
Views
2K
Replies
1
Views
1K
Replies
5
Views
11K
Replies
11
Views
4K
Back
Top