# Why does conditional probability used in mean square error equal zero?

1. Mar 27, 2014

### EdMel

Hi guys,

I am having trouble showing that $\mathbb{E}\left[(Y-\mathbb{E}[Y|X])^{2}\right]=0$.

I understand the proof of why E[Y|X] minimizes the mean square error, but I cannot understand why it is then equal to zero.

I tried multiplying out the square to get $\mathbb{E}\left[Y^{2}\right]-2\mathbb{E}\left[Y\mathbb{E}[Y|X]\right]+\mathbb{E}\left[\mathbb{E}[Y|X]\mathbb{E}[Y|X]\right]$
but have not been able to justify $\mathbb{E}\left[Y\mathbb{E}[Y|X]\right]=\mathbb{E}\left[Y^{2}\right]$ or $\mathbb{E}\left[\mathbb{E}[Y|X]\mathbb{E}[Y|X]\right]=\mathbb{E}\left[Y^{2}\right]$.

2. Mar 27, 2014

### micromass

Can you tell us your definition of the conditional expectation and what properties you are allowed to use?

3. Mar 27, 2014

### Stephen Tashi

It would also be helpful to improve the notation. If $X$ and $Y$ are random variables and $f(X,Y)$ is a function of them then the notation $E f(X,y)$ is ambiguous. It is not clear whether the expectation is being computed with respect to the distribution of $X$ or the distribution of $Y$ - or perhaps with respect to the joint distribution for $(X,Y)$.

You can use a subscript to denote which distribution is used to compute the expectation. For example, if $Y$ is not a function of $X$ then $E_X ( E_Y ( 3Y + 1) )$ is the expectation of with respect to the distribution of $X$ of the constant value $E_Y(3Y + 1)$ Hence $E_X (E_Y (3Y+1)) = E_Y (3Y+ 1)$.

4. Mar 28, 2014

### FactChecker

It's hard to prove because it is not true unless (Y-E(Y|X))==0. Are you sure that (Y-E(Y|X)) is supposed to be squared?

5. Mar 28, 2014

### Stephen Tashi

And before (Y - E(Y|X)) is equal or not equal to zero, it would have to mean something. How do we interpret Y - E(Y|X) ? Is it a random variable? To realize it , do we realize a value Y = y0 from the distribution of Y and then take the expected value of the constant y0 with respect to the distribution of X ?