Why does conditional probability used in mean square error equal zero?

EdMel · Mar 27, 2014

Hi guys,

I am having trouble showing that [itex]\mathbb{E}\left[(Y-\mathbb{E}[Y|X])^{2}\right]=0[/itex].

I understand the proof of why E[Y|X] minimizes the mean square error, but I cannot understand why it is then equal to zero.

I tried multiplying out the square to get [itex]\mathbb{E}\left[Y^{2}\right]-2\mathbb{E}\left[Y\mathbb{E}[Y|X]\right]+\mathbb{E}\left[\mathbb{E}[Y|X]\mathbb{E}[Y|X]\right][/itex]
but have not been able to justify [itex]\mathbb{E}\left[Y\mathbb{E}[Y|X]\right]=\mathbb{E}\left[Y^{2}\right]
[/itex] or [itex]\mathbb{E}\left[\mathbb{E}[Y|X]\mathbb{E}[Y|X]\right]=\mathbb{E}\left[Y^{2}\right][/itex].

Thanks in advance.

micromass · Mar 27, 2014

Can you tell us your definition of the conditional expectation and what properties you are allowed to use?

Stephen Tashi · Mar 27, 2014

It would also be helpful to improve the notation. If [itex] X [/itex] and [itex] Y [/itex] are random variables and [itex] f(X,Y) [/itex] is a function of them then the notation [itex] E f(X,y) [/itex] is ambiguous. It is not clear whether the expectation is being computed with respect to the distribution of [itex] X [/itex] or the distribution of [itex] Y [/itex] - or perhaps with respect to the joint distribution for [itex] (X,Y) [/itex].

You can use a subscript to denote which distribution is used to compute the expectation. For example, if [itex] Y [/itex] is not a function of [itex] X [/itex] then [itex] E_X ( E_Y ( 3Y + 1) ) [/itex] is the expectation of with respect to the distribution of [itex] X [/itex] of the constant value [itex] E_Y(3Y + 1) [/itex] Hence [itex] E_X (E_Y (3Y+1)) = E_Y (3Y+ 1) [/itex].

FactChecker · Mar 28, 2014

It's hard to prove because it is not true unless (Y-E(Y|X))==0. Are you sure that (Y-E(Y|X)) is supposed to be squared?

Stephen Tashi · Mar 28, 2014

FactChecker said:

It's hard to prove because it is not true unless (Y-E(Y|X))==0.

And before (Y - E(Y|X)) is equal or not equal to zero, it would have to mean something. How do we interpret Y - E(Y|X) ? Is it a random variable? To realize it , do we realize a value Y = y0 from the distribution of Y and then take the expected value of the constant y0 with respect to the distribution of X ?

Why does conditional probability used in mean square error equal zero?

1. What is conditional probability and how is it used in mean square error?

2. Why is conditional probability used in mean square error instead of other probability measures?

3. Can you provide an example of how conditional probability is used in mean square error?

4. How does conditional probability affect the value of mean square error?

5. Are there any limitations to using conditional probability in mean square error?

Similar threads

Hot Threads

Recent Insights