Why does conditional probability used in mean square error equal zero?

AI Thread Summary
The discussion centers on understanding why the mean square error, expressed as E[(Y - E[Y|X])²], equals zero. The original poster is struggling to justify certain equalities involving expectations and the properties of conditional expectation. They seek clarification on the definitions and properties of conditional expectation, emphasizing the importance of notation to avoid ambiguity in calculations. Participants suggest that the proof hinges on the interpretation of Y - E[Y|X] and its implications for random variables. The conversation highlights the need for a clear understanding of conditional expectations to resolve the mathematical confusion.
EdMel
Messages
13
Reaction score
0
Hi guys,

I am having trouble showing that \mathbb{E}\left[(Y-\mathbb{E}[Y|X])^{2}\right]=0.

I understand the proof of why E[Y|X] minimizes the mean square error, but I cannot understand why it is then equal to zero.

I tried multiplying out the square to get \mathbb{E}\left[Y^{2}\right]-2\mathbb{E}\left[Y\mathbb{E}[Y|X]\right]+\mathbb{E}\left[\mathbb{E}[Y|X]\mathbb{E}[Y|X]\right]
but have not been able to justify \mathbb{E}\left[Y\mathbb{E}[Y|X]\right]=\mathbb{E}\left[Y^{2}\right]<br /> or \mathbb{E}\left[\mathbb{E}[Y|X]\mathbb{E}[Y|X]\right]=\mathbb{E}\left[Y^{2}\right].

Thanks in advance.
 
Physics news on Phys.org
Can you tell us your definition of the conditional expectation and what properties you are allowed to use?
 
It would also be helpful to improve the notation. If X and Y are random variables and f(X,Y) is a function of them then the notation E f(X,y) is ambiguous. It is not clear whether the expectation is being computed with respect to the distribution of X or the distribution of Y - or perhaps with respect to the joint distribution for (X,Y).

You can use a subscript to denote which distribution is used to compute the expectation. For example, if Y is not a function of X then E_X ( E_Y ( 3Y + 1) ) is the expectation of with respect to the distribution of X of the constant value E_Y(3Y + 1) Hence E_X (E_Y (3Y+1)) = E_Y (3Y+ 1).
 
It's hard to prove because it is not true unless (Y-E(Y|X))==0. Are you sure that (Y-E(Y|X)) is supposed to be squared?
 
FactChecker said:
It's hard to prove because it is not true unless (Y-E(Y|X))==0.

And before (Y - E(Y|X)) is equal or not equal to zero, it would have to mean something. How do we interpret Y - E(Y|X) ? Is it a random variable? To realize it , do we realize a value Y = y0 from the distribution of Y and then take the expected value of the constant y0 with respect to the distribution of X ?
 
I was reading a Bachelor thesis on Peano Arithmetic (PA). PA has the following axioms (not including the induction schema): $$\begin{align} & (A1) ~~~~ \forall x \neg (x + 1 = 0) \nonumber \\ & (A2) ~~~~ \forall xy (x + 1 =y + 1 \to x = y) \nonumber \\ & (A3) ~~~~ \forall x (x + 0 = x) \nonumber \\ & (A4) ~~~~ \forall xy (x + (y +1) = (x + y ) + 1) \nonumber \\ & (A5) ~~~~ \forall x (x \cdot 0 = 0) \nonumber \\ & (A6) ~~~~ \forall xy (x \cdot (y + 1) = (x \cdot y) + x) \nonumber...
Back
Top