Conditional Expectation of a random variable

kblue
Messages
1
Reaction score
0
My professor made a rather concise statement in class, which sums to this: E(Y|X=xi) = constant. E(Y|X )= variable. Could anyone help me understand how the expectation is calculated for the second case? I understand that for different values of xi, we'll have different values for the expectation. This is where my thoughts are all muddled up:

E(Y|X)=\sumi yi*P(Y=yi|X) = \sumi yi * P(X|Y=yi)*P(Y=yi)/P(X).

Could anyone explain the above computation, and how that is a variable? Also, it is my understanding that summing the probability P(Y=yi|X) over all values of Y won't be 1. Is this true?
 
Physics news on Phys.org
kblue said:
Could anyone explain the above computation, and how that is a variable?

One not-quite-correct explanation is to confuse "random variables" with ordinary variables.

It would go like this:

If you had an ordinary function such as g(X) = 3X then it woud be fair to say that expression g(3) represents a constant and the expression g(X) represents a fuction of X, which I suppose you would call "variable".

E(Y|X) is some function of X.
When you give X a specific value this is denoted by E(Y|X=x) and that notation represents a constant.

The expression E(Y|X) is not a two-variable function. The "Y" in that notation jus tells you that you must do a summation over all possible values of Y. Since you do that summation, the answer is not a function of the variable Y.

If we need a more precise explanation, we must heed the saying (that was the theme of a thread on the forum recently) "Random variables are not random and they are not variables".

It would be fair to say that E(Y|X) depends on the random variable Y because this says it depends on the entire distribution of Y. "Random variables" are not ordinary variables because the definition of a "random variable" carries with it all the baggage about a distribution function that is not present in the definition or odinary variables. So E(Y|X) isn't a function of an ordinary variable named "Y".

Random variables technically do not take on specific values. It is their realizations that have specific values. When we say something like "Suppose the random variable X = 5", what we should say is "Suppose we have realization of the random variable X and the value of that realization is 5". The statement "X=x" means a realization of the random variable X is the value x.

I, myself, would have a hard time defining the notation E(Y|X) using those precise notions and I tend to think about in the crude way that I first explained it! The E(Y tells you to sum a certain function of possible realizations of the random variable Y over all possible values that a realization may take. The X tells you that when you do that sum, you assume that one particular value of the random variable X has been realized and we abuse notation by denoting that value with the letter X also. That particular value is a "variable" in the ordinary sense of the word variable.
"Variables" and "constants" are not adequately explained in ordinary mathematics courses. For example in the earlier discussion the literal "x" is used to repesent a "constant". We are asked to pretend it is a specific numerical value, yet at the same time it could be any specific numerical value. By contrast, in the function g(X) = 3X we might be asked to pretend the literal "X" is a "variable", but it seems to be on the same footing as the literal "x" insofar as it can take on any specific value. In ordinary math classes, you have to make your way though discussions that distinguish between variables and constants without have formal training of how to do that. (And most people with mathematical aptitude are able to.)

If you've taken logic courses or done structured computer programming, you know that symbols have a certain "scope". Within a certain context (such as an argument to a function) they can take unspecified values and within another context (such as a "read-only" global variable referenced inside a function but initialized outside of it) they hold only one specific value. That's the sort of formalism needed to deal with the distinction between variables and constants in a rigorous manner.
Also, it is my understanding that summing the probability P(Y=yi|X) over all values of Y won't be 1. Is this true?
No, I don't think that's true if by "summing" you mean that you assume each term in the sum assumes the same unspecified realization of the random variable X.

To understand the computation you asked about, think about Bayes rule.
 
Last edited:
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Back
Top