Conditional Expectation of a random variable

Click For Summary
SUMMARY

The discussion centers on the concept of conditional expectation, specifically the distinction between E(Y|X=xi), which is a constant, and E(Y|X), which is a variable. The computation of E(Y|X) is clarified through the formula E(Y|X)=∑yi*P(Y=yi|X) = ∑yi * P(X|Y=yi)*P(Y=yi)/P(X). Participants emphasize that E(Y|X) is a function of the random variable X, while E(Y|X=x) represents a constant based on a specific realization of X. The conversation also addresses misconceptions about random variables and constants, highlighting the importance of understanding their definitions in probability theory.

PREREQUISITES
  • Understanding of conditional expectation in probability theory
  • Familiarity with Bayes' theorem and its applications
  • Knowledge of random variables and their properties
  • Basic proficiency in mathematical notation and functions
NEXT STEPS
  • Study the derivation and applications of Bayes' theorem in probability
  • Explore the properties of random variables and their distributions
  • Learn about the differences between constants and variables in mathematical contexts
  • Investigate advanced topics in conditional expectation, such as E(Y|X) in multivariate distributions
USEFUL FOR

Students of probability theory, statisticians, data scientists, and anyone seeking to deepen their understanding of conditional expectation and random variables.

kblue
Messages
1
Reaction score
0
My professor made a rather concise statement in class, which sums to this: E(Y|X=xi) = constant. E(Y|X )= variable. Could anyone help me understand how the expectation is calculated for the second case? I understand that for different values of xi, we'll have different values for the expectation. This is where my thoughts are all muddled up:

E(Y|X)=[itex]\sum[/itex]i yi*P(Y=yi|X) = [itex]\sum[/itex]i yi * P(X|Y=yi)*P(Y=yi)/P(X).

Could anyone explain the above computation, and how that is a variable? Also, it is my understanding that summing the probability P(Y=yi|X) over all values of Y won't be 1. Is this true?
 
Physics news on Phys.org
kblue said:
Could anyone explain the above computation, and how that is a variable?

One not-quite-correct explanation is to confuse "random variables" with ordinary variables.

It would go like this:

If you had an ordinary function such as g(X) = 3X then it woud be fair to say that expression g(3) represents a constant and the expression g(X) represents a fuction of X, which I suppose you would call "variable".

[itex]E(Y|X)[/itex] is some function of [itex]X[/itex].
When you give [itex]X[/itex] a specific value this is denoted by [itex]E(Y|X=x)[/itex] and that notation represents a constant.

The expression [itex]E(Y|X)[/itex] is not a two-variable function. The "[itex]Y[/itex]" in that notation jus tells you that you must do a summation over all possible values of [itex]Y[/itex]. Since you do that summation, the answer is not a function of the variable [itex]Y[/itex].

If we need a more precise explanation, we must heed the saying (that was the theme of a thread on the forum recently) "Random variables are not random and they are not variables".

It would be fair to say that [itex]E(Y|X)[/itex] depends on the random variable [itex]Y[/itex] because this says it depends on the entire distribution of [itex]Y[/itex]. "Random variables" are not ordinary variables because the definition of a "random variable" carries with it all the baggage about a distribution function that is not present in the definition or odinary variables. So [itex]E(Y|X)[/itex] isn't a function of an ordinary variable named "[itex]Y[/itex]".

Random variables technically do not take on specific values. It is their realizations that have specific values. When we say something like "Suppose the random variable X = 5", what we should say is "Suppose we have realization of the random variable X and the value of that realization is 5". The statement "X=x" means a realization of the random variable X is the value x.

I, myself, would have a hard time defining the notation [itex]E(Y|X)[/itex] using those precise notions and I tend to think about in the crude way that I first explained it! The [itex]E(Y[/itex] tells you to sum a certain function of possible realizations of the random variable [itex]Y[/itex] over all possible values that a realization may take. The [itex]X[/itex] tells you that when you do that sum, you assume that one particular value of the random variable [itex]X[/itex] has been realized and we abuse notation by denoting that value with the letter [itex]X[/itex] also. That particular value is a "variable" in the ordinary sense of the word variable.
"Variables" and "constants" are not adequately explained in ordinary mathematics courses. For example in the earlier discussion the literal "x" is used to repesent a "constant". We are asked to pretend it is a specific numerical value, yet at the same time it could be any specific numerical value. By contrast, in the function [itex]g(X) = 3X[/itex] we might be asked to pretend the literal "[itex]X[/itex]" is a "variable", but it seems to be on the same footing as the literal "x" insofar as it can take on any specific value. In ordinary math classes, you have to make your way though discussions that distinguish between variables and constants without have formal training of how to do that. (And most people with mathematical aptitude are able to.)

If you've taken logic courses or done structured computer programming, you know that symbols have a certain "scope". Within a certain context (such as an argument to a function) they can take unspecified values and within another context (such as a "read-only" global variable referenced inside a function but initialized outside of it) they hold only one specific value. That's the sort of formalism needed to deal with the distinction between variables and constants in a rigorous manner.
Also, it is my understanding that summing the probability P(Y=yi|X) over all values of Y won't be 1. Is this true?
No, I don't think that's true if by "summing" you mean that you assume each term in the sum assumes the same unspecified realization of the random variable X.

To understand the computation you asked about, think about Bayes rule.
 
Last edited:

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K