A random variable is defined as a function from one set,

Rasalhague · Jul 15, 2011

A random variable is defined as a function from one set, called a sample space, to another, called an observation space, both of which must be underlying sets of probability spaces. But often when people talk about a random variable - as in definitions of a particular, named distribution, such as the binomial distribution - they make no explicit mention of the domain or codomain of the random variable, or any rule to specify how this function associates an element of the one with an element, or elements, of the other. In this case, would a good rule of thumb be to assume they mean the identity function on a suitable sample space? (...so that the same set is being used for sample space and observation space.) When people write, for example, E[g(X)], for expectation, are they stating that E is a function of the composite random variable g o X, where X is to be understood as the identity function on the sample space, unless otherwise stated? Or is it that the exact details of, at least the domain of X, are usually irrelevant to applications?

The notation E[X] makes it look as if expectation is a function of a random variable, but sometimes people also talk about the "expectation of a distribution" (such as the binomial distribution). Is E in fact a function of both the random variable and the distribution, E[X,Q]? If X is held fixed, can a given expectation be made into any other expectation by a suitable choice of distribution Q; and if Q is held fixed, can the expectation be made into any other by a suitable choice of random variable X? Or are these best thought of as separate, equivalent ways of formalising the concept of expectation, so that, in one formalism, E is a function of random variables (in which all the necessary information is encoded), while in the other way of thinking, E is a function of distributions (probability measures)?

chiro · Jul 15, 2011

Rasalhague said:

A random variable is defined as a function from one set, called a sample space, to another, called an observation space, both of which must be underlying sets of probability spaces. But often when people talk about a random variable - as in definitions of a particular, named distribution, such as the binomial distribution - they make no explicit mention of the domain or codomain of the random variable, or any rule to specify how this function associates an element of the one with an element, or elements, of the other. In this case, would a good rule of thumb be to assume they mean the identity function on a suitable sample space? (...so that the same set is being used for sample space and observation space.) When people write, for example, E[g(X)], for expectation, are they stating that E is a function of the composite random variable g o X, where X is to be understood as the identity function on the sample space, unless otherwise stated? Or is it that the exact details of, at least the domain of X, are usually irrelevant to applications?

The notation E[X] makes it look as if expectation is a function of a random variable, but sometimes people also talk about the "expectation of a distribution" (such as the binomial distribution). Is E in fact a function of both the random variable and the distribution, E[X,Q]? If X is held fixed, can a given expectation be made into any other expectation by a suitable choice of distribution Q; and if Q is held fixed, can the expectation be made into any other by a suitable choice of random variable X? Or are these best thought of as separate, equivalent ways of formalising the concept of expectation, so that, in one formalism, E is a function of random variables (in which all the necessary information is encoded), while in the other way of thinking, E is a function of distributions (probability measures)?

Expectation is a function of a set of random variables. The introductory case starts off with one variable, but it can apply to any number of random variables and even conditional cases.

The key thing is to recognize what the final distribution you are trying to find the expectation of is. If it's say for example something like E[X|Y=y] then to get you will have to find your pdf of this random variable by integrating out one variable to get your pdf in terms of a particular y value. If you have this kind of expectation, you do the same thing depending on what you are doing: you get your pdf, your integral (or summation) domain and use the definition of expectation to calculate what you need to do.

If you have non-trivial regions of integration (or summation) or complicated set theoretic statements (like say (X and Y) or (B and C)) you translate that, solve for your pdf, figure out the domain space, and apply the expectation formula.

As for expectations of functions (like g(X)), it's like you would expect.

Instead of your observations being x, instead they are g(x). The idea of expectation is exactly the same, except your expectation is now in relation to observations g(x) instead of x.

So key points: for general expectation you have your random variables and your domain. Depending on the conditions you find out your final pdf that you integrate with respect to (you may have to integrate out variables if you have conditional statements like X|Y = y). Once you have final density function, you set up your domain. If you want to do it with respect to transformed functions [like g(X),h(y)] and so on, substitute appropriately, and then calculate the expectation.

Rasalhague · Jul 15, 2011

Okay, I hope you don't mind if I set aside, for now, these other - also fascinating topics - of how distributions relate to PDFs and CDFs and marginal distribution functions, and the differences between single and multivariable cases, so that we can to focus on this specific question of how expectation is defined. I'll try to paraphrase my question; please let me know if you're unsure what I'm asking. I'll ask it in terms of a single random variable, then think about the generalisations after I've understood that basic definition.

Am I right in thinking of "expectation", E, as a function which takes something as its input, and whose value is a real number, "the expected value of..."?

Suppose E is a function of a random variable X, as the notation E[X] suggests. This means that the value of E is completely determined by X, i.e. by which function X we chose. But, in the notation of the thread I linked to, X:S-->T, where S is the sample space, and T the observation space. The definition of X itself doesn't refer to a distribution. But wouldn't the value of E change if we imposed a different distribution on T?

Indeed, I can look up a particular, named distribution, say the binomial distribution, and find listed among its particular characteristics, its mean, i.e. expected value, as if E is a function of distributions. But suppose, having calculated this mean, I keep everything else the same except that I choose a new random variable, Y, which may take the form g o X. This will affect the output of E, won't it?

So is expectation a function not only of a random variable but also a distribution, or are these alternate ways of defining expectation? Does this question make sense?

Suppose I know nothing except the definitions quoted in the thread I linked to; how would you introduce the notion of expectation? Do we need to assume any restrictions on what kind of sample space or observation space we're talking about? All of the examples I've seen seem to deal with numbers, whereas the general definitions just talk about arbitrary sets.

micromass · Jul 15, 2011

But X uniquely determines a distribution on T. Let [itex]P_T[/itex] be the distribution on T, then

[tex]P_T(A):=P(T^{-1}(A))[/tex]

where P is the probability measure on S.

Rasalhague · Jul 15, 2011

Ah, thanks micromass! So from the requirement that

[tex]P_T(A)=P(X^{-1}(A)),[/tex]

we get the distribution, provided we already know P. But do we know P? I mean: as a function, X, tells us what S and T are, and a rule to get from one to the other, but does it encode information about the rest of the probability-space structure, such as which measure was chosen for S?

Hmm, I just had a thought. Is the general definition X:S-->T shorthand for something like X:Sx{P}-->Tx{P_T}, where x means Cartesian product.

micromass · Jul 15, 2011

S is the sample space which is always fixed. So the measure P on the sample space is always the same. If you change the measure P on the sample space, then of course the measure P_T will also change. But the point is that we always work with the same measure P on the same sample space throughout the problem. I've never seen a problem in probability where we suddenly have another sample space, or where we have two sample spaces.

Rasalhague · Jul 15, 2011

So, if all is fixed for a given problem, does the definition of a specific X tacitly include all information pertaining to the problem - (S,E,P), (T,F,Q) - making these terms, random variable, sample space, observation space, distribution, etc. effectively, in a sense, synonyms when it comes to specifying a named "distribution", such as the binomial distribution; so we could equally well say "the binomial random variable" or the "expected value of the binomial sample space", "...binomial observation space", "...binomial events"?

But then, condider problems that are different but similar. Might we not want to compare them? Are there not cases where two problems are modeled using the same sample space but a different random variable, or two problems which have the same sample space and the same random variable but differ in probability measure P, and hence in distribution P_T?

micromass · Jul 15, 2011

Rasalhague said:

So, if all is fixed for a given problem, does the definition of a specific X tacitly include all information pertaining to the problem - (S,E,P), (T,F,Q) - making these terms, random variable, sample space, observation space, distribution, etc. effectively, in a sense, synonyms when it comes to specifying a named "distribution", such as the binomial distribution; so we could equally well say "the binomial random variable" or the "expected value of the binomial sample space", "...binomial observation space", "...binomial events"?

Well, yes, you must see the information (S,E,P) and (T,F,Q) as fixed during a given problem. But there is no such thing as "the binomial random variable". There can be two random variables X:S-->T and Y:S-->T that both have the same binomial distribution, but that doesn't mean that they are equal!

As an example, consider

[tex]S=\{(0,0),(0,1),(1,0),(1,1)\}[/tex]

where each element has probability 1/4. And let X="the first coordinate" and Y="the second coordinate" These two random variables have the exact same distribution, but they are not equal! They are even independent!
In the same manner, we can have two different binomial random variables with the exact same distrubution.

But then, condider problems that are different but similar. Might we not want to compare them? Are there not cases where two problems are modeled using the same sample space but a different random variable, or two problems which have the same sample space and the same random variable but differ in probability measure P, and hence in distribution P_T?

Yes, we can have two random variables on the same sample space. So we can look at different distributions. But the probability measure P of the sample space will never (or very rarely) be subject to change!
So, for example, if you read "take [itex]X_n[/itex] a sequence of independent random variables", then the sample space is always the same: we always have [itex]X_n:S\rightarrow T[/itex]. Furthermore, the probability on the sample space will not change.

Rasalhague · Jul 15, 2011

micromass said:

In the same manner, we can have two different binomial random variables with the exact same distrubution.

Okay, point taken. So, if I've understood this, a random variable determines a distribution, but a distribution doesn't determine a random variable. And that's how it is that some people can say "the expected value of such-and-such a distribution" while others, referring to the same thing, say "the expected value of such-and-such a random variable" (i.e. of the class of random variables which induce that distrubution).

micromass · Jul 15, 2011

Rasalhague said:

Okay, point taken. So, if I've understood this, a random variable determines a distribution, but a distribution doesn't determine a random variable. And that's how it is that some people can say "the expected value of such-and-such a distribution" while others, referring to the same thing, say "the expected value of such-and-such a random variable" (i.e. of the class of random variables which induce that distrubution).

Indeed, a random variable will induce a distribution (but not vice-versa). And good probability-notions only depend of the distribution. That is, the expected value, the variance, etc. only depend of the distribution.
So, two variable that are "equally distributed" (i.e. have the same distrubutions) will have the same expected value, varience, etc.

A random variable is defined as a function from one set,

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect