How do we interpret a random variable?

etotheipi · Feb 27, 2020

I've read that we can define a random variable on a probability space ##(\Omega, F, P)## such that it is a function that maps elements of the sample space to a measurable space - for instance, the reals - i.e. ##X: \Omega \rightarrow \mathbb{R}##.

That being said, it's often treated (at least from what I've encountered in elementary probability) as a variable, whose value becomes known after the experiment has been completed. When we write the probability of a certain event, like ##P(X<5)##, then I guess what we really mean is ##P(X(\omega)<5)## where ##\omega \in \Omega##. But if - as if often the case - the elements of ##\Omega## are also real numbers, then a perfectly good mapping would be ##X(\omega) = \omega##, so we need not even think too much about it.

My question is, is it correct to think of the random variable as analogous to a variable like velocity, which happens to also be a function (of time, or other parameters)? As in, we may define velocity as a function, ##v(t) = \frac{dx}{dt}##, but more often than not I tend to think of ##v## as a variable. Can we treat a random variable ##X## in the same way, as a variable?

That's what the name seems to imply, but names can be misleading... the text I'm using makes a point to state that it is definitely not a variable (nor is it random)!

Stephen Tashi · Feb 27, 2020

etotheipi said:

That being said, it's often treated (at least from what I've encountered in elementary probability) as a variable, whose value becomes known after the experiment has been completed.

We must distinguish between mathematical theory versus the applications of theory to specific physical problems. In the mathematical theory of probability, there is no definition concerning doing experiments which have outcomes. There is no definition of taking random samples or any assumption that random samples can be taken. In the application of probability theory, outcomes of experiments and random samples are the common interpretations given to random variables.

My question is, is it correct to think of the random variable as analogous to a variable like velocity, which happens to also be a function (of time, or other parameters)?

It's difficult to define "variable" in a precise manner. A rigorous definition of "variable" is given in documents that define the syntax of computer languages or other formal languages. What definition of "variable" do you have in mind?

That's what the name seems to imply, but names can be misleading... the text I'm using makes a point to state that it is definitely not a variable (nor is it random)!

A deterministic quantity need not have an associated probability space. A "random variable" must have an associated probability space. If you wish to use the term "variable" only to describe something that has a single definite but perhaps unknown numerical value, then a "random variable" is not a such a variable. If you wish to use the term "variable" to denote a symbol that stands for a mathematical structure that can consist of many other structures ( such as a probability space) then a "random variable" can be such a variable.

etotheipi · Feb 27, 2020

Stephen Tashi said:

It's difficult to define "variable" in a precise manner. A rigorous definition of "variable" is given in documents that define the syntax of computer languages or other formal languages. What definition of "variable" do you have in mind?

For instance, we might denote the sum of 3 observations ##S = X_{1} + X_{2} + X_{3}##, and use this in expectation algebra and what not. In this scenario, the ##X##'s are definitely being used as variables - in the sense that they could take on any values.

This doesn't seem to mesh well at all with the definition of a random variable as a function. I don't even know how the function definition would make sense in the context of the sum of those observations.

Stephen Tashi said:

A deterministic quantity need not have an associated probability space. A "random variable" must have an associated probability space. If you wish to use the term "variable" only to describe something that has a single definite but perhaps unknown numerical value, then a "random variable" is not a such a variable. If you wish to use the term "variable" to denote a symbol that stands for a mathematical structure that can consist of many other structures ( such as a probability space) then a "random variable" can be such a variable.

I sort of see the distinction you are making, though it's a little subtle. ##X## doesn't seem like a conventional algebraic variable (we definitely can't solve for it!), but we can plug it through various formulae and get out variances and all sorts - though this might be indicative of the alternative mathematical structure you were talking about!

So I'm sort of at a loss about how to interpret it!

Stephen Tashi · Feb 27, 2020

etotheipi said:

For instance, we might denote the sum of 3 observations ##S = X_{1} + X_{2} + X_{3}##, and use this in expectation algebra and what not. In this scenario, the ##X##'s are definitely being used as variables - in the sense that they could take on any values.

What would it mean to "take on a value"?

A last resort is to try Logic! In logic a "proposition" is either True or False, and not both. A sentence like "2X=3" is not a proposition. Instead it is a "propositional function". It maps particular values of the "variable" "X" to a particular value of True or False. To convert a propositional function into a proposition we can use the quantifiers "for each" and "there exists". For example, we can say "for each number X, 2X = 3" and produce a False proposition. We can say "there exists a number X such that 2X = 3" and produce a True proposition.

Often mathematical writing does not explicity employ quantifiers and it is left to the reader to supply them. For example, we would usually understand "x + x = 2x" to mean "for each number x, x + x = 2x". If we were looking at a solution to a problem and saw "x = 4 - 2x " we would understand this to mean that for the particular x that exists as a solution, x = 4 - 2x.In coherent mathematical writing, a symbolic expression like ##S = X_{1} + X_{2} + X_{3}## must be interpretable as a complete sentence and, if some assertion is being made, we must be able to interpret it as a proposition. How would you interpret the expression "##S = X_{1} + X_{2} + X_{3}##"? I think it is ambiguous without any context.

This doesn't seem to mesh well at all with the definition of a random variable as a function. I don't even know how the function definition would make sense in the context of the sum of those observations.

One can speak of adding two functions to obtain another function. e.g. f(x) = 2x, g(x) = cos(x), h(x) = (f+g)(x) = 2x + cos(x).

I sort of see the distinction you are making, though it's a little subtle. ##X## doesn't seem like a conventional algebraic variable

Technically, in your example, the random variables are ##X_1, X_2, X_3##. How would a reader interpret the symbol "X" without a subscript? (It's a common convention, but can you express in words what you are doing?)

(we definitely can't solve for it!),

We can't solve for a single value of x in the statement "for each number x, x + x = 2x".

So I'm sort of at a loss about how to interpret it!

The same symbols are used for different mathematical structures. For example, " A = B" means one thing for numbers A,B and something completely different for sets A,B. With each type of mathematical structure we must learn new definitions for the same symbolism. What I consider "the power of mathematics" is its ability to get results from symbolic manipulations without our having to think about and express in words the meaning of the symbolic manipulations. People can get results without fully understanding what is going on. However, understanding mathematics requires understanding the technical meaning of the symbolism. That is very difficult. When a concept is explained in words, we are alerted to the fact that there is something to understand. When symbols are written down, if they are familiar, we tend to feel they say something self evident and don't bother to translate them into words.

etotheipi · Feb 27, 2020

Stephen Tashi said:

In coherent mathematical writing, a symbolic expression like ##S = X_{1} + X_{2} + X_{3}## must be interpretable as a complete sentence and, if some assertion is being made, we must be able to interpret it as a proposition. How would you interpret the expression "##S = X_{1} + X_{2} + X_{3}##"? I think it is ambiguous without any context.
---
Technically, in your example, the random variables are ##X_1, X_2, X_3##. How would a reader interpret the symbol "X" without a subscript? (It's a common convention, but can you express in words what you are doing?)

I'm very unsure. Up until now I've worked with them by thinking of them as placeholders for possible values it could take given a distribution. Putting it into words, if ##X## represents the length of a piece of string from a certain manufacturer, then I'd say "The total length of 3 pieces of string equals the length of the first string (##X_1##) plus the length of the second string (##X_2##) plus the length of the third string (##X_3##)". This seems a little fishy though, since each ##X_i## doesn't represent a single number.

If Wikipedia is not lying, then ##X## is a function and ##S## is then the sum of three functions. So I suppose we might then say that ##S## is another function. But it doesn't make any sense to write ##P(S<63)## if ##S## is a function!

Stephen Tashi · Feb 28, 2020

etotheipi said:

I'm very unsure. Up until now I've worked with them by thinking of them as placeholders for possible values it could take given a distribution. Putting it into words, if ##X## represents the length of a piece of string from a certain manufacturer, then I'd say "The total length of 3 pieces of string equals the length of the first string (##X_1##) plus the length of the second string (##X_2##) plus the length of the third string (##X_3##)". This seems a little fishy though, since each ##X_i## doesn't represent a single number.

That is a correct interpretation. You are correct to be cautious. The use of "=" with random variables is ticky. For example, with numbers we can reason: If X = Y then X+Y = X+X = 2X. However, in applications of random variables "X+Y" denotes the random variable whose samples are created by adding one sample taken from X to one sample taken from Y. This is different than taking one sample from X and multiplying that value by 2.

Many texts use the notation ##X \sim Y## to indicate that X and Y are two random variables that have the same probability distribution, but are different in the sense that taking samples from X is a different experiment than taking samples from Y.

The notation "=" is used to make definitions. So ##S = X_1 + X_2 + X_3## abbreviates the statement that a sample of random variable ##S## is defined by an experiment that involves sampling 3 other random variables and adding those three values. In general, each ##X_i## is a function of a "random variable" ##\omega_i## that is an element in a probability space ##\Omega_i##. So ##S## is a function of 3 random variables ##(\omega_1, \omega_2, \omega_3)## and the set of outcomes in the probability space for ##S## consists of the set of such triples (technically the "cartesian product" of the 3 other spaces) ##\Omega = \Omega_1 \times \Omega_2 \times \Omega_3##.. The probability measure associated with ##S## is not defined by the "=" in the above notation. Often, the probability measure on ##S## is implied by saying that "##X_1, X_2, X_3## are mutually independent". This tells us that the probability density function of the "random vector" ##(\omega_1, \omega_2,\omega_3)## is ##f(\omega_1,\omega_2,\omega_3) = f_1(\omega_1)f_2(\omega_2) f_3(\omega_3)## where ##f_i## is the probability density for ##X_i##. Knowing the probability density for the random vector ##(\omega_1,\omega_2,\omega_3)## we can compute the probability density for ##S## by an appliction of integration known as "convolution". In other cases, the probability measure of the random vector does not take such a simple form.

If Wikipedia is not lying, then ##X## is a function and ##S## is then the sum of three functions. So I suppose we might then say that ##S## is another function. But it doesn't make any sense to write ##P(S<63)## if ##S## is a function!

You are correct that the notation ##P(S < 63)## doesn't apply to ##S## merely because it is a function. However if ##S## is defined as a random variable by ##S = X_1 + X_2 + X_3## then ##S## does have some associated probability measure. It is true that notation "##S = X_1 + X_2 + X_3##" does not, by itself, define the probability measure for ##S## in terms of the probability measures associated with the ##X_i##.

An oft recurring topic in probability theory is problems of the form: Given ##S## is such-and-such a function of mutually independent ##X_i##, compute the probability density for ##S##. Examples are:
##S = X_1 + X_2##
##S = max( X_1,X_2)##
##S = (X_1)(X_2)##
##S = X_1/X_2##

etotheipi · Feb 28, 2020

Awesome, that's a lot to take in but it clarifies a lot of what I had doubts about! Also interesting since I'd never really considered how constructions which appear similar and are manipulated in the same way (e.g. in the form of an equation) can indeed have quite distinct interpretations. Especially this part:

Stephen Tashi said:

What I consider "the power of mathematics" is its ability to get results from symbolic manipulations without our having to think about and express in words the meaning of the symbolic manipulations.

On another note,

Stephen Tashi said:

An oft recurring topic in probability theory is problems of the form: Given ##S## is such-and-such a function of mutually independent ##X_i##, compute the probability density for ##S##. Examples are:
##S = X_1 + X_2##
##S = max( X_1,X_2)##
##S = (X_1)(X_2)##
##S = X_1/X_2##

I think the way one would do these is to write down the CDF of ##S## in terms of ##X_1## and ##X_2##, and the PDF could then be determined on differentiation.

For the one with the max function, I'm going to suppose that the constraint is that, for ##P(Z<z)##, both observations are also less than ##z##. This should then allow solving for the PDF!

Stephen Tashi · Feb 29, 2020

etotheipi said:

I think the way one would do these is to write down the CDF of ##S## in terms of ##X_1## and ##X_2##, and the PDF could then be determined on differentiation.

For the one with the max function, I'm going to suppose that the constraint is that, for ##P(Z<z)##, both observations are also less than ##z##. This should then allow solving for the PDF!

The details would be a digression from the general topic of random variables, but it's on-topic to mention that the notion of a random variable as a function leads to new concepts of integration and differentiation.

One convenient aspect of the definition of a random variable as a function is that a "function of a random variable" becomes itself a random variable. A random variable ##X## is defined as a function on the outcomes of a probability space ##S = (\Omega,\mathcal{F},P)##. If ##F(X)## is a function of ##X##, this allows us to define a different probability space ##S_X = (\Omega_X,\mathcal{F_X},P_X)## that describes the possible values of ##X## and their associated probabilities. ##F## is a random variable on the space ##S_X##.

Introductory probability theory deals with two types of random variables: continuous and discrete. Discrete variables are handled with summations and differences. Continuous variables are handled with Riemann integration and differentiation. However, the notion of "function" is very general. It includes things that can be described by complicated algorithms that include conditions of the form "if so-and-so then such-and-such". The generality of the notion of "function" leads to random variables that are neither (purely) continuous nor discrete.

For example, if ##X## is a uniformly distributed random variable on the interval [0,1], we can define another random variable ##F(X)## as:

##F(X) = X## when ##X < 1/3##.
##F(X) = 1/3 ## when ##X \in (1/3, 2/3]##
##F(X) = 1/3 + 2( X - 2/3)## when ##X > 2/3##

Defining a probability density function ##f## for ##F## involves treating the value 1/3 in a special manner. Riemann integration gives ##\int_{1/3}^{1/3} f(x) dx = 0## but we want the answer to be 1/3.

People can handle such cases informally by using Riemann integration together with adding a value to account for a "point mass" at ##X = 1/3##. Developing a proper mathematical theory that can handle such situations involves defining notions of integration and differentiation that are more general than those in introductory calculus.

Stephen Tashi · Mar 1, 2020

Although the topic of random variables motivates new types of integration and differentiation, it's worth mentioning that the definition of a random variable does not require that a random variable have an associated distribution function. (In spite of the habit that I and others have of talking about random variables as if a distribution is taken for granted!)

A random variable is associated with a probability space ##(\Omega,\mathcal{F},P)##. The function ##P## has a domain consisting of certain subsets of ##\Omega##. The set ##\Omega## need not be a set of numbers. In the special case that ##\Omega## happens to be the real number line and we wish to assign a probability to a subset such as ##[0,1/2]##, the set ##[0,1/2]## must be in the domain of ##P##. So ##P## does not refer to a cumulative distribution function or a probability density function. The domain of those functions consists of sets of single numbers such as {0} or {1/2}.

A cumulative distribution function ##F## provides a way of implementing the probability measure ##P##. For example, we can define ##P([a,b])## to be ##F(b) - F(a)##. The most important applications of random variables involve those that have distribution functions. However, keep in mind that distribution functions are not required by the definition of a random variable. The definition requires the existence of a probability measure, without any specification of how that probability measure is implemented.

etotheipi · Mar 1, 2020

Stephen Tashi said:

Although the topic of random variables motivates new types of integration and differentiation, it's worth mentioning that the definition of a random variable does not require that a random variable have an associated distribution function. (In spite of the habit that I and others have of talking about random variables as if a distribution is taken for granted!)

A random variable is associated with a probability space ##(\Omega,\mathcal{F},P)##. The function ##P## has a domain consisting of certain subsets of ##\Omega##. The set ##\Omega## need not be a set of numbers. In the special case that ##\Omega## happens to be the real number line and we wish to assign a probability to a subset such as ##[0,1/2]##, the set ##[0,1/2]## must be in the domain of ##P##. So ##P## does not refer to a cumulative distribution function or a probability density function. The domain of those functions consists of sets of single numbers such as {0} or {1/2}.

A cumulative distribution function ##F## provides a way of implementing the probability measure ##P##. For example, we can define ##P([a,b])## to be ##F(b) - F(a)##. The most important applications of random variables involve those that have distribution functions. However, keep in mind that distribution functions are not required by the definition of a random variable. The definition requires the existence of a probability measure, without any specification of how that probability measure is implemented.

You explained it a lot better than Wikipedia did! So ##P## is a function whose domain is ##\mathcal{F}##, which maps each event to a number in ##[0,1]##. But no matter whether the sample space is discrete or continuous, we can always use the function ##P## (i.e. continuous: ##P(X<24.1343)## or discrete: ##P(X=5)##) since we just change how we define ##P##!

Stephen Tashi · Mar 1, 2020

etotheipi said:

So ##P## is a function whose domain is ##\mathcal{F}##, which maps each event to a number in ##[0,1]##.

Yes ( using the term "event" to refer to a subset of ##\Omega## that is an element of ##F## and the term "outcome" to refer to a (single) element of the set ##\Omega##.)

The vexing thing about ##\mathcal{F}## is that we cannot necessarily make ##\mathcal{F}## the collection of all possible subsets of ##\Omega##. So there can be subsets of ##\Omega## that are not "events". Intuitively, this wouldn't be surprising in some weird probability space that was unrelated to any practical application. However, in the commonly encountered situation where ##\Omega## is the set of real numbers and we have a measure ##P## defined by use of a cumulative distribution, there are "unmeasureable" subsets of ##\Omega## that are not elements of ##\mathcal{F}##. I don't have an intuitive grasp of what an unmeasureable set is. The formal definition of "Vitali set" is easy to look up.

How do we interpret a random variable?

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad The countability paradox of computable numbers

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect