Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Basic random variable question - measure theory approach

  1. Oct 22, 2012 #1
    I have always struggled in understanding probability theory, but since coming across the measure theoretic approach it seems so much simpler to grasp. I want to verify I have a couple basic things.

    So say we have a set χ. Together with a σ-algebra κ on χ, we can call (χ,κ) a measurable space. We can define a measure on χ, call it M, that follows the basic requirements (non-negativity, countably additive, the measure of the empty set is 0, etc). Now, (χ,κ,M) is a measure space.

    Now, getting to probability theory, if M(χ) < ∞, M is a finite measure and in particular, if M(χ)=1, we can consider M as a probability measure. If we're talking about a probability measure, we can consider χ as our sample space, and forall E in κ, E is an event. Furthermore, we can the measure space (χ,κ,M) a probability space.

    If M is a probability measure, what it does is it assigns to each event E in κ, a probability, which is the chance it will happen. (that's why M(χ)=1, this is like saying, if I get some feasible event E as an outcome, what's the probability that E is any of the feasible events in χ, well obviously, this is a 100% chance, while M(∅)=0, what's the probability that E isn't any of the possible events in χ? Well, no chance of that happening since we've established before hand that E is feasible).

    So here is my confusion: understanding the random variable, R. From what I understand, the analogue to measure theory is the random variable is a measurable function. For example, like the counting measure, or the Lebesgue measure. A measurable function is a function R: S1 --> S2 where (S11,M1) and (S22,M2) are measure spaces, so that if E in Ω2 is an event in Ω2 (the σ-algebra on S2), then R-1(E) in Ω1 (its pre-image under R is an event in the σ-algebra on S1). I have difficulty understanding intuitively what this is in a probability sense. What are the two measure spaces we would be mapping between? I feel like with probability, I"m only looking at one measure space: you know, for example, the population of a country, this could be my sample space, where as, the events could correspond to different possible ages of the residents, so an event 1 might be (person A is 25, person B is 30, etc. etc.). What is this "second measurable space" I would be dealing with when establishing a random variable?
    Last edited: Oct 22, 2012
  2. jcsd
  3. Oct 22, 2012 #2
    It's been a while since I looked into this theory in any depth, but the wikipedia article should be useful.

    Suppose you have a probability space where the sample space X = {a,b}. Let your sigma-algebra be the collection {empty, {a}, {b}, {a,b}}, and let the probability measure be P(empty) = 0, P({a}) = P({b}) = 0.5, and P({a,b}) = 1.

    Though the probability measure itself already contains the information needed to produce a random variable, a measurable function lets you take that probability space and apply it to other measurable spaces, e.g. the real numbers.

    An (R,Borel measure) random variable is a function F:X->R that would map from the sample space X to intervals in R, the real numbers (the Borel measure defined u([x y]) = y-x). For example, mapping F({a}) = [0 0.5] and F({b}) = [0.5 1]. This random variable would be in the interval [0 0.5] with probability 0.5, per the probability measure. This is a measurable function because the preimage under F of each of those two intervals is measurable back in the probability space.

    In this case, the Borel measure equalled the probability measure, but you could also map G({a}) = [0 1] and G({b}) = [1 2]. This random variable would be in the interval [0 1] with probability 0.5.

    So a measurable function relates a fixed probability space to many different possibilities of realizing random variables drawn with that fixed probability distribution.

    That is my understanding, but I never progressed terribly far into formal probability theory so I too may have misunderstandings or may have made mistakes!
  4. Oct 22, 2012 #3
    Hey thank you so much for the response, and especially this sentence here. That had been my main confusion because I would look at it and think, I all ready have a "function" (the probability measure) so I was getting very confused by how the random variable related to this, why one was more important than the other, etc. It seems like they are very much related, and it makes sense to say the random variable allows me to kind of apply it to other measurable spaces.
  5. Oct 22, 2012 #4
    You should see random variables as some kind of "selection of data". Typically, sample spaces contain a lot of data that we're not interested in. For example, let's say that we want to prove or disprove the following: if I throw 10 dices, then the probability that the sum of the eyes is exactly 37 is 1/4. An easy way of proving/disproving this is just to throw 10 dices a number of times. So let's throw 10 dices a number of 1000 times. If we get approximately 1/4 of our answers 37, then the conjecture is proved (whatever approximately means).

    So, we begin by throwing dices. We get results such as (1,3,5,2,1,2,1,2,1,1). However, this is too much information. We are simply not interested in the first dice being 1 and the second one being 3. We are only interested in the sum of the eyes. In this case, we are interested in the number 19. So instead of keeping track of the full outcome (1,3,5,2,1,2,1,2,1,1), we might just keep track of 19.

    This is where random variables come in. The sample space is often extremely big and contains a lot of useless information. But the random variable selects the interesting information. For example, we might have the random variable [itex]X=\text{sum of the eyes of the 10 dice}[/itex]. This defines a function [itex]X:\Omega\rightarrow \mathbb{R}[/itex].

    We want our X to be measurable precisely to define a measure on [itex]\mathbb{R}[/itex] by [itex]P_X(A)=P(X^{-1}(A))[/itex]. In order for this to be well-defined, we want [itex]X^{-1}(A)[/itex] to be an event in [itex]\Omega[/itex].
  6. Oct 22, 2012 #5
    Micromass, thanks for this definition and the example. It helps tremendously.

    Can you tell me if my thinking is correct? If I take your example, where X is the sum of the ten die, then I almost feel like I need to use a sort of Lebesgue integral to figure out questions like "what's the probability that I'll get a sum in between a and b"?. Let me explain my reasoning:

    So I have my probability space (Ω,κ,M) where Ω is the sample space, κ is the event space, and M is the probability measure. My random variable X represents the sum of the ten die. Now X is a mapping from κ --> ℝ. I feel we can compute something like the Lebesgue integral as follows: break up the range in to pieces (for example, 15 would correspond to the point in ℝ so that the sum of the die is 15.) So for each t in the range of X, we could look at the pre-image for t. (This pre-image would give us all the events where the sum of the die is t.) Since X is a measurable function, we know this pre-image must be in κ and so we can assign it a measure with M, so we "measure" this pre-image and record the result. Now, if we sum together all of these measures for each possible point in the range, that should give us 1 right (since partitioning up the range corresponds to pre-images that are disjoint. For example, if e is the pre-image of 15, e can't also be the pre-image of 10)? Furthermore, if we want to know the probability that say the sum of the die is between numbers a and b, then for each number a through b, we go through this process, and sum these measures together and the resulting sum would give us the answer (as in, for t from a to b, take pre-image of t, then take M(a-1), sum all of these measures together).
  7. Oct 22, 2012 #6
    Yes, I think you get the right idea. But I want to remark that you're not really doing a Lebesgue integral. When I hear Lebesgue integral, I think of an integral with respect to Lebesgue measure [itex]\lambda[/itex]. What you want, however, is an integral with respect to the measure M.

    But this technical issue aside, your intuition seems correct. If we want to find out the values between a and b, then we want to find out [itex]M(X^{-1}([a,b]))[/itex]. This can of course be written as an integral:

    [tex]\int_{X^{-1}([a,b])} dM[/tex]

    Measure theoretic integrals will be especially crucial in determining expectation values.
  8. Oct 22, 2012 #7
    thanks so much! I finally feel like I'm beginning to understand this stuff on an intuitive level rather than just memorizing definitions and what not. I love this measure theory approach to probability. I know I have a lot of work to do but I feel like I can finally "see the light" (or a little bit of it at least)
  9. Oct 23, 2012 #8

    Stephen Tashi

    User Avatar
    Science Advisor

    It would useful to have an example of measure spaces that aren't described by sets of numbers, (and not trivially given in terms of thiings that are easily parameterized by numbers) but that's hard to find. I can't think of a good one.

    Suppose there is a dart game where your score s on one throw is based on how far the dart lands from the center of a target. If the dart lands d inches from the center, you get s = 1/d points when d > 1. If d is less than or equal to 1, you get s = 5 points.

    Assume the probability distribution for distance d is some continuous probability distribution on [0, infinity) like a rayleigh distribution. What does the probability distribution of the score s look like? The probability distribution of the s isn't a discrete distribution and it isn't a (purely) continuous distribution. It has a "point mass" at s = 5.

    Suppose we have the formula for the probability distribution of s. What kind of integration will we do to find the mean value of s? If we try to do Riemann integration of a density, the "point mass" at s = 5 won't contribute anything since the point s= 5 has zero length. So the kind of integration that makes sense is a combination of integrating the density of continuous random variable plus adding a term corresponding to how a discrete random variable are treated.

    One way to look at measure theory is that it is way of making such ad hoc combinations of integration and addition respectable. One thing the above example illustrates is that much of measure theory indeed deals with "a measure" and not specifically Lebesgue measure. There can be measures (such a the measure defined by the distribution of s) that have "point masses". Another thing it illustrates is that Lebesque measure still plays a key role in some definitions and theorems. To find the distribution of s, we would do integrations based on the continuous distribution of d.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook