# Question about the rule of the Lazy Statistician - If Y is discrete, w

1. Jul 8, 2013

Link to theorem: http://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician

Suppose Y is a discrete random variable related to X, a continuous random variable by some function r (so Y = r(X) ).

Let A be the following set: A_y = {x ∈ R ; r(x) = y}.

Since Y is discrete, f_Y (y) = P(Y = y) = P(r(X) = y). r(X) = y is equivalent to X ∈ A_y, so f_Y (y) = P(X ∈ A_y) = Sum of all P(X = x) such that x ∈ A_y.

It seems to me that the previous sum is valid for both discrete and continuous X. However, if X is continuous then P(X = x) = 0 for all x ∈ R. Thus X must be discrete, however I can construct a transformation from a continuous variable to a discrete one, so X is not necessarily discrete.

Am I wrong? Can anyone show me my mistake, if there is one? I really would like some clarification on this. Thank you!

2. Jul 8, 2013

### chiro

You need to work in the space that is intended: by transforming from continuous to discrete (or vice-versa) you are going from one measure to an entirely different one.

This (very important) fact needs to be taken into account.

3. Jul 9, 2013

### Stephen Tashi

The "law of the unconscious statistician" doesn't justify that step. It doesn't claim you can interpret $P(X \in A_y))$ as sum. You define Y as a function X, so the case of the theorem that deals with a continuous random variable applies.

As Chiro says, your example combines two "measure spaces". introductory probabiity theory deals with two types of random variables, continuous and discrete. This is mathematically awkward since analogous results for the two types need separate statements. But it is traditional to teach introductory probability theory this way instead of introducing more advanced viewpoints.

The use of the Riemann-Stieltjes integral on the cumulative distribution (which is mentioned in that article) is one way of unifying discrete and continuous distributions. It allows describing distributions that are a mixture of continuous and discrete.

( For example, suppose there is a dart game where you get 10 points for hitting the center ring , which has radius R and 100/X points if the dart lands outside the ring at a distance X from the center of the board. Suppose the distribution of the landing distance is continuous, such as a Rayleigh distribution. The distribution of points combines the characteristics of a continuous and discrete distribution. From the viewpointof introductory probaility theory, you can't define a continuous density for it since there is some probabilty greater than zero of scoring exactly 10 points.)

The high class way of doing probabiliy theory is to use measure theory , which is an even more general and abstract way of looking at things than the Riemann-Stieltjes view.

The Riemann-Stieltjes view defines a general definition of "integration" that includes both ordinary integration and also summation. Measure theory defines an abstract kind of integration that includes Riemann-Stietljes integration as a special case. In these general theories, when you see an "$\int$" sign it also includes the case of "$\sum$".

4. Jul 13, 2013

Can you explain why that step is not justified? I rewrote a proof that they are equivalent and it seems to hold for both continuous and discrete random variables. Here it is as an attachment.

Sorry, I'm still very new to probability and I'm trying to understand measure theory.

#### Attached Files:

• ###### image.jpg
File size:
45.2 KB
Views:
90
5. Jul 13, 2013

### Stephen Tashi

A continuous random variable has a probability density function, but evaluation that function at a given value cannot be interpreted as the giving the probability that the value occurs. For example, a random variable X with a uniform distribution on the the interval [0 , 1/2] has the probability density function: f(x) = 2 for x in [0, 1/2] and f(x) = 0 otherwise. The fact that f(1/3) = 2 cannot be interpreted as meaning "The probability that x = 1/3 is 2". To find the probability that X is in an interval, you must do an integration of the density over the interval, not a discrete summation over all the values in the interval.

That's too hard for me to read. I don't even see a statement of what is to be proven. You might find it interesting to learn the forum's LaTex.(post #3 of https://www.physicsforums.com/showthread.php?t=617567 ) Using LaTex is useful skill. With some variation in what "tags" are used, LaTex is used on other forums and in the Wikipedia and in many document editors.

6. Jul 13, 2013

Y is discrete, so f_Y (y) = P(Y = y) = P(r(X) = y)

r(X) = y is equivalent to X ∈ A_y

X ∈ A_y → {ω: X(ω) ∈ A_y} = U(x ∈ A_y) of {ω: X(ω) = x}

Since each {ω: X(ω) = x} is disjoint for distinct x, P(X ∈ A_y) = Sum of P(X = x) for x ∈ A_y

Can you show me at which step I am wrong exactly? I apologize about this, I'm having a hard time getting this.

7. Jul 13, 2013

### Stephen Tashi

That is not a correct statement. You're not giving serious consideration to specific examples. Again, if X is a random variable that is uniformly distributed on the interva [0, 1/2] the probability that X is in the interva [1/8, 2/8] is not "the sum over all x in [1/8, 2/8] of the probability that X is exactly equal to x". The probability that X is exactly equal to any specific number is zero. Furthermore the notion of "sum" over an infinite set is defined as limit of a sequence, so it is only defined for summing countable infinities of things (e.g. you can sum over "n = 1,2,3.." but not over "n = each number in [1/8,2/8]").

The thought that the probability of a set can be computed as the sum of the probabilities of disjoint subsets of the set is not in general true. The measure theoretic version of what may be done is that the measure of the union of a countably infinite or finite collection of disjoint measureable sets can be found by summing the measures of the individual sets. If you want to compute the probabiliy of a set by breaking it into mutually exclusive disjoint subsets, you have to break it into no more than a countable infinity of such subsets and the method of computing the probability of each subset must be well defined (i.e. each subset must be "measureable').

Last edited: Jul 13, 2013
8. Jul 13, 2013