Question about the rule of the Lazy Statistician - If Y is discrete, w

  • Thread starter LoadedAnvils
  • Start date
  • Tags
    Discrete
In summary, the "law of the unconscious statistician" does not justify the step of summation over all the values in the interval.
  • #1
LoadedAnvils
37
0
Link to theorem: http://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician

Suppose Y is a discrete random variable related to X, a continuous random variable by some function r (so Y = r(X) ).

Let A be the following set: A_y = {x ∈ R ; r(x) = y}.

Since Y is discrete, f_Y (y) = P(Y = y) = P(r(X) = y). r(X) = y is equivalent to X ∈ A_y, so f_Y (y) = P(X ∈ A_y) = Sum of all P(X = x) such that x ∈ A_y.

It seems to me that the previous sum is valid for both discrete and continuous X. However, if X is continuous then P(X = x) = 0 for all x ∈ R. Thus X must be discrete, however I can construct a transformation from a continuous variable to a discrete one, so X is not necessarily discrete.

Am I wrong? Can anyone show me my mistake, if there is one? I really would like some clarification on this. Thank you!
 
Physics news on Phys.org
  • #2
Hey LoadedAnvils.

You need to work in the space that is intended: by transforming from continuous to discrete (or vice-versa) you are going from one measure to an entirely different one.

This (very important) fact needs to be taken into account.
 
  • #3
LoadedAnvils said:
f_Y (y) = P(X ∈ A_y) = Sum of all P(X = x) such that x ∈ A_y.

The "law of the unconscious statistician" doesn't justify that step. It doesn't claim you can interpret [itex] P(X \in A_y)) [/itex] as sum. You define Y as a function X, so the case of the theorem that deals with a continuous random variable applies.

As Chiro says, your example combines two "measure spaces". introductory probabiity theory deals with two types of random variables, continuous and discrete. This is mathematically awkward since analogous results for the two types need separate statements. But it is traditional to teach introductory probability theory this way instead of introducing more advanced viewpoints.

The use of the Riemann-Stieltjes integral on the cumulative distribution (which is mentioned in that article) is one way of unifying discrete and continuous distributions. It allows describing distributions that are a mixture of continuous and discrete.

( For example, suppose there is a dart game where you get 10 points for hitting the center ring , which has radius R and 100/X points if the dart lands outside the ring at a distance X from the center of the board. Suppose the distribution of the landing distance is continuous, such as a Rayleigh distribution. The distribution of points combines the characteristics of a continuous and discrete distribution. From the viewpointof introductory probaility theory, you can't define a continuous density for it since there is some probabilty greater than zero of scoring exactly 10 points.)

The high class way of doing probabiliy theory is to use measure theory , which is an even more general and abstract way of looking at things than the Riemann-Stieltjes view.

The Riemann-Stieltjes view defines a general definition of "integration" that includes both ordinary integration and also summation. Measure theory defines an abstract kind of integration that includes Riemann-Stietljes integration as a special case. In these general theories, when you see an "[itex] \int [/itex]" sign it also includes the case of "[itex] \sum [/itex]".
 
  • #4
Can you explain why that step is not justified? I rewrote a proof that they are equivalent and it seems to hold for both continuous and discrete random variables. Here it is as an attachment.

Sorry, I'm still very new to probability and I'm trying to understand measure theory.
 

Attachments

  • image.jpg
    image.jpg
    45.2 KB · Views: 524
  • #5
LoadedAnvils said:
Can you explain why that step is not justified?

A continuous random variable has a probability density function, but evaluation that function at a given value cannot be interpreted as the giving the probability that the value occurs. For example, a random variable X with a uniform distribution on the the interval [0 , 1/2] has the probability density function: f(x) = 2 for x in [0, 1/2] and f(x) = 0 otherwise. The fact that f(1/3) = 2 cannot be interpreted as meaning "The probability that x = 1/3 is 2". To find the probability that X is in an interval, you must do an integration of the density over the interval, not a discrete summation over all the values in the interval.

I rewrote a proof that they are equivalent and it seems to hold for both continuous and discrete random variables. Here it is as an attachment.

That's too hard for me to read. I don't even see a statement of what is to be proven. You might find it interesting to learn the forum's LaTex.(post #3 of https://www.physicsforums.com/showthread.php?t=617567 ) Using LaTex is useful skill. With some variation in what "tags" are used, LaTex is used on other forums and in the Wikipedia and in many document editors.
 
  • #6
Y is discrete, so f_Y (y) = P(Y = y) = P(r(X) = y)

r(X) = y is equivalent to X ∈ A_y

X ∈ A_y → {ω: X(ω) ∈ A_y} = U(x ∈ A_y) of {ω: X(ω) = x}

Since each {ω: X(ω) = x} is disjoint for distinct x, P(X ∈ A_y) = Sum of P(X = x) for x ∈ A_y

Can you show me at which step I am wrong exactly? I apologize about this, I'm having a hard time getting this.
 
  • #7
LoadedAnvils said:
Since each {ω: X(ω) = x} is disjoint for distinct x, P(X ∈ A_y) = Sum of P(X = x) for x ∈ A_y

That is not a correct statement. You're not giving serious consideration to specific examples. Again, if X is a random variable that is uniformly distributed on the interva [0, 1/2] the probability that X is in the interva [1/8, 2/8] is not "the sum over all x in [1/8, 2/8] of the probability that X is exactly equal to x". The probability that X is exactly equal to any specific number is zero. Furthermore the notion of "sum" over an infinite set is defined as limit of a sequence, so it is only defined for summing countable infinities of things (e.g. you can sum over "n = 1,2,3.." but not over "n = each number in [1/8,2/8]").

The thought that the probability of a set can be computed as the sum of the probabilities of disjoint subsets of the set is not in general true. The measure theoretic version of what may be done is that the measure of the union of a countably infinite or finite collection of disjoint measureable sets can be found by summing the measures of the individual sets. If you want to compute the probabiliy of a set by breaking it into mutually exclusive disjoint subsets, you have to break it into no more than a countable infinity of such subsets and the method of computing the probability of each subset must be well defined (i.e. each subset must be "measureable').
 
Last edited:
  • #8
I finally see. I will try to learn measure theory, but for now, thank you.
 

1. What is the rule of the Lazy Statistician?

The rule of the Lazy Statistician is a principle that suggests that when dealing with a large dataset, it is better to focus on a few key summary statistics rather than analyzing the entire dataset. This approach saves time and effort while still providing valuable insights.

2. What does the rule mean when Y is discrete?

When Y is discrete, it means that the variable being measured can only take on a finite or countable number of values. This could include things like number of children in a family, number of items purchased, or categories like gender or race.

3. How does the rule of the Lazy Statistician apply to discrete variables?

The rule of the Lazy Statistician still applies to discrete variables, as it is still important to focus on key summary statistics when dealing with a large dataset. However, the specific statistics that are relevant may differ depending on the type of data being analyzed.

4. Are there any exceptions to the rule of the Lazy Statistician?

While the rule is generally a good guideline for analyzing large datasets, there may be certain situations where a more in-depth analysis is necessary. For example, if the data is highly skewed or contains outliers, a simple summary statistic may not accurately represent the data.

5. How can I apply the rule of the Lazy Statistician in my own research or analysis?

To apply the rule of the Lazy Statistician, first identify the key variables of interest in your dataset. Then, focus on a few summary statistics such as mean, median, and mode, to gain a general understanding of the data. From there, you can choose to dig deeper into specific subsets of the data if necessary.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
690
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
1K
Replies
1
Views
2K
Replies
3
Views
791
Back
Top