How can we prove that E(X)=0 if X=0 almost surely?

  • Context: Graduate 
  • Thread starter Thread starter kingwinner
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around proving that if a random variable X equals 0 almost surely (i.e., P(X=0)=1), then the expected value E(X) must also equal 0. Participants explore the axioms of expectation, the properties of indicator functions, and the implications of "almost surely" in probability theory.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants assert that if X=0 almost surely, then |X| I(|X|=0) equals 0, leading to E(|X| I(|X|=0)) = E(0).
  • Others question the validity of certain steps in the proof, particularly the transition from E[|X| * I(0<|X|
  • There is a discussion about the difference between "X=0 almost surely" and "X=0," with some participants seeking clarification on the implications of probability measures and sets of measure zero.
  • One participant proposes that the limit as N approaches infinity of both sides of an inequality should preserve the inequality, while others challenge the assumption that convergence can be taken for granted.
  • Several participants express confusion about the meaning of "almost surely" and seek simpler explanations, with examples provided to illustrate the concept.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the validity of certain mathematical steps in the proof or the interpretation of "almost surely." Multiple competing views remain regarding the implications of the definitions and properties discussed.

Contextual Notes

Some participants note that the axioms of expectation are not well-defined without specifying the space of random variables, which could affect the validity of the claims made. Additionally, the discussion highlights the need for a deeper understanding of measure theory and probability measures to fully grasp the concepts being debated.

kingwinner
Messages
1,266
Reaction score
0
Axioms of expectation:
1. X>0 => E(X)>0
2. E(1)=1
3. E(aX+bY) = aE(X) + bE(Y)
4. If X_n is a nondecreasing sequence of numbers and lim(X_n) = X, then E(X)=lim E(X_n) [montone convergence theorem]

Definition: P(A)=E[I(A)]

Using the above, prove that if X=0 almost surely [i.e. P(X=0)=1 ], then E(X)=0.

Proof:
X=0 almost surely <=> |X|=0 almost surely

[note: I is the indicator/dummy variable
I(A)=1 if event A occurs
I(A)=0 otherwise]

|X| = |X| I(|X|=0) + |X| I(|X|>0)

=> E(|X|) = E(0) + E[|X| I(|X|>0)]
=E(0*1) + E[|X| I(|X|>0)]
=0E(1) + E[|X| I(|X|>0)] (axiom 3)
=0 + E[|X| I(|X|>0)] (axiom 2)
=E[|X| I(|X|>0)]
=E[lim |X| * I(0<|X|<N)] (lim here means the limit as N->∞)
=lim E[|X| * I(0<|X|<N)] (axiom 4)
=lim E[N * I(0<|X|<N)]
=lim N * E[I(0<|X|<N)]
=lim N * P(0<|X|<N) (by definition)
=lim (0) since P(X=0)=1 => P(0<|X|<N)=0
=0
=>E(X)=0
=======================================

Now, I don't understand the parts in red.
1. We have |X| I(|X|=0), taking the expected value it becomes E(0), how come?? I don't understand why E[|X| I(|X|=0)] = E(0).
2. lim E[|X| * I(0<|X|<N)] = lim E[N * I(0<|X|<N)], why??

Thanks for explaining!
 
Last edited:
Physics news on Phys.org
What is I(Z<0) + I(Z=0) + I(Z>0)?

(By the way, what do you mean by I?)
 
Hurkyl said:
What is I(Z<0) + I(Z=0) + I(Z>0)?

(By the way, what do you mean by I?)

I is the indicator/dummy variable
I(A)=1 if event A occurs
I(A)=0 otherwise
 
First, your axioms aren't really well-defined. You have to specify the space of random variables to which it applies, which could be non-negative r.v.s, bounded r.v.s or integrable r.v.s and, if the latter, you should really specify what integrable means. Anyway...

1 is easy,
<br /> \vert X\vert I(\vert X\vert=0) = 0\ \Rightarrow E\left(\vert X\vert I(\vert X\vert=0)\right)=E(0).<br />

2 isn't true. You should have an inequality,
<br /> \vert X\vert I(0&lt;\vert X\vert\le N) \le N I(0&lt;\vert X\vert\le N)\ \Rightarrow E\left(\vert X\vert I(0&lt;\vert X\vert\le N)\right)\le E\left(N I(0&lt;\vert X\vert\le N)\right).<br />
 
1. But I don't understand why |X| I(|X|=0) = 0.
Here we are only given that "X=0 almost surely", instead of "X=0". Somehow I was told that "X=0 almost surely" is NOT the same thing as saying that "X=0", so in this case we cannot say that "X=0", and now I don't see why it would always be the case that |X| * I(|X|=0) =0. Would you mind explaining a little more on this?

By the way, I actually don't quite understand what is the real difference between "X=0 almost surely" and "X=0". If P(X=0)=1, then we can say with certainty that X must equal 0. It is impossible(i.e. with zero proabiliity) for X to not be 0. So this seems to be exactly the same as saying that X=0??


2. Yes, I think there may be a typo in the source and it should be an inequality.
<br /> \vert X\vert I(0&lt;\vert X\vert\le N) \le N I(0&lt;\vert X\vert\le N)\ \Rightarrow E\left(\vert X\vert I(0&lt;\vert X\vert\le N)\right)\le E\left(N I(0&lt;\vert X\vert\le N)\right).<br />
I agree with this. But now, if we take the limit as N->∞ of both sides of the expected values, would the inequality still hold? What justifies this?

Your help is greatly appreciated!
 
You're dealing with inequalities and absolute values -- you should be dividing things into cases!

1. But I don't understand why |X| I(|X|=0) = 0.​
Split it into cases. Analyze each case individually.



As for not understanding "almost surely", surely you've been introduced to some sets of measure zero? Can you find a subset of [0,1] whose Lesbegue measure is zero? Now, can you find a function zero on that set?

It also might help to have at your disposal probability measures of much different qualitative behavior to consider. For example, consider the following probability measure on R:

P(S) = \begin{cases} 1 &amp; 0 \in S \\ 0 &amp; 0 \notin S \end{cases}
 
Hurkyl said:
As for not understanding "almost surely", surely you've been introduced to some sets of measure zero? Can you find a subset of [0,1] whose Lesbegue measure is zero? Now, can you find a function zero on that set?

It also might help to have at your disposal probability measures of much different qualitative behavior to consider. For example, consider the following probability measure on R:

P(S) = \begin{cases} 1 &amp; 0 \in S \\ 0 &amp; 0 \notin S \end{cases}

Unfontunately, I am a second year undergrad stat student and I don't have the background of Lesbegue integral and measure zero. Everything in my multivariable calculus course is in terms of Riemann integral and we haven't discussed the idea of measure zero.

Is it possible to explain the difference between "X=0 a.s." and "X=0" in a simpler way? I know my understanding won't be perfect, but for now I just want to understand the difference in the simplest sense.

Thank you!
 
1. Now I see why it's always 0.
Claim: |X| I(|X|=0) is always 0

If X=0, then |X|=0, so |X| I(|X|=0) = 0
If X≠0, then I(|X|=0) =0, so |X| I(|X|=0) = 0
Therefore, |X| I(|X|=0) is always 0.

But now I have a question. In the assumptions, we are given that X=0 almost surely [i.e. P(X=0)=1]. We can say that X=0 with certainty. It is impossible for X to not be 0. Then, why are we even considering the case X≠0? There should only be one possible case, X=0, right?


2. |X| * I(0<|X|<N) < N * I(0<|X|<N)
=> E[|X| * I(0<|X|<N)] < E[N * I(0<|X|<N)]
=> lim E[|X| * I(0<|X|<N)] < lim E[N * I(0<|X|<N)] ??
(lim=limit as N->infinity)

I was flipping through my calculus textbooks, but I couldn't find a theorem that applies.

Any help is appreciated!
 
To be precise, you can't even take the limit as N tends to infinity unless you can first show that it converges. However, it is equal to 0 for each N, so that problem is easily solved.
 
  • #10
gel said:
To be precise, you can't even take the limit as N tends to infinity unless you can first show that it converges. However, it is equal to 0 for each N, so that problem is easily solved.
Um...do you mean the left side lim E[|X| * I(0<|X|<N)] equals 0 for all N, or the right side lim E[N * I(0<|X|<N)] equals 0 for all N?

In general, is it true that f(N)≤g(N) ALWAYS implies
lim f(N)≤ lim g(N) ?
N->∞ N->∞
And we can safely take the limit of both sides while still preserving the same inequality sign??

Thanks!
 
  • #11
First I'll give a (hopefully simple) intuitive explanation addressing your X=0 a.s. versus X=0. A related discussion is the meaning of P(A)=0. Some people use the word "impossible" in the sense that "P(A)=0 means A is impossible." Kolmogorov calls that usage incorrect. He says if A is impossible, then P(A)=0, but not conversely. Instead, P(A)=0 means A is "practically impossible."

Example: Select a random real number from the interval [0,3] according to a uniform distribution. The probability of selecting the number 2 is 0, meaning it is practically impossible, but not impossible. It has zero probability of occurring.

Now use the same example, letting X=0 if the random real number is not 2, but X=57 if the random real number is 2. Then X=0 a.s., but X is not identically 0.


Second: if f(N) < g(N) for all N, then lim f(N) <= lim g(N). Note how the strict inequality might become equality in the limit. If f(N) <= g(N) for all N, then lim f(N) <= lim g(N).
 
  • #12
Billy Bob said:
First I'll give a (hopefully simple) intuitive explanation addressing your X=0 a.s. versus X=0. A related discussion is the meaning of P(A)=0. Some people use the word "impossible" in the sense that "P(A)=0 means A is impossible." Kolmogorov calls that usage incorrect. He says if A is impossible, then P(A)=0, but not conversely. Instead, P(A)=0 means A is "practically impossible."

Example: Select a random real number from the interval [0,3] according to a uniform distribution. The probability of selecting the number 2 is 0, meaning it is practically impossible, but not impossible. It has zero probability of occurring.

Now use the same example, letting X=0 if the random real number is not 2, but X=57 if the random real number is 2. Then X=0 a.s., but X is not identically 0.
Thank you! This is very helpful!


Second: if f(N) < g(N) for all N, then lim f(N) <= lim g(N). Note how the strict inequality might become equality in the limit. If f(N) <= g(N) for all N, then lim f(N) <= lim g(N).
This doesn't seem too obvious to me. Is there a name for this theorem? It looks like the squeeze/sandwich theorem may be appropriate, but it is not quite saying the same thing as the above...

Once again, thanks for your help!
 
  • #13
If s(N)>=0 for all N, and if lim s(N)=L, then it is not hard to prove L>=0. (Suppose not, suppose L<0, take epsilon s.t. L<L+epsilon<0, then eventually s(N)<L+epsilon<0.)

Now let s(N)=g(N)-f(N). It follows that if g(N)>=f(N) for all N and if lim g(N) and lim f(N) both exist, then lim g(N) >= lim f(N).
 
  • #14
Billy Bob said:
If s(N)>=0 for all N, and if lim s(N)=L, then it is not hard to prove L>=0. (Suppose not, suppose L<0, take epsilon s.t. L<L+epsilon<0, then eventually s(N)<L+epsilon<0.)

Now let s(N)=g(N)-f(N). It follows that if g(N)>=f(N) for all N and if lim g(N) and lim f(N) both exist, then lim g(N) >= lim f(N).
Thanks! This really helps!

But it seems like an important condition is that both lim g(N) and lim f(N) must exist.
So in our context, we have to show that both lim E[|X| * I(0<|X|<N)] and lim E[N * I(0<|X|<N)] exist before we can say that
E[|X| * I(0<|X|<N)] < E[N * I(0<|X|<N)]
=> lim E[|X| * I(0<|X|<N)] < lim E[N * I(0<|X|<N)]
How can we show the existence of lim g(N) and lim f(N) in our case?
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K