(weak) Law of Large Numbers and variance

ntg865 · Apr 27, 2012

Sorry it's my first time posting so I am not sure if latex works here...

I have been trying to understand the proof of WLLN using Chebychev's inequalities, and here are my problems:

I know for the Strong Law of Large Numbers,

if n→∞, Ʃ(1 to n) σi ^2 / i^2 < ∞ => SLLN is satisfied,

My question is:
if n→∞, (n^-2) Ʃ(1 to n) σi ^2 < ∞ (which is a condition of WLLN), does it mean that WLLN is satisfied?

It sounds like it is but looking at the proof of WLLN, it doesn't make much sense because the RHS of the inequality is non-zero, thus it does not guarantee that LHS is 0, which is the definition of WLLN.

In the Chebychev's inequality (with k=ε/σ),

Pr(|sum(Xi)/n - sum(mean_i)/n|≥ε) ≤ σ^2 /ε^2

We know that if the left hand side is 0, then WLLN holds, and if the RHS is 0, then LHS must be 0. Now my problem is if I find that when n→∞ and say σ^2 converges to a non-zero constant (ie finite, so say, RHS becomes 0.5/ε^2), it means that the right hand side is not 0; although this does not imply that the left hand side won't be 0 according to the inequality, I can't think intuitively why the Weak Law will hold. And if finding that RHS is non-zero is not sufficient to show that the Weak Law does not hold, and there is no way of computing LHS since in most cases Xi is random, how would I go about convincing myself that the sequence (does not) satisfies WLLN?

chiro · Apr 27, 2012

Hey ntg865 and welcome to the forums.

This link was pretty informative and I think answers your question better than I could:

http://mathworld.wolfram.com/WeakLawofLargeNumbers.html

ntg865 · Apr 27, 2012

Thank you very much Chiro, I have used that link but it doesn't really answer my question. Moreover, the WLLN in that link assumes i.i.d. where as I am only assuming independence. Thanks again

Stephen Tashi · Apr 28, 2012

ntg865 said:

Moreover, the WLLN in that link assumes i.i.d. where as I am only assuming independence.

You should state a coherent question. If you are referring to a Weak Law Of Large Numbers that does not assume a series of identically distributed random variables, then state the law. Your first post is asks if "the WLLN is satisfied". What is that supposed to mean? Are you asking if the all the hypotheses of the Weak Law Of Large Number are satisified? Or are you asking whether the conclusion of the Weak Law Of Large Numbers is satisifed?

ntg865 · Apr 28, 2012

Stephen Tashi said:

You should state a coherent question. If you are referring to a Weak Law Of Large Numbers that does not assume a series of identically distributed random variables, then state the law. Your first post is asks if "the WLLN is satisfied". What is that supposed to mean? Are you asking if the all the hypotheses of the Weak Law Of Large Number are satisified? Or are you asking whether the conclusion of the Weak Law Of Large Numbers is satisifed?

Thank you Stephen, I am sorry that I didn't clarify. I meant for independent random variables, not iid. and by WLLN satisfied, I meant for a random variable, as n observations -> ∞, the sample mean -> μ
More formally, I can write:
Pr(|(sum(xi)/n - E(xi)|<epsilon)=1 for any epsilon>0 as n→∞
or
Pr(|(sum(xi)/n - E(xi)|≥epsilon)=0 for any epsilon>0 as n→∞
If xi satisfies above, it satisfies WLLN

Now, using Chebyshev's inequalities, we get
Pr(|(sum(xi)/n - E(xi)|≥epsilon) ≤ (σ^2)/epsilon^2
where σ^2= Var(sum(xi)/n)=Var(sum(xi))/n^2

One way to check the WLLN is that if the RHS of the inequality is 0, (so if we find that σ^2 → 0 as n→∞), LHS must be zero, thus for that random variable WLLN is satisfied. If we show that σ^2 → ∞ as n→∞, then WLLN is not satisfied.

My question is, if I find σ^2-> a constant that is greater than zero but smaller than infinity, could I conclude that WLLN is satisfied (I have checked several cases using alternative methods (e.g. Kolmogorov's WLLN) and WLLN is satisfied, but I can't prove or convince myself that σ^2→a<∞ implies WLLN is satisfied.

Stephen Tashi · Apr 28, 2012

ntg865 said:

I meant for independent random variables, not iid. and by WLLN satisfied, I meant for a random variable, as n observations -> ∞, the sample mean -> μ

If you're speaking of taking n observations of the same random variable, how are these not going to be n identically distributed random variables?

If you don't have identically distributed random variables, then what is your definition of [itex]\sigma^2[/itex]? Are you trying to make a statement about a sequence of independent random variables that each have the same variance, but which are not necessarily identically distributed?

The LaTex on the forum is described in the thread:
https://www.physicsforums.com/showthread.php?t=546968

ntg865 · Apr 28, 2012

Stephen Tashi said:

Are you trying to make a statement about a sequence of independent random variables that each have the same variance, but which are not necessarily identically distributed?

Sorry about that, I keep omitting key information... you I am looking at sequences of independent random variables, so a simple example would be:
[tex]\Pr(X_i=±\sqrt{i})=0.5[/tex]

Thanks for the latex help by the way.

Stephen Tashi · Apr 28, 2012

ntg865 said:

a simple example would be:
[tex]\Pr(X_i=±\sqrt{i})=0.5[/tex]

Ok, [itex]Pr(X_k = \sqrt{k }) = 0.5[/itex] and [itex]Pr(X_k = -\sqrt{k}) = 0.5[/itex]
What would [itex]\sigma[/itex] be in that case? You didn't put any subscript on the [itex]\sigma[/itex] in your equations.

By the way, I've forgotten if the LaTex thread explains using the tag "itex" instead of "tex" when you want the symbols to appear inline with written text. I find the "itex" tag very handy.

ntg865 · Apr 28, 2012

Stephen Tashi said:

Ok, [itex]Pr(X_k = \sqrt{k }) = 0.5[/itex] and [itex]Pr(X_k = -\sqrt{k}) = 0.5[/itex]
What would [itex]\sigma[/itex] be in that case? You didn't put any subscript on the [itex]\sigma[/itex] in your equations.

I used tex for equations to make it look neater and easier to read (like how one would write in articles).

[itex]\sigma^2[/itex] is the variance of the sum of all [itex]\sigma_i[/itex] divided by n (by independence, see below). So it is [itex]\text{Var} \left[ \frac{S_n}{n}\right][/itex] where [itex]S_n[/itex] is [itex]\sum_{i=1}^{n} X_i[/itex]

Firstly,
[tex]\text{E} [X_i] = \frac{1}{2} \sqrt{i} - \frac{1}{2} \sqrt{i} = 0[/tex]
By independent variance additivity,
[tex]\text{Var} [S_n] = \sum_{i=1}^{n} \text{Var} [X_i][/tex]
[tex]=\sum_{i=1}^{n} \text{E}[X_i^2] - (\text{E} [X_i])^2[/tex]
[tex]=\sum_{i=1}^{n} \frac{1}{2} (\sqrt{i})^2 + \frac{1}{2} (\sqrt{i})^2 - 0[/tex]
[tex]=\sum_{i=1}^{n} i = \frac{n(n+1)}{2}[/tex]
Now the Chebychev's bound requires that:
[tex]\Pr \left\{\left|\frac{S_n}{n}\right| \ge \epsilon \right\} \le \text{Var} \left[ \frac{S_n}{n}\right] \frac{1}{\epsilon^2}= \frac{1}{n^2} \text{Var} [S_n] \frac{1}{\epsilon^2}[/tex]
The right hand side is
[tex]\frac{1}{n^2} \text{Var} [S_n] \frac{1}{\epsilon^2}=\frac{1}{2\epsilon^2}+\frac{1}{2n\epsilon^2}[/tex]
Which converges to [itex]\frac{1}{2\epsilon^2}[/itex] as [itex]n[/itex] goes to infinity. And that's where my problem is, it's neither infinity or zero.

Stephen Tashi · Apr 29, 2012

You didn't make significant use of [itex]\sigma[/itex] in your last post, but in case it reappears, let's get it defined correctly.

ntg865 said:

[itex]\sigma^2[/itex] is the variance of the sum of all [itex]\sigma_i[/itex] divided by n

it is [itex]\text{Var} \left[ \frac{S_n}{n}\right][/itex] where [itex]S_n[/itex] is [itex]\sum_{i=1}^{n} X_i[/itex]

if [itex]\sigma^2[/itex] is the mean of the variances of "all" of the [itex]X_i[/itex], it would have to be [itex]\lim_{n \rightarrow \infty} \frac{ \sum_{i=1}^n var(X_i)}{n}[/itex].

if [itex]\sigma^2[/itex] is the mean of the variances of the first n of the [itex]X_i[/itex] then it should be denoted [itex]\sigma^2_n[/itex] to indicate that it depends on n.

The rest of the math is clear, but what is the question?

Let [itex]m_n = \frac{\sum_{i=1}^n X_i}{n}[/itex]. Let [itex]\mu_n = E( m_n)[/itex] I think you are asking whether it is true that for each [itex]\epsilon > 0[/itex] we have [itex]\lim_{n \rightarrow \infty} Pr( | m_n - u_n | > \epsilon) = 0[/itex].

I doubt that conjecture is true for an arbirary sequence of independent but not identically distributed random variables.

For the example, I agree that Chebyshev's inequality does not establish the conjecture. So the question is whether the the conjecture can be proven or disproven for that example.

For a disproof, it would help to have some sort of reverse inequality of the form

[tex]Pr( |Y - \mu_Y| > \theta) \ge f(\sigma_{Y})[/tex]

where [itex]f[/itex] will be some known function. It's too late in the evening for me to think about that. (Maybe Chiro will figure it out and I won't have to.)

chiro · Apr 29, 2012

If you didn't want to assume i.i.d, but just independent samples for whatever distribution, one idea that I have is that you find the supremum of all distribution variances and use the idea that your variance bound will be based on σ² where is the supremum variance of all distributions and use this and then using the standard argument for that used with i.i.d you can prove the identity.

The reason is, is that the sum of all variances will always be the less than the sum of the supremum variances. If you need a rigorous proof just norms and their properties to prove this.

Then follow the normal argument and you will get the intended result.

ntg865 · Apr 29, 2012

Stephen Tashi said:

I doubt that conjecture is true for an arbirary sequence of independent but not identically distributed random variables.

For the example, I agree that Chebyshev's inequality does not establish the conjecture. So the question is whether the the conjecture can be proven or disproven for that example.

For a disproof, it would help to have some sort of reverse inequality of the form

[tex]Pr( |Y - \mu_Y| > \theta) \ge f(\sigma_{Y})[/tex]

where [itex]f[/itex] will be some known function. It's too late in the evening for me to think about that. (Maybe Chiro will figure it out and I won't have to.)

Thanks Stephen and Chiro, my question is not whether the example above satisfy WLLN, my question is about the Chebyshev's inequality. What can we conclude (if anything), if the RHS of the inequality is greater than 0 and smaller than ∞.

As for the example, I can prove that it does not satisfy WLLN using Kolmogorov's WLLN which states that a sequence of random variables obeys the WLLN if and only if:
[tex]\text{E} \left[\frac{[\sum_{i=1}^{n} X_n - \sum_{i=1}^{n} \text{E} (X_i)]^2}{n^2 +[\sum_{i=1}^{n} X_i - \sum_{i=1}^{n} \text{E} (X_i)]^2} \right] \to 0 \text{ as } n \to \infty[/tex]
We can rewrite this as:
[tex]\text{E} \left[1 - \frac{n^2}{n^2 +\sum_{i=1}^{n} X_i - [\sum_{i=1}^{n} \text{E} (X_i)^2]^2 }\right][/tex]
[tex]=1 - \frac{n^2}{ n^2 + \text{E}[(\sum_{i=1}^{n} X_i - \sum_{i=1}^{n} \text{E} (X_i)^2)^2]}[/tex]
In our example above, E([itex]X_i[/itex])=0, now,
[tex]\text{E} [(\sum_{i=1}^{n} X_i)^2][/tex]
[tex]=\text{E} (\sum_{i=1}^{n} X_i^2 + \sum_{i=1}^{n}\sum_{j=1}^{n} 2X_i X_j)[/tex]
where [itex]i \ne j[/itex], since [itex]X_i^2 >0[/itex],
[tex]=\sum_{i=1}^{n} \text{E} (X_i^2) + 2\sum_{i=1}^{n}\sum_{j=1}^{n}\text{E} (X_i X_j)[/tex]
by independence of [itex]X_i[/itex],
[tex]=\sum_{i=1}^{n} \text{E} (X_i^2) + 2\sum_{i=1}^{n}\sum_{j=1}^{n}\text{E} (X_i)\text{E}( X_j)[/tex]
[tex]=\sum_{i=1}^{n} \text{E} (X_i^2)[/tex]
as shown in my last post,
[tex]=\frac{n(n+1)}{2}[/tex]
Kolmogorov's condition therefore becomes:
[tex]=1 - \frac{n^2}{ n^2 + \frac{n(n+1)}{2}}[/tex]
which tends to [itex]\frac{1}{3}[/itex] as [itex]n \to \infty[/itex], which means that the WLLN does not hold.

My question is, does WLLN not hold if the RHS of the Chebyshev's inequality is greater than 0 (as in the example, recall that it equals to 1/2ε^2), or is this example just a special case. Intuitively, I do not think that it holds if the RHS is greater than 0, but I cannot confirm that, nor can I show that, that's why I am asking... Thanks again.

chiro · Apr 29, 2012

I think you should consider what I said above.

With regards to specifics, regardless of the distribution you know that for the distribution of the mean, the variance will always be be bounded by [tex]\frac{σ^{2}}{n}[/tex] where our σ² is the supremum of all variances for all distributions considered even if they are not from the same distributions.

Remember we need to divide by n because we are looking at the distribution of a mean of samples and this will affect the variance.

Again even if we had distributions with difference variances we can use just a triangle inequality for a 2-norm to show the bounds. This actual inequality is known as the triangle inequality.

I'm also going from the definitions provided in the link I posted above with the exception that I am allowing the samples to come from just independent distributions and not necessarily identical, which means I have to use the triangle inequality and the supremum variance of the set of all population variances (assuming that the variances are finite also).

Stephen Tashi · Apr 29, 2012

In the situation ntg865 is asking about, the supremum of the [itex]\sigma_i ^2[/itex] could be [itex]\infty[/itex].

I think we have a glimpse of the question in the statement:

My question is, does WLLN not hold if the RHS of the Chebyshev's inequality is greater than 0

It doesn't make sense to ask if "does WLLN not hold" since WLLN is a theorem, not a property. I'll try to formulate the question correctly.

Let [itex]{X_i}[/itex] be an infinite sequence of independent random variables, each of which has a finite mean [itex]\mu_i[/itex] and finite variance [itex]\sigma^2_i[/itex] ( but not necessarily having identical distributions,means or variances, and not necessarily with [itex]Sup_i {Var(X_i)}[/itex] being finite).

Let [itex]C_n = \frac{ \sum_{i=1}^n \sigma^2_i}{n^2}[/itex] (This defines the expression involved in the right hand side of Chebyshev's inequality.)

Let [itex]Y_n = \frac{ \sum_{i=1}^n X_i }{n}[/itex] Let [itex]\theta_n[/itex] be the expected value of [itex]Y_n[/itex]

Define the terminology " The sequence of random variables [itex]{X_n}[/itex] statisfies the WLLN property" to mean that for any [itex]\epsilon > 0[/itex] , [itex]Lim_{n \rightarrow \infty} P( | Y_n - \theta_n| > \epsilon) = 0[/itex].

The conjecture is:

Suppose C = [itex]lim_{n \rightarrow \infty} C_n[/itex] exists and [itex]C > 0[/itex]. Then [itex]X_i[/itex] does not satisfy the WLLN property.

i.e. There exists an [itex]\epsilon > 0[/itex] such that [itex]Lim_{n \rightarrow \infty}P( |Y_n - \theta_n| < \epsilon )[/itex] either fails to exist or exists and is [itex]> 0[/itex].

Stephen Tashi · May 17, 2012

I found this theorem in the book "Probability" by Leo Breiman. I've paraphrased it.

Theorem 3.27. Let [itex]X_1,X_2,...[/itex] be indpendent random variables with [itex]E(X_k) = 0[/itex] and [itex]E(X_k^2) < \infty[/itex] for each [itex]k = 1,2,...[/itex] Let [itex]{b_n}[/itex] be a sequence of non-negative numbers such that [itex]Lim_{n \rightarrow \infty} b_n = +\infty[/itex]. If [itex]\sum_{k=1}^\infty \frac{X_k^2}{b_k} < \infty[/itex] then [itex]\frac{\sum_{k=1}^n X_k}{bn}[/itex] converges to zero almost surely as [itex]n[/itex] approaches infinity.

(weak) Law of Large Numbers and variance

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect