Convergence of random variables

kingwinner · Nov 24, 2009

I was reading some proofs about the convergence of random variables, and here are the little bits that I couldn't figure out...

1) Let X_n be a sequence of random variables, and let X_n_k be a subsequence of it. If X_n conveges in probability to X, then X_n_k also conveges in probability to X. WHY?

2) I was looking at a theorem: if E(Y)<∞, then Y<∞ almost surely. Now I am puzzled by the notation. What does it MEAN to say that Y=∞ or Y<∞?
For example, if Y is a Poisson random variable, then the possible values are 0,1,2,..., (there is no upper bound). Is it true to say that Y=∞ in this case?

3) If X_n⁴ converges to 0 almost surely, then is it true to say that X_n also converges to 0 almost surely? Why or why not?

4) The moment generating function(mgf) determines the distribution uniquely, so we can use mgf to find the distributions of random varibles. If the mgf already does the job, what is the point of introducing the "characteristic function"?

Can someone please explain?
Any help is much appreciated! :)

grief · Nov 24, 2009

I can answer the first one. Xn converges to X by definition if for all epsilon > 0,
Pr(|Xn-X|>epsilon) converges to 0. Suppose Xn converges to X in probability. Let Xnk be a subsequence. Then for any epsilon>0, Pr(|Xnk-X|>epsilon) is a subsequence of Pr(|Xn-X|>epsilon) (these are sequences of numbers). Since we know that a subsequence of a convergent sequence of numbers converges to the limit of the original sequence, it follows that Pr(|Xnk-X|>epsilon) converges to 0. So Xnk converges in probability to X.

bpet · Nov 24, 2009

kingwinner said:

I was reading some proofs about the convergence of random variables, and here are the little bits that I couldn't figure out...

1) Let X_n be a sequence of random variables, and let X_n_k be a subsequence of it. If X_n conveges in probability to X, then X_n_k also conveges in probability to X. WHY?

2) I was looking at a theorem: if E(Y)<∞, then Y<∞ almost surely. Now I am puzzled by the notation. What does it MEAN to say that Y=∞ or Y<∞?
For example, if Y is a Poisson random variable, then the possible values are 0,1,2,..., (there is no upper bound). Is it true to say that Y=∞ in this case?

3) If X_n⁴ converges to 0 almost surely, then is it true to say that X_n also converges to 0 almost surely? Why or why not?

4) The moment generating function(mgf) determines the distribution uniquely, so we can use mgf to find the distributions of random varibles. If the mgf already does the job, what is the point of introducing the "characteristic function"?

Can someone please explain?
Any help is much appreciated! :)

1) This would be a generalization of convergence of subsequences of real numbers.

2) An example of this would be first exit times - consider a process that has a finite probability of never exiting (e.g. fly in a jar), so the first exit time can be infinite.

3) Not sure

4) No - mgf is not unique (e.g. lognormal distribution) and doesn't necessary exist (e.g. Pareto). The c.f. is useful because it always exists on the real axis (if the r.v. is a.s. finite) and acts like a Fourier transform.

Hope this helps

kingwinner · Nov 24, 2009

Thank you for the replies.

2) I don't get it. The theorem is talking about this: "if E(Y)<∞, then Y<∞ almost surely", but I don't even know what Y<∞ means...:(
For a Poisson random variable Y, the possible values are 0,1,2,..., and there is NO upper bound, so Y=∞ is possible? (same for exponential random variable, there is no upper bound.)
For a binomial random variable X, the possible values are 0,1,2,...,n, there is a upper bound, so Y<∞?
I am really confused. Can someone please explain more on this? What does it mean to say that Y<∞? (or Y=∞?)

4) So you mean the characterisitic function c(t) always exists for ALL real numbers t, is that right?
Also, for example, if we are asked to prove that the sum of 2 indepndent normal r.v.'s is again normal, then I think the proof using mgf is perfectly fine, but I see my textbook using characteristic function for this, is it absolutely necessary to use characteristic function in a proof like this?

statdad · Nov 25, 2009

"Also, for example, if we are asked to prove that the sum of 2 indepndent normal r.v.'s is again normal, then I think the proof using mgf is perfectly fine, but I see my textbook using characteristic function for this, is it absolutely necessary to use characteristic function in a proof like this?"

No, it isn't necessary.

Every probability distribution has a characteristic function, and that function is unique - it determines the distribution.

In order for a distribution to have a moment generating function, every moment has to exist - that is, you must have

 \int x^n \,dF(x) < \infty 

for all n. This isn't always true - consider

 f(x) = \frac 1 {\pi (1+x^2)} 

which doesn't even have a mean.

If a distribution's moments identify the distribution exactly (say they satisfy Carleman's conditions) then the moment generating function is unique and identifies the distribution.

I'm guessing (and it's only a guess, since I don't know which probability text you're using) that the author(s) use the characteristic function approach to show the sum of two independent normals is normal because it is a relatively easy example to use to demonstrate the general procedure.

kingwinner · Nov 25, 2009

4) So while the moment generating function does not always exist in a neighborhood of 0, the "characterisitic function" ALWAYS exists for ALL real numbers t, is this right? (so that it is more general?)

2) Can you also explain the meaning of "Y<∞", please?
Is this about the difference of binomial random variables (which has an upper bound on the possible values), and Poisson (or exponential) random variables (which has no upper bound on the possible values)?
So that for binomial random variables Y, we can say that Y<∞, while for Poisson (or exponential) random variables X, we cannot say that X<∞?

Your help is much appreciated! :)

Hurkyl · Nov 25, 2009

It is often more convenient to do calculus using the extended real numbers rather than the real numbers. The extended real numbers contain two extra points, called +\infty and -\infty.

Every infinite sum of nonnegative extended real numbers is convergent. For example:

1 + 1 + 1 + \cdots = +\infty

A similar statement is true for definite integrals.

statdad · Nov 25, 2009

kingwinner said:

4) So while the moment generating function does not always exist in a neighborhood of 0, the "characterisitic function" ALWAYS exists for ALL real numbers t, is this right? (so that it is more general?)

2) Can you also explain the meaning of "Y<∞", please?
Is this about the difference of binomial random variables (which has an upper bound on the possible values), and Poisson (or exponential) random variables (which has no upper bound on the possible values)?
So that for binomial random variables Y, we can say that Y<∞, while for Poisson (or exponential) random variables X, we cannot say that X<∞?

Your help is much appreciated! :)

1) Correct - as I (and others) have noted, there are some distributions for which the moment generating function does not exist - distributions that fail to have moments from some order on. The reason this is a problem comes from the definition of the mgf and can be seen from the series expansion of the exponential function. For the real-valued case

 \begin{align*} \phi_X(t) & = \int_{\mathcal{R}} e^{tx} \, dF(x) \\ & = \int_{\mathcal{R}} \sum_{n=0}^\infty \frac{(tx)^n}{n!} \, dF(x) \end{align*} 

If the distribution does not have moments of all orders, eventually an integral involving x^n will diverge, and so the mgf does not exist.

2) The characteristic function exists for every distribution, for every real t, since

 |\psi_X(t)| = \left|\int_{\mathcal{R}} e^{itx} \, dF(x)\right| \le \int_{\mathcal{R}} |e^{itx}| \, dF(x) = \int_{\mathcal{R}} dF(x) = 1

Convergence of random variables

Similar threads

Undergrad A variant of the Monty Hall problem

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

High School How Rare Is Low Smartphone Usage Among Metro Travelers in Japan?

High School Onto set mapping is the surjective set mapping, and into injective?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers