# Convergence of random variables

• kingwinner
In summary: Yes, the characteristic function always exists for all real numbers t, making it more general than the moment generating function.2) "Y<∞" means that the random variable Y is finite, or takes on a finite value. This notation is used to talk about the almost sure finiteness of a random variable. For example, if Y is a Poisson random variable, then Y<∞ means that Y can take on values of 0, 1, 2, ..., but will never be infinite (since the Poisson distribution has no upper bound). For a binomial random variable, Y<∞ means that Y can take on values of 0, 1, 2, ..., n,
kingwinner
I was reading some proofs about the convergence of random variables, and here are the little bits that I couldn't figure out...

1) Let Xn be a sequence of random variables, and let Xnk be a subsequence of it. If Xn conveges in probability to X, then Xnk also conveges in probability to X. WHY?

2) I was looking at a theorem: if E(Y)<∞, then Y<∞ almost surely. Now I am puzzled by the notation. What does it MEAN to say that Y=∞ or Y<∞?
For example, if Y is a Poisson random variable, then the possible values are 0,1,2,..., (there is no upper bound). Is it true to say that Y=∞ in this case?

3) If Xn4 converges to 0 almost surely, then is it true to say that Xn also converges to 0 almost surely? Why or why not?

4) The moment generating function(mgf) determines the distribution uniquely, so we can use mgf to find the distributions of random varibles. If the mgf already does the job, what is the point of introducing the "characteristic function"?

Any help is much appreciated! :)

I can answer the first one. Xn converges to X by definition if for all epsilon > 0,
Pr(|Xn-X|>epsilon) converges to 0. Suppose Xn converges to X in probability. Let Xnk be a subsequence. Then for any epsilon>0, Pr(|Xnk-X|>epsilon) is a subsequence of Pr(|Xn-X|>epsilon) (these are sequences of numbers). Since we know that a subsequence of a convergent sequence of numbers converges to the limit of the original sequence, it follows that Pr(|Xnk-X|>epsilon) converges to 0. So Xnk converges in probability to X.

kingwinner said:
I was reading some proofs about the convergence of random variables, and here are the little bits that I couldn't figure out...

1) Let Xn be a sequence of random variables, and let Xnk be a subsequence of it. If Xn conveges in probability to X, then Xnk also conveges in probability to X. WHY?

2) I was looking at a theorem: if E(Y)<∞, then Y<∞ almost surely. Now I am puzzled by the notation. What does it MEAN to say that Y=∞ or Y<∞?
For example, if Y is a Poisson random variable, then the possible values are 0,1,2,..., (there is no upper bound). Is it true to say that Y=∞ in this case?

3) If Xn4 converges to 0 almost surely, then is it true to say that Xn also converges to 0 almost surely? Why or why not?

4) The moment generating function(mgf) determines the distribution uniquely, so we can use mgf to find the distributions of random varibles. If the mgf already does the job, what is the point of introducing the "characteristic function"?

Any help is much appreciated! :)

1) This would be a generalization of convergence of subsequences of real numbers.

2) An example of this would be first exit times - consider a process that has a finite probability of never exiting (e.g. fly in a jar), so the first exit time can be infinite.

3) Not sure

4) No - mgf is not unique (e.g. lognormal distribution) and doesn't necessary exist (e.g. Pareto). The c.f. is useful because it always exists on the real axis (if the r.v. is a.s. finite) and acts like a Fourier transform.

Hope this helps

Thank you for the replies.

2) I don't get it. The theorem is talking about this: "if E(Y)<∞, then Y<∞ almost surely", but I don't even know what Y<∞ means...:(
For a Poisson random variable Y, the possible values are 0,1,2,..., and there is NO upper bound, so Y=∞ is possible? (same for exponential random variable, there is no upper bound.)
For a binomial random variable X, the possible values are 0,1,2,...,n, there is a upper bound, so Y<∞?
I am really confused. Can someone please explain more on this? What does it mean to say that Y<∞? (or Y=∞?)

4) So you mean the characterisitic function c(t) always exists for ALL real numbers t, is that right?
Also, for example, if we are asked to prove that the sum of 2 indepndent normal r.v.'s is again normal, then I think the proof using mgf is perfectly fine, but I see my textbook using characteristic function for this, is it absolutely necessary to use characteristic function in a proof like this?

"Also, for example, if we are asked to prove that the sum of 2 indepndent normal r.v.'s is again normal, then I think the proof using mgf is perfectly fine, but I see my textbook using characteristic function for this, is it absolutely necessary to use characteristic function in a proof like this?"

No, it isn't necessary.

Every probability distribution has a characteristic function, and that function is unique - it determines the distribution.

In order for a distribution to have a moment generating function, every moment has to exist - that is, you must have

$$\int x^n \,dF(x) < \infty$$

for all n. This isn't always true - consider

$$f(x) = \frac 1 {\pi (1+x^2)}$$

which doesn't even have a mean.

If a distribution's moments identify the distribution exactly (say they satisfy Carleman's conditions) then the moment generating function is unique and identifies the distribution.

I'm guessing (and it's only a guess, since I don't know which probability text you're using) that the author(s) use the characteristic function approach to show the sum of two independent normals is normal because it is a relatively easy example to use to demonstrate the general procedure.

4) So while the moment generating function does not always exist in a neighborhood of 0, the "characterisitic function" ALWAYS exists for ALL real numbers t, is this right? (so that it is more general?)

2) Can you also explain the meaning of "Y<∞", please?
Is this about the difference of binomial random variables (which has an upper bound on the possible values), and Poisson (or exponential) random variables (which has no upper bound on the possible values)?
So that for binomial random variables Y, we can say that Y<∞, while for Poisson (or exponential) random variables X, we cannot say that X<∞?

Your help is much appreciated! :)

It is often more convenient to do calculus using the extended real numbers rather than the real numbers. The extended real numbers contain two extra points, called $+\infty$ and $-\infty$.

Every infinite sum of nonnegative extended real numbers is convergent. For example:
$$1 + 1 + 1 + \cdots = +\infty$$​
A similar statement is true for definite integrals.

kingwinner said:
4) So while the moment generating function does not always exist in a neighborhood of 0, the "characterisitic function" ALWAYS exists for ALL real numbers t, is this right? (so that it is more general?)

2) Can you also explain the meaning of "Y<∞", please?
Is this about the difference of binomial random variables (which has an upper bound on the possible values), and Poisson (or exponential) random variables (which has no upper bound on the possible values)?
So that for binomial random variables Y, we can say that Y<∞, while for Poisson (or exponential) random variables X, we cannot say that X<∞?

Your help is much appreciated! :)

1) Correct - as I (and others) have noted, there are some distributions for which the moment generating function does not exist - distributions that fail to have moments from some order on. The reason this is a problem comes from the definition of the mgf and can be seen from the series expansion of the exponential function. For the real-valued case

\begin{align*} \phi_X(t) & = \int_{\mathcal{R}} e^{tx} \, dF(x) \\ & = \int_{\mathcal{R}} \sum_{n=0}^\infty \frac{(tx)^n}{n!} \, dF(x) \end{align*}

If the distribution does not have moments of all orders, eventually an integral involving $$x^n$$ will diverge, and so the mgf does not exist.

2) The characteristic function exists for every distribution, for every real t, since

$$|\psi_X(t)| = \left|\int_{\mathcal{R}} e^{itx} \, dF(x)\right| \le \int_{\mathcal{R}} |e^{itx}| \, dF(x) = \int_{\mathcal{R}} dF(x) = 1$$

## 1. What is the concept of convergence of random variables?

The convergence of random variables refers to the idea that as the number of trials or observations increases, the values of these variables tend to approach a certain fixed value or distribution. This can also be thought of as the behavior of a sequence of random variables as the number of terms in the sequence increases.

## 2. What are the different types of convergence of random variables?

There are three main types of convergence of random variables: almost sure convergence, convergence in probability, and convergence in distribution. Almost sure convergence means that the sequence of random variables converges to a fixed value with probability 1. Convergence in probability means that the probability that the random variables are close to the limit value approaches 1 as the number of trials or observations increases. Convergence in distribution means that the distribution of the random variables approaches a fixed distribution as the number of terms in the sequence increases.

## 3. What is the difference between convergence in probability and almost sure convergence?

The main difference between these two types of convergence is the level of confidence in the convergence. With almost sure convergence, we are certain that the sequence of random variables will converge to a fixed value with probability 1. However, with convergence in probability, we can only say that the probability of convergence approaches 1 as the number of trials or observations increases, but there is still a small chance that the sequence may not converge.

## 4. What is the importance of convergence of random variables in statistics?

The concept of convergence of random variables is crucial in statistics because it allows us to make predictions and inferences about a population based on a sample. By understanding how a sequence of random variables behaves as the number of terms increases, we can make more accurate estimations and draw more reliable conclusions about the population as a whole.

## 5. How is convergence of random variables related to the central limit theorem?

The central limit theorem states that the sum of a large number of independent and identically distributed random variables will tend to follow a normal distribution, regardless of the underlying distribution of the individual variables. This is a form of convergence in distribution, as the sum of these random variables approaches a normal distribution as the number of terms in the sum increases. Therefore, the central limit theorem is a specific application of the concept of convergence of random variables.

• Set Theory, Logic, Probability, Statistics
Replies
7
Views
833
• Set Theory, Logic, Probability, Statistics
Replies
5
Views
881
• Set Theory, Logic, Probability, Statistics
Replies
30
Views
3K
• Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
10
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
6
Views
4K
• Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K