CDF of a normal vector is there a closed form?

Click For Summary

Discussion Overview

The discussion revolves around the probability of a vector of independent standard normal random variables exceeding certain threshold values. Participants explore the concept of a cumulative distribution function (CDF) for this scenario and whether a closed form exists for the probability that at least one component of the vector exceeds specified values.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning

Main Points Raised

  • One participant proposes calculating the probability that at least one component of a vector X exceeds given thresholds by using Bernoulli random variables, suggesting a formula involving the individual probabilities.
  • Another participant discusses the relationship between the squared length of the vector and the Chi-Square distribution, noting that the distribution of the squared length is more tractable than the length itself.
  • There is mention of conditional probabilities when considering the relationship between the components of the vector and the squared length, indicating a need for careful calculation of probabilities within specified intervals.
  • One participant raises a question about the shape of the resulting probability distribution, speculating whether it would exhibit a sigmoid shape in n-space.
  • There is a clarification regarding the interpretation of the vector's length, with a participant suggesting that the MATLAB definition of length may refer to the number of elements rather than a geometric length.

Areas of Agreement / Disagreement

Participants express differing views on the approach to calculating the probability and the interpretation of the vector's length. The discussion remains unresolved, with multiple competing ideas and no consensus on a closed form for the probability.

Contextual Notes

Participants highlight the need for specific intervals when calculating probabilities due to the continuous nature of the Chi-Square distribution. There are also mentions of computational tools and statistical tables that may be necessary for detailed probability calculations.

dabeth
Messages
3
Reaction score
0
Suppose a vector X of length n, where each component of X is normally distributed with mean 0 and variance 1, and independent of the other components. I want to know the probability that at least one of X_1>2, X_2>2.5, X_3>1.9, etc. happens (inclusive), i.e. the probability that the vector's individual values are all greater than a vector with some arbitrary values. In the above case, I'd be looking for the probability that the individual components of X are greater than [2, 2.5, 1.9, ... ]. To me this looks like a CDF, but a CDF of n independently distributed random variables. Does this have a closed form? If not, is there a manageable way to approximate it? Would it have a sigmoid shape (i.e. concave where all components are positive in n-space)?

Actually, I think the vector notation confuses things. So basically, I'm just asking about the "CDF" of a bunch of independent random variables that all are distributed standard normal.

Thanks.
 
Physics news on Phys.org
If you put B_1 = 1 if X_1>2 or 0 otherwise, etc, you have a collection of Bernoulli random variables with parameters p_1=Prob[X_1>2] etc, so the probability that at least one is 1 is 1-(1-p_1)*...*(1-p_n).
 
dabeth said:
Suppose a vector X of length n, where each component of X is normally distributed with mean 0 and variance 1, and independent of the other components. I want to know the probability that at least one of X_1>2, X_2>2.5, X_3>1.9, etc. happens (inclusive), i.e. the probability that the vector's individual values are all greater than a vector with some arbitrary values. In the above case, I'd be looking for the probability that the individual components of X are greater than [2, 2.5, 1.9, ... ]. To me this looks like a CDF, but a CDF of n independently distributed random variables. Does this have a closed form? If not, is there a manageable way to approximate it? Would it have a sigmoid shape (i.e. concave where all components are positive in n-space)?

Actually, I think the vector notation confuses things. So basically, I'm just asking about the "CDF" of a bunch of independent random variables that all are distributed standard normal.

Thanks.

Hello dabeth and welcome to the forums.

In terms of length, I'll assume three dimensions in which L^2 = (X1)^2 + (X2)^2 + (X3)^2 and L^2 = N^2 for some given N.

In terms of the random variable L^2, assuming X1, X2, X3 are distributed N(0,1), then L^2 has a Chi-Square distribution with 3 degrees of freedom (This is the definition of a chi-square distribution). In this case it is better to deal with the distribution of the square of the length rather than the length itself, because you have a known distribution that you can use. You can use to get a distribution for the length of your vector for some given range of the length.

You can't just get the probability of the length being some fixed real number, because the chi-square distribution is continuous and the probability in this regard is zero. You have to make the interval have some width of some sort (Perhaps some multiple of an integer).

With regards to finding out whether |X1| > 2.5, this is going to be a conditional probability.

So let's say you choose that you are considering values between a and a+1 for length of the vector. What you will need to find out is P(|X1| > 2.5 | a^2 < X1^2 + X2^2 + X3^2 < (a+1)^2) which is going to be P(|X1| > 2.5 AND a^2 < Chi-Square(3) < (a+1)^2) / P(a^2 < Chi-Square(3) < (a+1)^2).

Since these are actual probabilities: you can calculate them to get a specific value for your real probability. (You will have to use either a computer or get some statistical tables that have detailed probabilities that are more extensive than your standard 0.01,0.025,0.05,0.1 etc) [I would use a computer to do it].

Just one note though: in terms of this we are dealing with the squared value of the component so you need to make sure you are calculating the right thing. In my example I used the absolute value |X1| because of the nature of the squaring effect.

In terms of more complicated probability calculations, you will need to use probability axioms to reduce whatever conditional probabilities you have into ones that you can measure.

Hopefully this has given you some hints to work from.
 
Last edited:
chiro said:
Hello dabeth and welcome to the forums.

In terms of length, I'll assume three dimensions in which L^2 = (X1)^2 + (X2)^2 + (X3)^2 and L^2 = N^2 for some given N.

...

I think he meant the MATLAB length, i.e. the number of elements in the vector.
 

Similar threads

  • · Replies 10 ·
Replies
10
Views
6K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K