Expectation of maximum of a multinormal random vector

In summary: Cheers,PereIn summary, the conversation discusses the distribution of the maximum of a multinormal random vector and the possibility of finding a closed form expression for it. It also explores the more general case of a random vector with a covariance matrix and the challenges in finding a probability density function when the matrix is not positive definite. The conversation participants suggest looking into the Generalized Extreme Value Distribution and the paper by Charles E. Clark for potential solutions.
  • #1
Pere Callahan
586
1
Hello,

This time my question is not about Catalan numbers but something much more interesting (to me at least;))

I was wondering how the maximum of a multinormal random vector is distributed, for example let

[tex]X \approx N(\mu_1,\sigma_1^2)[/tex]
[tex]Y \approx N(\mu_2,\sigma_2^2)[/tex]

be normally distributed random variables.

Let further

[tex]Z=max(X,Y)[/tex]

What is the distribution of Z. Maybe it doesn't even have a distribution which can be put in closed form..so what can be said about the expectation value...?

What about the more general case, if you have a random vector

[tex](X_1,X_2,\dots,X_n) \approx N(\vec{\mu},\vec{\vec{\Sigma}})[/tex]

with mean [tex]\vec{\mu}[/tex] and covariance matrix [tex]\vec{\vec{\Sigma}}[/tex].
Can anything be sais about the distribution of [tex]Z=max(X_1,X_2,\dots,X_n)[/tex]?


And talking about multivariate normaldistributions there comes yet another question to my mind. If the covariance matrix [tex]\vec{\vec{\Sigma}}[/tex] is not positive definite but only positive semi-definite, then [tex]det(\vec{\vec{\Sigma}})=0[/tex] so the there is no probability density function.

Is there nonetheless a good way to draw samples from such a multivariate normal distribution? (The most obvious way via the Cholesky decomposition of [tex]\vec{\vec{\Sigma}}[/tex] and a suitable linear transformation of a standard normally disitributed random vector doesn't work if the determinant is zero)

So a bunch of questions, thanks for any answers.

Cheers,
Pere
 
Last edited:
Physics news on Phys.org
  • #2
just concerning the first part of your question, for two independent random variables X and Y with distribution functions F and G, the distribution function H of max(X,Y) is simply given by the product of F and G
H(c)=P(max(X,Y)<=c)=P(X<=c and Y<=c)=P(X<=c)P(Y<=c)=F(c)G(c)
differentiating that by c you get the density function of max(X,Y)
h(c)=F(c)g(c)+G(c)f(c)

here is the more general case
http://en.wikipedia.org/wiki/Order_statistic
(actually it is only more general in having more than two random variables, they assume F=G most of the time..)
don't know whether in the normal case you get a closed form expression for the expectation...

if you don't have independence, matters get more complicated, of course
 
Last edited:
  • #3
Hi Jodoudo,
Thanks for your answer and the link.

So if there is a pdf for the random vector it seems to be rather a matter of finding a nice way to simplify the integrals, but what if there is no such thing as a density function als in the case when the covairiance matrix is zero. How do I even calcualate the probability then that the random vector takes values in a certain region of [tex]R^n[/tex] when integrating the pdf over that region is not an option.

Cheers
Eckhard
 
  • #4
HI,

I found this interesting article:

http://en.wikipedia.org/wiki/Generalized_extreme_value_distribution

This covers the iid case (at least asymptotically) but shouldn't it be possible to derive something stronger for the special case of a multivariate normal distribution?

I am going to have a look at Coles, Stuart (2001). An Introduction to Statistical Modeling of Extreme Values, maybe this can provide some insights.

Still, I'd appreciate any comments:smile:
 
  • #5
Pere Callahan said:
HI,

I found this interesting article:

http://en.wikipedia.org/wiki/Generalized_extreme_value_distribution

This covers the iid case (at least asymptotically) but shouldn't it be possible to derive something stronger for the special case of a multivariate normal distribution?

I am going to have a look at Coles, Stuart (2001). An Introduction to Statistical Modeling of Extreme Values, maybe this can provide some insights.

Still, I'd appreciate any comments:smile:
Say (x1, x2) ~ multivariate normal: F(z1, z2) = P{x1<z1 and x2<z2}. Then P{max(x1, x2) < z} = P{x1 < z and x2 < z} = F(z, z).

When x1 and x2 are independent, x1 ~ F1 and x2 ~ F2, then F(z1, z2) = F1(z1)F2(z2).
 
Last edited:
  • #6
Hi thanks for your answer,

I know these formulas, I was just wondering if it is possible to actually evaluate them for the case of a multivariate normal random variable.

I started calculating

[tex]E[max({X_1,\dots ,X_n})][/tex]

where the [tex]X_i[/tex] are i.i.d standard normal.

However, I succeeded only for [tex]1\leq n\leq 4[/tex]. For the case n=5 I could reduce the initial expression to a definite integral I can't solve (an so can't Mathematica, Maple ... which doesn't mean much.)

I haven't my notes here right now, I will post the integral later tonight, so maybe somebody of you has an idea as to how to solve it.

Cheers,
PereP.S. If even for n=5 i.i.d it turns out to be so tricky, I am really becoming less confident to find an expression for general n und a general (singular) covariance matrix...:grumpy:

P.P.S So here's what I ended up with:

[tex]E[max(\{X_1,X_2,X_3,X_4,X_5\})] = \int _{\frac{-3}{4}\pi }^{\pi /4}(\frac{ 15\text{ArcTan}[\frac{2 \text{Cos}[\phi ]}{\sqrt{3+\text{Cos}[2 \phi ]}}] \text{Cos}[\phi
]}{ \pi ^{5/2} \sqrt{1+\text{Cos}[\phi ]^2}})d\phi[/tex]

Probably it's just not possible to find a more elementary expression for this ...

By the way, my result for n=1,2,3,4 are 0, [tex]\frac{1}{\sqrt{\pi}}[/tex], [tex]\frac{3}{2\sqrt{\pi}}[/tex] and [tex]\frac{3 (\pi -2 \text{ArcCot}[\sqrt{2}])}{\pi ^{3/2}}[/tex]
 
Last edited:
  • #7
Hello,

I'm also interested in this problem. Did you manage to get a general expression for E(max(X_1,X_2,X_3)), with arbitrary mean and covariance?

There is a paper by Charles E. Clark in Operations Research, 1961 that gets an exact analytic solution for two arbitrary normal random variables. But what about three, in the case where they are all independent?

I'd appreciate any insight you might have.
 
  • #8
Hi,

I remember getting my result for i.i.d standard normals.

Are you wondering about general covariances or just about general variances ... ?

For the latter case I'm pretty confident that my calculations also work (I haven't tried yet, though), for the dependent case I have strong doubts, even for "only" three random variables.

Did you do a rather thorough literature research if there has been something published on that topic since 1961?

I hope to be able to have another look at my notes within the next couple of days and see if they easily generalize to arbitrary means and variances.

Pere
 
  • #9
Hi,

Sorry I took so long to see this. I'm interested in E(max(X_1,X_2,X_3)) where the X_i have arbitrary means mu_i and arbitrary variances sigma^2_i. But they are all independent, so the covariances are all zero. Do your calculations work for that case?

I've tried to find other material on this, but it seems that most papers on the subject just do it for standard normals, trying to calculate the answer to 50 decimal places or something. But I need a closed-form solution for the case with arbitrary means and variances.

Thanks in advance, hope you can find something...
 
Last edited:
  • #10
Pere, I'm interested in your proof for the standard normal case of 4 variates.

This is relevant for a problem analysing auctions with sealed bids awarded to the highest bidder.

Any chance of a hint as to how you got that result? I get your answers for n=1 to 3 but n=4 has me beat.

Thanks in advance for your help
 
  • #11
Hi Andrew,

long time since the last post in this thread. My notes on that problem are all long gone, but I will try and reproduce the result. I don't remember if I numerically checked my result for n=4, so maybe that's the first thing on the agenda:smile:

O.K. my result for n=4 seems to be correct. For n=5 I claim that the expected value of the maximum equals
[tex]
\frac{5\left(\pi-3\operatorname{ArcCot}[\sqrt{2}]\right)}{\pi^{3/2}}
[/tex]

My method is actually simply evaluating the integrals, I'll post a note later if a find the time.
 
Last edited:
  • #12
If you have i.i.d. normals with zero means, the problem can be attacked this way. (The notation refers to the standard normal cdf and pdf)

[tex]
\begin{align*}
P(X_n \le t) &= \Phi\left(\frac t {\sigma}\right)^n \\
f_n(t) & = \frac n \sigma \Phi \left(\frac t {\sigma}\right)^{n-1} \phi\left(\frac t \sigma \right) \\
E(X_{(n)}) &= \frac n \sigma \int_{-\infty}^\infty t \Phi\left(\frac t \sigma\right)^{n-1} \phi \left(\frac t \sigma \right) \, dt
\end{align*}
[/tex]

Let [tex] z = {t}/{\sigma} [/tex] so [tex] dt = \sigma dt [/tex]

[tex]
\begin{align*}
E(X_{(n)}) & = \frac n \sigma \int_{-\infty}^\infty (\sigma z) \Phi (z)^{n-1} \phi (z) \, \sigma dz \\
& = n \sigma \int_{-\infty}^\infty z \Phi(z)^{n-1} \phi(x) \, dz
\end{align*}
[/tex]

One way to proceed is to note that

[tex]
z \phi(z) = - \phi'(z)
[/tex]

so

[tex]
\begin{align*}
E(X_{(n)}) & = n \sigma \int_{-\infty}^\infty \Phi(z)^{n-1} (-\phi'(z))\, dz \\
& = -n \sigma \int_{-\infty}^\infty \Phi(z)^{n-1} \phi'(z) \, dz
\end{align*}
[/tex]

The problem becomes much more difficult if you don't assume equal variances, since then (still with means = 0 and pair-wise independence)

[tex]
F_{(n)}(t) = \prod_{i=1}^n \Phi\left(\frac t {\sigma_i}\right)
[/tex]

The density for the maximum has the (to me ugly) form

[tex]
f_{(n)}(t) = \sum_{i=1}^n \left(\frac 1 {\sigma_{i}} \phi\left( \frac t {\sigma_i}\right) \prod_{\substack{{k=1\\k \ne 1}}}^n \Phi\left( \frac t {\sigma_i}\right) \right) \right)
[/tex]

Things only get more complicated (at least, for this direct approach) if you don't assume all means are zero.
 
  • #13
For the standard normal i.i.d. case, Mathematica predicts E[max{X1,...,Xn}] = Integrate[x PDF[NormalDistribution[0, 1], x]^n, {x, -Infinity, z}] = -1/(Exp((n*z^2)/2)*n)//(2*Pi)^(n/2)
 
  • #14
What is z supposed to be in this formula?
 
  • #15
Pere Callahan said:
What is z supposed to be in this formula?
It represents the error term :blushing:

Thanks...

Correction:
Mathematica does not have a closed-form representation of Integrate[x maxpdf[x], {x, -Infinity, Infinity}] for general n, where maxpdf[x] is the derivative of CDF[NormalDistribution[0, 1], x]^n with respect to x, that is:

maxpdf[x] = n (Exp(-(x^2/2)) (1 + Erf[x/Sqrt[2]])^(n-1))/(2^(n-1/2) Sqrt[Pi]).

For n = 1, ..., 10, Mathematica numerically computes the following {n, E[max]} sequence:
{{1, 0.}, {2, 0.56419}, {3, 0.846284}, {4, 1.02938}, {5, 1.16296}, {6,
1.26721}, {7, 1.35218}, {8, 1.4236}, {9, 1.48501}, {10, 1.53875}}

For n = 10, ..., 100:
{{10, 1.53875}, {20, 1.86748}, {30, 2.04276}, {40, 2.16078}, {50,
2.24907}, {60, 2.31928}, {70, 2.37736}, {80, 2.42677}, {90,
2.4697}, {100, 2.50759}}

For n = 100, ..., 1 000:
{{100, 2.50759}, {200, 2.74604}, {300, 2.87777}, {400, 2.96818}, {500,
3.0367}, {600, 3.0917}, {700, 3.13755}, {800, 3.17679}, {900,
3.21106}, {1000, 3.24144}}

For n = 1 000, ..., 10 000:
{{1000, 3.24144}, {2000, 3.43534}, {3000, 3.54436}, {4000,
3.61992}, {5000, 3.67756}, {6000, 3.72406}, {7000, 3.76296}, {8000,
3.79637}, {9000, 3.82562}, {10000, 3.85162}}

For n = 10 000, ..., 100 000:
{{10000, 3.85162}, {20000, 4.01879}, {30000, 4.11369}, {40000,
4.17982}, {50000, 4.23046}, {60000, 4.27143}, {70000,
4.30578}, {80000, 4.33534}, {90000, 4.36126}, {100000, 4.38432}}

and for n = 100 000, ..., 1 000 000:
{{100000, 4.38432}, {200000, 4.53333}, {300000, 4.61842}, {400000,
4.67792}, {500000, 4.72359}, {600000, 4.7606}, {700000,
4.79169}, {800000, 4.81846}, {900000, 4.84196}, {1000000, 4.8629}}
 

What is the expectation of a maximum of a multinormal random vector?

The expectation of the maximum of a multinormal random vector is the average value that we would expect the maximum value to take if we were to repeatedly sample from the same multinormal distribution. It is also known as the upper bound of the random vector.

How is the expectation of a maximum of a multinormal random vector calculated?

The expectation of a maximum of a multinormal random vector can be calculated using the formula E[max(X1, X2, ..., Xn)] = ∫∫...∫max(x1, x2, ..., xn) f(x1, x2, ..., xn) dx1dx2...dxn, where f(x1, x2, ..., xn) is the probability density function of the multinormal distribution.

Can the expectation of a maximum of a multinormal random vector be negative?

No, the expectation of a maximum of a multinormal random vector cannot be negative. This is because the maximum value of a random vector is always equal to or greater than the individual values, and the expectation is the average of these values.

How does the dimension of a multinormal random vector affect its expectation of maximum?

The dimension of a multinormal random vector does not affect its expectation of maximum. This is because the expectation is solely dependent on the distribution of the random vector and not its dimension.

What is the significance of the expectation of maximum in statistical analysis?

The expectation of maximum is an important measure in statistical analysis as it helps in understanding the upper bound or maximum potential of a random vector. It can also be used to compare different samples from the same distribution and make inferences about the distribution's characteristics.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
475
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
Replies
0
Views
350
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
918
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
730
Back
Top