Quantiles of a log-multivariate-normal-distributed set.

  • Thread starter mtal
  • Start date
  • Tags
    Set
In summary: The vector will be of length N, not 3; the length 3 business is only for the one-dimensional case.(3) For each of the K vectors, calculate the sum.(4) Arrange the resulting K sums in order.(5) The 2.5% quantile, for instance, is now the 2.5% element of the list you ordered in (4).That's still all.
  • #1
mtal
6
0
Hello,

Let [itex]X[/itex] be a set of [itex]N[/itex] lognormal prices (in dollars), meaning
[tex]\log(X) = Y \sim MN(\mu_Y , \Sigma_Y) ,[/tex]
i.e. the log of [itex]X[/itex] follows a multivariate normal distribution.

Imagine now that one wants to compute various quantiles for this set, e.g. 2.5%, 50% and 97.5\%, and does this by simulating 100k draws from the distribution above.
You then get say a [itex]100k \times N[/itex] matrix, and to get the total value you'd find these three quantiles for each of the [itex]N[/itex] prices, resulting in a [itex]N \times 3[/itex] matrix of quantiles, and then simply add up each of the three columns giving three numbers which represent the total quantiles for the whole set.

So, my question is:
How would one go about transforming the information for the quantiles for the whole set from log-dollars to dollars?
I am of course aware of the exponential function, so what I'm asking is how and where in this process do I use it?

------------------------------------------------------------------------------------------------------------------------------------------------------

My main idea is this:
The three final quantiles, 2.5%, 50% and 97.5\%, represent the log-quantiles of the set as a whole, say [itex] V_{2.5\%}, V_{50\%}[/itex] and [itex] V_{97.5\%}[/itex]. They also have their log/exp-counterparts, e.g. [itex] V_{50\%} = log(X_{50\%})[/itex].
Now, the difference between, for example, the 50% and 2.5%, [itex]V_{50\%} - V_{2.5\%}[/itex] could then be represented as
[tex] log(X_{50\%}) - log(X_{2.5\%}) = log\left(\frac{X_{50\%}}{X_{2.5\%}}\right),[/tex]

meaning that the log difference can be interpreted as log-proportional difference. Thus one could just exp and inverse this value and get

[tex] exp \left( log \left( \frac{ X_{50\%} }{ X_{2.5\%} } \right) \right)^{-1} = \frac{X_{2.5\%}}{X_{50\%}} [/tex]

which gives you the proportional difference of the quantiles in dollars, instead of log-dollars. With this information the 2.5% quantile can be obtained by multiplying [itex] X_{2.5\%}[/itex] with the mean of the [itex]N[/itex] prices.

Or am I way off here?

Any help is greatly appreaciated.
 
Physics news on Phys.org
  • #2
First, let me make sure I have something clear. You have these N prices, and you just want to know, for each price, where the 2.5, 50, and 97.5 percentiles are? You're not paying any attention to the correlations among them, right? So, in fact, the fact that there are N of them isn't important, because you can just do each one on its own.

If all you want is quantiles based on a large (100k) sample, you don't need to muck around with logs and exponentials at all, and indeed, you need not know anything about the underlying distribution. For a given stock, sort your 100k prices. The 2.5 percentile is [itex]\frac{x_{250} + x_{251}}{2}[/itex], and the 50 and 97.5 percentiles are analogous. That's it. You're done.

If you want to make use of the fact that they're lognormal, then you can do something different that won't require such a large sample size. (This would be useful, for instance, if you were dealing with real data, such as daily closing prices, where the quantity is limited.) Let's call the price of stock i on day j [itex]p_{ij}[/itex]. Take the (natural) logs of all those prices -- let's call those [itex]\pi_{ij}[/itex]. For each i, calculate the mean and standard deviation of the [itex]\pi_{i.}[/itex]. Now the 2.5 percentile is at [itex]\pi=\mu -1.95996 \sigma[/itex], the 50 at [itex]\mu[/itex], and the 97.5 at [itex]\mu + 1.95996 \sigma[/itex]. To get the percentiles for p, just use [itex]p=e^\pi[/itex].

A big advantage of this second approach is that you can pay attention to the correlations between your different stocks. All you need to do is calculate the [itex]N(N-1)[/itex] covariances from your sample data, and you now have a complete description of the multivariate lognormal distribution. This can be used, for instance, to predict the behavior of a portfolio of stocks. (Of course, the predictions will only be as reliable as the assumption of normality.)

EDIT: Reading your post again, I think this is still not exactly what you want. But I don't know what you DO want. What exactly do you mean by the 2.5 percentile for a list of N prices? I know what the percentile is for one scalar random variable, but how do you define a percentile for a list of random variables?
 
Last edited:
  • #3
Thanks for the reply!

These prices are estimates from a statistical model and are indeed correlated.

About the second approach you mentioned: The multivariate log-normal distribution didn't enter my mind before since I had never thought of its existence before. I did some searching and found that R (the statistical program I use) has a function which simulates data from the MVLN distribution given a mean vector of logs and a covariance matrix of logs. That is exactly what I was looking for!

But is that the case then? If the log of the data follows a multivariate normal, does the data itself follow the MVLN?
 
  • #4
mtal said:
But is that the case then? If the log of the data follows a multivariate normal, does the data itself follow the MVLN?
That is in fact the definition.

I still don't get what you're trying to do, though. It seems like you want to have ONE number for the 2.5 percentile of (say) 23 stock prices. I just don't understand what that means. How could the 2.5 percentile of 23 stock prices be a single number? Now, if you had a portfolio containing 23 stocks, i.e. a weighted sum of the stock prices, and you wanted the 2.5 percentile of the portfolio, that I would understand.
 
  • #5
Yes, I do have the equivalence of a portfolio, in that I'm trying to estimate the behaviour of the price of the set as a whole (i.e. the behaviour of the sum of the prices).

What I mean by the "x percentile of N prices" is that I would make some K draws from the multivariate normal distribution, resulting in K vectors of length N. Then for each of the N prices calculate the corresponding quantiles, then sum each column of quantiles. This would result in three different quantiles for the set as a whole. Right?
 
  • #6
Ah, I see. But that won't actually give you the correct quantiles for the sum of the prices. If you want to do that, you need to:

(1) Draw K vectors of length N.
(2) Sum up each of the K vectors, to get a single vector of length K.
(3) Sort vector (2) and locate the quantiles.

What you're proposing is something like (1), (3), (2), but that will not give the right answer. The quantile of the sums is not the sum of the quantiles.
 
  • #7
Thanks a lot!

I was so concentrated on this whole exp-log idea that I somehow never looked at adding the rows instead of the columns.

And thanks for pointing out where I went wrong, you've been tons of help!
 

1. What is a log-multivariate-normal distribution?

A log-multivariate-normal distribution is a type of probability distribution that describes the joint distribution of several random variables that are related to each other through a log-linear relationship. It is commonly used in statistical modeling to describe data that is skewed or has a wide range of values.

2. How are quantiles calculated for a log-multivariate-normal distribution?

Quantiles for a log-multivariate-normal distribution can be calculated using numerical methods such as the Newton-Raphson algorithm or the Bisection method. These methods involve finding the value of the distribution function that corresponds to a given probability or percentile.

3. What is the significance of quantiles in a log-multivariate-normal distribution?

Quantiles provide a way to summarize the distribution of a log-multivariate-normal dataset by dividing it into equal-sized groups based on their probability or percentile. They can be used to estimate the probability of an event occurring or to compare different datasets.

4. Can quantiles be used to measure the central tendency of a log-multivariate-normal distribution?

No, quantiles are not a measure of central tendency. They provide information about the spread of the data and its position within the distribution, but they do not give an indication of the typical or average value.

5. What are some applications of quantiles for a log-multivariate-normal distribution?

Quantiles have various applications in fields such as finance, economics, and environmental studies. They can be used to analyze income inequality, estimate risk in financial portfolios, and model extreme weather events.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
Replies
1
Views
1K
  • Math Proof Training and Practice
2
Replies
52
Views
9K
Replies
1
Views
994
Replies
7
Views
835
  • Math Proof Training and Practice
3
Replies
100
Views
7K
Back
Top