Finance: does the volatility of a stock follow a certain distribution?

Master1022 · Feb 1, 2022

Hi,

I am not sure whether this is the right forum to post this. Please let me know if I should move and will do so.

Overall question: Does the 'volatility' (i.e. standard deviation of the log returns) follow any sort of statistical distribution - maybe normal or log-normal?

Background/ Context: I was looking at some stock data from yahoo finance and was plotting out some metrics for the data. For example, I looked at a group of technology companies, calculated a metric (let us call it ##m_1##) which was the linear change in closing price from the previous day:
[tex]m_1 = \frac{c_{t} - c_{t - 1}}{c_{t-1}}[/tex]

where ##c_t## is the closing price of the stock on day ##t##. The distribution came out looking something like: (Note: it will take a very long time to make/read this post if I have to post all the code and the cleaning/processing steps)

Screen Shot 2022-02-01 at 2.07.13 PM.png

This looks somewhat fairly normal, at least from a visual standpoint. Then I thought: does that make sense that I see a normal-looking distribution?

After some searching around, it seems like yes, I do expect this value to be normally distributed. One explanation was provided on stack exchange (here) and it ends up (after some brief mathematics) by saying that stock returns are normally distributed.

From the post

Screen Shot 2022-02-01 at 2.15.50 PM.png

So now I am calculating the volatility using the standard deviation of the log(returns) (i.e. ## log \left( \frac{c_t}{c_{t-1}} \right) ##). Then I used a rolling window of about 10 days for the standard deviation (and also annualized it, etc. - this part doesn't matter as much in terms of the shape of the distribution as its just a scaling factor). When I plotted this out in Python, I was getting a distribution looking like this:

Screen Shot 2022-02-01 at 2.18.45 PM.png

This seems lognormal to me. However, is this what we expect? I couldn't find any answers online (at least that were comprehensible by someone like me who has no stochastic/financial mathematics background). I didn't really have time to learn stochastic calculus just for this, but am just trying to understand if this aligns with expectation or not.

Any help would be greatly appreciated.

fresh_42 · Feb 1, 2022

Volatility is commonly the standard deviation of the distribution. So once you determined the probability distribution function of the stock price, you automatically fixed volatility to a certain number.

Master1022 · Feb 1, 2022

Thanks for the reply @fresh_42 !

fresh_42 said:

Volatility is commonly the standard abbreviation of the distribution. So once you determined the probability distribution function of the stock price, you automatically fixed volatility to a certain number.

(did you mean 'deviation' instead of abbreviation). Nonetheless, I suppose that makes sense. From reading online, it seems like:
- the stock price is log-normally distributed (this makes sense as the price can't be negative),
- the linear fractional return are normally distributed (from link above)

Perhaps I should re-phrase the question as:
If ##c_t## follows a log-normal distribution, how can I find out what type of distribution describes ##P##:
## P = STDEV \left( log \left( \frac{c_{t}}{c_{t-1}} \right) \right) ##?

Apologies if this is what you were explaining before, but I didn't quite understand that.

fresh_42 · Feb 1, 2022

Master1022 said:

Thanks for the reply @fresh_42 !

(did you mean 'deviation' instead of abbreviation).

Sure, sorry, I corrected it.

Master1022 said:

Nonetheless, I suppose that makes sense. From reading online, it seems like:
- the stock price is log-normally distributed (this makes sense as the price can't be negative),
- the linear fractional return are normally distributed (from link above)

Perhaps I should re-phrase the question as:
If ##c_t## follows a log-normal distribution, how can I find out what type of distribution describes:
## STDEV \left( log \left( \frac{c_{t}}{c_{t-1}} \right) \right) ##?

Apologies if this is what you were explaining before, but I didn't quite understand that.

What is the random variable in ## STDEV \left( log \left( \frac{c_{t}}{c_{t-1}} \right) \right) ##?

FactChecker · Feb 1, 2022

Of course, you can only say this about randomly generated stock fluctuation. So you must be ignoring significant events that are directly related to the stock price.

hutchphd · Feb 1, 2022

I am confused as to the question at hand. Therefore I will supply my one-size-fits-all answer: Central Limit Theorem.
If this is not the answer I will be interested.

FactChecker · Feb 1, 2022

Master1022 said:

This seems lognormal to me. However, is this what we expect?

It is not lognormal. Here is a link to information about the distribution of the sample standard deviation.

Master1022 · Feb 1, 2022

Thanks for all the replies @fresh_42 , @hutchphd , and @FactChecker . It looks like I didn't do a good job of asking the question. The only way I can phrase the question in my mind is as I wrote before: if stock price is assumed to be log-normal, then what is the distribution of the sum of the squared log-returns? The log returns are a ratio between two prices, which themselves are random variables...

The first plot was just there for the context of me calculating a metric, thinking: does "this make sense?" and then finding something to support that.

fresh_42 said:

What is the random variable in ## STDEV \left( log \left( \frac{c_{t}}{c_{t-1}} \right) \right) ##?

I thought the prices ##c_{t}## are the random variables. They are the prices on days ##t## and ##t -1 ## respectively. Then I thought that a function of a random variable is a random variable.

FactChecker said:

Of course, you can only say this about randomly generated stock fluctuation. So you must be ignoring significant events that are directly related to the stock price.

Yes, I am ignoring those events for now. Just trying to ascertain things about the basic (and perhaps un-realistic/non-applicable) distribution.

hutchphd said:

I am confused as to the question at hand. Therefore I will supply my one-size-fits-all answer: Central Limit Theorem.
If this is not the answer I will be interested.

Hmm, fair enough. How can I reconcile that with the second plot in my original post? That plot was generated by looking at the price time-series for a few companies, then calculating the volatility time-series for each of them. Then I averaged them together (on each day) to get an 'average sector volatility' just so I could look at the distribution of those (e.g. average technology volatilities). The distribution doesn't quite look normal, but you are saying that I should expect it to be normal?

FactChecker said:

It is not lognormal. Here is a link to information about the distribution of the sample standard deviation.

Okay many thanks. However, the one part that I am not sure about is that ##x## is drawn from a normal distribution here, whereas the stock price is log-normal. I feel like this should make a difference?

fresh_42 · Feb 1, 2022

The prices are the random variable according to some probability distribution function (pdf). The pdf has one expectation value, one variance and so one standard deviation, all are a number. By saying the prices are distributed along with a certain pdf, you automatically determine a certain value for the standard deviation aka volatility.

In order to consider the standard deviation itself as a random variable, you have to consider a variety of pdf, along which the standard deviation can vary. @FactChecker gave an example in #7, where all pdf were normal distributions, but I couldn't see how the pdf themselves were distributed. I guess also normally.

So if you want to vary a standard deviation, you have to consider probability distribution functions as values for a random variable. This requires some math to do. E.g. what does it mean that two pdf are close to each other, which kind of pdf do we allow at all, etc. And if the space for the pdf is set, then we have to consider a pdf on these pdf. That finally has again a standard deviation. You have basically asked for the pdf of all stock price pdf. However, this cannot be answered without a lot more information. You only said that the stock price pdf have to be normally distributed (or log normally). This brings you to the link in post #7 - with the same amount of missing information. What is the pdf of those (log) normal pdf?

Master1022 · Feb 1, 2022

Thanks for taking the time to respond @fresh_42

fresh_42 said:

The prices are the random variable according to some probability distribution function (pdf). The pdf has one expectation value, one variance and so one standard deviation, all are a number. By saying the prices are distributed along with a certain pdf, you automatically determine a certain value for the standard deviation aka volatility.

Sure, but volatility isn't just the standard deviation of the prices. It is the standard deviation of the log returns of the stock prices (at least with the definition I am going by). The log return is a the logarithm of the ratio of two random variables... I think this may be where our discrepancy is arising from? If we have two log-normal random variables, take the ratio between them and then take the natural logarithm of that, what distribution does that yield?

fresh_42 · Feb 1, 2022

The variance of the logarithmic normal distribution is
$$
\operatorname{Var(\log X)}=e^{2\mu+\sigma^2}\left(e^{\sigma^2}-1\right) \neq e^{\operatorname{Var(X)}}=e^{\sigma^2}
$$
where ##(\mu,\sigma )## are the parameters (expectation value, standard deviation) of the underlying normal distribution.

Does that answer your question?

pasmith · Feb 2, 2022

I think you should have [tex] \log (c_n/c_{n-1}) \sim \mathrm{N}(\mu,\sigma^2).[/tex] The variance of [itex]N[/itex] independent [itex]\mathrm{N}(\mu,\sigma^2)[/itex] variables [itex]Y_1, \dots, Y_N[/itex] is [tex] \frac{1}{N}\sum_{n=1}^{N} (Y_n - \mu)^2 = \frac{\sigma^2}{N}\sum_{n=1}^{N} \left(\frac{Y_n - \mu}{\sigma}\right)^2.[/tex] This is a sum of squares of independent [itex]\mathrm{N}(0,1)[/itex] variables, which therefore follows a chi-square distribution with [itex]N[/itex] degrees of freedom, while the standard deviation follows a chi distribution.

BWV · Feb 2, 2022

Lognormal is the standard model used in finance and the basis for the Black Scholes option pricing model and others

However, the vol is not normal, particularly at shorter time periods, financial market returns are fat-tailed. Heteroskedastic models, rather than levy-stable distributions fit the data best. You can get an idea of the volatility of the volatility looking at the VIX index, which reflects the standard deviation priced into S&P 500 options.

A standard model of volatility is GARCH, where volatility is an autoregressive process (i.e. it trends with a random shock) this reflects the observation that tomorrow's volatility tends to be correlated to today's

https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity

BWV · Feb 2, 2022

pasmith said:

I think you should have [tex] \log (c_n/c_{n-1}) \sim \mathrm{N}(\mu,\sigma^2).[/tex] The variance of [itex]N[/itex] independent [itex]\mathrm{N}(\mu,\sigma^2)[/itex] variables [itex]Y_1, \dots, Y_N[/itex] is [tex] \frac{1}{N}\sum_{n=1}^{N} (Y_n - \mu)^2 = \frac{\sigma^2}{N}\sum_{n=1}^{N} \left(\frac{Y_n - \mu}{\sigma}\right)^2.[/tex] This is a sum of squares of independent [itex]\mathrm{N}(0,1)[/itex] variables, which therefore follows a chi-square distribution with [itex]N[/itex] degrees of freedom, while the standard deviation follows a chi distribution.

you are describing the cross-sectional variance of stock returns, but this variance is not independent - there is a high correlation among the returns of individual stocks based on market, industry and other factors. Have never seen Chi-Squared used for cross sectional stock variance in any finance papers

BWV · Feb 2, 2022

The obvious problem with normally distributed returns is the probability of a return <-100% is not zero
Lognormal solves this. You can also get it from applying Ito's lemma to a Brownian motion:

Given a stochastic process S, if you model F(S)=log(S) and apply Ito's Lemma you reproduce a log-normal distribution

Master1022 · Feb 2, 2022

Many thanks for the continued responses! I need to take some time to read through and fully understand them before I come back with any follow up questions. I've heard about some of these concepts before (e.g. GARCH), but not some of the other terms so I need to do some extra reading on them.

Master1022 · Feb 13, 2022

Just to follow up on this, I think a few answers here are correct with regards to ARCH/GARCH, but I am writing to update what I found. I was shown the book:

"Bouchaud, J., & Potters, M. (2003). Theory of Financial Risk and Derivative Pricing: From Statistical Physics to Risk Management (2nd ed.). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511753893"

(I don't know if I can just screenshot the figure I am talking about. What are the rules regarding this??).

In Chapter 7, there is Figure 7.7 which plots the 'Distribution of the measured volatility ##\sigma_{hf}## of the S&P using a semi-log plot'. They then fit both a log-normal distribution and an inverse gamma distribution and note that the log-normal distribution underestimates the right tail. The inverse Gamma distribution arises from a derivation that essentially stems from a simple 'volatility-feedback' model (derivation is explained fully in Chapter 20).

fresh_42 · Feb 13, 2022

I'm always skeptical if prices are explained by chartists. Volatility depends heavily on external variables: news, politics, trends, conversations between analysts and their customers, or simply among analysts, to name a few. It is not that the actual economic behavior leads to volatility, it is the opinions and decisions of others!

WWGD · Feb 13, 2022

BWV said:

you are describing the cross-sectional variance of stock returns, but this variance is not independent - there is a high correlation among the returns of individual stocks based on market, industry and other factors. Have never seen Chi-Squared used for cross sectional stock variance in any finance papers

Aren't we dealing with ##((c_n-c_{n-1})/c_{n-1}))##, though?

BWV · Feb 13, 2022

WWGD said:

Aren't we dealing with ##((c_n-c_{n-1})/c_{n-1}))##, though?

Yes, misread the post. In the lognormal model the variance is a constant and Chi-square applies to the sampling error in estimating the variance. However as variance is a stochastic process as well, no one bothers with the variance sampling error

Office_Shredder · Feb 13, 2022

fresh_42 said:

It is not that the actual economic behavior leads to volatility, it is the opinions and decisions of others!

The opinions and decisions of others are economic behavior of the world, so I'm not sure your belief here is entirely well founded.Questions like how often does an exogenous fat tailed shock occur are just measurable empirical results.

BWV · Feb 13, 2022

If you take a simple growing perpetuity model for stocks:

P=CF/(r-g), where CF is either company earnings or dividends, r is the discount rate and g is the growth rather of CF and r>g then small changes in r or g have a large impact on value. Hard to observe directly, but using EPS, r-g for the market is around .04 (1/(r-g)) is the PE ratio)

WWGD · Feb 17, 2022

FactChecker said:

It is not lognormal. Here is a link to information about the distribution of the sample standard deviation.

But this is composed with ##Log( ct/c_{t-1}) ##. Wouldn't that affect the distribution?

WWGD · Feb 20, 2022

Master1022 said:

Summary:: The stock price follows a log-normal distribution; the linear fractional change of the stock price follows a normal distribution. What about the volatility of the stock?

Hi,

I am not sure whether this is the right forum to post this. Please let me know if I should move and will do so.

Overall question: Does the 'volatility' (i.e. standard deviation of the log returns) follow any sort of statistical distribution - maybe normal or log-normal?

Background/ Context: I was looking at some stock data from yahoo finance and was plotting out some metrics for the data. For example, I looked at a group of technology companies, calculated a metric (let us call it ##m_1##) which was the linear change in closing price from the previous day:
[tex]m_1 = \frac{c_{t} - c_{t - 1}}{c_{t-1}}[/tex]

where ##c_t## is the closing price of the stock on day ##t##. The distribution came out looking something like: (Note: it will take a very long time to make/read this post if I have to post all the code and the cleaning/processing steps)
View attachment 296354
This looks somewhat fairly normal, at least from a visual standpoint. Then I thought: does that make sense that I see a normal-looking distribution?

After some searching around, it seems like yes, I do expect this value to be normally distributed. One explanation was provided on stack exchange (here) and it ends up (after some brief mathematics) by saying that stock returns are normally distributed.

From the post
View attachment 296355

So now I am calculating the volatility using the standard deviation of the log(returns) (i.e. ## log \left( \frac{c_t}{c_{t-1}} \right) ##). Then I used a rolling window of about 10 days for the standard deviation (and also annualized it, etc. - this part doesn't matter as much in terms of the shape of the distribution as its just a scaling factor). When I plotted this out in Python, I was getting a distribution looking like this:

View attachment 296356

This seems lognormal to me. However, is this what we expect? I couldn't find any answers online (at least that were comprehensible by someone like me who has no stochastic/financial mathematics background). I didn't really have time to learn stochastic calculus just for this, but am just trying to understand if this aligns with expectation or not.

Any help would be greatly appreciated.

Don't know if this is an empty technicality or not, but it doesn't seem clear what the parameters are for your choice ##m_1:= ((C_t - C_{t-1})/C_{t-1})## of Estimator Statistic. Is it just a function of t? Maybe I'm just being thick.

Master1022 · Feb 23, 2022

WWGD said:

Don't know if this is an empty technicality or not, but it doesn't seem clear what the parameters are for your choice ##m_1:= ((C_t - C_{t-1})/C_{t-1})## of Estimator Statistic. Is it just a function of t? Maybe I'm just being thick.

Sorry yes, it was just a function of ##t##. That formula was just meant to describe the linear fractional change in the closing price of a stock.

The post at a high level was supposed to be:

"I am looking at variable A and I see the following plot [show plot]. Is this what should be expected?

Likewise, when I looked at variable B (formula in your comment), I see the following plot which is substantiated by online resources, thus confirming I should see that."

Apologies for the confusion

Finance: does the volatility of a stock follow a certain distribution?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect