# What is the expected (squared) coefficient of variation of a sample?

• I
• NotEuler
I see what you mean. Does what you call "coefficient of variation of the samples" have a precise definition? For example, to some people, the terminology "standard deviation of a sample" of N outcomes involves division by N and to others it involves division by N-1. We can use different estimators for the ##\sigma## of the distribution.f

#### NotEuler

I've been trying to figure this out, but am not getting anywhere and was hoping someone here might know.
Say I have a distribution of which I know the variance and mean. I then take samples of n random variables from this distribution.
Without knowing anything more about the distribution, can I calculate the expectation of the coefficient of variation of the samples?
What about the squared coefficient of variation?
If I understand correctly, the expected mean of the samples is just the mean of the distribution. And the expected variance of the samples is just the variance of the distribution divided by n.
But I haven't been able to get much further than that.

Perhaps another way to ask the same is: If I have n iid random variables, what is their expected (squared) coefficient of variation?

The coefficient of variation, and also its square, will depend on the distribution of the random variable. That's because it's a nonlinear function of the samples, so things don't cancel out as they do for the mean and the variance, which are respectively linear in the sampled values and the squared sampled values.
Consider for example the following two random variables:
1. X is chosen from {1, 2, 3, 4, 5} with equal probability for each one.
2. Y is chosen from {2.293, 5.828} with probability of 0.8 on the lower number.
You can verify that X and Y have the same mean and variance. But if you simulate enough samples of size 10 and take the average of the sample coefficients of variation observed you should get something like 0.47 for X and 0.42 for Y.
In short, in most cases, distribution matters and we can't just use the mean and std dev. The cases where we can ignore distribution are generally special cases, or are linear combinations of random variables so we can use linearities to just add means and variances.
Even more concisely, the answer to the "can I?" question is No.

Excellent, many thanks! Makes a lot of sense.

So does this mean that the 'best' one can do (for an expression of the expectation of the coefficient of variation of the samples) is an expression like Expectation(cv) or Expectation(cv^2) for the square of it?

Or can something more be done if one knows more about the distribution, like skewness, kurtosis and so on?
Or perhaps this is a silly question. Maybe knowing 'all the central moments' is ultimately the same thing as knowing the whole distribution...
But I suppose there could be an expression that involves only a limited number of the central moments, like the first 3, 4 or 5...

So does this mean that the 'best' one can do (for an expression of the expectation of the coefficient of variation of the samples) is an expression like Expectation(cv) or Expectation(cv^2) for the square of it?
That can't be answered until you define precisely what you want "best" to mean. In the theory of statistical estimation, some terminologies for possible meanings of "best" for an estimator are:

Maximum liklihood
Minimum Variance
Unbiased
Minimum Least Square
Asymptotically efficient

Perhaps you can find a technical article involving some of this terminology and "square of the coefficient of variation".

That can't be answered until you define precisely what you want "best" to mean. In the theory of statistical estimation, some terminologies for possible meanings of "best" for an estimator are:

Maximum liklihood
Minimum Variance
Unbiased
Minimum Least Square
Asymptotically efficient

Perhaps you can find a technical article involving some of this terminology and "square of the coefficient of variation".

But is this really the same thing? I don't want to estimate anything, I want to find an expression for something (cv^2) in terms of exactly known properties of a distribution. I just don't know if that is actually possible (although I know from andrewkirk's reply that it is not possible in terms of variance and mean alone).

But is this really the same thing? I don't want to estimate anything, I want to find an expression for something (cv^2) in terms of exactly known properties of a distribution.

I see what you mean. Does what you call "coefficient of variation of the samples" have a precise definition?

For example, to some people, the terminology "standard deviation of a sample" of N outcomes involves division by N and to others it involves division by N-1. We can use different estimators for the ##\sigma## of the distribution.

It seems to me that however you define cv^2 , it's purpose is to estimate something. So you might find an answer by looking at the literature of estimators.

I don't know of a general statement for the distribution of a sample coefficient of variation. I can say that in the case where the samples come from a normal distribution then

$$\sqrt n \left(\frac{s}{\overline{x}} - \frac {\sigma}{\mu}\right) \xrightarrow{d} \mathcal{N}\left(0,\frac{\sigma^2}{2\mu^2} + \frac{\sigma^4}{\mu^4}\right)$$

That might give you some insight for a [very special] case.