# I Combined measurement uncertainty for mass computation

1. Apr 1, 2016

### JotWe

Problem
I would like to determine the combined standard measurement uncertainty for a mean mass $\bar{m}$ computed from a mean volume $\bar{V}$ and a constant density $\rho$ with $$\bar{m} = \bar{V} \, \rho.$$
I know the mean volume $\bar{V}$, volume standard deviation $\sigma_V$, number of volume samples $n_V$ and constant density $\rho$. The density $\rho$ is assumed to have no uncertainty.

I also have a small set of relative errors $\epsilon_i$ with $n_\epsilon$ samples that describe the deviation between masses $\tilde{m}_i$ obtained with the applied measurement method (same system to determine the volume and the same constant density) and the masses $\hat{m}_i$ obtained with an assumingly more reliable reference measurement method. The relative error $\epsilon_i$ is computed by $$\epsilon_i = \frac{\tilde{m}_i - \hat{m}_i}{\hat{m}_i}.$$
All uncertainties are assumed to be normally distributed.

Approach
My approach to compute the combined standard measurement uncertainty $u_c(\bar{m})$ is $$u_c(\bar{m}) = \sqrt{\frac{\rho^2 \, u(V)^2 + u(m)^2}{n_V}}$$
with
$$u(V) = \sigma_V,$$
$$u(m) = \frac{1}{n_\epsilon - 1} \sum_{i = 0}^{n_\epsilon} \; (\epsilon_i \, \bar{m} - 0)^2.$$
Here, I assume that the error of the measurement method is zero-mean and I use the computed mean mass $\bar{m}$ to convert the given relative errors $\epsilon_i$ with unknown reference masses $\hat{m}_i$ into an absolute error.

Is this approach reasonable?

2. Apr 4, 2016

### Stephen Tashi

I think it overestimates the standard deviation.

Let V be the random variable defined by the measured volume. let X be the random variable that is the actual volume. Let Y be the random variable that gives the deviation from the actual volume produced by the measuring apparatus. Assuming X and Y are independent then the standard deviations obey $\sigma_V = \sqrt{ \sigma^2_X + \sigma^2_Y }$. I think this is the motivation for your formula.

However, in your formula $$u_c(\bar{m}) = \sqrt{\frac{\rho^2 \, u(V)^2 + u(m)^2}{n_V}}$$ the $u(V)^2$ term does not correspond to $\sigma_X$ because you have no data on the variance of the actual volumes. You only have data that gives the sum $V = X + Y$, not data that gives $X$ by itself.

3. Apr 4, 2016

### JotWe

Thank you very much for your feedback!

You are absolutely right, the volume is the random variable here. But my intention with the formula
$$u_c(\bar{m}) = \sqrt{\frac{\rho^2 \, u(V)^2 + u(m)^2}{n_V}}$$
is not to estimate the uncertainty of the measured volume. I would like to estimate the uncertainty of determining a mass with the uncertain volume incorporating additional information given by the relative errors. (I hope this makes sense, somehow.) I should have given more background on my derivation of this formula in my first post. Sorry for being unclear.

I know the (sample) mean, (sample) standard deviation and number of the volume samples $\bar{V}$, $\sigma_V$, $n_V$ as well as the constant density $\rho$. Based on this information, I can compute the standard deviation (or standard uncertainty) of the mean volume $\bar{V}$ by
$$\sigma_{\bar{V}} = \frac{\sigma_V}{\sqrt{n_V}} = \sqrt{\frac{\sigma_V^2}{n_V}}.$$
Including the constant density $\rho$, I should be able to compute the standard deviation (or standard uncertainty) of the mean mass $\bar{m} = \rho \, \bar{V}$ with
$$\sigma_{\bar{m}} = \frac{\rho \, \sigma_V}{\sqrt{n_V}} = \sqrt{\frac{\rho^2 \, \sigma_V^2}{n_V}}.$$
I could stop here and use this standard deviation (or standard uncertainty) for my analysis. But I have additional information: The small set of relative errors $\epsilon_i$ with $n_\epsilon$ samples that describe the deviation between the applied measurement method and an assumingly more reliable reference measurement method for the mass. My approach for incorporating this information is to estimate an uncertainty component $u(m)$ and add it to the formula of the standard deviation (or standard uncertainty) of the mean mass $\bar{m}$ by adding in quadrature. This results in my formula
$$u_c(\bar{m}) = \sqrt{\frac{\rho^2 \, \sigma_V^2 + u(m)^2}{n_V}} = \sqrt{\frac{\rho^2 \, u(V)^2 + u(m)^2}{n_V}}.$$
In this way, the indirectly determined mean mass $\bar{m}$ has two uncertainty components: The uncertainty resulting from the mean volume $\bar{V}$ and the uncertainty of the measurement system. At least, that's what I would like to do.

Last edited: Apr 4, 2016
4. Apr 5, 2016

### Stephen Tashi

You can try to estimate the standard deviation of the error-in-measurement from the relative errors. However, if your bottom line is to estimate the standard deviation of the computed masses. The computed masses depend only on the measured volumes. Each measured volume combines the effect of having a particular actual volume with the effect of making some error in measuring that volume. So the standard deviation you get from your volume measurements already accounts for the variation due to errors in measuring.

To illustrate the point, think of situation where the actual volumes are all the same identical value and the only variation in the measured volumes is due to the error of the measuring device. The uncertainty volume measurement can be estimated from the data and it would be identical to the uncertainty of the measuring device. You wouldn't need to add another term to double count the uncertainty of the measuring device.

If you want to estimate the standard deviation of the actual masses, which are a function of the actual volumes then this involves solving for $\sigma^2_x$ in the equation $\sigma^2_V = \sigma^2_X + \sigma^2_Y$. However, you have no direct measurements of the actual volumes. So you would be computing an uncertainty for a estimator, for which you have no data.

Estimating the error in measurement from the relative errors is an interesting problem. The equation $$\epsilon_i = \frac{\tilde{m}_i - \hat{m}_i}{\hat{m}_i}.$$ and the statement that "all variables are normally distributed" are incompatible because the right side would contain a quotient of normal random variables and so the left side could not be normally distributed. Of course, in practical work, people make technically contradictory assumptions all the time and perhaps one can get away with approximating both sides of the equation with normal distributions. (As I recall, there is a paper by a well know statistician on this topic. )

5. Apr 5, 2016

### Staff: Mentor

The inverse of a normal distribution is not a normal distribution, but if the standard deviation is small compared to the mean value, the deviation from a normal distribution is small. If your denominator just varies by 1%, you probably don't care about those details. If it varies by 30%, you have to consider it.

6. Apr 13, 2016

### JotWe

It's true, I don't know the actual volumes. That's why I tried to estimate the uncertainty resulting from the error-in-measurement by the standard deviation of the mean given by $$\sigma_{\bar{V}} = \frac{\sigma_V}{\sqrt{n_V}} = \sqrt{\frac{\sigma_V^2}{n_V}}.$$ But in addition, I have a (more systematic) error of the measurement method which I tried to estimate with the relative errors giving $$\sigma_{\bar{m}} = \frac{\rho \, \sigma_V}{\sqrt{n_V}} = \sqrt{\frac{\rho^2 \, \sigma_V^2}{n_V}}.$$ Here, I assume that the error of the measurement method is reduced by taking the mean over $n_V$ measurements (unlike real systematic errors).

Your arguments against a normal distribution of this error are absolutely valid. One solution could be to apply a Monte-Carlo simulation to determine the distribution of the estimated mass error. But I still need a distribution for the error of the measurement method for sampling. If you have a better idea for obtaining a distribution from the relative errors, I would be happy to try it. The main problem is that only six relative errors are reported in literature, so the mean and standard deviation are not very representative.

The background of this problem is the estimation of human body segment masses of living subjects (e.g., shank mass). The approach used in most biomechanics literature is to estimate the segment mass from linear regression model based on measured segment volumes and a constant density. Of course, the human body does not have a constant density and that's why there is a significant error.

7. Apr 14, 2016

### Stephen Tashi

One interpretation of the problem is a question like "What is the standard deviation of the mass of the right forearm in the population of all human subjects ? " (male, female, old, young, etc.). A different interpretation of the problem is a question like "If I randomly pick a particular individual from the population and apply a given method to estimate the mass of his right forearm, what is the standard deviation of the error in that measurement ?.

There are many other ways of making the general question into something specific. We need to understand which specific versions of the question you are working on.

We also need to understand the specific interpretation of the two sets of data.

For example, are the six reported relative errors each an error in the measurement of the same body segment (like a forearm) ? Are the relative errors summaries of larger sets of errors ? - for example, is one reported relative error the mean error in measuring the mass of the right forearm of some randomly selected adult subjects ?

What defines the larger set of volume measurements ? - i.e. what defines the population from which the measurements are taken ? Does the population involve measurements of the same body part ? (e.g. right forearm). Or is the population defined by a random selection of some human subject and then a random selection of a body part on that subject ?

8. Apr 15, 2016

### JotWe

The larger set of volumes is based on a sample of 31 young male subjects [1]. For each body segment the mean volume and the standard deviation is reported as well as a constant density $\rho$ for all segments. These are my $n_V$, $\bar{V}$, $\sigma_V$ and $\rho$.

The six relative errors $\epsilon_i$ are reported in a previous study for each body segment [2] and were determined from six male cadavers. Each relative error represents the error on one specific cadaver. The relative errors describe the deviation between the applied measurement method (measuring the body segment volume and multiplying it with the constant density) and an assumingly more reliable reference measurement method.

$$\bar{V} = 4369 \, \text{cm}^3, \sigma_V = 295 \, \text{cm}^3, n_V = 31, \rho = 0.001 \, \frac{\text{kg}}{\text{cm}^3}$$
$$\epsilon_1 = -0.4 \, \%, \epsilon_2 = 1.1 \, \%, \epsilon_3 = -5.6 \, \%, \epsilon_4 = 3.7 \, \%, \epsilon_5 = -1.2 \, \%, \epsilon_6 = -1.4 \, \%$$

I am interested in the uncertainty of the mean mass $\bar{m} = \bar{V} \, \rho$.

[1] McConville et al. (1980). Anthropometric Relationships of Body and Body Segment Moments of Inertia.
[2] McConville et al. (1976). Anthropometric Assessment of the Mass Distribution Characteristics of the Living Human Body.

9. Apr 17, 2016

### Stephen Tashi

Let's consider one body segment (e.g. head segment). For that segment, your data for 31 male subjects consists of what I will called the "measured volume", from which you infer the "measured mass". The "measured volume" of the segment is the result of a measuring process that may have some error, so the "measured volume" is distinct from the "actual volume".

For the segment, there is (in theory) a "mean measured volume" for the population of all male subjects. The "sample mean measured volume" is an estimate of the population "mean measured volume". If you want to estimate the standard deviation for this estimator, you can estimate it by using the sample standard deviation.

A population parameter can be estimated in many different ways. The definition of what makes one estimation formula "best" is a conceptually complicated matter and "best" can be interpreted mathematically in different ways. I think estimating the standard deviation of the sample measured volume by using a formula that adds on a term to the sample standard deviation is not best from the point of view of getting an "unbiased" estimator.

Instead of the standard deviation of the "sample mean measured volume", perhaps you want to estimate the standard deviation of the "sample mean actual volume". We can think about doing this, but it isn't straightforward because your data from the 31 subjects doesn't consists of "actual" volumes.

Applying statistics is highly subjective. Sometimes the goal is "cultural" - e.g. to please the referees of a scientific journal. Sometimes the goal is a practical engineering project - e.g. to design some equipment that will work. Different goals require different approaches.

10. Apr 18, 2016

### JotWe

Thank you again for your support and patience.

Yes, I only know the sample mean and standard deviation of the measured volume. I (maybe too hastily) anticipated that these - based on the given information - are the best estimators for the sample mean and standard deviation of the actual volume I can get. In my original post I tried to derive a closed form for the combined measurement uncertainty. This might not be the best approach. My final goal is to estimate the uncertainty of the regression model. I could use a Monte Carlo simulation for example. So an unbiased estimation would definitely be favourable.

11. Apr 19, 2016

### Stephen Tashi

Your original post is reasonable line of thinking, but (in the notation of my reply) , the standard deviation of the actual volume would be computed by subtraction: $\sigma_X^2 = \sigma_V^2 - \sigma_Y^2$

However, you have no measurements of the estimator $X$. You can say "I assume the mean measured volume estimates the mean actual volume", but then you are talking about a different estimator of the actual volume than using given values of $X$.

A Monte Carlo simulation is always a good idea.

You prefer the term "uncertainty" to the term "standard deviation". The term "uncertainty" is used by various "cultures" to mean different things. For example, there is (certainly!) a culture of people who think about statistical problems in an intuitive and technically wrong way. Often statisticians find they must present arguments to managers who belong to such a culture. In the culture of intuitive wrong statistics, there is some specific number (e.g. 27.53) and there is an "error bar" of a certain size around that number (e.g. plus or minus 5.6) and the intuitive conclusion is that "there is a 68% chance that the actual measurement is within in interval spanned by the error bar".

It is possible to formulate statistical problems so that this intuitive interpretation is correct, but doing so involves using Bayesian statistics. What I often see is that statisticians formulate statistical problems without using Bayesian statistics and, to please non-statisticians, they phrase their results in terms of "error bars". This allows the statisticians to deny complicity when the non-statisticians to make their own incorrect interpretation of the results.

Saying you want to "estimate the uncertainty" of a model is ambiguous. It's difficult to translate this into a specific mathematical task. I think of a regression model as a function. Are you seeking to plot the model and then find "error bars" that let you bound the graph of the function by one function above it and another function below it ?

12. Apr 22, 2016

### JotWe

Well, to answer your last question first: No, I would like to know (or approximate) the distribution of the model output.

If I know the distribution, I still can think of the parameters to describe it.

Actually, I don't have much information. I know a mean and a standard deviation of the measured volume. I think it is reasonable to assume a normal distribution, but I don't know exactly. I also have a small number of relative errors for masses that were computed with other measured volumes using the same measuring method and a constant density which is just a rough approximation of the real density distribution in the volume. Again, I don't have any information about the distribution.

If I want to apply a Monte Carlo Simulation, I think I have make some assumptions.

13. Apr 23, 2016

### Stephen Tashi

What shall we mean by "the model" ? Glancing at those articles online, the articles give statistics. So is "the model" to be a distribution that matches those statistics?

I think what you want to do is find a model for the actual mass distribution of a given body segment using the published statistics of measured mass distributions together with the published statistics about the errors in measurement. We have 3 random variables, V = X + Y. You have data for the distribution of V and data for the distribution of Y. You want to estimate the distribution of X. This would be called a problem of "deconvolution".

An approach to investigating this by Monte Carlo simulation would be to use the data to fit a distribution to the data for Y. Then try various distributions for X and see which produce results that match (in some sense) the data for V.

14. Apr 24, 2016

### JotWe

Yes, I would like to estimate the distribution of the actual segment mass given the published statistics.

Thank you for this hint! I will read up on deconvolution and see where it gets me.