# Combine several measures: consensus mean or a sampling problem?

Combine several measures: "consensus mean" or a sampling problem?

Dear all,
I have this problem:
the work aims to test the value (mean and error) of several sensors (with a different measure for each) in a subjects group, and then to have a single global value (a single mean and error) from all the sensors used to test the group. All sensors measure the same molecule with different specificity. The normality of the sensors measure in unknown.

More in details: I have 20 subjects (a,b,c,...,n). For each subject I can use different sensors (A, B, C,...M) , each resulting in a measure. The interest is focused on the value of each sensor (reflecting a biological parameter ) and then on the complessive value of all sensor measures.

The value provided by each sensor is interesting, so for the subjects I have the value of each sensor measurement. Not all measures ar reliable and some must be discarded, so each sensor measures a different subset of the 20 subjects.
I calculated the mean value, and variance of sensor A, B, C,...M using available measures from the group for each of the sensors.

So this is the available data:

Sensor A -> values from subjects (a,c,d,f,h,j...,n) ===> mean(A), variance(A)
Sensor B -> values from subjects (c,d,e,g,...,l) ===> mean(B), variance(B)
Sensor C -> values from subjects (a,b,d,e,j,k,...,n) ===> mean(C), variance(C)
...
Sensor M -> values from subjects (b,c,d,g,h,...,K) ===> mean(M), variance(M)

Each mean and variance calculated upon a different (sometimes equal) set of subjects, because of discarding of some bad values.

Now the problem:

I want to have a single measure from all my means, and relative variance, and I find difficult to decide which method to use, and under wich reasons (heteroskedasticity, correlation of samples, ecc ecc).

The global measure should account for variability between and within each sensor, giving at the end a single mean value and a single error.

In particular I tend to consider two approaches: the calculation of a "consensus mean" or -considering each sensor as a strata- treat my work as a stratified sampling (where each sensor would be a strate) and weight each sensor by the number of subjects from which measures are available.

The theoretical approach of "consensus mean" is described here: http://www.fire.nist.gov/bfrlpubs/build02/PDF/b02027.pdf (page 26)
and here:
http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/consmean.htm
And I would use Dataplot with the Mandle-Paule approach.

While for stratified sampling approach I am considering the formula described here:
http://www.spsstools.net/Tutorials/WEIGHTING.pdf (page 9)

Which approach do you think is more suitable for my needs, and why?

Thank you very much!
Michael.

Related Set Theory, Logic, Probability, Statistics News on Phys.org
Stephen Tashi

Using stratification would imply that the goal was to compute a mean that is likely to agree with other researchers who use the same set of sensors (and had roughly the same number of non-discarded measurements from them). This is not the same as the goal of finding a mean value that is likely to agree with researchers who use a different set of sensors or, for example, make a larger fraction of their measurments with sensor 'A' than you did.

The consensus mean is simple and if you publish your data, some readers will probably compute it themselves even if you don't. So you should be prepared to answer questions that it may raise.

Usually when people ask questions about real world statistical problems on the forum, they abstract the mathematical information that they think is relevant in order to be concise. This is good, but they usually omit necessary information. (A large amount of information and a considerable number of assumptions are needed to give a mathematical answer to a real world question.)

For example, when you say subject 'a' is measured by sensors 'A' and 'C', this might mean that subject 'a' is something like a mineral specimen that was weighed non-destructively on two different scale.s. Or it might mean that subject 'a' is a tissue sample from an animal that was divided into two parts with one part sent to lab 'A' and one part sent to lab 'C'.

It's also necessary to be clear about your purpose. For example, if your goal is to publish a paper in a particular journal or write a thesis that will be approved by a certain committee then you should look at what statistical methods were used in papers that they have approved. You might simply asked one of the editors or committee members for advice.

I realized that I cannot treat my data like a stratified sampling, as the same subject is measured with different sensors (so my strata would be overlapping).
This leaves the "consensus mean" but usually it is computed for data from different labs, that worked on different (even if similar) samples.

At this point I do not know exactly how to treat my data, and obtain a "global" estimation from all the measures. I also thought to directly calculate a single mean and variance from alla available data, ignoring the fact that I used different sensors, as they are all designed to measure the concentration of a specific molecule.

You are absolutely right about the questions that a particular approach would produce, but I do not really care about them, as I think the most important thing is to obtain a reasonable measure, hence understanding the importance of alla significant aspects (like the overlapping, that exclude considering data like from a stratified sampling). Then answers will be logical.

The aim is to publish a paper, but this technique is really new and no other works have been done previously.

If any other detail is needed I can provide it. About sensors, I already said that normality is of their measures is unknown; also homoskedasticity cannot be assumed (and this prevents me from just taking alla the raw measures and treat them all as a single dataset).

The purpose is to mediate the contribution of different approaches (sensors) that have in common the same purpose and use the same measuring unit. The idea is to squish the maximum information from each sensor, assuming that they all are good to measure the molecule without ignoring that they may be different (i.e. no normality, no same variance), and I do not know how much this is important.

Stephen Tashi

The aim is to publish a paper, but this technique is really new and no other works have been done previously.
The particular subject matter (measuring your moelcule) may be new, but there have probably been many other papers published where some other thing was measured by a variety of instruments. Don't neglect to scan for those in the journals that you are considering.

Do you anticipate that a large part of the content of the paper will be to expound your statistical method?

...
hence understanding the importance of alla significant aspects (like the overlapping, that exclude considering data like from a stratified sampling). Then answers will be logical.
Logic and mathematics can only proceed if they have sufficient information. It's unlikely that you have sufficient objective information. You must be willing to make some assumptions in order to get a mathematical answer.

My interests are in simulation and probabilistic modelling, so that will be my bias in advising your. In complex situations it is very useful to construct a simulation, even if you don't implement it as a computer program. The exercise of making a simulation helps you break the problem into manageable parts and leads to asking pertinent questions.

As an expert in this subject (whatever it is!) you probably know something about how the measuring instruments work. You may find that someone has published a simulation of how these instruments produce errors or you may be able to create such a simulation yourself. You might also find data about a different molecule of similar properties to yours where the true value is known and the values given by various sensors are listed. That would let you form some model of the errors.

A robust method of statistical estimation is the Maximum Likelihood method. If you assume a probability distribution of errors for each sensor (not a family of distributions, but a specific distribution, perhaps a different one for each individual sensor) and you assume the true mean is a specific value x_bar, then you can calculate the probability of the data you observed. A computer program can try various values for x_bar and determine which of them gives the data the highest probability.

The purpose is to mediate the contribution of different approaches (sensors) that have in common the same purpose and use the same measuring unit.
The term "sensor fusion" is often used when people pursue this goal. You might find relevant material by searching for it. I, myself, have never seen any work with that title that impressed me.