How do I calculate the error of my data?

  • Context: Graduate 
  • Thread starter Thread starter tjosan
  • Start date Start date
  • Tags Tags
    Data Error
Click For Summary

Discussion Overview

The discussion revolves around calculating the error associated with mass flow rate measurements obtained from repeated experiments. Participants explore methods for integrating mass flow rates over time using the trapezoid rule and seek to quantify the uncertainty in their results based on multiple repetitions of the experiment.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant outlines a method for calculating the mean mass flow rate and standard deviation from three repetitions of an experiment, using the trapezoid rule for integration.
  • Another participant questions the assumption that the time dependence of mass flow should be the same across all repetitions and highlights the need to account for correlated uncertainties when calculating total area.
  • A participant explains that uncertainties should scale linearly with constants when multiplying or dividing, suggesting a formula for calculating total variance based on this scaling.
  • Concerns are raised about the impact of random errors versus sample inhomogeneity on the standard deviation, with a participant noting that the standard deviation may reflect sampling quality rather than measurement uncertainty.
  • Further discussion includes the process of calculating total mass and its associated error when multiple chemicals are measured, with participants proposing methods for combining variances from different chemicals.
  • A later reply emphasizes the importance of ensuring that the variance calculations align with error propagation principles, particularly when dealing with independent measurements.

Areas of Agreement / Disagreement

Participants express differing views on how to properly account for uncertainties and correlations in their calculations. While some methods are proposed, no consensus is reached on the best approach to calculate the total error.

Contextual Notes

Participants note potential limitations in their assumptions regarding the independence of measurements and the nature of the errors involved, particularly in relation to sample inhomogeneity.

Who May Find This Useful

This discussion may be useful for researchers or students involved in experimental physics or engineering who are interested in data analysis, error propagation, and uncertainty quantification in repeated measurements.

tjosan
Messages
32
Reaction score
2
Hello.I have 3 repitions of an experiment. The data consists of mass flow rates vs. time.
What I want to do is to calculate the mass vs. time, i.e. integrating my data.

I consider the resolution for each measurement to be exactly 3 minutes. To calculate the area I use the trapetzoid
rule, i.e. (y(i)+y(i+1))/2*\Delta X= (y(i)+y(i+1))/2*3

Now I want calculate the error. I know using the trapetzoid rule introduce errors, but since I
have no way of knowing how this affects my result I don't care about that particular error. Rather I want to know
the error between my 3 repititions.
My data look like this:
X # 1 # 2 # 3
0 y11 y21 y31
3 y12 y22 y32
6 y13 y23 y33
. . . .
. . . .
. . . .
n y1n y2n y3n
where y is mass flow rates and x is time.

This is my approach:
Calculate the mean mass flow rate, i.e
yavg(1)=(y11+y21+y31)/3
yavg(2)=(y12+y22+y32)/3
yavg(3)=(y13+y23+y33)/3
...
yavg(n)=(y1n+y2n+y3n)/3

and also the standard deviation

ystd(1)=std(y11,y21,y31)
ystd(2)=std(y11,y21,y31)
ystd(3)=std(y11,y21,y31)
...
ystd(n)=std(y11,y21,y31)

Then use the trapetzoid rule:
Area(1)=(yavg(1)+yavg(2))/2*3
Area(2)=(yavg(2)+yavg(3))/2*3
...
Area(n-1)=(yavg(n-1)+yavg(n))/2*3

But how do I calculate the error? This I what I think:
Error(1)=sqrt((ystd(1)^2+ystd(2)^2)/2*3))
Error(2)=sqrt((ystd(2)^2+ystd(3)^2)/2*3))

or should it be like this:
Error(1)=sqrt(ystd(1)^2+ystd(2)^2) ??

Area(1)+-Error(1)
Area(2)+-Error(2)

.. etc

And suppose I want to calculate the total mass, i.e Area(1)+Area(2)...+Area(n-1), then what will the error be?

Thanks
 
Physics news on Phys.org
Do you have a good reason to expect the same time dependence of mass flow for all three repetitions?

You have to be careful to account for the uncertainty exactly once: the uncertainty on area 1 and on area 2 are correlated.
Luckily, your trapezoid rule can be simplified: total area is 3 minutes * (1/2 yavg(1) + yavg(2) + yavg(3) + ... + 1/2 yavg(n)). The uncertainties on those averages now can* be independent, and you can add them in quadrature.

*something that would need more investigation: what leads to different measurements between the repetitions?
 
Hi, thank you for your answer.

I am using an instrument (gas chromatograph) to measure concentrations for different chemicals which I then relate to their mass. The chemicals are produced in the same way for all three experiments so I expect the result to be "the same".

Since I am multiplicating and dividing my areas with constants, should I do the same to the uncertainties also? I.e std=sqrt(3*(var_1*0.5+var_2+...+var_n*0.5)), where var is the variance.

Random error would be the reason for having different values, although I suspect sample inhomogeneity probably has a larger impact. If this is the case I guess the standard deviation is more a measurement of how good I sample rather than the uncertainty of the measurements :) But I assume that I only have random errors since I normalize my results.
 
tjosan said:
Since I am multiplicating and dividing my areas with constants, should I do the same to the uncertainties also?
Sure. But do it outside the square root. Standard deviations scale linearly if you just multiply the uncertain value with some constant, variances scale with the square of that value. This also means your *0.5 should be *0.25.
tjosan said:
If this is the case I guess the standard deviation is more a measurement of how good I sample rather than the uncertainty of the measurements :)
That is a possible issue.
tjosan said:
But I assume that I only have random errors since I normalize my results.
Normalize to what? That could be relevant as well.
 
mfb said:
Normalize to what? That could be relevant as well.
Actually I realized that the data is not normalized, nevermind what I wrote.

mfb said:
Sure. But do it outside the square root. Standard deviations scale linearly if you just multiply the uncertain value with some constant, variances scale with the square of that value. This also means your *0.5 should be *0.25.
Thank you.

So if I understand you correctly, I should calculate the error like this:

## 3^2*(s1^2*0.25^2+s2^2...+sn^2*0.25^2)=total variance##

One more question.

The instrument detect several chemicals. So in addition to calculating the mass for each chemical, I also need to add those masses together to get the total mass. I assume that I just repeat the process for each chemical, and in the end, I add all the variances and square root them to get the total standard deviation.

E.g.

Chemical A:
Average mass of ##A = 3*(A_1*0.5+A_2+...+A_n*0.5)##
Variance of ##A = 3^2*(S_{A1}^2*0.5^2+S_{A2}^2+...+S_{An}^2*0.5^2)##

Chemical B:
Average mass of ##B = 3*(B_1*0.5+B_2+...+B_n*0.5)##
Variance of ##B = 3^2*(S_{B1}^2*0.5^2+S_{B2}^2+...+S_{Bn}^2*0.5^2)##
.
.
.
Chemical K:
Average mass of ##K = 3*(K_1*0.5+K_2+...+K_n*0.5)##
Variance of ##K = 3^2*(S_{K1}^2*0.5^2+S_{K2}^2+...+S_{Kn}^2*0.5^2)##Total mass:
##{\text{Average mass of A}}+{\text{Average mass of B}}+...+{\text{Average mass of K}}##
Total error:
##sqrt({\text{Variance of A}}+{\text{Variance of B}}+...+{\text{Variance of K}})##
 
tjosan said:
So if I understand you correctly, I should calculate the error like this:

## 3^2*(s1^2*0.25^2+s2^2...+sn^2*0.25^2)=total variance##
Right.
The instrument detect several chemicals. So in addition to calculating the mass for each chemical, I also need to add those masses together to get the total mass. I assume that I just repeat the process for each chemical, and in the end, I add all the variances and square root them to get the total standard deviation.
If all those measurements are independent, that works.
 
Watch that your variance now is the same as it would be by error propagation (no correlation between your x_i):
You have the variable f(x_1,x_2,...,x_n) = a (bx_1 + x_2 + ... + bx_n)
and its deviation will be \sigma_f = \sqrt{\Big(\frac{\partial f}{\partial x_1} \sigma_{x1}\Big)^2+ \Big(\frac{\partial f}{\partial x_2 } \sigma_{x2} \Big)^2 + ... + \Big(\frac{\partial f}{\partial x_n} \sigma_{xn} \Big)^2}
or \sigma_f^2 \equiv Var(f) = a^2 b^2 Var(x_1) + a^2 Var(x_2) + ... +a^2 b^2 Var(x_n) = a^2 \big( b^2 Var(x_1) + Var(x_2) +... + b^2 Var(x_n) \big)
 

Similar threads

Replies
28
Views
4K
  • · Replies 9 ·
Replies
9
Views
2K
Replies
8
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
24
Views
3K