Can temperature data be averaged to a greater resolution than the raw data?

  • Thread starter Thread starter Jobrag
  • Start date Start date
  • Tags Tags
    Data Temperature
Click For Summary
The discussion centers on the validity of averaging temperature data to a higher precision than the original measurements. Participants debate whether averaging can yield more precise values than the precision of the raw data, with some arguing that averaging multiple measurements can improve the estimate's accuracy. However, others maintain that the computed average should not exceed the precision of the individual measurements, emphasizing the distinction between accuracy and precision. The conversation also touches on the implications of measurement errors and the nature of continuous data, suggesting that averaging multiple readings can provide a more reliable estimate, but only if the measurements are consistent. Ultimately, the debate highlights the complexities of statistical interpretation in temperature data analysis.
  • #31
zgozvrm said:
I'm not saying that the average is exact, only that by adding that extra degree of precision, we are closer to the exact value over many measurements, which goes back to the original question.

Agreed, we don't know what the exact average is if even one measurement is not known exactly.
Every time you measure something, the measurement is inexact.
zgozvrm said:
But, by taking the average of the values you do have, I'm simply saying that you can add that extra decimal place (more precision), as it is likely to be closer to the actual average than if you don't
 
Mathematics news on Phys.org
  • #32
Mark44 said:
Every time you measure something, the measurement is inexact.

Not true. Counting items is a measurement of quantity.

As for measurements not involving counts ("real" measurements) we can only be as accurate as our measuring devices allow us to be. In most cases, we estimate the last figure. For example, when reading a ruler that has graduations every 0.1 of an inch, we can usually estimate fairly accurately the measurement to the nearest 0.025 of an inch (one quarter of a graduation). Although, measurements from a ruler with 0.020 inch graduations can probably not be estimated to be any more accurate than that (with the naked eye). Whatever our means of measurement, we should only make estimates to the degree that we can guarantee the accuracy. (In the case of the 0.020 inch ruler, it would be ridiculous to try estimate to the nearest 0.005, or 1/4th of a graduation).

So, as long as we can guarantee our accuracy to a given level, we can then take an average of several of those readings at the next higher level of accuracy (one more decimal place).

If I take measurements with my 0.020 inch ruler of 1.26, 3.68, and 2.42, the sum is 7.36, and the average is 2.453. Given that our original 3 measurements are measured to the nearest 0.020 inch, the average of 2.453 is a truer representation of the average than 2.45
 
  • #33
When people talk about "measuring" something, they almost never mean this in the sense of counting things.

There is some statistical justification for your extra place of precision, which I believe statdad hinted at many posts ago in this thread - the standard error of the mean. If you take repeated measurements of something with an arbitrary distribution, the distribution of these measurements will be approximately normal, and their standard deviation will be sigma/sqrt(n). This has the effect of bunching them more tightly around the population mean (which we don't know).
 
  • #34
zgozvrm said:
What I am saying is that 42.75 is a better approximation of the average between 42.5 and 43.0 than either 42.7 or 42.8.

Not true, as has been pointed out - if that were the case, the entire idea of confidence intervals and estimation would be worthless.

If the discussion is about means (I assume that is what "average" is in reference to), the idea is that when repeated samples are taken, the sample to sample variability in means is

<br /> \frac{\sigma}{\sqrt n}<br />

where the numerator is the population standard deviation. If this isn't know, the sample standard deviation is used. The phrase "standardard error" is given to this quantity.

The idea is that gathering a sample and averaging smooths out the variability that occurs from individual to individual, and so statements about the mean of a group are more accurate than statements about individual values.

Caveats: the "root n" decrease is correct for random samples; even a little correlation among the items in the sample can give different results. Hampel & Ronchetti, in "Robust Statistics: The Approach Based on Influence Functions" (think I remember the title correctly: my copy is upstairs) discuss, in the final chapter, some of the real problems that can occur with data when this law isn't satisfied.
 
  • #35
Then why in a program that randomly selects two values, A & B such that 42.45<= A < 42.55 and 42.95 <= B < 43.05 (which round to 42.5 and 43.0 respectively) to 12 decimal places and then the average M of these is taken by M = (AB)/2 to 13 decimal places, after 10,000 iterations of the program, I show that 85.44% of the time, 42.75 is closer to the actual average M than 42.8?
 
  • #36
If I understand #35 correctly, your program is generating a pair of random numbers, A and B. A has a uniform distribution on [42.45, 42.55), so the mean \mu_A = 42.5. B has a uniform distribution on [42.95, 43.05), so its mean is \mu_B = 43. If you calculate \frac{A+B} 2 a large number of times (I assume you mean this instead of (AB)/2), you should expect most of the results to be close to

<br /> \mu_{\frac{A+B}2} = \frac{\mu_A + \mu_B}2 = 42.75<br />

I don't know what you mean by this:

" I show that 85.44% of the time, 42.75 is closer to the actual average M than 42.8?"
 
  • #37
statdad said:
If I understand #35 correctly, your program is generating a pair of random numbers, A and B. A has a uniform distribution on [42.45, 42.55), so the mean \mu_A = 42.5. B has a uniform distribution on [42.95, 43.05), so its mean is \mu_B = 43.

Yes, but the point wasn't so much that the mean values were 42.5 and 43 as that the values, when rounded to the nearest 0.1, were 42.5 and 43 (basically the same difference).

And, for instance, A might be the value 42.45032157926 and B might be 43.049000783 in one iteration of the program.

statdad said:
If you calculate \frac{A+B} 2 a large number of times (I assume you mean this instead of (AB)/2), ...

Yes, that is what I meant

statdad said:
... you should expect most of the results to be close to

<br /> \mu_{\frac{A+B}2} = \frac{\mu_A + \mu_B}2 = 42.75<br />

and they are, as shown by my program.

statdad said:
I don't know what you mean by this:

" I show that 85.44% of the time, 42.75 is closer to the actual average M than 42.8?"

I counted the number of times 42.75 was closer to the actual average and compared that sum to the number of times 42.8 was closer to the actual average. 42.75 was closer 85.44% of the time.

Thus making my point that 42.75 is a better representation of the average between measurements of 42.5 and 43, given that these measurements may actually be any value in the ranges given above (to any number of decimal places).
 
  • #38
Then this shouldn't be a surprise - this is exactly what what SHOULD happen. I'm still not sure why you're mentioning 42.8.
 
  • #39
statdad said:
Then this shouldn't be a surprise - this is exactly what what SHOULD happen. I'm still not sure why you're mentioning 42.8.

It's not a surprise to me! According to Mark44, in post #24, there is no justification for "the 5" in 42.75, therefore he wants to say that the average is either 42.7 or 42.8. I chose 42.8 since 42.75 rounds to 42.8 (obviously, the result would've been the same, had I chosen 42.7).
 
  • #40
zgozvrm said:
It's not a surprise to me! According to Mark44, in post #24, there is no justification for "the 5" in 42.75, therefore he wants to say that the average is either 42.7 or 42.8. I chose 42.8 since 42.75 rounds to 42.8 (obviously, the result would've been the same, had I chosen 42.7).

If you're going to cite what I said, please get it right. Here's what I said at the end of post #24 (complete with two typos I made).
Mark44 said:
If you take the average of the two rod lengths, you get 42.75 +/- 0.05, which represents a number somewhere between 42.7 and 42.8. As it happens, 42.75 is right smack in the middle of that interval, by there is no justifaction whatsoever for the 5 in the hundredths' place.
I DID NOT say that the average was either 42.7 or 42.8, as the quote above shows. Since we are dealing with only two measurements, and the measurements are of two different things, we can't invoke the increased precision that would come about from numerous (i.e., much more than two) measurements of the same thing (the rods are different).
 
  • #41
Mark44 said:
I DID NOT say that the average was either 42.7 or 42.8, as the quote above shows.

No, you didn't say that the average was either 42.7 or 42.8, but you DID say that there was no justification for the 5 in the hundredths' place. I'm saying that there IS.

Mark44 said:
Since we are dealing with only two measurements

All my posts refer to multiple measurements, which pertains to the OP.
 
  • #42
But you were responding to my post, and my example had two rods, with each one measured only once.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 12 ·
Replies
12
Views
3K
Replies
23
Views
4K
Replies
6
Views
9K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 8 ·
Replies
8
Views
751
  • · Replies 13 ·
Replies
13
Views
6K