Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Calculating variance of a single statistic from different collections of data

  1. Sep 30, 2012 #1
    Let's assume there's a magical ball that bounces once every .5 seconds, and that there was a specific machine designed to calculate the rate of the ball's bounce (with error). After performing numerous readings of the ball's bounce using this machine, you have a collection of values describing the rate at which the ball bounces for this particular ball.

    Now, say you have another magical ball that bounces once every .25 seconds, and used the same machine from before to read the rate at which this other ball bounces. After getting some data, you now have data on the bounce-per-.5-second ball and the bounce-per-.25-second ball.

    Question: How would one be able to relate the accuracy and precision of this machine between the readings of the two balls?

    Question from different perspective: Could you relate the standard deviation from readings from the first ball to readings from the second?

    Question from another perspective: Would it be possible, after collecting readings from the first ball and the second ball, to estimate the readings present in a third magic ball which bounced at a different rate?

    Question from yet another perspective: How would you analyze the machine's precision off a collection of balls with different bounce frequencies?

    Any feedback would be greatly appreciated. Preferably, if this type of statistical analysis exists, I'd like to know the name. I'm having immense trouble finding this type of analysis using google.

    _
     
    Last edited: Sep 30, 2012
  2. jcsd
  3. Sep 30, 2012 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    It will be important to clarify whether the periodic motion in this problem really has any impact on the statistical analysis. In other words, is there any difference in analyzing these measurements vs analyzing two sets of measurements with a scale, one set on a 0.5 kg mass and the other on a 0.25 kg mass?

    You say the machine measures the "rate" at which the ball bounces and not something like the times when the ball lands. If I imagine a reall machine to do this job it would have an "integration time". Over a certain time interval, it would detect the ball interrupting light to a photo diode. It would record the time's when this happened. Then it would produce a rate based on average time between these events, using all the events it recorded in the integration time interval. If it measured a ball bouncing every 0.25 seconds, it would have more measurements to average than with the slower ball. So the final output of the measurement as a rate might have different precision for balls bouncing at different rates.

    Are we to assume the machine is just a "black box" that has the same precison regardless of the rate of bounce of the balls?
     
  4. Sep 30, 2012 #3
    On an added note, how would you create a single normal distribution of the machine's readings when the machine reads different balls?

    EDIT: And yes, it's best to assume the machine as a black box with the same precision no matter how long it's been looking at a ball.
     
  5. Oct 1, 2012 #4

    Stephen Tashi

    User Avatar
    Science Advisor

    Why would you want to combine readings from both balls into one distribution? Did you mean you want a single normal distribution of the errors the machine makes?

    If this is a real life problem, it would be best if you described that problem rather than attempt to abstract the mathematics from it on your own. If this is hypothetical problem, then we can discuss it that way. If the machine is being treated as a black box, it would simpler to forget about the bouncing balls and just think about a machine that measures weights, or heights etc.
     
  6. Oct 1, 2012 #5
    The overall goal is to create a normal distribution which can be used to visually describe the accuracy and precision of something, when the 'something' is used on a multitude of different objects.

    Granted, the ball-rate-measuring-machine was an example. One could use anything that can measure anything.
     
  7. Oct 2, 2012 #6

    Stephen Tashi

    User Avatar
    Science Advisor

    Assume the variance of the errors of the device doesn't depend on the true magnitude of what it is measuring. ( For example, assume the variability of the error a scale makes doesn't increase with the weight of the thing measured - as if would if the scale had a constant variabiliy in its percentage error rather than a constant variability in error)

    Assume that the mean error of the device is 0.

    Given that you know the correct value of the thing being measured, you can compute the error made on each measurement. You can combine all these observations of errors in to one data set and estimate the mean and variance of the normal distribution that fits it. Is that what you want?
     
  8. Oct 2, 2012 #7
    Percentage error, in some cases, could be huge. If said scale was inaccurate enough to measure an object whose weight is similar to its error, percent error could be over 1.

    I suppose it would depend on the things being weighed. If the error depended on the object being measured, it would be percentage error. If not, it would be a flat plus-or-minus. Both normal distributions would have a high-point at 0. I suppose one would need to look at actual results of said hypothetical data in order to look at how best to group it.

    Thanks!
     
  9. Oct 2, 2012 #8

    Stephen Tashi

    User Avatar
    Science Advisor

    Yes, you'd have to look at actual data or know the physics involved to determine if the variability of the error depends on the magnitude of the thing being measued. So the problem also includes estimating a formula for this dependence.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Calculating variance of a single statistic from different collections of data
Loading...