Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Standard Error of Replicates, Each With Standard Error

  1. Jun 16, 2011 #1
    So I have a situation that I keep confusing myself with:

    I am running a computer simulation that, after a set number of iterations (a block) will output the running average (mean) of a property and the standard error based on the previous blocks. So, as the simulation runs, these standard errors tend to shrink because the simulation settles into its "equilibrium".

    The problem is, I have 6 of these simulations (all the same, replicates), each with their own means and standard errors for each block during the simulation. I could easily find the mean and standard errors of the means, but what about the standard error associated with each replicate? Is there anyway to legitimately average those?

    Thanks for any help.
  2. jcsd
  3. Jun 16, 2011 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    It's unlikely that people are going to understand exactly what you mean by "block" or "replicates" or what it means for 6 simulations to be "all the same, replicates". Whether it is legitimate to average a set of numbers will depend not only on what the numbers represent, but what you intend to do with the average. I suggest that you describe your work more clearly, including what you intend to do with any statistics that you compute. (Does "replicate" mean "replication"? I've most often heard (in the USA) people speak of "replications" of a simulation. I haven't heard the term "replicate" used.)
  4. Jun 17, 2011 #3
    Sure, he is confusing. This is what I made of it:

    He has 6 simulations.
    One simulations has "X" number of iterations/blocks.
    He has computed the mean and standard deviation of a property for each of the simulation.

    What I cannot make out is "I could easily find the mean and standard errors of the means, but what about the standard error associated with each replicate? Is there anyway to legitimately average those?"
  5. Jun 17, 2011 #4
    I'm sorry about the confusion. I don't mean to be patronizing in any way, but I think a simpler analogy might be much better.

    Say, for instance, that I am interested in the speed a ball rolls down a large hill. Somehow, I can measure it every 1/2 second as it rolls. I then calculate a running average of all measurements as well as a running standard error.

    The running standard error would obviously be undefined for the first measurement, but would get smaller and smaller as subsequent measurements were taken because N increases (assuming the ball maintained a fairly constant speed).

    When the ball reaches the bottom, I then have running averages for the entire descent, as well as running standard errors associated with them.

    Now, assume I wanted more confidence in that data, so I decide to measure the ball rolling down the hill a second time. Then a third, fourth, etc. up to six. Each time, I have running averages and standard errors.

    I'm not just interested in the final speed based on the final measurements, but I am interested specifically in how that running mean and error change over time (i.e. how the mean "equilibrates" and how the error shrinks). Another assumption here is that all of these times matched up perfectly. For example, I have measurements for all six experiments at 0 sec, 1/2 sec, 1 sec, ... and I want to combine all six running means and running errors at each point in time along the descent. Combining the means seems easy (just take the mean of the means). However, is there a way to use the running standard errors from the six experiments for a given time along the descent? Can they combine in some way to give a better estimate of some true standard error (if that even makes sense)?

    Again I'm sorry about any confusion, but I am a little confused myself.

    P.S. A "replicate" (noun) is just another experiment under the same conditions (or as close as possible).
    Last edited: Jun 17, 2011
  6. Jun 17, 2011 #5

    Stephen Tashi

    User Avatar
    Science Advisor

    A ball rolling down a hill at a roughly constant speed may not get a good reception on a physics forum, but fortunately I lean more toward mathematics than physics.

    Let's say the hill is bumpy, so the ball's constant average speed is due small accelerations and de-accelerations. We can even include friction, which will make things quite complicated since the "normal force" of the ball against the hill will vary.

    Suppose we have 6 experiments of rolling the ball down the hill and record the (x,y) location of the ball as a function of time, so we have functions [itex] x = X_i(t) [/itex] and [itex] y =Y_i(t) [/itex] for [itex] i= 1,2,..6 [/itex].

    Lets focus on [itex] X(t) [/itex]. We can be interested in various parameters of the population of all possible replications of the experiment (not just the 6 that were done).

    Examples of this are:

    1) For a fixed [itex] t_{final}[/itex] , what is the mean value of [itex] \frac{ X(t_{final}) - X(0} {t_{final} - 0}[/itex]

    2) For a fixed [itex] \triangle t [/itex] and a randomly selected [itex] t [/itex] , what is the mean value of [itex] \frac{X(t + \triangle t) - X(t)}{\triangle t } [/itex]

    3) For a given time t, what is the mean value of the deriviative [itex] \frac{dX}{dt} [/itex] averaged over all possible replications of the experiment.

    To decide the relevance of such parameters, we must decide what we are trying to accomplish. One can make deterministic representations of physical phenomena or stochastic models of physical phenomena.

    A simple minded deterministic model for the phenomena would be to find a function F(t) such that F(t) is the parameter given by 3).

    This establishes the relevance of parameter 3), but it leaves open the question of how to estimate it from the data. For example, to estimate F(t) at t = 9.4 should we average only the 6 velocities a time t= 9.4 or may we use other times or average velocities?

    The answer to that depends on whether the other data can be regarded as independent (or approximately independent) random samples of the same random variable. You might be able to answer that question from your knowledge of the physics of a particular problem. If not, you can investigate the empircal evidence for dependence of different types data.

    Perhaps you can elaborate further on this example to describe exactly what you final goal is.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook