Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Combining loosely correlated data set

  1. Dec 16, 2013 #1
    I need some help finding an appropriate statistics model for some experimental data. Thanks in advance for any suggestions.

    I am trying to compare simulated results from a code that models nuclide concentrations in spent nuclear fuel to experimental data. These concentrations have complicated dependencies on starting concentrations, reactor conditions, fuel design, etc.

    I have a set of experimental data (and associated standard error) representing fuel from a wide variety of the conditions listed above.

    For each experimental data point I have a simulated result. The simulated result has no given error.

    I am taking the ratio of measured value to calculated value (M/C) for a variety of nuclides to determine how well the simulation works and to conservatively correct future calculated values. If the simulation were perfect (and the measurements were perfect), all of the M/C values would be 1.0. However, I don't think I can really combine the data points as if each point were a measurement of the same value... because each is based on a different set of dependencies.

    Previous work has treated the data as normally distributed... but I think that is a flawed approach. So how can I collapse my data set into a single value that will bound some known percentage of results and account for the experimental uncertainty? At this point I am considering using mean deviation about the mean, using experimental data points plus their respective error.
  2. jcsd
  3. Dec 16, 2013 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    Your goal isn't clear. For example, if I have a model that works well on one set of similar conditions and works badly on another set of conditions, the principle of "under promise, over perform" might lead me to publish the "bound" of some percentage based on results where the model works badly. On the other hand, if I assume a person picks a set of conditions "at random" (in a manner to be specified) from the possible sets of conditions, then I can ask for a bound on how the model deviates from experiment in such a scenario.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook