| New Reply |
How to Normalize a simulated dataset to fit the actual dataset? |
Share Thread |
| Feb19-12, 06:46 AM | #1 |
|
|
How to Normalize a simulated dataset to fit the actual dataset?
Can someone tell me how I can 'normalize' my dataset?
My scenario is as follows. I have two datasets, A (real-life data) and B (simulated data). Dataset A contains 4 numerical values (from an actual experiment): -> E.g. 4 leaves from a binary tree each assigned with values 12.5,13.5,20.0 and 45.0. Dataset B contains 40 numerical values (from a simulation done by the computer): -> E.g. 40 leaves from a total of 10 binary trees where each tree produces 4 leaves with randomly assigned numerical values for each leaf. For both datasets, I have computed their respective cumulative frequencies and plotted their respective charts using MS Excel e.g. [Cumulative frequencies of leaf values VS Leaf values]. This was to observe how similar/different are both of these data sets, where the smaller the vertical displacement between the two plots implies that both datasets are less different. I was instructed to normalize my data from Dataset B and re-plot the chart for a better comparison between set A and set B. How can I do this (and why is this important?)? An example based on the situation described here will help a great deal. Thanks in advance. |
| Feb19-12, 11:42 AM | #2 |
|
Recognitions:
|
Unfortunately "normalize" is an ambiguous instruction. It might mean to convert each data value [itex] v [/itex] to it's "z-score" by computing [itex] \frac{v - \mu}{\sigma} [/itex] where [itex] \mu [/itex] is the mean of the sample in question ( real or simulated) and [itex] \sigma [/itex] is the standard deviation of the sample.
It could mean something as simplistic as converting each data value [itex] v [/itex] to a sort of ranking by computing [itex] \frac{v - v_{min}}{v_{max} - v_{min} } [/itex] where [itex] v_{max} [/itex] and [itex] v_{min} [/itex] are, respectively, the max and min values in the sample. We'd have to know more about what the data and the simulation represent to know what makes sense - (and we'd have to assume the person who told to do this gave sensible advice!). If you use z-scores you can probably defend that choice as a common meaning for "normalize". If both your historgrams had a roughly a bell shaped appearance, I'd guess that this was was your advisor meant. |
| New Reply |
| Tags |
| excel, normalisation |
Similar discussions for: How to Normalize a simulated dataset to fit the actual dataset?
|
||||
| Thread | Forum | Replies | ||
| Peer-Reviewed Journal Question w/ Dataset needed | Set Theory, Logic, Probability, Statistics | 0 | ||
| EEG dataset | Medical Sciences | 5 | ||
| Numerical Integration of a dataset (what is the best method?) | General Engineering | 2 | ||
| Predicting dataset | Set Theory, Logic, Probability, Statistics | 0 | ||
| Finding an oscillator's period with a dataset in Mathematica | Introductory Physics Homework | 1 | ||