## Discrepency events

I am having problems deciding on which statistical measure to use. Although this problem is of the simplest type, none of my books seem to address exactly what I need.

Let me describe a typical example:

Suppose that I have a factory that produces four types of products, say boats, cars, planes, and trains. (Big factory, I know). For ideal production, I want 20% of the factory's output to be boats, 25% to be cars, 40% to be planes, and 15% to be trains.

Suppose the factory instead produces 15% boats, 20% cars, 40% planes, and 25% trains. How would I express, using one statistical measure, how far off the factory is producing from the ideal?

I want the answer to be a percentage so that 100% is perfect alignment with the targeted goals, 0% would be no alignment (the factory produces spaceships instead).

My first inclination was to use simply find the average discrepency, that is, take the absolute values of each difference and average them. If needed, I could weight the result to produce a result between 0 and 100%, but something tells me that my plan is too unsophisticated. Is there a form of the linear regression that I could use on data that is not described by a function but represented in terms of finite data? What about weighting the standard deviation?

As you can tell, I am not a statistician (all of my experience is using statistics on functional data), but if I was just told the name of the statistical measure to use I could figure out the rest on my own.

 PhysOrg.com science news on PhysOrg.com >> Heat-related deaths in Manhattan projected to rise>> Dire outlook despite global warming 'pause': study>> Sea level influenced tropical climate during the last ice age
 Recognitions: Homework Help Science Advisor You are looking for a metric (distance function) so the sum of absolute differences is fine. An alternative measure is the sum of squared differences (or errors, i.e. SSE). You may have to take into account that the percentages always add up to 100%, so you only need to know 3 out of 4. In a regression the errors are based on the difference between an actual value and a projected (estimated) value -- as far as I understand, you are not trying to project anything; whether or how regression might help is not obvious.
 Thanks for the response. What advantage does the sum of squared differences have over the sum of absolute differences?

Recognitions:
Homework Help

## Discrepency events

 Quote by BiologyGirl Thanks for the response. What advantage does the sum of squared differences have over the sum of absolute differences?
1. It's often easier to analyze mathematically
2. The worst values are weighted more heavily, so "0% boats, 20% cars" is worse than "10% boats, 15% cars" in your example.
3. There aren't continuous of values that are considered 'equally bad', which makes it hard to decide what to prefer.