# Need help proving a flawed model

1. May 30, 2013

### IBYCFOTA

A little backstory - I work in a warehouse where my job is to quickly and accurately pick products from the shelves to ship out to the stores. Recently, I've gotten a talking to from my supervisor regarding my accuracy. The standard here is that you must be at a 98% accuracy or you will get what essentially amounts to a written warning, and if they continue to pile up they can eventually lead to termination. The problem is that the sample size being used to determine is incredibly small at times, and is such a small percentage of our overall body of work. For instance their sample on me for this last month was 122 pieces of product, and on any given month I probably pick around 100,000 pieces - if not more.

I've taken a class on statistics years ago and have some very basic knowledge on the matter, but enough to know that their current standard for accuracy is inherently flawed on account of not dealing with a reliable sample size. You can make one mistake in the .01% of work that they check and possibly get fired over it, and that's a slightly frightening prospect.

I've talked it over with my supervisor who mostly agrees but claims that they reviewed their policy in the past on this issue and found that 100 pieces was enough of a sample size to determine accuracy. I'd like to prove that wrong with some number crunching, but I'm afraid I've forgotten most of what I learned in the one statistics course that I took in college and need help from those more mathematically inlined than I.

The first thing I'd like to do is establish how likely it is that somebody who normally picks with an accuracy of 98.5% on average can fall below 98% in a 100 piece sample. You can round the 98.5 up if it makes it any easier. I suspect that the odds of this happening are not particularly low.

The next part of the problem is determining what a reasonable sample size ought to be. How about this: how many pieces need to be checked before we can determine with 95% confidence that a 98.5%+ picker has picked that sample with 98% or better accuracy? I don't know if that's the correct approach to go about tackling that problem so if anybody has any better way to phrase that then feel free to change it, but I think you guys get the gist of what I'm asking.

Anyways, I appreciate all the help I can get on the matter. Thanks in advance!

2. May 30, 2013

### Number Nine

Kind of an interesting problem. Too lazy to work it out analytically (which shouldn't be too difficult), but I ran a quick simulation to see what the distribution of errors should look like. I've attached a histogram below.

I simulated 100,000 "subjects" classifying 100 objects with 98.5% accuracy, and counted the number of errors for each of them. The histogram is the error count. Approximately 20% of subjects display an error count greater than 2 (i.e. an accuracy less than 98%).

#### Attached Files:

• ###### sim.jpg
File size:
9.8 KB
Views:
75
3. May 30, 2013

### Office_Shredder

Staff Emeritus
To flip the question around: if you pick with accuracy p, and you want to have a smaller than, say, 5% chance of being fired (giving you an expected 20 years on the job if they do this once per year), then what does p need to be to either miss 0 or 1 or 2 pieces?

The probability you miss 0 is $p^{100}$
The probability you miss 1 is $p^{99}(1-p) *100$ (100 ways to miss one)
The probability you miss 2 is $p^{98}(1-p)^2 * 100*99/2$ (100 choose 2 ways to miss 2).
So we want to know what p is such that
$$p^{100} + p^{99}*(1-p)*100 + p^{98}*(1-p)^2*100*99/2 = .95$$
Good luck solving analytically (degree 100 polynomials are scary) but numerically is a piece of cake
http://www.wolframalpha.com/input/?...99}*(1-p)*100+++p^{98}*(1-p)^2*100*99/2+=+.95

If you want to pass 19 out of every 20 inspections your success rate needs to be about 99.1%

4. May 30, 2013

### IBYCFOTA

Thanks, this is exactly what I was looking for. Do you mind running that simulation again with picker accuracy at 99% instead? I'm curious how much that changes the probability. Maybe even increase the sample to 200 or 300 pieces to see how that changes the results. The more of those simulations you post the better. I'd feel a bit more easy if I knew they were using a sample that only has a 5% chance of screwing me over on any given month. :D

Here's the kicker - written warnings, aka 'coachings' that go on your employee record are wiped out every 6 months, but only if you go through that period without another coaching. If I get a coaching from my supervisor, and then 5 months later get another, both will stick on my record for at least another 6 months, so it's possible that a coaching from a year and a half ago could end up hurting me. Considering there is a 20% chance if my accuracy is at 98.5% that I could pick below 98% for a 100 ish piece sample, I would think the possibility of getting those coachings to continue to pile up without resetting isn't all that farfetched.

5. May 30, 2013

### IBYCFOTA

Thanks a lot. And based on the previous poster's data, a 98.5% picker would only pass 16 of every 20 inspections. Pretty scary.

6. May 31, 2013

### MrAnchovy

Why are you assuming that a 98.5% picker is satisfactory? This means on average 1500 items a month wrongly picked! I would assume they have set the threshold and sample size to identify the failure rate that they deem unsatisfactory. So if they want you to be picking with 99.5% accuracy then the combination of a sample size of 100 and a threshold of > 2 items is sufficient to avoid false positives with approximately 98.5% confidence.

You can't say their sample size is too small without knowing what they are trying to achieve. 98% is just the threshold for the monthly test, not the target accuracy.

7. May 31, 2013

### IBYCFOTA

This is a good point, and something that had crossed my mind recently. I do think it's something my supervisor would have brought up when I discussed it with him however, so unless he's out of the loop (unlikely) I think there's a better possibility that they're just using a flawed model. I'm going to try and get an answer from management on this soon, so we'll see how that goes.