# Stopping rule for quality control problem

• A
• estebanox
In summary, the conversation discusses a production process that yields output in batches of n items and the decision to discard a batch if the quality of more than half of the items is 'bad'. The conversation also explores the use of Bayesian statistics to update beliefs on the unknown mean probability of a Bernoulli distribution, and how this can be used to determine when to stop testing items in a batch in order to make a statistically confident decision on whether to discard it. Different approaches, such as using a normal approximation and a worst case scenario assumption, are also discussed. Ultimately, the conversation emphasizes the importance of considering the finite nature of n in the testing process and suggests using individual keep/discard choices for each item to improve the overall quality.
estebanox
Problem:

Suppose I have a production process that yields output in batches of n items. For each batch, I can test whether they are of good or bad quality. Let q_i ∈ {1,0} be the quality of tested item i.

If more than half of the items are ‘bad’, the batch should be discarded. In other words: the batch should be discarded if the average quality of items is q(n)<0.5.

Suppose testing is sequential (i.e. you learn the quality of items one at the time), and each test is a random draw from a Bernoulli distribution with unknown mean p.I would like to know when to stop testing items in a batch, in order to decide whether to discard it without testing all items, and yet be statistically confident that the outcome (discard vs not discard) is the same as the one that would have been reached if all items had been tested.

What I currently have:

If I knew p, I could use a normal approximation to estimate the confidence interval of the binomial proportion q(k) for any k. With this, for any arbitrary width of the interval, I could calculate the number of tests that are required to achieve a given confidence level (let me call such number of tests k*).

Since I don’t know p, one option would be to solve for k* assuming the worst case scenario (i.e. p=0.5). This was proposed (and discussed) in a related question here. Another similar alternative was proposed here.

This approach, however, does not take into account that n is finite, so once many items have been tested, it becomes very unlikely that one more test will swing the outcome...

This is interesting, because you'll be doing two estimations at the same time. Indeed, while testing a batch you are getting information on ##p## and on ##q(n)## at the same time. So if you test ##100## items and they all turn out to be positive, then it's no longer a good idea to assume the worst case scenario ##p=0.5##. Indeed, you will want to update your confidence on ##p##.

This can be done adequately with Bayesian statistics. In Bayesian statistics, you have your parameter ##p## which is essentially unknown to you. What's different is that now you can give a probability distribution to ##p##, which corresponds to your beliefs on what ##p## is.

Of course, in the beginning you have no idea what ##p## is. So you could use what is called a "vague prior" on ##p##, or you could be more pessimistic and assume that ##p## has the worst case scenario, the latter option is more conservative.

Anyway, what happens then is you test a number of items. The tests will turn out to be positive or negative. The results of these tests allow you to update your beliefs on ##p##. Using this updated belief, you can then generate a predictive distribution of ##q(n)## and see whether it is good enough.

So let me put this in an example. We have ##1000## items. We test ##100## items.
In the beginning, I have no information on my probability ##p##. So I give it a vague prior. I have a lot of choice here, but I choose for a uniform distribution on ##[0,1]##. This means that I allow my probability ##p## to be anything with equal chance. Important for computations is also that the uniform distribution on ##[0,1]## is a beta distribution, namely beta(1,1).

Now assume that I know my probability ##p## (I don't, but let's assume). We test ##100## items and ##20## are inadequate. What's the probability of this? Well, it's not too farfetched to model this with a binomial distribution. The probability of ##20## inadequate samples is ##\binom{100}{20}p^{20}(1-p)^{80}##.

Now I can combine my uniform prior and my binomial distribution using Bayes' theorem. This will give a posterior distribution on ##p##. Basically, my posterior distribution will be proportional to ##p^{20}(1-p)^{80}##. This is a beta distribution again (this is no coincidence) with parameters beta(21,81). You should graph it to see what happens.

Now we can compute the likelihood of what happens with the other ##900## samples. We already had ##20## inadequate samples, so we need to know when there will be ##480## other inadequate samples or more. If we knew ##p##, then the probability of this happening is ##\sum_{k=480}^{900} \binom{900}{k} p^k (1-p)^{900-k}##. We don't know ##p## but we know a distribution of ##p##. So we can use integration to find a probability, namely
$$\int_0^1 \sum_{k=480}^{900} \binom{900}{k} p^k (1-p)^{900-k} \frac{1}{B(21,81)}p^{20}(1-p)^{80} dp = \sum_{k=480}^{900} \binom{900}{k}\frac{1}{B(21,81)} \int_0^1 p^{20+k}(1-p)^{1080-k}dp$$
The integral above is B(21+k, 1081-k). So I end up with
$$\sum_{k=480}^{900} \binom{900}{k} \frac{B(21+k,1081-k)}{B(21,81)}$$
I could use the CLT to compute this probability, or use software. If the probability is ##<0.05## (or another value), then I can say my batch is adequate, if it is larger than ##0.95##, then the batch is inadequate. Otherwise I need to go on.

How to go on? Well, I test ##100## more items. Let's say ##50## are inadequate. My prior this time is beta(21,81) and I combine this with the result of my test which is ##\binom{100}{50} p^{50}(1-p)^{50}##. I get a posterior probability on ##p## of ##beta(21+50, 81+50)##. I can then again predict what happens if I test the ##800## other samples.

estebanox said:
You use the observed rate to construct the confidence interval.

If p=0.5, no test can guarantee to tell you if p<0.5. You can't even decide clearly if the fraction of good items is below 0.5 without testing the full sample, and every test that will typically stop after a small subsample has to give a nearly random result.

You could keep testing until the confidence interval (90%, 95% or whatever) is fully below 0.5, for example. Use this confidence interval for the untested items only and add the tested items with their known quality to it. If you can make keep/discard choices for each item individually, doing so would clearly improve the quality here.I don't know the application, but if money is involved a Bayesian approach is probably better. Your process won't give all possible p values with a uniform distribution, this is information (gained over time) that can be used in the process.

I'm also wondering how fixed that limit of 0.5 is. What would you prefer?
- a batch where it is known that a fraction of 0.51 is good.
- a batch where there is a 1% probability that a fraction of 0.49 is good, and 99% probability that all are good.

Here is a good starting point for understanding Bayesian decision making

http://www.stat.ucla.edu/~yuille/courses/Stat161-261-Spring13/LectureNote2.pdf

Last edited by a moderator:
To OP: To find the optimal stopping for this probem, you would also have to specify the cost of doing each test, and the costs of making a wrong decision (i.e., discarding an OK batch, or letting a faulty batch slip through).

@micromass, thanks a lot for the suggestion of using a Bayesian approach– and for spelling it out. I'll try to implement in computationally.

mfb said:
I don't know the application, but if money is involved a Bayesian approach is probably better.

Do you mean "if testing is very costly"? In my application it is costly (not cash, but that's besides the point). In other words, I am willing to trade statistic confidence for number of tests.

Regarding your last question, the acceptance criterion in my application is not necessarily 0.5, but it is sharp: I do prefer knowing that a batch is above the threshold, even if only marginally.

I suppose both of these things point to the Bayesian approach. I'll look into it (thanks @Dale for the reference)

estebanox said:
Do you mean "if testing is very costly"?
No, I mean the difference between a pure research environment and running a business. For a scientific study, you might be interested in finding the fraction of batches where more than half of the samples are good. If you want to sell something, it is probably not the overall goal to determine that number as precisely as possible.
estebanox said:
I do prefer knowing that a batch is above the threshold, even if only marginally.
That will need many tests then, if the sample is close to the threshold.

Hey estebanox.

The only thing I'd add to the comments above is to consider the correlation aspect involved and note that a Binomial (or it's Bayesian equivalent the Beta distribution) is probably not going to be a good assumption to use on a line.

Typically if some process is making a bad batch then it impacts things in a correlated way - not an independent one.

If you are testing for a bad batch you should evaluate the physical process and see how that is likely to impact on the result of a positive or negative test. You will probably find that with more dependencies, this correlation component will be significant. Usually you assume independence if everything physically is disconnected from everything else but follows the same sort of layout or procedure.

As an example - you could two assembly lines with exactly the same design and components and they should be independent but if you had the one line then you would expect a lot of correlation to exist between whether items on that line were faulty or not.

## What is a stopping rule for quality control problems?

A stopping rule for quality control problems is a predetermined rule that determines when to stop a quality control process. It is used to ensure that the process is effective and efficient in detecting and correcting any defects in the production process.

## Why is a stopping rule important in quality control?

A stopping rule is important in quality control because it helps prevent unnecessary waste of resources and time. It also ensures that any defects are detected and corrected before they become larger issues, ultimately improving the overall quality of the product.

## What are the different types of stopping rules for quality control?

There are two main types of stopping rules for quality control: fixed sample size and sequential. Fixed sample size rules involve taking a predetermined number of samples and then making a decision, while sequential rules involve continuously sampling until a decision is made.

## How do you determine the appropriate stopping rule for a quality control problem?

The appropriate stopping rule for a quality control problem depends on the specific needs and goals of the company or organization. Factors such as the type of product being produced, the level of risk tolerance, and the cost of sampling should be considered when determining the best stopping rule.

## What are the benefits of using a stopping rule in quality control?

Using a stopping rule in quality control can help improve efficiency and reduce costs by preventing unnecessary sampling. It also ensures that any defects are detected and corrected in a timely manner, leading to higher overall product quality and customer satisfaction.

• Set Theory, Logic, Probability, Statistics
Replies
6
Views
800
• Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
• Programming and Computer Science
Replies
1
Views
1K
• Calculus and Beyond Homework Help
Replies
4
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
• Quantum Physics
Replies
12
Views
767