Is this a vaild way to test a hypothesis?

  • Thread starter Thread starter rede96
  • Start date Start date
  • Tags Tags
    Test
rede96
Messages
663
Reaction score
16
We manufacture materials of various types and hence target thickness. This is a continuous process, and material is cut to specified lengths (often over 10, 000 meters) after production. To control the process during production, we take regular measurements of the thickness (Which is the key variable we want to control) throughout production and the machine automatically compensates.

The data shows that the processes are very capable of making the material to required thickness, with CP and CPK of 4 or over.

However I suspect that there is something that changes in the properties of the material after production which effects the thickness and want to take random samples to check.

As these tests would be done by many different people from different locations, and as I suspect the number of samples from each material type would need to be quite high in order to accurately compare the means of the sample and process distributions, I thought of a different approach.

My hypothesis is that if I take a number of random samples for the different material types, of say just a meter length, and measure the average thickness of the samples and compare them to the target value, I should see an equal number of thicknesses above and below the target thickness.

So using Binomial probability I can work out that there is a very high probability of finding between 30 and 70 measurements over the specified thickness for a sample size of 100. (appx 0.999)

So when I find that over 90 random samples are above the target thickness, I conclude that the sample thicknesses are very likely to be different from the thicknesses measured during the process.

Is this a valid conclusion?
 
Physics news on Phys.org
If 90 samples are above and 50 samples are below - what would be the conclusion?
If you are happy with 100 or so samples, why not use regular hypothesis testing? The measuring process introduces gaussian uncertainties anyway.
 
Firstly, thanks for the reply.

If 90 samples are above and 50 samples are below...

Sorry, I might not have explained it too well, I only have 100 samples, so if 90 were above only 10 would be below.

If you are happy with 100 or so samples, why not use regular hypothesis testing?

There are probably a number of reasons I guess. Firstly the total number of samples is 100, but the total number of samples for each material type is only a few, as there are many different material types. As each material type produced is a different process I would need more samples (From memory over 30 as a minimum) for each material type.

I think the other reason is I have an understanding of some statistics, but my Math is crap! I don't use this stuff too often and to be honest I really just don't remember how to do the two sample t-test or even if I can 'normalise' the different material types (Which all have different thicknesses) to get an understanding of the overall situation.

Hence why I opted for just checking the percentage that were above the means from my sample, as intuitively I thought there should be an even spread of above or below.
 
Well, the machine may have a tendency to err on the side of "too thick" while staying within the tolerance limits. This will produce a skewed distribution.

But what you have suggested would make a good start ... you are just defining your two outcomes as "high" and "low" and arbitrarily picking "high" as the state you are interested in... converting a continuous distribution into a discrete one. Then binomial probability is fine ... if you assume equal probability of being high or low, then you can pick a probability level where you reject the hypothesis that the distribution is symmetric. It does not tell you why the distribution is not symmetric but it will be a reason to investigate further.

I don't think that small samples of each material is necessarily a problem - adding them all up for about 100 of them should get you a well-behaved gaussian by the central limit theorem. It will be like rolling lots of different kinds of dice and adding them.
If you do this for the deviations of the samples from the requirement, then you'll be able to find if the mean deviation is unacceptably high.

Standard hypothesis testing is not that hard - there are lots of references online.
The main difficulty with this stuff is the interpretation at the end. For that concentrate on what you want to achieve... what do you need the result to tell you in terms of what action will be taken? i.e. you'd do the calculation differently if you want use it as evidence in a law suit compared with if you want to see if the machines need a recalibrate.
 
Simon Bridge said:
But what you have suggested would make a good start ... you are just defining your two outcomes as "high" and "low" and arbitrarily picking "high" as the state you are interested in... converting a continuous distribution into a discrete one. Then binomial probability is fine ... if you assume equal probability of being high or low, then you can pick a probability level where you reject the hypothesis that the distribution is symmetric. It does not tell you why the distribution is not symmetric but it will be a reason to investigate further.

Ok, thanks. Yes it was just a quick way of assessing if further investigation was needed. So I assumed that the probability of finding a sample that was higher (or lower) than the mean of the process would be 0.5.

Simon Bridge said:
I don't think that small samples of each material is necessarily a problem - adding them all up for about 100 of them should get you a well-behaved gaussian by the central limit theorem. It will be like rolling lots of different kinds of dice and adding them.
If you do this for the deviations of the samples from the requirement, then you'll be able to find if the mean deviation is unacceptably high.

I'll be honest I don't know enough about the CLT to know if this would help or not. But considering there are many different machines that make this product and many different target widths, and that I am gathering data from the 'customer' point of view (i.e. I haven't traced each sample back to it's manufacturing origin or batch) then I assumed that the I couldn't interpret anything else from the data other than the binomial higher or lower probability that I did.

But thanks very much for your help.
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...

Similar threads

Replies
7
Views
2K
Replies
43
Views
4K
Replies
4
Views
2K
Replies
30
Views
2K
Replies
13
Views
3K
Replies
20
Views
2K
Back
Top