Is this a vaild way to test a hypothesis?

  • Context: Graduate 
  • Thread starter Thread starter rede96
  • Start date Start date
  • Tags Tags
    Test
Click For Summary

Discussion Overview

The discussion revolves around the validity of a proposed method for testing a hypothesis related to the thickness of manufactured materials. Participants explore statistical approaches to assess whether the thickness measurements deviate from expected values, considering the complexities of sampling from different material types and the implications of their findings.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant describes a manufacturing process where material thickness is controlled, but suspects post-production changes may affect measurements, leading to a hypothesis about the distribution of thicknesses.
  • Another participant questions the conclusion drawn from the sample sizes, suggesting that if 90 samples are above the target, it raises questions about the distribution and validity of the hypothesis.
  • There is a discussion about the appropriateness of using binomial probability to assess the outcomes of the samples, with some participants suggesting that regular hypothesis testing might be more suitable.
  • Concerns are raised about the potential for a skewed distribution due to the manufacturing process, which could affect the validity of the hypothesis testing approach.
  • Some participants propose that small sample sizes for each material type may not be problematic if combined, referencing the central limit theorem to support this view.
  • There is an acknowledgment of the challenges in interpreting statistical results, particularly in relation to the intended use of the findings.

Areas of Agreement / Disagreement

Participants express differing views on the validity of the proposed hypothesis testing method, with some supporting the initial approach and others advocating for more traditional statistical methods. The discussion remains unresolved regarding the best approach to take.

Contextual Notes

Participants note limitations related to sample sizes for different material types and the potential impact of manufacturing processes on the distribution of measurements. There is also uncertainty about the applicability of the central limit theorem in this context.

rede96
Messages
663
Reaction score
16
We manufacture materials of various types and hence target thickness. This is a continuous process, and material is cut to specified lengths (often over 10, 000 meters) after production. To control the process during production, we take regular measurements of the thickness (Which is the key variable we want to control) throughout production and the machine automatically compensates.

The data shows that the processes are very capable of making the material to required thickness, with CP and CPK of 4 or over.

However I suspect that there is something that changes in the properties of the material after production which effects the thickness and want to take random samples to check.

As these tests would be done by many different people from different locations, and as I suspect the number of samples from each material type would need to be quite high in order to accurately compare the means of the sample and process distributions, I thought of a different approach.

My hypothesis is that if I take a number of random samples for the different material types, of say just a meter length, and measure the average thickness of the samples and compare them to the target value, I should see an equal number of thicknesses above and below the target thickness.

So using Binomial probability I can work out that there is a very high probability of finding between 30 and 70 measurements over the specified thickness for a sample size of 100. (appx 0.999)

So when I find that over 90 random samples are above the target thickness, I conclude that the sample thicknesses are very likely to be different from the thicknesses measured during the process.

Is this a valid conclusion?
 
Physics news on Phys.org
If 90 samples are above and 50 samples are below - what would be the conclusion?
If you are happy with 100 or so samples, why not use regular hypothesis testing? The measuring process introduces gaussian uncertainties anyway.
 
Firstly, thanks for the reply.

If 90 samples are above and 50 samples are below...

Sorry, I might not have explained it too well, I only have 100 samples, so if 90 were above only 10 would be below.

If you are happy with 100 or so samples, why not use regular hypothesis testing?

There are probably a number of reasons I guess. Firstly the total number of samples is 100, but the total number of samples for each material type is only a few, as there are many different material types. As each material type produced is a different process I would need more samples (From memory over 30 as a minimum) for each material type.

I think the other reason is I have an understanding of some statistics, but my Math is crap! I don't use this stuff too often and to be honest I really just don't remember how to do the two sample t-test or even if I can 'normalise' the different material types (Which all have different thicknesses) to get an understanding of the overall situation.

Hence why I opted for just checking the percentage that were above the means from my sample, as intuitively I thought there should be an even spread of above or below.
 
Well, the machine may have a tendency to err on the side of "too thick" while staying within the tolerance limits. This will produce a skewed distribution.

But what you have suggested would make a good start ... you are just defining your two outcomes as "high" and "low" and arbitrarily picking "high" as the state you are interested in... converting a continuous distribution into a discrete one. Then binomial probability is fine ... if you assume equal probability of being high or low, then you can pick a probability level where you reject the hypothesis that the distribution is symmetric. It does not tell you why the distribution is not symmetric but it will be a reason to investigate further.

I don't think that small samples of each material is necessarily a problem - adding them all up for about 100 of them should get you a well-behaved gaussian by the central limit theorem. It will be like rolling lots of different kinds of dice and adding them.
If you do this for the deviations of the samples from the requirement, then you'll be able to find if the mean deviation is unacceptably high.

Standard hypothesis testing is not that hard - there are lots of references online.
The main difficulty with this stuff is the interpretation at the end. For that concentrate on what you want to achieve... what do you need the result to tell you in terms of what action will be taken? i.e. you'd do the calculation differently if you want use it as evidence in a law suit compared with if you want to see if the machines need a recalibrate.
 
Simon Bridge said:
But what you have suggested would make a good start ... you are just defining your two outcomes as "high" and "low" and arbitrarily picking "high" as the state you are interested in... converting a continuous distribution into a discrete one. Then binomial probability is fine ... if you assume equal probability of being high or low, then you can pick a probability level where you reject the hypothesis that the distribution is symmetric. It does not tell you why the distribution is not symmetric but it will be a reason to investigate further.

Ok, thanks. Yes it was just a quick way of assessing if further investigation was needed. So I assumed that the probability of finding a sample that was higher (or lower) than the mean of the process would be 0.5.

Simon Bridge said:
I don't think that small samples of each material is necessarily a problem - adding them all up for about 100 of them should get you a well-behaved gaussian by the central limit theorem. It will be like rolling lots of different kinds of dice and adding them.
If you do this for the deviations of the samples from the requirement, then you'll be able to find if the mean deviation is unacceptably high.

I'll be honest I don't know enough about the CLT to know if this would help or not. But considering there are many different machines that make this product and many different target widths, and that I am gathering data from the 'customer' point of view (i.e. I haven't traced each sample back to it's manufacturing origin or batch) then I assumed that the I couldn't interpret anything else from the data other than the binomial higher or lower probability that I did.

But thanks very much for your help.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
Replies
20
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 21 ·
Replies
21
Views
4K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 20 ·
Replies
20
Views
4K