Sample Test | Component Lifetime

Click For Summary

Discussion Overview

The discussion revolves around determining the lifetime of a mechanical component using statistical methods, specifically focusing on how to establish a confidence interval based on a small sample size. Participants explore Bayesian statistical approaches to model failure times and the implications of uncertainty in their estimates.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant seeks to understand how to determine the 'trueness' of the mean from a small sample of failure times, expressing concern about the uncertainty associated with limited data.
  • Another participant suggests using Bayesian methods, proposing that the failure time can be modeled as an exponential distribution with a constant hazard rate, and introduces the gamma distribution as a conjugate prior for the failure rate.
  • Further elaboration includes an example with specific failure times, demonstrating how to compute parameters for the gamma distribution based on observed data.
  • Concerns are raised about implementing the gamma distribution in Excel, particularly regarding the correct interpretation of parameters and the calculations involved.
  • Questions arise about the concept of 'trueness' and whether it is accounted for in the gamma distribution, with discussions on how to incorporate uncertainty in the modeling process.
  • Clarifications are provided about the relationship between the gamma and exponential distributions, emphasizing the need to simulate failure rates as random variables in a Monte Carlo framework.
  • Participants discuss the implications of small sample sizes on the estimation of the mean and the associated uncertainty, with one participant questioning if the Bayesian method adequately addresses this concern.

Areas of Agreement / Disagreement

Participants express varying levels of familiarity with Bayesian methods, leading to some confusion and requests for clarification. While there is a general agreement on using Bayesian approaches, the discussion reveals multiple interpretations of how to handle uncertainty and the relationship between different statistical distributions, indicating that no consensus has been reached on these points.

Contextual Notes

Limitations include the small sample size affecting the reliability of the mean estimate, the dependence on the choice of prior distributions in Bayesian analysis, and unresolved questions about the correct application of statistical methods in practical scenarios.

Joris Kievits
Messages
5
Reaction score
0
Hi,

I'm currently working on a project for which I have to determine the Life-Time of a certain mechanical component within a certain confidence interval. By sampling a small number (let's say n = 10) of these components and measuring the number of hours until failure, I want to determine this confidence interval.
I currently have one main question:

How can I determine the 'trueness' of the mean of these 10 components after determining the standard deviation and mean from the test? (If I were to measure just 2 times but the deviation is extremely small then I would still have a huge uncertainty to my mean right?)

I'd greatly appreciate it if someone could help me figure this out!
 
Physics news on Phys.org
I tend to like Bayesian statistical methods, so that is what I would use here. If you are modeling your failure time as exponential, meaning that you have a constant hazard rate ##\lambda## then a conjugate prior/posterior for ##\lambda## is the Gamma distribution with ##\alpha## equal to the number of observations and ##\beta## equal to the sum of the failure times for previous observations
 
  • Like
Likes   Reactions: Joris Kievits
Hi Dave,

Thanks for responding so quickly! Could you please elaborate a bit more, maybe using an example? I'm not familiar with Bayesian methods and I'd hate to misinterpret your answer..

Thanks either way!
 
Sure. Suppose you model a set of components whose true constant failure rate is ##\lambda## = 0.01/day meaning that the mean time to failure is 100 days. Since we assume a constant failure rate (no burn-in or wear) we can use an exponential distribution to model the days to failure of any given component. So suppose you take 10 such components and you use them until they all fail, then you might have data that looks like this:
{98.0327, 364.585, 135.671, 69.0766, 113.28, 15.9393, 64.3092, 118.104, 15.5044, 71.9093}

So if you don't know ##\lambda## then the Bayesian approach would be to treat it as a random variable, take this data, and compute a probability distribution function for P(##\lambda## | data). It turns out that the best PDF to use for ##\lambda## is the gamma distribution (see https://en.wikipedia.org/wiki/Exponential_distribution#Bayesian_inference ). The gamma distribution is a pretty common distribution, but not as common as the normal distribution, so you may not have worked with it before. It has two parameters ##\alpha## and ##\beta##, and the data allows us to set ##\alpha## and ##\beta## so that we get ##\alpha = n = 10## and ##\beta = n \bar{x} = \Sigma x = 1066.41##. So then we can plot this PDF if you like, or you can summarize it with mean ##\alpha/\beta = 0.0094## and standard deviation ##\sqrt{\alpha}/\beta=0.0029##. You can construct 95% credible intervals and so forth as well.
 
Last edited:
Hi Dave,

I'm trying to implement this into an Excel file, so that I can update it while testing. However, I seem to get extremely low values for the Gamma distribution. I'm also not able to totally wrap my head around the prior/posterior story so that might be what I've done wrong. Is there any way that you could show me the exact calculations for a data set of for example: { 8.4, 8.9, 7.3, 9.0, 6.9 } hours of operation until failure?
I'd understand if it would be too much trouble...
Thanks for the help either way!
 
So, for that data ##\alpha = n = 5## and ##\beta = \Sigma x = 40.5##. So the mean failure rate is ##\alpha/\beta = 0.12## failures per hour. That makes sense because they typically failed in less than 10 hours per failure, so that is more than 0.1 failures per hour.

If you are evaluating the PDF, then the PDF of this gamma distribution is equal to 7.9 at 0.1. The CDF is equal to 0.38 at the same 0.1 value. If you find that your excel sheet is giving you far different numbers (like 3.8E-14 and 7.6E-16 for the PDF and CDF) then try using ##1/\beta## instead of ##\beta##. Unfortunately, some software packages use ##\beta## and some use ##1/\beta## and often there is no way to know which is used without just trying it both ways and seeing which makes sense.
 
Hi Dave,

It was indeed the case that I needed to use ##1/\beta##! Thanks for the clear explanation! However, I have some questions with respect to Bayes and the "trueness" of this PDF. Is this accounted for in the Gamma distribution? Or does the value, that I get from the Gamma distribution, still have to be modified using some "trueness error"? And you told me that I could model the time to failure as an exponential distribution, is this the Gamma distribution or should I implement the mean of the Gamma distribution, with an amount of uncertainty, into another exponential distribution? Thanks for all the help so far and I'm sorry for going on and on...
 
Joris Kievits said:
Hi Dave,
I am Dale, not Dave

Joris Kievits said:
I have some questions with respect to Bayes and the "trueness" of this PDF. Is this accounted for in the Gamma distribution? Or does the value, that I get from the Gamma distribution, still have to be modified using some "trueness error"?
I don’t know what you mean by “trueness”.

Joris Kievits said:
And you told me that I could model the time to failure as an exponential distribution, is this the Gamma distribution or should I implement the mean of the Gamma distribution, with an amount of uncertainty, into another exponential distribution?
So the idea is that we are modeling the time to failure as an exponential distribution.

The exponential distribution has a single parameter, ##\lambda##, which is the failure rate. We don’t know that failure rate, so we treat it as a random variable too, in this case as a gamma distributed variable. The gamma distribution has two parameters, ##\alpha## and ##\beta##, which are determined from the data.

So if we are doing a Monte Carlo simulation of failure times you would first simulate a failure rate with the gamma distribution and then you would use that rate as the parameter to simulate an exponential distribution to get the failure time. Then to get your next Monte Carlo draw you would simulate a new gamma and a new exponential. This process accounts for both the inherent randomness of failure times as well as the lack of knowledge of the exact failure rate.
 
Last edited:
Hi Dale (my apologies for mixing up your name),

By "trueness" I mean the amount of uncertainty associated with a small number of tests. If I were to perform 3 tests but the values would all be very close to each other I'd have a small standard deviation but I would expect that there is more uncertainty to the mean than just this small standard deviation... do you know if this is true?

About the exponential distribution, what value from the Gamma distribution should I use as lambda? I've attached a two plots, one of the measurements and one of the Gamma distribution (for n=10) I got out of Excel. Does it look correct to you?
Distributions.PNG
 

Attachments

  • Distributions.PNG
    Distributions.PNG
    10.3 KB · Views: 420
  • #10
Joris Kievits said:
By "trueness" I mean the amount of uncertainty associated with a small number of tests. If I were to perform 3 tests but the values would all be very close to each other I'd have a small standard deviation but I would expect that there is more uncertainty to the mean than just this small standard deviation... do you know if this is true?
This is automatically accounted for in the Bayesian method. There ##\alpha=n## and a small ##\alpha## gives a broad distribution for the failure rate.

Joris Kievits said:
About the exponential distribution, what value from the Gamma distribution should I use as lambda?
All of them!

If you are doing a Monte Carlo simulation then you start with a draw from the gamma distribution and only then make a draw from the exponential distribution. If you are doing something else then you would marginalize over ##\lambda##

Joris Kievits said:
I've attached a two plots, one of the measurements and one of the Gamma distribution (for n=10) I got out of Excel. Does it look correct to you?
I am not on my main computer, but visually yes it looks good.
 

Similar threads

  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 24 ·
Replies
24
Views
7K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 21 ·
Replies
21
Views
3K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 21 ·
Replies
21
Views
4K