- #1
Strum
- 105
- 4
(Sorry for the terrible title. If anybody have a better idea, post and I will edit. Also I have no idea of the level so now I just put undergraduate since the problem is fairly easy to state.)
Suppose I buy ## N## sensors which the manufacturer tells me will fail at some point and the failure distribution is given by a Normal distribution(## p_{f}(t) = N[\mu_{f},\sigma_{f}^{2}](t) ##). Now after ## t<\mu_{f} ## I have ## n = |\mathbf{n}| \ll N## failed sensors, where ## \mathbf{n} ## is an ordered vector with timestamps of when the sensors broke. I would like to quantify whether my manufacturer is living up to his end of the deal. I have made a few attempts but I would very much like to know if there is some canonical way to do this.
Attempt 1: Calculate the joint probability of having the ## n ## sensors broken at ## t ## in the correct order. This I suppose would be given by
\begin{equation}
P_{joint} = \Pi_{i=1}^{n}cdf[\mu_{f},\sigma_{f}^{2}](n_{i})
\end{equation}
Where ## cdf(t) ## is the cumulative probability function of ## p_{f}(t) ##.
Attempt 2: Make a new random variable given by the sum of ## n ## failure times which will follow ## p_{tot}(t) = N(n\mu_{f},n\sigma_{f}^{2})(t) ## and calculate the probability
\begin{equation}
P_{tot} = p_{tot}(t<\max{(\mathbf{n})})
\end{equation}
Attempt 3: Make a one sample Kolmogorov-Smirnov test using ## \mathbf{n} ## and a cut normal distribution, ## p_{cut} = N[\mu_{f}\sigma_{f}^{2}](t<\max(n)) / L ##, where ## L ## is a normalisation constant, and then estimate the significance.
Attempt 4: Make a two sample Kolmogorov-Smirnov test using simulated data from ## p_{cut} ## sort of like the answer given here:
http://stats.stackexchange.com/questions/126539/testing-whether-data-follows-t-distribution , and then estimate the significance.
I am not sure which method is best and what the advantages and disadvantages is for each. I also need some help in order to quantify the uncertainty on my final answer. I understand how to calculate the uncertainty on the mean and variance on a normal distributed sample, but I do not know how to do it on this sample.
I feel this should be a super simple exercise but just can not seem to get a real hold on it.
Suppose I buy ## N## sensors which the manufacturer tells me will fail at some point and the failure distribution is given by a Normal distribution(## p_{f}(t) = N[\mu_{f},\sigma_{f}^{2}](t) ##). Now after ## t<\mu_{f} ## I have ## n = |\mathbf{n}| \ll N## failed sensors, where ## \mathbf{n} ## is an ordered vector with timestamps of when the sensors broke. I would like to quantify whether my manufacturer is living up to his end of the deal. I have made a few attempts but I would very much like to know if there is some canonical way to do this.
Attempt 1: Calculate the joint probability of having the ## n ## sensors broken at ## t ## in the correct order. This I suppose would be given by
\begin{equation}
P_{joint} = \Pi_{i=1}^{n}cdf[\mu_{f},\sigma_{f}^{2}](n_{i})
\end{equation}
Where ## cdf(t) ## is the cumulative probability function of ## p_{f}(t) ##.
Attempt 2: Make a new random variable given by the sum of ## n ## failure times which will follow ## p_{tot}(t) = N(n\mu_{f},n\sigma_{f}^{2})(t) ## and calculate the probability
\begin{equation}
P_{tot} = p_{tot}(t<\max{(\mathbf{n})})
\end{equation}
Attempt 3: Make a one sample Kolmogorov-Smirnov test using ## \mathbf{n} ## and a cut normal distribution, ## p_{cut} = N[\mu_{f}\sigma_{f}^{2}](t<\max(n)) / L ##, where ## L ## is a normalisation constant, and then estimate the significance.
Attempt 4: Make a two sample Kolmogorov-Smirnov test using simulated data from ## p_{cut} ## sort of like the answer given here:
http://stats.stackexchange.com/questions/126539/testing-whether-data-follows-t-distribution , and then estimate the significance.
I am not sure which method is best and what the advantages and disadvantages is for each. I also need some help in order to quantify the uncertainty on my final answer. I understand how to calculate the uncertainty on the mean and variance on a normal distributed sample, but I do not know how to do it on this sample.
I feel this should be a super simple exercise but just can not seem to get a real hold on it.