How to calculate confidence interval for a CDF curve

In summary, the conversation discusses the confusion of calculating the 95% confidence interval for a curve, as opposed to a straight line. The example of a cumulative distribution function (CDF) is used, and different methods of obtaining the confidence intervals are suggested, such as using MATLAB or Minitab. The conversation also mentions the use of non-linear regression models and provides equations for calculating the confidence intervals. References are also given for further information on the topic.
  • #1
ILEVEN
5
0
I got a question which has been confused me for a long time.

The question is to calculate the 95% confidence interval for a curve. I have already learned how to calculate for a straight line.

For example, the cumulative distribution function (CDF) could be expressed as below:
Y = 1/2 * {1 + erf [(X-mean) / (sd * 2^0.5)]}
where ‘erf ’ is called error function, ‘mean’ and ‘sd’ are the mean value and standard deviation of X, respectively. Y is distributed normally from 0 to 1.

If ‘mean’ and ‘sd’ are known, by varying the value of X we could obtain a series values of Y. Then you could plot a typical CDF graph.

Then I need to calculate the 95% confidence intervals of this plotted curve. Could someone tell me how to do it?

I know it could be completed using MATLAB, Minitab, etc. But I want to know the algorithm.

Thank you.
 
Physics news on Phys.org
  • #2
Someone please help:cry:
 
  • #3
ILEVEN said:
Someone please help:cry:

The most general non-linear regression model is the polynomial. If you find a reasonable fit you can use an analysis of residuals to determine confidence bounds. Of course you can also simply do piecewise point by point CIs on the Y axis and simply connect the dots of upper and lower "curves" if your data allows it. This gives you some idea of the consistency of data quality.

http://www.mathworks.com/help/toolbox/curvefit/bq_5ka6-1_1.html

If you are just curve fitting for CDFs or PDFs, most stat packages contain programs for this.
 
Last edited by a moderator:
  • #4
SW VandeCarr said:
The most general non-linear regression model is the polynomial. If you find a reasonable fit you can use an analysis of residuals to determine confidence bounds. Of course you can also simply do piecewise point by point CIs on the Y axis and simply connect the dots of upper and lower "curves" if your data allows it. This gives you some idea of the consistency of data quality.

http://www.mathworks.com/help/toolbox/curvefit/bq_5ka6-1_1.html

If you are just curve fitting for CDFs or PDFs, most stat packages contain programs for this.

thank you.

I have read the information on mathworks but it seems I still can not figure out what algorithm they used. I think only use C=b+-t*sqrt(S) can not solve the problem. Or I might not fully understand this.

I know I can simplely use MATLAB or minitap, etc to analyze such statistics problem, but I need to understand how it works?

Could you give me a example of it, please?

Thank you!
 
Last edited by a moderator:
  • #5
ILEVEN said:
thank you.

I have read the information on mathworks but it seems I still can not figure out what algorithm they used. I think only use C=b+-t*sqrt(S) can not solve the problem. Or I might not fully understand this.

I know I can simplely use MATLAB or minitap, etc to analyze such statistics problem, but I need to understand how it works?

Could you give me a example of it, please?

Thank you!

I don't know the proprietary algorithms they use but for unspecified non-linear regressions, it's probably an iterative ML estimate.

[tex]CI= [\hat\theta-2SE, \hat\theta+2SE][/tex]

[tex]SE=\frac{1}{\sqrt{nI_{X_i}(\hat\theta)}}[/tex]

[tex]I_X(\theta)={E(\theta)-\frac{\delta^2(lnp(X(\theta))}{\delta\theta^2}[/tex]

http://learning.eng.cam.ac.uk/zoubin/SALD/week3b.pdf
 
Last edited:
  • #6
SW VandeCarr said:
I don't know the proprietary algorithms they use but for unspecified non-linear regressions, it's probably an iterative ML estimate.

[tex]CI= [\hat\theta-2SE, \hat\theta+2SE][/tex]

[tex]SE=\frac{1}{\sqrt{nI_{X_i}(\hat\theta)}}[/tex]

[tex]I_X(\theta)={E(\theta)-\frac{\delta^2(lnp(X(\theta))}{\delta\theta^2}[/tex]

http://learning.eng.cam.ac.uk/zoubin/SALD/week3b.pdf

Correction to the third equation above:

[tex]I_X(\theta)=E_\theta\frac{(-\delta^2(ln p(X|\theta))}{\delta\theta^2}[/tex]
 
  • #8
SW VandeCarr said:
Correction to the third equation above:

[tex]I_X(\theta)=E_\theta\frac{(-\delta^2(ln p(X|\theta))}{\delta\theta^2}[/tex]

Thank you.
 
  • #9

Related to How to calculate confidence interval for a CDF curve

Question 1:

What is a confidence interval for a CDF curve?

A confidence interval for a CDF curve is a range of values that is likely to contain the true mean of the population based on a sample. It is used to estimate the uncertainty around the estimated mean of a CDF curve.

Question 2:

How is a confidence interval for a CDF curve calculated?

A confidence interval for a CDF curve is typically calculated using the sample mean, sample standard deviation, and the sample size. The formula used is: sample mean +/- (critical value x standard deviation/square root of sample size).

Question 3:

What is the significance of a confidence level in a CDF curve?

The confidence level in a CDF curve represents the probability that the true mean of the population falls within the calculated confidence interval. For example, a confidence level of 95% means that there is a 95% chance that the true mean of the population falls within the calculated interval.

Question 4:

How does the sample size affect the confidence interval for a CDF curve?

The sample size has a direct impact on the width of the confidence interval for a CDF curve. A larger sample size will result in a narrower confidence interval, indicating a more precise estimation of the true mean. On the other hand, a smaller sample size will result in a wider confidence interval, indicating a less precise estimation.

Question 5:

Can a confidence interval for a CDF curve be used to compare two or more populations?

Yes, a confidence interval for a CDF curve can be used to compare two or more populations. If the confidence intervals for the CDF curves of two populations do not overlap, it suggests that there is a significant difference between the two populations. However, if the confidence intervals overlap, it cannot be concluded that there is a significant difference between the populations.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
743
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
672
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
737
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
22
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
985
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
Back
Top