Finding the confidence interval

Click For Summary

Homework Help Overview

The discussion revolves around finding the confidence interval in the context of a database reliability assessment. Participants are exploring the necessary formulas and concepts related to confidence intervals, particularly in relation to sample data and its implications for a larger dataset.

Discussion Character

  • Exploratory, Conceptual clarification, Assumption checking

Approaches and Questions Raised

  • Participants are discussing the appropriate formulas for calculating confidence intervals, with some questioning the relevance of specific equations provided. There is an exploration of the relationship between sample size, confidence level, and accuracy assumptions. The original poster is seeking clarity on how to interpret their findings in the context of the entire database.

Discussion Status

Some participants have provided guidance on the formulas for confidence intervals, while others have raised questions about the assumptions and definitions being used. There is an ongoing exploration of the problem without a clear consensus on the correct approach or interpretation.

Contextual Notes

The original poster has outlined a practical scenario involving a database with 4000 entries and a sample of 100 checked entries, noting discrepancies. There is a focus on understanding how to generalize findings from the sample to the entire dataset, including considerations of confidence levels and intervals.

Pietair
Messages
57
Reaction score
0

Homework Statement



What formula do I need to find the confidence interval, when I have got:

- Number of samples
- Level of Confidence
- The assumed (1st guess) accuracy

Homework Equations



I found the following equation online: µ = z * [p * (1 - p) / n] ^ (-1/2)

The Attempt at a Solution



When I fill in this formula, I get µ = 125.5, while I think the confidence interval should be around 3 percent.
 
Physics news on Phys.org
If you want a 95% CI, then you want P(-a<Z<a)=0.95 where


[tex]Z=\frac{\bar{x}-\mu}{\frac{\sigma}{\sqrt{n}}}[/tex].


So [itex]\bar{x} \pm a \frac{\sigma}{\sqrt{n}}[/itex] will be a 95% CI for μ
 
Pietair said:

Homework Statement



What formula do I need to find the confidence interval, when I have got:

- Number of samples
- Level of Confidence
- The assumed (1st guess) accuracy

Homework Equations



I found the following equation online: µ = z * [p * (1 - p) / n] ^ (-1/2)
This may or may not be relevant to your problem. This formula looks vaguely related to a binomial distribution. You haven't said what the distribution is, so it's hard to say if this is something you need to use.
Pietair said:

The Attempt at a Solution



When I fill in this formula, I get µ = 125.5, while I think the confidence interval should be around 3 percent.

Again, you have provided enough information for me to tell if this is a reasonable value for µ. What you said about the confidence interval makes no sense at all. A confidence interval is an interval, with a left endpoint and a right endpoint. It is not given as a percentage.
 
Thank you for your replies.

All the information I have got considering this practice situation:

Information written down on a form will be put in a database. The information in the database can be correct (match the information written on the form) or can be incorrect (do not match the information written on the form). A mismatch occurs when the database administrator enters the wrong information (for example: putting "b" in the database when "a" is written on the form).

Now I would like to execute a sample to judge whether the data found in the database is reliable (ie consistent with the source information) or not. The database contains a total of 4000 entries. I would like to execute a sample because it is quite time consuming to check if all 4000 entries are correct or not. With this sample I would like to state something about the reliability of the entire database (4000 entries).

So, suppose I have 100 entries checked, and 2 of them do not match. Then I find that 98% of the database entries of the corresponding sample is consistent with the source information. But what can I say about the confidence level and interval of this 98% considering the entire database (4000 entries).

Thanks in advance!
 
Has anyone got an idea regarding this practical situation?
 

Similar threads

  • · Replies 21 ·
Replies
21
Views
3K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 3 ·
Replies
3
Views
1K
Replies
2
Views
2K
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K