Calculating the confidence interval of your data

tsaitea
Messages
19
Reaction score
0
Hello guys,

I would like to calculate the confidence interval of the data in which the data is correct. In otherwords I would like to know how much confidence we have that the data is correct.

Could anyone direct me where to start?

Thank you.
 
Physics news on Phys.org
Google for "confidence integral" is a good place to start.
Note: you decide how much confidence you want i.e. 95% - the confidence interval the the range within which you can have that confidence.
 
Thank you for your reply Simon.

That definitely helped me understand the confidence interval. I am struggling to put it into context now. Say for example I am loading 300 lines of data into a database. Now I want to figure out the # of errors that would occur during the load (data not loaded properly).

What I was thinking is maybe I would have to perform the load, and calculate the sample size I would need based on a 95% confidence interval and randomly check each line until I have checked up to the sample size. And based on the # of errors found, I could determine the confidence interval?
 
Let me try to repeat your problem. You maybe want to load millions of lines into a database and what to estimate what percentage of the lines is corrupt. You decide to check at random say N=250 lines. Let's assume that the distribution of corrupt lines follows a poisson distribution and you find that n=25 lines (i.e. 10% ) are corrupt, hence your estimate of the probability or fraction of corrupted lines is p=n/N=0.1. The variance is also 25 for a poisson distribution and the standard deviation ## \sigma=\sqrt{n}##. Using a normal approximation you can construct a 95% confidence interval
for the true percentage as ##[ (n- z \sigma)/N, (n+z \sigma)/N]## where for a 95% interval z=2 (1.96 to be exact).
So your true value is in the range [15/250, 35/250].
 
  • Like
Likes 1 person
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.

Similar threads

Replies
3
Views
1K
Replies
1
Views
1K
Replies
22
Views
3K
Replies
1
Views
1K
Replies
18
Views
4K
Replies
21
Views
4K
Back
Top