Sample Size calculation, defect analysis

AI Thread Summary
Testing a rewritten software function involved processing 15,000 data sets, revealing one defect due to a coding error, which was subsequently fixed. A second test with the same number of data sets showed no defects of that type, raising questions about confidence levels in defect estimates. While the test data aimed to mimic real-world scenarios, it may not accurately represent actual usage, leading to uncertainty in the defect rate. Statistical methods like p-charts and the Student's t distribution were discussed for determining sample size and confidence intervals, but their applicability is limited due to the nature of the test data. Overall, while the specific defect was resolved, the potential for other undetected bugs remains a concern.
cinger
Messages
2
Reaction score
0
I am testing a change made to a software application; a function had to be rewritten to perform in a completely different way. The specific part of this application iterates through input data sets and outputs the data to another application. In order to test that data was being sent correctly, 15,000 data sets were passed through. This is an time-expensive test, taking about 10 hrs of labor for processing, and 2-3hrs for data analysis. During this test, 1 defect was encountered where the specific function failed to send data. Subsequently, they reviewed the code and attributed the defect to a incorrectly double incrementing a counter where it should have only incremented once. The code was corrected, and another test of 15,000 was performed with 0 defects of this specific type seen. Other defects occurred, but they were completely outside the scope of the specific function in question. Once the software is complete and ships, it will be responsible for passing millions of transactions in this manner.
How would confidence levels be determined for the first and second tests, i.e., how confident can we be with 0 defects seen in the second test?
The first test showed 1 defect in 15000 samples, or a defect per million encounters of 67 defects per million; how confident is this defect per million estimate?
How would adequate sample size be determined?
Would p-charts (http://en.wikipedia.org/wiki/P-chart" ) be appropriately used here?
If p-charts are useful, I have attempted to use the adequate sample size calculation listed on the wiki page; I wasn't sure of the units but could post some of that as well if it is applicable here.
 
Last edited by a moderator:
Physics news on Phys.org
It doesn't apply because you fixed the bug, so you should have no defects of that type in the future.

Also, your test data sets are not necessarily representative of the data sets that you will find in practice - a bug that happens once in 15000 runs might actually happen 1 in 100 runs for real data, or it might happen 1 in 2 million runs for real data. So you can't draw conclusions about real usage based on your test run.
 
The data from the tests had been selected to be as representative of actual field data as possible, with an increased percentage of input data that would catch errors associated with the type of bug encountered, due to the software's new functionality and software changes. This particular bug has been fixed, and performing the test again no defects were seen; but perhaps there are other bugs with smaller probability of error, it is very large software. May not have defects of this type attributed to this bug, but it is possible to have defects of this type attributed to perhaps different bug. Can any statistical information be derived from these tests? Can an adequate sample size be determined?
 
The problem is still that your test data is not field data. You could use the student's t distribution to get a confidence interval for the bug rate... but it wouldn't be very meaningful. Especially since there may be many bugs that exist that you have not noticed.

You could just track the absolute number of unfixed functional bugs. Maybe divide it by the total lines of code or by the number of current developers to get a proportional figure.
 
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...
Back
Top