Let's say I'm testing anti-spam software. The number of false positives (aka, friendly messages misidentified as spam, for those who don't know the term) is 40. The number of false negatives (spam messages misidentified as friendly) is also 40. I'm testing 100 messages. How many more messages would I need to test in order to be 99.99% that the null hypothesis can/cannot be rejected?