- #1

- 3

- 0

## Main Question or Discussion Point

I'm a programmer, but I know very little about statistics and am not even sure where or how to ask this. Lets say you have 2 variables about people in general, var A and var B, that are tangible characterists of these people. People either possess A or B.

I then take 11 different measurements about the person and use those to determine if they are actually A or B without looking at them. The program successfully determines if someone is A or B in a group of 10 people. But as I test more and more people, I find that some people have slight differences or exceptions in their variables that I have to account for.

Example: All A people have the first variable in a range of 12 to 13, the 2nd variable in a range of 5 to 6, but then I find an A person who has a range of 1 for the 2nd variable. So I add to the formula that if the 2nd variable = 1, then the person has A.

My question - How many people would I have to test out to get an accuracy rating above 80% of the program, or is that even possible. As I add more and more subjects that fit that equation, does that translate into an increase in accuracy of the program when used on the general population?

I then take 11 different measurements about the person and use those to determine if they are actually A or B without looking at them. The program successfully determines if someone is A or B in a group of 10 people. But as I test more and more people, I find that some people have slight differences or exceptions in their variables that I have to account for.

Example: All A people have the first variable in a range of 12 to 13, the 2nd variable in a range of 5 to 6, but then I find an A person who has a range of 1 for the 2nd variable. So I add to the formula that if the 2nd variable = 1, then the person has A.

My question - How many people would I have to test out to get an accuracy rating above 80% of the program, or is that even possible. As I add more and more subjects that fit that equation, does that translate into an increase in accuracy of the program when used on the general population?