# Stats experts, GTHIH!

1. Aug 9, 2010

### Jamin2112

1. The problem statement, all variables and given/known data

A group of 50,000 tax forms has an average gross income of $37,000, with an SD of$20,000. Furthermore, 20% of the forms have a gross income over $50,000. A group of 900 forms is chosen at random for audit. To estimate the chance that between 19% and 21% of the forms chosen for audit have gross incomes incomes over$50,000, a box model is needed.
(a) Should the number of tickets in the box be 900 or 50,000?
(b) Each ticket in the box shows
a zero or a one a gross income
(c) True or false: the SD of the box is $20,000 (d) True or false: the number of draws is 900 (e) Find the chance (approximately) that between 19% and 21% of the forms chosen for the audit have gross incomes over$50,000.
(f) With the information given, can you find the chance (approximately) that between 9% and 11% of the forms chosen for the audit have gross incomes over $75,000? Either find the chance, or explain why you need more information. 2. Relevant equations The basic Stats stuff: Standard Deviation, Standard Error, Mean, Expected Value, etc. 3. The attempt at a solution Just tell me if I've done these right. 20% of 50,000 tax forms is 10,000 forms. So if we were to make a box model, it be like +1 +0 +0 +0 +0 +1 +0 +0 +0 +0 +1 +0 +0 +0 +0 + ........ 10,000 incomes over$50,000 and 40,000 incomes not over $50,000, in other words. So if you are to draw 900 forms out of the 50,000, it'd be like drawing 900 times, with replacement, from the box model. We expect that 1/5 of our 900 draws will be over$50,000; that's our best guess. So we expect 180 forms to be over $50,000. But I'm a little confused about doing the Standard Error here. If the first form you draw is an income over$50,000, the chance of drawing a second is not 1/5 but 9999/49999. See what I'm saying? It's the replacement that's confusing me. If there were no replacement, the SE would simply be (1-0) * √[(10,000/50,000)*(40,000/50,000)] * √[900] = 12. Then I could estimate the area between 19% * 900 = 171 and 21% * 900 = 189, which is 54.67%, .... and do the rest of the problem.