Inferential Statistics Problem

  • Thread starter Thread starter Elpinetos
  • Start date Start date
  • Tags Tags
    Statistics
Elpinetos
Messages
47
Reaction score
0

Homework Statement



Through a sample, a party A wants to determine the percentage p of the voters, who voted for party A. They want to determine p with a 90% certainty to 5%. How many people do they need to survey?
Let X be the amount of party-A-voters from n surveyed people.

I don't need the solution for this, since it is already solved step-by-step in the textbook. I just don't understand the first step and am looking for an explanation.

They say that through the given information you can calculate this:

P(|\frac{X}{N}-p|\leq0.05)\geq0.90

Afterwards, they rearange it to

P(|\frac{X-np}{\sigma}|\leq\frac{0.05n}{\sigma})\geq0.90

and solve for z.

The last part I understand, but I don't understand how they infere from the information to the first statement.

I don't really have much experience in stats, and I have trouble understanding how to get from the information given to the first equation.

Thank you in advance! :)
 
Physics news on Phys.org
What happens to the inequality
<br /> \left| \frac{X}{N} - p \, \right| \le 0.05<br />

when you multiply both sides by the (positive, which is why the inequality doesn't change direction) value
<br /> \frac N \sigma<br />
 
  • Like
Likes 1 person
I understand how they get from the first to the second equation. I don't understand how they get the first equation from the information given.
 
Elpinetos said:
I understand how they get from the first to the second equation. I don't understand how they get the first equation from the information given.

The quantity ##X/N## is the measured proportion, while ##p## is the theoretical proportion. You want the difference between these two proportions to be ≤ 0.05 with high probability (that is, with probability at least 0.9). Also: use either the symbol ##N## or the symbol ##n##, but not both in the same problem (as you have done). That can cause confusion and loss of marks, etc.
 
  • Like
Likes 1 person
I see now, thank you.

What I still don't fully understand is how to make the first equation in such examples.
I now understand how they got to it in this particular example, but how do I even approach such a problem? Where do I start? I somehow wouldn't have thought of taking the distance between those two.
Can you help me with that? :)
 
Elpinetos said:
I see now, thank you.

What I still don't fully understand is how to make the first equation in such examples.
I now understand how they got to it in this particular example, but how do I even approach such a problem? Where do I start? I somehow wouldn't have thought of taking the distance between those two.
Can you help me with that? :)
Not sure if this helps, but it can be generalized a little.
There is some property of the distribution, s, that you want to estimate. (In the present case it's p.) You have a procedure for estimating it from sample data, and the procedure produces the estimate ##\hat s##. You want the probability of the estimate being within ε of s to exceed some threshold probability τ. Algebraically, ##P[|\hat s - s| < \epsilon] > \tau##.
In the OP, the assumption is that the estimate for p will be X/N.
 
  • Like
Likes 1 person
Thank you, this helps a lot :)

Any tips on how to best approach such problems? What should I look out for? How do I start best?
 
Elpinetos said:
Thank you, this helps a lot :)

Any tips on how to best approach such problems? What should I look out for? How do I start best?
I can't think of anything I can add to what I wrote before. It's a matter of understanding what the objective is (finding the probability distribution of the error in an estimate for a parameter of another distribution) and expressing that in algebra.
 
  • Like
Likes 1 person
Back
Top