# Homework Help: Inferential Statistics Problem

1. Jan 6, 2014

### Elpinetos

1. The problem statement, all variables and given/known data

Through a sample, a party A wants to determine the percentage p of the voters, who voted for party A. They want to determine p with a 90% certainty to 5%. How many people do they need to survey?
Let X be the amount of party-A-voters from n surveyed people.

I don't need the solution for this, since it is already solved step-by-step in the textbook. I just don't understand the first step and am looking for an explanation.

They say that through the given information you can calculate this:

$P(|\frac{X}{N}-p|\leq0.05)\geq0.90$

Afterwards, they rearange it to

$P(|\frac{X-np}{\sigma}|\leq\frac{0.05n}{\sigma})\geq0.90$

and solve for z.

The last part I understand, but I don't understand how they infere from the information to the first statement.

I don't really have much experience in stats, and I have trouble understanding how to get from the information given to the first equation.

2. Jan 6, 2014

What happens to the inequality
$$\left| \frac{X}{N} - p \, \right| \le 0.05$$

when you multiply both sides by the (positive, which is why the inequality doesn't change direction) value
$$\frac N \sigma$$

3. Jan 6, 2014

### Elpinetos

I understand how they get from the first to the second equation. I don't understand how they get the first equation from the information given.

4. Jan 6, 2014

### Ray Vickson

The quantity $X/N$ is the measured proportion, while $p$ is the theoretical proportion. You want the difference between these two proportions to be ≤ 0.05 with high probability (that is, with probability at least 0.9). Also: use either the symbol $N$ or the symbol $n$, but not both in the same problem (as you have done). That can cause confusion and loss of marks, etc.

5. Jan 7, 2014

### Elpinetos

I see now, thank you.

What I still don't fully understand is how to make the first equation in such examples.
I now understand how they got to it in this particular example, but how do I even approach such a problem? Where do I start? I somehow wouldn't have thought of taking the distance between those two.
Can you help me with that? :)

6. Jan 7, 2014

### haruspex

Not sure if this helps, but it can be generalized a little.
There is some property of the distribution, s, that you want to estimate. (In the present case it's p.) You have a procedure for estimating it from sample data, and the procedure produces the estimate $\hat s$. You want the probability of the estimate being within ε of s to exceed some threshold probability τ. Algebraically, $P[|\hat s - s| < \epsilon] > \tau$.
In the OP, the assumption is that the estimate for p will be X/N.

7. Jan 7, 2014

### Elpinetos

Thank you, this helps a lot :)

Any tips on how to best approach such problems? What should I look out for? How do I start best?

8. Jan 7, 2014

### haruspex

I can't think of anything I can add to what I wrote before. It's a matter of understanding what the objective is (finding the probability distribution of the error in an estimate for a parameter of another distribution) and expressing that in algebra.