Probability Problem: Find Total # of Typos in 269-Page Book

aaaa202 · Jun 12, 2013

Homework Statement

The first 18 pages of 269 paged book is examined for typos and 7 are found. Given this find the probability that in the entire book there are 7,8,9... typos in total.

Homework Equations

Probably the binomial distribution but I am very unsure.

The Attempt at a Solution

I am very unsure where to start. I might want to use the binomial distrobution somewhere, but on the other hand doesn't really fit to the problem. I am not looking for an answer, merely a hint as to where to start. Thank you :)

LCKurtz · Jun 12, 2013

aaaa202 said:

Homework Statement

The first 18 pages of 269 paged book is examined for typos and 7 are found. Given this find the probability that in the entire book there are 7,8,9... typos in total.

Homework Equations

Probably the binomial distribution but I am very unsure.
The Attempt at a Solution

I am very unsure where to start. I might want to use the binomial distrobution somewhere, but on the other hand doesn't really fit to the problem. I am not looking for an answer, merely a hint as to where to start. Thank you :)

I would think a logical assumption would be that the errors are uniformly distributed throughout the book. And your data suggests the probability that a given page has an error is 7/18. If you have independence among pages and you let ##X_i = 1## if there is an error on page ##i## and ##0## otherwise, aren't you inquiring about ##X = X_1+X_2+...X_{269}##? So...

[Edit] The more I think about it, I'm not so sure, hint not guaranteed

awkward · Jun 12, 2013

The Poisson distribution might work well here.

aaaa202 · Jun 12, 2013

If the probability that a page has an error is 7/18 what if there were 21 errors. Would it then be 21/18? That wouldn't make sense.

haruspex · Jun 12, 2013

aaaa202 said:

If the probability that a page has an error is 7/18 what if there were 21 errors. Would it then be 21/18? That wouldn't make sense.

By what logic would you multiply a number of errors per page by a number of errors? Multiplying by a number of pages would be reasonable, and that would indeed give you the expected (average) number of errors in that many pages.
A precise answer is not possible because we're not told how many letters per page, so, as suggested, a Poisson distribution seems appropriate. (Poisson is an approximation to the binomial which works well when there's a large number of trials and relatively few 'successes'.)

HallsofIvy · Jun 13, 2013

aaaa202 said:

If the probability that a page has an error is 7/18 what if there were 21 errors. Would it then be 21/18? That wouldn't make sense.

If you were told that there were 21 errors in 18 pages then there would, indeed, be an average of 21/18= 7/6 errors per page. Why would that not makes sense? Because it is larger than one? Are you assuming that you cannot have more than one error per page? Why?

aaaa202 · Jun 13, 2013

I thought he used 21/18 as a probability. Anyways, as it turns out the correct answer is not found using the poisson distribution.
Rather use Bayes theorem:

P(m errors l 7 errors first 18 pages) = C * P(7 errors first 18 pages l m errors)

where C is a normalization constant.

Ray Vickson · Jun 13, 2013

aaaa202 said:

I thought he used 21/18 as a probability. Anyways, as it turns out the correct answer is not found using the poisson distribution.
Rather use Bayes theorem:

P(m errors l 7 errors first 18 pages) = C * P(7 errors first 18 pages l m errors)

where C is a normalization constant.

This tells you precisely nothing. If E_m = {m errors in book} and E_7,18 = {7 errors in first 18 pages} we have
P(E_m|E_{7,18}) = P(E_{7,18}|E_m) \frac{P(E_m)}{P(E_{7,18})}
so your C = to P(E_m)/P(E_7,18). Now P(E_7,18 | E_m) is computable from a model (say uniform distribution of m errors throughout the book), but we still need to know P(E_m) to get anywhere.

aaaa202 · Jun 13, 2013

P(E_m) is just the a priori probability that the book contains m errors given no background information. We can set that to a constant.

Ray Vickson · Jun 13, 2013

aaaa202 said:

P(E_m) is just the a priori probability that the book contains m errors given no background information. We can set that to a constant.

You seem to be under the impression that problems of this type have "right" and "wrong" answers. That is not the case.

You are arguing for a Bayesian analysis using a so-called uniform prior, but in problems like this one it is perfectly acceptable for two different people to use two different "priors", and there is not really any way to say for sure that one is right and the other is wrong. For example, an editor or a publisher may have a lot of experience regarding misprints, and might use a prior very different from the uniform one you propose. Besides that, there are the so-called "classical statisticians" who would reject the use of Bayes Theorem entirely in such problems. (Do not misinterpret what I say: of course Bayes Theorem is a true theorem in Probability Theory, but the issue is how you apply it in certain situations---or, rather, whether it applies at all in some contexts.)

Even if we accept the Bayesian viewpoint, you still need a probability model for P(E_1,17|E_m). What model would YOU use? What actual answer would you get?

aaaa202 · Jun 13, 2013

P(7 Errors first 18 pagesl m errors) = (18/269)^7 * (1-18/269)^(m-7) * K(m,7), where K is the binomial coefficient. I don't see what other models to use than a binomial distribution. You could then find C by computing an infinite sum, but I don't think you can evalute the sum - its quite ugly.

haruspex · Jun 13, 2013

aaaa202 said:

P(7 Errors first 18 pagesl m errors) = (18/269)^7 * (1-18/269)^(m-7) * K(m,7), where K is the binomial coefficient. I don't see what other models to use than a binomial distribution. You could then find C by computing an infinite sum, but I don't think you can evalute the sum - its quite ugly.

You're overlooking that pages are arbitrary boundaries here. Typos occur at a much finer granularity. E.g. if there are 2000 letters to a page then the info is effectively that there were 7 errors in 36000 letters. This low hit rate makes Poisson entirely appropriate.

aaaa202 · Jun 13, 2013

I think you misunderstand. The probability for a specific typo to occur in the first 18 pages is 18/269. This has nothing to do with whether my granurality is fine enough.

Joffan · Jun 13, 2013

aaaa202 said:

P(7 Errors first 18 pagesl m errors) = (18/269)^7 * (1-18/269)^(m-7) * K(m,7), where K is the binomial coefficient. I don't see what other models to use than a binomial distribution. You could then find C by computing an infinite sum, but I don't think you can evalute the sum - its quite ugly.

C=18/269 for a uniform prior (all total error counts equally likely).

All right, I admit, it's an empirical result from evaluating the sum in Excel. But completely consistent as I vary the parameters: sum(p(E_found|E_i)) = 1/(proportion_examined).

haruspex · Jun 13, 2013

aaaa202 said:

I think you misunderstand. The probability for a specific typo to occur in the first 18 pages is 18/269. This has nothing to do with whether my granurality is fine enough.

I was commenting on this statement

I don't see what other models to use than a binomial distribution.

which I took to be a general statement about the problem, regardless of approach.

Probability Problem: Find Total # of Typos in 269-Page Book

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

Similar threads

Hot Threads

Geometry: Similar Shapes

Length of Diagonal

Eliminate ##\theta## between a pair of given equations

Find integer points on this equation

[ASK] Trigonometric Inequality

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective