1. Not finding help here? Sign up for a free 30min tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Probability problem

  1. Jun 12, 2013 #1
    1. The problem statement, all variables and given/known data
    The first 18 pages of 269 paged book is examined for typos and 7 are found. Given this find the probability that in the entire book there are 7,8,9... typos in total.

    2. Relevant equations
    Probably the binomial distribution but I am very unsure.


    3. The attempt at a solution
    I am very unsure where to start. I might want to use the binomial distrobution somewhere, but on the other hand doesn't really fit to the problem. I am not looking for an answer, merely a hint as to where to start. Thank you :)
     
  2. jcsd
  3. Jun 12, 2013 #2

    LCKurtz

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    I would think a logical assumption would be that the errors are uniformly distributed throughout the book. And your data suggests the probability that a given page has an error is 7/18. If you have independence among pages and you let ##X_i = 1## if there is an error on page ##i## and ##0## otherwise, aren't you inquiring about ##X = X_1+X_2+...X_{269}##? So...

    [Edit] The more I think about it, I'm not so sure, hint not guaranteed :frown:
     
    Last edited: Jun 12, 2013
  4. Jun 12, 2013 #3
    The Poisson distribution might work well here.
     
  5. Jun 12, 2013 #4
    If the probability that a page has an error is 7/18 what if there were 21 errors. Would it then be 21/18? That wouldn't make sense.
     
  6. Jun 12, 2013 #5

    haruspex

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member
    2016 Award

    By what logic would you multiply a number of errors per page by a number of errors? Multiplying by a number of pages would be reasonable, and that would indeed give you the expected (average) number of errors in that many pages.
    A precise answer is not possible because we're not told how many letters per page, so, as suggested, a Poisson distribution seems appropriate. (Poisson is an approximation to the binomial which works well when there's a large number of trials and relatively few 'successes'.)
     
  7. Jun 13, 2013 #6

    HallsofIvy

    User Avatar
    Staff Emeritus
    Science Advisor

    If you were told that there were 21 errors in 18 pages then there would, indeed, be an average of 21/18= 7/6 errors per page. Why would that not makes sense? Because it is larger than one? Are you assuming that you cannot have more than one error per page? Why?
     
  8. Jun 13, 2013 #7
    I thought he used 21/18 as a probability. Anyways, as it turns out the correct answer is not found using the poisson distribution.
    Rather use Bayes theorem:

    P(m errors l 7 errors first 18 pages) = C * P(7 errors first 18 pages l m errors)

    where C is a normalization constant.
     
  9. Jun 13, 2013 #8

    Ray Vickson

    User Avatar
    Science Advisor
    Homework Helper



    This tells you precisely nothing. If E_m = {m errors in book} and E_7,18 = {7 errors in first 18 pages} we have
    [tex] P(E_m|E_{7,18}) = P(E_{7,18}|E_m) \frac{P(E_m)}{P(E_{7,18})}[/tex]
    so your C = to P(E_m)/P(E_7,18). Now P(E_7,18 | E_m) is computable from a model (say uniform distribution of m errors throughout the book), but we still need to know P(E_m) to get anywhere.
     
    Last edited: Jun 13, 2013
  10. Jun 13, 2013 #9
    P(E_m) is just the a priori probability that the book contains m errors given no background information. We can set that to a constant.
     
  11. Jun 13, 2013 #10

    Ray Vickson

    User Avatar
    Science Advisor
    Homework Helper

    You seem to be under the impression that problems of this type have "right" and "wrong" answers. That is not the case.

    You are arguing for a Bayesian analysis using a so-called uniform prior, but in problems like this one it is perfectly acceptable for two different people to use two different "priors", and there is not really any way to say for sure that one is right and the other is wrong. For example, an editor or a publisher may have a lot of experience regarding misprints, and might use a prior very different from the uniform one you propose. Besides that, there are the so-called "classical statisticians" who would reject the use of Bayes Theorem entirely in such problems. (Do not misinterpret what I say: of course Bayes Theorem is a true theorem in Probability Theory, but the issue is how you apply it in certain situations---or, rather, whether it applies at all in some contexts.)

    Even if we accept the Bayesian viewpoint, you still need a probability model for P(E_1,17|E_m). What model would YOU use? What actual answer would you get?
     
  12. Jun 13, 2013 #11
    P(7 Errors first 18 pagesl m errors) = (18/269)^7 * (1-18/269)^(m-7) * K(m,7), where K is the binomial coefficient. I don't see what other models to use than a binomial distribution. You could then find C by computing an infinite sum, but I dont think you can evalute the sum - its quite ugly.
     
  13. Jun 13, 2013 #12

    haruspex

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member
    2016 Award

    You're overlooking that pages are arbitrary boundaries here. Typos occur at a much finer granularity. E.g. if there are 2000 letters to a page then the info is effectively that there were 7 errors in 36000 letters. This low hit rate makes Poisson entirely appropriate.
     
  14. Jun 13, 2013 #13
    I think you misunderstand. The probability for a specific typo to occur in the first 18 pages is 18/269. This has nothing to do with whether my granurality is fine enough.
     
  15. Jun 13, 2013 #14
    C=18/269 for a uniform prior (all total error counts equally likely).

    All right, I admit, it's an empirical result from evaluating the sum in Excel. But completely consistent as I vary the parameters: sum(p(E_found|E_i)) = 1/(proportion_examined).
     
  16. Jun 13, 2013 #15

    haruspex

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member
    2016 Award

    I was commenting on this statement
    which I took to be a general statement about the problem, regardless of approach.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Probability problem
  1. Probability problem (Replies: 4)

  2. Probability problem (Replies: 31)

  3. Probability problem (Replies: 5)

  4. Probability problem (Replies: 59)

Loading...