Inferring quantity based on changing repitition rate

  • Context: Undergrad 
  • Thread starter Thread starter sir_manning
  • Start date Start date
  • Tags Tags
    Rate
Click For Summary

Discussion Overview

The discussion revolves around estimating the total number of individual comics on a webcomic site based on the frequency of repeats encountered while clicking a random selection button. Participants explore statistical methods to infer this quantity, considering assumptions about distribution and sampling.

Discussion Character

  • Exploratory
  • Mathematical reasoning

Main Points Raised

  • One participant suggests that if the comic selection is truly random, the number of unique comics can be estimated by analyzing the frequency of repeats encountered during random clicks.
  • Another participant proposes assuming a uniform distribution of comics, indicating that the average number of clicks before encountering a repeat can help estimate the probability of selecting a given comic.
  • A later reply introduces the idea of using maximum likelihood estimation on a joint Poisson distribution to model the occurrences of each comic, although they note potential computational difficulties if the number of comics is large.
  • Another method is proposed that involves counting the average occurrences of each comic after a set number of clicks, suggesting that this could provide a probability estimate for selecting a comic, though the participant expresses uncertainty about the required sample size for accuracy.

Areas of Agreement / Disagreement

Participants present multiple competing views on how to approach the estimation problem, with no consensus reached on the best method or the necessary conditions for accuracy.

Contextual Notes

There are assumptions about the uniformity of comic distribution and the computational feasibility of certain statistical methods that remain unverified. The discussions also highlight uncertainty regarding the sample size needed for reliable estimates.

sir_manning
Messages
64
Reaction score
0
Hey

I was bored the other day and I stumbled across a webcomic site. In my boredom, I kept on clicking "random" to choose a random comic. After a couple clicks on the random button, a comic that I had already read popped up.

Assuming it was truly random, I would have to run out of comics sometime. Also, as the amount that I read got smaller and smaller, clicking the random button would give me more and more comics that I had already read.

So now I'm wondering how I could go about getting an approximate number of how many individual comics exist on the site by applying some statistics to my initial few clicks (assuming that out of these clicks I will mostly get comics I haven't read but I will also get some repeats).

Thanks.
 
Physics news on Phys.org
I'd start by making the assumption that they are uniformly distributed. For each comic you get their should be an average amount of clicks be for you get that comic again.

The average number of clicks would be given by:

[tex]\sum_1^\infty n p^n[/tex]

where p is the probability of getting a given comic each click.

Thus if you look for the average number of clicks to get a given comic to repeat, you can estimate p. Once you have P then N would be equal to 1/p.
 
As a note to the above, the most accurate method would be to do maximum likelihood on a joint poison distribution. Your random variable would be the number of occurrences of each comic after a given number of clicks. If their is a lot of comics I suspect that this would be difficult computationally.
 
Here is a simple method. Count the average number of times each comic shows up after N clicks subtract one and take the average. Divide this by N.

Clearly for large N this should give the probability of getting a given comic each click. The inverse of this should be related to the number of comics. However, I'm not sure how large N needs to be for this to give a good estimate.
 

Similar threads

  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 19 ·
Replies
19
Views
2K
  • · Replies 21 ·
Replies
21
Views
4K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 8 ·
Replies
8
Views
2K
  • Sticky
  • · Replies 0 ·
Replies
0
Views
5K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K