Challenge Micromass' big probability challenge

PeroK · May 20, 2016

BiGyElLoWhAt said:

So here's my thinking (not a huge probability guy, so I might be missing something, but I think this makes sense):

We want to find the number of consecutive heads/tails such that the probability is 1 in 4.320432x10^17. The probability of flipping 1 heads is 1/2, the probability of two is (1/2)^2, and the probability of n heads is (1/2)^n. So we have (1/2)^n = 1/4.320432x10^(17)
or
##2^n = 4.320432E17##
##\text{log}_2(2^n) = \text{log}_2(4.320432E17) ## which gives us about ##n=58.584## which honestly seems a little low to me. But that's my guess, anyways.

I think you might have missed that to get exactly ##n## heads, you also need to follow up the heads with a tail. So, what you calculated is ##n+1 = 58.584##.

andrewkirk · May 20, 2016

Here's my attempt at Problem 4:

The number of different 8-shoe sequences that can be picked in 8 selections is T\equiv P^{200}_8=\frac{200!}{(200-8)!}

The number of those that have exactly one pair is the product of the following factors:

(number of different pairs that could be the complete pair)
(number of different ordered pairs of integers from {1,...,8} without replacement that could denote the pick numbers of the left and right shoes in the complete picked pair)
(number of different ordered 6-tuples from {1,...,99} without replacement that could denote the pairs from which the six picked unpaired shoes come)
(number of different ordered 6-tuples from {left,right} with replacement that could denote which foot each picked, unpaired shoe is for)

This is

$$R\equiv 100\times P^8_2\times P^{99}_6\times 2^6$$

So the probability of picking exactly one pair from eight picks is ##R/T##.

A quick calc tells me this is about 13%.

andrewkirk · May 20, 2016

Here's my attempt at number 6:

The number of spots in which aces can be placed is

$$N=\sum_{i=1}^{52-6}\ \ \sum_{j=i+2}^{52-4}\ \ \sum_{k=j+2}^{52-2}\ \ \sum_{l=k+2}^{52-0}1=
\sum_{i=1}^{52-6}\ \ \sum_{j=i+2}^{52-4}\ \ \sum_{k=j+2}^{52-2}\ \ (54-k)$$

The outer sum is over possible positions for the first ace, the sum inside that over possible positions for the second ace, and so on. The deductions from the upper sum limits are to leave room for the remaining aces without any of them being adjacent. The lower sum limits are to leave at least one space between this ace and the preceding one.

So the number of arrangements with no aces adjacent is

N x
(number of different orders for the four aces) x
(number of different orders for the remaining 48 cards)

This is
$$R=N\times 4!\times 48!$$

The probability of this occurring will be

$$\frac{R}{52!}=\frac{24N}{52\cdot 51\cdot 50\cdot 49}$$

My calc is that N=263,764, in which case the probability would be 97%. That's surprisingly high. I'll have to check.

andrewkirk · May 20, 2016

Here is my partial attempt at 8.

The probability will be ##3!=6## times ##R## where ##R## is the probability that the first piece is the longest, the third piece the shortest and an obtuse triangle is formed. That factor of six allows for the different possible orderings of the pieces.

Then, letting x and y be the coordinates of the first and second breaks, we have

$$R=\int_{1/3}^{1/2}\int_{\frac{x+1}2}^{2x}\chi_A((x,y))dy\ dx$$

where ##A=\{(x,y)\in\mathbb R^2\ |\ x^2>(y-x)^2+(1-y)^2\}## is the requirement that the triangle be obtuse (there are points in A that are not triangles, but that doesn't matter because the integration limits ensure triangularity.

The integral limits are necessary and sufficient conditions for the first piece to be the longest, the second to be the next longest and for a triangle to be formed.

lower outer limit could have been 0 but, because of the inner limits, the integrand will be zero if x is in [0,1/3], so we might as well set the limit to 1/3.
upper outer limit is necessary for a triangle to be possible (triangle inequality)
lower inner limit is necessary and sufficient for the 3rd piece to be shorter than the 2nd
upper inner limit is necessary and sufficient for 2nd piece to be shorter than 1st

Limits 3 and 4 together ensure that 1st length > 2nd length > 3rd length.

My problem is that I can't yet see an analytic formulation of the indicator function ##\chi_A##.

A crude numerical integration gives a number between ##8\frac12##% and 9%.

fresh_42 · May 20, 2016

andrewkirk said:

Here is my partial attempt at 8.

The probability will be ##3!=6## times ##R## where ##R## is the probability that the first piece is the longest, the third piece the shortest and an obtuse triangle is formed. That factor of six allows for the different possible orderings of the pieces.

Then, letting x and y be the coordinates of the first and second breaks, we have

$$R=\int_{1/3}^{1/2}\int_{\frac{x+1}2}^{2x}\chi_A((x,y))dy\ dx$$

where ##A=\{(x,y)\in\mathbb R^2\ |\ x^2>(y-x)^2+(1-y)^2\}## is the requirement that the triangle be obtuse (there are points in A that are not triangles, but that doesn't matter because the integration limits ensure triangularity.

The integral limits are necessary and sufficient conditions for the first piece to be the longest, the second to be the next longest and for a triangle to be formed. My problem is that I can't yet see an analytic formulation of the indicator function ##\chi_A##.

A crude numerical integration gives a number between ##8\frac12##% and 9%.

You can transform ##x^2>(y-x)^2+(1-y)^2## into ##uv > 1##. Could this help?

micromass · May 20, 2016

PeroK said:

Amended answer for number 4 (100 pairs this time):

To have exactly one complete pair, you need one pair and six odd shoes. The pair is equally likely to be any two shoes, from the 1st and 2nd to the 7th and 8th. There are, therefore, ##\binom{8}{2} = 28## equally likely ways to get a pair.

The probability that the 1st and 2nd shoes are the pair is:

##p_{12} = \frac{200}{200} \times \frac{1}{199} \times \frac{198}{198} \times \frac{196}{197} \times \frac{194}{196} \times \frac{192}{195} \times \frac{190}{194} \times \frac{188}{193}##

Hence, the total probablity that there is exactly one pair is:

##p = 28 \times p_{12} = 0.13##

QuantumQuest said:

Problem 4

There are 100 ways of choosing 1 pair of shoes from 100 pairs. Because we have to choose exactly one complete pair, for all the remaining shoes we cannot have a match, so there are ##99 \choose 6## ways to choose pairs for the remaining shoes. Now, in order to select individual shoes from the six pairs the possible ways are ##2^6##. So, if we multiply ##99 \choose 6## ## \cdot 100 \cdot 2^6## and divide by the number of possible ways of selecting 8 shoes we get what is asked:

##\frac {{99 \choose 6} \cdot 100 \cdot 2^6}{200 \choose 8} \approx 0.130154##

Those replies are both correct. I'm afraid for QuantumQuest that PeroK was first though, so he gets credited.

micromass · May 20, 2016

andrewkirk said:

Here's my attempt at Problem 4:

The number of different 8-shoe sequences that can be picked in 8 selections is T\equiv P^{200}_8=\frac{200!}{(200-8)!}

The number of those that have exactly one pair is the product of the following factors:

(number of different pairs that could be the complete pair)

(number of different ordered pairs of integers from {1,...,8} without replacement that could denote the pick numbers of the left and right shoes in the complete picked pair)

(number of different ordered 6-tuples from {1,...,99} without replacement that could denote the pairs from which the six picked unpaired shoes come)

(number of different ordered 6-tuples from {left,right} with replacement that could denote which foot each picked, unpaired shoe is for)

This is

$$R\equiv 100\times P^8_2\times P^{99}_6\times 2^6$$

So the probability of picking exactly one pair from eight picks is ##R/T##.

A quick calc tells me this is about 13%.

Correct, but PeroK was first.

andrewkirk said:

Here's my attempt at number 6:

The number of spots in which aces can be placed is

$$N=\sum_{i=1}^{52-6}\ \ \sum_{j=i+2}^{52-4}\ \ \sum_{k=j+2}^{52-2}\ \ \sum_{l=k+2}^{52-0}1=
\sum_{i=1}^{52-6}\ \ \sum_{j=i+2}^{52-4}\ \ \sum_{k=j+2}^{52-2}\ \ (54-k)$$

The outer sum is over possible positions for the first ace, the sum inside that over possible positions for the second ace, and so on. The deductions from the upper sum limits are to leave room for the remaining aces without any of them being adjacent. The lower sum limits are to leave at least one space between this ace and the preceding one.

So the number of arrangements with no aces adjacent is

N x
(number of different orders for the four aces) x
(number of different orders for the remaining 48 cards)

This is
$$R=N\times 4!\times 48!$$

The probability of this occurring will be

$$\frac{R}{52!}=\frac{24N}{52\cdot 51\cdot 50\cdot 49}$$

My calc is that N=263,764, in which case the probability would be 97%. That's surprisingly high. I'll have to check.

That number is too high.

andrewkirk said:

Here is my partial attempt at 8.

The probability will be ##3!=6## times ##R## where ##R## is the probability that the first piece is the longest, the third piece the shortest and an obtuse triangle is formed. That factor of six allows for the different possible orderings of the pieces.

Then, letting x and y be the coordinates of the first and second breaks, we have

$$R=\int_{1/3}^{1/2}\int_{\frac{x+1}2}^{2x}\chi_A((x,y))dy\ dx$$

where ##A=\{(x,y)\in\mathbb R^2\ |\ x^2>(y-x)^2+(1-y)^2\}## is the requirement that the triangle be obtuse (there are points in A that are not triangles, but that doesn't matter because the integration limits ensure triangularity.

The integral limits are necessary and sufficient conditions for the first piece to be the longest, the second to be the next longest and for a triangle to be formed.

lower outer limit could have been 0 but, because of the inner limits, the integrand will be zero if x is in [0,1/3], so we might as well set the limit to 1/3.

upper outer limit is necessary for a triangle to be possible (triangle inequality)

lower inner limit is necessary and sufficient for the 3rd piece to be shorter than the 2nd

upper inner limit is necessary and sufficient for 2nd piece to be shorter than 1st

Limits 3 and 4 together ensure that 1st length > 2nd length > 3rd length.

My problem is that I can't yet see an analytic formulation of the indicator function ##\chi_A##.

A crude numerical integration gives a number between ##8\frac12##% and 9%.

All I will say now is that the correct answer is not between your two limits.

andrewkirk · May 20, 2016

On problem 6, prompted by MM's observation, I see that my summand (54-k) is incorrect. Since the previous ace was in position k, there are only 52-(k+1) =51-k remaining valid places for the last ace.

The calc using that summand instead gives N=211,876.
That would give the probabiity as ##\frac{24N}{52\cdot 51\cdot 50\cdot 49}## which is 78%.

micromass · May 20, 2016

andrewkirk said:

On problem 6, prompted by MM's observation, I see that my summand (54-k) is incorrect. Since the previous ace was in position k, there are only 52-(k+1) =51-k remaining valid places for the last ace.

The calc using that summand instead gives N=211,876.
That would give the probabiity as ##\frac{24N}{52\cdot 51\cdot 50\cdot 49}## which is 78%.

Very good. The answer is in fact
\frac{49\cdot 48\cdot 47\cdot 46}{52\cdot 51\cdot 50\cdot 49}
A simple reasoning for this short expression will be given in the OP.

andrewkirk · May 20, 2016

micromass said:

All I will say now is that the correct answer is not between your two limits.

On problem 8, I see from a brute force calc that my answer appears to be three times what I think it should be. It looks like I should have multiplied R by 2 rather than by 6, but I can't immediately see why. I must have done something silly in working out the different orders the lengths could be and how that affects the probability. I'll think about that later.

But I have a method that avoids assuming an order, in which the probability of an obtuse triangle is as follows:

$$\int_0^{\frac12}\int_{\frac12}^{x+\frac12} \chi_A\ dy\ dx$$

the integration limits collectively ensure that no side is longer than 1/2, and hence that a triangle can be formed. The indicator function ##\chi_A## requires the triangle to be obtuse.

The answer I get from a numerical integration on that is about 2.9%. That's the same as for my brute force calc and one third of my erroneous answer above.

EDIT: I think there may still be an error in this. Can you hold off commenting please MicroMass until I've thought about it a little more? I could have deleted it but decided it's better not to since it has already been quoted.
A bit later: I've confirmed it needs correction, because

it doesn't allow for the fact that any of the three sides could be opposite the obtuse angles; and
it needs to be doubled, since it assumes the cut that relates to the outer integral will be closer to the left end than the other cut.

I'll post the corrected version in a new post.

Charles Link · May 20, 2016

I would like to try #9. For the probability that the first person has a Jan 1 birthday and all others different, that is ## p=(1/365)(364/365)^{999} ##. There are 1000 different people who can have this Jan.1 birthday, so that probability of one and only one Jan.1 birthday is ## P=(1000/365)(364/365)^{999} ##. (Not going to try to account for leap years.) 365 days, each of which has a similar possibility of one and only one birthday, so that mean number (expected number of people with no one matching their birthday) will be ## \mu=365 P=1000(364/365)^{999} ## . My arithmetic with a log table (I don't have a calculator handy) is ##\mu=64 ##. I didn't account either for any cross correlations, but these should be insignificant.

PeroK · May 21, 2016

andrewkirk said:

On problem 6, prompted by MM's observation, I see that my summand (54-k) is incorrect. Since the previous ace was in position k, there are only 52-(k+1) =51-k remaining valid places for the last ace.

The calc using that summand instead gives N=211,876.
That would give the probabiity as ##\frac{24N}{52\cdot 51\cdot 50\cdot 49}## which is 78%.

There's a neat solution to this:

Take the ##48## other cards and place them in a row with a space between each (and a space at either end), so there are ##49## spaces. Now, each arrangement of the cards where no two Aces are adjacent corresponds to an arrangment of the cards where each of the ##4## aces occupies one of the ##49## spaces.

Hence, there are ##\binom{49}{4}## valid arrangements, out of the total of ##\binom{52}{4}## possible arrangements for the 52 cards.

This simply generalises. If there are ##n## cards and ##k## aces, then the probability that no two aces are adjacent is:

##\binom{n-k+1}{k} / \binom{n}{k}##

PeroK · May 21, 2016

For number 9. Note: solution ignores leap years.

Imagine you conduct this experiment many times with ##999## other people. How often do you find yourself without a birthday partner?

##p = (\frac{364}{365})^{999} = 0.0645##

This is the same for everyone else. So, everyone is without a birthday partner with this probability. Overall, therefore, the expected number in any ##1000## is:

##1000p = 64.5##

Just saw @Charles Link beat me to it!

So, taking 1 leap year every 4 years, we have 1461 days every 4 years:

##p_x = 1/1461## is the probability of being born on 29th Feb. And

##p_y = 1460/1461## is the probability otherwise.

For those born on 29th Feb, the probability no one shares their birthday is:

##q_x = (1460/1461)^{999}##

For those not born on 29th Feb:

##q_y = (1457/1461)^{999}##

This gives a revised calculation:

##p = p_xq_x + p_yq_y = 0.0649##

Hence the expected number in 1000 is ##64.9##.

thephystudent · May 21, 2016

andrewkirk said:

On problem 8, I see from a brute force calc that my answer appears to be three times what I think it should be. It looks like I should have multiplied R by 2 rather than by 6, but I can't immediately see why. I must have done something silly in working out the different orders the lengths could be and how that affects the probability. I'll think about that later.

But I have a method that avoids assuming an order, in which the probability of an obtuse triangle is as follows:

$$\int_0^{\frac12}\int_{\frac12}^{x+\frac12} \chi_A\ dy\ dx$$

the integration limits collectively ensure that no side is longer than 1/2, and hence that a triangle can be formed. The indicator function ##\chi_A## requires the triangle to be obtuse.

The answer I get from a numerical integration on that is about 2.9%. That's the same as for my brute force calc and one third of my erroneous answer above.

I get the same result now for the numerator when I use the triangular inequality I forgot about last time.

The total probability, however is(in my notation)

$$P(\text{obtuse triangle})=\langle\Theta(c^2-a^2-b^2)\Theta(1/2-c)\rangle=\frac{\int_0^1dc\int_0^1db\int_0^1da \Theta(c^2-a^2-b^2)\Theta(1/2-c)\delta(1-a-b-c) }{\int_0^1dc\int_0^1db\int_0^1da\delta(1-a-b-c)\Theta(c-a)\Theta(c-b)}\approx 0.17
$$

Note that demand that ##c## is the biggest one is automatically included in the numerator by the pythagoras inequality.
If I'm correct this permutation counting doesn't matter in the end as long as you're consistent in numerator and denominator. In my way of doing, I for example first pick the longest part and call it ##c##, and than ##a## and ##b## are the other two parts as you find them from the left to the right.

andrewkirk · May 21, 2016

Here's my corrected attempt at Q8. I think the probability will be:

$$
2\times \int_0^{\frac12}\int_{\frac12}^{x+\frac12} \chi_B\ dy\ dx
$$

where B is the subset of ##\mathbb R^2## (where the two coordinates denote the distances from the left end of the stick where the first and second cuts are made respectively) such that ##a^2>b^2+c^2##, where ##a,b,c## are the lengths of the three stick components arranged from longest to shortest (so ##a## is the longest).

That is, if the cut locations are ##x## and ##y## then the stick lengths are ##\min(x,y),1-\max(x,y)## and ##|x-y|## and ##a,b,c## is those three numbers sorted from largest to smallest.

The integral gives the probability of an obtuse triangle if the first cut is closer to the left hand end of the stick. We then double it to get the total probability, as the prob if the second cut is closest to the left hand end is the same as that one, and the overlap of the two events has measure zero.

The integration limits are necessary and sufficient conditions for the three sticks to form a triangle, given that the first cut is closer to the left-hand end.

The result I get from a numerical integration of this is approximately 17%.

thephystudent · May 21, 2016

andrewkirk said:

Here's my corrected attempt at Q8. I think the probability will be:

$$
2\times \int_0^{\frac12}\int_{\frac12}^{x+\frac12} \chi_B\ dy\ dx
$$

where B is the subset of ##\mathbb R^2## (where the two coordinates denote the distances from the left end of the stick where the first and second cuts are made respectively) such that ##a^2>b^2+c^2##, where ##a,b,c## are the lengths of the three stick components arranged from longest to shortest (so ##a## is the longest).

That is, if the cut locations are ##x## and ##y## then the stick lengths are ##\min(x,y),1-\max(x,y)## and ##|x-y|## and ##a,b,c## is those three numbers sorted from largest to smallest.

The integral gives the probability of an obtuse triangle if the first cut is closer to the left hand end of the stick. We then double it to get the total probability, as the prob if the second cut is closest to the left hand end is the same as that one, and the overlap of the two events has measure zero.

The integration limits are necessary and sufficient conditions for the three sticks to form a triangle, given that the first cut is closer to the left-hand end.

The result I get from a numerical integration of this is approximately 17%.

Nice to have results that agree

micromass · May 21, 2016

Charles Link said:

I would like to try #9. For the probability that the first person has a Jan 1 birthday and all others different, that is ## p=(1/365)(364/365)^{999} ##. There are 1000 different people who can have this Jan.1 birthday, so that probability of one and only one Jan.1 birthday is ## P=(1000/365)(364/365)^{999} ##. (Not going to try to account for leap years.) 365 days, each of which has a similar possibility of one and only one birthday, so that mean number (expected number of people with no one matching their birthday) will be ## \mu=365 P=1000(364/365)^{999} ## . My arithmetic with a log table (I don't have a calculator handy) is ##\mu=64 ##. I didn't account either for any cross correlations, but these should be insignificant.

PeroK said:

For number 9. Note: solution ignores leap years.

Imagine you conduct this experiment many times with ##999## other people. How often do you find yourself without a birthday partner?

##p = (\frac{364}{365})^{999} = 0.0645##

This is the same for everyone else. So, everyone is without a birthday partner with this probability. Overall, therefore, the expected number in any ##1000## is:

##1000p = 64.5##

Just saw @Charles Link beat me to it!

So, taking 1 leap year every 4 years, we have 1461 days every 4 years:

##p_x = 1/1461## is the probability of being born on 29th Feb. And

##p_y = 1460/1461## is the probability otherwise.

For those born on 29th Feb, the probability no one shares their birthday is:

##q_x = (1460/1461)^{999}##

For those not born on 29th Feb:

##q_y = (1457/1461)^{999}##

This gives a revised calculation:

##p = p_xq_x + p_yq_y = 0.0649##

Hence the expected number in 1000 is ##64.9##.

Very good. I'll credit you both since PeroK also did it for leap years.

thephystudent said:

I get the same result now for the numerator when I use the triangular inequality I forgot about last time.

The total probability, however is(in my notation)

$$P(\text{obtuse triangle})=\langle\Theta(c^2-a^2-b^2)\Theta(1/2-c)\rangle=\frac{\int_0^1dc\int_0^1db\int_0^1da \Theta(c^2-a^2-b^2)\Theta(1/2-c)\delta(1-a-b-c) }{\int_0^1dc\int_0^1db\int_0^1da\delta(1-a-b-c)\Theta(c-a)\Theta(c-b)}\approx 0.17
$$

Note that demand that ##c## is the biggest one is automatically included in the numerator by the pythagoras inequality.
If I'm correct this permutation counting doesn't matter in the end as long as you're consistent in numerator and denominator. In my way of doing, I for example first pick the longest part and call it ##c##, and than ##a## and ##b## are the other two parts as you find them from the left to the right.

andrewkirk said:

Here's my corrected attempt at Q8. I think the probability will be:

$$
2\times \int_0^{\frac12}\int_{\frac12}^{x+\frac12} \chi_B\ dy\ dx
$$

where B is the subset of ##\mathbb R^2## (where the two coordinates denote the distances from the left end of the stick where the first and second cuts are made respectively) such that ##a^2>b^2+c^2##, where ##a,b,c## are the lengths of the three stick components arranged from longest to shortest (so ##a## is the longest).

That is, if the cut locations are ##x## and ##y## then the stick lengths are ##\min(x,y),1-\max(x,y)## and ##|x-y|## and ##a,b,c## is those three numbers sorted from largest to smallest.

The integral gives the probability of an obtuse triangle if the first cut is closer to the left hand end of the stick. We then double it to get the total probability, as the prob if the second cut is closest to the left hand end is the same as that one, and the overlap of the two events has measure zero.

The integration limits are necessary and sufficient conditions for the three sticks to form a triangle, given that the first cut is closer to the left-hand end.

The result I get from a numerical integration of this is approximately 17%.

That is correct, but sadly thephystudent was first. I will credit anybody else who gives an exact solution as opposed to the approximate one.

QuantumQuest · May 21, 2016

micromass said:

Those replies are both correct. I'm afraid for QuantumQuest that PeroK was first though, so he gets credited

Yes indeed. I had worked for some time on the problem and when I was ready to post, PeroK had already done. But anyway, such is life...;)

Redbelly98 · May 21, 2016

micromass said:

5. In a town of ##n+1## inhabitants. A person tells a rumor to two distinct numbers, the "first generation". These repeat the performance and generally, each person tells a rumor to two people at random without regard to the past development. Find the probability that the generation ##r## will not contain the person who started the rumor. Find the median of this distribution assuming ##n## large.

Can you clarify something. If each person chooses "two people at random without regard to the past development" to repeat the rumor to, then they could repeat the rumor to the same person who told them. That doesn't seem to follow common sense, but is a literal interpretation of the problem statement.

Thanks for putting these together, btw!

micromass · May 21, 2016

Redbelly98 said:

Can you clarify something. If each person chooses "two people at random without regard to the past development" to repeat the rumor to, then they could repeat the rumor to the same person who told them. That doesn't seem to follow common sense, but is a literal interpretation of the problem statement.

Thanks for putting these together, btw!

No, you can't tell them to the same person who told you. However, if person A told B and C, then it is possibly for B to tell C. But B cannot tell A. Thank you for noticing this.

Samy_A · May 21, 2016

micromass said:

No, you can't tell them to the same person who told you. However, if person A told B and C, then it is possibly for B to tell C. But B cannot tell A. Thank you for noticing this.

Is this true forever, or only for the next generation?

I mean: say A tells the secret to B. After a few generations, the secret is again told to B. Can he now tell it to A?

Redbelly98 · May 21, 2016

Samy_A said:

Is this true forever, or only for the next generation?

I mean: say A tells the secret to B. After a few generations, the secret is again told to B. Can he now tell it to A?

Related: After a few generations, the secret is again told to A. Can A tell B again?

Common sense would say "no" to both questions, since A & B both realize that the other has heard the rumor already.

micromass · May 21, 2016

Samy_A said:

I mean: say A tells the secret to B. After a few generations, the secret is again told to B. Can he now tell it to A?

Yes that is possible

micromass · May 21, 2016

Redbelly98 said:

Related: After a few generations, the secret is again told to A. Can A tell B again?

Yes, otherwise things would be intractable.

mfb · May 21, 2016

The rumor spreading rules get complicated.Proof for problem 7: As Fightfish noted already, the number of ways we can get the votes in follows a law similar to the binomial coefficients. If x is the number of votes for Trump and y is the number of votes for Hillary, then
$$f(x,y) = \begin{cases} 1 \qquad \mathrm{if}\qquad y = 0 \\ 0 \qquad \mathrm{if}\qquad x = y \\ f(x-1,y) + f(x,y-1) \qquad \mathrm{otherwise} \end{cases}$$

We can show that $$f(x,y) = \frac{(x-y)(x+y-1)!}{x!y!}$$ is the solution to this by (two-dimensional) induction. It is clearly true for x=y (where the first expression in the numerator gets zero) and for y=0 (where we get x!/x!=1).

Induction step:
$$\begin{align*}
f(x-1,y) + f(x,y-1) &= \frac{(x-y-1)(x+y-2)!}{(x-1)!y!} + \frac{(x-y+1)(x+y-2)!}{x!(y-1)!} \\
&= \frac{x(x-y-1)(x+y-2)!}{x!y!} + \frac{y(x-y+1)(x+y-2)!}{x!y!}\\
&= \frac{(x+y-2)!}{x!y!} \left( (x+y)(x-y)-(x-y) \right)\\
&= \frac{(x+y-2)!}{x!y!} (x-y)(x+y-1)\\
&= \frac{(x-y)(x+y-1)!}{x!y!}\\
&= f(x,y) \end{align*}$$

Finally, we have to divide f(x,y) by the total number of ways the votes can go up, (x+y choose x):

$$P = \frac{(x-y)(x+y-1)!}{x!y!} \cdot \frac{x!y!}{(x+y)!} = \frac{x-y}{x+y}$$

Isaac0427 · May 21, 2016

micromass said:

In a town of n+1n+1n+1 inhabitants. A person tells a rumor to two distinct numbers, the "first generation". These repeat the performance and generally, each person tells a rumor to two people at random without regard to the past development. Find the probability that the generation rrr will not contain the person who started the rumor. Find the median of this distribution assuming nnn large.

Well, I can answer part of the problem. In generation r, ##2^r## people will be told the rumor. The probability of anyone person being told the rumor would be the people being told the rumor divided by the people in the town, and thus ##\frac{2^r}{n+1}##. Therefore, the probability of that person not being told the rumor is ##1-\frac{2^r}{n+1}## (I think I'm doing this right). I am assuming that somebody can't be told the rumor more than once in one generation, so ##2^r\leq n+1##. I'm not quite sure what you mean by the median of the distribution.

micromass · May 21, 2016

Isaac0427 said:

In generation r, ##2^r## people will be told the rumor.

I am not convinced of this.

Isaac0427 · May 21, 2016

micromass said:

I am not convinced of this.

In generation 1, 2 people will be told. Then, each of those people will tell 2 people. 4 people will be told (gen 2). And then those will tell 2 people. 8 people will be told (gen 3). This pattern has a constant multiplier, and thus an exponential function must be used. The multiplier is 2, and is therefore is the base of the exponent. We can check this with results that we know; ##2^1=2##, ##2^2=4## and ##2^3=8##.

Redbelly98 · May 21, 2016

Isaac0427 said:

In generation 1, 2 people will be told. Then, each of those people will tell 2 people. 4 people will be told (gen 2).

But if those people tell the same 2 people, then only 2 people, not 4, will be told in gen 2.

Or, those 2 people might tell 3 people in total, if 1 person gets told by both of the "tellers" in gen 2.

Isaac0427 · May 21, 2016

Redbelly98 said:

But if those people tell the same 2 people, then only 2 people, not 4, will be told in gen 2.

Or, those 2 people might tell 3 people in total, if 1 person gets told by both of the "tellers" in gen 2.

I said in my first post that I was assuming that somebody couldn't be told more than once in the same generation, because

micromass said:

each person tells a rumor to two people at random without regard to the past development.

And I took that to assume that they were aware of present development (as @micromass specifically talked about past development)

andrewkirk · May 21, 2016

Isaac0427 said:

I said in my first post that I was assuming that somebody couldn't be told more than once in the same generation, because

And I took that to assume that they were aware of present development (as @micromass specifically talked about past development)

Micromass makes clear in post 70 that it is possible for somebody to be told the rumour more than once in the same generation, by two different people.

One quasi-realistic implementation of this would be if the communication were by private, one-way medium such as email or text. That way there is no opportunity for Hermann to stop Lakshmi as she verbally tells him the rumour by saying 'Stop, Arjun has just told me that'. Lakshmi will forward the email to Hermann anyway because she doesn't know Arjun has already forwarded it to Hermann. But Lakshmi would not forward the email to Hermann if she has only just received it from him in the same generation (say, on the same day).

The fact that Lakshmi can forward the email to Hermann if he forwarded it to her in a previous generation can be explained by the supposition that people have short memories.

I was trying to think this one through yesterday and the questions raised in the above posts came up. With the clarifications, the process of determining the size of each generation really does seem quite complicated. No doubt there is a clever way to cut through, but what is it?

Isaac0427 · May 21, 2016

andrewkirk said:

Micromass makes clear in post 70 that it is possible for somebody to be told the rumour more than once in the same generation, by two different people.

That's not what I got out of post #70.

micromass · May 21, 2016

Isaac0427 said:

That's not what I got out of post #70.

It's what I meant anyway. Sorry if that was unclear.

andrewkirk · May 21, 2016

Can I make a further request for clarification on this rumour question. The last sentence asks for the median of the distribution, but it does not say what the random variable is of which the median is to be calculated. I can think of a few different possibilities. My best guess is that it's the median of ##X_n##, as a function of ##n##, where ##X_n## is the number of people who are told the rumour in generation ##n##.

I'm nowhere near up to that part of the question, but thought it might be best to plan ahead.

micromass · May 21, 2016

andrewkirk said:

Can I make a further request for clarification on this rumour question. The last sentence asks for the median of the distribution, but it does not say what the random variable is of which the median is to be calculated. I can think of a few different possibilities. My best guess is that it's the median of ##X_n##, as a function of ##n##, where ##X_n## is the number of people who are told the rumour in generation ##n##.

I'm nowhere near up to that part of the question, but thought it might be best to plan ahead.

For every ##r##, you have the probability ##P(r)## that generation ##r## will contain the original starter. I want the median of those ##P(r)##.

andrewkirk · May 21, 2016

I'm afraid I don't follow. ##P(r)## is a deterministic infinite sequence, not a random variable (assuming we are referring to unconditional probabilities, which I'm pretty confident we are). What does it mean to take the median of a deterministic infinite sequence?

micromass · May 21, 2016

andrewkirk said:

I'm afraid I don't follow. ##P(r)## is a deterministic infinite sequence, not a random variable (assuming we are referring to unconditional probabilities, which I'm pretty confident we are). What does it mean to take the median of a deterministic infinite sequence?

Every ##P(r)## is a probability. Namely the probability that generation ##1## and generation ##2## and ... and generation ##r## do not contain the original starter. Basically, I want to know when this probability is ##1/2##.

Samy_A · May 22, 2016

Deleted:

Note to myself: read the question before answering.

Samy_A · May 22, 2016

Second shot at 5:

I'm not sure about terminology. Here I define generation m as the people who are told the rumor during the m'th iteration of the process.

EDIT: it wasn't clear in the OP, but from post #87, it appears that generation m consists of all the people who have been told the rumor up to the m'th iteration.

So "solution" deleted.

Samy_A · May 22, 2016

Never mind, one of those days. :oldconfused:

mfb · May 22, 2016

micromass said:

Every ##P(r)## is a probability. Namely the probability that generation ##1## and generation ##2## and ... and generation ##r## do not contain the original starter. Basically, I want to know when this probability is ##1/2##.

That looks like an odd definition of median. While it does not happen here, in general P(r) could go down. Do we have multiple medians then?

If generation r has N(r) "active" people, then we have 2 N mails sent to nearly random people, neglecting that a 2N out of N(n+1) specific combinations are not possible (don't tell the rumor back, and don't tell it yourself) and that no one sents two mails to the same person (a very unrealistic assumption)[/size]. That gives a probability of (n/(n+1))^(2N) for everyone to not get the rumor, or E(N(r+1))=(n+1)(1-(n/(n+1))^(2N)) people hearing the rumor on average.

Initially, the growth is nearly exponential, but then levels off as more and more people hear the rumor twice within a generation. In the long run and for large n, on average about 79.68% of the population are active, which is given by the solution to E(N(r+1))=N(r), the analytic solution needs the product log function.

Without a proof, N(r) > (n+1)/2 first happens roughly at generation r=ld(n+1)+1.1 rounded up, if N(1)=1 is the first person and first generation.

Megaquark · May 22, 2016

In number 5, can you tell yourself the rumor?...Can a person look in the mirror and say, "Hey, want to hear some juicy gossip!?"

Samy_A · May 22, 2016

So, last last try at 5.

For fixed n, let ##P(m)## be the probability that the m'th generation still doesn't include the starter.

We have ##P(1)=1, P(2)=1##.

For m>2:
For starter not to be in generation m+1, he has to satisfy two conditions: not be in generation m (probability P(m)), and not being selected for generation m+1.
There are ##\binom{n-1}{2}= (n-1)(n-2)/2## ways for selecting the people to tell the rumor to.
There are ##\binom{n-2}{2}=(n-2)(n-3)/2## ways for selecting the people to tell the rumor to, not including the starter.
So for one person not to select the starter, the probability is ##(n-3)/(n-1)##.
Since there are two people telling the rumor in each generation, the probabilty that starter will not be selected by either is ##((n-3)/(n-1))^2##.

This gives the recurrence relation ##P(m+1)= P(m)*((n-3)/(n-1))^2##.
Since ##P(2)=1##, we get ##P(m)=(\frac{n-3}{n-1})^{2(m-2)}## for m>2.

When will P(m) reach 1/2?

Let's solve ##(\frac{n-3}{n-1})^{2(m-2)} =1/2##.
Notice that ##\frac{n-3}{n-1}=1-\frac{2}{n-1}##.

Taking logarithm, we get ##2(m-2)\log (1-2/(n-1))=\log(1/2)= -\log(2)##.
If n is large, ##\log(1 -2/(n-1)) \approx -2/(n-1)##.
So we get ##-4(m-2)/(n-1) \approx -\log(2)##.
Solving for m, we get ##m \approx 2+\frac{1}{4}(n-1)\log(2) ##.

CynicusRex · May 22, 2016

5) I'm probably wrong but I wanted to think about it. The first generation can not contain the instigator because he can not tell himself the rumor. Beyond the first generation, the 2 distinct persons told, each have ##\left(\frac{\left(n+1\right)-3}{\left(n+1\right)-2}\right)## (##\frac{8}{9}## if there are 11 people) choices for the first person if they do not 'want' to pick the instigator and themselves and ##\left(\frac{\left(n+1\right)-4}{\left(n+1\right)-3}\right)## (##\frac{7}{8}## if there are 11 people) choices for the second distinct person. Thus far the probability of the instigator not being picked is ##\left(\frac{\left(n+1\right)-3}{\left(n+1\right)-2}\right)^2## * ##\left(\frac{\left(n+1\right)-4}{\left(n+1\right)-3}\right)^2## =## \left(\frac{\left(n+1\right)-4}{\left(n+1\right)-2}\right)^2 ##. The third generation separately doubles the exponent because the amount of people picked is doubled. However, we shouldn't forget to multiply the chances of the instigator not being picked for each past generation. For example, the chance of instigator not being told after 4 generations is: ##
\left(\frac{n-3}{n-1}\right)^{2^{\left(2-1\right)}} * \left(\frac{n-3}{n-1}\right)^{2^{\left(3-1\right)}} * \left(\frac{n-3}{n-1}\right)^{2^{\left(4-1\right)}}##

P(instigator not told after r generations) = ##\prod_{2}^{r}\left(\frac{n-3}{n-1}\right)^{2^{\left(r-1\right)}}##, where r ≠ 1

As an example I used population of n+1 = 11 people.
Probability of instigator not being picked in generation r separately:
r=1: ##\left(\frac{8}{9}\frac{7}{8}\right)^{0}## (~~exponent: ##2^{\left(1-1\right)}## = 1~~) (Probability of not choosing the instigator in generation one is 1)
r=2: ##\left(\frac{8}{9}\frac{7}{8}\right)^{2}## (exponent: ##2^{\left(2-1\right)}## = 2)
r=3: ##\left(\frac{8}{9}\frac{7}{8}\right)^{4}## (exponent: ##2^{\left(3-1\right)}## = 4)
r=4: ##\left(\frac{8}{9}\frac{7}{8}\right)^{8}## (exponent: ##2^{\left(4-1\right)}## = 8)

Probability of instigator not being picked after generation r:
Multiply the probabilities above until you've reached the generation. With the use of this example I then replaced the numbers with 11=n+1 accordingly.

I think the median has to be calculated with the log equalling 1/2 or something. But haven't thought about it yet.

As for sources; I used Quora to know if there was a notation like sigma, but for multiplication. Yes, I'm new to math beyond high school.

mfb · May 22, 2016

Samy_A said:

Since there are two people telling the rumor in each generation

Why? If the 2 people of generation 2 both tell the rumor to different people, we then have 4 people telling the rumor. 3 is also an option.

TheBlackAdder said:

The third generation separately doubles the exponent because the amount of people picked is doubled.

It does not have to be doubled.

Samy_A · May 22, 2016

mfb said:

Why? If the 2 people of generation 2 both tell the rumor to different people, we then have 4 people telling the rumor.

Of course. As I said, one of those days ...

It could be 2, 3 or 4 people.

CynicusRex · May 22, 2016

mfb said:

Why? If the 2 people of generation 2 both tell the rumor to different people, we then have 4 people telling the rumor. 3 is also an option.It does not have to be doubled.

It doesn't, but the formula does, at least I think so.

mfb · May 22, 2016

TheBlackAdder said:

It doesn't, but the formula does, at least I think so.

Well if the formula assumes something that is not true, the formula does not work.

mpresic · May 22, 2016

The Birthday Cereal problem reminds me of the m-m/s problem . In the 1990's I learned M&M were thinking of coming out in 23 colors. I wondered how many M&M would I have to buy in order to get one of each color assuming the colors are equally distributed. I solved it in the manner of Profusely Quarky. I believe the answer was 82 (or maybe it was 85, I do not remember).

ProfuselyQuarky · May 22, 2016

mpresic said:

The Birthday Cereal problem

I think you're mixing up problems...

EDIT: Yeeesss, I'm post #100 !

Challenge Micromass' big probability challenge

Similar threads

Hot Threads

Insights Fermat's Last Theorem

B What could prove this wrong? I'm having a dispute with friends

B About a definition: What is the number of terms of a polynomial P(x)?

B Geometry Puzzle with 20 points in a cross pattern

I Geometry problem of interest with a 3-4-5 triangle

Recent Insights

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers