Micromass' big July Challenge

mfb · Jul 12, 2016

fluidistic said:

This is not that much different from the result I obtained. I don't understand why my simulation is wrong, I only did 10k trials. Had I done more, it would have converged toward the real ratio. My ratio is 0.2327 and you got 0.2310.

Sorry, I misinterpreted your post, I read it as "in all of 2327 simulations all days were birthdays".
2327 out of 10000 is perfectly reasonable.

PeroK · Jul 12, 2016

fluidistic said:

This is not that much different from the result I obtained. I don't understand why my simulation is wrong, I only did 10k trials. Had I done more, it would have converged toward the real ratio. My ratio is 0.2327 and you got 0.2310.

My knowledge of statistics is a bit shaky, but I think the standard deviation of a binomial distribution with probability 0.2310 and 10k trials is about 42. So, you would expect 2310 successes, plus or minus 42. The 2327 is, therefore, well within one standard deviation.

andrewkirk · Jul 12, 2016

Here is my proposed solution to problem 1. I have found the mode and median. The mean is left for somebody else, because it would be messy. It would be easy enough to compute numerically though.
Let the two points have coordinates [itex](X_1,Y_1),\ (X_2,Y_2)[/itex], where the coordinates are pairwise independent and each is drawn from a uniform distribution on [0,1]. Then the squared length [itex]L^2[/itex] of the connecting line segment is given by:
$$L^2=(X_2-X_1)^2+(Y_2-Y_1)^2$$
Let us write [itex]X=X_2-X_1,\ Y=Y_2-Y_1[/itex] so that [itex]L^2=X^2+Y^2[/itex]. Then [itex]X,Y[/itex] are mutually independent and the cdf of [itex]X[/itex] is [itex]F_X(c)=Pr(X<c)=Pr(X_2<X_1+c)[/itex] which is the area of the region of the unit square in the number plane above the line [itex]y=x+c[/itex]. A little geometry shows that [itex]F_X(c)[/itex] is equal to [itex]\frac12 (1+2c+c^2)[/itex] if [itex]c\in[-1,0][/itex] and [itex]\frac12 (1+2c-c^2)[/itex] if [itex]c\in[0.1][/itex], The pdf is [itex]f_X(c)=1+c[/itex] if [itex]c\in[-1,0][/itex] and [itex]1-c[/itex] if [itex]c\in[0.1][/itex], [itex]F_Y[/itex] and [itex]f_Y[/itex] have the same form. Note that the pdfs are even functions.

Then the cdf of [itex]L[/itex] is [itex]F_L(l)[/itex] which is the integral over the origin-centred circle of radius [itex]l[/itex] of [itex]f_X(x)f_Y(y)[/itex], excluding parts of the circle that lie outside the square [itex]S\equiv[-1,1]\times[-1,1][/itex]. If we restrict to [itex]l<1[/itex], the whole circle lies inside [itex]S[/itex] so we can calculate the integral without having to break it into sections. It is simplest as a polar integral. Since both the pdfs are even, the full integral is equal to four times the integral in the first quadrant, which is
$$4\int_0^l \int_0^{\frac\pi2} f_X(r\cos \theta)f_Y(r\sin\theta)\,r\,d\theta\,dr$$
Hence we have
\begin{eqnarray*}
F_L(l)&=4\int_0^l \int_0^{\frac\pi2} (1-r\cos \theta)(1-r\sin\theta)\,r\,d\theta\,dr\\
&=4\int_0^l \int_0^{\frac\pi2} (1-r\cos \theta-r\sin\theta+r^2\cos\theta\sin\theta)\,r\,d\theta\,dr\\
&=4\int_0^l \left[ \theta-r\sin \theta+r\cos\theta+\frac12r^2\sin^2\theta)\right]_0^{\frac\pi2}\,r\,dr\\
&=4\int_0^l \left( \frac\pi2-r-r+\frac12r^2\right)\,r\,dr\\
&=4 \left[ \frac\pi2\cdot\frac12r^2-\frac23r^3+\frac18r^4\right]_0^l\\
&=\pi l^2-\frac83 l^3+\frac12l^4
\end{eqnarray*}

The pdf is
$$f_L(l)=\frac d{dl}F_L(l)=2\pi l-8l^2+2l^3$$

The mode will occur when [itex]\frac d{dl}f_L(l)=0[/itex], that is:
$$0=2\pi-16l+6l^2$$
This has roots at [itex]\frac13(4-\sqrt{16-3\pi})[/itex] of which only the lower one is in the admissible range, giving a value of about 0.48.

Hence the mode is approx 0.48.

To find the median we solve the equation [itex]F_L(x)=0.5[/itex]. That is
$$\pi l^2-\frac83 l^3+\frac12l^4=0.5$$
Numerically solving this, we find it has four real roots, of which the second is at approx 0.51. The LHS of the equation is increasing at that point, so we know that is the median.

Hence the median is approx 0.51.

Finding the mean would be trickier, because it would require integrating over the entire admissible region, which will involve breaking the integral into parts to allow for excluding the areas that lie outside the square [itex]S[/itex]. That is a task for somebody with more patience than me.

andrewkirk · Jul 12, 2016

I did a numerical calc and estimated the mean at about 0.52. Here is the code, in R:

Code:

n=1000
wt=1/(n*n)
cumexp=0
for (j in 1:n){
  x=j/n
  for (k in 1:n){
  y=k/n
  cumexp=cumexp+wt*(1-x)*(1-y)*sqrt(x^2+y^2)
  }
}
4*cumexp

TheDemx27 · Jul 13, 2016

Seeing all this group theory is really exciting, although the problems shown here are quite a bit beyond my reach. I can only do simple proofs I guess.
For #1, I believe I found the average distance. Considering only 1 dimension, we have a line upon which we consider a certain point, ##\tau##. The distance from ##\tau## to every other point is ##abs(x-\tau)##. The average distance of all points from the point in question is ##\int_0^1 abs(x-\tau) \,dx##. Iterating over all points ##\tau## (and essentially taking the average of all the averages), we have ##\int_0^1 \int_0^1 abs(x-\tau) \, dx \, d\tau##. Wolfram alpha tells us that the double integral evaluates to ##1/3##. Then we just pythagorize that with itself because the other dimension is symmetrical and we get ##\frac{\sqrt{2}}{3}##.

I feel like I did this wrong after seeing andrewkirk's solution ho boy

andrewkirk · Jul 13, 2016

I've been looking at number 4. I don't understand the question.

It says 'containing all continuous functions and closed under pointswise limits'

If we were talking about sequences of functions I could understand the ref to pointswise limits, but not for an unstructured set that contains a bunch of continuous functions.

What does 'closed under pointswise limits' mean here? Since the functions are all continuous, they are trivially equal to the limit at every point. Since there are no sequences defined, we can't talk about limits of sequences of functions, so I don't know what 'limits' can be referring to.

I doubt I'll be able to answer the question anyway, as I'm exceedingly rusty on measurability. But I'd at least like to understand the question.

Infrared · Jul 13, 2016

I think the meaning is that if [itex](f_n)[/itex] is a sequence of functions in [itex] B[/itex] that converges pointwise to a function [itex] f[/itex], then [itex] f[/itex] is also in [itex]B[/itex]. Micro, please correct me if this isn't the right interpretation.

I've been meaning to look at this problem but I've been busy these past few days.

micromass · Jul 13, 2016

andrewkirk said:

I've been looking at number 4. I don't understand the question.

It says 'containing all continuous functions and closed under pointswise limits'

If we were talking about sequences of functions I could understand the ref to pointswise limits, but not for an unstructured set that contains a bunch of continuous functions.

What does 'closed under pointswise limits' mean here? Since the functions are all continuous, they are trivially equal to the limit at every point. Since there are no sequences defined, we can't talk about limits of sequences of functions, so I don't know what 'limits' can be referring to.

I doubt I'll be able to answer the question anyway, as I'm exceedingly rusty on measurability. But I'd at least like to understand the question.

The functions in ##\mathcal{B}## are not all continuous. All I'm saying is that every continuous function is in ##\mathcal{B}##, but there might be more. The limit condition says that if ##f_n\in \mathcal{B}##, then so is ##\lim_n f_n\in \mathcal{B}##.

micromass · Jul 13, 2016

andrewkirk said:

Here is my proposed solution to problem 1. I have found the mode and median. The mean is left for somebody else, because it would be messy. It would be easy enough to compute numerically though.
Let the two points have coordinates [itex](X_1,Y_1),\ (X_2,Y_2)[/itex], where the coordinates are pairwise independent and each is drawn from a uniform distribution on [0,1]. Then the squared length [itex]L^2[/itex] of the connecting line segment is given by:
$$L^2=(X_2-X_1)^2+(Y_2-Y_1)^2$$
Let us write [itex]X=X_2-X_1,\ Y=Y_2-Y_1[/itex] so that [itex]L^2=X^2+Y^2[/itex]. Then [itex]X,Y[/itex] are mutually independent and the cdf of [itex]X[/itex] is [itex]F_X(c)=Pr(X<c)=Pr(X_2<X_1+c)[/itex] which is the area of the region of the unit square in the number plane above the line [itex]y=x+c[/itex]. A little geometry shows that [itex]F_X(c)[/itex] is equal to [itex]\frac12 (1+2c+c^2)[/itex] if [itex]c\in[-1,0][/itex] and [itex]\frac12 (1+2c-c^2)[/itex] if [itex]c\in[0.1][/itex], The pdf is [itex]f_X(c)=1+c[/itex] if [itex]c\in[-1,0][/itex] and [itex]1-c[/itex] if [itex]c\in[0.1][/itex], [itex]F_Y[/itex] and [itex]f_Y[/itex] have the same form. Note that the pdfs are even functions.

Then the cdf of [itex]L[/itex] is [itex]F_L(l)[/itex] which is the integral over the origin-centred circle of radius [itex]l[/itex] of [itex]f_X(x)f_Y(y)[/itex], excluding parts of the circle that lie outside the square [itex]S\equiv[-1,1]\times[-1,1][/itex]. If we restrict to [itex]l<1[/itex], the whole circle lies inside [itex]S[/itex] so we can calculate the integral without having to break it into sections. It is simplest as a polar integral. Since both the pdfs are even, the full integral is equal to four times the integral in the first quadrant, which is
$$4\int_0^l \int_0^{\frac\pi2} f_X(r\cos \theta)f_Y(r\sin\theta)\,r\,d\theta\,dr$$
Hence we have
\begin{eqnarray*}
F_L(l)&=4\int_0^l \int_0^{\frac\pi2} (1-r\cos \theta)(1-r\sin\theta)\,r\,d\theta\,dr\\
&=4\int_0^l \int_0^{\frac\pi2} (1-r\cos \theta-r\sin\theta+r^2\cos\theta\sin\theta)\,r\,d\theta\,dr\\
&=4\int_0^l \left[ \theta-r\sin \theta+r\cos\theta+\frac12r^2\sin^2\theta)\right]_0^{\frac\pi2}\,r\,dr\\
&=4\int_0^l \left( \frac\pi2-r-r+\frac12r^2\right)\,r\,dr\\
&=4 \left[ \frac\pi2\cdot\frac12r^2-\frac23r^3+\frac18r^4\right]_0^l\\
&=\pi l^2-\frac83 l^3+\frac12l^4
\end{eqnarray*}

The pdf is
$$f_L(l)=\frac d{dl}F_L(l)=2\pi l-8l^2+2l^3$$

The mode will occur when [itex]\frac d{dl}f_L(l)=0[/itex], that is:
$$0=2\pi-16l+6l^2$$
This has roots at [itex]\frac13(4-\sqrt{16-3\pi})[/itex] of which only the lower one is in the admissible range, giving a value of about 0.48.

Hence the mode is approx 0.48.

To find the median we solve the equation [itex]F_L(x)=0.5[/itex]. That is
$$\pi l^2-\frac83 l^3+\frac12l^4=0.5$$
Numerically solving this, we find it has four real roots, of which the second is at approx 0.51. The LHS of the equation is increasing at that point, so we know that is the median.

Hence the median is approx 0.51.

Finding the mean would be trickier, because it would require integrating over the entire admissible region, which will involve breaking the integral into parts to allow for excluding the areas that lie outside the square [itex]S[/itex]. That is a task for somebody with more patience than me.

Very neat. I will credit you after somebody found the mean.

micromass · Jul 13, 2016

TheDemx27 said:

Seeing all this group theory is really exciting, although the problems shown here are quite a bit beyond my reach. I can only do simple proofs I guess.
For #1, I believe I found the average distance. Considering only 1 dimension, we have a line upon which we consider a certain point, ##\tau##. The distance from ##\tau## to every other point is ##abs(x-\tau)##. The average distance of all points from the point in question is ##\int_0^1 abs(x-\tau) \,dx##. Iterating over all points ##\tau## (and essentially taking the average of all the averages), we have ##\int_0^1 \int_0^1 abs(x-\tau) \, dx \, d\tau##. Wolfram alpha tells us that the double integral evaluates to ##1/3##. Then we just pythagorize that with itself because the other dimension is symmetrical and we get ##\frac{\sqrt{2}}{3}##.

I feel like I did this wrong after seeing andrewkirk's solution ho boy

Sorry, that is the incorrect answer

mfb · Jul 13, 2016

TheDemx27 said:

Then we just pythagorize that with itself because the other dimension is symmetrical and we get ##\frac{\sqrt{2}}{3}##

The Pythagoras is not linear, you can't "pull the average in/out" like that.

PeroK · Jul 13, 2016

I have a revised formula for number 6:

$$P = 1 - \sum_{i = 1}^{d-1} \binom{d}{i} \sum_{j=1}^{i} (-1)^{i+j} \binom{i}{j} (\frac{j}{d})^n$$

With ##d = 365## and ##n = 2016##

I ran this calculation and got ##P = 0.230987## (which agrees with @mfb).

That's my final bid!

andrewkirk · Jul 13, 2016

micromass said:

The functions in ##\mathcal{B}## are not all continuous. All I'm saying is that every continuous function is in ##\mathcal{B}##, but there might be more. The limit condition says that if ##f_n\in \mathcal{B}##, then so is ##\lim_n f_n\in \mathcal{B}##.

Is the following then an accurate representation of what the set ##\mathcal B## is intended to be in problem 4?

Let ##C^0## be the set of all continuous functions from ##\mathbb R\to \mathbb R##.
Given a set ##S## of functions ##\mathbb R\to\mathbb R##, we say ##S## is 'pointwise closed' if, for any countable subset ##(f_n)_{n\in\mathbb N}## of ##S## that converges pointwise to a function ##f:\mathbb R\to\mathbb R##, the function ##f## is also in ##S##.

Then ##\mathcal B## is defined to be the intersection of all sets of functions ##S## that contain ##C^0## and are pointwise closed.

micromass · Jul 13, 2016

andrewkirk said:

Is the following then an accurate representation of what the set ##\mathcal B## is intended to be in problem 4?

Let ##C^0## be the set of all continuous functions from ##\mathbb R\to \mathbb R##.
Given a set ##S## of functions ##\mathbb R\to\mathbb R##, we say ##S## is 'pointwise closed' if, for any countable subset ##(f_n)_{n\in\mathbb N}## of ##S## that converges pointwise to a function ##f:\mathbb R\to\mathbb R##, the function ##f## is also in ##S##.

Then ##\mathcal B## is defined to be the intersection of all sets of functions ##S## that contain ##C^0## and are pointwise closed.

That is right.

mfb · Jul 14, 2016

PeroK said:

I have a revised formula for number 6:

$$P = 1 - \sum_{i = 1}^{d-1} \binom{d}{i} \sum_{j=1}^{i} (-1)^{i+j} \binom{i}{j} (\frac{j}{d})^n$$

With ##d = 365## and ##n = 2016##

I ran this calculation and got ##P = 0.230987## (which agrees with @mfb).

That's my final bid!

The inner sum should give the probability that people have birthdays on exactly d (specified) days, right? Why don't you directly extend it to 365 and consider that only? We don't need the double sum.
$$\sum_{j=1}^{d}(-1)^{j+1} \binom{d}{j} \left(\frac j d \right)^n$$
For d=365 and n=2016, it gives 0.230987.

That's the inclusion/exclusion I was looking for earlier.

andrewkirk · Jul 14, 2016

Well, I have waded through the mire of messy integration to solve the mean for problem four.

WARNING: it is not pretty. Avert your eyes if you are easily distressed.

To calculate the mean we want to use Cartesian rather than Polar integration so that we can include the corners of the square ##[-1,1]\times[-1,1]##. Fortunately, we don't need to calculate the cdf of L in Cartesian coordinates, which might be very messy. Instead we calculate the mean directly using Cartesian integration over the first quadrant, using only the pdfs of X and Y.

\begin{align*}
E[L]&=4\int_0^1\int_0^1 f_X{x}f_Y(y)\sqrt{x^2+y^2}\,dy\,dx\\
&=4\int_0^1\int_0^1 (1-x)(1-y)\sqrt{x^2+y^2}\,dy\,dx\\
&=4\int_0^1(1-x)\left[-\frac13 (x^2+y^2)^\frac32+ \frac12\left(y\sqrt{x^2+y^2}+x^2\log\left(\sqrt{x^2+y^2}+y\right)\right)\right]_0^1\,dx\\
%
&=4\int_0^1(1-x)\Bigg(\left(-\frac13 (x^2+1^2)^\frac32+ \frac12\left(1\sqrt{x^2+1^2}+x^2\log\left(\sqrt{x^2+1^2}+1\right)\right)\right)\\
&\quad\quad\quad-\left(
-\frac13 (x^2+0)^\frac32+ \frac12\left(0+x^2\log\left(\sqrt{x^2+0}+0\right)\right)
\right)
\Bigg)\,dx\\
%
&=4\int_0^1(1-x)\Bigg(-\frac13 (x^2+1)^\frac32+ \frac12\sqrt{x^2+1}+\frac12 x^2\log\left(1+\sqrt{x^2+1}\right)\\
&\quad\quad\quad+\frac13 x^3-\frac12 x^2\log{x}\Bigg)dx\\
%
&=
4\left[\frac1{12}x^4-\frac1{15}x^5\right]_0^1+
%
4\int_0^1\Bigg(+\frac13 x(x^2+1)^\frac32 - \frac12 x\sqrt{x^2+1}\\
&\quad\quad\quad-\frac13 (x^2+1)^\frac32+\frac12 \sqrt{x^2+1}
+(1-x)\left(\frac12 x^2\log\left(1+\sqrt{x^2+1}\right)
-\frac12 x^2\log{x}\right)\Bigg)dx\\
%
&=\frac13-\frac4{15}
+4\left[
\frac1{15}(x^2+1)^\frac52
-\frac16(x^2+1)^\frac32
\right]_0^1\\
&\quad\quad\quad+4\int_0^1\Bigg(
-\frac13 (x^2+1)^\frac32+\frac12 \sqrt{x^2+1}
+(1-x)\left(\frac12 x^2\log\left(1+\sqrt{x^2+1}\right)
-\frac12 x^2\log{x}\right)\Bigg)dx\\
%
&=\frac1{15}+\frac1{15}\cdot 16\sqrt2-\frac16 8\sqrt2-\frac4{15}+\frac23\\
&\quad\quad\quad+4\int_0^1\Bigg(
-\frac13 (x^2+1)^\frac32+\frac12 \sqrt{x^2+1}
+(1-x)\left(\frac12 x^2\log\left(1+\sqrt{x^2+1}\right)
-\frac12 x^2\log{x}\right)\Bigg)dx\\
%
&=0.090
+4\int_0^1\Bigg(
-\frac13 (x^2+1)^\frac32+\frac12 \sqrt{x^2+1}
+\frac12 x^2\log\left(1+\sqrt{x^2+1}\right)
-\frac12 x^2\log{x}\\
&\quad\quad\quad-\frac12 x^3\log\left(1+\sqrt{x^2+1}\right)
+\frac12 x^3\log{x}
\Bigg)dx
%
\end{align*}

\begin{align*}
&=0.090
-\frac13\cdot\frac18\left[x\sqrt{x^2+1}(2x^2+5)+3\sinh^{-1}x\right]_0^1\\
&\quad\quad\quad+\frac12\cdot\frac12\left[x\sqrt{x^2+1}+\sinh^{-1}x\right]_0^1\\
&\quad\quad\quad+\frac12\cdot\frac1{18}\left[
-2x^3+3x\sqrt{x^2+1}+6x^3\log(1+\sqrt{x^2+1})-3\sinh^{-1} x\right]_0^1\\
&\quad\quad\quad-\frac12\cdot\frac19\left[
x^3(3\log x-1)\right]_0^1\\
&\quad\quad\quad-\frac12\cdot\frac1{48}\left[
-3x^4+4x^2\sqrt{x^2+1}-8\sqrt{x^2+1}+12x^4\log(1+\sqrt{x^2+1})
\right]_0^1\\
&\quad\quad\quad+\frac12\cdot\frac1{16}\left[x^4(4\log x - 1)\right]_0^1
\end{align*}
The alert reader will have noticed that in evaluating the fourth and sixth definite integrals here it appears I will have to perform an illegal operation in evaluating [itex]\log 0[/itex]. However, we can validate such a step by observing that the formulas actually being evaluated are [itex](x^4\log x)[/itex] and [itex](x^3\log x)[/itex] in the fourth and six items respectively. We can show by l'Hopital's rule that the limit of these formulas as [itex]x[/itex] goes to zero from above is 0. Hence we can use that, since it is only the limit, not the value of the antiderivative at 0, that determines the value of the definite integral.

Let us continue:
\begin{align*}
&=0.090
-\frac1{24}\left[1\cdot\sqrt{1+1}(2+5)+3\sinh^{-1}1-(0)\right]
+\frac14\left[\sqrt{2}+\sinh^{-1}1-(0)\right]\\
&\quad\quad\quad+\frac1{36}\left[-2+3\sqrt{2}+6\log(1+\sqrt{2})-3\sinh^{-1} 1-(0)\right]\\
&\quad\quad\quad-\frac1{18}\left[1(3\cdot 0-1)-(0)\right]\\
&\quad\quad\quad-\frac1{96}\left[
-3+4\sqrt{2}-8\sqrt{2}+12\log(1+\sqrt{2})-(-8)\right]\\
&\quad\quad\quad+\frac1{32}\left[1\cdot (4\cdot 0 - 1)-(0)\right]\\
%
&=0.52
\end{align*}
which matches the value calculated earlier by numerical integration. What a relief!

mfb · Jul 14, 2016

The calculation is much shorter if you use polar coordinates and make the radius depend on the angle.

PeroK · Jul 14, 2016

mfb said:

The inner sum should give the probability that people have birthdays on exactly d (specified) days, right? Why don't you directly extend it to 365 and consider that only? We don't need the double sum.
$$\sum_{j=1}^{d}(-1)^{j+1} \binom{d}{j} \left(\frac j d \right)^n$$
For d=365 and n=2016, it gives 0.230987.

That's the inclusion/exclusion I was looking for earlier.

I missed that!

haruspex · Jul 14, 2016

Problem 1.
(I assume it means average distance between two points, not two numbers.)
The obvious approach is a quadruple integral, ##\int\int\int\int\sqrt{(x_1-x_2)^2+(y_1-y_2)^2}##. The first integral can be handled with a tan substitution, leading to integral of sec³, with a standard but messy result, or a sinh substitution, leading to a sinh^-1 term. Either way, the second integral looks very nasty.

We can do better by noting that lines in which the endpoints differ by a vector (x,y) have a relative frequency (1-x)(1-y). This leads to the double integral ##\int\int(1-x)(1-y)\sqrt {x^2+y^2}.dxdy/\int\int(1-x)(1-y)dxdy=4\int\int(1-x)(1-y)\sqrt {x^2+y^2}.dxdy##.
Multiplying out, we get an xy term, an x term, a y term and a "1" term. The xy term can be integrated twice easily. The x and y terms can also be integrated twice, though it involves integrating sec⁵ (I think).
But for the "1" term we have the same horrendous second integral as before.

Any better ideas?

mfb · Jul 14, 2016

Go to polar coordinates for the "1" term.

haruspex · Jul 14, 2016

mfb said:

Go to polar coordinates for the "1" term.

Doh! I looked at that when I had the quadruple integral and it didn't help, but I didn't retry it with the simpler integral.
Thanks!

rcgldr · Jul 15, 2016

Following up on mfb's post for number 6, a similar formula I had from the early 1990's, for dropping n balls into a plinko like box, each ball with an equal probability of ending up in 1 of d slots (note a real plinko box (bean machine) would have a bell curve distribution, not a uniform distribution, this was being used as an analogy for comparing error rates at the bit level versus a block level on tape drives and hard drives).

correction - was (-1)^(j+1), corrected to (-1)^(j+d):

## \left( \sum_{j=1}^{d} (-1)^{j+d} \begin{pmatrix} d\\j\end{pmatrix} j^n \right) \ / \ (d^n) ##

for d = 365, and n = 2016, the result is

~= 0.23098718155802874302697567890329631779588686618349968487

To explain this sum, note that there are ##d^n## total possible cases, each of n people can have one of d possible birthdays.

For all the birthdays to fall on a single day, there are ...
For all the birthdays to fall on exactly two days, there are ...
For all the birthdays to fall on exactly three days, there are ...
For all the birthdays to fall on exactly 365 days, there are

The probability of all the birthdays to fall on exactly 365 days = the number of cases for exactly 365 days divided by the total number of cases, ## d^n ## .

PeroK · Jul 15, 2016

rcgldr said:

To explain this sum, note that there are ##d^n## total possible cases, each of n people can have one of d possible birthdays.

For all the birthdays to fall on a single day, there are
## \begin{pmatrix} d\\1\end{pmatrix} 1^n ## cases.

For all the birthdays to fall on exactly two days, there are
## \begin{pmatrix} d\\2\end{pmatrix} 2^n ## cases minus the number of single day cases.

For all the birthdays to fall on exactly three days, there are
## \begin{pmatrix} d\\3\end{pmatrix} 3^n ## cases minus the number of exactly two day cases.

For all the birthdays to fall on exactly 365 days, there are
## \begin{pmatrix} d\\365\end{pmatrix} 365^n ## cases minus the number of exactly 364 day cases.

I must confess, I don't see this argument. First, why is the final number the total number of cases minus the exactly 364 cases? What about all the 363, 362 etc cases? The same for the others: the number of exactly 3-day cases should be the total number of 3-day cases minus the number of exactly 2-days cases minus the number of exactly 1-day cases.

Second, that calculation for the number of 2-days cases (not exact) includes as separate 2-day cases day-1/day-2 and day-1/day-3, which counts the subcase of all day-1 twice.

PeroK · Jul 15, 2016

In fact, the probability that ##n## people have birthdays covering exactly any ##x## days is:

$$ \binom{d}{x} \sum_{j=1}^{x}(-1)^{j+x} \binom{x}{j} \left(\frac j d \right)^n$$

It's only when ##x = d## that the initial binomial term disappears.

Note also, a minor point, that it should be in general ##(-1)^{j+x}##. It's only ##(-1)^{j+1}## when ##x## is odd.

PeroK · Jul 15, 2016

PeroK said:

In fact, the probability that ##n## people have birthdays covering exactly any ##x## days is:

$$ \binom{d}{x} \sum_{j=1}^{x}(-1)^{j+x} \binom{x}{j} \left(\frac j d \right)^n$$

It's only when ##x = d## that the initial binomial term disappears.

Note also, a minor point, that it should be in general ##(-1)^{j+x}##. It's only ##(-1)^{j+1}## when ##x## is odd.

Here's a counting argument that I think works:

Let ##N_i## be the number of ways that exactly any ##i## days can be birthdays:

##\binom{d}{2} 2^n = N_2 + 2\binom{d}{2}##

As all the ##N_2## cases are covered and every pair chosen generates ##2## cases where only 1 day is covered. Hence:

##N_2 = \binom{d}{2} (2^n - 2) = \binom{d}{2} \sum_{j=1}^{2} (-1)^{j+2} \binom{2}{j} j^n##

And:

##\binom{d}{3} 3^n = N_3 + \binom{d}{3}(3(2^n) - 3) ##

All the ##N_3## cases are generated, plus for every triple there are 3 sets of ##2^n## cases where at most 2 days are covered, but this counts twice each of the three cases where only one day is covered. Hence:

##N_3 = \binom{d}{3}(3^n - 3(2^n) + 3) ##

And, in general - or for the last day - we get the series of alternating inclusions and exclusions.

rcgldr · Jul 15, 2016

PeroK said:

In fact, the probability that ##n## people have birthdays covering exactly any ##x## days is:

$$ \binom{d}{x} \sum_{j=1}^{x}(-1)^{j+x} \binom{x}{j} \left(\frac j d \right)^n$$

It's only when ##x = d## that the initial binomial term disappears.

Note also, a minor point, that it should be in general ##(-1)^{j+x}##. It's only ##(-1)^{j+1}## when ##x## is odd.

PeroK said:

Here's a counting argument that I think works:

Let ##N_i## be the number of ways that exactly any ##i## days can be birthdays:

##\binom{d}{2} 2^n = N_2 + 2\binom{d}{2}##

As all the ##N_2## cases are covered and every pair chosen generates ##2## cases where only 1 day is covered. Hence:

##N_2 = \binom{d}{2} (2^n - 2) = \binom{d}{2} \sum_{j=1}^{2} (-1)^{j+2} \binom{2}{j} j^n##

And:

##\binom{d}{3} 3^n = N_3 + \binom{d}{3}(3(2^n) - 3) ##

All the ##N_3## cases are generated, plus for every triple there are 3 sets of ##2^n## cases where at most 2 days are covered, but this counts twice each of the three cases where only day is covered. Hence:

##N_3 = \binom{d}{3}(3^n - 3(2^n) + 3) ##

And, in general - or for the last day - we get the series of alternating inclusions and exclusions.

I think everything you've mentioned is correct. Brain fade on my wording, especially since I had done this back in 1992 where I first determined this series based on observation, and ended up with a similar description of the terms of the series as alternating corrections (inclusions and exclusions).

http://rcgldr.net/misc/prob.rtf

I also posted here about that series back in 2012, the series is shown in posts #1 and #12, wondering if there was a better way than the alternating series. It includes some of the observation process, with comments on the series in posts 16 through 19.

https://www.physicsforums.com/threads/permutation-7-identical-items-32-distinct-containers.588810

I've got some notes about this, but haven't found them yet.

Erland · Jul 15, 2016

I will give solution of Problem 4. I use andrewkirks notation:

andrewkirk said:

Is the following then an accurate representation of what the set ##\mathcal B## is intended to be in problem 4?

Let ##C^0## be the set of all continuous functions from ##\mathbb R\to \mathbb R##.
Given a set ##S## of functions ##\mathbb R\to\mathbb R##, we say ##S## is 'pointwise closed' if, for any countable subset ##(f_n)_{n\in\mathbb N}## of ##S## that converges pointwise to a function ##f:\mathbb R\to\mathbb R##, the function ##f## is also in ##S##.

Then ##\mathcal B## is defined to be the intersection of all sets of functions ##S## that contain ##C^0## and are pointwise closed.

micromass says that this is correct.

Here is my solution:

Since all continuous functions are Borel measurable, and all pointwise limits of Borel measurable functions are Borel measurable (W. Rudin, Real and Complex Analysis, 3rd ed, Corollary (a), p. 15, McGraw-Hill, Singapore, 1986), it follows that all functions in ##\mathcal B## are Borel measurable functions.

Conversely, let ##\Omega## be the set of all countable ordinals.

We define ##C^\alpha##, for ##\alpha\in \Omega##, by recursion thus:
##C^0## is as above. For ##\alpha>0##, ##C^\alpha## is the set of all pointwise limits of functions in ##\cup_{\beta<\alpha} C^\beta##.

We prove that ##\mathcal B=\cup_{\alpha\in\Omega}C^\alpha##.

To see this, we first notice that by transfinite induction, any pointwise closed set ##S## of real valued functions on ##\mathbb R## which contains ##C^0## contains ##C^\alpha## for all ##\alpha\in\Omega##. This means that ##\cup_{\alpha\in\Omega}C^\alpha\subset\mathcal B##.

Conversely, ##C^0\subset\cup_{\alpha\in\Omega}C^\alpha##. Also, for any sequence ##(f_n)_{n\in\mathbb N}## of functions in ##\cup_{\alpha\in\Omega}C^\alpha##, there is a sequence of countable ordinals ##(\alpha_n)_{n\in\mathbb N}##, such that ##f_n\in C^{\alpha_n}## for all ##n\in\mathbb N##.
The supremum of any countable subset of ##\Omega## is a countable ordinal (a consequence of ##\aleph_0^2=\aleph_0##), so there is an ##\alpha\in \Omega## such that ##f_n\in C^\alpha## for all ##n\in\mathbb N##. Therefore, if such a sequence ##(f_n)_{n\in\mathbb N}## is pointwise convergent, its limit lies in ##C^{\alpha+1}##. It follows that ##\cup_{\alpha\in\Omega}C^\alpha## is pointwise closed.
This means that ##\mathcal B\subset\cup_{\alpha\in\Omega}C^\alpha##.

Hence, ##\mathcal B=\cup_{\alpha\in\Omega}C^\alpha##.

Next, we prove by transfinite induction that if ##f,g\in C^\alpha##, then ##f+g\in C^\alpha## and ##\min(f,g)\mathcal \in C^\alpha## for all ##\alpha\in \Omega##: This is true for ##\alpha=0##. Assume that it holds for all ##\beta<\alpha##, for some ##\alpha>0## (##\alpha\in\Omega##). Let ##f,g\in C^\alpha##. Then there are two sequences ##(f_n)_{n\in\mathbb N}## and ##(g_n)_{n\in\mathbb N}## of functions in ##\cup_{\beta<\alpha}C^\beta## which converge pointwise to ##f## and ##g##, respectively. Then, to each ##n\in \mathbb N## there are ##\gamma_n<\alpha## and ##\delta_n<\alpha## such that ##f_n\in C^{\gamma_n}## and ##g_n\in C^{\delta_n}##. With ##\beta_n=\max(\gamma_n,\delta_n)##, ##f_n,g_n\in C^{\beta_n}##, and, by the induction hypothesis, ##f_n+g_n\in C^{\beta_n}## and ##\min(f_n,g_n)\in C^{\beta_n}##. Since ##(f_n+g_n)_{n\in \mathbb N}## and ##\min (f_n,g_n)_{n\in \mathbb N}## converge pointwise to ##f+g## and ##\min(f,g)##, respectively, those limit functions lie in ##C^\alpha##.
The conclusion now follows by transfinite induction.

It follows that if ##f,g\in \mathcal B##, then ##f+g\in \mathcal B## and ##\min(f,g)\in\mathcal B##.

By a similar, somewhat simpler, argument, ##f \in \mathcal B## implies ##rf\in \mathcal B## for all ##r\in \mathbb R##.

Next, let ##\mathcal E## be the family of all sets ##E\subset\mathbb R## such that ##\chi_E\in\mathcal B##, where ##\chi_E## is the characteristic function of ##E##.

Then, ##\mathcal E## contains all bounded open intervals ##(a,b)\subset \mathbb R##, for if we put
[itex]f_n(x)=\begin{cases} 0, \qquad x\le a\\ \frac{2n}{b-a}(x-a), \qquad a<x<a+\frac{b-a}{2n}\\1,\qquad a+\frac{b-a}{2n}\le x\le b -\frac{b-a}{2n}\\
\frac{2n}{b-a}(b-x),\qquad b-\frac{b-a}{2n}<x<b\\0,\qquad x\ge b\end{cases}[/itex]
for all ##n\in \mathbb N##, then ##(f_n)_{n\in \mathbb N}## is a sequence in ##C^0## which converges pointwise to ##\chi_{(a,b)}##.

If ##E,F\in\mathcal E##, then, by the above, ##\chi_{\mathbb R\setminus E}=1-\chi_E\in\mathcal B## and ##\chi_{E\cup F}=\min(\chi_E+\chi_F,1)\in \mathcal B##, so ##\mathbb R\setminus E\in \mathcal E## and ##E\cup F\in\mathcal E##.
Assume that ##E_n\in\mathcal E##, for all ##n \in \mathbb N##. Then, by induction, ##\cup_{k=1}^n E_k\in\mathcal E##, for all ##n\in \mathbb N##.
It follows that ##\chi_{\cup_{n=1}^{\infty} E_n}=\lim_{n\to\infty} \chi_{\cup_{k=1}^{\infty} E_k}## (pointwise), so ##\chi_{\cup_{n=1}^{\infty} E_n}\in \mathcal B## and ##\cup_{n=1}^{\infty} E_n\in \mathcal E##.
This means that ##\mathcal E## is a ##\sigma##-algebra which contains all bounded open intervals in ##\Bbb R##. It follows that ##\mathcal E## contains all Borel sets.

From this and the above, it follows that ##\mathcal B## contains all simple Borel measurable functions, since these are linear combinations of characteristic functions of Borel sets.
Now, if ##f:\mathbb R\to [0,\infty)## is a positive Borel measurable function, then there is a sequence ##(s_n)_{n\in\mathbb N}## of simple Borel meaurable functions which converges pointwise to ##f## (W. Rudin, Real and Complex Analysis, 3rd ed, Theorem 1.17, p. 15, McGraw-Hill, Singapore, 1986). Hence ##f\in \mathcal B##. Every real Borel measurable function ##f## can be written as a difference ##f_+-f_-##, where

##f_+(x)=\begin{cases}f(x),\qquad f(x)\ge 0\\ 0,\qquad f(x)<0\end{cases}## and ##f_-=f-f_+##, and ##f_+## and ##f_-## both are positive Borel measurable functions, so ##f\in \mathcal B##.
It follows that ##\mathcal B## contains all Borel measurable functions.

We have proved that ##\mathcal B## is the set of Borel measurable functions.

Kaura · Jul 16, 2016

I am a bit late to the party but none the less for Problem 6 I wrote a script to serve as a way to get experimental probability

Entering in birthday(2016, 1000000) to the console served as a way to run 1000000 trials
This returned a value of 0.23017 probability
This is close to some of the other values I have seen in the thread so it might be close to the answer
If my code if incorrect for the situation please tell me
Also if you want to run the code for yourself in the console I would advise setting the trails parameter to 10000 or less as it more can take forever to run

mfb · Jul 16, 2016

Kaura said:

This is close to some of the other values I have seen in the thread so it might be close to the answer

We have the analytic answer already.
Here are more digits, all correct unless WolframAlpha has a bug:
0.23098718155802874302697567890329631779588686618349968487

haruspex · Jul 16, 2016

mfb said:

Go to polar coordinates for the "1" term.

So, to complete the calculation:
1.
##I_1=\int\int\sqrt{x^2+y^2}=\int_{\theta=0}^{\pi/2}\int_{r=0}^{1}r^2drd\theta+2\int_{\theta=0}^{\pi/4}\int_{r=0}^{\sec(\theta)}r^2drd\theta##
##=\pi/6+\frac 23\int_0^{\pi/4}(\sec^3(\theta)-1)d\theta=\pi/6+\frac 13\left[\sec(\theta)\tan(\theta)+\ln|\sec(\theta)+\tan(\theta)|-2\theta\right]_0^{\pi/4}##
##=\frac 13\sqrt 2+\frac 13\ln(1+\sqrt 2)##
2. The 'x' and 'y' terms are equivalent, so we combine them:
##I_2=2\int\int x\sqrt{x^2+y^2}=\frac 23\int((1+y^2)^{\frac 32-y^3}dy=\frac 23\int_0^{\pi/4}\sec^5(\theta)d\theta-1/6##
Writing ##s=\sin(\theta)##:
##\int_0^{\pi/4}\sec^5(\theta)d\theta=\int_0^{1/\sqrt 2}(1-s^2)^{-3}ds##
The integrand can be written ##\frac 1{16}(3(1-s)^{-1}+3(1+s)^{-1}+3(1-s)^{-2}+3(1+s)^{-2}+2(1-s)^{-3}+2(1+s)^{-3})##
whence ##I_2=\frac 14\ln(1+\sqrt 2)+\frac 5{12}\sqrt 2-\frac 16##
3. ##I_3=\int\int xy\sqrt{x^2+y^2}=\frac 14\int\sqrt{x+y}=\frac 14\frac 23\frac 25 2^{\frac 52}=\frac 4{15}\sqrt 2##

Pulling the three integrals together, I₁-I₂+I₃, and applying the normalisation factor of 4:
average length of line ##=\frac 13\ln(1+\sqrt 2)+\frac {11}{15}\sqrt 2-\frac 23##

Kaura · Jul 16, 2016

mfb said:

0.23098718155802874302697567890329631779588686618349968487

0.3537% error on my part then which makes me happy

andrewkirk · Jul 16, 2016

Erland said:

I will give solution of Problem 4.

That's an interesting approach. I've only read part of it so far, as other distractions keep butting in. I have a question on what I've read thus far, re the following:

We define ##C^\alpha##, for ##\alpha\in \Omega##, by recursion thus:
##C^0## is as above. For ##\alpha>0##, ##C^\alpha## is the set of all pointwise limits of functions in ##\cup_{\beta<\alpha} C^\beta##.

This is a recursive definition. Don't we need well-orderedness of ##\Omega##, or some similar constraint, for a recursive definition to be meaningful? Can you demonstrate, or provide a reference, supporting the well-orderedness of the countable ordinals?

Thanks

Andrew

Erland · Jul 16, 2016

andrewkirk said:

This is a recursive definition. Don't we need well-orderedness of ##\Omega##, or some similar constraint, for a recursive definition to be meaningful? Can you demonstrate, or provide a reference, supporting the well-orderedness of the countable ordinals?

All sets of ordinals are well ordered. In fact, if we use von Neumann's definition of ordinals

https://en.wikipedia.org/wiki/Ordinal_number#Von_Neumann_definition_of_ordinals

an ordinal ##\alpha## is identical to the set of all ordinals smaller than ##\alpha##. This means that ##\Omega## is the same as the smallest uncountable ordinal, which is also the smallest uncountable cardinal ##\aleph_1##.

fresh_42 · Jul 16, 2016

I'll try to finish it (somehow - and hope I haven't forgotten some cases).

5.
Notation: ##G## is a ##p##-group, and in our case of order ##p^2## or ##p^3 \; , \; Z = Z(G)## its center and ##G' = [G,G]## its commutator subgroup. ##<a_i>## will denote a subgroup generated by elements ##a_i## of ##G## and ##|H| , |a|## the order of a group ##H## or an element ##a##, resp. The cyclic group of order ##n## will be denoted ##C_n##. (And ##\times## will be the direct product, as usual.)

First of all:
All ##p##-groups are nilpotent (the upper central series terminate and have abelian quotients) and therefore have a non-trivial center. If ##G \diagup Z## is cyclic, then ##G## is abelian, because all elements could be written ##g = a^n \cdot z## with a generator ##aZ## of ##G \diagup Z## and a center element ##z##.

If ##|G| = p^2## then ##G \diagup Z## is either ##\{1\}## or isomorphic to ##C_p## In both cases ##G## is abelian. (The second case can therefore not appear, but it doesn't matter.) I.e. all groups of order ##p^2## are abelian and thus all subgroups are normal and the only possibilities are ##G ≅ C^2_p## or ##G ≅ C_{p^2}.##

From now on let us assume ##|G| = p^3.## Again ##Z ≠ \{1\}## which leaves the cases ##|Z| \in \{p,p^2,p^3\}##.
##|Z|=p^2## is impossible for ##G \diagup Z ≅ C_p## and ##G## would be abelian, i.e. ##|Z| = |G| = p^3##.
If ##|Z|=p^3## then ##G## is abelian, all subgroups are normal and ##G## has to be isomorphic to one of ##C_{p^3} \; , \; C_{p^2} \times C_{p} \; , \; C_p^3##

So we are left with the following situation: ##|G| = p^3 \; , \; |Z| = p## and ##G \diagup Z## is not cyclic of order ##p^2##. Thus ##G \diagup Z ≅ C_p \times C_p## by the previous result, which is abelian. Furthermore ##G' = [G,G] \subseteq Z## which is only possible for ##G' = Z ≅ C_p##. (##G' = \{1\}## would imply ##G## abelian.)

All group elements beside ##1## have orders ##p## or ##p^2##. There is at least one element of order ##p## in ##G \diagup Z ≅ C_p \times C_p## so we have two non-central elements ##a## and ##b## of order ##(|a|,|b|) = (p,p)## or w.l.o.g. ##(|a|,|b|) = (p^2,p)## and a central element in ##G'## of order ##p##. So either ##G ≅ A := <a,b \; | \; a^p = b^p = [a,b]^p = 1>## or ##G ≅ B := <a,b \; | \; a^{p^2} = b^p = 1 , [a,b] = a^p>.##
The group ##D := <a,b \; | \; a^{p^2} = b^p = [a,b] = 1>## is isomorphic to ##B## by interchanging the generators.

In case ##p=2 \; ,\; A## are the quaternions ##Q_8## with ##a = i \; , \; b= j \; , \; ab = k## and ##B## is the dihedral group ##D_4## (with ##8## elements).

micromass · Jul 17, 2016

fresh_42 said:

So either ##G ≅ A := <a,b \; | \; a^p = b^p = [a,b]^p = 1>## or ##G ≅ B := <a,b \; | \; a^{p^2} = b^p = 1 , [a,b] = a^p>.##

If you write groups as a presentation, then you risk the groups to be trivial, or not having the order you want. So how are you sure that ##A## and ##B## have order ##p^3##?

Micromass' big July Challenge

Similar threads

Hot Threads

Recent Insights