# Random set of N points in a unit disc, what is the average nearest distance

• I
Gold Member
Pick a random set of N points from the unit disc. Calculate the distance between all pairs of points and call the smallest value r. Do this calculation for many such sets. Please give me a hint how to estimate what the average value of r is. I guess a computer program could quickly come up with an accurate estimate as long as N is not too big?

Thanks!

Last edited:

mfb
Mentor
If you don't need an analytic solution (and I don't know if one exists for general N) a computer simulation is a nice approach.

For very large N there are good analytic approximations.

Gold Member
For very large N there are good analytic approximations.

Would 115 billion be considered a large number in this case?

I guess my 2 dimensional problem simulated on a computer could be simplified by doing the 1 dimensional problem on a computer, N random points on the unit interval, coming up with r and then inferring that for the 2 dimensional case r should be approximately (or exactly?) √2 times r for the 1D problem?

Thanks!

Would 115 billion be considered a large number in this case?

I guess my 2 dimensional problem simulated on a computer could be simplified by doing the 1 dimensional problem on a computer, N random points on the unit interval, coming up with r and then inferring that for the 2 dimensional case r should be approximately (or exactly?) √2 times r for the 1D problem?

Thanks!

You need to increase the number of repetitions to get more accurate results, not ##N##. What you need to do is to first generate ##N## random points, such that distance from the center ,say (0,0) is less than or equal to 1. This means that ##\sqrt{r_1^2+r_2^2}\leq 1##, where ##\mathbf{r}=(r_1,\,r_2)## is a point in the unit disk. I guess how you can do this, is first to generate ##N## random variables in the interval ##[0, 1]## that represent ##r_1##, and then generate another ##N## random variables between ##[0,\sqrt{1-r_1^2}]## that represent ##r_2##. Now you have ##N## random points on a unit disk. Then you find the pair distance of all points and select the minimum, and store it. Repeat this process say ##K=10^5## times. At the end add all the minimum distances from all trials, and divide it by ##K##. This way you get the average minimum distance. Maybe there is another simpler way, but this what comes to mind now.

• Spinnor
mfb
Mentor
Would 115 billion be considered a large number in this case?
Yes.
Keep in mind that the distribution of what you want to study is not uniform.
Also keep in mind that (a) protons are not point particles, (b) the bunches collide with a non-zero crossing angle and (c) there are many collisions per bunch crossing.
I guess my 2 dimensional problem simulated on a computer could be simplified by doing the 1 dimensional problem on a computer, N random points on the unit interval, coming up with r and then inferring that for the 2 dimensional case r should be approximately (or exactly?) √2 times r for the 1D problem?
For a uniform distribution that should be a reasonable approximation.

• Spinnor
Gold Member
Thank you all your corrections. So if I used the numbers for focused beam bunches, bunch diameter of order several millimeters and a bunch number of a hundred billion protons should give us a r minimum of order the proton radius? Protons need to be pretty close to interact strongly?

Thank you for your help. Need to look into free easy to use math calculating software.

mfb
Mentor
bunch diameter of order several millimeters
That is way too much.
should give us a r minimum of order the proton radius?
No as they don't have to overlap to interact and they are not pool balls with a single sharp border anyway.

We have much more accurate numbers for the proton radius from other measurements. Overall LHC cross sections don't help with that.

Gold Member
So back to the original math question. I think I have a solution and would appreciate flaws in my approximation pointed out.

If we have N points we then have (N-1)N/2 distinct pairs of points. Focus on one pair of points and call them a and b. If I give you the coordinates of a then you could say that because b is random it could be anywhere in the unit disc. divide the unit disc into angular regions centered on point a. These annular regions do not form complete annular rings unless point a was at the center of the unit disc, consider this an error in my approximation a value I think will be between approximately between 2 and 1/2, brain is too fuzzy to figure if the error factor is larger or smaller than 1. Call the radius to the random point b from a, r_i. The probability that the random point b is in a particular annular region is proportional to the area of the annular region. Considering all pairs of points there will be (N-1)N/2 values of r_i. Randomly distribute (N-1)N/2 points about point a. The density of points is the number of points divided by the area of the unit disc, and is (N-1)N/2]/π, call that ρ. Use this density to calculate the minimum value of r, r_min such there is likely one point in an annular region centered on point a with radius r_min. We want,

Density of points times area of circular region of radius r_min = 1

[(N-1)N/2]/π X π(r_min)^2 = 1 or,

(r_min)^2 ≅ 2/[(N-1)N/2] for large N

r_min ≅ √2/N

Extrapolating, in D dimensions r_min ≅ √D/N Edit, unless I can come up with an argument for the D dimensional case lets forget it for now.

Edit, the factor of √2 should be 2, simple math error.

So at this point I should step back and ask myself, should the answer scale as 1/N. Maybe you smarter guys out there can argue yes or no I do not know.

Last edited:
mfb
Mentor
##r_{min} = \frac{\sqrt{2}}{N}## for a unit disk looks good. Edge effects are negligible, so the shape (disk) doesn't matter.
We can generalize it to ##r_{min} = \frac{\sqrt{2A}}{\sqrt{\pi} N}## for an area A that is not too fractured.

In d dimensions a point will have an expected ##c x^d N## points within a radius x where c is some numerical constant depending on the volume of the unit ball in d dimensions. To estimate the rmin let this value be 2/N: ##\frac{2}{N} = c r_{min}^d N## or ##r_{min}=\left(\frac{2}{cN^2}\right)^{1/d}##.
You can see the curse of dimensionality here. ##r_{min} \propto N^{-2/d}##. For large d the radius will stay quite large even for large N.

• Spinnor
WWGD
Gold Member
As a general result, assuming each point comes from the same distribution, i.e., the points ##p_1,....,p_N## are IID RVs, then the distribution of the minimum

## Min (p_1,p_2,...,p_N) ## is given by ##P(Min(p_1,...,p_N)<y)= 1- (1-\phi)^N ## , where ##\phi## is the cdf.

• Spinnor
Gold Member
As a general result, assuming each point comes from the same distribution, i.e., the points p1,....,pNp1,....,pNp_1,....,p_N are IID RVs, then the distribution of the minimum

Min(p1,p2,...,pN)Min(p1,p2,...,pN) Min (p_1,p_2,...,p_N) is given by P(Min(p1,...,pN)<y)=1−(1−ϕ)NP(Min(p1,...,pN)<y)=1−(1−ϕ)NP(Min(p_1,...,p_N)ϕϕ\phi is the cdf.

Does the above result agree with that of mfb, I don't see it right now? I will start here, Cumulative distribution function,
https://en.wikipedia.org/wiki/Cumulative_distribution_function

Can you point to a link or translate "IID RVs"?

Thanks!

Gold Member
Can you point to a link or translate "IID RVs"?

Would that be 2 dimensional random values?

WWGD
Gold Member
Does the above result agree with that of mfb, I don't see it right now? I will start here, Cumulative distribution function,
https://en.wikipedia.org/wiki/Cumulative_distribution_function

Can you point to a link or translate "IID RVs"?

Thanks!
Sorry for my laziness :). This stands for Independently, Identically Distributed Random Variables, a condition assumed in some results like the Central Limit Theorem and others.

• Spinnor
mfb
Mentor
They are not independent - we always have two points with the smallest distance to a neighbor. I would expect the approach with independent distributions to miss a factor sqrt(2) in the result.

• WWGD and Spinnor