Anybody ever heard of the broken stick model? If so, what is it used for besides species abundance in ecology? Basically, n-1 points are randomly distributed on a unit interval, creating n intervals of random lengths. The points are independent, uniform random variables on the unit interval.(adsbygoogle = window.adsbygoogle || []).push({});

Here I am going to derive the expected length of I_{r}, the r^{th}shortest interval using a brute force approach. The I’s are the order statistics for the intervals. That is, 0<=I_{1}<= I_{2}<= … I_{r}<= … I_{n}<= 1. Looking around online, there are some more sophisticated derivations for the expectation that I wasn't able to follow.

First I'll give a brief sketch of the my derivation. I'll fill in the details later.

To begin, we need to find the probability that I_{r}> x, or P(I_{r}> x). This condition holds exactly when there are at least n-r+1 intervals with length greater than x.

As a sub-problem, we need to find the probability that exactly j intervals have length > x, and then we can sum over j from n-r+1 to n. The set of n-1 points X1, X2, … X(n-1) can be thought of as a vector uniformly distributed on an (n-1)-cube of unit size. It turns out that the easiest probability to calculate is the probability that a particular group of s intervals (and possibly others, too) have lengths > x. This probability is just (1-sx)^{n-1}, provided s is less than the integer part of 1/x. Otherwise it’s zero.

For example, the probability that the first two intervals starting from the left are greater than .01 in length is (1-2*.01)^{n-1}Because of the symmetry of the problem, all such groups have the same probability, so the sum over all groups of s intervals of the probability that at least those s intervals have length > x is given by n_C_s (1-sx)^{n-1}, provided s < floor(1/x). You might think we could immediately get the probability that at least n-r+1 intervals have length > x by putting s=n-r+1. But the cases where there are more than s intervals with length > x are being counted multiple times, so we have to use the inclusion-exclusion principle. The probability that exactly j intervals have length > x is

1) [tex]\sum_{s=j}^{n}(-1)^{s-j}\binom{s}{j}\binom{n}{s}(1-sx)^{n-1}[/tex] if s<floor(1/x). Zero otherwise. In other words,

2) [tex]\sum_{s=j}^{\min(n,\floor{\frac{1}{x}})}(-1)^{s-j}\binom{s}{j}\binom{n}{s}(1-sx)^{n-1}[/tex]

Therefore the tail probability distribution for the r-th smallest interval is given by

3) [tex]P(I_r > x) = \sum_{j=n-r+1}^{n}\sum_{s=j}^{\min(n,\floor{\frac{1}{x}})}(-1)^{s-j}\binom{s}{j}\binom{n}{s}(1-sx)^{n-1 }[/tex]

To find the expected value of Ir, we integrate this tail probability over the allowable values for x. The smallest x could be is zero. The largest it could be is 1/(n-r+1). This would happen if the smallest r-1 intervals were zero length and all other intervals had length 1/(n-r+1). So we need to integrate P(I_r>x) from 0 to 1/(n-r+1).

Because the upper limit of the sum over s depends on x, we have to break the integral up into small intervals:

4) [tex]E[I_r] = \int_{0}^{\frac{1}{n}}P(I_r > x)dx + \sum_{l=1}^{r-1} \int_{\frac{1}{n-(l-1)}}^{\frac{1}{n-l}}P(I_r>x)dx[/tex]

Plugging in (3) for P(I_r>x) we get two main terms for E[I_r].

5) [tex]E[I_r] = \int_{0}^{\frac{1}{n}} \sum_{j=n-r+1}^{n}\sum_{s=j}^{n}(-1)^{s-j}\binom{s}{j}\binom{n}{s}(1-sx)^{n-1} dx + \sum_{l=1}^{r-1} \int_{\frac{1}{n-(l-1)}}^{\frac{1}{n-l}}\sum_{j=n-r+1}^{n}\sum_{s=j}^{n-l}(-1)^{s-j}\binom{s}{j}\binom{n}{s}(1-sx)^{n-1} dx[/tex]

After performing the integrals and rearranging the summations, you get to this:

6) [tex]E[I_r] = \frac{1}{n}\sum_{j=n-r+1}^{n} \sum_{s=j}^{n}(-1)^{s-j} \frac{ \binom{s}{j} \binom{n}{s} }{s}[/tex]

That inner sum over s had me stumped for over a week, but I finally figured out how it comes out to 1/j. That gives the desired answer

7) [tex]E[I_r] = \frac{1}{n}\sum_{j=n-r+1}^{n}\frac{1}{j}[/tex]

**Physics Forums | Science Articles, Homework Help, Discussion**

Join Physics Forums Today!

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

# Broken stick model?

**Physics Forums | Science Articles, Homework Help, Discussion**