Sorting sample from exponential

AI Thread Summary
When sorting a large sample from an exponential distribution, the resulting distribution may not appear exponential due to systematic deviations, particularly in the tails. This can be influenced by the quality of the random number generator used, as simpler generators may introduce biases. The empirical distribution function (edf) and empirical quantile function (eqf) can provide insights into the observed shape, with Donsker's theorem offering a theoretical framework. If the sorted sample deviates significantly from an exponential distribution, it may indicate that the initial sampling was flawed. Understanding these nuances is crucial for accurate statistical analysis.
Gerenuk
Messages
1,027
Reaction score
5
What is the function that I get when I take a large sample from an exponential distribution (many values from the same distribution) and sort the sample points. I'm a bit surprised it's not really exponential.
The shape seems to fit some data I have nicely, but I don't know the function to fit :(
I found something about order statistics, but that seems to give answers to other questions.

Anyone a suggestion?
 
Physics news on Phys.org
Sounds it's like either the empirical distribution function or empirical quantile function that you're looking at.

For the edf Donsker's theorem tells us that sqrt(n)*(F_n(x)-F(x)) -> B(F(x)) where B is a Brownian bridge, so for this example you'd get either 1-exp(-bx) (for the edf) or -log(1-u)/b (for the eqf).
 
Last edited:
Thanks! Hmm, quite possible. The uniform distribution seems uniform for empirical, but the exponential seems to have some systematic deviations in the tails. Is there any statement about such systematic deviations?
 
Maybe it can be attributed to numerical precision?
 
I made a run with 10000 samples and quite a smooth curve came out. The log of it is linear in the center with tails smoothly deviating in opposite direction. But I try to check that again.
 
Maybe "random number" generator you use is not random enough? It often happens if you use simplest pseudorandom generators built into programming languages, like rand().
Try something more reliable, e.g. http://root.cern.ch/root/html/TRandom1.html
http://root.cern.ch/root/html/src/TRandom1.cxx.html#URPliB
 
For uniform random bits there is the site:

http://www.random.org

The site says that it uses atmospheric noise as input to generate the random data. In terms of compression, it looks pretty good (random data can't be compressed with standard methods), but apart from that you'll probably have to run it through some analysis to test its fit for statistical randomness.
 
Gerenuk said:
What is the function that I get when I take a large sample from an exponential distribution (many values from the same distribution) and sort the sample points. I'm a bit surprised it's not really exponential.
Sorry -- this must be a dumb question, because everyone else seems to understand what you're talking about, but could you explain more clearly what you're doing? My understanding of "sort the sample points" is "put them in numerical order". That obviously will not change the distribution. If the sample points truly were sampled from an exponential distribution, then they will also have an exponential distribution after sorting. If they don't, it can only mean that you were mistaken when you thought you were sampling from an exponential distribution.

Where am I going wrong?
 

Similar threads

Replies
5
Views
2K
Replies
7
Views
2K
Replies
4
Views
2K
Replies
13
Views
3K
Replies
3
Views
2K
Replies
30
Views
4K
Replies
5
Views
2K
Back
Top