Re-scaling of exponentially distributed numbers

roam · May 25, 2019

For simplicity, let ##N=1##. The following histograms show my results. The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed.

What is the cause of that, and is there a solution?

P.S. Here is my code in Matlab:

Matlab:

subplot(121)
samples = 10000;
lambda = 1;
X = -log(rand(samples,2))/lambda;
hist(X(:,1),100)
subplot(122)
X = X./sum(X,2); % re-scaling
hist(X(:,1),100)

phinds · May 25, 2019

roam said:

The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed.

Yes, of course they do. That's the way math works. If you have a log distribution on a log scale, that's the same as a flat distribution in a linear scale. I don't know why you would expect otherwise. There IS no "solution".

Stephen Tashi · May 25, 2019

roam said:

Summary: However, the re-scaling always causes the numbers to become uniformly distributed.

The mathematical question is more likely to be answered if it is stated precisely. As I read the MATLAB code, the mathematical question is:

##X## and ##Y## are independent random variables and each is uniformly distributed on [0,1]. What is the distribution of ##W = \frac{ \log(X)}{ \log(X) + \log(Y)}## ?

FactChecker · May 25, 2019

I don't quite understand why you are scaling by sum(X,2). That will rescale each row by the sum of the numbers on that row (2 numbers). I changed it to sum(X,1), which sums the 10000 numbers along index 1 (sums each column), and got what I think you were expecting. I like to keep it simple and see the intermediate calculations so that I can make sure it is doing what I expected:
N=10
S = sum(X,1)
Y = N*X(:,1)/S(1,1); % re-scaling

pbuk · Jun 3, 2019

roam said:

Summary: I am trying to generate ##M## random numbers which are exponentially distributed and whose sum adds up to ##N##.

If their sum adds up to a given N then their distributions are not independent and so they cannot individually be exponentially distributed. Take the case of M = 2; If the first number ## n_0 ## is exponentially distributed in the range ## [0,N] ## but the second number must always equal ## N-n_0 ##.

roam said:

However, the re-scaling always causes the numbers to become uniformly distributed.

Yes of course, because as @StephenTashi points out your 'rescaling' creates a completely different distribution.

FactChecker · Jun 4, 2019

pbuk said:

If their sum adds up to a given N then their distributions are not independent

He was trying to generate the data and do a simple linear rescaling of the data afterward. It should not have changed the general shape of the distribution. He had a MATLAB coding error.

Stephen Tashi · Jun 5, 2019

FactChecker said:

He was trying to generate the data and do a simple linear rescaling of the data afterward. It should not have changed the general shape of the distribution.

As to rescaling, I think the goal is take pairs of random variables ##X_a, X_b## , and from each pair , create the pair ##W_a = X_a/(X_a + X_b),\ W_b = (X_b)/(X_a + X_b)##. Then we look at the distribution of ##W_a##. So the intent is not to do a linear rescaling. The intent is to create pairs of random variables ##W_a,\ W_b## that sum to 1.

FactChecker · Jun 5, 2019

Stephen Tashi said:

As to rescaling, I think the goal is take pairs of random variables ##X_a, X_b## , and from each pair , create the pair ##W_a = X_a/(X_a + X_b),\ W_b = (X_b)/(X_a + X_b)##. Then we look at the distribution of ##W_a##. So the intent is not to do a linear rescaling. The intent is to create pairs of random variables ##W_a,\ W_b## that sum to 1.

That is what his original code did. I don't know what the real intention was. He got a valid answer to either case in this thread. I didn't see anything about "pairs" in the description. I still think that my assumption fits the original description better (or at least as well).

Stephen Tashi · Jun 5, 2019

FactChecker said:

I don't know what the real intention was. He got a valid answer to either case in this thread.

We don't yet have a good mathematical explanation for the shape of the second histogram. I agree that we don't yet have a clear statement of a mathematical question!

roam said:

Summary: I am trying to generate ##M## random numbers which are exponentially distributed and whose sum adds up to ##N##.

The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed.

It's often hard to translate a procedure into a question about random variables. The first step is to describe the procedure clearly.

As a procedure, one interpretation of what you want to do, in general, is to generate 1 set of ##M## random numbers that sum to ##N## and then you want to make a histogram of all those ##M## random numbers. - i.e. all ##M## of the numbers contribute to the histogram. Your claim is that most such histograms are approximately uniform distributions.

A different interpretation is that you want to generate many sets of ##M## random numbers, each set being one where the ##M## numbers sum to the same ##N##. Then you want to make a histogram by using one number from each of the sets. For example, if you generate 100 sets of ##M= 20## numbers, you might make a histogram by picking the first number from each of the 100 sets of numbers. The histogram would involve 100 values.

Or, to make a hybrid of the previous procedures, perhaps you want to generate, say, ##100## sets of ##M = 20## random numbers such that the sum of the numbers in each set is ##N##. Then you want to histogram all 2000 of the numbers.

Re-scaling of exponentially distributed numbers

1. What is the purpose of re-scaling exponentially distributed numbers?

2. How is re-scaling different from normalization or standardization?

3. What are the common methods for re-scaling exponentially distributed numbers?

4. Can re-scaling affect the interpretation of the data?

5. When is re-scaling of exponentially distributed numbers necessary?

Similar threads

Hot Threads

Recent Insights