Re-scaling of exponentially distributed numbers

In summary: The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed. Yes, of course they do. That's the way math works. If you have a log distribution on a log scale, that's the same as a flat distribution in a linear scale. I don't know why you would expect otherwise.
  • #1
roam
1,271
12
TL;DR Summary
I am trying to generate ##M## random numbers which are exponentially distributed and whose sum adds up to ##N##. However, the re-scaling always causes the numbers to become uniformly distributed.
For simplicity, let ##N=1##. The following histograms show my results. The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed.

244159


What is the cause of that, and is there a solution?

P.S. Here is my code in Matlab:

Matlab:
subplot(121)
samples = 10000;
lambda = 1;
X = -log(rand(samples,2))/lambda;
hist(X(:,1),100)
subplot(122)
X = X./sum(X,2); % re-scaling
hist(X(:,1),100)
 
Physics news on Phys.org
  • #2
roam said:
The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed.
Yes, of course they do. That's the way math works. If you have a log distribution on a log scale, that's the same as a flat distribution in a linear scale. I don't know why you would expect otherwise. There IS no "solution".
 
  • #3
roam said:
Summary: However, the re-scaling always causes the numbers to become uniformly distributed.
The mathematical question is more likely to be answered if it is stated precisely. As I read the MATLAB code, the mathematical question is:

##X## and ##Y## are independent random variables and each is uniformly distributed on [0,1]. What is the distribution of ##W = \frac{ \log(X)}{ \log(X) + \log(Y)}## ?
 
  • Like
Likes roam
  • #4
I don't quite understand why you are scaling by sum(X,2). That will rescale each row by the sum of the numbers on that row (2 numbers). I changed it to sum(X,1), which sums the 10000 numbers along index 1 (sums each column), and got what I think you were expecting. I like to keep it simple and see the intermediate calculations so that I can make sure it is doing what I expected:
N=10
S = sum(X,1)
Y = N*X(:,1)/S(1,1); % re-scaling
 
  • Like
Likes Stephen Tashi and roam
  • #5
roam said:
Summary: I am trying to generate ##M## random numbers which are exponentially distributed and whose sum adds up to ##N##.
If their sum adds up to a given N then their distributions are not independent and so they cannot individually be exponentially distributed. Take the case of M = 2; If the first number ## n_0 ## is exponentially distributed in the range ## [0,N] ## but the second number must always equal ## N-n_0 ##.
roam said:
However, the re-scaling always causes the numbers to become uniformly distributed.
Yes of course, because as @StephenTashi points out your 'rescaling' creates a completely different distribution.
 
  • #6
pbuk said:
If their sum adds up to a given N then their distributions are not independent
He was trying to generate the data and do a simple linear rescaling of the data afterward. It should not have changed the general shape of the distribution. He had a MATLAB coding error.
 
  • Like
Likes pbuk and roam
  • #7
FactChecker said:
He was trying to generate the data and do a simple linear rescaling of the data afterward. It should not have changed the general shape of the distribution.

As to rescaling, I think the goal is take pairs of random variables ##X_a, X_b## , and from each pair , create the pair ##W_a = X_a/(X_a + X_b),\ W_b = (X_b)/(X_a + X_b)##. Then we look at the distribution of ##W_a##. So the intent is not to do a linear rescaling. The intent is to create pairs of random variables ##W_a,\ W_b## that sum to 1.
 
  • Like
Likes roam
  • #8
Stephen Tashi said:
As to rescaling, I think the goal is take pairs of random variables ##X_a, X_b## , and from each pair , create the pair ##W_a = X_a/(X_a + X_b),\ W_b = (X_b)/(X_a + X_b)##. Then we look at the distribution of ##W_a##. So the intent is not to do a linear rescaling. The intent is to create pairs of random variables ##W_a,\ W_b## that sum to 1.
That is what his original code did. I don't know what the real intention was. He got a valid answer to either case in this thread. I didn't see anything about "pairs" in the description. I still think that my assumption fits the original description better (or at least as well).
 
  • #9
FactChecker said:
I don't know what the real intention was. He got a valid answer to either case in this thread.

We don't yet have a good mathematical explanation for the shape of the second histogram. I agree that we don't yet have a clear statement of a mathematical question!
roam said:
Summary: I am trying to generate ##M## random numbers which are exponentially distributed and whose sum adds up to ##N##.

The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed.

It's often hard to translate a procedure into a question about random variables. The first step is to describe the procedure clearly.

As a procedure, one interpretation of what you want to do, in general, is to generate 1 set of ##M## random numbers that sum to ##N## and then you want to make a histogram of all those ##M## random numbers. - i.e. all ##M## of the numbers contribute to the histogram. Your claim is that most such histograms are approximately uniform distributions.

A different interpretation is that you want to generate many sets of ##M## random numbers, each set being one where the ##M## numbers sum to the same ##N##. Then you want to make a histogram by using one number from each of the sets. For example, if you generate 100 sets of ##M= 20## numbers, you might make a histogram by picking the first number from each of the 100 sets of numbers. The histogram would involve 100 values.

Or, to make a hybrid of the previous procedures, perhaps you want to generate, say, ##100## sets of ##M = 20## random numbers such that the sum of the numbers in each set is ##N##. Then you want to histogram all 2000 of the numbers.
 
  • Like
Likes FactChecker

1. What is the purpose of re-scaling exponentially distributed numbers?

The purpose of re-scaling exponentially distributed numbers is to transform the data into a more manageable range. Exponential distributions can often have a large range of values, making it difficult to visualize and analyze the data. Re-scaling helps to standardize the data and make it easier to interpret.

2. How is re-scaling different from normalization or standardization?

Re-scaling is different from normalization and standardization in that it specifically focuses on transforming exponentially distributed numbers. Normalization and standardization are more general techniques used to transform data into a specific range or distribution.

3. What are the common methods for re-scaling exponentially distributed numbers?

The most common methods for re-scaling exponentially distributed numbers include logarithmic transformation, square root transformation, and Box-Cox transformation. These methods aim to transform the data into a more linear or normal distribution.

4. Can re-scaling affect the interpretation of the data?

Yes, re-scaling can affect the interpretation of the data. Depending on the method used, the re-scaling process can change the shape and range of the data. This can impact the conclusions and insights drawn from the data.

5. When is re-scaling of exponentially distributed numbers necessary?

Re-scaling of exponentially distributed numbers is necessary when the data has a large range of values and a non-linear distribution. This can make it difficult to visualize and analyze the data in its original form, and re-scaling can help to address this issue.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
815
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
333
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
918
  • Calculus and Beyond Homework Help
Replies
1
Views
807
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
853
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
897
Back
Top