Re-scaling of exponentially distributed numbers

Click For Summary

Discussion Overview

The discussion revolves around the re-scaling of exponentially distributed random numbers and its effect on their distribution. Participants explore the mathematical implications of this re-scaling, particularly in relation to generating random variables that sum to a specific value.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants note that re-scaling exponentially distributed numbers results in a distribution that appears almost uniform, questioning the expectation of maintaining the original distribution shape.
  • One participant suggests that the mathematical question can be framed in terms of independent random variables and their transformations, specifically regarding the distribution of a derived variable from two uniformly distributed inputs.
  • Another participant challenges the method of re-scaling by summing across rows versus columns, proposing an alternative approach that may align better with the intended outcome.
  • Some participants argue that if the sum of generated numbers must equal a constant, then the independence of the distributions is compromised, leading to a different distribution than initially expected.
  • There is a discussion about the intent behind the re-scaling process, with some suggesting it aims to create pairs of random variables that sum to one, rather than performing a linear rescaling.
  • Participants express uncertainty regarding the mathematical explanation for the shape of the resulting histogram after re-scaling, indicating a lack of clarity in the mathematical question being posed.
  • Different interpretations of the procedure for generating and analyzing the random numbers are presented, highlighting the complexity of translating procedural steps into a coherent mathematical framework.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the implications of re-scaling exponentially distributed numbers, with multiple competing views on the nature of the distributions involved and the correct approach to the problem.

Contextual Notes

There are limitations in the clarity of the mathematical question being posed, as well as the assumptions regarding the independence of the random variables and the intended outcomes of the re-scaling process.

roam
Messages
1,265
Reaction score
12
TL;DR
I am trying to generate ##M## random numbers which are exponentially distributed and whose sum adds up to ##N##. However, the re-scaling always causes the numbers to become uniformly distributed.
For simplicity, let ##N=1##. The following histograms show my results. The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed.

244159


What is the cause of that, and is there a solution?

P.S. Here is my code in Matlab:

Matlab:
subplot(121)
samples = 10000;
lambda = 1;
X = -log(rand(samples,2))/lambda;
hist(X(:,1),100)
subplot(122)
X = X./sum(X,2); % re-scaling
hist(X(:,1),100)
 
Physics news on Phys.org
roam said:
The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed.
Yes, of course they do. That's the way math works. If you have a log distribution on a log scale, that's the same as a flat distribution in a linear scale. I don't know why you would expect otherwise. There IS no "solution".
 
roam said:
Summary: However, the re-scaling always causes the numbers to become uniformly distributed.
The mathematical question is more likely to be answered if it is stated precisely. As I read the MATLAB code, the mathematical question is:

##X## and ##Y## are independent random variables and each is uniformly distributed on [0,1]. What is the distribution of ##W = \frac{ \log(X)}{ \log(X) + \log(Y)}## ?
 
  • Like
Likes   Reactions: roam
I don't quite understand why you are scaling by sum(X,2). That will rescale each row by the sum of the numbers on that row (2 numbers). I changed it to sum(X,1), which sums the 10000 numbers along index 1 (sums each column), and got what I think you were expecting. I like to keep it simple and see the intermediate calculations so that I can make sure it is doing what I expected:
N=10
S = sum(X,1)
Y = N*X(:,1)/S(1,1); % re-scaling
 
  • Like
Likes   Reactions: Stephen Tashi and roam
roam said:
Summary: I am trying to generate ##M## random numbers which are exponentially distributed and whose sum adds up to ##N##.
If their sum adds up to a given N then their distributions are not independent and so they cannot individually be exponentially distributed. Take the case of M = 2; If the first number ## n_0 ## is exponentially distributed in the range ## [0,N] ## but the second number must always equal ## N-n_0 ##.
roam said:
However, the re-scaling always causes the numbers to become uniformly distributed.
Yes of course, because as @StephenTashi points out your 'rescaling' creates a completely different distribution.
 
pbuk said:
If their sum adds up to a given N then their distributions are not independent
He was trying to generate the data and do a simple linear rescaling of the data afterward. It should not have changed the general shape of the distribution. He had a MATLAB coding error.
 
  • Like
Likes   Reactions: pbuk and roam
FactChecker said:
He was trying to generate the data and do a simple linear rescaling of the data afterward. It should not have changed the general shape of the distribution.

As to rescaling, I think the goal is take pairs of random variables ##X_a, X_b## , and from each pair , create the pair ##W_a = X_a/(X_a + X_b),\ W_b = (X_b)/(X_a + X_b)##. Then we look at the distribution of ##W_a##. So the intent is not to do a linear rescaling. The intent is to create pairs of random variables ##W_a,\ W_b## that sum to 1.
 
  • Like
Likes   Reactions: roam
Stephen Tashi said:
As to rescaling, I think the goal is take pairs of random variables ##X_a, X_b## , and from each pair , create the pair ##W_a = X_a/(X_a + X_b),\ W_b = (X_b)/(X_a + X_b)##. Then we look at the distribution of ##W_a##. So the intent is not to do a linear rescaling. The intent is to create pairs of random variables ##W_a,\ W_b## that sum to 1.
That is what his original code did. I don't know what the real intention was. He got a valid answer to either case in this thread. I didn't see anything about "pairs" in the description. I still think that my assumption fits the original description better (or at least as well).
 
FactChecker said:
I don't know what the real intention was. He got a valid answer to either case in this thread.

We don't yet have a good mathematical explanation for the shape of the second histogram. I agree that we don't yet have a clear statement of a mathematical question!
roam said:
Summary: I am trying to generate ##M## random numbers which are exponentially distributed and whose sum adds up to ##N##.

The generated random numbers are initially exponentially distributed. But after re-scaling they become almost uniformly distributed.

It's often hard to translate a procedure into a question about random variables. The first step is to describe the procedure clearly.

As a procedure, one interpretation of what you want to do, in general, is to generate 1 set of ##M## random numbers that sum to ##N## and then you want to make a histogram of all those ##M## random numbers. - i.e. all ##M## of the numbers contribute to the histogram. Your claim is that most such histograms are approximately uniform distributions.

A different interpretation is that you want to generate many sets of ##M## random numbers, each set being one where the ##M## numbers sum to the same ##N##. Then you want to make a histogram by using one number from each of the sets. For example, if you generate 100 sets of ##M= 20## numbers, you might make a histogram by picking the first number from each of the 100 sets of numbers. The histogram would involve 100 values.

Or, to make a hybrid of the previous procedures, perhaps you want to generate, say, ##100## sets of ##M = 20## random numbers such that the sum of the numbers in each set is ##N##. Then you want to histogram all 2000 of the numbers.
 
  • Like
Likes   Reactions: FactChecker

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 3 ·
Replies
3
Views
12K
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 12 ·
Replies
12
Views
4K
Replies
1
Views
8K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K