Register to reply 
How do I randomly generate a set of numbers that sum up to one? 
Share this thread: 
#1
Nov1512, 04:45 PM

P: 3

I teach costbenefit analysis, which requires me to teach monte carlo simulation for sensitivity analysis. I use excel. I understand how to generate a number with uniform, triangular, normal or other distributions, but I don't know how to randomly generate a set of numbers between zero and one which sum up to one.
Here is the exact application. Suppose I have an estimate of the proportions of blacks, whites, hispanics, and asians in a given population, let's say 0.2, 0.5, 0.2, 0.1. In my costbenefit analysis there is some impact I'm estimating that depends on these proportions. For example, suppose I'm estimating the total number of people who will sign up for medicaid benefits, and I have an estimate of the signup rate for each race/ethnic group, so the total number of signups depends on the weighted average of signup rates. In monte carlo sensitivity analysis I want to vary each of these parameters over some believable range. Let's say I have reason to believe that the race/ethnic proportions may not be exactly 0.2, 0.5, 0.2, 0.1, but that each of them lies in a range of .05 above or below those numbers, so my ranges are [0.15,0.25], [0.45,0.55], [0.15,0.25], [0.05,0.15]. If I naively tell excel to choose four numbers randomly, one from within each of those ranges, they are extremely unlikely to sum up to one. If I tell excel to choose a number from each of the first three ranges and subtract their sum from one to get the fourth number, it is possible the first three will sum to more than 1. Ultimately what I need is to be able to randomly generate a set of proportions that sum up to one when I have some belief about the range each proportion must lie in. I have no idea how to think about how to do this, but it must come up a lot. I (and my students) will be hugely grateful for a solution. Dan Acland, Goldman School of Public Policy, UC Berkeley. 


#2
Nov1512, 06:31 PM

Homework
Sci Advisor
HW Helper
Thanks
P: 13,051

Welcome to PF;
From your criteria, you cannot get a set of proportions that add up to one  unless you make one of them dependent on the others. In your example, you could randomly generate the first three, and make the last one whatever makes the four sum to one. 


#3
Nov1512, 09:47 PM

Homework
Sci Advisor
HW Helper
Thanks
P: 9,922

An obvious solution is to generate all four numbers randomly then rescale to get the desired total. Now, that won't generate values with exactly the distribution fed in, but I gather those distributions are only plucked out of the air anyway, so that shouldn't matter.



#4
Nov1612, 09:17 AM

P: 74

How do I randomly generate a set of numbers that sum up to one?
Hi Dan, I suggest following the Bayesian procedure with multinomial model and Dirichlet prior. You can set your prior for Dirichlet in terms of shaping parameters. This prior can correspond to your [.2, .5, .2, .1] vector with parameters set to suit this need. Then, you update the Dirichlet distribution by multinomial model (e.g. with randomly generated classes, according to your needs) in order to obtain posterior which is slightly different to [.2, .5, .2, .1].
Other, more direct way, is sampling directly from the prior, e.g. in python: numpy.random.mtrand.dirichlet([2.,5.,2.,1.], 2) which generates 2 vectors, e.g.: array([[ 0.09636368, 0.53846125, 0.20418588, 0.16098919], [ 0.19053245, 0.69141272, 0.11662014, 0.00143469]]) Notice the rows sum to unity. The concentration around original values are driven by the magnitude of Dirichlet's parameters, e.g. numpy.random.mtrand.dirichlet(np.array([2.,5.,2.,1.])*1e5, 2) array([[ 0.20068111, 0.49879339, 0.20027888, 0.10024661], [ 0.20021036, 0.49957287, 0.19975537, 0.10046141]]) In R, you can proceed similarly. Presumably, composition of equivalent function in excel wouldn't be too hard. 


#5
Nov1812, 12:56 AM

P: 570

Good for you for giving us so much detail. 


#6
Nov1812, 01:16 AM

Homework
Sci Advisor
HW Helper
Thanks
P: 13,051

@Danacland: how did you get on?



#7
Nov1812, 08:32 AM

Sci Advisor
P: 1,718

Randomly generate all positive numbers then divide each one by the total.



#8
Nov1912, 04:39 PM

P: 3

Thanks for these responses. The sampleandscale approach occurred to me. I think a simulation would give me a sense of how badly it would violate my ranges.
@Camillio: I was unaware of the Dirichlet distribution. It looks like the right answer, though if I use the simpler, direct approach you suggest, it looks like I can't specify the range I believe the true proportions lie in around the initial alphas. If I ran a simulation I could probably get a sense of how much spread the Dirichlet distribution generates around the alphas, and for most of the roughandready policy analysis stuff I teach, this would probably be fine. My sense is that the twostage procedure you proposed is a way to get the spread I want. Is that right? Unfortunately I don't really follow the steps you outline. Is that because I don't know anything about Bayesian inference? Is there somewhere a moderately boneheaded economist could get a quick and dirty introduction to the kind of procedure you are outlining? Or can you explain it to me in "layman's" terms without taking up too much of your time? 


#9
Nov2012, 10:36 AM

P: 74

Well, immediately I'm not sure how to make your values don't exceed the limits. Still, there are some possibilities how to generate them in the way that they do not with high probability:
1) Notice, that you have 4 classes [itex]X_1, ..., X_4[/itex] with parameters [itex]\alpha_1,...,\alpha_4[/itex], each with mean value [itex]\mathbb{E}[X_i] = \alpha_i / \sum \alpha_i[/itex]. You can exploit the Chebyshev inequality and set parameters' values so high that probability of mean values exceeding your limits is adequately small. 2) Consider [.2, .5, .2, .1] to be mean values of Dirichlet dist. with [itex]\alpha = (20, 50, 20, 10)[/itex]. Then you can generate (uniform) random vectors with elements' values from (5, 5), update [itex]\alpha[/itex] (i.e. add the random vector to it) and calculate estimate. The former case produces vectors with "nonuniform" distribution, values close to original ones will be more frequent. However, in both cases, you will need to check whether the final estimates are within your constraints. In the latter case, e.g. an extreme case [25, 55, 25, 5] leads to [.227, 0.5, .227, .045]. The last value is below the allowed difference 0.05. 


#10
Nov2012, 02:12 PM

P: 825

First you take your random numbers to be the deviations from your average values (the average values should sum to 1). You now want to create a list of deviation values that are zero in sum.
The algorithm works like this: Once you have chosen a number, then the next number you can choose will lie in the intersection between what can still be corrected (negative sum of the maximum remaining deviations) and the the deviation allowed for this value.



#11
Nov2112, 04:20 AM

P: 74

That smart, 0xDEADBEEF! I'd however add one more thing  choose the indices of the deviations list in a random way, otherwise the result will be biased due to shrinkage of the the sequence of the distributions' supports. Other (easier) possibility could be to shuffle the deviations with numpy.random.shuffle.
EDIT: Ough, I'm even more lazy... Now I've read your post to its end, where you mention shuffling... 


#12
Nov2712, 12:56 PM

P: 3

That sounds like a rather brilliantly simple solution, 0xDEADBEEF. Thanks to all.



#13
Nov2812, 12:40 AM

Homework
Sci Advisor
HW Helper
Thanks
P: 9,922

It's still not clear to me whether the result is better or worse than postscaling. But given the crudeness of supposing uniform distributions in the first place, I can't see that it'll matter much. 


Register to reply 
Related Discussions  
Can the mind generate random numbers?  General Discussion  61  
Generate random numbers by hand?  General Math  5  
Complex numbers and hamilton quaternions generate [tex]M_{2}(C)[/tex]  Linear & Abstract Algebra  1  
In theory, does a quantum computer have the capacity to generate truly random numbers  Computers  7  
Next number in a sequence of randomly chosen numbers?  General Math  12 