Generating Random Numbers with the Acceptance-Rejection Method

AI Thread Summary
The discussion focuses on implementing the acceptance-rejection method for generating random numbers in C++ using ROOT for graphing. Key issues include the initial incorrect assignment of an integer variable and the need for proper function evaluations within the loop. Participants suggest adjusting the random number ranges to ensure sufficient coverage of the probability density function (PDF) and recommend using a two-dimensional histogram for better visualization. There is also a discussion about alternative methods like Box-Muller and std::normal_distribution for generating random numbers. Ultimately, the importance of correctly scaling histogram values to maintain the integrity of the PDF is emphasized.
Amrator
Messages
246
Reaction score
83
I'm trying to write a C++ program to generate random numbers using the acceptance-rejection method. To plot the graphs, I'm using ROOT by CERN. I am checking if y values taken from the rectangular boundary are less than or equal to the function ##f(x_{i}) = e^{-k(x_{i} - x_{0})^{2}}##.

C++:
void rejection()
{
    int n = 100;
    float x, y;
    float pi = TMath::Pi();
    float squeeze = 1.;
    float A = 1.0/sqrt(2.*pi);

    cout << "Amplitude: " << A << endl;
 
    gStyle->SetOptStat(0);
    TH1D *dist = new TH1D("dist", "sampling", 100, -4, 100);

    TF1 *f = new TF1("f", "exp(-1000.*(x-0.3)*(x-0.3))", 0, 1);

    for (int i = 0; i < n; i++)
    {
        x = gRandom->Rndm();
        y = gRandom->Rndm();
        if(y <= f)
        {
            dist->Fill(x); //accept x
        }
    }
    float scale = A/dist->GetMaximum();
    float scale = 1./dist->GetMaximum();
    dist->Scale(scale);
    dist->Draw("same");
}

I am not sure I am even coding the method correctly. May I have some guidance?
1606780667919.png
 
Last edited:
Technology news on Phys.org
The first line of the function is already problematic:
Code:
int n = .5;
You're assigning a floating point number to an integer. This is legal, but probably not what you want, as 0.5 will be rounded down to the nearest integer, which is 0. Any decent compiler should generate a warning about this.

Then, a few lines later, you're assigning to two variables, dist and f, that have not been defined:
Code:
    dist = new TH1D("dist", "sampling", 100, -4, 100);
    f = new TF1("f", "exp(-1000.*(x-0.3)*(x-0.3))", 0, 1);
Does this code even compile?
 
jbunniii said:
The first line of the function is already problematic:
Code:
int n = .5;
You're assigning a floating point number to an integer. This is legal, but probably not what you want, as 0.5 will be rounded down to the nearest integer, which is 0. Any decent compiler should generate a warning about this.

Then, a few lines later, you're assigning to two variables, dist and f, that have not been defined:
Code:
    dist = new TH1D("dist", "sampling", 100, -4, 100);
    f = new TF1("f", "exp(-1000.*(x-0.3)*(x-0.3))", 0, 1);
Does this code even compile?
You are right about the first point. I changed .5 to 1.

I am using ROOT to graph the code. So dist and f are legal. f is a TF1 function and dist is a one dimensional histogram. They are legal. And yes, it compiles, but I'm not worried about whether or not it compiles right now. I'm trying to figure out how to code the acceptance-rejection method correctly.
 
Even if we don't know ROOT, this can be considered pseudocode. I suggest that you strip out the graphing parts of the code and just show us the rejection code that you are wondering about. Some things that look wrong to me are:
1) n is set to 1 and the loop will only be executed once
2) the function value, f, is not calculated for the new value of x inside the loop.
If ROOT is somehow taking care of these problems, then only people familiar with ROOT can help you.
 
FactChecker said:
Even if we don't know ROOT, this can be considered pseudocode. I suggest that you strip out the graphing parts of the code and just show us the rejection code that you are wondering about. Some things that look wrong to me are:
1) n is set to 1 and the loop will only be executed once
2) the function value, f, is not calculated for the new value of x inside the loop.
If ROOT is somehow taking care of these problems, then only people familiar with ROOT can help you.
1) I changed n to 100.
2) ROOT does indeed take care of that. https://root.cern.ch/doc/master/classTF1.html
 
Another concern that I have is the range of the random numbers generated by rndm. You need to allow a large enough y-range to include all of the height of the PDF and enough x-range to include all but the small-probability tails at the left and right. It looks like rndm will only generate numbers in [0,1]. That would require setting x to something like: x = maxRange*rndm - maxRange/2

EDIT: Actually, after looking at the PDF, I see that the x-range of [0,1] covers practically all of the probability. That forces most of your generated (x,y) samples to be rejected. You might want to set a small maxRange.
 
Last edited:
FactChecker said:
Another concern that I have is the range of the random numbers generated by rndm. You need to allow a large enough y-range to include all of the height of the PDF and enough x-range to include all but the small-probability tails at the left and right. It looks like rndm will only generate numbers in [0,1]. That would require setting x to something like: x = maxRange*rndm - maxRange/2

EDIT: Actually, after looking at the PDF, I see that the x-range of [0,1] covers practically all of the probability. That forces most of your generated (x,y) samples to be rejected. You might want to set a small maxRange.
Does the for-loop cover all of the y values as well? And would you elaborate on the maxRange?
 
Last edited:
It covers practically all of the possibility. The y values (the PDF) stays under 1 and the range of x values only omit the very unlikely values on each side. The rejection method will reject a large fraction, so it might take a while to get enough valid samples (but computers are fast these days).
1606841341444.png
 
  • Like
Likes Amrator and sysprog
Just out of curiosity, why are you not using a Box-Muller? That looks much easier.

Easier still is the Gaus method in TRandom.
 
  • #10
Vanadium 50 said:
Just out of curiosity, why are you not using a Box-Muller? That looks much easier.

Easier still is the Gaus method in TRandom.
That's actually one of the other methods I have to use. I'm doing the same problem three times using three different methods: Box-Muller, acceptance-rejection, and central limit theorem.
 
  • #11
1606845075117.png

So this is what I get when I run n from 0 to 10000. Obviously, this doesn't look right.
 
  • #12
If you examine the sample mean, you might see a very slight bias toward the high side since your range of x cuts off more of the lower tail end than the upper.
 
  • #13
Amrator said:
View attachment 273509
So this is what I get when I run n from 0 to 10000. Obviously, this doesn't look right.
What is your x-axis here? Your accepted x-values will cluster tightly around 0.3, so I don't think that this histogram is usable. But setting the parameters of the histogram is a ROOT plotting issue, not a random number issue. I see no reason to doubt your rejection method just because of the unusable histogram.
 
  • Like
Likes Amrator
  • #14
FactChecker said:
What is your x-axis here? Your accepted x-values will cluster tightly around 0.3, so I don't think that this histogram is usable. But setting the parameters of the histogram is a ROOT plotting issue, not a random number issue. I see no reason to doubt your rejection method just because of the unusable histogram.
Honestly, you're right. I don't even think a 1D histogram will work here. Let me try a 2D histogram.
 
  • #15
Amrator said:
Honestly, you're right. I don't even think a 1D histogram will work here. Let me try a 2D histogram.
I don't know what you mean by 1D and 2D, two dimensional? The problems with the histogram are the range of the x values and the lack of fine resolution on the x axis.

Your line:
TH1D *dist = new TH1D("dist", "sampling", 100, -4, 100);
is specifying 100 bins from x=-4 to x=100.
But x only goes from 0 to 1, and much of that range has very small probability. Look at the graph in post #8. Better parameters would be 100, 0.1, 0.5:
TH1D *dist = new TH1D("dist", "sampling", 100, 0.1, 0.5);
 
Last edited:
  • #16
When you plot a PDF, remember that the integral is 1 (not the maximum). So you need to know how many total samples were accepted (not rejected) and divide the histogram bin numbers by that total number.
 
  • #17
You want a one dimensional histogram (only x is variable) - but as FactChecker wrote the binning is wrong. All your entries are in just one or two bins.

Line 25 does nothing, by the way, and line 26 is a redefinition of scale.
Code:
if(y <= f)
Doesn't work under ROOT 6.06 at least:
> error: invalid operands to binary expression ('float' and 'TF1 *')
Code:
if(y <= f->Eval(x))
This works. After fixing the binning, the redefinition of scale and this point I get a graph that looks reasonable given the small n=100. With n=10000 it resembles a Gauss shape.
 
Last edited:
  • Like
Likes Amrator and FactChecker
  • #18
Vanadium 50 said:
Just out of curiosity, why are you not using a Box-Muller? That looks much easier.

Easier still is the Gaus method in TRandom.
Or better yet, use std::normal_distribution, which has been part of the C++ standard library since C++11.
 
  • #19
Well, this is ROOT, which is not exactly pure C++ Only a subset of STL is in there.
 
  • #20
root [0] gROOT->GetVersion()
(const char *) "6.08/06"
root [1] std::normal_distribution<> d{5,2};
root [2]

This one is in.
 
  • #21
Alright, I got it. Thanks for the help, everyone.
1606887450864.png
Here is the same problem but using the Box-Muller method:
1606887699982.png
 
Last edited:
  • Like
Likes FactChecker
  • #22
One unusual thing about this example is that the maximum of the PDF is 1 (exactly?), which is also the total integral of the PDF. So it is hard to know if you scaled the values by the right number. You should scale the histogram cell counts by the total valid (not-rejected) sample size so that the total (integral) is 1. You should not scale them by the maximum cell count. I am not sure that your original scale code was correct for the general case.
 
Last edited:
  • Like
Likes Amrator
  • #23
In addition it's highly advisable to use the same binning for the different methods, that makes them easier to compare.
 
  • Like
Likes Amrator

Similar threads

Replies
4
Views
2K
Replies
6
Views
2K
Replies
3
Views
2K
Replies
1
Views
8K
Replies
2
Views
2K
Replies
0
Views
798
Replies
1
Views
4K
Back
Top