# VEGAS algorithm

wuliwong
I am attempting to compute the integral of a function that is mostly flat relative to a few localized peaks. These peaks are the bulk of the contribution to the integral. I decided to try the VEGAS algorithm to do this. I am running into some difficulty, so I am reaching out to YOU physicsforums.com.

One of the main areas that I am having trouble with is using the density function. The density function is p(x) = |f(x)|/int(|f(x)|dx).

This is how I THINK this is supposed to work.

1. Calculate int(|f(x)|dx) using regular Monte-Carlo.
2. Use this result to calculate p(x) = |f(x)|/int(|f(x)|dx).
3. Calculate int((f(x)/p(x))*p(x)dx).
3(a). This integral is represents the first problem for me. I know I am to use p(x) to create a new "mesh" to integrate over, let's call it xp. xp is some function of p(x) and x.
3(b). Another issue I have needs us to now assume I have figured out how to calculate xp. Is the integral in step 3 calculated like this:

int((f(x)/p(x))*p(x)dx) = (1/N)*sum((f_i/g_i)*dxp_i) ?

And finally, assuming that I have learned all of the above and can approximate the integral in 3, how do I extract the original integral I wanted to calculate?

Also, I'm open to other suggestions too. I'm not married to this VEGAS algorithm, not by a long shot.

blayzinblues
I am attempting to compute the integral of a function that is mostly flat relative to a few localized peaks. These peaks are the bulk of the contribution to the integral. I decided to try the VEGAS algorithm to do this. I am running into some difficulty, so I am reaching out to YOU physicsforums.com.

One of the main areas that I am having trouble with is using the density function. The density function is p(x) = |f(x)|/int(|f(x)|dx).

This is how I THINK this is supposed to work.

1. Calculate int(|f(x)|dx) using regular Monte-Carlo.
2. Use this result to calculate p(x) = |f(x)|/int(|f(x)|dx).
3. Calculate int((f(x)/p(x))*p(x)dx).
3(a). This integral is represents the first problem for me. I know I am to use p(x) to create a new "mesh" to integrate over, let's call it xp. xp is some function of p(x) and x.
3(b). Another issue I have needs us to now assume I have figured out how to calculate xp. Is the integral in step 3 calculated like this:

int((f(x)/p(x))*p(x)dx) = (1/N)*sum((f_i/g_i)*dxp_i) ?

And finally, assuming that I have learned all of the above and can approximate the integral in 3, how do I extract the original integral I wanted to calculate?

Also, I'm open to other suggestions too. I'm not married to this VEGAS algorithm, not by a long shot.

This thread is a bit old but still unanswered so I'll give it a go to try and eliminate some confusion... or perhaps create more. I'm going to assume a basic knowledge of Monte Carlo integration (but check Wikipedia if you need to refresh).

In a basic Monte Carlo integration, data points X are sampled from a uniform distribution of x (called p(x)) and then used to obtain an estimate of the integral of f(x). This can lead to slow convergence or possibly no convergence if the points sampled happen not to contribute substantially to the integral of f(x).

Using importance sampling, we can pick a distribution of x, p(x), which will focus our sampled points in the region of x that most contributes to the integral of f(x). This should help convergence but requires that we know what p(x) is. Unfortunately, finding p(x) is a big problem.

As was mentioned above, p(x) = |f(x)|/int(|f(x)|dx). This is the density function which which minimizes the variance of the estimate of the integral. At first inspection, it appears this is a somewhat useless fact since it involves integrating f(x), which is what we set out to do in the first place, or knowing p(x) beforehand, which we don't. Fortunately, the Vegas algorithm can help us estimate p(x).

The Vegas algorithm adaptively determines p(x) by using a technique called stratified sampling. We divide the integration interval into N subregions, for each of which the average function value f_avg is estimated. Once we have f_avg for each interval, we can create a weighting for each interval by dividing f_avg_i/sum(f_avg_i) where i = 1:N is the interval index. Multiply each weight by a constant K (say K = 1000) to get the number of subregions to divide each interval into.

Now we can update p(x). p(x) is initially uniform and x has a probability of 1/N of being in the each interval. After subdividing each of the N intervals, we now have a bunch of small intervals. We want to return to having N intervals of equal probability. We want to redistribute the subintervals (don't rearrange them just count them from left to right) so that we have N intervals each containing the same amount of subintervals. The thing that changes about the N intervals is their width: initially each interval has the same width but after determining the subintervals the regions with higher weights (i.e. that contribute more to the integral) will have NARROWER widths and the regions of lower probability will have WIDER widths (this requires some tricky bookkeeping to make sure that the correct widths are used). [Consider the CDF of a function where the x axis is 'x' and the y axis is 'cdf(x)' and 0<=cdf(x) <=1. If 2 intervals on the x axis are equally likely, then if one of the intervals is shorter the chance of observing each value in that range is increased.]

Once the new interval widths are known, the CDF of x can be constructed. We can now sample points on [0 1] from a uniform distribution using standard random number generators (best to use the Twister algorithm if you can afford it), input them into the inverse cdf, and voila we have points sampled from the updated (i.e. nonuniform) distribution p(x). Of course, we can keep refining the grid by repeating the process of estimating weights, dividing into subincrements etc. until the optimal case where we find that each interval has the same amount of subintervals.

Instead of a brute monte carlo using a uniform distribution, we can estimate the integral by instead sampling from our adaptive p(x). Notice that by sampling from p(x) I don't need to divide by p(x) when I estimate the integral (the notation given in the literature often makes it confusing what needs to be done in implementation). With enough computational horsepower, you should be able to evaluate a very wide variety of integrals.

While the description I've given is for a single dimension, this sort of algorithm has no business estimating most integrals for N < 4. There are much faster methods for estimating those. Also, Vegas integration has been around since 1977 and newer methods are available (e.g. the Cuba package). MCMC is also very popular.