Using VEGAS Algorithm to Compute Integral of Flat Function with Localized Peaks

In summary: N+1. We then multiply p(x) by the weight of each interval and add all of these up. We end up with p(x) = (1/N)*sum((f_i/g_i)*dxp_i).Now we can use this to calculate the integral of f(x) using regular Monte Carlo.
  • #1
wuliwong
9
0
I am attempting to compute the integral of a function that is mostly flat relative to a few localized peaks. These peaks are the bulk of the contribution to the integral. I decided to try the VEGAS algorithm to do this. I am running into some difficulty, so I am reaching out to YOU physicsforums.com.

One of the main areas that I am having trouble with is using the density function. The density function is p(x) = |f(x)|/int(|f(x)|dx).

This is how I THINK this is supposed to work.

1. Calculate int(|f(x)|dx) using regular Monte-Carlo.
2. Use this result to calculate p(x) = |f(x)|/int(|f(x)|dx).
3. Calculate int((f(x)/p(x))*p(x)dx).
3(a). This integral is represents the first problem for me. I know I am to use p(x) to create a new "mesh" to integrate over, let's call it xp. xp is some function of p(x) and x.
3(b). Another issue I have needs us to now assume I have figured out how to calculate xp. Is the integral in step 3 calculated like this:

int((f(x)/p(x))*p(x)dx) = (1/N)*sum((f_i/g_i)*dxp_i) ?

And finally, assuming that I have learned all of the above and can approximate the integral in 3, how do I extract the original integral I wanted to calculate?

Also, I'm open to other suggestions too. I'm not married to this VEGAS algorithm, not by a long shot.

Thanks in advance.
 
Technology news on Phys.org
  • #2
wuliwong said:
I am attempting to compute the integral of a function that is mostly flat relative to a few localized peaks. These peaks are the bulk of the contribution to the integral. I decided to try the VEGAS algorithm to do this. I am running into some difficulty, so I am reaching out to YOU physicsforums.com.

One of the main areas that I am having trouble with is using the density function. The density function is p(x) = |f(x)|/int(|f(x)|dx).

This is how I THINK this is supposed to work.

1. Calculate int(|f(x)|dx) using regular Monte-Carlo.
2. Use this result to calculate p(x) = |f(x)|/int(|f(x)|dx).
3. Calculate int((f(x)/p(x))*p(x)dx).
3(a). This integral is represents the first problem for me. I know I am to use p(x) to create a new "mesh" to integrate over, let's call it xp. xp is some function of p(x) and x.
3(b). Another issue I have needs us to now assume I have figured out how to calculate xp. Is the integral in step 3 calculated like this:

int((f(x)/p(x))*p(x)dx) = (1/N)*sum((f_i/g_i)*dxp_i) ?

And finally, assuming that I have learned all of the above and can approximate the integral in 3, how do I extract the original integral I wanted to calculate?

Also, I'm open to other suggestions too. I'm not married to this VEGAS algorithm, not by a long shot.

Thanks in advance.

This thread is a bit old but still unanswered so I'll give it a go to try and eliminate some confusion... or perhaps create more. I'm going to assume a basic knowledge of Monte Carlo integration (but check Wikipedia if you need to refresh).

In a basic Monte Carlo integration, data points X are sampled from a uniform distribution of x (called p(x)) and then used to obtain an estimate of the integral of f(x). This can lead to slow convergence or possibly no convergence if the points sampled happen not to contribute substantially to the integral of f(x).

Using importance sampling, we can pick a distribution of x, p(x), which will focus our sampled points in the region of x that most contributes to the integral of f(x). This should help convergence but requires that we know what p(x) is. Unfortunately, finding p(x) is a big problem.

As was mentioned above, p(x) = |f(x)|/int(|f(x)|dx). This is the density function which which minimizes the variance of the estimate of the integral. At first inspection, it appears this is a somewhat useless fact since it involves integrating f(x), which is what we set out to do in the first place, or knowing p(x) beforehand, which we don't. Fortunately, the Vegas algorithm can help us estimate p(x).

The Vegas algorithm adaptively determines p(x) by using a technique called stratified sampling. We divide the integration interval into N subregions, for each of which the average function value f_avg is estimated. Once we have f_avg for each interval, we can create a weighting for each interval by dividing f_avg_i/sum(f_avg_i) where i = 1:N is the interval index. Multiply each weight by a constant K (say K = 1000) to get the number of subregions to divide each interval into.

Now we can update p(x). p(x) is initially uniform and x has a probability of 1/N of being in the each interval. After subdividing each of the N intervals, we now have a bunch of small intervals. We want to return to having N intervals of equal probability. We want to redistribute the subintervals (don't rearrange them just count them from left to right) so that we have N intervals each containing the same amount of subintervals. The thing that changes about the N intervals is their width: initially each interval has the same width but after determining the subintervals the regions with higher weights (i.e. that contribute more to the integral) will have NARROWER widths and the regions of lower probability will have WIDER widths (this requires some tricky bookkeeping to make sure that the correct widths are used). [Consider the CDF of a function where the x-axis is 'x' and the y-axis is 'cdf(x)' and 0<=cdf(x) <=1. If 2 intervals on the x-axis are equally likely, then if one of the intervals is shorter the chance of observing each value in that range is increased.]

Once the new interval widths are known, the CDF of x can be constructed. We can now sample points on [0 1] from a uniform distribution using standard random number generators (best to use the Twister algorithm if you can afford it), input them into the inverse cdf, and voila we have points sampled from the updated (i.e. nonuniform) distribution p(x). Of course, we can keep refining the grid by repeating the process of estimating weights, dividing into subincrements etc. until the optimal case where we find that each interval has the same amount of subintervals.

Instead of a brute monte carlo using a uniform distribution, we can estimate the integral by instead sampling from our adaptive p(x). Notice that by sampling from p(x) I don't need to divide by p(x) when I estimate the integral (the notation given in the literature often makes it confusing what needs to be done in implementation). With enough computational horsepower, you should be able to evaluate a very wide variety of integrals.

While the description I've given is for a single dimension, this sort of algorithm has no business estimating most integrals for N < 4. There are much faster methods for estimating those. Also, Vegas integration has been around since 1977 and newer methods are available (e.g. the Cuba package). MCMC is also very popular.

You can read all about the algorithm at http://bloodaxe.phyast.pitt.edu/CompPhys/vegas.pdf"

GB
 
Last edited by a moderator:
  • #3
blayzinblues said:
Instead of a brute monte carlo using a uniform distribution, we can estimate the integral by instead sampling from our adaptive p(x). Notice that by sampling from p(x) I don't need to divide by p(x) when I estimate the integral

This is incorrect - please excuse the error and ignore these statements. Once points are sampled from the new distribution, their corresponding pdf values are also calculated. Then, S1, S2, etc and finally the integral can be evaluated. See the link in the original response to learn how to calculate those.
 

1. How does the VEGAS algorithm work?

The VEGAS algorithm uses a Monte Carlo approach to compute integrals of functions with localized peaks. It divides the integration region into smaller subregions and uses an adaptive grid to sample points within each subregion. The algorithm then evaluates the function at these points and uses the results to estimate the integral.

2. Can the VEGAS algorithm handle functions with multiple peaks?

Yes, the VEGAS algorithm is designed to handle functions with multiple localized peaks. It uses an adaptive grid to efficiently sample points within each peak, resulting in accurate integral estimates.

3. How accurate is the VEGAS algorithm compared to other methods?

The VEGAS algorithm is known for its high accuracy and efficiency, making it a popular choice for computing integrals of functions with localized peaks. It has been shown to outperform other methods such as the Monte Carlo and Gauss-Kronrod approaches.

4. What are the limitations of the VEGAS algorithm?

The VEGAS algorithm may not perform well for functions with very sharp peaks or those that are highly oscillatory. It also requires a significant amount of computation, making it less suitable for very high-dimensional integrals.

5. Are there any recommended settings for using the VEGAS algorithm?

For optimal performance, it is recommended to use a large number of samples and to adjust the parameters of the algorithm, such as the number of subregions and the size of the adaptive grid, based on the characteristics of the function being integrated.

Similar threads

  • Programming and Computer Science
Replies
1
Views
662
  • Programming and Computer Science
Replies
5
Views
2K
  • Programming and Computer Science
Replies
2
Views
523
  • Programming and Computer Science
Replies
1
Views
612
  • Programming and Computer Science
Replies
11
Views
945
Replies
4
Views
262
  • Programming and Computer Science
Replies
30
Views
4K
  • Calculus
Replies
6
Views
1K
  • Advanced Physics Homework Help
Replies
0
Views
130
  • Programming and Computer Science
Replies
7
Views
1K
Back
Top