Using VEGAS Algorithm to Compute Integral of Flat Function with Localized Peaks

Click For Summary
SUMMARY

The discussion focuses on using the VEGAS algorithm to compute the integral of a function characterized by localized peaks amidst a flat background. The density function is defined as p(x) = |f(x)|/int(|f(x)|dx), which is crucial for importance sampling. Key steps include calculating the integral using Monte Carlo methods, updating the density function, and refining the mesh for integration. The conversation highlights challenges in implementing these steps and suggests that while VEGAS is useful, alternative methods like the Cuba package or MCMC may offer better performance.

PREREQUISITES
  • Understanding of Monte Carlo integration techniques
  • Familiarity with the VEGAS algorithm and its application
  • Knowledge of importance sampling and density functions
  • Basic proficiency in numerical methods for integration
NEXT STEPS
  • Study the implementation of the VEGAS algorithm in computational physics
  • Learn about the Cuba package for multidimensional integration
  • Explore advanced Monte Carlo methods, particularly Markov Chain Monte Carlo (MCMC)
  • Investigate techniques for adaptive sampling and mesh refinement in numerical integration
USEFUL FOR

Researchers and practitioners in computational physics, mathematicians focusing on numerical analysis, and anyone involved in optimizing integration techniques for functions with localized features.

wuliwong
Messages
9
Reaction score
0
I am attempting to compute the integral of a function that is mostly flat relative to a few localized peaks. These peaks are the bulk of the contribution to the integral. I decided to try the VEGAS algorithm to do this. I am running into some difficulty, so I am reaching out to YOU physicsforums.com.

One of the main areas that I am having trouble with is using the density function. The density function is p(x) = |f(x)|/int(|f(x)|dx).

This is how I THINK this is supposed to work.

1. Calculate int(|f(x)|dx) using regular Monte-Carlo.
2. Use this result to calculate p(x) = |f(x)|/int(|f(x)|dx).
3. Calculate int((f(x)/p(x))*p(x)dx).
3(a). This integral is represents the first problem for me. I know I am to use p(x) to create a new "mesh" to integrate over, let's call it xp. xp is some function of p(x) and x.
3(b). Another issue I have needs us to now assume I have figured out how to calculate xp. Is the integral in step 3 calculated like this:

int((f(x)/p(x))*p(x)dx) = (1/N)*sum((f_i/g_i)*dxp_i) ?

And finally, assuming that I have learned all of the above and can approximate the integral in 3, how do I extract the original integral I wanted to calculate?

Also, I'm open to other suggestions too. I'm not married to this VEGAS algorithm, not by a long shot.

Thanks in advance.
 
Technology news on Phys.org
wuliwong said:
I am attempting to compute the integral of a function that is mostly flat relative to a few localized peaks. These peaks are the bulk of the contribution to the integral. I decided to try the VEGAS algorithm to do this. I am running into some difficulty, so I am reaching out to YOU physicsforums.com.

One of the main areas that I am having trouble with is using the density function. The density function is p(x) = |f(x)|/int(|f(x)|dx).

This is how I THINK this is supposed to work.

1. Calculate int(|f(x)|dx) using regular Monte-Carlo.
2. Use this result to calculate p(x) = |f(x)|/int(|f(x)|dx).
3. Calculate int((f(x)/p(x))*p(x)dx).
3(a). This integral is represents the first problem for me. I know I am to use p(x) to create a new "mesh" to integrate over, let's call it xp. xp is some function of p(x) and x.
3(b). Another issue I have needs us to now assume I have figured out how to calculate xp. Is the integral in step 3 calculated like this:

int((f(x)/p(x))*p(x)dx) = (1/N)*sum((f_i/g_i)*dxp_i) ?

And finally, assuming that I have learned all of the above and can approximate the integral in 3, how do I extract the original integral I wanted to calculate?

Also, I'm open to other suggestions too. I'm not married to this VEGAS algorithm, not by a long shot.

Thanks in advance.

This thread is a bit old but still unanswered so I'll give it a go to try and eliminate some confusion... or perhaps create more. I'm going to assume a basic knowledge of Monte Carlo integration (but check Wikipedia if you need to refresh).

In a basic Monte Carlo integration, data points X are sampled from a uniform distribution of x (called p(x)) and then used to obtain an estimate of the integral of f(x). This can lead to slow convergence or possibly no convergence if the points sampled happen not to contribute substantially to the integral of f(x).

Using importance sampling, we can pick a distribution of x, p(x), which will focus our sampled points in the region of x that most contributes to the integral of f(x). This should help convergence but requires that we know what p(x) is. Unfortunately, finding p(x) is a big problem.

As was mentioned above, p(x) = |f(x)|/int(|f(x)|dx). This is the density function which which minimizes the variance of the estimate of the integral. At first inspection, it appears this is a somewhat useless fact since it involves integrating f(x), which is what we set out to do in the first place, or knowing p(x) beforehand, which we don't. Fortunately, the Vegas algorithm can help us estimate p(x).

The Vegas algorithm adaptively determines p(x) by using a technique called stratified sampling. We divide the integration interval into N subregions, for each of which the average function value f_avg is estimated. Once we have f_avg for each interval, we can create a weighting for each interval by dividing f_avg_i/sum(f_avg_i) where i = 1:N is the interval index. Multiply each weight by a constant K (say K = 1000) to get the number of subregions to divide each interval into.

Now we can update p(x). p(x) is initially uniform and x has a probability of 1/N of being in the each interval. After subdividing each of the N intervals, we now have a bunch of small intervals. We want to return to having N intervals of equal probability. We want to redistribute the subintervals (don't rearrange them just count them from left to right) so that we have N intervals each containing the same amount of subintervals. The thing that changes about the N intervals is their width: initially each interval has the same width but after determining the subintervals the regions with higher weights (i.e. that contribute more to the integral) will have NARROWER widths and the regions of lower probability will have WIDER widths (this requires some tricky bookkeeping to make sure that the correct widths are used). [Consider the CDF of a function where the x-axis is 'x' and the y-axis is 'cdf(x)' and 0<=cdf(x) <=1. If 2 intervals on the x-axis are equally likely, then if one of the intervals is shorter the chance of observing each value in that range is increased.]

Once the new interval widths are known, the CDF of x can be constructed. We can now sample points on [0 1] from a uniform distribution using standard random number generators (best to use the Twister algorithm if you can afford it), input them into the inverse cdf, and voila we have points sampled from the updated (i.e. nonuniform) distribution p(x). Of course, we can keep refining the grid by repeating the process of estimating weights, dividing into subincrements etc. until the optimal case where we find that each interval has the same amount of subintervals.

Instead of a brute monte carlo using a uniform distribution, we can estimate the integral by instead sampling from our adaptive p(x). Notice that by sampling from p(x) I don't need to divide by p(x) when I estimate the integral (the notation given in the literature often makes it confusing what needs to be done in implementation). With enough computational horsepower, you should be able to evaluate a very wide variety of integrals.

While the description I've given is for a single dimension, this sort of algorithm has no business estimating most integrals for N < 4. There are much faster methods for estimating those. Also, Vegas integration has been around since 1977 and newer methods are available (e.g. the Cuba package). MCMC is also very popular.

You can read all about the algorithm at http://bloodaxe.phyast.pitt.edu/CompPhys/vegas.pdf"

GB
 
Last edited by a moderator:
blayzinblues said:
Instead of a brute monte carlo using a uniform distribution, we can estimate the integral by instead sampling from our adaptive p(x). Notice that by sampling from p(x) I don't need to divide by p(x) when I estimate the integral

This is incorrect - please excuse the error and ignore these statements. Once points are sampled from the new distribution, their corresponding pdf values are also calculated. Then, S1, S2, etc and finally the integral can be evaluated. See the link in the original response to learn how to calculate those.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 30 ·
2
Replies
30
Views
7K
  • · Replies 19 ·
Replies
19
Views
3K
Replies
9
Views
2K
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K