Numerical integration, use given random distribution as integration points

Derivator · Apr 5, 2012

Hi folks,

I need to evaluate (numerically) a multi-dimensional integral of the form
[itex]\int_A f(x) dx[/itex].

Now in my application, I already have points inside the domain A which are distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. So I hoped I could use these random points in some importance sampling monte carlo integration technique, since this would not result in any additional computational cost.
However, using these random points for monte carlo integration doesn't seem so easy, since all ideas that came to my mind need the value of the normalizing constant [itex]\int_A f(x) dx[/itex]. Obviously, I don't have this value, since this is exactly what I want to compute.

Does anybody have an idea?derivator

mathman · Apr 5, 2012

How did you get the random points without knowing f(x)?

DuncanM · Apr 7, 2012

There are computer programs that can do this, but they'd probably need more data than the random points you have. I think they build cubic polynomial interpolation functions for ranges between your data points and then numerically integrate that function. That way, the program has some control over the accuracy of the results. Otherwise, your results might be so inaccurate as to be meaningless.

For example, consider the function f = sin(x)
Say you want to numerically integrate this function from x = 0 to x = ∏/2 but have only three data points for f on this range:
(0,0), (∏/4, (√2)/2), and (∏/2, 1).
A simple one-segment trapezoidal approach might yield an answer like 0.8 ± 0.3, which has so much error in it as to be meaningless.
So most numerical integration routines for problems such as yours would create an interpolating function for points between the known values. The program would then be able to evaluate function values as many times and at as many places as necessary to compute an answer to the desired accuracy. That way, you'd get an answer like 1 ± 1e-14, which is much more accurate--something you could actually use.

Do you have access to a commercial product like MATLAB?
Or do you have to write your own program?

Office_Shredder · Apr 7, 2012

http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo

The idea is you do a random walk in space where your probability distribution of where to walk next is based on f(x). This allows you to sample according to the global distribution of f(x) in a surprisingly fast period of time.

Derivator · Apr 10, 2012

mathman said:

How did you get the random points without knowing f(x)?

Well, I know f(x), but I don't know [itex]\int_A f(x) dx[/itex]

Office_Shredder said:

http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo

The idea is you do a random walk in space where your probability distribution of where to walk next is based on f(x). This allows you to sample according to the global distribution of f(x) in a surprisingly fast period of time.

Yes, if I wanted to calculate an integral like [itex]\int g(x)\frac{f(x)}{\int_A f(y) dy} dx[/itex], then I could use Markov Chain Monte Carlo. However, I want to calculate [itex]\int_A f(x) dx[/itex] by using previously obtained random points (distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]) as sampling points.derivator

@DuncanM: It's a high dimensional integration problem, so I think there is now way around Monte Carlo Integration. What do you think?

chiro · Apr 10, 2012

Derivator said:

Well, I know f(x), but I don't know [itex]\int_A f(x) dx[/itex]

This doesn't make sense.

If you have the expression that is analytic (or can be separated into analytic expressions that are mutually exclusive but still exhaust the region A) then all you have to do is do a normal integration.

From what you have written it seems that all you need to find is the cumulative density function for some distribution in the region A and you said that you know f(x).

If this is the case then this is just a standard integration. You might have to use a numerical integration scheme but it's just a classic integration never the less. Pick the resolution you want that's good enough and then use this to get the answer.

I think what you are trying to do form my observations is basically create a censored distribution where you only have random variables that been obtained from simulation or otherwise within the region A and you want to normalize this so that you get the right distribution in the context of your problem.

Finding the density of the region for A given a specific and known f(x) in any dimension does not require any simulation at all: it only requires at the most, a sophisticated numerical integration routine to get a density that is good enough for approximation purposes if an analytic expression can't be found.

Maybe you could tell us what f(x) actually corresponds to.

Derivator · Apr 10, 2012

chiro said:

Maybe you could tell us what f(x) actually corresponds to.

This could be any arbitrary distribution. But let's assume, that it is the Boltzmann distribution.

Lets rephrase my problem:

From a Markov Chain Monte Carlo simulation, I already have points that are distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. Now I want to calculate the normalization [itex]\frac{1}{\int_A f(x) dx}[/itex] explicitly, in a way that is computationally as cheap as possible. I'm happy with any method that can do this (keep in mind, that this is a high dimensional integral). Though, my idea was, since I already know points in important regions of the integral, that it may be useful to use these. However, it is not clear, how to do this.

chiro · Apr 10, 2012

Derivator said:

This could be any arbitrary distribution. But let's assume, that it is the Boltzmann distribution.

Lets rephrase my problem:

From a Markov Chain Monte Carlo simulation, I already have points that are distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. Now I want to calculate the normalization [itex]\frac{1}{\int_A f(x) dx}[/itex] explicitly, in a way that is computationally as cheap as possible. I'm happy with any method that can do this (keep in mind, that this is a high dimensional integral). Though, my idea was, since I already know points in important regions of the integral, that it may be useful to use these. However, it is not clear, how to do this.

So do you acknowledge the fact that any numerical integrator will do this in a finite-dimensional boundary? In terms of optimizations, this is going to depend more on f(x) than anything else.

Also another thing is how often does this need to be computed? If it only needs to be computer once then this should not be a big issue in the context of things.

Also if it needs to be computed multiple times corresponding to different regions then how does the region change over time? If the change in regions is purely local then you can use the fact that you can partition your integral into n-dimensional cubes and just calculate the delta's of the region rather than having to recompute the entire region from scratch.

Derivator · Apr 10, 2012

chiro said:

So do you acknowledge the fact that any numerical integrator will do this in a finite-dimensional boundary?

Theoretically yes, but in practice most integration schemes are computationally not feasible since the needed number of sampling points is growing exponentially with the dimension.

chiro · Apr 10, 2012

Derivator said:

Theoretically yes, but in practice most integration schemes are computationally not feasible since the needed number of sampling points is growing exponentially with the dimension.

So is the region changing and if so how is it changing? Also what accuracy do you need and based on this can you rule out for example getting rid of redundant tails with respect to approximation requirements?

Derivator · Apr 10, 2012

no, the region of integration is not changing over time. The accurarcy should be as high as possible, however, I'm not sure how good it has to be, in order to obtain useful results, finally.

What do you mean by "redundant tails"?

chiro · Apr 10, 2012

Derivator said:

What do you mean by "redundant tails"?

Basically areas of the distribution where it is so low that it can be ignored completely thus speeding up he integration.

As an example think of a standard normal where you are say 5+ from the x = 0 line. Depending on your application that might be ok to just skip that region altogether.

Derivator · Apr 10, 2012

chiro said:

Basically areas of the distribution where it is so low that it can be ignored completely thus speeding up he integration.

Ah, ok. These redundant regions are one reason why I want to use the random points which are distributed according [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. (A second reason is, that evaluation f(x) is very expensive, hence a reuse of already calculated sample values is desirable)

Numerical integration, use given random distribution as integration points

1. What is numerical integration and how does it work?

2. How is a random distribution used in numerical integration?

3. What are the advantages of using a random distribution in numerical integration?

4. Can any random distribution be used in numerical integration?

5. Are there any limitations to using a random distribution in numerical integration?

Similar threads

Hot Threads

Recent Insights