Numerical integration, use given random distribution as integration points

In summary, the speaker is trying to find an efficient way to calculate the normalization factor \frac{1}{\int_A f(x) dx} for a high dimensional integral, using previously obtained points from a Markov Chain Monte Carlo simulation. They are open to any method that can accomplish this, but have not found a clear way to do so. It is suggested that the optimization will depend on the specific distribution f(x), and if the calculation only needs to be done once, a more expensive but precise numerical integrator could be used.
  • #1
Derivator
149
0
Hi folks,

I need to evaluate (numerically) a multi-dimensional integral of the form
[itex]\int_A f(x) dx[/itex].

Now in my application, I already have points inside the domain A which are distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. So I hoped I could use these random points in some importance sampling monte carlo integration technique, since this would not result in any additional computational cost.
However, using these random points for monte carlo integration doesn't seem so easy, since all ideas that came to my mind need the value of the normalizing constant [itex]\int_A f(x) dx[/itex]. Obviously, I don't have this value, since this is exactly what I want to compute.

Does anybody have an idea?derivator
 
Physics news on Phys.org
  • #2
How did you get the random points without knowing f(x)?
 
  • #3
There are computer programs that can do this, but they'd probably need more data than the random points you have. I think they build cubic polynomial interpolation functions for ranges between your data points and then numerically integrate that function. That way, the program has some control over the accuracy of the results. Otherwise, your results might be so inaccurate as to be meaningless.

For example, consider the function f = sin(x)
Say you want to numerically integrate this function from x = 0 to x = ∏/2 but have only three data points for f on this range:
(0,0), (∏/4, (√2)/2), and (∏/2, 1).
A simple one-segment trapezoidal approach might yield an answer like 0.8 ± 0.3, which has so much error in it as to be meaningless.
So most numerical integration routines for problems such as yours would create an interpolating function for points between the known values. The program would then be able to evaluate function values as many times and at as many places as necessary to compute an answer to the desired accuracy. That way, you'd get an answer like 1 ± 1e-14, which is much more accurate--something you could actually use.

Do you have access to a commercial product like MATLAB?
Or do you have to write your own program?
 
  • #4
http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo

The idea is you do a random walk in space where your probability distribution of where to walk next is based on f(x). This allows you to sample according to the global distribution of f(x) in a surprisingly fast period of time.
 
  • #5
mathman said:
How did you get the random points without knowing f(x)?
Well, I know f(x), but I don't know [itex]\int_A f(x) dx[/itex]

Office_Shredder said:
http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo

The idea is you do a random walk in space where your probability distribution of where to walk next is based on f(x). This allows you to sample according to the global distribution of f(x) in a surprisingly fast period of time.

Yes, if I wanted to calculate an integral like [itex]\int g(x)\frac{f(x)}{\int_A f(y) dy} dx[/itex], then I could use Markov Chain Monte Carlo. However, I want to calculate [itex]\int_A f(x) dx[/itex] by using previously obtained random points (distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]) as sampling points.derivator

@DuncanM: It's a high dimensional integration problem, so I think there is now way around Monte Carlo Integration. What do you think?
 
Last edited:
  • #6
Derivator said:
Well, I know f(x), but I don't know [itex]\int_A f(x) dx[/itex]

This doesn't make sense.

If you have the expression that is analytic (or can be separated into analytic expressions that are mutually exclusive but still exhaust the region A) then all you have to do is do a normal integration.

From what you have written it seems that all you need to find is the cumulative density function for some distribution in the region A and you said that you know f(x).

If this is the case then this is just a standard integration. You might have to use a numerical integration scheme but it's just a classic integration never the less. Pick the resolution you want that's good enough and then use this to get the answer.

I think what you are trying to do form my observations is basically create a censored distribution where you only have random variables that been obtained from simulation or otherwise within the region A and you want to normalize this so that you get the right distribution in the context of your problem.

Finding the density of the region for A given a specific and known f(x) in any dimension does not require any simulation at all: it only requires at the most, a sophisticated numerical integration routine to get a density that is good enough for approximation purposes if an analytic expression can't be found.

Maybe you could tell us what f(x) actually corresponds to.
 
  • #7
chiro said:
Maybe you could tell us what f(x) actually corresponds to.

This could be any arbitrary distribution. But let's assume, that it is the Boltzmann distribution.

Lets rephrase my problem:

From a Markov Chain Monte Carlo simulation, I already have points that are distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. Now I want to calculate the normalization [itex]\frac{1}{\int_A f(x) dx}[/itex] explicitly, in a way that is computationally as cheap as possible. I'm happy with any method that can do this (keep in mind, that this is a high dimensional integral). Though, my idea was, since I already know points in important regions of the integral, that it may be useful to use these. However, it is not clear, how to do this.
 
  • #8
Derivator said:
This could be any arbitrary distribution. But let's assume, that it is the Boltzmann distribution.

Lets rephrase my problem:

From a Markov Chain Monte Carlo simulation, I already have points that are distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. Now I want to calculate the normalization [itex]\frac{1}{\int_A f(x) dx}[/itex] explicitly, in a way that is computationally as cheap as possible. I'm happy with any method that can do this (keep in mind, that this is a high dimensional integral). Though, my idea was, since I already know points in important regions of the integral, that it may be useful to use these. However, it is not clear, how to do this.

So do you acknowledge the fact that any numerical integrator will do this in a finite-dimensional boundary? In terms of optimizations, this is going to depend more on f(x) than anything else.

Also another thing is how often does this need to be computed? If it only needs to be computer once then this should not be a big issue in the context of things.

Also if it needs to be computed multiple times corresponding to different regions then how does the region change over time? If the change in regions is purely local then you can use the fact that you can partition your integral into n-dimensional cubes and just calculate the delta's of the region rather than having to recompute the entire region from scratch.
 
  • #9
chiro said:
So do you acknowledge the fact that any numerical integrator will do this in a finite-dimensional boundary?
Theoretically yes, but in practice most integration schemes are computationally not feasible since the needed number of sampling points is growing exponentially with the dimension.
 
  • #10
Derivator said:
Theoretically yes, but in practice most integration schemes are computationally not feasible since the needed number of sampling points is growing exponentially with the dimension.

So is the region changing and if so how is it changing? Also what accuracy do you need and based on this can you rule out for example getting rid of redundant tails with respect to approximation requirements?
 
  • #11
no, the region of integration is not changing over time. The accurarcy should be as high as possible, however, I'm not sure how good it has to be, in order to obtain useful results, finally.

What do you mean by "redundant tails"?
 
  • #12
Derivator said:
What do you mean by "redundant tails"?

Basically areas of the distribution where it is so low that it can be ignored completely thus speeding up he integration.

As an example think of a standard normal where you are say 5+ from the x = 0 line. Depending on your application that might be ok to just skip that region altogether.
 
  • #13
chiro said:
Basically areas of the distribution where it is so low that it can be ignored completely thus speeding up he integration.

Ah, ok. These redundant regions are one reason why I want to use the random points which are distributed according [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. (A second reason is, that evaluation f(x) is very expensive, hence a reuse of already calculated sample values is desirable)
 

1. What is numerical integration and how does it work?

Numerical integration is a method of approximating the definite integral of a function using numerical techniques. It works by dividing the area under the curve into smaller sections and using mathematical formulas to calculate the sum of these sections.

2. How is a random distribution used in numerical integration?

A random distribution is used as integration points in numerical integration to improve the accuracy of the approximation. It helps to evenly distribute the integration points across the area under the curve, resulting in a better estimation of the integral.

3. What are the advantages of using a random distribution in numerical integration?

Using a random distribution in numerical integration can help to reduce bias and errors in the approximation. It also allows for a more efficient use of integration points, resulting in a faster and more accurate estimation of the integral.

4. Can any random distribution be used in numerical integration?

Yes, any random distribution can be used in numerical integration as long as it is continuous and has a known probability density function. Some commonly used distributions include uniform, normal, and exponential distributions.

5. Are there any limitations to using a random distribution in numerical integration?

While using a random distribution can improve the accuracy of numerical integration, it may not always be necessary or beneficial. In some cases, a regular distribution of integration points may be more suitable, especially if the function being integrated is simple or has a known pattern.

Similar threads

Replies
31
Views
753
  • Calculus
Replies
6
Views
1K
Replies
3
Views
2K
Replies
1
Views
1K
Replies
3
Views
1K
Replies
20
Views
2K
Replies
1
Views
833
  • Calculus
Replies
8
Views
926
  • Calculus
Replies
1
Views
920
Back
Top