Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Numerical integration, use given random distribution as integration points

  1. Apr 5, 2012 #1
    Hi folks,

    I need to evaluate (numerically) a multi-dimensional integral of the form
    [itex]\int_A f(x) dx[/itex].

    Now in my application, I already have points inside the domain A which are distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. So I hoped I could use these random points in some importance sampling monte carlo integration technique, since this would not result in any additional computational cost.
    However, using these random points for monte carlo integration doesn't seem so easy, since all ideas that came to my mind need the value of the normalizing constant [itex]\int_A f(x) dx[/itex]. Obviously, I don't have this value, since this is exactly what I want to compute.

    Does anybody have an idea?

  2. jcsd
  3. Apr 5, 2012 #2


    User Avatar
    Science Advisor

    How did you get the random points without knowing f(x)?
  4. Apr 7, 2012 #3
    There are computer programs that can do this, but they'd probably need more data than the random points you have. I think they build cubic polynomial interpolation functions for ranges between your data points and then numerically integrate that function. That way, the program has some control over the accuracy of the results. Otherwise, your results might be so inaccurate as to be meaningless.

    For example, consider the function f = sin(x)
    Say you want to numerically integrate this function from x = 0 to x = ∏/2 but have only three data points for f on this range:
    (0,0), (∏/4, (√2)/2), and (∏/2, 1).
    A simple one-segment trapezoidal approach might yield an answer like 0.8 ± 0.3, which has so much error in it as to be meaningless.
    So most numerical integration routines for problems such as yours would create an interpolating function for points between the known values. The program would then be able to evaluate function values as many times and at as many places as necessary to compute an answer to the desired accuracy. That way, you'd get an answer like 1 ± 1e-14, which is much more accurate--something you could actually use.

    Do you have access to a commercial product like MATLAB?
    Or do you have to write your own program?
  5. Apr 7, 2012 #4


    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member


    The idea is you do a random walk in space where your probability distribution of where to walk next is based on f(x). This allows you to sample according to the global distribution of f(x) in a surprisingly fast period of time.
  6. Apr 10, 2012 #5
    Well, I know f(x), but I don't know [itex]\int_A f(x) dx[/itex]

    Yes, if I wanted to calculate an integral like [itex]\int g(x)\frac{f(x)}{\int_A f(y) dy} dx[/itex], then I could use Markov Chain Monte Carlo. However, I want to calculate [itex]\int_A f(x) dx[/itex] by using previously obtained random points (distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]) as sampling points.


    @DuncanM: It's a high dimensional integration problem, so I think there is now way around Monte Carlo Integration. What do you think?
    Last edited: Apr 10, 2012
  7. Apr 10, 2012 #6


    User Avatar
    Science Advisor

    This doesn't make sense.

    If you have the expression that is analytic (or can be separated into analytic expressions that are mutually exclusive but still exhaust the region A) then all you have to do is do a normal integration.

    From what you have written it seems that all you need to find is the cumulative density function for some distribution in the region A and you said that you know f(x).

    If this is the case then this is just a standard integration. You might have to use a numerical integration scheme but it's just a classic integration never the less. Pick the resolution you want that's good enough and then use this to get the answer.

    I think what you are trying to do form my observations is basically create a censored distribution where you only have random variables that been obtained from simulation or otherwise within the region A and you want to normalize this so that you get the right distribution in the context of your problem.

    Finding the density of the region for A given a specific and known f(x) in any dimension does not require any simulation at all: it only requires at the most, a sophisticated numerical integration routine to get a density that is good enough for approximation purposes if an analytic expression can't be found.

    Maybe you could tell us what f(x) actually corresponds to.
  8. Apr 10, 2012 #7
    This could be any arbitrary distribution. But lets assume, that it is the Boltzmann distribution.

    Lets rephrase my problem:

    From a Markov Chain Monte Carlo simulation, I already have points that are distributed like [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. Now I want to calculate the normalization [itex]\frac{1}{\int_A f(x) dx}[/itex] explicitly, in a way that is computationally as cheap as possible. I'm happy with any method that can do this (keep in mind, that this is a high dimensional integral). Though, my idea was, since I already know points in important regions of the integral, that it may be useful to use these. However, it is not clear, how to do this.
  9. Apr 10, 2012 #8


    User Avatar
    Science Advisor

    So do you acknowledge the fact that any numerical integrator will do this in a finite-dimensional boundary? In terms of optimizations, this is going to depend more on f(x) than anything else.

    Also another thing is how often does this need to be computed? If it only needs to be computer once then this should not be a big issue in the context of things.

    Also if it needs to be computed multiple times corresponding to different regions then how does the region change over time? If the change in regions is purely local then you can use the fact that you can partition your integral into n-dimensional cubes and just calculate the delta's of the region rather than having to recompute the entire region from scratch.
  10. Apr 10, 2012 #9
    Theoretically yes, but in practice most integration schemes are computationally not feasible since the needed number of sampling points is growing exponentially with the dimension.
  11. Apr 10, 2012 #10


    User Avatar
    Science Advisor

    So is the region changing and if so how is it changing? Also what accuracy do you need and based on this can you rule out for example getting rid of redundant tails with respect to approximation requirements?
  12. Apr 10, 2012 #11
    no, the region of integration is not changing over time. The accurarcy should be as high as possible, however, I'm not sure how good it has to be, in order to obtain useful results, finally.

    What do you mean by "redundant tails"?
  13. Apr 10, 2012 #12


    User Avatar
    Science Advisor

    Basically areas of the distribution where it is so low that it can be ignored completely thus speeding up he integration.

    As an example think of a standard normal where you are say 5+ from the x = 0 line. Depending on your application that might be ok to just skip that region altogether.
  14. Apr 10, 2012 #13
    Ah, ok. These redundant regions are one reason why I want to use the random points which are distributed according [itex]\frac{f(x)}{\int_A f(x) dx}[/itex]. (A second reason is, that evaluation f(x) is very expensive, hence a reuse of already calculated sample values is desirable)
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook