Understanding different Monte Carlo approximation notations

Click For Summary

Discussion Overview

The discussion revolves around different notations and approximations used in Monte Carlo integration, particularly focusing on the expectations and the assumptions underlying various formulas. Participants explore the relationship between deterministic functions and random variables in the context of Monte Carlo methods, as well as the implications of different probability density functions (pdfs) on these approximations.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant questions whether the general Monte Carlo approximation $$E[f(X)] \to \frac{1}{N} \sum_{i=1}^N f(X_i)$$ assumes uniformity or normalization of the pdf.
  • Another participant asserts that the only assumption is the existence of the mean and that the pdf is normalized, suggesting uniformity is irrelevant.
  • Different approximations are noted, including $$V\frac{1}{N}\sum_{i=1}^N f(X_i)$$ and $$\frac{1}{N}\sum_{i=1}^N \frac{f(X_i)}{g(X_i)}$$, with questions raised about their derivation and relationship.
  • One participant highlights that the factor $$V$$ in the approximation relates to estimating the mean value of a function over a volume, while another seeks clarification on the conceptual justification for using uniform density in sampling.
  • There is a discussion on the distinction between integrating deterministic functions and random variables, with questions about whether they can be treated equivalently under certain conditions.
  • Participants explore the general formulation of Monte Carlo methods, emphasizing the role of random variables and the potential for various sampling methods, including importance sampling.
  • One participant notes that $$\hat{I} = \frac{1}{N}\sum_{i=1}^N f(X_i)$$ is not a general unbiased estimator unless specific conditions are met, such as uniform distribution over a volume.

Areas of Agreement / Disagreement

Participants express differing views on the assumptions required for various Monte Carlo approximations, particularly regarding the role of the pdf and uniformity. The discussion remains unresolved, with multiple competing perspectives on the relationships between the different formulations and their applications.

Contextual Notes

There are unresolved questions about the definitions and implications of density in the context of Monte Carlo integration, as well as the conditions under which different approximations hold true. The discussion also highlights the complexity of relating probabilistic and deterministic approaches in Monte Carlo methods.

schniefen
Messages
177
Reaction score
4
TL;DR
Understanding different notations of Monte Carlo approximations.
Currently working on a project involving Monte Carlo integrals. I haven't had any prior studies of this method, so hence the following question.

Consider the following expectation:

$$E[f(X)]=\int_A f(x)g(x)dx.$$

Let ##X## be a random variable taking values in ##A\subseteq\mathbb{R}^n##. Let ##g:A\to\mathbb{R}_+## be the probability density (pdf) of ##X##, and ##f:A\to\mathbb{R}## a function such that the expectation above is finite.

If ##X_1,X_2,...X_N## be independent random variables with pdf ##g##, then by the law of large numbers,

$$ E[f(X)]\to \frac{1}{N} \sum_{i=1}^N f(X_i) \quad \text{as N} \to \infty.$$

As far I understand, the sum above is a general Monte Carlo approximation of the integral.

Does the above approximation make any assumption on the pdf, i.e. uniformity and normalisation? If it is a general approximation, it should hold for any pdf, but I have seen different approximations such as ##V\frac{1}{N}\sum_{i=1}^N f(X_i)## and ##\frac{1}{N}\sum_{i=1}^N \frac{f(X_i)}{g(X_i)}##, where in the former ##V## denotes the definite integral over the pdf. How are these related and derived?
 
Last edited:
Physics news on Phys.org
(First question) - The only assumption is that the mean exists. By definition pdf is normalized. Uniformity (?) is irrelevant.

Different approximations - I have never seen the first one ##(V\times)##. The second looks like what is needed when ##X_i## is sampled from a uniform distribution, but it should have ##f(X_i)g(X_i)##.

The form ##\frac{f(X_i)}{g(X_i)}## would be used when sampling ##X_i## using ##g(x)## when the distribution of ##X_i## is actually uniform. Both here and the above notes, it is assumed that ##X## is distributed over a finite interval.
 
Last edited:
  • Like
Likes   Reactions: schniefen
Thanks for the reply. The formula ##V\frac{1}{N}\sum_{i=1}^N f(X_i)## appears in the Wikipedia article on Monte Carlo integration. Using the mean value theorem, I can see how this formula could easily be derived. However, when using a probabilistic approach, the integral ##I## in the article could be written as ##I=\int_{\Omega} f(x)g(x)dx## with ##g(x)=1##. Wouldn't then the law of large numbers simply yield ##\frac{1}{N} \sum_{i=1}^N f(X_i)## for this integral too? Where does the factor ##V## come from?
 
schniefen said:
Where does the factor ##V## come from?

Monte-carlo integration can be used:
1) To estimate the value of a (deterministic) function over a volume.
2) To estimate a function (e.g. the mean) of a random variable when the function is defined by an integral.

The Wikipedia article refers to integrating a (deterministic) function ##f## over a volume ##V## , not to integrating a multidimensional random variable whose domain is that volume. The estimate is done by introducing a random variable ##\bar{x}## that is uniformly distributed over the volume.

The mean of a set of samples drawn from various places in a volume is used to estimate the mean value of ##f## per unit volume. The estimated value of the integral is the volume ##V## times the estimate of the mean value of ##f## per unit volume.

How do we explain that sampling a function defined on volume using a uniform density is an estimate of the "density" of that function per unit volume? Justifying that thought is the basic conceptual difficulty.
 
  • Like
Likes   Reactions: schniefen
Stephen Tashi said:
The Wikipedia article refers to integrating a (deterministic) function ##f## over a volume ##V## , not to integrating a multidimensional random variable whose domain is that volume.
What is the difference? If the multidimensional random variable ##f(X)## "looks like" the deterministic function ##f##, wouldn't it be the same (with the pdf of ##f(X)## being equal to 1, i.e. ##E[f(X)]=I=\int_{\Omega=A}f(x)dx##)?

Stephen Tashi said:
How do we explain that sampling a function defined on volume using a uniform density is an estimate of the "density" of that function per unit volume?
Wouldn't it be the "density" of that function per that volume (from which we sample the function values)? Why is it unit volume?
 
Stephen Tashi said:
Monte-carlo integration can be used:
1) To estimate the value of a (deterministic) function over a volume.
2) To estimate a function (e.g. the mean) of a random variable when the function is defined by an integral.
Is there a defining link between the two areas of applications? Somewhere where 1) would be equivalent to 2)? Or any other relation? Currently struggling with the presentation of Monte Carlo integration in the language of probability theory versus plain calculus language.
 
schniefen said:
Is there a defining link between the two areas of applications?
There is a general way of defining a problem that includes both major types of problems:
1) Integration of deterministic functions by introducing random variables
2) Computation of integrals involving probability distributions - i.e. computing certain functions of random variables.

In either case, a Monte-Carlo method involves random variables. In both cases, the random variables may or may not be uniformly distributed.

In both cases, the distribution of the random variable may or may not determine how samples of the random variable are selected. For example, in statistics there a sampling methods such as "importance sampling" and "stratified sampling" that do not draw samples from uniform distributions or from the distribution of the population in question.

So a general formulation is that we are estimating an integral ##\int_{\Omega} f(\bar{x}) g(\bar{x}) d\bar{x}## where ##g(\bar{x})## is a probability density (which may have the constant value 1). We select samples of ##\bar{x}## by using some distribution ##s(\bar{x})## which may or may not be equal to ##g(\bar{x})##. If necessary, in computing our estimate of the integral we use formulae that compensate for the fact that we sampled using ##s(\bar{x})## instead of ##g(\bar{x})##.

A further generalization is the "Markov Chain Monte-Carlo" type of estimation where the location of samples is determined by a Markov Chain instead of by taking independent random samples from some given distribution.

Wouldn't it be the "density" of that function per that volume (from which we sample the function values)? Why is it unit volume?

It depends on how we define the "density"of a function - which I deliberately neglected to do! If I think of the function ##f(\bar{x})## itself as the physical density of something per unit volume (e.g. kg/ m^3 ) then ##\frac{1}{N} ( f(\bar{x_1}) + f(\bar{x_2}) + ...+f(\bar{x_N}))## is an estimate of the mean density per unit volume and ##V## times that is an estimate of the total stuff (whatever ##f## is a density for) in the volume ##V##. The exact total stuff is given by ##\int_V f(\bar{x}) d\bar{x}##.

In some physical applications it's unnatural to think of ##f(\bar{x}) ## as a density. (For example, in 1-D it's unnatural to think of velocity as the density of position in time even though it has the appropriate physical units) However, the above way of looking at things makes the general method of estimation plausible.
 
Last edited:
schniefen said:
If ##X_1,X_2,...X_N## be independent random variables with pdf ##g##, then by the law of large numbers,

$$ E[f(X)]\to \frac{1}{N} \sum_{i=1}^N f(X_i) \quad \text{as N} \to \infty.$$

As far I understand, the sum above is a general Monte Carlo approximation of the integral.

The above is not the general formula for an unbiased estmator. ##\hat{I} = \frac{1}{N}\sum_{i=1}^N f(X_i)## is an unbiased estimator of ##\int_V f(X_i)## in the special case where we integrate over a volume ##V = 1## and when the ##X_i## are uniformly distributed over the volume. Discussions of Monte-Carlo integration often treat integration over a hypercube and you may have seen the above formula in that context.

but I have seen different approximations such as ##V\frac{1}{N}\sum_{i=1}^N f(X_i)## and ##\frac{1}{N}\sum_{i=1}^N \frac{f(X_i)}{g(X_i)}##, where in the former ##V## denotes the definite integral over the pdf. How are these related and derived?

##V\frac{1}{N}\sum_{i=1}^N f(X_i)## is a special case of ##\frac{1}{N}\sum_{i=1}^N \frac{f(X_i)}{g(X_i)}## where the probability density ##g(X_i)## has constant value ##1/V## (The ##X_i## are assumed to be independently and uniformly distributed over the volume. So ##g## is the uniform probability distribution over the volume ##V##, which has constant value ##1/V##. )Here is an intuitive way to unify the two formulae.

Begin with the first formula. Think of the function ##f## as a density of "stuff" per unit volume, even if the physical interpretation of ##f## isn't naturally that way. Then ##\int_V f(\bar{x}) d\bar{x}## is the total stuff in a volume and ##\mu_f = (1/V) \int_V f(\bar{x}) d\bar{x} ## is average volume density of stuff in the volume.

Thinking of ##f## as density per unit volume, if we estimate the average ##f## by taking ##N## random samples of ##f## in the volume ##V##, it intuitive to use ##\hat{\mu}_ f = \frac{1}{N} \sum_{i=1}^N f(\bar{x}_i) ## as an estimator of the average density of stuff per unit volume and ##(V)(\hat{\mu}_f)## as the estimate of the total stuff in ##V##.

Now consider a case where we sample the volume in a non-uniform manner. Suppose the volume ##V## is composed of unequal sub-volumes ##V_1## and##V_2##. Suppose we take ##N_1## samples from ##V_1## and ##N_2## samples from ##V_2## To compensate for the bias in sampling and the inequality of the volumes, it is intuitive to use a weighted average as the estimator for the average volume density.

##\hat{\mu}_f = \frac{V_1}{V} \frac{1}{N_1} \sum_{\bar{x}_i \in V_1} f(\bar{x}_i) + \frac{V_2}{V} \frac{1}{N_2} \sum_{\bar{x}_i \in V_2} f(\bar{x}_i)##

The estimate for the total stuff in the volume is

##\hat{I} = (V)(\hat{\mu}_f) = \frac{V_1}{N_1} \sum_{\bar{x}_i \in V_1} f(\bar{x}_i) + \frac{V_2}{N_2} \sum_{\bar{x}_i \in V_2} f(\bar{x}_i) ##

Now, instead of considering ##N_1## and ##N_2## to have been deterministically selected, suppose we conduct random sampling with probability ##g_1## per unit volume that a sample is taken from ##V_1## and probability ##g_2## per unit volume that a sample is taken from ##V_2##.

[Note: Post has been edited from first version to add the "per unit volume" aspect to ##g_1, g_2## ]

"On the average" from a total of ##N## samples we take ## g_1 V_1 N ## samples from ##V_1## and ##g_2 V_2 N## samples from ##V_2##. Proceeding by superstition and ignorance, this suggests setting ##N_1 = g_1 V_1 N ## and ##N_2 = g_2 V_2 N## in the above formula.

We obtain:
##\hat{I} = \frac{V_1}{g_1 V_1 N} \sum_{\bar{x}_i \in V_1} f(\bar{x}_i) + \frac{V_2}{g_2 V_2 N} \sum_{\bar{x}_i \in V_2} f(\bar{x}_i) ##

##= \frac{1}{N} ( \sum_{\bar{x}_i \in V_1} \frac{f(\bar{x}_i)}{g_1} + \sum_{\bar{x}_i \in V_2} \frac{f(\bar{x}_i)}{g_2} ) ##

Generalizing from two volumes to many volumes, this suggests that if the volume ##V## is divided into many subvolumes, that a good estimator for the total stuff in ##V## is
##\hat{I} = \frac{1}{N} \sum_{\bar{x}_i \in V} \frac{ f(\bar{x}_i)}{g(\bar{x}_i)}##
 
Last edited:

Similar threads

  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K