Increasing variance of weights in sequential importance sampling

Click For Summary
Increasing variance of importance weights in sequential importance sampling (SIR) leads to the degeneracy problem, where a few particles gain high normalized weights while many others become insignificant. This high variance is problematic because it indicates that the algorithm is concentrating on a limited number of particles, reducing the effectiveness of the sampling process. Resampling is a solution to this issue, as it eliminates low-weight samples and amplifies high-weight ones, thus lowering the variance and enhancing exploration of the posterior density. The discussion also clarifies that SIR involves resampling, while SIS does not. Understanding these concepts is crucial for effectively implementing sequential importance sampling techniques.
sisyphuss
Messages
1
Reaction score
0
Hi all,

I know about these facts:
1- The variance of importance weights increases in SIR (also know as the degeneracy problem).
2- It's bad (lol), because in practice, there will be a particle with high normalized weight and many particles with insignificant (normalized) weights.

But I can not really understand the meaning of increasing variance of importance weights in sequential importance sampling (SIR) well.

Can you please explain it for me? And why the high variance is bad? Also, is there any intuitive proof for that?

Thanks.
 
Physics news on Phys.org
sisyphuss said:
Hi all,

I know about these facts:
1- The variance of importance weights increases in SIR (also know as the degeneracy problem).
2- It's bad (lol), because in practice, there will be a particle with high normalized weight and many particles with insignificant (normalized) weights.

You may know those facts, but they haven't struck a chord with anyone else, based on those fragmentary descriptions. There are probably people on the forum who know enough probability theory to help you if you describe your question precisely.

But I can not really understand the meaning of increasing variance of importance weights in sequential importance sampling (SIR) well.

I certainly don't know what "increasing variance of importance weights" means. And why do you use the abbreviation "SIR" for "sequential importance sampling"? Did you mean "sequential importance re-sampling"?
 
sisyphuss said:
Hi all,

I know about these facts:
1- The variance of importance weights increases in SIR (also know as the degeneracy problem).
2- It's bad (lol), because in practice, there will be a particle with high normalized weight and many particles with insignificant (normalized) weights.

But I can not really understand the meaning of increasing variance of importance weights in sequential importance sampling (SIR) well.

Thanks.

- The abbreviation of sequential importance sampling is SIS, as far as I know, and this approach does not include resampling, not like the sequential important resampling (SIR)

- Basicly fact 1 and fact 2 say somewhat the same thing. Variance is the second moment of the normalized weights (informally it gives the sum squared deviation from the mean). The sample weights are normalized. At the initialization of the algorithm they are distributed evenly (low variance), and as time goes by, and we proceed with the algorithm, there will be some particles that perform good, and gain more and more weights, and as variance is sensitive to outliers it will grow.
The solution of SIR is to resample to eliminate samples with low importance weights and multiply samples with high weights, this will lower the variance of the weights, and prevent us to work with particles of low weights, this way we will discover the interesting places of the posterior density (the ones with high weights).
The resampling step is often followed by a Markov Chain Monte-Carlo (MCMC)
step to introduce sample variety without affecting the posterior density.

If you are interested, read the great book about the topic:

A. Doucet, N. De Freitas, and N. Gordon. Sequential
Monte Carlo methods in practice. Springer Verlag, 2001.
 
The standard _A " operator" maps a Null Hypothesis Ho into a decision set { Do not reject:=1 and reject :=0}. In this sense ( HA)_A , makes no sense. Since H0, HA aren't exhaustive, can we find an alternative operator, _A' , so that ( H_A)_A' makes sense? Isn't Pearson Neyman related to this? Hope I'm making sense. Edit: I was motivated by a superficial similarity of the idea with double transposition of matrices M, with ## (M^{T})^{T}=M##, and just wanted to see if it made sense to talk...

Similar threads

Replies
10
Views
3K
  • · Replies 9 ·
Replies
9
Views
6K
  • · Replies 2 ·
Replies
2
Views
3K
Replies
1
Views
4K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 25 ·
Replies
25
Views
12K
  • · Replies 24 ·
Replies
24
Views
6K
Replies
24
Views
5K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K