Inverse gamma distribution & simulated annealing problem

tbishop · Dec 24, 2007

I've hit a problem trying to sample an inverse gamma distribution, 'scaled' using a temperature variable, T. If my distribution is defined as (where the normalising constant k=(b^a)/Gamma(a) ):
IG(x|a,b) = k * x^(-a-1) * exp(-b/x)
then the scaled version is
(IG(x|a,b))^(1/T) = k^(1/T) * x^(-(-a+1)/T) * exp(-b/(Tx)).
which I want to sample as part of a simulated annealing procedure.

If I am not mistaken, this is also proportional to a new IG distribution, IG(x|a',b')
where a'+1 = (a+1)/T and b'=b/T. Hence a' = (a+1-T)/T.
The problem is however, that a' and b' should be strictly > 0 for the distribution to be valid. Thus for temperatures T>a+1, a' becomes negative and the samples can't be drawn.
(However I can still evaluate the IG pdf for a'<0... so I'm not sure
why the condition is strictly necessary? Is it just a physical
interpretation?)

I can do the sampling for X~IG(a',b') by transformation of a Gamma variate
G(x|a,b)=k * x^(a-1) exp(-bx)
by drawing Y~G(a',b') and letting X=1/Y.

Now the strange thing is if I apply the same temperature scaling to the Gamma distribution, I get the new transformation to a Gamma distribution with
a' = (a-1+T)/T
which will always be positive for all T>=1 if a>0. As whether I'm working with x or 1/x (and therefore working with the IG or Gamma) should be purely a matter of convenience, and not depend on which definition I start from, there is something at odds here...

I've probably missed something obvious but I can't think what it is. Any suggestions appreciated!

EnumaElish · Dec 25, 2007

I am somewhat unclear as to why you'd like to scale the pdf in the way you described. The usual approach is to scale the random variable itself, say Y = sX for a scaling constant s > 0; and if X ~ Gamma(a, b) then Y ~ Gamma(a, sb).

In the usual meaning of "scaling," parameter a is not affected because it is not a scaling parameter -- parameter b is.

tbishop · Dec 25, 2007

I am somewhat unclear as to why you'd like to scale the pdf in the way you described. The usual approach is to scale the random variable itself, say Y = sX for a scaling constant s > 0; and if X ~ Gamma(a, b) then Y ~ Gamma(a, sb).

In the usual meaning of "scaling," parameter a is not affected because it is not a scaling parameter -- parameter b is.

I'm not referring to this normal type of "scaling", I'm referring to raising the whole distribution to a power 1/T which effectively "broadens" the distribution. In the case of a Gaussian, the effect is indeed simply to increase the variance. The reason I want to do this is to increase convergence rates in MCMC, by searching the space faster using http://mathworld.wolfram.com/SimulatedAnnealing.html" )

Inverse gamma distribution & simulated annealing problem

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad How do E[X] and E[|X|] relate?

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight