Chernoff Bounds using importance sampling identity

  • Thread starter Thread starter WMDhamnekar
  • Start date Start date
WMDhamnekar
MHB
Messages
376
Reaction score
28
How to use importance sampling identity to obtain the Chernoff bounds as given below?

Let X have moment generating function ##\phi(t)= E[e^{tX}]##. Then, for any c > 0 ,

##P[X\geq c ]\leq e^{-tc} \phi(t), \text{if t > 0}##

##P[X \leq c]\leq e^{-tc}\phi(t), \text{if t<0} ##

Solution:
Here's an example of using the importance sampling identity to obtain Chernoff bounds for a binomial random variable.

Suppose we have a binomial random variable ##X \sim \text{Bin}(n,p)##, where n is the number of trials and p is the probability of success on each trial. We want to bound the probability that X deviates from its mean by more than a certain amount, i.e., we want to bound ##P(X \geq (1+\delta)np)## for some ##\delta > 0##.

Let ##X = \sum_{i=1}^n X_i##, where the ##X_i## are independent Bernoulli random variables with parameter p. Let ##Q_i## be a Bernoulli distribution with parameter q, and let ##Q = Q_1 \times \cdots \times Q_n##.

Then, by the importance sampling identity, we have ##P(X \geq (1+\delta)np) = E_Q\left[\prod_{i=1}^n \frac{dP_i}{dQ_i}I_{X \geq (1+\delta)np}\right]##

Now, let ##\phi_i(t) = E[e^{tX_i}] = 1 + p(e^t - 1)## and ##\psi_i(t) = E_Q[e^{tX_i}] = 1 + q(e^t - 1)##.

By choosing ##Q_i## such that ##\frac{dP_i}{dQ_i} = e^{tX_i - \log \phi_i(t)}##, we obtain ##P(X \geq (1+\delta)np) = E_Q\left[e^{tX - n\log \phi(t)}I_{X \geq (1+\delta)np}\right]##

Applying Markov's inequality, we get ##P(X \geq (1+\delta)np) \leq e^{-t(1+\delta)np}E_Q\left[e^{tX - n\log \phi(t)}\right] = e^{-t(1+\delta)np}\prod_{i=1}^n\psi_i(t)e^{-\log \phi_i(t)} = e^{-t(1+\delta)np}\left(\frac{1+q(e^t-1)}{1+p(e^t-1)}\right)^n##

This is the Chernoff bound for a binomial random variable. By choosing an appropriate value of t, we can obtain a tight bound on the tail probability.

In my opinion, this solution seems to look correct. Do you have any doubts? Let me know, so that I shall clear them.
 
Last edited:
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top