Challenge 8: Sub-Gaussian Variables

  • Thread starter Thread starter Office_Shredder
  • Start date Start date
  • Tags Tags
    Challenge Variables
AI Thread Summary
A random variable X is defined as sub-gaussian if the probability of its absolute value exceeding t is bounded by a specific exponential decay. The discussion focuses on proving that the standard normal distribution is sub-gaussian by demonstrating that its probability density function meets the criteria outlined. The integral of the product of a bounding function and the standard normal density is analyzed using Fourier transforms, leading to the conclusion that it can be bounded appropriately. Various attempts are made to establish inequalities and bounds for the integral, ultimately showing that with a constant K set to 1/2, the standard normal distribution satisfies the sub-gaussian condition. The proof confirms that the standard normal distribution is indeed sub-gaussian.
Office_Shredder
Staff Emeritus
Science Advisor
Gold Member
Messages
5,702
Reaction score
1,587
A random variable X is called sub-gaussian if
P( |X| > t) \leq 2e^{-K t^2}
for some constant K. Equivalently, the probability density p(x) is sub-gaussian if
\int_{-t}^{t} p(x) dx \geq 1 - 2 e^{-Kt^2}.

The challenge: Prove that the standard normal distribution (with mean 0 and variance 1) is sub-gaussian.
 
Mathematics news on Phys.org
I haven't found a solution yet, but here's a method that might be useful.

The standard normal distribution has the following probability density function:
$$p(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2}$$
Therefore we wish to prove that there exists some ##K## such that
$$\int_{-\infty}^{\infty} r_t(x) p(x) dx \geq 1 - 2e^{-Kt^2}$$
where
$$r_t(x) = \begin{cases}
1 & \text{ if }|x| < t \\
0 & \text{ if }|x| > t\\
\end{cases}$$
Note that in general, the integral of a function is equal to its Fourier transform evaluated at zero:
$$\int_{-\infty}^{\infty}f(x) dx = \left. \int_{-\infty}^{\infty} f(x) e^{-2\pi i u x} dx \right|_{u = 0} = \hat{f}(0)$$
where ##\hat{f}## is the Fourier transform of ##f##:
$$\hat{f}(u) = \int_{-\infty}^{\infty}f(x) e^{-2\pi i u x} dx$$
We may therefore express our desired integral (let's call it ##I##) in terms of Fourier transforms:
$$I = \int_{-\infty}^{\infty} r_t(x) p(x) dx = \widehat{r_t p}(0)$$
where ##\widehat{r_t p}## is the Fourier transform of the product ##r_t p##. It's easy to show that the Fourier transform of a product is equal to the convolution of the Fourier transforms of the factors:
$$\widehat{r_t p}(u) = \int_{-\infty}^{\infty} \widehat{r_t}(u-v)\widehat{p}(v) dv$$
Evaluating at ##u = 0##, we get
$$I = \widehat{r_t p}(0) = \int_{-\infty}^{\infty} \widehat{r_t}(-v) \widehat{p}(v) dv$$
Let us now compute ##\widehat{r_t}## and ##\widehat{p}##. Starting with the definition of the Fourier transform, we obtain
$$\begin{align}
\widehat{r_t}(v) &= \int_{-\infty}^{\infty} r_t(x) e^{-2\pi i v x} dx \\
&= \int_{-t}^{t} e^{-2\pi i v x} dx \\
&= \int_{-t}^{t} \cos(2\pi v x) dx \\
&= 2 \int_{0}^{t} \cos(2\pi v x) dx \\
&= \left. 2 \frac{\sin(2\pi v x)}{2\pi v} \right|_{x = 0}^{x = t} \\
&= 2 \frac{\sin(2 \pi v t)}{2\pi v} \\
&= \frac{\sin(2\pi v t)}{\pi v}\\
\end{align}$$
Note that the result is an even function of ##v##, i.e., ##\widehat{r_t}(-v) = \widehat{r_t}(v)##, so we can simplify our expression for ##I## to
$$I = \int_{-\infty}^{\infty} \widehat{r_t}(v) \widehat{p}(v) dv$$
Next we calculate ##\hat{p}##, again starting from the definition. We have
$$\begin{align}
\hat{p}(v) &= \int_{-\infty}^{\infty} p(x) e^{-2 \pi i v x} dx \\
&= \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} e^{-x^2/2}e^{-2\pi i v x} dx\\
&= \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} \exp\left(-\frac{1}{2} (x^2 + 4\pi i v x)\right)dx \\
&= \exp\left(-\frac{1}{2} 4 \pi^2 v^2\right) \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} \exp\left(-\frac{1}{2} (x + 2\pi i v)^2\right) dx \\
&= \exp\left(-\frac{1}{2} 4 \pi^2 v^2\right) \frac{1}{\sqrt{2\pi}} \cdot \sqrt{2\pi} \\
&= \exp(-2\pi^2 v^2)
\end{align}$$
Plugging the expressions for ##\widehat{r_t}## and ##\hat{p}## into our integral, we obtain
$$I = \int_{-\infty}^{\infty}\widehat{r_t}(v) \widehat{p}(v) dv =
\int_{-\infty}^{\infty} \frac{\sin(2\pi v t)}{\pi v} \exp(-2\pi^2 v^2) dv$$
Now if I can find the right lower bound for the integrand as a function of ##v## and ##t##, the result might follow. Or I could be barking up the wrong tree.

Intuitively, as ##t## increases, ##\sin(2\pi v t)/(\pi v)## becomes a taller and skinnier function centered at ##v = 0##, although with tails that decay on the order of ##1/v##, so the oscillations are important! Most of the value of ##I## is hopefully determined by the part spanned by the "main peak" of ##\sin(2\pi v t)/(\pi v)##, which is the part for which ##|v| < 1/(2t)##.

I think it's true that
$$I > \int_{-1/t}^{1/t} \frac{\sin(2\pi v t)}{\pi v} \exp(-2\pi^2 v^2) dv$$
because outside the interval ##[-1/t, 1/t]## you could approximate the integrand as an alternating series whose first term is positive.
 
Last edited:
Office_Shredder said:
A random variable X is called sub-gaussian if
P( |X| &gt; t) \leq 2e^{-K t^2}
for some constant K. Equivalently, the probability density p(x) is sub-gaussian if
\int_{-t}^{t} p(x) dx \geq 1 - 2 e^{-Kt^2}.

The challenge: Prove that the standard normal distribution (with mean 0 and variance 1) is sub-gaussian.
Let's try this.

We wish to prove that ##\displaystyle 1\leq\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x+2e^{-kt^2}=f(t)##.

Attempt 1:
Expanding by power series, we get $$\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x=\frac{\sqrt{2}}{\sqrt{\pi}}\sum_{0\leq n}\frac{(-1)^n t^{2n+1}}{2^n(2n+1)n!}.$$

For ##2e^{-kt^2}##, we expand and get ##\displaystyle 2e^{-kt^2}=2\sum_{m\geq 0}\frac{(-1)^mk^mt^{2m}}{m!}##.

Let ##k=\frac{1}{2}## for ease of notation. After a myriad of simplifications and a couple series manipulations which I am not going to LaTeX out, we just need to prove $$1\leq 2\sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!}.$$

By properties of alternating series, I'm pretty sure that one can show the RHS is always greater than 1.

Attempt 2:
Note that $$\frac{d}{dt}\left(\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x+2e^{-kt^2}\right)=\sqrt{\frac{2}{\pi}}e^{-\frac{1}{2}t^2}-4kte^{-kt^2}=e^{-\frac{1}{2}t^2}(\sqrt{\frac{2}{\pi}}-2t).$$ Letting ##k=1/2## again, for convenience. Then, we note that ##max(f)=f\left(\sqrt{\frac{1}{2\pi}}\right)>1##. Since ##f## is [strike]increasing[/strike] (clearly decreasing with an asymptote at 1) for all ##t>\sqrt{\frac{1}{2\pi}}## and ##f(t)>1## for all ##0\leq t\leq \sqrt{\frac{1}{2\pi}}##, the result follows.

Not sure if either of these is right. They were pretty ugly.

Edit: Ignore the second part. I clearly did something horribly wrong with the derivatives. >_>
 
Last edited:
I guess we just consider t≥0?
For t=0, it is trivial (left side 1/2, right side 1), consider t>0 now.

We want to show
$$\int_{-t}^{t} p(x) dx \geq 1 - 2 e^{-Kt^2}$$
As the normal distribution is symmetric, this can be written as
$$1-2\int_{t}^{\infty} p(x) dx \geq 1 - 2 e^{-Kt^2}$$
or
$$\int_{t}^{\infty} p(x) dx \leq e^{-Kt^2}$$

As a plot and with the example K=1/2, this means the red line has to be above the blue line in the WolframAlpha plot
(if the url does not work, the query: "plot int(1/sqrt(2pi)*exp(-y^2/2), y from x to infinity), e^(-x^2/2)")[/size]

Let's find an upper bound for the integral. As the integrand is monotonically decreasing,
$$\int_{t}^{\infty} p(x) dx \\
\leq \sum_{i=0}^\infty p(t+i) \\
= \frac{1}{\sqrt{2\pi}} \sum_{i=0}^\infty \exp\left(\frac{-(t+i)^2}{2}\right) \\
< \frac{1}{\sqrt{2\pi}} \sum_{i=0}^\infty \exp\left(\frac{-t^2-2ti}{2}\right) \\
= \frac{1}{\sqrt{2\pi}} \exp(-t^2/2) \sum_{i=0}^\infty \left(e^{-t}\right)^i
$$
As t>0, exp(-t)<1 so we can evaluate the sum.
$$ = \frac{1}{\sqrt{2\pi}} \exp(-t^2/2) \frac{1}{1-e^{-t}}$$

Let's fix ##K=\frac{1}{2}##. We have to show that
$$\frac{1}{\sqrt{2\pi}} \exp(-t^2/2) \frac{1}{1-e^{-t}} \leq e^{-t^2/2}$$
This is satisfied if
$$\sqrt{2 \pi} (1-e^{-t}) \geq 1$$
where equality gives approximately t ≈ 0.51. Therefore, the inequality is satisfied for t≥1.

It is left to show the inequality for 0<t<1. In this interval, the LHS is ≤ 0.5, but ##e^{-t^2/2} > e^{-1/2} \approx 0.607 > 0.5.

As a result, with K=1/2 the inequality is satisfied everywhere and the standard normal distribution is sub-gaussian.
 
Mandelbroth said:
Let's try this.

We wish to prove that ##\displaystyle 1\leq\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x+2e^{-kt^2}=f(t)##.

Attempt 1:
Expanding by power series, we get $$\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x=\frac{\sqrt{2}}{\sqrt{\pi}}\sum_{0\leq n}\frac{(-1)^n t^{2n+1}}{2^n(2n+1)n!}.$$

For ##2e^{-kt^2}##, we expand and get ##\displaystyle 2e^{-kt^2}=2\sum_{m\geq 0}\frac{(-1)^mk^mt^{2m}}{m!}##.

Let ##k=\frac{1}{2}## for ease of notation. After a myriad of simplifications and a couple series manipulations which I am not going to LaTeX out, we just need to prove $$1\leq 2\sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!}.$$

By properties of alternating series, I'm pretty sure that one can show the RHS is always greater than 1.

That seems like it would be hard - if t is sufficiently large the first bunch of terms will be increasing in magnitude (as they will have higher powers of t).

mfb, that's a really nice solution, I like the idea of bounding the integral with a Riemann sum. k=1/2 is actually the best bound that I know of, but your proof is the first one that makes me think that it is genuinely tight as opposed to just convenient (since the sum should be a really good estimate if t is large).

Any alternate proofs out there people can come up with?
 
Office_Shredder said:
k=1/2 is actually the best bound that I know of, but your proof is the first one that makes me think that it is genuinely tight as opposed to just convenient (since the sum should be a really good estimate if t is large).
Proof:

$$\int_{t}^{\infty} p(x) dx \\
{\bf \geq} \sum_{i=1}^\infty p(t+i) \\
= \frac{1}{\sqrt{2\pi}} \sum_{i=1}^\infty \exp\left(\frac{-(t+i)^2}{2}\right) \\
> \frac{1}{\sqrt{2\pi}} \exp(-\frac{(t+1)^2}{2})$$
Where the last expression is simply the first term of the sum.

Let's look at the inequality:
$$\frac{1}{\sqrt{2\pi}} \exp(-\frac{(t+1)^2}{2}) < e^{-K t^2}$$
This can be solved for K:
$$c+(t+1)^2 > 2Kt^2$$
Where ##c=\log(2 \pi)## is constant.
$$K< \frac{c+(t+1)^2}{2t^2} = \frac{1}{2} + \frac{1}{t} + \frac{1+c}{2t^2}$$
K cannot be larger than 1/2, or the inequality gets violated at some point.
 
Last edited:
mfb said:
Proof:

$$\int_{t}^{\infty} p(x) dx \\
{\bf \geq} \sum_{i=1}^\infty p(t+i) \\
= \frac{1}{\sqrt{2\pi}} \sum_{i=1}^\infty \exp\left(\frac{-(t+i)^2}{2}\right) \\
> \frac{1}{\sqrt{2\pi}} \exp(-\frac{(t+1)^2}{2})$$
Where the last expression is simply the first term of the sum.

Let's look at the inequality:
$$\frac{1}{\sqrt{2\pi}} \exp(-\frac{(t+1)^2}{2}) < e^{-K t^2}$$
This can be solved for K:
$$c+(t+1)^2 > 2Kt^2$$
Where ##c=\log(2 \pi)## is constant.
$$K< \frac{c+(t+1)^2}{2t^2} = \frac{1}{2} + \frac{1}{t} + \frac{1+c}{2t^2}$$
K cannot be larger than 1/2, or the inequality gets violated at some point.
That's...awesome. In the series method I used, ##k=\frac{1}{2}## was actually what I considered an aesthetic choice. I chose it because it made the computations simple and lots of things canceled to give something I knew would converge. That's pretty fantastic, when you think about how all of it connects. :approve:
 
Mandelbroth said:
we just need to prove $$1\leq 2\sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!}.$$
One part gives a nice sum, the other part gives the error function:

$$ \sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!} \\
= \sum_{n\geq 0}(-1)^n\frac{t^{2n}(\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!} + \sum_{n\geq 0}(-1)^n\frac{t^{2n}t}{\sqrt{2\pi}2^n(2n+1)n!} \\
= \sum_{n\geq 0}(-1)^n\frac{t^{2n}}{2^nn!} + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \sum_{n\geq 0}\frac{(-t^2/2)^n}{n!} + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \exp\left(-\frac{t^2}{2}\right) + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!}$$
Let's focus on the second part:
$$\frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \int dt \frac{d}{dt} \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \frac{1}{\sqrt{2\pi}} \int dt \sum_{n\geq 0}(-1)^n\frac{t^{2n}}{2^n n!} \\
= \frac{1}{\sqrt{2\pi}} \int dt \exp\left(\frac{-t^2}{2}\right)$$
Now we have to get the integration constant right: For t=0, the expression is 0.
$$ = \frac{1}{\sqrt{2\pi}} \int_0^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
So now we are comparing a gaussian with the error function again. I think this went in a circle...

We have to prove that$$1 \leq 2 \exp\left(-\frac{t^2}{2}\right) + \frac{2}{\sqrt{2\pi}} \int_0^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
Which is equivalent to (using the symmetry of the integral)
$$1 - 2 \exp\left(-\frac{t^2}{2}\right) \leq \frac{1}{\sqrt{2\pi}} \int_{-t}^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
And that is exactly the initial problem statement. Okay, bad direction. Well now we have that in TeX.
 
Last edited:
mfb, you literally just reversed the steps that he took to arrive at that :p
 
  • #10
mfb said:
One part gives a nice sum, the other part gives the error function:

$$ \sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!} \\
= \sum_{n\geq 0}(-1)^n\frac{t^{2n}(\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!} + \sum_{n\geq 0}(-1)^n\frac{t^{2n}t}{\sqrt{2\pi}2^n(2n+1)n!} \\
= \sum_{n\geq 0}(-1)^n\frac{t^{2n}}{2^nn!} + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \sum_{n\geq 0}\frac{(-t^2/2)^n}{n!} + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \exp\left(-\frac{t^2}{2}\right) + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!}$$
Let's focus on the second part:
$$\frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \int dt \frac{d}{dt} \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \frac{1}{\sqrt{2\pi}} \int dt \sum_{n\geq 0}(-1)^n\frac{t^{2n}}{2^n n!} \\
= \frac{1}{\sqrt{2\pi}} \int dt \exp\left(\frac{-t^2}{2}\right)$$
Now we have to get the integration constant right: For t=0, the expression is 0.
$$ = \frac{1}{\sqrt{2\pi}} \int_0^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
So now we are comparing a gaussian with the error function again. I think this went in a circle...

We have to prove that$$1 \leq 2 \exp\left(-\frac{t^2}{2}\right) + \frac{2}{\sqrt{2\pi}} \int_0^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
Which is equivalent to (using the symmetry of the integral)
$$1 - 2 \exp\left(-\frac{t^2}{2}\right) \leq \frac{1}{\sqrt{2\pi}} \int_{-t}^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
And that is exactly the initial problem statement. Okay, bad direction. Well now we have that in TeX.
I'm literally crying with laughter right now. That was hilarious. :smile::smile::smile:
 
Back
Top