Challenge 8: Sub-Gaussian Variables

  • Context: Graduate 
  • Thread starter Thread starter Office_Shredder
  • Start date Start date
  • Tags Tags
    Challenge Variables
Click For Summary

Discussion Overview

The discussion revolves around proving that the standard normal distribution is sub-gaussian, focusing on the definitions and properties of sub-gaussian random variables and their probability density functions. The scope includes theoretical exploration and mathematical reasoning.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Some participants define a sub-gaussian random variable and propose proving that the standard normal distribution meets this definition.
  • One participant suggests using the Fourier transform to express the integral of the probability density function in terms of its Fourier transform, aiming to find a lower bound for the integral.
  • Another participant attempts to establish a relationship involving power series expansions for the integral of the probability density function and the exponential decay term.
  • Some participants discuss the symmetry of the normal distribution and reformulate the integral to facilitate proving the sub-gaussian property.
  • One participant proposes a specific value for K and explores the implications of this choice on the inequalities that need to be satisfied.
  • Another participant raises concerns about the validity of their derivative calculations in their attempts to prove the sub-gaussian property.
  • Some participants express uncertainty about the correctness of their approaches and calculations, indicating that they may not have reached a definitive conclusion.

Areas of Agreement / Disagreement

Participants generally do not reach consensus on the methods or conclusions, with multiple competing approaches and uncertainties remaining throughout the discussion.

Contextual Notes

Some limitations include unresolved mathematical steps, assumptions about the behavior of integrals, and the dependence on specific values of K. The discussion reflects various attempts to prove the sub-gaussian property without a clear resolution.

Office_Shredder
Staff Emeritus
Science Advisor
Gold Member
Messages
5,706
Reaction score
1,592
A random variable X is called sub-gaussian if
[tex]P( |X| > t) \leq 2e^{-K t^2}[/tex]
for some constant K. Equivalently, the probability density p(x) is sub-gaussian if
[tex]\int_{-t}^{t} p(x) dx \geq 1 - 2 e^{-Kt^2}.[/tex]

The challenge: Prove that the standard normal distribution (with mean 0 and variance 1) is sub-gaussian.
 
Physics news on Phys.org
I haven't found a solution yet, but here's a method that might be useful.

The standard normal distribution has the following probability density function:
$$p(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2}$$
Therefore we wish to prove that there exists some ##K## such that
$$\int_{-\infty}^{\infty} r_t(x) p(x) dx \geq 1 - 2e^{-Kt^2}$$
where
$$r_t(x) = \begin{cases}
1 & \text{ if }|x| < t \\
0 & \text{ if }|x| > t\\
\end{cases}$$
Note that in general, the integral of a function is equal to its Fourier transform evaluated at zero:
$$\int_{-\infty}^{\infty}f(x) dx = \left. \int_{-\infty}^{\infty} f(x) e^{-2\pi i u x} dx \right|_{u = 0} = \hat{f}(0)$$
where ##\hat{f}## is the Fourier transform of ##f##:
$$\hat{f}(u) = \int_{-\infty}^{\infty}f(x) e^{-2\pi i u x} dx$$
We may therefore express our desired integral (let's call it ##I##) in terms of Fourier transforms:
$$I = \int_{-\infty}^{\infty} r_t(x) p(x) dx = \widehat{r_t p}(0)$$
where ##\widehat{r_t p}## is the Fourier transform of the product ##r_t p##. It's easy to show that the Fourier transform of a product is equal to the convolution of the Fourier transforms of the factors:
$$\widehat{r_t p}(u) = \int_{-\infty}^{\infty} \widehat{r_t}(u-v)\widehat{p}(v) dv$$
Evaluating at ##u = 0##, we get
$$I = \widehat{r_t p}(0) = \int_{-\infty}^{\infty} \widehat{r_t}(-v) \widehat{p}(v) dv$$
Let us now compute ##\widehat{r_t}## and ##\widehat{p}##. Starting with the definition of the Fourier transform, we obtain
$$\begin{align}
\widehat{r_t}(v) &= \int_{-\infty}^{\infty} r_t(x) e^{-2\pi i v x} dx \\
&= \int_{-t}^{t} e^{-2\pi i v x} dx \\
&= \int_{-t}^{t} \cos(2\pi v x) dx \\
&= 2 \int_{0}^{t} \cos(2\pi v x) dx \\
&= \left. 2 \frac{\sin(2\pi v x)}{2\pi v} \right|_{x = 0}^{x = t} \\
&= 2 \frac{\sin(2 \pi v t)}{2\pi v} \\
&= \frac{\sin(2\pi v t)}{\pi v}\\
\end{align}$$
Note that the result is an even function of ##v##, i.e., ##\widehat{r_t}(-v) = \widehat{r_t}(v)##, so we can simplify our expression for ##I## to
$$I = \int_{-\infty}^{\infty} \widehat{r_t}(v) \widehat{p}(v) dv$$
Next we calculate ##\hat{p}##, again starting from the definition. We have
$$\begin{align}
\hat{p}(v) &= \int_{-\infty}^{\infty} p(x) e^{-2 \pi i v x} dx \\
&= \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} e^{-x^2/2}e^{-2\pi i v x} dx\\
&= \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} \exp\left(-\frac{1}{2} (x^2 + 4\pi i v x)\right)dx \\
&= \exp\left(-\frac{1}{2} 4 \pi^2 v^2\right) \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} \exp\left(-\frac{1}{2} (x + 2\pi i v)^2\right) dx \\
&= \exp\left(-\frac{1}{2} 4 \pi^2 v^2\right) \frac{1}{\sqrt{2\pi}} \cdot \sqrt{2\pi} \\
&= \exp(-2\pi^2 v^2)
\end{align}$$
Plugging the expressions for ##\widehat{r_t}## and ##\hat{p}## into our integral, we obtain
$$I = \int_{-\infty}^{\infty}\widehat{r_t}(v) \widehat{p}(v) dv =
\int_{-\infty}^{\infty} \frac{\sin(2\pi v t)}{\pi v} \exp(-2\pi^2 v^2) dv$$
Now if I can find the right lower bound for the integrand as a function of ##v## and ##t##, the result might follow. Or I could be barking up the wrong tree.

Intuitively, as ##t## increases, ##\sin(2\pi v t)/(\pi v)## becomes a taller and skinnier function centered at ##v = 0##, although with tails that decay on the order of ##1/v##, so the oscillations are important! Most of the value of ##I## is hopefully determined by the part spanned by the "main peak" of ##\sin(2\pi v t)/(\pi v)##, which is the part for which ##|v| < 1/(2t)##.

I think it's true that
$$I > \int_{-1/t}^{1/t} \frac{\sin(2\pi v t)}{\pi v} \exp(-2\pi^2 v^2) dv$$
because outside the interval ##[-1/t, 1/t]## you could approximate the integrand as an alternating series whose first term is positive.
 
Last edited:
Office_Shredder said:
A random variable X is called sub-gaussian if
[tex]P( |X| > t) \leq 2e^{-K t^2}[/tex]
for some constant K. Equivalently, the probability density p(x) is sub-gaussian if
[tex]\int_{-t}^{t} p(x) dx \geq 1 - 2 e^{-Kt^2}.[/tex]

The challenge: Prove that the standard normal distribution (with mean 0 and variance 1) is sub-gaussian.
Let's try this.

We wish to prove that ##\displaystyle 1\leq\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x+2e^{-kt^2}=f(t)##.

Attempt 1:
Expanding by power series, we get $$\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x=\frac{\sqrt{2}}{\sqrt{\pi}}\sum_{0\leq n}\frac{(-1)^n t^{2n+1}}{2^n(2n+1)n!}.$$

For ##2e^{-kt^2}##, we expand and get ##\displaystyle 2e^{-kt^2}=2\sum_{m\geq 0}\frac{(-1)^mk^mt^{2m}}{m!}##.

Let ##k=\frac{1}{2}## for ease of notation. After a myriad of simplifications and a couple series manipulations which I am not going to LaTeX out, we just need to prove $$1\leq 2\sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!}.$$

By properties of alternating series, I'm pretty sure that one can show the RHS is always greater than 1.

Attempt 2:
Note that $$\frac{d}{dt}\left(\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x+2e^{-kt^2}\right)=\sqrt{\frac{2}{\pi}}e^{-\frac{1}{2}t^2}-4kte^{-kt^2}=e^{-\frac{1}{2}t^2}(\sqrt{\frac{2}{\pi}}-2t).$$ Letting ##k=1/2## again, for convenience. Then, we note that ##max(f)=f\left(\sqrt{\frac{1}{2\pi}}\right)>1##. Since ##f## is [strike]increasing[/strike] (clearly decreasing with an asymptote at 1) for all ##t>\sqrt{\frac{1}{2\pi}}## and ##f(t)>1## for all ##0\leq t\leq \sqrt{\frac{1}{2\pi}}##, the result follows.

Not sure if either of these is right. They were pretty ugly.

Edit: Ignore the second part. I clearly did something horribly wrong with the derivatives. >_>
 
Last edited:
I guess we just consider t≥0?
For t=0, it is trivial (left side 1/2, right side 1), consider t>0 now.

We want to show
$$\int_{-t}^{t} p(x) dx \geq 1 - 2 e^{-Kt^2}$$
As the normal distribution is symmetric, this can be written as
$$1-2\int_{t}^{\infty} p(x) dx \geq 1 - 2 e^{-Kt^2}$$
or
$$\int_{t}^{\infty} p(x) dx \leq e^{-Kt^2}$$

As a plot and with the example K=1/2, this means the red line has to be above the blue line in the WolframAlpha plot
(if the url does not work, the query: "plot int(1/sqrt(2pi)*exp(-y^2/2), y from x to infinity), e^(-x^2/2)")[/size]

Let's find an upper bound for the integral. As the integrand is monotonically decreasing,
$$\int_{t}^{\infty} p(x) dx \\
\leq \sum_{i=0}^\infty p(t+i) \\
= \frac{1}{\sqrt{2\pi}} \sum_{i=0}^\infty \exp\left(\frac{-(t+i)^2}{2}\right) \\
< \frac{1}{\sqrt{2\pi}} \sum_{i=0}^\infty \exp\left(\frac{-t^2-2ti}{2}\right) \\
= \frac{1}{\sqrt{2\pi}} \exp(-t^2/2) \sum_{i=0}^\infty \left(e^{-t}\right)^i
$$
As t>0, exp(-t)<1 so we can evaluate the sum.
$$ = \frac{1}{\sqrt{2\pi}} \exp(-t^2/2) \frac{1}{1-e^{-t}}$$

Let's fix ##K=\frac{1}{2}##. We have to show that
$$\frac{1}{\sqrt{2\pi}} \exp(-t^2/2) \frac{1}{1-e^{-t}} \leq e^{-t^2/2}$$
This is satisfied if
$$\sqrt{2 \pi} (1-e^{-t}) \geq 1$$
where equality gives approximately t ≈ 0.51. Therefore, the inequality is satisfied for t≥1.

It is left to show the inequality for 0<t<1. In this interval, the LHS is ≤ 0.5, but ##e^{-t^2/2} > e^{-1/2} \approx 0.607 > 0.5.

As a result, with K=1/2 the inequality is satisfied everywhere and the standard normal distribution is sub-gaussian.
 
Mandelbroth said:
Let's try this.

We wish to prove that ##\displaystyle 1\leq\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x+2e^{-kt^2}=f(t)##.

Attempt 1:
Expanding by power series, we get $$\frac{\sqrt{2}}{\sqrt{\pi}}\int\limits_{[0,t]}e^{-\frac{x^2}{2}}\mathrm{d}x=\frac{\sqrt{2}}{\sqrt{\pi}}\sum_{0\leq n}\frac{(-1)^n t^{2n+1}}{2^n(2n+1)n!}.$$

For ##2e^{-kt^2}##, we expand and get ##\displaystyle 2e^{-kt^2}=2\sum_{m\geq 0}\frac{(-1)^mk^mt^{2m}}{m!}##.

Let ##k=\frac{1}{2}## for ease of notation. After a myriad of simplifications and a couple series manipulations which I am not going to LaTeX out, we just need to prove $$1\leq 2\sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!}.$$

By properties of alternating series, I'm pretty sure that one can show the RHS is always greater than 1.

That seems like it would be hard - if t is sufficiently large the first bunch of terms will be increasing in magnitude (as they will have higher powers of t).

mfb, that's a really nice solution, I like the idea of bounding the integral with a Riemann sum. k=1/2 is actually the best bound that I know of, but your proof is the first one that makes me think that it is genuinely tight as opposed to just convenient (since the sum should be a really good estimate if t is large).

Any alternate proofs out there people can come up with?
 
Office_Shredder said:
k=1/2 is actually the best bound that I know of, but your proof is the first one that makes me think that it is genuinely tight as opposed to just convenient (since the sum should be a really good estimate if t is large).
Proof:

$$\int_{t}^{\infty} p(x) dx \\
{\bf \geq} \sum_{i=1}^\infty p(t+i) \\
= \frac{1}{\sqrt{2\pi}} \sum_{i=1}^\infty \exp\left(\frac{-(t+i)^2}{2}\right) \\
> \frac{1}{\sqrt{2\pi}} \exp(-\frac{(t+1)^2}{2})$$
Where the last expression is simply the first term of the sum.

Let's look at the inequality:
$$\frac{1}{\sqrt{2\pi}} \exp(-\frac{(t+1)^2}{2}) < e^{-K t^2}$$
This can be solved for K:
$$c+(t+1)^2 > 2Kt^2$$
Where ##c=\log(2 \pi)## is constant.
$$K< \frac{c+(t+1)^2}{2t^2} = \frac{1}{2} + \frac{1}{t} + \frac{1+c}{2t^2}$$
K cannot be larger than 1/2, or the inequality gets violated at some point.
 
Last edited:
mfb said:
Proof:

$$\int_{t}^{\infty} p(x) dx \\
{\bf \geq} \sum_{i=1}^\infty p(t+i) \\
= \frac{1}{\sqrt{2\pi}} \sum_{i=1}^\infty \exp\left(\frac{-(t+i)^2}{2}\right) \\
> \frac{1}{\sqrt{2\pi}} \exp(-\frac{(t+1)^2}{2})$$
Where the last expression is simply the first term of the sum.

Let's look at the inequality:
$$\frac{1}{\sqrt{2\pi}} \exp(-\frac{(t+1)^2}{2}) < e^{-K t^2}$$
This can be solved for K:
$$c+(t+1)^2 > 2Kt^2$$
Where ##c=\log(2 \pi)## is constant.
$$K< \frac{c+(t+1)^2}{2t^2} = \frac{1}{2} + \frac{1}{t} + \frac{1+c}{2t^2}$$
K cannot be larger than 1/2, or the inequality gets violated at some point.
That's...awesome. In the series method I used, ##k=\frac{1}{2}## was actually what I considered an aesthetic choice. I chose it because it made the computations simple and lots of things canceled to give something I knew would converge. That's pretty fantastic, when you think about how all of it connects. :approve:
 
Mandelbroth said:
we just need to prove $$1\leq 2\sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!}.$$
One part gives a nice sum, the other part gives the error function:

$$ \sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!} \\
= \sum_{n\geq 0}(-1)^n\frac{t^{2n}(\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!} + \sum_{n\geq 0}(-1)^n\frac{t^{2n}t}{\sqrt{2\pi}2^n(2n+1)n!} \\
= \sum_{n\geq 0}(-1)^n\frac{t^{2n}}{2^nn!} + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \sum_{n\geq 0}\frac{(-t^2/2)^n}{n!} + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \exp\left(-\frac{t^2}{2}\right) + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!}$$
Let's focus on the second part:
$$\frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \int dt \frac{d}{dt} \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \frac{1}{\sqrt{2\pi}} \int dt \sum_{n\geq 0}(-1)^n\frac{t^{2n}}{2^n n!} \\
= \frac{1}{\sqrt{2\pi}} \int dt \exp\left(\frac{-t^2}{2}\right)$$
Now we have to get the integration constant right: For t=0, the expression is 0.
$$ = \frac{1}{\sqrt{2\pi}} \int_0^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
So now we are comparing a gaussian with the error function again. I think this went in a circle...

We have to prove that$$1 \leq 2 \exp\left(-\frac{t^2}{2}\right) + \frac{2}{\sqrt{2\pi}} \int_0^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
Which is equivalent to (using the symmetry of the integral)
$$1 - 2 \exp\left(-\frac{t^2}{2}\right) \leq \frac{1}{\sqrt{2\pi}} \int_{-t}^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
And that is exactly the initial problem statement. Okay, bad direction. Well now we have that in TeX.
 
Last edited:
mfb, you literally just reversed the steps that he took to arrive at that :p
 
  • #10
mfb said:
One part gives a nice sum, the other part gives the error function:

$$ \sum_{n\geq 0}(-1)^n\frac{t^{2n}(t+\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!} \\
= \sum_{n\geq 0}(-1)^n\frac{t^{2n}(\sqrt{2\pi}(2n+1))}{\sqrt{2\pi}2^n(2n+1)n!} + \sum_{n\geq 0}(-1)^n\frac{t^{2n}t}{\sqrt{2\pi}2^n(2n+1)n!} \\
= \sum_{n\geq 0}(-1)^n\frac{t^{2n}}{2^nn!} + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \sum_{n\geq 0}\frac{(-t^2/2)^n}{n!} + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \exp\left(-\frac{t^2}{2}\right) + \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!}$$
Let's focus on the second part:
$$\frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \int dt \frac{d}{dt} \frac{1}{\sqrt{2\pi}}\sum_{n\geq 0}(-1)^n\frac{t^{2n+1}}{2^n(2n+1)n!} \\
= \frac{1}{\sqrt{2\pi}} \int dt \sum_{n\geq 0}(-1)^n\frac{t^{2n}}{2^n n!} \\
= \frac{1}{\sqrt{2\pi}} \int dt \exp\left(\frac{-t^2}{2}\right)$$
Now we have to get the integration constant right: For t=0, the expression is 0.
$$ = \frac{1}{\sqrt{2\pi}} \int_0^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
So now we are comparing a gaussian with the error function again. I think this went in a circle...

We have to prove that$$1 \leq 2 \exp\left(-\frac{t^2}{2}\right) + \frac{2}{\sqrt{2\pi}} \int_0^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
Which is equivalent to (using the symmetry of the integral)
$$1 - 2 \exp\left(-\frac{t^2}{2}\right) \leq \frac{1}{\sqrt{2\pi}} \int_{-t}^t dt' \exp\left(\frac{-t'^2}{2}\right)$$
And that is exactly the initial problem statement. Okay, bad direction. Well now we have that in TeX.
I'm literally crying with laughter right now. That was hilarious. :smile::smile::smile:
 

Similar threads

  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 19 ·
Replies
19
Views
3K
  • · Replies 6 ·
Replies
6
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K