# FFT scaling vs analytical FT

• Featured
Mentor

## Summary:

How does the scaling of the numerical output of a forward FFT compare to the mathematical definition of the Fourier transform?

## Main Question or Discussion Point

When considering the forward FFT of a mathematical function sampled at times $t = 0, \Delta, \ldots, (N-1) \Delta$, following the usual convention, we have something like
$$H(f) = \int_{-\infty}^{+\infty} h(t) e^{-2 \pi i f t} dt \quad \Rightarrow \quad H_k = \sum_{n=0}^{N-1} h_n e^{-2 \pi i f t} = \sum_{n=0}^{N-1} h_n e^{-2 \pi i k n / N}$$
The difference between the mathematical and numerical version is the absence in the latter of the time-sampling step $\Delta$. (Conventionally, this is compensated by adding a factor $1/N$ in the inverse FFT, to recover the original signal.)

This appears to work with a function like a Gaussian, $h(t) = \exp(-(t-t_0)^2 / 2 \sigma^2)$, where $H(f)$ is recovered from $\Delta \times H_k$, independently of the number of points $N$.

However, if I take $h(t) = \exp(2 \pi i \nu t)$, which mathematically transforms to a Dirac delta, I recover $H_k$'s that are all zeros except for $k = j \equiv \nu N \Delta$ (assuming that $\nu$ is indeed chosen to be one of the discrete frequencies of the spectrum), and the value of $H_j$ is exactly $N$, independently of $\Delta$. Why is that?

Related Programming and Computer Science News on Phys.org
.Scott
Homework Helper
I am not following you down to the last detail - but let me take a shot at it anyway.
I believe the Dirac function you are using is (mathematically) zero at all values except one - and at that one exception it is 1. The pulse width is thus zero, and the power at all frequencies is zero. So if the FFT correctly modeled it, it would produce all zeroes.

Instead the FFT is working on a different assumption - that this pulse has a non-zero pulse width.

For practical uses, converting a real-life function into a series of discrete samples is best done by averaging over $\Delta$ - rather than strictly sampling.

Last edited:
PeterDonis
Mentor
2019 Award
I believe the Dirac function you are using is (mathematically) zero at all values except one - and at that one exception it is 1.
Actually the Dirac delta function is (heuristically) infinite at the one point where it's not zero (the point where its argument is zero). The defining property of the Dirac delta function is that its integral over the range $- \infty$ to $\infty$ is 1.

Mathematical purists would state this as the Delta not being a well-defined function at all, but rather a distribution.

.Scott
Homework Helper
Edited to fix the time/frequency/spatial terminology:
---------------------

So of the steps you are taking, I am not fully following.
So let me do my "FFT of Delta" experiment, and perhaps you can reply with similar detail.

I realize "time" and "frequency" are just special case applications for the FFT, but they might make some of the dialog easier.

And your H(t) is your frequency-domain function - the results of the FFT.

I am using Matlab. I realize you are not generating your Dirac this way - but here's my way:
Code:
ht_tm=[1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];
ht_fr=fft(ht_tm);
This gets me an FFT (ht_tm) of all ones.
If I divide by sqrt(N) = 4, I get a total power of 1 - which is what I started with. So I am happy.

You are using h(t)=exp(2πiνt)h(t)=exp⁡(2πiνt). So I am not doing exactly what you are doing.

Last edited:
PeterDonis
Mentor
2019 Award
No. There is no $h(x)$ anywhere in his post. He's transforming between the time domain and the frequency domain. The time domain function he has a question about is $h(t) = \exp(2 \pi i \nu t)$.

And your H(t) is your time-domain function - the results of the FFT.
No. The result of the Fourier transform is a frequency domain function $H(f)$. The Fourier transform (by the continuous integral method) of $h(t) = \exp(2 \pi i \nu t)$ is $H(f) = \delta(f - \nu)$.

PeterDonis
Mentor
2019 Award
The difference between the mathematical and numerical version is the absence in the latter of the time-sampling step $\Delta$.
I think you mean the absence in the former, i.e., in the $H(f)$ obtained from the integral.

if I take $h(t) = \exp(2 \pi i \nu t)$, which mathematically transforms to a Dirac delta, I recover $H_k$'s that are all zeros except for $k = j \equiv \nu N \Delta$ (assuming that $\nu$ is indeed chosen to be one of the discrete frequencies of the spectrum), and the value of $H_j$ is exactly $N$, independently of $\Delta$. Why is that?
What values are you using for $h_n$?

Also, I don't understand what you mean by "one of the discrete frequencies of the spectrum". There is no "spectrum" in your $h(t)$ here; it's just one frequency, $\nu$. So that $\nu$, a constant, is what appears in the FFT formulas.

.Scott
Homework Helper
I have corrected my bad terminology.

For FFT, the Dirac function will yield a "spectrum" in the sense that there will be many occupied bins in the frequency domain.
For the specific example I gave, every frequency bin is occupied.

But I want to know exactly what is in @DrClaude 's discrete time-domain array. Is it all zeros and one non-zero value (similar to mine)?
The reason this is important is because there is no good way to represent the Delta function in a discrete array - and whatever method is used to approximate it will result in artifacts.

Part of the issue is that you are dealing with a function that is heavy with high-frequency components. What the FFT is really operating on is a periodic function - one that has a pulse every N $\Delta$'s.

Last edited:
Baluncore
2019 Award
I never trust the transfer function of an FFT as it is dependent on the algorithm and the size of the data set. Also the gain of the DC = Cos( zero ) term can often be different to AC terms.

The only way to be sure is to Slowly-Fourier-Synthesis a known amplitude sinewave, at an integer frequency, then FFT to identify the transform gain. Also check the DC gain.

Obviously different window functions change the gain by scaling the data, but they also convolute energy into adjacent bins which confuses the transfer function for real data with non-integer frequencies.

Most applications of the FFT can ignore the amplitude scale factor, and save time.

.Scott
Homework Helper
Yes. The windowing function is what you normally apply to your FFT output to avoid the problem with ambiguous high frequencies. It is usually 1 for most values and tails off to about zero at the ends of the FFT array. For the delta function, the windowing function is problematic.

PeterDonis
Mentor
2019 Award
I want to know exactly what is in @DrClaude 's discrete time-domain array. Is it all zeros and one non-zero value (similar to mine)?
It shouldn't be, since that's not what you get if you evaluate his $h(t)$ at a series of discrete values of $t$ separated by $\Delta$.

I asked @DrClaude in my last post to explicitly give the discrete time-domain array he is using.

PeterDonis
Mentor
2019 Award
I don't understand what you mean by "one of the discrete frequencies of the spectrum".
I think I do now; this statement by @.Scott made the light bulb go on in my head:

there is no good way to represent the Delta function in a discrete array
The obvious way is all of the values zero except one (which is what @DrClaude is saying he gets--all of his $H_k$'s are zero except the one corresponding to $\nu$). The problem is what the one non-zero value should be (since in the continuous case it's infinity, as I said in an earlier post).

But the "discrete array" here, I assume, is what @DrClaude meant by "discrete frequencies of the spectrum": you have to discretize the frequency range, so if you're trying to FFT a wave with a single frequency $\nu$, you have to make sure that $\nu$ is one of the discrete frequencies in the set you are using to discretize the frequency range.

Baluncore
2019 Award
Yes. The windowing function is what you normally apply to your FFT output to avoid the problem with ambiguous high frequencies. It is usually 1 for most values and tails off to about zero at the ends of the FFT array. For the delta function, the windowing function is problematic.
Well not quite, you have it backwards. The window function is applied to the input before the FFT. It does two things. Firstly, it eliminates the sudden step where the extreme ends of the cycle meet, so it removes the inherent high-frequency step noise from the spectrum. Secondly, it convolutes the input signal, which is widened so it falls into two or more bins, and not into the deep null between two bins where it could disappear completely.

Frequency analysis involves multiplying the input signal by all possible waves to detect by correlation. The FFT is therefore a harmonic mixer, so an anti-aliasing filter is needed to remove all frequency components above half the sampling frequency. That is done before the window is applied, to eliminate the harmonics of valid signals from being folded back around the spectrum as aliases.

.Scott
Homework Helper
Here is an excerpt from the User's Manual for the Trilib DSP library from Infineon:
Windowing: In the windowing method, an initial impulse response is derived by taking the Inverse Discrete Fourier Transform (IDFT) of the desired frequency response. Then, the impulse response is refined by applying a data window to it.
Note that the window is applied after the DSP. It has to be because it is specified in the frequency domain.

So here is the full signal treatment:
1) Apply a low pass filter to the raw analog signal - aiming for no more than half the sampling frequency as you cut-off frequency.
2) Digitize the signal.
3) Process it through an FFT
4) Apply the window.

However:
This thread has prompted me to look at a wider range of the use of the "windowing" method. It exists outside of the scope of the FFT. Only within the scope of FFT does it apply to frequency domain - and even then, inconsistently so.
So it boils down to what you are trying to do with the signal. In a situation where you have access to the analog signal before it is digitized, there is no sense in applying the window in the time domain. Otherwise, applying it in the time domain can be useful - especially when that window contains only real values (no imaginary part) and it avoids further windowing in the frequency domain.

DrGreg
Gold Member
Here is an excerpt from the User's Manual for the Trilib DSP library from Infineon:
Windowing: In the windowing method, an initial impulse response is derived by taking the Inverse Discrete Fourier Transform (IDFT) of the desired frequency response. Then, the impulse response is refined by applying a data window to it.
Note that the window is applied after the DSP. It has to be because it is specified in the frequency domain.
No, you have misread the quote. In that example,
1. You take the desired frequency response (in the frequency domain), apply an inverse DFT to get an initial impulse response (in the time domain)
2. You multiply the initial impulse response by a window (in the time domain).
In the context of DFTs, windows are always applied in the time domain.

Summary:: How does the scaling of the numerical output of a forward FFT compare to the mathematical definition of the Fourier transform?

Are we to worry about (1)going from a continuous Fourier transform to a discrete one or (2) use of the Fast Fourier transform method to compute the discrete fourier transform?

It seems to me to be getting mixed up and they are distinct questions.

This appears to work with a function like a Gaussian, $h(t) = \exp(-(t-t_0)^2 / 2 \sigma^2)$, where $H(f)$ is recovered from $\Delta \times H_k$, independently of the number of points $N$.
Just to be clear, when you sum over all the bins in the frequency domain for the Gaussian $H(f)$, you don't get $N$?

Last edited:
...not into the deep null between two bins where it could disappear completely.

Last edited:
DrGreg
Gold Member
it convolutes the input signal, which is widened so it falls into two or more bins, and not into the deep null between two bins where it could disappear completely.
Without windowing, a signal at a frequency between two bins doesn't vanish ar all; it is spread out over a larger number of bins; the window reduces the spreading,

Think about applying a rectangle function that truncates your data. Or think about the Heisenberg uncertainty principle. If you apply a window to localize your data, the alternate domain must spread out.

The purpose of this window functions is apodization. Getting rid of the “feet” around peaks and cleaning up the noise from aliasing so to our eyes we can better visualize the data. I think this is what you meant to convey.

Last edited:
Baluncore
2019 Award
OK, so the energy cannot completely disappear, the energy is redistributed in the spectrum. Like a tube of toothpaste, if you squeeze it at one end, it will come out somewhere else.

An input frequency midway between two bins implies a maximum step at the ends of the cycle, so windowing lowers the noise floor by removing that step, which in a sense is a reduction or a narrowing of the spectrum.

At the same time, multiplication by the window function in the time domain, is convolution in the frequency domain. As the time domain windows used are fundamentally a single cycle, the convolution adds one bin to either side of the response peak. So window convolution widens the response, usually from two bins to four or more bins. Just what the window does is determined by the spectrum of the time domain window selected and the exact frequency of the signal, but it effectively spreads or partitions available energy into adjacent bins that were once skirt.

So windowing does two opposite things, it reduces the broadband noise floor, while at the same time it widens and lowers the peak of the response, both of which smooth the spectrum. Because the local spreading of energy reduces the peak value detected, depending on application, it may require the transform gain be recalibrated.

There are too many places to squeeze the toothpaste tube. There is the top of the peak, the skirts, or the high frequency noise floor.
You can usually find a window function that will convolve the input energy into any spectrum you might want as an output. You cannot be truly good, unless you have bad thoughts, and can resist them.

marcusl
Gold Member
No, you have misread the quote. In that example,
1. You take the desired frequency response (in the frequency domain), apply an inverse DFT to get an initial impulse response (in the time domain)
2. You multiply the initial impulse response by a window (in the time domain).
In the context of DFTs, windows are always applied in the time domain.
Actually, since the FT and DFT are essentially symmetric, a window can be applied in either domain. One application where a window is applied in the frequency domain is in radar signal processing, where it is used to reduce so-called range sidelobes (time corresponds to range in radar).

Last edited:
marcusl
Gold Member

Think about applying a rectangle function that truncates your data. Or think about the Heisenberg uncertainty principle. If you apply a window to localize your data, the alternate domain must spread out.

The purpose of this window functions is apodization. Getting rid of the “feet” around peaks and cleaning up the noise from aliasing so to our eyes we can better visualize the data. I think this is what you meant to convey.
Apodization can't fix aliasing. It's used to reduce sidelobe levels, which can lead to a host of ills when performing spectral analysis including: "straddle loss," energy leakage into neighboring bins (sometimes referred to as the "Picket Fence" distortion), and the masking of weak spectral lines by spillover from nearby strong ones.

The downsides of windowing are: broadening of the main peak (someone correctly mentioned energy conservation--energy from the sidelobes ends up in the main peak), arbitrary deprecation of signal content at the beginning and end of the data record, and the lack of any rigorous basis (it's an ad-hoc band aid, which is one reason that dozens of different windows have been invented and used).

Apodization can't fix aliasing.
That's right. I got confused. I even just mentioned the Nyquist theorem. With this example, apodization happens in the time-domain. Low-pass filters belong in the frequency domain.

Mentor
Thanks to all that replied, but the discussion didn't go in the direction I was aiming for. That said, I think I have found the solution, in particular thanks to @PeterDonis in post #3 that made me focus on the aspects of the Dirac delta. Below, I will restate the problem with examples so everything is clear.

Consider the Gaussian function $h(t) = (\pi \sigma^2)^{-1/2} \exp(-(t-t_0)^2/\sigma^2)$. Let's transform it using the FFT and compare the results for different sampling rates $\Delta t$ and number of points $N$:
Matlab:
N = 1024;
dt = 1/N;
t = 0:dt:1-dt;
t0 = .5;
sigma2 = (16 * dt)^2;
norm = 1 / sqrt (pi * sigma2);
h = norm * exp (-(t - t0).^2 / sigma2);
H = fft(h);
H(1)
ans =  1024
Matlab:
N = 2048;
dt = 1/(N/2);
t = 0:dt:2-dt;
t0 = .5;
sigma2 = (16 * dt)^2;
norm = 1 / sqrt (pi * sigma2);
h = norm * exp (-(t - t0).^2 / sigma2);
H = fft(h);
H(1)
ans =  1024
Matlab:
N = 2048;
dt = 1/N;
t = 0:dt:1-dt;
t0 = .5;
sigma2 = (16 * dt)^2;
norm = 1 / sqrt (pi * sigma2);
h = norm * exp (-(t - t0).^2 / sigma2);
H = fft(h);
H(1)
ans =  2048
The first two cases use the same $\Delta t$ but a different $N$, and the answer is the same. Comparing the last two cases, we see that same $N$ bit different $\Delta t$ gives a different scaling. To recover the same scaling as the mathematical case, we need to multiply the result of the FFT by $\Delta t$, as in my post #1.

Consider now the function $h(t) = \exp(-2 \pi i f t)$. Again, let's transform it using the FFT and compare the results for different sampling rates $\Delta t$ and number of points $N$:
Matlab:
N = 1024;
dt = 1/N;
t = 0:dt:1-dt;
f = 4;
h = exp(2 * pi * i * f * t);
H = fft(h);
H(1:8)
ans =
Columns 1 and 2:
-3.8085e-14 + 8.2337e-15i  -5.6050e-14 - 5.6177e-15i
Columns 3 and 4:
-7.9501e-14 + 2.6375e-15i  -1.5801e-13 + 2.9424e-15i
Columns 5 and 6:
1.0240e+03 - 4.5938e-13i   1.6074e-13 + 3.1842e-15i
Columns 7 and 8:
7.6452e-14 + 8.8088e-16i   5.7424e-14 - 5.9275e-15i
Matlab:
N = 2048;
dt = 1/(N/2);
t = 0:dt:2-dt;
f = 4;
h = exp(2 * pi * i * f * t);
H = fft(h);
H(1:16)
ans =
Columns 1 and 2:
-9.2786e-15 + 5.7537e-15i  -5.7238e-14 + 5.7350e-14i
Columns 3 and 4:
-1.2683e-13 + 1.1743e-14i  -1.2466e-13 + 1.2523e-14i
Columns 5 and 6:
-1.6657e-13 + 2.6473e-14i  -2.2042e-13 - 7.1758e-15i
Columns 7 and 8:
-3.0651e-13 + 5.6659e-15i  -6.4047e-13 + 9.3314e-15i
Columns 9 and 10:
2.0480e+03 - 1.9989e-12i   6.4303e-13 + 1.1222e-14i
Columns 11 and 12:
3.1166e-13 + 8.0520e-15i   2.2321e-13 - 9.5963e-15i
Columns 13 and 14:
1.6746e-13 + 2.1866e-14i   1.2470e-13 + 1.2386e-14i
Columns 15 and 16:
1.2610e-13 + 1.3216e-14i   5.6498e-14 + 5.5692e-14i
Matlab:
N = 2048;
dt = 1/N;
t = 0:dt:1-dt;
f = 4;
h = exp(2 * pi * i * f * t);
H = fft(h);
H(1:8)
ans =
Columns 1 and 2:
-7.7900e-14 + 1.0295e-14i  -1.1034e-13 - 5.8091e-15i
Columns 3 and 4:
-1.5691e-13 + 6.5048e-15i  -3.1835e-13 + 4.1198e-15i
Columns 5 and 6:
2.0480e+03 - 1.0105e-12i   3.2215e-13 + 4.7495e-15i
Columns 7 and 8:
1.5294e-13 + 7.5246e-16i   1.1180e-13 - 4.4143e-15i
We see that the result is indeed a Dirac delta, with a single frequency containing a non-zero element. The value of that non-zero element is $N$, irrespective of the value of $\Delta t$. So, what is going on?

Using the properties of the Dirac delta, the integral of the frequency spectrum should be 1. We approximate it by a sum:
$$\int H(f) df \approx \sum_{k} (\Delta t \times H_k) \Delta f = \Delta t \times H_j \frac{1}{N \Delta t} = \frac{H_j}{N}$$
where $H_j$ is the non-zero element of the FFT and I have added the $\Delta t$ in front of $H_k$ following what I found for the Gaussian.

The conclusion is that indeed the Dirac delta must be scaled with $N$ to recover the correct mathematical Fourier transform.

Dr. Courtney