# Limits of pitch and time resolution in signal analysis (FFT/SFTF)

## Main Question or Discussion Point

With DFT/FFT/SFTF or similar fourier transforms, there's always a trade off between frequency and time resolution. It's like the 'uncertainty principle' for signal analysis.

I have a few questions regarding this:

1: With a theoretical infinitely fast processor, is it possible to obtain both good time AND frequency resolution? Which algorithm may help here?

2: How much better are wavelets for this sort of thing? If wavelets are better than FFT/SFTF, then could there be something better than even wavelets? What's the ceiling theoretically?

3: With the help of an incredibly fast CPU, one idea I thought of would be to analyze all possible sets of frequencies, amplitudes, and offsets of individual sine waves, mix them, and see which combination produces a result closest to a given window. Some signals/sounds may require one or two sine waves to come close, whilst others may require hundreds or even thousands of mixed sine waves (each with their own amplitudes, phase and frequency) to come close. Would this whole idea get close to perfection for signal analysis?

4: What's the difference between DFT of a given window length and SFTF?

5: Since the SFTF introduces only a rough approximation of the original signal (with 'blurring' around each frequency found), how can it be so effective with pitch shifting which it is commonly used for?

The reason for asking all this is because I'd love a spectragram VST in the future to display a more accurate analysis of any sound.

Last edited:

1: With a theoretical infinitely fast processor, is it possible to obtain both good time AND frequency resolution? Which algorithm may help here?
The FFT is good. It only has log linear complexity

2: How much better are wavelets for this sort of thing? If wavelets are better than FFT/SFTF, then could there be something better than even wavelets? What's the ceiling theoretically?
I'm going to have to read up on this but I don't think that wavelets will give good frequency resolution because there amplitude envelope should increase there bandwidth from over that of a pure sinusoidal signal.

3: With the help of an incredibly fast CPU, one idea I thought of would be to analyze all possible sets of frequencies, amplitudes, and offsets of individual sine waves, mix them, and see which combination produces a result closest to a given window. Some signals/sounds may require one or two sine waves to come close, whilst others may require hundreds or even thousands of mixed sine waves (each with their own amplitudes, phase and frequency) to come close. Would this whole idea get close to perfection for signal analysis?
If you know "a priori" that you have a small bias to describe the signal then there is no need to use an FFT and of course you can find a more efficient algorithm. Perhaps, fitting the signal to an ARMA model might be more appropriate.

4: What's the difference between DFT of a given window length and SFTF?

5: Since the SFTF introduces only a rough approximation of the original signal (with 'blurring' around each frequency found), how can it be so effective with pitch shifting which it is commonly used for?

The reason for asking all this is because I'd love a spectragram VST in the future to display a more accurate analysis of any sound.

With DFT/FFT/SFTF or similar fourier transforms, there's always a trade off between frequency and time resolution. It's like the 'uncertainty principle' for signal analysis.

I have a few questions regarding this:

1: With a theoretical infinitely fast processor, is it possible to obtain both good time AND frequency resolution? Which algorithm may help here?
First, no information is lost in a DFT / IDFT transform, except for the effects of floating point precision. So for a given data set and resolution, they have the same "resolution" in this sense (except for losses from limited floating point precision).

The resolution in this sense is simply the number of samples and the floating point precision of each sample (whether its the number of frequencies in the frequency domain or number of time samples in the time domain). So you just need both more CPU and more memory (for more samples and more floating point precision). So yes, you can get any resolution you want given enough CPU and memory.

2: How much better are wavelets for this sort of thing? If wavelets are better than FFT/SFTF, then could there be something better than even wavelets? What's the ceiling theoretically?
Wavelets only give you more precision in measuring a specific frequency at the expense (assuming limited CPU and memory resources) of precision elsewhere. Also, to truly take advantage of those benefits, the wavelet correlation should be done before the signal is digitized (in the front-end analog circuitry). The other advantage of wavelet analysis, which is the case even when implemented after digitization, is that you can effectively shorten buffer lengths, which is useful in real-time applications so that buffering delay is reduced.

3: With the help of an incredibly fast CPU, one idea I thought of would be to analyze all possible sets of frequencies, amplitudes, and offsets of individual sine waves, mix them, and see which combination produces a result closest to a given window. Some signals/sounds may require one or two sine waves to come close, whilst others may require hundreds or even thousands of mixed sine waves (each with their own amplitudes, phase and frequency) to come close. Would this whole idea get close to perfection for signal analysis?
I think you are simply describing a lossy or lossless frequency-domain compression, which does have its merits. For example, it saves memory.

4: What's the difference between DFT of a given window length and SFTF?

5: Since the SFTF introduces only a rough approximation of the original signal (with 'blurring' around each frequency found), how can it be so effective with pitch shifting which it is commonly used for?
Sorry, I don't know enough about SFTF.

The reason for asking all this is because I'd love a spectragram VST in the future to display a more accurate analysis of any sound.

Last edited:
First, no information is lost in a DFT / IDFT transform, except for the effects of floating point precision. So for a given data set and resolution, they have the same "resolution" in this sense (except for losses from limited floating point precision).
Edit: Seems I said bellow what you wrote above but I'll say it anyway.

If you are using a DFT to approximate a Fourier transform, then increasing the number of time steps increases the frequency resolution well, increasing while increasing the sampling rate increases the highest frequency you can estimate. Given a fixed computation time this might be what the original poster meant by the trade off.

However, I don't think this is the uncertainty principle with regards to some kind of frequency transform. I know that the wider a rect function is the narrower it is in the frequency domain. I think this might be what be related to the uncertainty principle but it has been a while since I took a DSP course. I remember getting some question on a test before that had to do with DSP and an uncertainty principle and I'm not sure if anyone in the class new how to solve it.