Is the FFT effective for analyzing real world music?

Click For Summary

Discussion Overview

The discussion revolves around the effectiveness of the Fast Fourier Transform (FFT) for analyzing real-world music, particularly in the context of a program that processes audio from a microphone. Participants explore the challenges faced when applying FFT to non-synthetic sounds, such as music from YouTube, and consider various techniques to improve the analysis.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant notes that FFT is designed for repeated waveforms and may yield misleading results when applied to arbitrary audio samples.
  • Another suggests using a windowing technique to mitigate artifacts, proposing the Hanning window and overlapping samples to maintain information integrity.
  • A participant questions whether they should zero out unused samples in their FFT process, indicating confusion about the application of windowing.
  • There is a discussion about the importance of filtering to avoid frequency leakage in FFT results.
  • One participant mentions that their FFT results are slightly off from expected frequencies, raising concerns about the accuracy of their analysis.
  • Another participant explains that the resolution of FFT frequencies is limited by the sample size and Nyquist frequency, suggesting that minor discrepancies in frequency readings may be acceptable.
  • One participant describes having a filter to remove low noise, comparing it to windowing, but does not elaborate on its effectiveness.

Areas of Agreement / Disagreement

Participants express varying opinions on the application of windowing and filtering techniques, with no consensus on the best approach to analyze real-world music using FFT. The discussion remains unresolved regarding the effectiveness of FFT in this context.

Contextual Notes

Participants highlight limitations related to the assumptions of FFT, the need for windowing to reduce artifacts, and the potential for frequency inaccuracies due to the nature of the audio source.

btb4198
Messages
570
Reaction score
10
so I have a program that pull data from a microphone and does an FFT.
now it is working for computer Generated Sin waves like from:
http://onlinetonegenerator.com/
and
youtube..

but not for read work sound from youtube.
so I do not have a piano, but I have been using youtube videos that play middle c scale
and my program is not working so will with it...

does the FFT not work well with real music ?
is there anything I should add to the FFT?
I
 
Engineering news on Phys.org
The thing about an FFT is that it only applies (by definition) to a repeated waveform. You take a string of samples and FFT (A Discrete Fourier Transform) assumes that goes round and round for ever. An FFT of just any old string of samples can give you a load of very misleading results, telling you there is a structure of harmonics (of the repeat of the sample loop) when there may well not be in the original signal source.

One way round this is to take a very long string and then 'window' them down to a smaller number of samples in the middle, which will suppress the unwanted artefacts a bit. Unfortunately, this topic gets very hard very quickly. Have a look in this Wiki article.

Another way to get some sense would be to look at the samples on a digital editor and to snip out a sequence that looks as though it repeats. But you will find it hard to get any information about the transients of the attack of notes.

You can still have fun with waveforms and their transforms and see if you can make sense out of relating the two together.
 
sophiecentaur said:
One way round this is to take a very long string and then 'window' them down to a smaller number of samples in the middle, which will suppress the unwanted artefacts a bit. Unfortunately, this topic gets very hard very quickly. Have a look in this Wiki article.

You don't need to take a "very long" string. Windowing is a good idea for any signal-processing using Fourier analysis.

I agree the subject can get esoteric, but a practical recipe that doesn't throw away any information is to use the Hanning (or Hann) window (as defined on the wiki page), and split the data into chunks that overlap half way. In other words if you use a 1024 point FFT, then do FFTs on samples 1...1024, 513...1536, 1025...2048, etc. That choice of window function makes the whole process invertible, because the overlapping window functions add up to exactly 1 everywhere. The window function ##(1 - \cos 2x)/2 = \sin^2 x##, the successive windows look like an alternation between ##\sin^2 x## and ##\cos^2 x##, and ##\sin^2 x + \cos^2 x = 1##.

If you are interested in the transients at the beginning and end of the data, add some 0's at the start and end, so half the first chunk of data is zero .

If you are processing "music", one FFT will show you all the harmonics of all the notes that were played in that time interval, which can just look like a mess. It might work better to use fairly short FFTs, e.g. 4096 points which is about 1/10 of a second at 44100 samples /second. See http://en.wikipedia.org/wiki/Short-time_Fourier_transform
 
Last edited:
ok I am windowing ... I think...
I taking 16384 samples at a time..
and i forget about the other samples
well I just do not use them
are you saying I should make them 0 and put them in my FFT?
 
btb4198 said:
ok I am windowing ... I think...
I taking 16384 samples at a time..
and i forget about the other samples
well I just do not use them
are you saying I should make them 0 and put them in my FFT?

No, he's saying you should apply a filter to your data (a window) such that the beginning and the end of the data sequence are roughly equal. Otherwise there will be significant "frequency leakage" in your FFT and it could be very misleading.
 
what kind of filter ? and what should it do ?
 
If you are doing an FFT on 16384 samples and the Nyquist frequency is 22050 Hz, the resolution of your FFT frequencies will be 22050/16384 = 1.35 Hz.

If you didn't do any windowing and are just looking at a plot of the FFT data, getting within 2 or 3 frequency increments of of the "accurate" frequency is probably about as good as you will get. In any case, there is no guarantee that the pitch of the music you played from youtube was actually A = 440 Hz.
 
  • #10
I have a filter that removes low noise ..
is the magnitude is less 300,000
I do not display it

that is kind of like windowing
and you are right about the youtube Video
 

Similar threads

  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
Replies
3
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 29 ·
Replies
29
Views
7K
Replies
38
Views
5K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K