# Real world music and FFT

1. Jan 29, 2014

### btb4198

so I have a program that pull data from a microphone and does an FFT.
now it is working for computer Generated Sin waves like from:
http://onlinetonegenerator.com/
and

so I do not have a piano, but I have been using youtube videos that play middle c scale
and my program is not working so will with it...

does the FFT not work well with real music ?
is there anything I should add to the FFT?
I

2. Jan 29, 2014

### sophiecentaur

The thing about an FFT is that it only applies (by definition) to a repeated waveform. You take a string of samples and FFT (A Discrete Fourier Transform) assumes that goes round and round for ever. An FFT of just any old string of samples can give you a load of very misleading results, telling you there is a structure of harmonics (of the repeat of the sample loop) when there may well not be in the original signal source.

One way round this is to take a very long string and then 'window' them down to a smaller number of samples in the middle, which will suppress the unwanted artefacts a bit. Unfortunately, this topic gets very hard very quickly. Have a look in this Wiki article.

Another way to get some sense would be to look at the samples on a digital editor and to snip out a sequence that looks as though it repeats. But you will find it hard to get any information about the transients of the attack of notes.

You can still have fun with waveforms and their transforms and see if you can make sense out of relating the two together.

3. Jan 29, 2014

### AlephZero

You don't need to take a "very long" string. Windowing is a good idea for any signal-processing using fourier analysis.

I agree the subject can get esoteric, but a practical recipe that doesn't throw away any information is to use the Hanning (or Hann) window (as defined on the wiki page), and split the data into chunks that overlap half way. In other words if you use a 1024 point FFT, then do FFTs on samples 1...1024, 513...1536, 1025...2048, etc. That choice of window function makes the whole process invertible, because the overlapping window functions add up to exactly 1 everywhere. The window function $(1 - \cos 2x)/2 = \sin^2 x$, the successive windows look like an alternation between $\sin^2 x$ and $\cos^2 x$, and $\sin^2 x + \cos^2 x = 1$.

If you are interested in the transients at the beginning and end of the data, add some 0's at the start and end, so half the first chunk of data is zero .

If you are processing "music", one FFT will show you all the harmonics of all the notes that were played in that time interval, which can just look like a mess. It might work better to use fairly short FFTs, e.g. 4096 points which is about 1/10 of a second at 44100 samples /second. See http://en.wikipedia.org/wiki/Short-time_Fourier_transform

Last edited: Jan 29, 2014
4. Jan 29, 2014

### btb4198

ok I am windowing ... I think...
I taking 16384 samples at a time..
and i forget about the other samples
well I just do not use them
are you saying I should make them 0 and put them in my FFT?

5. Jan 29, 2014

### analogdesign

No, he's saying you should apply a filter to your data (a window) such that the beginning and the end of the data sequence are roughly equal. Otherwise there will be significant "frequency leakage" in your FFT and it could be very misleading.

6. Jan 29, 2014

### btb4198

what kind of filter ? and what should it do ?

7. Jan 29, 2014

### analogdesign

8. Jan 29, 2014

### btb4198

9. Jan 30, 2014

### AlephZero

If you are doing an FFT on 16384 samples and the Nyquist frequency is 22050 Hz, the resolution of your FFT frequencies will be 22050/16384 = 1.35 Hz.

If you didn't do any windowing and are just looking at a plot of the FFT data, getting within 2 or 3 frequency increments of of the "accurate" frequency is probably about as good as you will get. In any case, there is no guarantee that the pitch of the music you played from youtube was actually A = 440 Hz.

10. Jan 30, 2014

### btb4198

I have a filter that removes low noise ..
is the magnitude is less 300,000
I do not display it

that is kind of like windowing