Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Real world music and FFT

  1. Jan 29, 2014 #1
    so I have a program that pull data from a microphone and does an FFT.
    now it is working for computer Generated Sin waves like from:

    but not for read work sound from youtube.
    so I do not have a piano, but I have been using youtube videos that play middle c scale
    and my program is not working so will with it...

    does the FFT not work well with real music ?
    is there anything I should add to the FFT?
  2. jcsd
  3. Jan 29, 2014 #2


    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    The thing about an FFT is that it only applies (by definition) to a repeated waveform. You take a string of samples and FFT (A Discrete Fourier Transform) assumes that goes round and round for ever. An FFT of just any old string of samples can give you a load of very misleading results, telling you there is a structure of harmonics (of the repeat of the sample loop) when there may well not be in the original signal source.

    One way round this is to take a very long string and then 'window' them down to a smaller number of samples in the middle, which will suppress the unwanted artefacts a bit. Unfortunately, this topic gets very hard very quickly. Have a look in this Wiki article.

    Another way to get some sense would be to look at the samples on a digital editor and to snip out a sequence that looks as though it repeats. But you will find it hard to get any information about the transients of the attack of notes.

    You can still have fun with waveforms and their transforms and see if you can make sense out of relating the two together.
  4. Jan 29, 2014 #3


    User Avatar
    Science Advisor
    Homework Helper

    You don't need to take a "very long" string. Windowing is a good idea for any signal-processing using fourier analysis.

    I agree the subject can get esoteric, but a practical recipe that doesn't throw away any information is to use the Hanning (or Hann) window (as defined on the wiki page), and split the data into chunks that overlap half way. In other words if you use a 1024 point FFT, then do FFTs on samples 1...1024, 513...1536, 1025...2048, etc. That choice of window function makes the whole process invertible, because the overlapping window functions add up to exactly 1 everywhere. The window function ##(1 - \cos 2x)/2 = \sin^2 x##, the successive windows look like an alternation between ##\sin^2 x## and ##\cos^2 x##, and ##\sin^2 x + \cos^2 x = 1##.

    If you are interested in the transients at the beginning and end of the data, add some 0's at the start and end, so half the first chunk of data is zero .

    If you are processing "music", one FFT will show you all the harmonics of all the notes that were played in that time interval, which can just look like a mess. It might work better to use fairly short FFTs, e.g. 4096 points which is about 1/10 of a second at 44100 samples /second. See http://en.wikipedia.org/wiki/Short-time_Fourier_transform
    Last edited: Jan 29, 2014
  5. Jan 29, 2014 #4
    ok I am windowing ... I think...
    I taking 16384 samples at a time..
    and i forget about the other samples
    well I just do not use them
    are you saying I should make them 0 and put them in my FFT?
  6. Jan 29, 2014 #5


    User Avatar
    Science Advisor

    No, he's saying you should apply a filter to your data (a window) such that the beginning and the end of the data sequence are roughly equal. Otherwise there will be significant "frequency leakage" in your FFT and it could be very misleading.
  7. Jan 29, 2014 #6
    what kind of filter ? and what should it do ?
  8. Jan 29, 2014 #7


    User Avatar
    Science Advisor

  9. Jan 29, 2014 #8
  10. Jan 30, 2014 #9


    User Avatar
    Science Advisor
    Homework Helper

    If you are doing an FFT on 16384 samples and the Nyquist frequency is 22050 Hz, the resolution of your FFT frequencies will be 22050/16384 = 1.35 Hz.

    If you didn't do any windowing and are just looking at a plot of the FFT data, getting within 2 or 3 frequency increments of of the "accurate" frequency is probably about as good as you will get. In any case, there is no guarantee that the pitch of the music you played from youtube was actually A = 440 Hz.
  11. Jan 30, 2014 #10
    I have a filter that removes low noise ..
    is the magnitude is less 300,000
    I do not display it

    that is kind of like windowing
    and you are right about the youtube Video
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook