Smearing an audio recording using Fourier transform

Click For Summary

Discussion Overview

The discussion revolves around the process of smearing an audio recording using the Fourier transform, specifically focusing on how to manipulate the frequency content of an audio signal to achieve a more uniform sound. Participants explore methods for averaging frequency content and the implications of the Discrete Fourier Transform (DFT) in this context.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • One participant proposes using the Short-Time Fourier Transform (STFT) to average the frequency content of an audio recording, suggesting that this could produce a smeared effect similar to that achieved with reverb.
  • Another participant questions the purpose of the smearing process, asking if it could allow for reconstructing the original sequence of notes under ideal conditions.
  • A participant clarifies that the goal is to understand the frequency content of the audio, such as the balance of bass and treble, rather than to perfectly reproduce the original signal.
  • One participant shares an experiment using the inverse Fourier transform on the absolute values of the Fourier transform, noting that while the result was somewhat smeared, it still exhibited pulsating volume corresponding to the original notes.

Areas of Agreement / Disagreement

Participants express differing views on the feasibility and purpose of the smearing process, with no consensus on the best method or the implications of the DFT in achieving the desired effect.

Contextual Notes

Participants mention confusion regarding the relationship between the DFT and the average frequency content, indicating a need for clarity on how frequency changes over time affect the interpretation of the DFT.

dwarp
Messages
3
Reaction score
0
Hi!

I'd like to smear an audio recording, where the frequency content audibly changes, into an audio recording where it does not. Here's a recording of a sampled piano playing a melody, which will serve as an example:

https://dl.dropboxusercontent.com/u/9355745/oldmcdonald.wav

The frequency content changes, both during each note played and because different notes are being played. I'd like to use the Fourier transform to somehow produce something like this:

https://dl.dropboxusercontent.com/u/9355745/oldmcdonaldsmear.wav

This was created by repeatedly playing the original recording into a reverb with a very high decay. There's still some "shimmering" in the recording, so the result isn't completely smeared out.

One way I can think of doing this is by computing an STFT and then combining the resulting windows into an average, which is then used as input to the reverse Fourier transform to produce a new audio recording (and I'm guessing this is what the reverb is actually doing). Is there a simpler, perhaps more elegant way of doing this?

The reason I'm asking is that for the longest time, I thought the result of the DFT *was* the average frequency content of the input. It seems this is only true if the frequency content in the input signal does not change over time - if I had been able to smear the signal completely, the DFT of that smeared signal *would* have been the average frequency content of the smeared signal. Imagine, for instance, if I'd run 4 seconds of a square wave at 80 Hz - the DFT would give me the same thing an EQ analyzer would.

But then, since the un-smeared signal can be accurately represented by a sum of sine and cosine waves (that is, the result of the DFT), its frequency content does not, in fact, change over time. Obviously, though, if you listen to the unsmeared signal, its frequency content DOES change over time, or there wouldn't be a melody! I find all this incredibly confusing.
 
Last edited by a moderator:
Engineering news on Phys.org
What is the purpose for such a process? Could you for instance, under perfect conditions, go backwards and end up with the same notes in the right sequence?
 
The purpose is getting an idea of the frequency content in an audio recording - whether there's a lot of bass, a lot of treble, etc. I also like the idea of getting the "average sound" of a longer recording just to hear what it sounds like. I don't expect to be able, and am not interested in being able, to reproduce the original signal from the result, as I would be able to with a straight up DFT, no.
 
Also, since I believed the DFT contained the average frequency content, I did try ifft(abs(fft(signal), and I actually got kind of close to what I was trying to achieve - the result is kind of smeared - but it's also still "pulsating" (the volume is modulating) at the same frequency that the notes were played in the original recording (4 Hz):

https://dl.dropboxusercontent.com/u/9355745/newmcdonald.wav
 
Last edited by a moderator:

Similar threads

  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
Replies
10
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 31 ·
2
Replies
31
Views
12K
Replies
17
Views
6K
Replies
7
Views
4K
  • · Replies 15 ·
Replies
15
Views
2K