Randomizing phases of harmonics

Click For Summary
Randomizing the phases of harmonics in a discrete audio signal alters the temporal arrangement of frequencies, potentially leading to a sound that differs significantly from the original. While human hearing can often cope with phase variations, the perception of sound may change, especially when the temporal structure is disrupted, as in music or speech. The cochlea processes sound in a way that does not scramble temporal order, but a complete randomization of phases can still affect how the brain interprets the audio. The discussion emphasizes that while some sounds may remain recognizable despite phase shifts, others may sound nonsensical due to the loss of temporal coherence. Ultimately, the relationship between phase and perceived sound quality is complex and context-dependent.
  • #31
sophiecentaur said:
If your 'discrete audio signal' is a length of real audio and not just generated with a simple signal generator of basic synth then the "harmonics" you refer to will not actually be harmonics. Musical intsruments and voices contain Overtones which are not harmonically related to any fundamental frequency. That means the waveform will be changing all the time and an isolated clip will not 'sound right' when played as a loop. So the simple scenario you propose will already not sound the same as the original.
I don't know exactly how this is implemented in for instance a vocoder, but I was initially thinking of slicing an audio clip in blocks of say 512 samples, FT them, and IFT them but with randomized phases of the harmonics. With "harmonics" I mean the (amplitudes of) the frequency components resulting from the FT.

This is something a little different from overtones. Overtones are part of the audio signal and can be transformed to frequency components (harmonics). The harmonics are not IN the audio signal, like the overtones are. But they CAN reproduce a BLOCK of samples of the audio signal. They are a basis to express a series of samples in. So in that way, they could be viewed as part of the audio signal (in that block).

The overtones are related to a key note, the harmonics are not. The harmonics are related to the number of samples (the block size). However, if you just add them, they produce the original signal, AS IF they were IN the signal. In my OP I look only at these harmonics, not at the contents of the signal.

I forgot to mention that I am not looking to reproduce only part of the signal, but for instance slicing the signal up into blocks of 512 samples, FT each block, randomize phases, reconstruct, and lie the blocks in sequence, producing the new signal.
 
Last edited:
Physics news on Phys.org
  • #32
This youTube video shows the way that higher frequency components of a guitar string do not stay in one phase relative to the fundamental. Unfortunately, the effect is difficult to see when a Digital Scope display is shown on a mangled video format like Mpeg. A (real life) look at the display on an analogue scope is far better that what you can see here but you can see that the waveform shape is constantly changing - but it is still a Guitar Note.
@entropy1 this practical demo does go some way to answer your question, I think.
 
  • #33
entropy1 said:
Overtones are part of the audio signal and can be transformed to frequency components (harmonics). The harmonics are not IN the audio signal, like the overtones are.
The same is true for all the components of the original audio signal. Assuming the sampling satisfies Nyquist. There is nothing special about the harmonics or the overtones - or the fundamental(s) in the source signal.
Limiting the period of the recording is Windowing and it introduces modulation products into the signal.
I don't understand what you say about components being "in the audio signal". Once the windowing has been done, they are all just 'signal' components.
 
  • #34
Assuming the sampling frequency is higher than twice the highest audio frequency (Nyquist criterion) then you can more or forget that there's sampling involved. The spectrum of the resultant string of samples will be a comb of frequencies, spaced by 1/T up to the maximum audio frequency (where T is the time interval for the whole clip). If your audio frequency does not coincide with the frequencies in the comb (the most common situation) then each component of the input audio will be 'missed out' but there will be adjacent comb frequencies. So you already have a distorted signal. This applies to every component (i.e. the perfect Fourier Component) of the original. I sometimes look at this windowing in terms of modulation of a carrier with frequency 1/T by the audio signal which will produce sidebands on either side of the frequency comb elements.
 
  • #35
sophiecentaur said:
The spectrum of the resultant string of samples will be a comb of frequencies, spaced by 1/T up to the maximum audio frequency (where T is the time interval for the whole clip). If your audio frequency does not coincide with the frequencies in the comb (the most common situation) then each component of the input audio will be 'missed out' but there will be adjacent comb frequencies. So you already have a distorted signal.
The time windowing function applied before the Fourier transform, effectively spreads or broadens the teeth of the analyser comb. Then signals will not be lost in deep nulls between the teeth. Windowing also distorts the signal, and reduces HF noise.
 

Similar threads

Replies
8
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 26 ·
Replies
26
Views
3K
  • · Replies 58 ·
2
Replies
58
Views
8K
  • · Replies 31 ·
2
Replies
31
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K
Replies
17
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K