How to separate sound from another sound?

  • Thread starter cybertific
  • Start date
  • Tags
    Sound
In summary, separating individual sounds in a mixed song is a complex process that involves techniques such as template matching, Fourier transforms, and stochastic signal analysis. Our ability to separate sounds is often artificial and relies on our brain's ability to fill in missing information. However, in some cases, such as during the production of a record, techniques like equalizing and "ducking" are used to distribute instruments throughout the spectrum and create a sense of separation. Ultimately, more research and understanding is needed in order to effectively separate sounds in a mixed song.
  • #1
cybertific
9
0
For example: in a song, there is some instrument like guitar, piano, violin and singer voice. How to separate them ? how we can get the individual sound? human can easily understand them, that in that song is consist of several instrument and a singer voice. But how exactly we do that theoretically?

If that can be done, I believe that may change our perspective about signal processing. And also it is very useful in many application.
 
Physics news on Phys.org
  • #2
You'd usually do something like take a Fourier transform of the signal and develop a sort of "template signal" or each instrument and match it. The worse quality the recording to harder this is to do and the more advanced techniques you need.
 
  • #3
Isn't this akin to trying to remove a watermark from an image? You'd have to know the original function in order to separate the two "signals"..

When you add together multiple signals it's impossible to know if it was 1 + 2 + 3 = 6 or 0 + 3 + 3, etc...

but I'm sure there are tricky ways to make very good guesses based on probability
 
Last edited:
  • #4
maverick_starstrider said:
You'd usually do something like take a Fourier transform of the signal and develop a sort of "template signal" or each instrument and match it. The worse quality the recording to harder this is to do and the more advanced techniques you need.

I think Fourier transform is not enough. Because for example if we mix c tone from piano and c tone from violin the frequencies is same. But the sound is different. I think we need more than frequencies domain to recognize it.
Template matching may be work for verification or identification the individual sound. But not to separate signal into each component.
 
  • #5
The distinction between e.g. a piano C and a violin C lies in the relative intensities of the higher harmonics. This pattern of relative intensities is roughly characteristic to each instrument. So if you had the waveform of a perfectly ideal piano/violin duet playing a unison C, you could probably tell that the intensities of the harmonics were the sums of the piano intensities and violin intensities, and thereby identify the instruments. But in a real recording, you'd have to deal with phase differences, various levels of interference, echos, random noise, etc. - basically all sorts of effects that would make it very difficult to clearly identify which instruments were contributing to a waveform.

You're certainly not the only one wondering about how to do that, though ;-)
 
  • #6
DavidSnider said:
Isn't this akin to trying to remove a watermark from an image? You'd have to know the original function in order to separate the two "signals"..

When you add together multiple signals it's impossible to know if it was 1 + 2 + 3 = 6 or 0 + 3 + 3, etc...

but I'm sure there are tricky ways to make very good guesses based on probability

No, this is not about that. Think about the application of this technique. If we can separate mixed sound into each component, we can build a very effective speech recognition. But first we need voice recognition. People usually misunderstand about speech recognition and voice recognition. Speech recognition is recognize speech (what is being said) and voice recognition is recognize voice (what /who voice is that).

I think this is not about addition, but something else. And not about probability. I believe there is several pattern that we don’t know yet. Because we human race can easily mention what instrument is make up the song.
 
  • #7
Well with something like voice recognition things get a lot more complicated. The general approach is probably some form of stochastic signal analysis and machine learning ( I could see using something like a genetic algorithm that mashes together base frequencies and matches again an atlas of clean voices).
 
  • #8
cybertific said:
For example: in a song, there is some instrument like guitar, piano, violin and singer voice. How to separate them ? how we can get the individual sound? human can easily understand them, that in that song is consist of several instrument and a singer voice. But how exactly we do that theoretically?

We are not nearly as good as this as you might think. During the production of a record one of the main tasks of the mixing engineer is to make sure the listener can separate the main instruments. There are many ways to achieve this, the most obvious being to use equalizers (i.e. filtering) to "distribute" the various instruments throughout the spectrum, sometimes this only leaves a small range around the fundamental of the instrument (this is quite common for acoustic guitars). Another common technique is to "duck" instruments with respect to each other, e.g. the bass is ducked with respect to the kick. Last but not least the instruments are distributed in "space", i.e. the stereo field (two instruments that use the same part of the spectrum can be panned hard right/left).

Now, there are of course many cases where we DON'T want the listener to separate the sounds. It is quite often the case that what we perceive as a single instrument is really several layered sounds (modern pop-songs) often use 30-40 different tracks, e.g. a synth bass "underneath" the main bass line to make it sound "thicker" etc (not to mention compressed side-chains, reverbs delays etc).

Hence, our ability to separate instruments is to a large extent artificial; it is much more difficult to do so if the song is just a straightforward recording, a typical example would be a symphony orchestra with all the strings playing at once.

Note that the reason why we can "remove" so much of a sound is that our brains are very good at "adding" the missing bits and this is true in general; even when we THINK we can separate two sounds if is usually the case that we are really only hearing some parts of the spectrum and our brains fill in the blanks; this is incidentally also used by mp3 and other psycho-acoustic compression methods.
 

Related to How to separate sound from another sound?

1. How can sound be separated from background noise?

The most effective way to separate sound from background noise is to use a technique called signal processing. This involves isolating the desired sound signal and filtering out any unwanted noise. This can be done using specialized software or hardware equipment.

2. Can sound separation be achieved in real-time?

Yes, it is possible to separate sound from another sound in real-time. This is commonly used in audio recording and live sound engineering, where unwanted background noise can be removed from the final recording or performance.

3. Is there a universal method for separating sound from another sound?

No, there is no one specific method that can be universally applied for separating sound from another sound. The best technique to use will depend on the specific situation and the type of sound being separated.

4. Can sound separation be done without affecting the quality of the sound?

Yes, it is possible to separate sound without significantly impacting the quality of the sound. However, some level of quality loss may occur, especially if the desired sound and background noise are very similar in frequency and amplitude.

5. Are there any limitations to sound separation?

While sound separation technology has advanced significantly in recent years, there are still some limitations. Separating sounds that are very similar in frequency and amplitude can be challenging, and in some cases, it may not be possible to completely remove all background noise without affecting the desired sound.

Similar threads

Replies
13
Views
2K
  • Mechanics
Replies
10
Views
4K
Replies
23
Views
9K
  • Science Fiction and Fantasy Media
Replies
4
Views
1K
Replies
4
Views
2K
  • Introductory Physics Homework Help
Replies
6
Views
1K
  • Special and General Relativity
2
Replies
48
Views
3K
  • Art, Music, History, and Linguistics
Replies
1
Views
854
  • Other Physics Topics
Replies
18
Views
2K
Back
Top