High School Mixing two discrete audio signals

Click For Summary
Mixing two discrete audio signals recorded at maximum levels in 16-bit PCM can lead to clipping if simply added together. To avoid clipping while maintaining volume, one approach is to add the samples and apply a scaling factor, such as multiplying by 0.707 to achieve a 3dB reduction. However, this method may not work for all signal types, especially if they are in phase, as they could still clip. The discussion also highlights that human perception of sound does not always equate to the arithmetic sum of sound waves, suggesting that our ears process combined sounds differently. Ultimately, achieving a mix without clipping and maintaining original volume levels presents inherent challenges due to the nature of sound wave interactions.
entropy1
Messages
1,232
Reaction score
72
Suppose I have two audiofiles in 16 bit PCM, both recorded on a level that, except for the noise and distortion, is maximally recorded, or that the maximum recording level results in the maximum PCM level. So, the signal is recorded on the maximum level such that there is no clipping.

If we want to mix both files together so that the volume of both audiosignals remain at the same volume, we can:
  1. just add both signals together samplewise. However, now we can have samples in the mix that clip, because the value of the sum of both signals can exceed the 16 bits.
  2. We can add both signals samplewise and divide by two. Now the mix won't have clipping, but the volume of both signals has halved.
So how can we mix the audiofiles without it clipping and without changing the volume?
 
Mathematics news on Phys.org
What is the sampling rate? I wonder how it would sound taking every other sample from each file, then alternating them. Would that work (with just losing some of the higher frequencies), or would it just sound like garbage?
 
I don't know the specific PCM format you have - is there an overall scaling information that you can adjust after taking the average for each sample? If not then your sound is already as loud as it can get (for its frequency spectrum), there is no way to get louder sound.

Just adding the samples won't necessarily give the result you expect, however. Imagine both sounds are a sine wave with a phase difference of pi for example: The result will be silence. What you can add easily is the magnitude of the Fourier transformation of your sound, then transform back. That might lead to some unwanted side-effects but it will add the sound intensities as you expect.
 
If you're doing this in software, can you convert them to a format with more dynamic range, i.e., more bits, before adding them?
 
Maybe I was not entirely clear. It was ment as a theoretical 'problem' rather than a practical one. So, how can you theoreticly add two PCM signals together, so that their volumes in the mix sound just as loud as in the originals? (and you don't get clipping and distortion?) (see (1) and (2))
 
I don't see a relevant difference between having two actual files and asking how to add them .[/color] and a purely theoretical discussion how you would add them if you would have two actual files.
 
entropy1 said:
Maybe I was not entirely clear. It was ment as a theoretical 'problem' rather than a practical one. So, how can you theoreticly add two PCM signals together, so that their volumes in the mix sound just as loud as in the originals? (and you don't get clipping and distortion?) (see (1) and (2))
You can't. There are too many constraints there. As a simple example, suppose you have two different sine waves, each of the maximum amplitude that can be achieved without clipping.

So what would you like to happen when they're in phase? If you retain the same amplitude of each sine wave, you're going to clip. You have in effect asked how to add two numbers between 0 and ##2^n## in such a way that you don't scale either but their sum is still ##\leq 2^n##. You can't achieve all those goals. Something has to give.
 
Last edited:
entropy1 said:
Suppose I have two audiofiles in 16 bit PCM, both recorded on a level that, except for the noise and distortion, is maximally recorded, or that the maximum recording level results in the maximum PCM level. So, the signal is recorded on the maximum level such that there is no clipping.

If we want to mix both files together so that the volume of both audiosignals remain at the same volume, we can:
  1. just add both signals together samplewise. However, now we can have samples in the mix that clip, because the value of the sum of both signals can exceed the 16 bits.
  2. We can add both signals samplewise and divide by two. Now the mix won't have clipping, but the volume of both signals has halved.
So how can we mix the audiofiles without it clipping and without changing the volume?
Let's assume the signals are white noise. To retain the same volume we must add them so that the total power, and not the voltage, remains the same. In the analogue world we would combine the signals and pass them through a 3dB attenuator. A 3dB attenuator reduces voltage by a factor of 0.707.
I think that in the PCM system we would need to add the two digital samples and then multiply the value by 0.707 to obtain a 3dB reduction. The resulting signal will fit into the 16 bits.
 
Get 2 sets of amplifier/speakers. Play each file on its own device. Let your ears/brain handle the complexities of mixing the signals :wink:
 
  • Like
Likes Tom.G
  • #10
If I have a soundsource playing through my PC speakers, and I start a second soundsource, I don't perceive any change in volume of the first soundsource. Suppose both soundsources have the same amplitude, and this amplitude is pretty much maximum. Then, if we mix two of them, there must be clipping right? I don't hear any change in volume nor clipping when I start a second soundsource on my PC speakers. How is this possible?
 
  • Like
Likes Stephen Tashi
  • #11
entropy1 said:
If I have a soundsource playing through my PC speakers, and I start a second soundsource, I don't perceive any change in volume of the first soundsource. Suppose both soundsources have the same amplitude, and this amplitude is pretty much maximum. Then, if we mix two of them, there must be clipping right? I don't hear any change in volume nor clipping when I start a second soundsource on my PC speakers. How is this possible?

An excellent question!

We can break it into 2 parts( neither of which, I can answer).

1) Does the physics of the real life situation actually say that the result of two pressure waves is the arithmetic sum of the two waves? Thats how we portray transverse waves when we teach about wave interference. However, if we throw a handful of pebbles in a pond, we see interference patterns, but we don't see local maxima in the water level that increase linearly with the number of pebbles thrown. Longitudinal waves are harder to visualize, but I suspect sound waves aren't actually additive.

2) Whatever the physics is, how does our perception combine sounds from different sources?
An interesting webpage on muscial acoustics, http://digitalsoundandmusic.com/chapters/ch4/ says in section 4.1.6.2 that the human ear peforms a frequency analysis of sounds.

Section 4.2.1.5, that same article gives an example of computing the loudness of hearing a lawnmower and a symphony orchestra at the same time. It concludes:
The combined sounds in this case are not perceptibly louder than the louder of the two original sounds being combined.

 
  • Like
Likes entropy1
  • #12
tech99 said:
Let's assume the signals are white noise. To retain the same volume we must add them so that the total power, and not the voltage, remains the same. In the analogue world we would combine the signals and pass them through a 3dB attenuator. A 3dB attenuator reduces voltage by a factor of 0.707.
I think that in the PCM system we would need to add the two digital samples and then multiply the value by 0.707 to obtain a 3dB reduction. The resulting signal will fit into the 16 bits.

.707 {##1\sqrt(2)##} of the signal voltage would be the RMS value.
 
  • #13
tech99 said:
I think that in the PCM system we would need to add the two digital samples and then multiply the value by 0.707 to obtain a 3dB reduction. The resulting signal will fit into the 16 bits.
Wouldn't this possibly use more bits than the original signals (assuming no cancelation)?
 
  • #14
FactChecker said:
Wouldn't this possibly use more bits than the original signals (assuming no cancelation)?
If the signals are noise-like and we use a 3dB attenuator then the signal power is the same. Then if the wave shape is the same the amplitude is the same. If we are dealing with noise-like signals there is always the occasional peak which overloads the system.
If you add two sine waves which are in phase then you need to reduce the input level by 6dB rather than 3dB, because they add on a voltage basis, but this is very unlikely to occur with real sounds.
When you add two noise-like signals, it gives an increase of 3dB, and notice that this is only just audible, as the ear has a log response.
In a practical case I would be surprised if you operated with no headroom to allow for fluctuations in level, so I think it is unlikely that 16 bits are filled.
When dealing with audio levels, also be aware that phase distortion can create a new wave shape having much higher peaks, even though the power remains the same. This new wave does not sound louder and cannot be audibly distinguished from the original. It is uncertain whether a particular microphone and amplifier will maintain zero phase distortion.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
Replies
17
Views
5K
Replies
23
Views
5K
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
749
  • · Replies 45 ·
2
Replies
45
Views
11K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 2 ·
Replies
2
Views
10K
Replies
3
Views
8K