Mixing two discrete audio signals

entropy1 · Sep 12, 2019

Suppose I have two audiofiles in 16 bit PCM, both recorded on a level that, except for the noise and distortion, is maximally recorded, or that the maximum recording level results in the maximum PCM level. So, the signal is recorded on the maximum level such that there is no clipping.

If we want to mix both files together so that the volume of both audiosignals remain at the same volume, we can:

just add both signals together samplewise. However, now we can have samples in the mix that clip, because the value of the sum of both signals can exceed the 16 bits.
We can add both signals samplewise and divide by two. Now the mix won't have clipping, but the volume of both signals has halved.

So how can we mix the audiofiles without it clipping and without changing the volume?

scottdave · Sep 12, 2019

What is the sampling rate? I wonder how it would sound taking every other sample from each file, then alternating them. Would that work (with just losing some of the higher frequencies), or would it just sound like garbage?

mfb · Sep 12, 2019

I don't know the specific PCM format you have - is there an overall scaling information that you can adjust after taking the average for each sample? If not then your sound is already as loud as it can get (for its frequency spectrum), there is no way to get louder sound.

Just adding the samples won't necessarily give the result you expect, however. Imagine both sounds are a sine wave with a phase difference of pi for example: The result will be silence. What you can add easily is the magnitude of the Fourier transformation of your sound, then transform back. That might lead to some unwanted side-effects but it will add the sound intensities as you expect.

RPinPA · Sep 12, 2019

If you're doing this in software, can you convert them to a format with more dynamic range, i.e., more bits, before adding them?

entropy1 · Sep 12, 2019

Maybe I was not entirely clear. It was ment as a theoretical 'problem' rather than a practical one. So, how can you theoreticly add two PCM signals together, so that their volumes in the mix sound just as loud as in the originals? (and you don't get clipping and distortion?) (see (1) and (2))

mfb · Sep 12, 2019

I don't see a relevant difference between having two actual files and asking how to add them .[/color] and a purely theoretical discussion how you would add them if you would have two actual files.

RPinPA · Sep 12, 2019

entropy1 said:

Maybe I was not entirely clear. It was ment as a theoretical 'problem' rather than a practical one. So, how can you theoreticly add two PCM signals together, so that their volumes in the mix sound just as loud as in the originals? (and you don't get clipping and distortion?) (see (1) and (2))

You can't. There are too many constraints there. As a simple example, suppose you have two different sine waves, each of the maximum amplitude that can be achieved without clipping.

So what would you like to happen when they're in phase? If you retain the same amplitude of each sine wave, you're going to clip. You have in effect asked how to add two numbers between 0 and ##2^n## in such a way that you don't scale either but their sum is still ##\leq 2^n##. You can't achieve all those goals. Something has to give.

tech99 · Sep 12, 2019

entropy1 said:

Suppose I have two audiofiles in 16 bit PCM, both recorded on a level that, except for the noise and distortion, is maximally recorded, or that the maximum recording level results in the maximum PCM level. So, the signal is recorded on the maximum level such that there is no clipping.

If we want to mix both files together so that the volume of both audiosignals remain at the same volume, we can:

just add both signals together samplewise. However, now we can have samples in the mix that clip, because the value of the sum of both signals can exceed the 16 bits.

We can add both signals samplewise and divide by two. Now the mix won't have clipping, but the volume of both signals has halved.

So how can we mix the audiofiles without it clipping and without changing the volume?

Let's assume the signals are white noise. To retain the same volume we must add them so that the total power, and not the voltage, remains the same. In the analogue world we would combine the signals and pass them through a 3dB attenuator. A 3dB attenuator reduces voltage by a factor of 0.707.
I think that in the PCM system we would need to add the two digital samples and then multiply the value by 0.707 to obtain a 3dB reduction. The resulting signal will fit into the 16 bits.

scottdave · Sep 12, 2019

Get 2 sets of amplifier/speakers. Play each file on its own device. Let your ears/brain handle the complexities of mixing the signals

entropy1 · Oct 4, 2019

If I have a soundsource playing through my PC speakers, and I start a second soundsource, I don't perceive any change in volume of the first soundsource. Suppose both soundsources have the same amplitude, and this amplitude is pretty much maximum. Then, if we mix two of them, there must be clipping right? I don't hear any change in volume nor clipping when I start a second soundsource on my PC speakers. How is this possible?

Stephen Tashi · Oct 4, 2019

entropy1 said:

If I have a soundsource playing through my PC speakers, and I start a second soundsource, I don't perceive any change in volume of the first soundsource. Suppose both soundsources have the same amplitude, and this amplitude is pretty much maximum. Then, if we mix two of them, there must be clipping right? I don't hear any change in volume nor clipping when I start a second soundsource on my PC speakers. How is this possible?

An excellent question!

We can break it into 2 parts( neither of which, I can answer).

1) Does the physics of the real life situation actually say that the result of two pressure waves is the arithmetic sum of the two waves? Thats how we portray transverse waves when we teach about wave interference. However, if we throw a handful of pebbles in a pond, we see interference patterns, but we don't see local maxima in the water level that increase linearly with the number of pebbles thrown. Longitudinal waves are harder to visualize, but I suspect sound waves aren't actually additive.

2) Whatever the physics is, how does our perception combine sounds from different sources?
An interesting webpage on muscial acoustics, http://digitalsoundandmusic.com/chapters/ch4/ says in section 4.1.6.2 that the human ear peforms a frequency analysis of sounds.

Section 4.2.1.5, that same article gives an example of computing the loudness of hearing a lawnmower and a symphony orchestra at the same time. It concludes:

The combined sounds in this case are not perceptibly louder than the louder of the two original sounds being combined.

osilmag · Oct 6, 2019

tech99 said:

Let's assume the signals are white noise. To retain the same volume we must add them so that the total power, and not the voltage, remains the same. In the analogue world we would combine the signals and pass them through a 3dB attenuator. A 3dB attenuator reduces voltage by a factor of 0.707.
I think that in the PCM system we would need to add the two digital samples and then multiply the value by 0.707 to obtain a 3dB reduction. The resulting signal will fit into the 16 bits.

.707 {##1\sqrt(2)##} of the signal voltage would be the RMS value.

FactChecker · Oct 6, 2019

tech99 said:

I think that in the PCM system we would need to add the two digital samples and then multiply the value by 0.707 to obtain a 3dB reduction. The resulting signal will fit into the 16 bits.

Wouldn't this possibly use more bits than the original signals (assuming no cancelation)?

tech99 · Oct 7, 2019

FactChecker said:

Wouldn't this possibly use more bits than the original signals (assuming no cancelation)?

If the signals are noise-like and we use a 3dB attenuator then the signal power is the same. Then if the wave shape is the same the amplitude is the same. If we are dealing with noise-like signals there is always the occasional peak which overloads the system.
If you add two sine waves which are in phase then you need to reduce the input level by 6dB rather than 3dB, because they add on a voltage basis, but this is very unlikely to occur with real sounds.
When you add two noise-like signals, it gives an increase of 3dB, and notice that this is only just audible, as the ear has a log response.
In a practical case I would be surprised if you operated with no headroom to allow for fluctuations in level, so I think it is unlikely that 16 bits are filled.
When dealing with audio levels, also be aware that phase distortion can create a new wave shape having much higher peaks, even though the power remains the same. This new wave does not sound louder and cannot be audibly distinguished from the original. It is uncertain whether a particular microphone and amplifier will maintain zero phase distortion.

Mixing two discrete audio signals

High School Ant on a stretchy rope puzzle

High School Potato paradox

Geometric Game: Fun With Matches (Safe!)

Undergrad Three Circle Problem

High School Three Squares Problem

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Mixing two discrete audio signals

Similar threads