Mixing two discrete audio signals

Click For Summary

Discussion Overview

The discussion revolves around the theoretical mixing of two discrete audio signals recorded in 16-bit PCM format, focusing on how to combine them without causing clipping or altering their perceived volume. Participants explore various methods and implications of audio mixing, including technical constraints and perceptual aspects.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Conceptual clarification

Main Points Raised

  • Some participants suggest that simply adding two audio signals samplewise can lead to clipping if the combined value exceeds the maximum PCM level.
  • Others propose averaging the signals to avoid clipping, but note that this would halve the volume of the mixed signals.
  • A participant questions the feasibility of mixing by alternating samples from each file, wondering about the potential loss of higher frequencies.
  • One participant mentions the possibility of using Fourier transformations to combine sound intensities, although this may introduce unwanted side effects.
  • Another participant raises the idea of converting the audio files to a format with a greater dynamic range before mixing to prevent clipping.
  • Some participants emphasize the theoretical nature of the problem, arguing that certain constraints make it impossible to mix two maximum amplitude signals without clipping or reducing volume.
  • A few participants discuss the perception of sound, noting that the human ear may not perceive changes in volume when multiple sources are played simultaneously, despite potential physical clipping.
  • There is mention of using a 3dB attenuator in an analog context to maintain volume while mixing, with a suggestion that a similar approach could be applied in PCM systems.
  • Concerns are raised about whether mixing signals could lead to a need for more bits than the original signals, particularly in the context of noise-like signals.

Areas of Agreement / Disagreement

Participants express differing views on the feasibility of mixing audio signals without clipping or volume reduction. While some agree on the theoretical constraints involved, others propose various methods that may or may not resolve the issues raised. The discussion remains unresolved with multiple competing views.

Contextual Notes

Limitations include assumptions about the nature of the audio signals (e.g., sine waves vs. noise), the potential for phase differences leading to silence when signals are added, and the impact of perceptual factors on sound mixing.

entropy1
Messages
1,232
Reaction score
72
Suppose I have two audiofiles in 16 bit PCM, both recorded on a level that, except for the noise and distortion, is maximally recorded, or that the maximum recording level results in the maximum PCM level. So, the signal is recorded on the maximum level such that there is no clipping.

If we want to mix both files together so that the volume of both audiosignals remain at the same volume, we can:
  1. just add both signals together samplewise. However, now we can have samples in the mix that clip, because the value of the sum of both signals can exceed the 16 bits.
  2. We can add both signals samplewise and divide by two. Now the mix won't have clipping, but the volume of both signals has halved.
So how can we mix the audiofiles without it clipping and without changing the volume?
 
Mathematics news on Phys.org
What is the sampling rate? I wonder how it would sound taking every other sample from each file, then alternating them. Would that work (with just losing some of the higher frequencies), or would it just sound like garbage?
 
I don't know the specific PCM format you have - is there an overall scaling information that you can adjust after taking the average for each sample? If not then your sound is already as loud as it can get (for its frequency spectrum), there is no way to get louder sound.

Just adding the samples won't necessarily give the result you expect, however. Imagine both sounds are a sine wave with a phase difference of pi for example: The result will be silence. What you can add easily is the magnitude of the Fourier transformation of your sound, then transform back. That might lead to some unwanted side-effects but it will add the sound intensities as you expect.
 
If you're doing this in software, can you convert them to a format with more dynamic range, i.e., more bits, before adding them?
 
Maybe I was not entirely clear. It was ment as a theoretical 'problem' rather than a practical one. So, how can you theoreticly add two PCM signals together, so that their volumes in the mix sound just as loud as in the originals? (and you don't get clipping and distortion?) (see (1) and (2))
 
I don't see a relevant difference between having two actual files and asking how to add them .[/color] and a purely theoretical discussion how you would add them if you would have two actual files.
 
entropy1 said:
Maybe I was not entirely clear. It was ment as a theoretical 'problem' rather than a practical one. So, how can you theoreticly add two PCM signals together, so that their volumes in the mix sound just as loud as in the originals? (and you don't get clipping and distortion?) (see (1) and (2))
You can't. There are too many constraints there. As a simple example, suppose you have two different sine waves, each of the maximum amplitude that can be achieved without clipping.

So what would you like to happen when they're in phase? If you retain the same amplitude of each sine wave, you're going to clip. You have in effect asked how to add two numbers between 0 and ##2^n## in such a way that you don't scale either but their sum is still ##\leq 2^n##. You can't achieve all those goals. Something has to give.
 
Last edited:
entropy1 said:
Suppose I have two audiofiles in 16 bit PCM, both recorded on a level that, except for the noise and distortion, is maximally recorded, or that the maximum recording level results in the maximum PCM level. So, the signal is recorded on the maximum level such that there is no clipping.

If we want to mix both files together so that the volume of both audiosignals remain at the same volume, we can:
  1. just add both signals together samplewise. However, now we can have samples in the mix that clip, because the value of the sum of both signals can exceed the 16 bits.
  2. We can add both signals samplewise and divide by two. Now the mix won't have clipping, but the volume of both signals has halved.
So how can we mix the audiofiles without it clipping and without changing the volume?
Let's assume the signals are white noise. To retain the same volume we must add them so that the total power, and not the voltage, remains the same. In the analogue world we would combine the signals and pass them through a 3dB attenuator. A 3dB attenuator reduces voltage by a factor of 0.707.
I think that in the PCM system we would need to add the two digital samples and then multiply the value by 0.707 to obtain a 3dB reduction. The resulting signal will fit into the 16 bits.
 
Get 2 sets of amplifier/speakers. Play each file on its own device. Let your ears/brain handle the complexities of mixing the signals :wink:
 
  • Like
Likes   Reactions: Tom.G
  • #10
If I have a soundsource playing through my PC speakers, and I start a second soundsource, I don't perceive any change in volume of the first soundsource. Suppose both soundsources have the same amplitude, and this amplitude is pretty much maximum. Then, if we mix two of them, there must be clipping right? I don't hear any change in volume nor clipping when I start a second soundsource on my PC speakers. How is this possible?
 
  • Like
Likes   Reactions: Stephen Tashi
  • #11
entropy1 said:
If I have a soundsource playing through my PC speakers, and I start a second soundsource, I don't perceive any change in volume of the first soundsource. Suppose both soundsources have the same amplitude, and this amplitude is pretty much maximum. Then, if we mix two of them, there must be clipping right? I don't hear any change in volume nor clipping when I start a second soundsource on my PC speakers. How is this possible?

An excellent question!

We can break it into 2 parts( neither of which, I can answer).

1) Does the physics of the real life situation actually say that the result of two pressure waves is the arithmetic sum of the two waves? Thats how we portray transverse waves when we teach about wave interference. However, if we throw a handful of pebbles in a pond, we see interference patterns, but we don't see local maxima in the water level that increase linearly with the number of pebbles thrown. Longitudinal waves are harder to visualize, but I suspect sound waves aren't actually additive.

2) Whatever the physics is, how does our perception combine sounds from different sources?
An interesting webpage on muscial acoustics, http://digitalsoundandmusic.com/chapters/ch4/ says in section 4.1.6.2 that the human ear peforms a frequency analysis of sounds.

Section 4.2.1.5, that same article gives an example of computing the loudness of hearing a lawnmower and a symphony orchestra at the same time. It concludes:
The combined sounds in this case are not perceptibly louder than the louder of the two original sounds being combined.

 
  • Like
Likes   Reactions: entropy1
  • #12
tech99 said:
Let's assume the signals are white noise. To retain the same volume we must add them so that the total power, and not the voltage, remains the same. In the analogue world we would combine the signals and pass them through a 3dB attenuator. A 3dB attenuator reduces voltage by a factor of 0.707.
I think that in the PCM system we would need to add the two digital samples and then multiply the value by 0.707 to obtain a 3dB reduction. The resulting signal will fit into the 16 bits.

.707 {##1\sqrt(2)##} of the signal voltage would be the RMS value.
 
  • #13
tech99 said:
I think that in the PCM system we would need to add the two digital samples and then multiply the value by 0.707 to obtain a 3dB reduction. The resulting signal will fit into the 16 bits.
Wouldn't this possibly use more bits than the original signals (assuming no cancelation)?
 
  • #14
FactChecker said:
Wouldn't this possibly use more bits than the original signals (assuming no cancelation)?
If the signals are noise-like and we use a 3dB attenuator then the signal power is the same. Then if the wave shape is the same the amplitude is the same. If we are dealing with noise-like signals there is always the occasional peak which overloads the system.
If you add two sine waves which are in phase then you need to reduce the input level by 6dB rather than 3dB, because they add on a voltage basis, but this is very unlikely to occur with real sounds.
When you add two noise-like signals, it gives an increase of 3dB, and notice that this is only just audible, as the ear has a log response.
In a practical case I would be surprised if you operated with no headroom to allow for fluctuations in level, so I think it is unlikely that 16 bits are filled.
When dealing with audio levels, also be aware that phase distortion can create a new wave shape having much higher peaks, even though the power remains the same. This new wave does not sound louder and cannot be audibly distinguished from the original. It is uncertain whether a particular microphone and amplifier will maintain zero phase distortion.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
Replies
17
Views
6K
Replies
23
Views
5K
Replies
4
Views
3K
  • · Replies 21 ·
Replies
21
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K