My motivation was to combine two audio signals in such a way that in the combined form, one would HEAR a volume level that corresponds to the increase that would exist in a circumstance where it is clearly just a doubling of power. If a person is speaking or singing at a volume level equivalent to 4 watts from a speaker, and a 2nd person chimes in at the same level, we presume it involves twice the power. My motivation was to achieve, in signa arithmetic, exactly that same thing. This had the apparent contradiction that adding voltage would be the equivalent of 16 watts. How could one get 16 watts from a pair of 4 watt sources. So it sure seemed plausible that what is correct is adding power, not adding voltage. Since the actual signal numbers were voltage, they had to be converted to power to do the power arithmetic, then converted back to voltage.
That is why adding power "makes sense" (even though I'm quite confident now that even though it STILL makes sense, it is actually wrong).
Reality is more complex. Waves, space, and time are involved. And with two people speaking on stage (without an amplification system), different people will hear different things. Given two different places to emit the sounds, the mixing is not as simple as adding voltage (or adding power, for that matter). One must consider the situation, and deal with how the arithmetic applies. The situation is the two audio sources are not in the same place (no more so than two talkers in a conference room with one phone on the table). How to apply the arithmetic begs deciding where the listener or microphone is places (or even more complications if things are moving with respect to time).
Simply adding voltages makes an assumption about spatial placement. Adding them with no phase/time shift means accepting the assumption of what is heard at a point equal distant from both sound sources in space. They actually do combine to "appear" to be more power at that point. This is how waves achieve gain ... it's at a loss in other directions (where waves cancel out). Power (and energy) is still preserved (nothing gained ... nothing lost besides the usual physical loses like heat). The wave structure just moves it around so it appears to be gained in some places and lost in others. Radio waves do this, and antennas are designed to take advantage of it. Panelized speakers are the same thing, I presume (an antenna array equivalent).
This was not about me assuming the signal values were power or such. I always knew they were a measure of voltage. But I had not determined that the addition of voltage was the correct one, until getting through this thread. I looked at quite a number of websites about this, and not a single one ever even said in an unsupported way that addition of voltage is the way to go to effectively combine two power sources (much less a site saying so with an explanation of why it was correct). Hopefully, in the future, Google and the others find this thread.
Waves are funny things.