@Max....I seem to have stirred up a hornet's nest here.... ::-)
If you had a good old analog audio mixer with a summing op-amp, and your inputs were two rail to rail signals, then you'd have to attenuate both by 1/2 to avoid overdriving your output, hence Z=(A+B)/2. Sure, you'll never hear either channel at full volume, but then you'll never get any distortion with both signals going either.
If you apply attenuation to each signal based on the average level of the other signal, that would be fair enough - that's just a variation on the compression commonly used in telecomms to ensure that your signal is always at a usable level. Or, in the case of my local rock station, to ensure that their signal always sounds louder than the other guys . In this case, you'd just use the above formula and then apply compression to the whole result.
I know how to do this with analog electronics - there are some very nice ICs that do it for you - but I'm hopelessly out of my depth doing it digitally.....
@Tom-ii: It's pricey (at $357US), but here's something we're using for a product. It does (I think) what you are asking...
Hmm -- it sort of does what we want -- but this is more an industrial-level unit for doing things like haunted houses.
I think the price tag woudl put that way out of the range of all but the most dedicated hobbyists -- I'd really hope ours could be less than $50 -- plus ours will be custom made to work with an Arduino or chipKIT (and anything else with an I2C interface)
@David: For instance when you mix two streams with an amplitude of 0.5 then you get a result of 0.75, not 1. I presume you are applying this formula to individual samples? Then surely this WILL result in distortion?
Further to my previous comment, I actually think this may be correct, although I will reserve final judgement until I've had time to ponder it over a beer or three...
Here's the way I'm thinking about it. We start with the fact that in an electronic audio system we have a maximum value for the stream -- anything above this will be clipped. For the purposes of this exercise, we're assming that our input channels -- and output channel -- can range between 0 and 1 (we can scale this to other values later).
Lets start by assuming two channels -- A and B. If A is 0 and B is ranging between 0 and 1, then -- using the formula (A + B) - A*B -- the output will simply be equal to B, which is what we'd expect. Similarly, if B is 0 and A is ranging between 0 and 1, then the output will simply be equal to A, which is -- once again -- what we'd expect.
OK, so what do we expect if both A and B are 0.5? I think you are thinking the result should be 1. Similarly for A = 0.75 and B = 0.25 (and vice versa), our knee-jerk reaction might be that the output should be 1. But in this case, it's easy to imagine a combination of signals where A and B are both constantly changing while their sum always equals 1 -- in which case we aren't hearing anything useful at the output.
The formula (A + B) - A*B has the effect of scaling the signals relative to each other while also ensuring that the total value is <= 1.
@max and @david: I tend to agree with David on this point. IMO, no decent sound engineer will use this algorithm as you describe it; it is in fact an instantaneous (sample by sample) way of "riding the gain control" which is something only a novice would do in the analog-only days! The only redeeming aspect of the A*B scaling is it WILL eliminate clipping caused by exceeding the 12-bit range, but at the cost of reducing the dynamic range of the output. You also need to remember that the sample values represent a BIPOLAR signal (logically equivalent to an 11-bit amplitude and a sign bit) with a "zero" value of half-scale. If you want to examine this further I'd suggest you construct a spreadsheet using two NON-SYNCHRONOUS sine wave signals and see what happens. Have columns for both the straight summation and for the A*B scaling algorithm. The difference between the two will vary quite a bit, and represent distortion compared to the reality of the original sounds mixing in the analog/audio listening environment.
@David: I gave this some thought (difficult in the post-Xmas mental fog :-) and thought that this would involve some compression. I plugged a couple of numbers into Excel and this seems to be the case: (Total = (A + B) - A*B)
This is why I love you like rhe brother I never had (of course I do have a brother -- you're the one I don't have :-)
I can't believe I didn't perform this test myself -- I just skimmed Viktor Toth's article -- agreed that (A + B) / 2 didn't work -- agreed that (A + B) - A*B would work for A = 1 and B = 0, and vice versa -- and carried on trundling merrily along.
I think Viktor has the right idea as a starting point, but maybe we need to develop his algorithm -- I need to ponder this some more -- watch this space...
Interesting stuff Max, as was Viktor Toth's article.
> "Our knee-jerk reaction might be simply to add them together and divide the result by two; that is, Z = (A + B)/2. However, a moment's thought reveals that this is valid only in the case where both A and B are at the maximum 1 value. Suppose that A is at 1 while B is at its minimum 0. If we use our original equation, we end up halving the value of A, which is not what we want. A much better approach requiring minimal computational overhead is to use the formula Z = (A + B) - A*B, which allows the contributing signals to be heard clearly without distortion or perceived loss of volume."
I gave this some thought (difficult in the post-Xmas mental fog :-) and thought that this would involve some compression. I plugged a couple of numbers into Excel and this seems to be the case: (Total = (A + B) - A*B)
For instance when you mix two streams with an amplitude of 0.5 then you get a result of 0.75, not 1. I presume you are applying this formula to individual samples? Then surely this WILL result in distortion? When A = 1 and B = 0, you will get the result of 1 (ie A is not attenuated at all), but when both A=B=0.5, the result is 0.75, which implies that A and B are each 0.375, which implies that they've both been attenuated by a factor of 3/4 (0.75) - so if you plotted the waveforms of A and B individually after this process, they would be distorted?
I'd love to have pursued this further and got Excel to plot some results to prove my contention, but alas the salt mines are beckoning :-(( I have a feeling I am missing something here, can you tell me what?