Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Psychoacoustic modeled compression and sound reproduction

  1. Dec 18, 2003 #1
    My question regards some of the audio compression codecs and the effects they have on physical audio reproduction and power consumption.

    Specifically, I am referring to the well known "lossy" codecs such as mp3, Ogg Vorbis, and MusePack. These codecs use algorithms designed around a psycoacoustic model to selectively remove portions of the audio signal which are not percievable to the listener.They also include trickeries that fool the listener into hearing things that dont really exist as explicit components of the compressed signal. In the end the ammount of information in the signal is reduced dramatically.

    OK, To the point

    My question is:

    1) How does his affect the power consumption involved in amplifying the final output signal, which is to the listener indistinguishable from the original, but in reality has much less information/signal content? Less signal less power?

    2)How does this affect the driver/speaker performance. Most speakers are much less accurate when they are driven by many frequencies simultaneously compared to when they are driven by fewer. Less information to reproduce = more accuracy?

    I'm thinking that if these two questions have answers that favor the information removal, more work can theoretically be focused on the most important job: repproducing the audio signals that matter most and removing the stagnant information that is bieng reproduced for no reason. ...Or does the codec add its own version of useless garbage? Could "lossy" compression actually improve real world performance?

    (Oh yea, im not worried about artifacts, assume the compression is transparent to the listener)

    Sorry for the mess.
    Thanks in advance.
    Last edited: Dec 18, 2003
  2. jcsd
  3. Dec 18, 2003 #2
    I'm thinking about modeling Question 1 using Circuit simulation. The first step is to carefully select a sample PCM stream of a given small length and and encode it using the Musepack encoder while keeping the original as a benchmark. The next is to dump the .mpc file back to PCM using foobar2000. I think this is exactly what happens in virually all hardware and sofware decoders before the d/a conversion and final output. Foobar has the best decoding routines and the ability to add modules to the dsp stack. The hard part will be running the d/a conversion and amplification simulation to get an analog waveform and to fugure out the differences in the power used and other interesing things I might find. I have seen many analytical comparisons of the analog waveforms produced by compressed and non compressed ouputs, but none of hem consider the aftermath once the signal is sent throgh the many remaining analog/mechanical processes.

    I have never done this ype of simulaion specifically.

    What is the best simultaion program to use for mixed signal designs? I know Cadence will do it, but I have little experience. I am only familiar with Mathematica, PSpice (my exp limited to analog) and Mentor Graphics Modelsim (I did logic systems only).
    Last edited: Dec 19, 2003
  4. Dec 19, 2003 #3
    1) Power consumption is obviously dependant on spectral density of the signal. So of course, less signal - less power. But why is power consumption important to you? Also notice that to play psycoaural tricks on human mind, decoders "invent" signals that are not in the compressed source. Thus signal going into amps can be even spectrally more dense than original.

    2) Speakers have inertia. Complex signals can form motion shapes exceeding mechanical abilities of speakers. But thats only one reason for inaccuracy. Rest is zillions of resonances of speakers and complex impedance that the amps are facing.

    In addition, did you know that what you hear, air pressure changes, and mechanical movement of speaker membranes, are not of same kind? To keep air pressure at some level, speaker membranes must accelerate. Being constrained by motion ranges, that sets some limits. To get sudden changes in air pressure, even higher order of acceleration is needed.
    The air pressure depends on room acoustics, resonances, input signal to speakers, etc. etc. etc. Its such a swamp that you'd want to step into it only if you are hardcore audiophile.

    If you think that reducing spectrum of speakers helps alot, then you are mistaken. It takes roughly 5 frequencies in select combination to clip any speaker. We can pray that such "bad" combinations never happen, or we can modify the material, forgetting about accuracy.

    This is completely bizare. Hi-Fi stand for high accuracy, and not that of crippled half-lost signal, but all the very original source, with every single nuances. Your idea in limit means that perfect accuracy is 50Hz signal from the mains power.

    Lossy compression can NEVER improve accuracy of real world performance, period.
    Compression is never transparent to the listener. Its only question about which listener notices which difference, or even suspects that there might be any difference, if he don't have immediate reference for comparison. Without having heard an artist in real life we don't really have any accurate idea about how it should sound.

    Regarding your simulation idea, hmm, looks like start for a 10-15 year project. Your starting idea of comparing reference wave is completely flawed, but heck, everybody start from there. Not sure why you'd need circuit simulation though. Seems to me you'd need spectral analyser.
  5. Dec 20, 2003 #4
    There are so many deep errors and misconceptions in your post, wimms, that I don't have the time or energy to "quote by quote" address them.

    I did not ask for a K-12 level (and also innacurate) review on the electo-mechanical dynamics of drivers.

    I did not ask you how you feel about the lossy compression revolution in audio and video(or the argument in favor of $2000 liquid cooled speaker cables which so often accompanies your type of stance against modern media compression).

    I did not ask you what to use to compare generated waveforms (do you really think that I have not seen a specral analyzer used to compare the decoded analog signal?).

    Selected Quotes:

    "15 years" - I'll let this speak for itself.

    "crippled half-lost signal" - I'll put you to the listening test with Musepack. (even with a modest VBR of 180, you had better not have wagered money on it)

    One more: "Clip the speaker" <- refer to "15 years" The term clipping is used to describe output stage saturaion within an amplifier. Please don't tell me you mean over-excursion.

    Answer my questions or ignore them.
    I did not ask for sophistry or hubris.

    With respect
    Last edited: Dec 20, 2003
  6. Dec 20, 2003 #5
    Honestly, you've left impression of 13-year old. good luck
  7. Dec 20, 2003 #6
    That was the reply I was looking for from you.
    Last edited: Dec 20, 2003
  8. Dec 22, 2003 #7


    User Avatar

    Staff: Mentor

    Keep it civil, Achy.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook