More on Distributing High Quality Audio

  • Thread starter Thread starter bhobba
  • Start date Start date
AI Thread Summary
High-quality audio recordings are increasingly made using one-bit DSD, but mastering is often done in DXD format, which offers greater flexibility with a sampling rate of 352.8 kHz at 24 bits. While CD quality at 44.1/16 bits is generally sufficient for distribution, many audiophiles argue that higher resolutions like 88.2/16 or even DXD are preferable. Techniques such as lossyWAV enhance FLAC compression by effectively managing ultrasonic noise, which can exceed 16 bits in DXD files. Upsampling methods can help maintain audio quality while reducing noise. Recent advancements in compression technology, such as Microsoft's frequency domain compression, show promise for better performance than traditional methods like FLAC. The evolution of DAC technology, including noise-shaping and the use of FPGA chips, is paving the way for improved audio systems, indicating a dynamic future for audio recording and playback.
Messages
10,901
Reaction score
3,782
A while ago I did some posts on the modern way audio is recorded, mastered and distributed.

I have been investigating this further and am writing this post on what I found.

These days, high-quality recordings are often recorded in one-bit DSD, which you can look into (a link I provide later has details). However, DSD is hard to use when creating masters. So, a format where Audio Engineers have an overkill amount of leeway in creating masters, called DXD, was devised (352.8/24, ie 352.8 kHz sampling at 24 bits). Some high-quality producers release their recordings in DXD. I have one, and it sounds glorious. However, have a look at:

What About DXD? Surprise!

CD quality 44.1/16 is good enough if 16 bits are used. But certainly not the full DXD; it is all noise above 50 kHz. Knowing this, some DAC manufacturers have a 50 kHz filter in their DACS.

However, is 16 bits enough? To answer that, we need to look into dithering:

24/192 Music Downloads

So, for distribution, 16 bits are more than good enough. 88.2/16 is likely all that is ever needed; even audiophile nuts do not need 24 bits. Most of the time, 44.1 sampling is enough.

There is a sneaky way to process the DXD file so that only the audio, not the noise, is distributed. It is called lossyWAV:

lossyWAV - Hydrogenaudio Knowledgebase

It is a form of adaptive dither that allows FLAC compression to operate much more effectively.

There is the issue with the ultrasonic noise in DXD being larger than 16 bits, so it is not removed when truncated to 16 bits. The methods of the following article can fix this, as well as explain DSD:

Fundamental Principles Behind the Sigma-Delta ADC Topology: Part 1

It is then easy for a program to determine the minimum sampling rate necessary to prevent aliasing (distortion that occurs if there is any content above half the sampling frequency) and decimate (that is, just throw away unnecessary samples and still have a sampling rate above half the maximum frequency) the recording to that minimum. It will usually be just 44.1 sampling, but a higher sampling rate may occasionally be required.

This can be done by upsampling to 10xDSD. A little math shows it is an exact multiple of all the common sampling frequencies. This decreases noise, plus decimation is trivial.

Also, using lossyWAV and FLAC, files that are close in size to lossy audio files are all that is needed for very transparent audio.
 
Last edited:
  • Informative
  • Like
Likes harborsparrow, bdrobin519, FactChecker and 1 other person
Computer science news on Phys.org
bhobba said:
I have one, and it sounds glorious.
It is interesting that you can hear the difference. That tells me there is something lacking in the usual DVD format. I have always been skeptical of the claim that the DVD format is completely satisfactory. (although it is fine for my use and hearing ability)
ADDED: I should have said CD format. I don't know it the DVD audio is the same as CD.
 
Last edited:
FactChecker said:
It is interesting that you can hear the difference. That tells me there is something lacking in the usual DVD format. I have always been skeptical of the claim that the DVD format is completely satisfactory. (although it is fine for my use and hearing ability)

I think you are correct.

The conjectured reason (from the article DXD - Surprise) is 'Maybe the reason is because the filtering needed at lower sample rates becomes unnecessary when you get to 352.8 or 384 kHz'

Some filtering must be used to create the DXD file from the DSD file, but it can be so gentle and is at such a high frequency that it is like no filter at all:
https://media.ifi-audio.com/wp-content/uploads/2020/02/iFi-audio-Tech-Note-The-GTO-Filter.pdf

Such a filter at 352.8, while still a filter, for all practical purposes, does nothing in the audible range - it is just leaky and lets ultrasonic noise through (see the attached file for the 192k GTO filter response). It would be even better at doing nothing at 352.8

The suggested method requires just 'nothing' filtering to have lower sample rates, simply upsampling to get the ultrasonic noise below 16 bits and decimation.

The real issues with filters come at the DAC end. While decimation is simple, the upsampling filter to restore it to DXD sampling is more difficult. The upsampling filter in a product like HQ Player is one possibility.

As an aside, ideas like this are being looked into by companies like MQA (it is not MQA - just being researched by the new owners of MQA):

https://mqalabs.com/wp-content/uploads/2024/12/MQA-Labs-QRONO-White-Paper_updated.pdf

Thanks
Bill
 

Attachments

  • freq-192.png.668f717df7a0c9d3b2afc93a96c8e902.png
    freq-192.png.668f717df7a0c9d3b2afc93a96c8e902.png
    11.3 KB · Views: 24
Last edited:
  • Like
Likes FactChecker
bhobba said:
A while ago I did some posts on the modern way audio is recorded, mastered and distributed.

I have been investigating this further and am writing this post on what I found.

These days, high-quality recordings are often recorded in one-bit DSD, which you can look into (a link I provide later has details). However, DSD is hard to use when creating masters. So, a format where Audio Engineers have an overkill amount of leeway in creating masters, called DXD, was devised (352.8/24, ie 352.8 kHz sampling at 24 bits). Some high-quality producers release their recordings in DXD. I have one, and it sounds glorious. However, have a look at:

What About DXD? Surprise!

CD quality 44.1/16 is good enough if 16 bits are used. But certainly not the full DXD; it is all noise above 50 kHz. Knowing this, some DAC manufacturers have a 50 kHz filter in their DACS.

However, is 16 bits enough? To answer that, we need to look into dithering:

24/192 Music Downloads

So, for distribution, 16 bits are more than good enough. 88.2/16 is likely all that is ever needed; even audiophile nuts do not need 24 bits. Most of the time, 44.1 sampling is enough.

There is a sneaky way to process the DXD file so that only the audio, not the noise, is distributed. It is called lossyWAV:

lossyWAV - Hydrogenaudio Knowledgebase

It is a form of adaptive dither that allows FLAC compression to operate much more effectively.

There is the issue with the ultrasonic noise in DXD being larger than 16 bits, so it is not removed when truncated to 16 bits. The methods of the following article can fix this, as well as explain DSD:

Fundamental Principles Behind the Sigma-Delta ADC Topology: Part 1

It is then easy for a program to determine the minimum sampling rate necessary to prevent aliasing (distortion that occurs if there is any content above half the sampling frequency) and decimate (that is, just throw away unnecessary samples and still have a sampling rate above half the maximum frequency) the recording to that minimum. It will usually be just 44.1 sampling, but a higher sampling rate may occasionally be required.

This can be done by upsampling to 10xDSD. A little math shows it is an exact multiple of all the common sampling frequencies. This decreases noise, plus decimation is trivial.

Also, using lossyWAV and FLAC, files that are close in size to lossy audio files are all that is needed for very transparent audio.
As evidence of the highest volumetric recorded tracks/album created to my knowledge was the Death Magnetic - Metallica album. To my knowledge it was a record breaking decibel read recording.
 
Now here is something interesting. I used FLAC for compression after decimation from filtered DXD. Microsoft has, however, devised a compression method in the frequency domain, rather than transmitting the difference from a predictor like FLAC does:

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/Malvar_DCC07.pdf

It has better compression performance than FLAC.

64 DSD is 2.8 mHZ, 128 DSD 5.6 mHz and so on. These days, the latest is 1024 DSD, which is 45 mHz, and it is likely to go even higher in the future. The output is one-bit noise-shaped audio that, if passed through a filter like the GTO, easily reaches DXD accuracy with noise below 16 bits for 256 DXD and above.

Chop it off at 16 bits, and there are rarely any frequencies above the usual 22 kHz. Convert to lossyWAV and compress using Microsoft compression. It could be decoded at the DAC, but since it is in the frequency domain, padding out extra zero frequencies makes it easy to convert to some very high-frequency PCM. Modern DAC chips (or FPGA's) easily convert it into one-bit DSD by noise shaping and upsampling. DAC's, like the Direct Stream DAC, feed the noise-shaped DSD into a simple high-quality audio transformer. Chips are available from companies like ESS that do the conversion for the DAC designer who does not want to code their own FPGA chip. I have the Chord TT2 DAC that the designer, Rob Watts, uses FPGA chips (probably some output high-speed switching transistors as well), and managed to coax 18w out of it to connect to speakers directly. A digital system from end to end.

There are interesting times ahead in audio.

Thanks
Bill
 
Last edited:
In my discussions elsewhere, I've noticed a lot of disagreement regarding AI. A question that comes up is, "Is AI hype?" Unfortunately, when this question is asked, the one asking, as far as I can tell, may mean one of three things which can lead to lots of confusion. I'll list them out now for clarity. 1. Can AI do everything a human can do and how close are we to that? 2. Are corporations and governments using the promise of AI to gain more power for themselves? 3. Are AI and transhumans...
Thread 'ChatGPT Examples, Good and Bad'
I've been experimenting with ChatGPT. Some results are good, some very very bad. I think examples can help expose the properties of this AI. Maybe you can post some of your favorite examples and tell us what they reveal about the properties of this AI. (I had problems with copy/paste of text and formatting, so I'm posting my examples as screen shots. That is a promising start. :smile: But then I provided values V=1, R1=1, R2=2, R3=3 and asked for the value of I. At first, it said...
Back
Top