More on Distributing High Quality Audio

  • Thread starter Thread starter bhobba
  • Start date Start date
Click For Summary
SUMMARY

The discussion centers on the modern methods of recording, mastering, and distributing high-quality audio, particularly focusing on one-bit DSD and DXD formats. DXD, with a sampling rate of 352.8 kHz at 24 bits, provides audio engineers with extensive flexibility for mastering, while 16-bit audio is deemed sufficient for distribution, often at 44.1 kHz or 88.2 kHz. The lossyWAV technique enhances FLAC compression by effectively managing ultrasonic noise in DXD files. Additionally, advancements in DAC technology and compression methods, such as those developed by Microsoft, are shaping the future of high-fidelity audio.

PREREQUISITES
  • Understanding of DSD (Direct Stream Digital) audio format
  • Familiarity with DXD (Digital eXtreme Definition) audio specifications
  • Knowledge of audio dithering techniques
  • Basic principles of digital-to-analog conversion (DAC)
NEXT STEPS
  • Research the lossyWAV technique and its impact on audio compression
  • Explore the principles of Sigma-Delta ADC topology for audio processing
  • Learn about the capabilities and applications of DAC chips from manufacturers like ESS
  • Investigate advanced audio compression methods developed by Microsoft
USEFUL FOR

Audio engineers, music producers, audiophiles, and anyone involved in high-quality audio recording and distribution will benefit from this discussion.

Messages
10,973
Reaction score
3,839
A while ago I did some posts on the modern way audio is recorded, mastered and distributed.

I have been investigating this further and am writing this post on what I found.

These days, high-quality recordings are often recorded in one-bit DSD, which you can look into (a link I provide later has details). However, DSD is hard to use when creating masters. So, a format where Audio Engineers have an overkill amount of leeway in creating masters, called DXD, was devised (352.8/24, ie 352.8 kHz sampling at 24 bits). Some high-quality producers release their recordings in DXD. I have one, and it sounds glorious. However, have a look at:

What About DXD? Surprise!

CD quality 44.1/16 is good enough if 16 bits are used. But certainly not the full DXD; it is all noise above 50 kHz. Knowing this, some DAC manufacturers have a 50 kHz filter in their DACS.

However, is 16 bits enough? To answer that, we need to look into dithering:

24/192 Music Downloads

So, for distribution, 16 bits are more than good enough. 88.2/16 is likely all that is ever needed; even audiophile nuts do not need 24 bits. Most of the time, 44.1 sampling is enough.

There is a sneaky way to process the DXD file so that only the audio, not the noise, is distributed. It is called lossyWAV:

lossyWAV - Hydrogenaudio Knowledgebase

It is a form of adaptive dither that allows FLAC compression to operate much more effectively.

There is the issue with the ultrasonic noise in DXD being larger than 16 bits, so it is not removed when truncated to 16 bits. The methods of the following article can fix this, as well as explain DSD:

Fundamental Principles Behind the Sigma-Delta ADC Topology: Part 1

It is then easy for a program to determine the minimum sampling rate necessary to prevent aliasing (distortion that occurs if there is any content above half the sampling frequency) and decimate (that is, just throw away unnecessary samples and still have a sampling rate above half the maximum frequency) the recording to that minimum. It will usually be just 44.1 sampling, but a higher sampling rate may occasionally be required.

This can be done by upsampling to 10xDSD. A little math shows it is an exact multiple of all the common sampling frequencies. This decreases noise, plus decimation is trivial.

Also, using lossyWAV and FLAC, files that are close in size to lossy audio files are all that is needed for very transparent audio.
 
Last edited:
  • Informative
  • Like
Likes   Reactions: harborsparrow, bdrobin519, FactChecker and 1 other person
Computer science news on Phys.org
bhobba said:
I have one, and it sounds glorious.
It is interesting that you can hear the difference. That tells me there is something lacking in the usual DVD format. I have always been skeptical of the claim that the DVD format is completely satisfactory. (although it is fine for my use and hearing ability)
ADDED: I should have said CD format. I don't know it the DVD audio is the same as CD.
 
Last edited:
  • Like
Likes   Reactions: bhobba
FactChecker said:
It is interesting that you can hear the difference. That tells me there is something lacking in the usual DVD format. I have always been skeptical of the claim that the DVD format is completely satisfactory. (although it is fine for my use and hearing ability)

I think you are correct.

The conjectured reason (from the article DXD - Surprise) is 'Maybe the reason is because the filtering needed at lower sample rates becomes unnecessary when you get to 352.8 or 384 kHz'

Some filtering must be used to create the DXD file from the DSD file, but it can be so gentle and is at such a high frequency that it is like no filter at all:
https://media.ifi-audio.com/wp-content/uploads/2020/02/iFi-audio-Tech-Note-The-GTO-Filter.pdf

Such a filter at 352.8, while still a filter, for all practical purposes, does nothing in the audible range - it is just leaky and lets ultrasonic noise through (see the attached file for the 192k GTO filter response). It would be even better at doing nothing at 352.8

The suggested method requires just 'nothing' filtering to have lower sample rates, simply upsampling to get the ultrasonic noise below 16 bits and decimation.

The real issues with filters come at the DAC end. While decimation is simple, the upsampling filter to restore it to DXD sampling is more difficult. The upsampling filter in a product like HQ Player is one possibility.

As an aside, ideas like this are being looked into by companies like MQA (it is not MQA - just being researched by the new owners of MQA):

https://mqalabs.com/wp-content/uploads/2024/12/MQA-Labs-QRONO-White-Paper_updated.pdf

Thanks
Bill
 

Attachments

  • freq-192.png.668f717df7a0c9d3b2afc93a96c8e902.png
    freq-192.png.668f717df7a0c9d3b2afc93a96c8e902.png
    11.3 KB · Views: 55
Last edited:
  • Like
Likes   Reactions: FactChecker
bhobba said:
A while ago I did some posts on the modern way audio is recorded, mastered and distributed.

I have been investigating this further and am writing this post on what I found.

These days, high-quality recordings are often recorded in one-bit DSD, which you can look into (a link I provide later has details). However, DSD is hard to use when creating masters. So, a format where Audio Engineers have an overkill amount of leeway in creating masters, called DXD, was devised (352.8/24, ie 352.8 kHz sampling at 24 bits). Some high-quality producers release their recordings in DXD. I have one, and it sounds glorious. However, have a look at:

What About DXD? Surprise!

CD quality 44.1/16 is good enough if 16 bits are used. But certainly not the full DXD; it is all noise above 50 kHz. Knowing this, some DAC manufacturers have a 50 kHz filter in their DACS.

However, is 16 bits enough? To answer that, we need to look into dithering:

24/192 Music Downloads

So, for distribution, 16 bits are more than good enough. 88.2/16 is likely all that is ever needed; even audiophile nuts do not need 24 bits. Most of the time, 44.1 sampling is enough.

There is a sneaky way to process the DXD file so that only the audio, not the noise, is distributed. It is called lossyWAV:

lossyWAV - Hydrogenaudio Knowledgebase

It is a form of adaptive dither that allows FLAC compression to operate much more effectively.

There is the issue with the ultrasonic noise in DXD being larger than 16 bits, so it is not removed when truncated to 16 bits. The methods of the following article can fix this, as well as explain DSD:

Fundamental Principles Behind the Sigma-Delta ADC Topology: Part 1

It is then easy for a program to determine the minimum sampling rate necessary to prevent aliasing (distortion that occurs if there is any content above half the sampling frequency) and decimate (that is, just throw away unnecessary samples and still have a sampling rate above half the maximum frequency) the recording to that minimum. It will usually be just 44.1 sampling, but a higher sampling rate may occasionally be required.

This can be done by upsampling to 10xDSD. A little math shows it is an exact multiple of all the common sampling frequencies. This decreases noise, plus decimation is trivial.

Also, using lossyWAV and FLAC, files that are close in size to lossy audio files are all that is needed for very transparent audio.
As evidence of the highest volumetric recorded tracks/album created to my knowledge was the Death Magnetic - Metallica album. To my knowledge it was a record breaking decibel read recording.
 
  • Like
Likes   Reactions: bhobba
Now here is something interesting. I used FLAC for compression after decimation from filtered DXD. Microsoft has, however, devised a compression method in the frequency domain, rather than transmitting the difference from a predictor like FLAC does:

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/Malvar_DCC07.pdf

It has better compression performance than FLAC.

64 DSD is 2.8 mHZ, 128 DSD 5.6 mHz and so on. These days, the latest is 1024 DSD, which is 45 mHz, and it is likely to go even higher in the future. The output is one-bit noise-shaped audio that, if passed through a filter like the GTO, easily reaches DXD accuracy with noise below 16 bits for 256 DXD and above.

Chop it off at 16 bits, and there are rarely any frequencies above the usual 22 kHz. Convert to lossyWAV and compress using Microsoft compression. It could be decoded at the DAC, but since it is in the frequency domain, padding out extra zero frequencies makes it easy to convert to some very high-frequency PCM. Modern DAC chips (or FPGA's) easily convert it into one-bit DSD by noise shaping and upsampling. DAC's, like the Direct Stream DAC, feed the noise-shaped DSD into a simple high-quality audio transformer. Chips are available from companies like ESS that do the conversion for the DAC designer who does not want to code their own FPGA chip. I have the Chord TT2 DAC that the designer, Rob Watts, uses FPGA chips (probably some output high-speed switching transistors as well), and managed to coax 18w out of it to connect to speakers directly. A digital system from end to end.

There are interesting times ahead in audio.

Thanks
Bill
 
Last edited:

Similar threads

Replies
17
Views
5K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
Replies
6
Views
19K
  • · Replies 19 ·
Replies
19
Views
9K
Replies
23
Views
6K
  • · Replies 6 ·
Replies
6
Views
6K