Why is 44100 Hz the standard sampling rate for compact discs?

  • Thread starter Thread starter Bassalisk
  • Start date Start date
  • Tags Tags
    Rate Sampling
Click For Summary
SUMMARY

The standard sampling rate of 44.1 kHz for compact discs was primarily established due to the need for compatibility with existing video recording technology, specifically U-matic video tape systems. This rate allows for effective digital audio conversion while minimizing aliasing through the use of pre-sampling filters. The choice of 44.1 kHz also accommodates the upper limits of human hearing, which can extend beyond 20 kHz, ensuring a flat frequency response. Additionally, historical context suggests that the duration of classical recordings influenced this specification, as exemplified by Beethoven's 9th symphony.

PREREQUISITES
  • Understanding of the Nyquist theorem and its implications for sampling rates.
  • Familiarity with digital audio conversion processes, particularly PCM (Pulse Code Modulation).
  • Knowledge of aliasing and its effects on audio quality.
  • Basic concepts of video recording technology and its historical context in audio engineering.
NEXT STEPS
  • Research the Nyquist theorem and its application in audio sampling.
  • Explore the role of PCM adaptors in digital audio conversion.
  • Investigate the effects of aliasing and methods to mitigate it in audio processing.
  • Study the evolution of audio sampling rates in professional digital audio equipment.
USEFUL FOR

Audio engineers, sound designers, music producers, and anyone interested in the technical aspects of digital audio and its historical development.

Bassalisk
Messages
946
Reaction score
2
I know that from the sampling theorem you have to go at least twice the frequency you want to sample.


Human hearing is around 20 kHz. So 40 kHz would be enough. What are those extra 4100 Hz there for. Now don't get me wrong, I tried google but all results were vague and only scratched the surface of the question.


Some of them said, reduce in entropy, some of the mentioned pass bands etc. Can anybody give me full, straight answer? Don't be afraid to go technical on me, I am very curios about this...
 
Engineering news on Phys.org
A pre-sampling (Nyquist) filter needs to cut off unwanted high frequencies to avoid aliasing. But to have a fairly flat response up to where you can hear, a practical filter will need to have a band in which to 'roll off' and this requires an extra gap between your maximum programme frequency and half sample frequency. That takes you to, say 44kHz

I believe the actual choice of 44100Hz was to do with the fact that early recording of digital signals had to be done on existing Video recording equipment. A colour TV signal is a very complex thing and analogue recording of video 'only just works' on VHS (many people would say that it doesn't really work and I could agree). In order to use TV Analogue recording, the digital signal had to fit in with the existing standards so 44100Hz was high enough to suit the digital sound system and worked at acceptable bit rate for the , very complex, video, circuitry to work.
 
This frequency is used when preparing an audio signal to be digitized onto a CD.

This is a quote from Wikipedia giving the historical reasons for its choice:

The exact sampling rate of 44.1 kHz was inherited from a method of converting digital audio into an analog video signal for storage on U-matic video tape, which was the most affordable way to transfer data from the recording studio to the CD manufacturer at the time the CD specification was being developed. The device that converts an analog audio signal into PCM audio, which in turn is changed into an analog video signal is called a PCM adaptor. This technology could store six samples (three samples per stereo channel) in a single horizontal line. A standard NTSC video signal has 245 usable lines per field, and 59.94 fields/s, which works out to be 44,056 samples/s/stereo channel. Similarly, PAL has 294 lines and 50 fields, which gives 44,100 samples/s/stereo channel. This system could store 14-bit samples with some error correction, or 16-bit samples with almost no error correction.
 
One part of the answer is aliasing (again!). You can't design a practical filter that acts like a "brick wall" to block all frequencies above a particular value, without some nasty side effects like unwanted phase shifts that vary with frequency. It's much easier to design a filter with a steep slope to roll off the amplitude over a range like 20kHz - 22.05 KHz.

There are two frequency "standards" in practical use, because DAT (digital audio tape) was based on 48kHz sampling, and professional digital audio products (as compared with comsumer products) still use 48kHz or higher multiples of it (even as high as 384 kHz). For example DVD-audio disks use a 96kHz sample rate.

There is a story that the specification for audio CDs using 44.1 KHz was based on the requirement by Philips to issue their longest "popular" classical recording - Beethoven's 9th symphony conducted by Otto Klemperer, who was well known for slow tempos - on a single CD. That set the playing time at 74 minutes, and the 44.1 sampling rate then followed from the available technology to manufacture and play the disks. That may be apocryphal, but Klemperer's recording of the 9th does play for just a few seconds short of 74 minutes.

The advantage of the higher sampling rates is better capture of transient sounds with ultrasonic components (e.g. percussion instruments like cymbals etc) and better signal-to-noise ratio. If you can spread the "truncation noise" from rounding the results of digital processing to the nearest integer over the whole frequency range from 0 to 384 kHz, and tilt the frequency spectrum of the noise so most of it is at high frequences, you can then throw most of the noise away when you resample at 48 kHz.
 
http://www.snopes.com/music/media/cdlength.asp

story gets better - a Sony executive's wife was fond of the 9th...

the adaptation of video recording equipment makes a lot of sense. it was used briefly for computer mass storage...
 
Bassalisk said:
I know that from the sampling theorem you have to go at least twice the frequency you want to sample.


Human hearing is around 20 kHz. So 40 kHz would be enough. What are those extra 4100 Hz there for. Now don't get me wrong, I tried google but all results were vague and only scratched the surface of the question.


Some of them said, reduce in entropy, some of the mentioned pass bands etc. Can anybody give me full, straight answer? Don't be afraid to go technical on me, I am very curios about this...

The upper frequency cutoff of human hearing is not fixed from person to person, and it varies (decreases) with age in the same person. Young people often can hear ultrasounds above 20 kHz. I think even some musical instruments have spectral components up to 22 kHz. According to the sampling theorem, that would be 44 kHz. The 0.1 kHz is added to avoid any aliasing at the boundary,
 
Dickfore said:
The upper frequency cutoff of human hearing is not fixed from person to person, and it varies (decreases) with age in the same person. Young people often can hear ultrasounds above 20 kHz. I think even some musical instruments have spectral components up to 22 kHz. According to the sampling theorem, that would be 44 kHz. The 0.1 kHz is added to avoid any aliasing at the boundary,
that's a bad explanation because its 'up to about 22kHz' not 'up to exactly 22 kHz' and adding 100hz to 'about 44khz' changes pretty much nothing about aliasing.

The reason for such odd frequency, as others have pointed out, is because of storing digital audio on the video tapes, fitting N samples into a single horizontal scan of the video signal.
 
Last edited:
I would guess that the factorization 44100 = 2^2 \times 3^2 \times 5^2 \times 7^2 is more than "just a coincidence". But since there isn't any particular reason to chop audio signals into pieces exactly 1 second long, it's not entirely obvious why that is a nice property for a sampling rate. FWIW 48000 = 2^7 \times 3 \times 5^3 also has lots of small prime factors.
 
Dickfore said:
I think even some musical instruments have spectral components up to 22 kHz.

There are many instruments that have spectral components way above 22 kHz. A reasonable cut-off point for recording is about 100 kHz, which is why professional digital audio uses sample rates of 192 KHz or even 384 KHz. http://www.cco.caltech.edu/~boyk/spectra/spectra.htm
 
  • #10
Well I did expect an explanation technical in nature. So in a nutshell Philips was the "guilty" one for the standards?

Nice to learn something new today. Still, thank you all, you gave me a lot to work with. I will research this out even more.
 
  • #11
"I would guess that the factorization 44100=22×32×52×72 is more than "just a coincidence".

perhaps not a harmonic of TV vertical or horizontal sweep frequency?

like NTSC color carrier, 3.579xxxmhz not a multiple of 60..
 
  • #12
Bassalisk said:
Well I did expect an explanation technical in nature. So in a nutshell Philips was the "guilty" one for the standards?

Nice to learn something new today. Still, thank you all, you gave me a lot to work with. I will research this out even more.

Read the article in Wikipedia about compact disks.
http://en.wikipedia.org/wiki/Compact_disks
About the 6th topic on the page is the quote I gave earlier explaining exactly why 44.1 KHz is used.
 

Similar threads

Replies
17
Views
5K
  • · Replies 6 ·
Replies
6
Views
5K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
10
Views
5K
  • · Replies 10 ·
Replies
10
Views
6K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
4K