How does Nyquist's theory apply to digital recording of opera music?

amylase · Mar 4, 2012

Pretty sure I have some misconceptions, please correct me thanks.

SHORT VERSION
According to Nyquist's theory, if I want to digitally record a live opera and be able to reconstruct back to exactly the original sound quality, I have to sample at at least twice the frequency of the original signal.
(1) Am I right on this?
(2) But an opera has lots of different instruments and voices all at different frequencies. At twice of which frequency do we sample?

LONG VERSION
Nyquist's theory says if you want to sample something, and be able to reconstruct to get the original, then you have to sample at at least twice the frequency of the original signal.

For example assume there is a rather complex wave, which can be plotted on Y vs X axis, and its frequency is f.
Looking at the graph of this curve, it is made of a huge number of points on the X-Y plane. For example each X has a corresponding Y value.

Suppose I want to note down only some values of the curve, and hope I can reconstruct back to get the exact original complex wave from only the values I jotted down.

Imagine I walk along the x-axis and periodically I look at the Y value and I note it down.
How often do I have to record a Y value, in order for me to be able to later reconstruct the original complex wave exactly the way it is?
Ie. what should my minimum sampling rate be, in order to be able to reconstruct the original signal exactly, using my sampled values.

According to Nyquist's theory, if the original wave is of frequency f, then I have to sample at a frequency of at least 2f in order to be able to reconstruct back to original exact wave.

Am I right so far?

My second question is this: suppose that original complex wave is recording of an opera. It's going to have all sorts of instruments plus the singer. What do you call the frequency of this complex wave? It's going to be a wave complex as hell, what do you call its frequency? Do we break it down by Fourier Transformation and then deal with each component wave separately?

Thanks very much. Hope my two questions are clear. Firstly am I right about Nyguist's theory, and secondly what's the frequency of opera recording.

(Please note this is not home work. I know my questions might sound like home work but I assure you they are not. I don't get home work).

jtbell · Mar 4, 2012

amylase said:

(2) But an opera has lots of different instruments and voices all at different frequencies. At twice of which frequency do we sample?

In principle, you sample at twice the highest frequency that the input signal contains.

In practice, I suspect that most commercial recording of opera or other classical music nowadays is done at a sample rate of 96 kHz. This preserves frequencies up to 48 kHz, which is well above a human's hearing range (at least an octave). Then they use software to re-sample it downwards to 44.1 kHz for CD, or to 48 kHz for DVD or Blu-ray.

There are also places where you can buy and download 96 kHz files. To play them, you need an audio system with a digital-to-analog converter (DAC) that can handle them.

AlephZero · Mar 4, 2012

jtbell said:

In practice, I suspect that most commercial recording of opera or other classical music nowadays is done at a sample rate of 96 kHz.

96Khz is a fairly low sample rate for pro recording, You can get relatively cheap ($200) soundcards for PCs that will handle 192 KHz. Pros tend to use 384 Khz.

This is not just a matter of exceeding the range of human hearing. It also means any digital noise introduced in the signal processing and mixing can be spread over the whole frequency range up to the Nyquist frequency, and most of it is then eliminated when the signal is resampled at a lower frequency.

jim hardy · Mar 4, 2012

There's got to be practical limits to that?

Sampling at just 2x a frequency might be fine for steady tone, but what if it's amplitude modulated or just short bursts, like the snare in Bolero?

I have to believe you could hear the difference between a 40khz sampled and a 192 khz sampled recording.
In other words, all digital recordings are not created equal.

But i am not an audio professional, so am asking not asserting.

Michael C · Mar 5, 2012

amylase said:

My second question is this: suppose that original complex wave is recording of an opera. It's going to have all sorts of instruments plus the singer. What do you call the frequency of this complex wave? It's going to be a wave complex as hell, what do you call its frequency? Do we break it down by Fourier Transformation and then deal with each component wave separately?

All you need to worry about is the highest frequency in the signal: if your sampling rate is OK for that frequency, it will be OK for lower frequencies. And since your recording is presumably destined for human ears (instead of those of bats or pygmy marmosets), you only need to worry about the highest frequencies that a human ear can hear. As jtbell wrote in the first response in this thread, 96 kHz will be ample.

AlephZero said:

96Khz is a fairly low sample rate for pro recording, You can get relatively cheap ($200) soundcards for PCs that will handle 192 KHz. Pros tend to use 384 Khz.

Where do you get that information? A friend who runs a recording studio told me that most pros use 48 or 96 Khz. According to the Final Cut Pro user manual:

96 kHz:
A multiple of 48 kHz. This is becoming the professional standard for audio post-production and music recording.
192 kHz:
A multiple of 48 and 96 kHz, this is a very high-resolution sample rate used mostly for professional music recording and mastering.

f95toli · Mar 5, 2012

jim hardy said:

Sampling at just 2x a frequency might be fine for steady tone, but what if it's amplitude modulated or just short bursts, like the snare in Bolero?

It does not matter. ALL signals can be decomposed (using a Fourier transform) into single frequencies, and all you have to do is sample at 2x the highest frequency.
If you e.g. look at a square wave you can usually get away with just retaining frequencies 3x the rise time if you want to be able re-construct the signal again (but you need to multiply that by 2 if you are sampling, because of the possibility of aliasing). The more frequency components you retain, the better the re-construction.
In reality there is no need to keep frequencies above about 20 kHz, since our ears can't hear those anyway (and will "smooth" any square wave anyway).

amylase · Mar 5, 2012

f95toli said:

It does not matter. ALL signals can be decomposed (using a Fourier transform) into single frequencies, and all you have to do is sample at 2x the highest frequency.
If you e.g. look at a square wave you can usually get away with just retaining frequencies 3x the rise time if you want to be able re-construct the signal again (but you need to multiply that by 2 if you are sampling, because of the possibility of aliasing). The more frequency components you retain, the better the re-construction.
In reality there is no need to keep frequencies above about 20 kHz, since our ears can't hear those anyway (and will "smooth" any square wave anyway).

Hey thanks guys for the replies.

@ f95toli: What do you mean "The more frequency components you retain, the better the re-construction"? Do you mean even if I sample at 2f, I still won't be able to reconstruct back to the exact original signal? I think this is where I have some misconception. I thought, if you sample at 2f, then you can reconstruct the original wave 100% ie. exactly the same as the original.
But you are saying, 3f will allow even better reconstruction, 4f even better, 5f yet better etc. Higher sampling rate, the closer your reconstructed wave is to the real original wave. Right?

My question now is: when I sample at 2f, and I use the sampled data to reconstruct. How close is my reconstructed wave to the real original wave? How do you even give a number to describe closeness of resemblance? And what would that number be like when I restore using data sampled at 2f sampling rate?

Thanks a lot. Hope my questions are clear. Basically asking, if I sample at 2f and reconstruct, how close is the reconstructed wave in comparison to the true original wave?

@ jim hardy: But even Bolero with progressive increase in overall amplitude, new instruments still keep coming in. You still will get more and more frequencies joining in. So I take it whatever instrument has the highest frequency, you'll need to sample at twice that frequency.

f95toli · Mar 5, 2012

amylase said:

Hey thanks guys for the replies.

@ f95toli: What do you mean "The more frequency components you retain, the better the re-construction"? Do you mean even if I sample at 2f, I still won't be able to reconstruct back to the exact original signal?

Depends on what you are asking. If you have a signal where the highest frequency component is f, and you sample at 2f then you can reproduce the original signal EXACTLY.

However, in order to re-construct for example a square wave you would need an infinite number of frequency components and there IS -theoretically- no highest frequency. But, this is only an issue for an "ideal" square wave, any real-world signal (including the sound from a musical instrument) will have a finite bandwidth. Moreover, your hearing can only pick up signal with a maximum frequency of about 20 kHz, so there is no need to record frequency components much higher than that (which is why 44 kHz is the CD standard, 2x20 kHz plus a few kHz to allow for the filtering)

AlephZero · Mar 5, 2012

Michael C said:

Where do you get that information? A friend who runs a recording studio told me that most pros use 48 or 96 Khz.

Google for "384Khz sound cards". If nobody uses them, nobody would be selling them.

But if your final delivery format is going to be low quality MP3s played on low quality hardware (and that is how large sections of the music industry make money) I entirely agree high sample rates are overkill.

AlephZero · Mar 5, 2012

f95toli said:

Moreover, your hearing can only pick up signal with a maximum frequency of about 20 kHz, so there is no need to record frequency components much higher than that (which is why 44 kHz is the CD standard, 2x20 kHz plus a few kHz to allow for the filtering)

It is true that human hearing can't "decode" continuous tones above about 20 kHz as indentifiable "notes" with a definite "pitch". Whether humans can hear higher frequencies (in the sense that they produce a measurable stimulus in the brain) is a different question.

The CD standard for audio was based on what was technically feasible to implement 30 years ago. The DVD audio standard (which is more than 10 years old!) supports sampling up to 192 kHz.

amylase · Mar 5, 2012

f95toli said:

Depends on what you are asking. If you have a signal where the highest frequency component is f, and you sample at 2f then you can reproduce the original signal EXACTLY.

However, in order to re-construct for example a square wave you would need an infinite number of frequency components and there IS -theoretically- no highest frequency. But, this is only an issue for an "ideal" square wave, any real-world signal (including the sound from a musical instrument) will have a finite bandwidth.

Cool. Thank you very much (and everyone who replied!). So if I'm recording a natural, real world phenomenon, like a live concert, as long as I am sampling at >= 2f the concert's highest frequency f then I can reconstruct back to exact original. That's clear to me now.
However if the original is made of digital signals which are artificial, square waves, then intuitively it makes sense no matter how many finite sine waves I use, I won't be able to represent exactly that artificial square wave. Here comes my question through:

Suppose we have a digital signal that goes 1,0,1,0,1,0 and so on, and I want to represent it as sum of sine waves. You are saying I will need infinite number of sine waves to exact it, right?

How about this? Wouldn't adding these two sine waves together give you that exact digital signal?

y1 = square root of (sin x)^2
y2 = 1 - [ square root of (sin x)^2 ]

In otherwords, y1 is only the positive parts of this graph y = sinx
y2 is only the positive part of y=sinx and that being taken away from 1

The computer can draw both curves very well. And they both come from perfect sine waves, just massaged a bit.

So now if I tell the computer to calculate Y = y1 + y2, that'll give you Y = 1 for x is 0 - pi/2.
Then for x = p/2 to pi, Y is 0
Then for x = pi to 3pi / 2 , Y is 1 again.
So on and so forth you get your 1,0,1,0,1,0...

Isn't this reproducing a square wave exactly using sine waves?

sophiecentaur · Mar 5, 2012

I can't find any mention of quantising levels here. The benefit of high sample rate is to reduce quantisation noise. You can use two level sampling as long as your sample rate is high enough.

Khashishi · Mar 5, 2012

amylase said:

y1 = square root of (sin x)^2
y2 = 1 - [ square root of (sin x)^2 ]

These aren't linear combinations of sine waves, and they don't have a single frequency when expressed in Fourier series.

But, you don't have to use Fourier Transform if you don't want to. You could use a wavelet transformation instead. In particular, if you use the Haar wavelet (https://en.wikipedia.org/wiki/Haar_wavelet), I think the square waves will come out nicely.

sophiecentaur · Mar 6, 2012

sophiecentaur said:

I can't find any mention of quantising levels here. The benefit of high sample rate is to reduce quantisation noise. You can use two level sampling as long as your sample rate is high enough.

I'm sorry - AlephZero already brought up the topic of quantisation noise (but not by name) but no one picked up on it. QN is, I know, not true noise but a signal distortion but its properties make it appear and sound like noise to human perception.
The performance of any practical system is going to depend on both the number of quantising levels and the sample rate (that is, if your sampling is on the basis of a series of amplitude samples). However you choose to analyse your input signal mathematically, you won't get a different answer for the Nyquist rate for sampling.

People forget that Sampling and Quantisation are two different issues. Nyquist does not concern itself with limiting the precision of the samples: analogue samples are assumed in the basic theory. In practice, the samples will be digitised (quantised) and that brings in other considerations.

The answer to the OP is, I think, that Nyquist applies to all sampling operations in the same way. The Nyquist limit assumes perfect filtering of your input signal, with none of the signal spectrum being higher than f_s/2 so you are immediately into the realms of what is practicable. If you oversample then you can accommodate the characteristics of practical anti-aliasing filters. 'Gross' oversampling can allow you to use further filtering which can be a good engineering solution.

The "digital" part of the question then introduces Quantisation and other factors. The chosen solution will depend upon the available technology. Processing speed and distortion (linearity) were both at their limits for early ADCs. All sorts of combinations were tried.

These days, the big interest is in coding those original samples to achieve extreme bit rate reduction with no subjective impairment. Where did Hi Fi in the Home go? How many people actually listen to really good quality sound any more?

The choice of Opera as an example is a very good one in that it is one of the most complex of audio signals to deal with and has a very discerning audience. It sounds really rubbish if you try to squeeze it into a few kb/s. But that's all way down the line from the simple, basic requirement of Nyquist - which is always with us.

rcgldr · Mar 6, 2012

If sampling a monotone (constant frequency, constant amplitude) sound wave at exactly 2f, the amplitude information will be lost (except for the case where samples occur at peak frequencies), so the sampling rate needs at least some tiny amount greater than 2f. There's also some issue with how digital to analog circuits work, and other pratical limitations. Wiki article:

http://en.wikipedia.org/wiki/Nyquist_theorem#Practical_considerations

sophiecentaur · Mar 6, 2012

rcgldr said:

If sampling a monotone (constant frequency, constant amplitude) sound wave at exactly 2f, the amplitude information will be lost (except for the case where samples occur at peak frequencies), so the sampling rate needs at least some tiny amount greater than 2f. There's also some issue with how digital to analog circuits work, and other pratical limitations. Wiki article:

http://en.wikipedia.org/wiki/Nyquist_theorem#Practical_considerations

That frequency is the 'Limit' and is one that you can't actually get to. Sampling at just a fraction of a Hz above this and the samples will strobe through the sine wave, revealing the full amplitude. There is Enough Information in the sample stream to enable you to reconstitute the signal BUT the spectrum of this sampled signal will contain Two Components - one just above and one just below f_s/2 and you would need to filter out the higher one if you wanted to see just the wanted one. This involves a filter with a very long delay time and an extremely sharp 'knee' (same as the Pre-filter, in fact).
No one said it would be easy to do, remember.

That's why they leave sufficient headroom for the practicalities of Nyquist Filtering.

f95toli · Mar 6, 2012

AlephZero said:

Google for "384Khz sound cards". If nobody uses them, nobody would be selling them.

But if your final delivery format is going to be low quality MP3s played on low quality hardware (and that is how large sections of the music industry make money) I entirely agree high sample rates are overkill.

It is perhaps worth keeping in mind that sound cards are not only used for recording. Even though most studios record, mix and master music "in the box" (i.e. using computers), many still use a lot of of outboard analogue equipment such as compressors, equalizers etc,
These are used a part of the (potentially very long) signal chain when for example mixing (which in all likelihood also involves some plugins etc.). Now, the "problem" with using external analogue equipment is of course that you need to go through a D/A - outboard- A/D chain every time. If you do this enough times you'll get signal degradation unless very high spec converters are used. Moreover, since we are not actually recording NEW sounds there is no "natural" limit to the bandwidth which means it makes sense to use a high sample rate (with lots of bit, many plugins use 32 bits internally for the same reason).

Hence, whereas music is typically RECORDED at say 96kHz/24 bit higher sample rates (and more bits) are often used during mixing/mastring. Besides, using anything higher than 96 kHz for recording would be pointless since no microphone has a BW much higher than perhaps 30 Khz, and most roll off much earlier than that.

(caveat: My only sources of knowledge in this field comes from reading SoundOnSound for many years, reading books and doing some hobby recordings using Pro Tools; I am in no way an expert).

sophiecentaur · Mar 6, 2012

This whole business is a massive mixture of theory and practice. If we're talking about the needs of high quality Opera recordings then the 'sampling end' of things is relevant - along with Nyquist matters. If we're squeezing all our 'easy listening' music onto a ipod then the MP coding will do enough damage to cover any ADC inadequacies.

Michael C · Mar 6, 2012

amylase said:

Cool. Thank you very much (and everyone who replied!). So if I'm recording a natural, real world phenomenon, like a live concert, as long as I am sampling at >= 2f the concert's highest frequency f then I can reconstruct back to exact original. That's clear to me now.

"Reconstruct back to exact original" is a nice idea, but what does it mean? You're talking about recording an opera. In fact, every person in the audience will hear a different version of the performance: depending on your position in the auditorium, you'll hear more of the singers, or the orchestra, or more from the left, or the right, with different proportions of direct and reverberated sound... A professional opera recording will use a large number of microphones: there'll be some on stage, some in the auditorium, some in the orchestra pit, some wireless ones for individual singers and maybe some backstage ones for an offstage chorus. The inputs from all these mikes will be mixed into an end product that will be different from any version that a single person in the audience heard.

We need to make a choice as to what we want to achieve. Here's a possible one: we try to reproduce what one particular person in the audience would have heard (maybe sitting at "the best seat in the house"). We use a dummy head setup with two microphones. Even the very best microphone doesn't have a flat response curve, and the frequency response usually drops sharply above 20 kHz. Most frequency response charts for professional microphones don't go any further than 20 kHz.

Now is a good time to ask the question: what use is sampling the sound at 192 kHz or 384 kHz? The input to the sound card is not the original sound: all the information the sound card is getting is coming from the output of the microphone. If the sound card registers information in frequencies above 100 kHz, it is due to artefacts created by the microphone and has no relation to the original sound.

Now that we have a digital version of the recorded sound, we need a way to get it into the ears of the hearer: an amplifier and loudspeakers or headphones. Our ideal is to get the air next to the listeners ears to vibrate in exactly the same way as the air vibrated around the dummy head as we were recording. Suffice it to say that no headphone or loudspeaker can create a pressure wave in the air that has the exact form of the one that is fed to it.

So "reconstruct to exact original" is a myth. We can make something that approaches the original. Maybe we can do this well enough that a blindfolded person siting on a chair in a concert hall couldn't tell if the sound they are hearing is coming from a live orchestra or a recording of the same orchestra. That's what really matters: even if some instruments are producing sounds in the ultrasound range, nobody can hear them so they could add nothing to the quality of the recording.

To go back to the Nyquist-Shannon theory, this is what it states:

"If a function x(t) contains no frequencies higher than B hertz, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart."

Note that the wave is completely determined when sampled at twice the highest frequency. Sampling at a higher frequency simply doesn't add any more information (http://www.lavryengineering.com/documents/Sampling_Theory.pdf that goes into some detail about this). If the listeners cannot hear frequencies above 20 kHz, there's no sense in sampling at more than twice this frequency. 48 kHz is easily sufficient, not just for the "low quality mp3", but also for a high end opera recording.

jim hardy · Mar 6, 2012

Verrrry interesting, folks ...

Thanks to all.

Probably the ear wouldn't distinguish between a 20 khz sine vs square wave,

http://hyperphysics.phy-astr.gsu.edu/hbase/sound/place.html#c1

and if it won't, that's that in my book.
Further that's something i could try if can find where i packed my old headphones three moves ago..

48K vs 192K could become another "Tubes vs Transistors" debate.

old jim

Delta Kilo · Mar 6, 2012

Michael C said:

Note that the wave is completely determined when sampled at twice the highest frequency. Sampling at a higher frequency simply doesn't add any more information (http://www.lavryengineering.com/documents/Sampling_Theory.pdf that goes into some detail about this). If the listeners cannot hear frequencies above 20 kHz, there's no sense in sampling at more than twice this frequency. 48 kHz is easily sufficient, not just for the "low quality mp3", but also for a high end opera recording.

True, but there are conditions to be met.
First, source signal must not contain any frequencies above f/2, otherwise you'll get aliasing. So you have to use low-pass filter. Perfect low-pass aka brick-wall filters are in the same category as massless springs and absolute rigid bodies, i.e. do not exist. Real-world filters are a lot more sloppy. Plus this filter has to be before ADC in analog domain, and these days everything analog is fiddly and expensive and everything digital is dirt cheap and abundant.
Second, you need to use sinc interpolation to re-create the original function, which is again fiddly and expensive etc etc. Raising sampling frequency is the easy way out.

As sophiecentaur said, this is a mixture of theory and practice. And marketing. Surely 192Khz sounds 4 times better than 48 :)

sophiecentaur · Mar 6, 2012

Delta Kilo said:

Surely 192Khz sounds 4 times better than 48 :)

And we hear through our wallets!

rumleymusic · Mar 6, 2012

Goodness, a lot of interesting theories about how recording engineers do their jobs.

I just had to register to clear a few things up. I am not a physicist, but a classical recording engineer.

A few things:
1) Sampling rates are usually between 44.1 and 192kHz.
2) If there is a high channel count, 44.1 or 48, is probably the most common.
3) DXD formats like 394kHz are not common and are used primarily for PCM to DSD format conversion, as most Digital Audio Workstations cannot work with DSD formats.
4) High quality converters are coveted for their clean analog circuity and dynamic range, not the sample rate.
5) Most propaganda about the superiority of high sample rates sounding better are sales pitches and have no basis in reality. In truth, some inferior converters will sound better at 2x sample rates (88.2 and 96kHz) because of the stability of the clocking and performance of the converter, not because of the higher frequencies.
6) Forget any wave other than sin waves. Those are artificial waves that are not generated by any known acoustic instrument. We are talking about opera right?
7) Any limitation of the Nyquist theorem is due to the need for analog circuity in audio recording. 16bit audio works alright, but the theoretical limit of 144dB of dynamic range for 24 PCM recording has yet to be achieved due to analog limitations. Therefore a converter cannot output exactly what it records and any frequency rate, even though the converter is trying to produce an identical waveform from the original, (post decimation filter), analog signal.

rbj · Mar 6, 2012

AlephZero said:

96Khz is a fairly low sample rate for pro recording... Pros tend to use 384 Khz.

that's news to me. i haven't heard of any audio recording system that uses 384 kHz sampling rate.

rbj · Mar 6, 2012

Michael C said:

To go back to the Nyquist-Shannon theory, this is what it states:

"If a function x(t) contains no frequencies higher than B hertz, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart."

Note that the wave is completely determined when sampled at twice the highest frequency.

strictly speaking, the sampling frequency must be slightly higher than 2B. you will have a phase or amplitude ambiguity if you have a sinusoid at exactly B Hz and you sample it at exactly 2B.

even though Nyquist-Shannon didn't say this back in 1945 or whatever year it was, it has since been commonly expressed in the literature that you must sample at a frequency greater than 2B, if B is the highest possible frequency.

rbj · Mar 6, 2012

f95toli said:

If you have a signal where the highest frequency component is f, and you sample at 2f then you can reproduce the original signal EXACTLY.

okay, for any [itex] -\pi/2 < \theta < +\pi/2 [/itex], the following sampled at 2f will result in the same samples of +1, -1, +1, -1...

[tex] x(t) = \frac{1}{\cos(\theta)} \cos(2 \pi f t + \theta ) [/tex]

[tex] x(nT) = (-1)^n [/tex]

[tex] T \equiv \frac{1}{2f} [/tex]

So how are you going to reconstruct the highest component if the samples don't give you any hint as to what [itex]\theta[/itex] is?

sophiecentaur · Mar 6, 2012

rbj said:

okay, for any [itex] -\pi/2 < \theta < +\pi/2 [/itex], the following sampled at 2f will result in the same samples of +1, -1, +1, -1...

[tex] x(t) = \frac{1}{\cos(\theta)} \cos(2 \pi f t + \theta ) [/tex]

[tex] x(nT) = (-1)^n [/tex]

[tex] T \equiv \frac{1}{f} [/tex]

So how are you going to reconstruct the highest component if the samples don't give you any hint as to what [itex]\theta[/itex] is?

That's so obvious. I'm sure he's just forgotten the 'greater than' sign.

rbj · Mar 6, 2012

sophiecentaur said:

That's so obvious. I'm sure he's just forgotten the 'greater than' sign.

nyquist or shannon did too.

sophiecentaur · Mar 6, 2012

To be fair to them, they never built one which would have made it clear to them by producing a string of equal value pulses.
It's easy to be wise after the event ;-)

Naty1 · Mar 7, 2012

great discussion!

A practical insight into the above 'sampling' discussions can be found at this wiki article:

http://en.wikipedia.org/wiki/T-carrier

[T carrier is an older version of what is usually referred to as DS-1 today.]

QUOTE]A more detailed understanding of how the rate of 1.544 Mbit/s was divided into channels is as follows. (This explanation glosses over T1 voice communications, and deals mainly with the numbers involved.) Given that the telephone system nominal voiceband (including guardband) is 4,000 Hz, the required digital sampling rate is 8,000 Hz (see Nyquist rate). Since each T1 frame contains 1 byte of voice data for each of the 24 channels, that system needs then 8,000 frames per second to maintain those 24 simultaneous voice channels. Because each frame of a T1 is 193 bits in length (24 channels × 8 bits per channel + 1 framing bit = 193 bits), 8,000 frames per second is multiplied by 193 bits to yield a transfer rate of 1.544 Mbit/s (8,000 × 193 = 1,544,000).[/QUOTE]

A voice conversation, or music on a telephone line, gives you a bit of a subjective feel for what 'voice communications' sounds like when sampled at twice the nominal 4,000 hz analog bandwidth the former AT&T alloted for voice telephone calls here in the US. [As I recall, that analog bandwidth was based on subjective tests of human hearing by Bell Labs and what was needed to communicate human emotion and nuance and tones in typical voice conversations.]

That digital sample rate resulted in subjectively 'higher quality' digital voice communications because background noise was noticeably reduced. One could tell early on if a conversation was on digital or analog facilities...

[These digital coding schemes required precise clock timing throughout the country via distribution of timing signals from the National Institute of Standards and Technology (NIST), formerly the National Bureau of Standards (NBS). That's not a problem in local audio systems. Loss or impairment of that timing signal was a cause for near 'panic' in testrooms around the country!]

But after reading the above posts I AM wondering if any error correction schemes are employed in modern digital audio systems?

[The digital scheme employed by AT&T utilized AMI which was replaced aby B8ZS error correction schemes.]

And for those interested in some of the effects of analog filters,

http://en.wikipedia.org/wiki/Bandwidth_(signal_processing )

PhilDSP · Mar 9, 2012

Michael C said:

Now is a good time to ask the question: what use is sampling the sound at 192 kHz or 384 kHz?

One of the biggest problems with recreating a digitally recorded audio signal with extreme high fidelity is the low pass or anti-aliasing filter that has to be implemented to ensure that a very minimum amount of energy ever gets above the Nyquist frequency.

The low pass filter in the case of 44 KHz or even 88 KHz and 96 KHz needs to be very, very steep. That introduces some severe phase distortion that varies sharply with frequency. Golden eared audiophiles with high quality analog equipment can definitely hear that, especially with sounds that are percussive such as triangles and cymbals. It's not just the individual sounds from each instrument or voice that matter but the spatial and timing relationships between them and sharp filters do some strange things with those. (I have experience as both a physicist and professional recording and production engineer) The higher sample rates reduce the requirements for the filter and the result should be much reduced phase distortion in the higher audio frequencies.

One of the smarter things that can be done is to upsample the signal to 192 KHz or 384 KHz and then apply the filter though that's not quite as refined as having the original signal there already.

RobAnderson · Mar 9, 2012

Howdy folks.

First post here. I should qualify it by saying that I am a recording engineer, not a scientist.

From a practical standpoint, sampling rate depends as much on delivery format and track count as anything else when it comes to recording audio.

Most adult humans can't hear much above 16 kHz, but we go with the assumption that the average human can hear up to 20 kHz. Can you perceive anything above that? Hard to say, but most evidence points to the fact that higher sampling rates sound better, because the slope of the anti-aliasing filter can be relaxed so that there are no ripples or resonances in the audio band, and the cutoff frequency can be above 20 kHz, rather than in the audio band.

44.1 kHz and 48 kHz were decided upon 30 years ago, since the first commercially available digital audio recorders were adapted video decks, and we seem to be stuck on multiples of these for PCM encoding.

These sampling rates fulfilled two criteria:
1 - they were fast enough to recreate almost the entire bandwidth of human hearing (with rather sharply sloped anti-aliasing filters);
2 - they could function with the frame rates of the video recorders that were being used;

A lot of classical recordings, if they are using PCM, will use a sample rate of 96 kHz. Few (if any) of my colleagues are specifying 192 kHz or above. If track counts are high, chances are it will be recorded at 44.1 kHz for CD release, sometimes 48 kHz.

Nyquist/Shannon works. If you have listened to a CD in the past 30 years and heard all of the instruments, then you can be a witness. Are there some overtones or ultrasonic components missing? Maybe. But it must be said that most microphones and audio processors don't even pass audio signal much beyond 20 kHz. Even the best of the best might have a 50 kHz bandwidth at most. Your speakers sure don't reproduce anything that high.

Which leaves only the quality of the ADC and DAC. High-quality analog components; stable, low-jitter clocks, temperature stability, and such all make far more a difference in the quality of reproduction than the sampling rate.

I'll take a Genex or Mytek converter at 44.1 over a "soundcard" at 384 kHz any day, and if you heard the difference, you would too.

Michael C · Mar 9, 2012

RobAnderson said:

A lot of classical recordings, if they are using PCM, will use a sample rate of 96 kHz. Few (if any) of my colleagues are specifying 192 kHz or above. If track counts are high, chances are it will be recorded at 44.1 kHz for CD release, sometimes 48 kHz.

A few days ago I participated in a classical concert being recorded for the German radio network. I had a chat with the chief recording engineer, who told me that 48 kHz at a depth of 24 bit is still the standard for radio recordings here. He doesn't see any reason for this standard to change, since the microphones don't have any appreciable output above 24 kHz. He said that some colleagues use 96 kHz for classical recordings, but he would challenge anybody to hear the difference. Higher sampling rates, according to him, only make sense if you are using a lot of digital effects, which isn't the case in a classical recording where you are trying to reproduce the original as faithfully as possible.

RobAnderson · Mar 9, 2012

Higher sampling rates, according to him, only make sense if you are using a lot of digital effects, which isn't the case in a classical recording where you are trying to reproduce the original as faithfully as possible.

Yep - 48 kHz is typically used for broadcast or visual media. Bit depth is an important part of this discussion, and probably makes more of a difference than sampling rate - especially in terms of reproducing the dynamic range found in classical music. 24-bit resolution vs 16-bit makes a much bigger difference than 48 vs. 44 kHz sampling rate.

Although his argument is counter to the way many folks on this side of the pond seem to think. For classical recordings, the lower track counts, the desire to maintain the utmost fidelity, and the reluctance to do much in the way of digital processing often leads folks to go with the 88 or 96 k sampling frequencies, while the pop and rock productions often go with the lower sampling rates because of the need for extreme amounts of processing on many tracks.

Delta Kilo · Mar 9, 2012

Well, you can always run ADC at 384KHz with 8x oversampling, followed by digital anti-aliasing filter and downsampling to 48KHz for transmission/storage. Same thing on the playback side - digitally upsample to 384KHz using proper sinc interpolation before sending it to the DAC. This way analog filters are a lot less critical, noise and jitter are much reduced, basically you get a couple of extra bits of resolution.

How does Nyquist's theory apply to digital recording of opera music?

1. What is Nyquist's theory?

2. How does Nyquist's theory apply to digital recording of opera music?

3. What is the significance of Nyquist's theory in digital recording?

4. How does Nyquist's theory affect the quality of digital recordings of opera music?

5. Are there any exceptions to Nyquist's theory in digital recording of opera music?

Similar threads

Hot Threads

Recent Insights