# How does Nyquist's theory apply to digital recording of opera music?

by amylase
Tags: apply, digital, music, nyquist, opera, recording, theory
 P: 13 Pretty sure I have some misconceptions, please correct me thanks. SHORT VERSION According to Nyquist's theory, if I want to digitally record a live opera and be able to reconstruct back to exactly the original sound quality, I have to sample at at least twice the frequency of the original signal. (1) Am I right on this? (2) But an opera has lots of different instruments and voices all at different frequencies. At twice of which frequency do we sample? LONG VERSION Nyquist's theory says if you want to sample something, and be able to reconstruct to get the original, then you have to sample at at least twice the frequency of the original signal. For example assume there is a rather complex wave, which can be plotted on Y vs X axis, and its frequency is f. Looking at the graph of this curve, it is made of a huge number of points on the X-Y plane. For example each X has a corresponding Y value. Suppose I want to note down only some values of the curve, and hope I can reconstruct back to get the exact original complex wave from only the values I jotted down. Imagine I walk along the x-axis and periodically I look at the Y value and I note it down. How often do I have to record a Y value, in order for me to be able to later reconstruct the original complex wave exactly the way it is? Ie. what should my minimum sampling rate be, in order to be able to reconstruct the original signal exactly, using my sampled values. According to Nyquist's theory, if the original wave is of frequency f, then I have to sample at a frequency of at least 2f in order to be able to reconstruct back to original exact wave. Am I right so far? My second question is this: suppose that original complex wave is recording of an opera. It's going to have all sorts of instruments plus the singer. What do you call the frequency of this complex wave? It's going to be a wave complex as hell, what do you call its frequency? Do we break it down by Fourier Transformation and then deal with each component wave separately? Thanks very much. Hope my two questions are clear. Firstly am I right about Nyguist's theory, and secondly what's the frequency of opera recording. (Please note this is not home work. I know my questions might sound like home work but I assure you they are not. I don't get home work).
Mentor
P: 11,057
 Quote by amylase (2) But an opera has lots of different instruments and voices all at different frequencies. At twice of which frequency do we sample?
In principle, you sample at twice the highest frequency that the input signal contains.

In practice, I suspect that most commercial recording of opera or other classical music nowadays is done at a sample rate of 96 kHz. This preserves frequencies up to 48 kHz, which is well above a human's hearing range (at least an octave). Then they use software to re-sample it downwards to 44.1 kHz for CD, or to 48 kHz for DVD or Blu-ray.

There are also places where you can buy and download 96 kHz files. To play them, you need an audio system with a digital-to-analog converter (DAC) that can handle them.
Engineering
Thanks
P: 6,064
 Quote by jtbell In practice, I suspect that most commercial recording of opera or other classical music nowadays is done at a sample rate of 96 kHz.
96Khz is a fairly low sample rate for pro recording, You can get relatively cheap ($200) soundcards for PCs that will handle 192 KHz. Pros tend to use 384 Khz. This is not just a matter of exceeding the range of human hearing. It also means any digital noise introduced in the signal processing and mixing can be spread over the whole frequency range up to the Nyquist frequency, and most of it is then eliminated when the signal is resampled at a lower frequency. Sci Advisor P: 3,019 ## How does Nyquist's theory apply to digital recording of opera music? There's got to be practical limits to that? Sampling at just 2x a frequency might be fine for steady tone, but what if it's amplitude modulated or just short bursts, like the snare in Bolero? I have to believe you could hear the difference between a 40khz sampled and a 192 khz sampled recording. In other words, all digital recordings are not created equal. But i am not an audio professional, so am asking not asserting. P: 132  Quote by amylase My second question is this: suppose that original complex wave is recording of an opera. It's going to have all sorts of instruments plus the singer. What do you call the frequency of this complex wave? It's going to be a wave complex as hell, what do you call its frequency? Do we break it down by Fourier Transformation and then deal with each component wave separately? All you need to worry about is the highest frequency in the signal: if your sampling rate is OK for that frequency, it will be OK for lower frequencies. And since your recording is presumably destined for human ears (instead of those of bats or pygmy marmosets), you only need to worry about the highest frequencies that a human ear can hear. As jtbell wrote in the first response in this thread, 96 kHz will be ample.  Quote by AlephZero 96Khz is a fairly low sample rate for pro recording, You can get relatively cheap ($200) soundcards for PCs that will handle 192 KHz. Pros tend to use 384 Khz.
Where do you get that information? A friend who runs a recording studio told me that most pros use 48 or 96 Khz. According to the Final Cut Pro user manual:

 96 kHz: A multiple of 48 kHz. This is becoming the professional standard for audio post-production and music recording. 192 kHz: A multiple of 48 and 96 kHz, this is a very high-resolution sample rate used mostly for professional music recording and mastering.
PF Gold
P: 2,176
 Quote by jim hardy Sampling at just 2x a frequency might be fine for steady tone, but what if it's amplitude modulated or just short bursts, like the snare in Bolero?
It does not matter. ALL signals can be decomposed (using a Fourier transform) into single frequencies, and all you have to do is sample at 2x the highest frequency.
If you e.g. look at a square wave you can usually get away with just retaining frequencies 3x the rise time if you want to be able re-construct the signal again (but you need to multiply that by 2 if you are sampling, because of the possibility of aliasing). The more frequency components you retain, the better the re-construction.
In reality there is no need to keep frequencies above about 20 kHz, since our ears can't hear those anyway (and will "smooth" any square wave anyway).
P: 13
 Quote by f95toli It does not matter. ALL signals can be decomposed (using a Fourier transform) into single frequencies, and all you have to do is sample at 2x the highest frequency. If you e.g. look at a square wave you can usually get away with just retaining frequencies 3x the rise time if you want to be able re-construct the signal again (but you need to multiply that by 2 if you are sampling, because of the possibility of aliasing). The more frequency components you retain, the better the re-construction. In reality there is no need to keep frequencies above about 20 kHz, since our ears can't hear those anyway (and will "smooth" any square wave anyway).
Hey thanks guys for the replies.

@ f95toli: What do you mean "The more frequency components you retain, the better the re-construction"? Do you mean even if I sample at 2f, I still won't be able to reconstruct back to the exact original signal? I think this is where I have some misconception. I thought, if you sample at 2f, then you can reconstruct the original wave 100% ie. exactly the same as the original.
But you are saying, 3f will allow even better reconstruction, 4f even better, 5f yet better etc. Higher sampling rate, the closer your reconstructed wave is to the real original wave. Right?

My question now is: when I sample at 2f, and I use the sampled data to reconstruct. How close is my reconstructed wave to the real original wave? How do you even give a number to describe closeness of resemblance? And what would that number be like when I restore using data sampled at 2f sampling rate?

Thanks a lot. Hope my questions are clear. Basically asking, if I sample at 2f and reconstruct, how close is the reconstructed wave in comparison to the true original wave?

@ jim hardy: But even Bolero with progressive increase in overall amplitude, new instruments still keep coming in. You still will get more and more frequencies joining in. So I take it whatever instrument has the highest frequency, you'll need to sample at twice that frequency.
PF Gold
P: 2,176
 Quote by amylase Hey thanks guys for the replies. @ f95toli: What do you mean "The more frequency components you retain, the better the re-construction"? Do you mean even if I sample at 2f, I still won't be able to reconstruct back to the exact original signal?
Depends on what you are asking. If you have a signal where the highest frequency component is f, and you sample at 2f then you can reproduce the original signal EXACTLY.

However, in order to re-construct for example a square wave you would need an infinite number of frequency components and there IS -theoretically- no highest frequency. But, this is only an issue for an "ideal" square wave, any real-world signal (including the sound from a musical instrument) will have a finite bandwidth. Moreover, your hearing can only pick up signal with a maximum frequency of about 20 kHz, so there is no need to record frequency components much higher than that (which is why 44 kHz is the CD standard, 2x20 kHz plus a few kHz to allow for the filtering)
Engineering
Thanks
P: 6,064
 Quote by Michael C Where do you get that information? A friend who runs a recording studio told me that most pros use 48 or 96 Khz.
Google for "384Khz sound cards". If nobody uses them, nobody would be selling them.

But if your final delivery format is going to be low quality MP3s played on low quality hardware (and that is how large sections of the music industry make money) I entirely agree high sample rates are overkill.
Engineering
Thanks
P: 6,064
 Quote by f95toli Moreover, your hearing can only pick up signal with a maximum frequency of about 20 kHz, so there is no need to record frequency components much higher than that (which is why 44 kHz is the CD standard, 2x20 kHz plus a few kHz to allow for the filtering)
It is true that human hearing can't "decode" continuous tones above about 20 kHz as indentifiable "notes" with a definite "pitch". Whether humans can hear higher frequencies (in the sense that they produce a measurable stimulus in the brain) is a different question.

The CD standard for audio was based on what was technically feasible to implement 30 years ago. The DVD audio standard (which is more than 10 years old!) supports sampling up to 192 kHz.
P: 13
 Quote by f95toli Depends on what you are asking. If you have a signal where the highest frequency component is f, and you sample at 2f then you can reproduce the original signal EXACTLY. However, in order to re-construct for example a square wave you would need an infinite number of frequency components and there IS -theoretically- no highest frequency. But, this is only an issue for an "ideal" square wave, any real-world signal (including the sound from a musical instrument) will have a finite bandwidth.
Cool. Thank you very much (and everyone who replied!). So if I'm recording a natural, real world phenomenon, like a live concert, as long as I am sampling at >= 2f the concert's highest frequency f then I can reconstruct back to exact original. That's clear to me now.
However if the original is made of digital signals which are artificial, square waves, then intuitively it makes sense no matter how many finite sine waves I use, I won't be able to represent exactly that artificial square wave. Here comes my question through:

Suppose we have a digital signal that goes 1,0,1,0,1,0 and so on, and I want to represent it as sum of sine waves. You are saying I will need infinite number of sine waves to exact it, right?

y1 = square root of (sin x)^2
y2 = 1 - [ square root of (sin x)^2 ]

In otherwords, y1 is only the positive parts of this graph y = sinx
y2 is only the positive part of y=sinx and that being taken away from 1

The computer can draw both curves very well. And they both come from perfect sine waves, just massaged a bit.

So now if I tell the computer to calculate Y = y1 + y2, that'll give you Y = 1 for x is 0 - pi/2.
Then for x = p/2 to pi, Y is 0
Then for x = pi to 3pi / 2 , Y is 1 again.
So on and so forth you get your 1,0,1,0,1,0...

Isn't this reproducing a square wave exactly using sine waves?
 Sci Advisor PF Gold P: 11,138 I can't find any mention of quantising levels here. The benefit of high sample rate is to reduce quantisation noise. You can use two level sampling as long as your sample rate is high enough.
P: 798
 Quote by amylase y1 = square root of (sin x)^2 y2 = 1 - [ square root of (sin x)^2 ]
These aren't linear combinations of sine waves, and they don't have a single frequency when expressed in Fourier series.

But, you don't have to use Fourier Transform if you don't want to. You could use a wavelet transformation instead. In particular, if you use the Haar wavelet (https://en.wikipedia.org/wiki/Haar_wavelet), I think the square waves will come out nicely.
PF Gold
P: 11,138
 Quote by sophiecentaur I can't find any mention of quantising levels here. The benefit of high sample rate is to reduce quantisation noise. You can use two level sampling as long as your sample rate is high enough.
I'm sorry - AlephZero already brought up the topic of quantisation noise (but not by name) but no one picked up on it. QN is, I know, not true noise but a signal distortion but its properties make it appear and sound like noise to human perception.
The performance of any practical system is going to depend on both the number of quantising levels and the sample rate (that is, if your sampling is on the basis of a series of amplitude samples). However you choose to analyse your input signal mathematically, you won't get a different answer for the Nyquist rate for sampling.

People forget that Sampling and Quantisation are two different issues. Nyquist does not concern itself with limiting the precision of the samples: analogue samples are assumed in the basic theory. In practice, the samples will be digitised (quantised) and that brings in other considerations.

The answer to the OP is, I think, that Nyquist applies to all sampling operations in the same way. The Nyquist limit assumes perfect filtering of your input signal, with none of the signal spectrum being higher than fs/2 so you are immediately into the realms of what is practicable. If you oversample then you can accommodate the characteristics of practical anti-aliasing filters. 'Gross' oversampling can allow you to use further filtering which can be a good engineering solution.

The "digital" part of the question then introduces Quantisation and other factors. The chosen solution will depend upon the available technology. Processing speed and distortion (linearity) were both at their limits for early ADCs. All sorts of combinations were tried.

These days, the big interest is in coding those original samples to achieve extreme bit rate reduction with no subjective impairment. Where did Hi Fi in the Home go? How many people actually listen to really good quality sound any more?

The choice of Opera as an example is a very good one in that it is one of the most complex of audio signals to deal with and has a very discerning audience. It sounds really rubbish if you try to squeeze it into a few kb/s. But that's all way down the line from the simple, basic requirement of Nyquist - which is always with us.
 HW Helper P: 6,903 If sampling a monotone (constant frequency, constant amplitude) sound wave at exactly 2f, the amplitude information will be lost (except for the case where samples occur at peak frequencies), so the sampling rate needs at least some tiny amount greater than 2f. There's also some issue with how digital to analog circuits work, and other pratical limitations. Wiki article: http://en.wikipedia.org/wiki/Nyquist...considerations
PF Gold
P: 11,138
 Quote by rcgldr If sampling a monotone (constant frequency, constant amplitude) sound wave at exactly 2f, the amplitude information will be lost (except for the case where samples occur at peak frequencies), so the sampling rate needs at least some tiny amount greater than 2f. There's also some issue with how digital to analog circuits work, and other pratical limitations. Wiki article: http://en.wikipedia.org/wiki/Nyquist...considerations
That frequency is the 'Limit' and is one that you can't actually get to. Sampling at just a fraction of a Hz above this and the samples will strobe through the sine wave, revealing the full amplitude. There is Enough Information in the sample stream to enable you to reconstitute the signal BUT the spectrum of this sampled signal will contain Two Components - one just above and one just below fs/2 and you would need to filter out the higher one if you wanted to see just the wanted one. This involves a filter with a very long delay time and an extremely sharp 'knee' (same as the Pre-filter, in fact).
No one said it would be easy to do, remember.
That's why they leave sufficient headroom for the practicalities of Nyquist Filtering.