What exactly is a sample in terms of sample rate (audio)

hypnoticdesign · Jan 6, 2016

I can't seem to find an answer anywhere to this just by searching. I'm trying to understand samplerate more thoroughly. I know 44000 samples per second means that the highest frequency that can be recorded is 22000 hz but what is each sample? Does it mean there is a constant 44000 snippets of sound making up every sound each second? Or is this just the limit of the number of cycles any waveform can have? Is each sample a waveform?

jtbell · Jan 6, 2016

The instantaneous value of the waveform is measured 44000 times per second, as represented by the black dots in the following diagram:

http://manual.audacityteam.org/m/images/e/e2/Waveform_digital.png

Source: http://manual.audacityteam.org/index.php?title=Digital_Audio

Ibix · Jan 6, 2016

As you note, the Nyquist criterion means that a 44kHz sample rate picks up frequencies up to 22kHz. Only the very young can hear frequencies above 20kHz, so there is no point to recording higher frequencies. That's the reason why the sampling rate for audio is usually set to 44kHz or lower.

I don't know of any reason why a sound wave couldn't have higher frequency components, but we wouldn't be able to hear it.

Each sample is simply the amplitude of the wave at one instant, as per jtbell's post.

sophiecentaur · Jan 6, 2016

Ibix said:

Each sample is simply the amplitude of the wave at one instant, as per jtbell's post.

That's a good way to describe it. The 22kHz figure that's used is pretty arbitrary, in many ways, because old gimmers like me can hardly hear anything above 10kHz and there are a few 'golden ears' who can (or claim to be able to) hear way above 22kHz. A vibration with much higher frequencies than 22kHz can travel through air and end up being sampled at 44kHz. The resulting set of samples will produce signals that are below 22kHz and are quite audible. They are called Aliases. If you want an analogue/digital system that can be relied on, it is essential to low pass filter the input signal to less than or equal to 22kHz. The filter is referred to as a Nyquist filter (anti-aliasing) and is usually made to be as sharp cut as possible / practicable, to give a 'flat' frequency response in the audio band.

hypnoticdesign said:

Or is this just the limit of the number of cycles any waveform can have? Is each sample a waveform?

There's a problem here with terminology, I think. The samples are just values (e.g. the values of the instantaneous voltage). Samples are normally taken at a regular rate and, once digitised, they are a stream of digits. When they are reconstituted (in a DAC) they become a stream of pulses of different amplitudes which join together to produce a continuous waveform when they are passed through a low pass filter to 'smooth the jagged edges between them'.
Note. Samples do not have to be 'digitised'. Historically, sampling used to be done in an analogue 'sampling oscilloscope' which would sample a very high frequency signal, consisting of a repeating, fast waveform with a relatively slow rate set of sampling pulses. This would generate a waveform that was an Alias of the high frequency wave but which appeared at a low enough frequency to be displayed on a conventional (slow) oscilloscope. This was an all Analogue process. Ancient history now, of course but the only way to look at UHF waveforms with slow circuitry.

meBigGuy · Jan 7, 2016

Ibix said:

I don't know of any reason why a sound wave couldn't have higher frequency components, but we wouldn't be able to hear it.

Higher frequencies than 1/2 the sample rate (nyquist rate) are aliased back into the sub-nyquist band and therefore become audible.

For the OP:

One of the hardest thing to accept about sub-nyquist sampling is to understand that the periodic sample data contains ALL the information of the original signal and can be used to recreate it exactly (within the limits of the sample word size). The samples actually represent evenly spaced impulses which, when filtered, will create the original signal exactly (filling in the spaces between samples). It is a very important concept.

If you want to read more about sampling, try searching for sampling or digital signal processing.

https://en.wikipedia.org/wiki/Sampling_(signal_processing)
https://en.wikipedia.org/wiki/Digital_signal_processing

Ibix · Jan 7, 2016

Sorry I wasn't clear. You will not be able to hear an ultrasonic dog whistle at (say) 30kHz, so there's little point trying to record it. However, as sophiecentaur and meBigGuy point out, a 44kHz sample of the dog whistle will produce an aliased sound at 14kHz, which you are much more likely to be able to hear. You will need to filter out any stuff above the Nyquist frequency before the sampling occurs in order to avoid this.

sophiecentaur · Jan 7, 2016

meBigGuy said:

One of the hardest thing to accept about sub-nyquist sampling is to understand that the periodic sample data contains ALL the information of the original signal

Yes - it's a revolutionary idea. There is a way of looking it it that could help. Firstly, with above nyquist sampling, you have to low pass filter the input signal appropriately. If you pass the string of samples (impulses of different levels) through an ideal low pass filter, the voltage in between the sample times will follow the values of the original signal values (even when there are only slightly more samples than half a cycle of the signal). The devil is in the detail of how the two low pass filters are designed; to reconstitute such a signal may involve the filters using contributions of many samples. Also the pre and post filters have to be exactly complementary or the reconstituted signal will not be right - but, hell, this is theory and all engineering is based on 'near enough'. The harder you try and the more money you spend, the better the equipment will follow the theory. Sampling a signal with a stream of narrow pulses is a form of modulation (multiplication of one signal by the other). The string of sampling pulses will have a frequency spectrum which consists of a 'comb' of frequencies (the harmonics of the fundamental frequency). Sampling will produce sidebands around all of these harmonics. If you (sub-) sample a single tone with a high frequency (higher than the Nyquist limit), one of the sample harmonics (it could be the eleveth harmonic)will be ' near' your sampled frequency and produce a sideband that's less than the Nyquist frequency (i.e. it will be 'audible'). This shouldn't be hard to accept when you consider that a bog-standard superhet receiver does exactly the same thing by beating down an extremely high frequency signal to one that can be processed in a lowish frequency IF section. All that's required is that the spectrum of the final signal must not be 'jumbled up' with reconstituted products not laying on top of one another. Then they are inseperable and what you hear / see will be distorted.

hypnoticdesign · Jan 7, 2016

Thanks for the replies guys, really helpful stuff! I think I get it now,. Its the damn term sample that gets me, why this term is used for so many things in music production is beyond me, it just creates confusion, I mean I suspected it was constant but also thought it could'a just meant that was the maximum number of samples and would only be used if needed for the higher frequencies...
I just want to get something out. A lot of you are saying that there's no point sampling at higher rates, but that's not actually true. The frequencies of higher notes actually interact with the lower frequencies creating a beating. These are audible and form part of some sounds. Taking away this beating can result in a less natural sounding recording but won't affect a sound that has none of this higher frequency info recorded.

Soooo... Samples are little snapshots(so to speak) of amplitude that make up waveforms depending on the arrangement of the samples. So if I've got a sine wave at 44khz samplerate and play 1 bar at 60 bpm.. I will have 44000 samples in a bar regardless of frequency.

So I was wondering, using this would be a great way to get the best out of resampling right? Or getting the most out of digital synths? I mean say I have a sound that I want to resample, If I match the fundamental to the sample rate would that ensure the recording would be as accurate as possible? Or if a synth is tuned so that all the fundamental frequencies are cycling in time with the sample rate then nothing would be getting cut off anywhere in between note(for high frequency content)?? Or would it make no difference to lower notes unless they had crazy high harmonic content? There surely has to be some optimal relationship between frequency and sample rate or is bpm to frequency more important? or both? Or Neither? lol.. sorry If I'm going on.. I just want to have the most solid foundation for my music that's possible.

meBigGuy · Jan 7, 2016

The optimal relationship is that the sample rate needs to be greater than 2X the highest frequency in your signal and the word size large enough for your dynamic range (or noise floor requirements).

In reality that is impossible since there are always higher frequencies present. The trick is to filter them such that their alias products are below the inherent noise.

There is absolutely no relationship between samples in a synth rom sense, and sampling theory for dsp. There are standard DSP sample rates and word sizes, so you stick with them. They range roughly from 44.1Khz 16 bits to 96Khz 24 bits (lower and higher are possible, of course). Any given sample rate will faithfully reproduce any frequency between 0 and close to the nyquist frequency. There is nothing about your source material that should affect the sample rate other than the anti-alias issues (and your desire or need to cut corners for memory capacity or whatever).

The sample frequency determines the frequency above which aliasing will occur, and so determines the complexity of the anti-alias filters and the resulting phase and amplitude distortion. The higher the sample frequency, the easier to filter out everything above 20Khz (or 24Khz, or whatever) without radical phase and amplitude distortion. (modern digital filters make channel matching easier)

The sample word size determines the dynamic range (approx 6dB per bit) so 16 bist can reproduce a ~96dB dynamic range. 24 bit allows ~144dB.
At 16 bits, you can hear hiss from home theater systems when the volume is set moderately loud and the material is silent. Sampling always produces a noise floor. The idea is to keep it below what the ear can perceive. 16 bits is pretty marginal. I wish the standard was 18 or 20 bits .

hypnoticdesign · Jan 7, 2016

meBigGuy said:

There is absolutely no relationship between samples in a synth rom sense, and sampling theory for dsp. There are standard DSP sample rates and word sizes, so you stick with them. They range roughly from 44.1Khz 16 bits to 96Khz 24 bits (lower and higher are possible, of course). Any given sample rate will faithfully reproduce any frequency between 0 and close to the nyquist frequency. There is nothing about your source material that should affect the sample rate other than the anti-alias issues.

So use a steep low pass at the cut off point of the sampling rate limit? Thanks for clearing that up.

I totally get that the samples are different and not like for example a kick drum sample but that's not what I mean. I'm mainly referring to doing things like using 375 hertz as a root note with 48khz samplerate and 120 bpm tempo or something similar to that.
Surely frequency would make a difference. I mean if a sound never lands on a zero crossing that would have to sound different to a sound that always lands on a zero crossing? If a sine wave was made up of let's say 1000 samples but the last sample is in the middle of where the waveform ends.. if that's on a loop for a sustained sound, won't that cause some sort of artefact? I'm assuming here that the samples are evenly distributed and thinking completely digital not recorded sounds. Its just 1 of my favourite synths( serum) allows you to use other synths waveforms by recording and importing them. it allows up to 256 waves and can modulate between them. The manual says to use specific numbers of samples to get accurate results if your using a modulated sound and when I found a tutorial of someone doing it, they used the samplerate to convert it for a specific frequency timing cycle duration to samplerate then timing a specific amount of time to match the specified number of samples asked for by the manual.

meBigGuy · Jan 7, 2016

hypnoticdesign said:

Surely frequency would make a difference. I mean if a sound never lands on a zero crossing that would have to sound different to a sound that always lands on a zero crossing?

That's the counter-intuitive hard to accept part I was referring to earlier. The zero crossing will end up in exactly the right places. The ratios don't matter. Mathematically the exact signal is reproduced from the samples no matter what (except for the limitations of nyquist and the noise floor). The lining up of samples is not an issue in any way when dealing with real signals. (there is some statistical sampling theory involved that relates to dither and noise, but its not worth going into).

hypnoticdesign · Jan 7, 2016

I think I just realized... they were just doing that to make sure the sound completed entire cycles.

meBigGuy said:

The zero crossing will end up in exactly the right places. ).

Just realized a zero crossing has nothing to do with samples. I think I'm confusing myself with this synth... it was that the synth internally uses a certain number of samples to make up a frame for each waveform as the waveforms can be modulated between 256 different waves... I think I understand how that works now I've let this stuff sink in a little lol. For some reason I was picturing a waveform as a stack, I was just severely confusing myself and then it hit me :)
PS thanks for explaining this to me, its like a void in my head has just been lit up aha

Svein · Jan 8, 2016

hypnoticdesign said:

I think I'm confusing myself with this synth

Strictly speaking, what you are referring to is known as a sampler. In a synthesizer, the sounds are created in an artificial way, like FM (Yamaha DX7) or oscillators and filter banks (the original Moog and early Roland synths).

hypnoticdesign · Jan 8, 2016

Svein said:

Strictly speaking, what you are referring to is known as a sampler. In a synthesizer, the sounds are created in an artificial way, like FM (Yamaha DX7) or oscillators and filter banks (the original Moog and early Roland synths).

Nah this is a little different than most. It doesn't actually allow you to load samples but rather recreates a representation of the waveform by analysing an audio file. I'm guessing that's why it requires the precision. But that just a special feature of the thing. It even allows you to create waveforms from user defined equations. I've been getting a little obsessed with that feature... trying to make golden ratio waveforms and stuff like that ^^

meBigGuy · Jan 8, 2016

I guess there is DJ_sampling and DSP_sampling.

When you create waveforms with equations I expect you can use floating point numbers (like 3.14159) to describe relationships. Wouldn't be much good if it were just integers. So, don't sweat the sample rate when you are choosing frequencies. It flat out doesn't matter in most real world situations.

Now, if you are choosing a sequence to analyze (like do an FFT) or loop, there can be issues due to the discontinuity when looping. The FFT issues are generally fixed by windowing (like a hamming or hanning window), and can sometimes be reduced by tweaking the sample length and using a DFT. As I am sure you have found, sometimes you have to tweak the beginning and/or end of a loop to get it to sound ok at the transition. But, that's not really a DSP_sampling issue.

If you are limited to choosing a specific sequence size then looping can be an issue.

sophiecentaur · Jan 8, 2016

hypnoticdesign said:

Its the damn term sample that gets me, why this term is used for so many things in music production is beyond me, it just creates confusion

As do many of the terms that are used in Engineering. But there is just one 'official' meaning and that is the instantaneous value of the input waveform, that is measured and stored / processed. Reading the popular press for 'enlightenment' is usually a forlorn hope. 'Sampling' often means taking a long string of samples - but that's another topic.

hypnoticdesign said:

Surely frequency would make a difference. I mean if a sound never lands on a zero crossing that would have to sound different to a sound that always lands on a zero crossing?

The only instance of this is when you are sampling at exactly twice the signal frequency. Nyquist demands that your input signal frequency must be lower than that. I made a point, earlier, of saying that the process of reconstitution may take some time and, if you have an input tone that's 1Hz below the Nyquist frequency and you want to avoid your zero crossing problem, then you may need to take a half second for the samples to coincide with max and min of the waveform. Note - you couldn't have a very short burst of this frequency and still comply with the Nyquist criterion because the input signal would have components, higher than f_s/2 so, by definition, your input waveform would have to have a long enough stretch of high frequencies for the zero crossing problem to be avoided. The consequences of sampling are very much dependent upon the filtering that's used and that tends to get ignored in discussions; the basic theory assumes 'perfect pre and post filtering - to keep the signal within the Nyquist limits in a healthy way..

hypnoticdesign said:

A lot of you are saying that there's no point sampling at higher rates, but that's not actually true.

Absolutely. But first, remember that basic sampling theory relates to Analogue samples. Mr Shannon was around before the age of digits. If you want to introduce the consequences of digitisation / quantisation of your samples then that's a new ball game. as soon as the samples are quantised, there is distortion and this distortion is often described as Quantisation Noise because of what it can sound like. Quantisation noise can sound a bit like crossover distortion (buzz buzz during low level passages). There is a tradeoff between sample rate and the number of bits needed per sample. Gross oversampling can have massive advantages and the 'bit slice' ADC, which uses single bit quantisation and many MHz of sample rate can give very low quantisation noise because this noise energy is spread over the whole spectrum and the post filter will remove all but the audio components of the noise. There is so much to read about this topic and, of course, there is a lot of BS in the Hi Fi mags.

hypnoticdesign said:

Soooo... Samples are little snapshots(so to speak) of amplitude that make up waveforms depending on the arrangement of the samples. So if I've got a sine wave at 44khz samplerate and play 1 bar at 60 bpm.. I will have 44000 samples in a bar regardless of frequency.

I just found this. You are jumping from basics to practicalities here. If you have a simple sampling synthesiser then each note for a particular sound could be based on a recording of a number of cycles of just a single source sound, scaled up or down in frequency. Between this and the playback, there must be a re-sampling, to stretch or compress the available string of samples so that they are avaiable at the standard sample rate of the downstream circuits. This involves re-timing and interpolating between the recorded samples.

hypnoticdesign · Jan 9, 2016

meBigGuy said:

I guess there is DJ_sampling and DSP_sampling.

When you create waveforms with equations I expect you can use floating point numbers (like 3.14159) to describe relationships. Wouldn't be much good if it were just integers. So, don't sweat the sample rate when you are choosing frequencies. It flat out doesn't matter in most real world situations.

Now, if you are choosing a sequence to analyze (like do an FFT) or loop, there can be issues due to the discontinuity when looping. The FFT issues are generally fixed by windowing (like a hamming or hanning window), and can sometimes be reduced by tweaking the sample length and using a DFT. As I am sure you have found, sometimes you have to tweak the beginning and/or end of a loop to get it to sound ok at the transition. But, that's not really a DSP_sampling issue.

If you are limited to choosing a specific sequence size then looping can be an issue.

For the waveforms I just start with sin(-x*pi) for a sine wave or abs(x)*2-1 for a triangle and go from there.. I came up with a formula a while ago where if you take any number and divide it by a decimal then divide by the number you start with and + 1 you eventually get 1.618 and I just try stuff similar to that using something like < (-x*pi) to blend between waveforms. It results in a square like finish sound most of the time but blends between the waveform you start and allows a total of 256 waves so there's a lot of room for modulation. I just experiment to be honest though, I used a lyrebird song once to get a sine like wave while just messing about without thinking what I was doing and its 1 of my favourite sounds!

I don't really use loops at all, I like to combine samples to create a kit or just synthesise them and program the beats with midi, though I like the sound of a continuous loop sometimes to create a sustained sound and finding the right place to loop can be really hard sometimes.

Well before I got into this silly samplerate theory of mine I was timing the bpm to the cycle duration of my root bass note or basing it off the root mean of the chromatic scale. It sounds awesome to me so I'll stick to it! I just thought there might be some benefit to using sample rate in the same way. I understand now how samplerate and frequency are independent other than the range. I did notice something last night though that's a little strange. I was playing around in Logic with a few different sounds and testing using a cut off filter at really high frequencies. When I have a project set to 96khz there's just no cut off point my eq's can reach that sounds better to me than leaving it alone. I tried a 24 kHz cut off and while the difference was only tiny, it definitely sounded better without it. I'd describe the effect similar to how a tape machine can offset the sound making it brighter or darker. I'm guessing that it would be better to wait until the final stages before cutting off these high frequencies just before converting to more lossy formats?

meBigGuy · Jan 9, 2016

hypnoticdesign said:

When I have a project set to 96khz there's just no cut off point my eq's can reach that sounds better to me than leaving it alone. I tried a 24 kHz cut off and while the difference was only tiny, it definitely sounded better without it.

I'm not sure I entirely understand your experiment here, and whether or not you samplerate converted to 48KHz.

hypnoticdesign said:

I'm guessing that it would be better to wait until the final stages before cutting off these high frequencies just before converting to more lossy formats?

I concur that waiting until the end to samplerate convert will generally give the purest results. But, if you are dealing only with digitally created sounds created at the lower samplerate, I'm not sure where the difference might creep in. On the other hand, if you have a 96Khz sample rate signal, samplerate converting it to 48Khz requires the pre-conversion anti-alias filter to remove all content above 24Khz, and that means you will always change some content below 24Khz, since filters always have a transition region. You need to look at the response of the specific anti-alias filter used. And, since you cannot remove all >24Khz content, there will, in reality, always be some aliasing at time of subsampling. The goal is to design such that what aliasing there is will always be below the noise floor (or not, if you want to capitalize on aliasing)

https://www.maximintegrated.com/en/app-notes/index.mvp/id/928 explains anti-aliasing very well, but may raise some new questions.

Just remember that even purely digital samplerate conversion always introduces changes caused by aliasing and filter responses

The human ear (and personal taste) is "funny", and it is possible that a particular sound will sound "cooler" with alias artifacts than without. Just another distortion effect to be judged by "he who is listening". Supposedly there are tests that found that mp3 listeners preferred mp3 encoded music over losslessly encoded. A matter of what they are used to.

leright · Jan 17, 2016

meBigGuy said:

One of the hardest thing to accept about sub-nyquist sampling is to understand that the periodic sample data contains ALL the information of the original signal and can be used to recreate it exactly (within the limits of the sample word size). The samples actually represent evenly spaced impulses which, when filtered, will create the original signal exactly (filling in the spaces between samples). It is a very important concept.

Yup. This absolutely blew my mind when we studied this in my signals class. If you sample above the Nyquist rate of your signal ALL of the information of the original signal is there in the SAMPLES. Amazing. If you convert the time series of samples into the frequency domain you get a series of copies of the original frequency spectrum! However, If the sample rate is too low (below Nyquist) these copies of the original frequency spectrum overlap and you lose information. Also, if the copies of the original frequency spectrum are too close together it becomes difficult to recover the original signal via low pass filtering (since there is no such thing as an ideal filter.

What exactly is a sample in terms of sample rate (audio)

1. What is a sample rate?

2. How does sample rate affect audio quality?

3. What is the relationship between sample rate and frequency?

4. Can I change the sample rate of an audio file?

5. How does human hearing relate to sample rate?

Similar threads

Hot Threads

Recent Insights