Learning Materials for Speech, Music, and Acoustics Research

Skyler0114 · Sep 27, 2012

I am interested in starting to research into the acoustics of speech and music.
In particular I want to analyze the spectral profile of human voice in hopes of finding relationships between it and music. A possible future direction for this research is to improve synthetic speech to mimic humans with more fidelity.
I am a sophomore(ish) college student who has completed mechanics and is currently taking a second level calculus course, but I have a very strong aptitude (and enthusiasm) towards learning new things and I have a lot of science background all across the board, though I must admit biology(at scales larger than the cellular level) is less than desirable. I have a very firm grasp on almost all the math I've learned to this point. I have a knack for computers and I can make almost any program start singing 'come all ye faithful' after 15 minutes, and a background in Java and C++, though by all means I would really like to further my skills in programming. However, my knowledge of music theory is almost non-existant, and my understanding of acoustics is far from the level I think I'll need to adequately perform research. I am probably the definition of a beginner pianist, but am teaching myself when I have time. I also have little experience with linguistics, and would be interested in learning parts of the discipline pertinent to the research.
I live in California and have worked with researchers in Caltech and USC, though they were in computational chemistry and bioinformatics, so I don't know how much help that would be.
I need to come up with a plan to get myself to a level where I can start researching. I'll need to research acoustics definitely, and most likely a decent amount of music theory, and find software that can perform the analysis I'll need. There is more, which I'd gladly add to this list if I knew, but probably as I start learning I can better describe what I'll need. Currently I'm looking for books, scientific articles, web sites, videos, anything that would help me better understand acoustics, speech, and music theory in order to be able to perform this research. Also, if you have experience in any of these fields, an outline of what you think are essential would just be lovely. Thanks

Bobbywhy · Sep 27, 2012

I spent over 25 years as a sonar engineer pinging in oceans all around our planet using low, mid, and high frequencies. As a result I am slightly biased towards your project. We can learn great acoustic techniques from studying how animals use sound. Some of our military sonar applications have borrowed directly from a variety of sea animals that have achieved success by homeostasis through evolution. This includes, for instance, a big noisemaker, the lowly snapping shrimp. (See acoustic shockwaves and sonoluminescence)

Two examples of infrasound in nature used for communication where the transmission medium is air: elephants and peacocks!
http://www.birds.cornell.edu/brp/elephant/sections/dictionary/infrasound.html
http://www.sciencenews.org/view/generic/id/341606/title/Peacocks_ruffle_feathers,_make_a_rumble

Cornell Lab of Ornithology, The Macaulay Library is the world's largest and oldest scientific archive of biodiversity audio and video recordings of Birds, Mammals, Reptiles, Amphibians, Arthropods, and Fishes. At the bottom of the web page see the methods of Field Recording, Audio Equipment, Audio Techniques, Video Techniques, and Workshops
http://macaulaylibrary.org/

Here is one of the best sites to study all aspects of acoustics, full of excellent animations to help the student grasp the fundamentals intuitively:
Acoustics and Vibration Animations, by Dan Russell, Ph.D., Professor of Acoustics & Director of Distance Education
Graduate Program in Acoustics, The Pennsylvania State University
http://www.acs.psu.edu/drussell/demos.html

Audacity® is free, open source, cross-platform software for recording and editing sounds. You will want this in your toolbox:
http://audacity.sourceforge.net/

“Our goal in Speech Technology Research is twofold: to make speaking to your phones and computers ubiquitous and seamless, and to help make videos on the web accessible and searchable.”
http://research.google.com/pubs/SpeechProcessing.html

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech.
http://en.wikipedia.org/wiki/Speech_synthesis

The Scientist and Engineer's Guide to
Digital Signal Processing
By Steven W. Smith, Ph.D.
See Chapter 22, especially where the spectral profile of human voice is considered.
http://www.dspguide.com/ch22/6.htm

Work at the Audio Laboratory, Department of Electronic, University of York, UK in this area includes music performances using formant synthesis in music technology teaching and articulatory synthesis in 2 and 3 dimensions based on MRI images of the vocal tract.
http://www.davidmhoward.com/voiceSynthesis.htm

As for human languages, I suggest you consider the different modern living languages, including those primitive tongues that use “clicks” to transfer meaning to others. For example, see: http://en.wikipedia.org/wiki/Natural_language
And this: http://dir.yahoo.com/Social_Science/Linguistics_and_Human_Languages/

Finally, The Acoustic Society of America’s website is loaded with information, much or it related to your interests.
http://www.acoustics.org/

Let us know how your study progresses.

Cheers,
Bobbywhy

Skyler0114 · Oct 4, 2012

Thanks for all the leads Bobbywhy. A lot of good links, but one thing you didn't mention was math I should be strong in. Also looking at the Cornell link you gave me is making me a bit scared about the investment costs in start up. Obviously I don't need everything there but making out what I will really need is challenging.

BenG549 · Oct 8, 2012

Hey, My name is Ben and I'm currently studying for a Masters degree in Acoustics, with a background in Audio and Music Technology.

I did a project in my undergraduate course on formant frequency variation in vowel sounds across regional accents in the UK - if you've not got any knowledge of acoustics or linguistics I wouldn't expect that to mean much but formant frequencies are a great first step to understanding how timbre is created in music; for example why a C# on a piano and a C# on a trumpet are the same pitch but the sounds distinctly different. Its all to do with higher harmonics 'colouring' the sound created.

As a link with the spoken word, you can then start to think of the human voice (or more specifically your mouth and throat) as a pipe closed at one end (Just like a trumpet player vibrates his lips and sound is produced out the end of the tube you vibrate your vocal chords and sound arrives at the end of your throat, your open mouth). A uniform tube closed at one end will have a very specific set of natural harmonics that will provide its natural resonant sound. This is exactly the same as your voice only we have to ability to shape our mouth tongue and throat in order to change the harmonic content of the sound people from different places do this in a slightly different way - and hence different accents basically. So in the vowel sound 'ah' for example the 2nd 3rd 4th harmonic in that sound will be slightly different if you've been brought up in California, Alabama, or Luton (like me).

The guy above posted you a link to audacity? that's a good bit of software, if you record or sample some clean speech try cutting out a single word or vowel sound, like ooo or ahhh, and pull it up in the 'spectrograph function' you will be able to see the formant frequencies I'm talking about.

So because of the impedance miss match between your mouth and the air outside of it some of the sound will be reflected back down your throat (or back up the trumpet in that example) standing waves develop in the throat and those frequencies are enforced (look up wave summation, superposition principle, Fourier etc) and these frequencies show up on the spectrograph.

Ok so first point is if you're not totally clear on them, before you start trying to solve wave equation for a human throat, look up things like impedance miss matches understand what that means in acoustics (if you don't already), standing waves and resonant frequencies (harmonics) Fourier series, waves in tubes etc. Let me know if I was just rambling if that was a useful pointer, if it was useful feel free to ask questions, if not I'll shut up :).

Good luck

Ben.

BenG549 · Oct 8, 2012

Oh and as for the areas of maths you need to work on... its all calculus baby! Vibration and radiation is all modelling motion essentially, so it's differential equations. If you've completed a mechanics course and you've done a bit of maths you're probably familiar with differentiation, integration etc.. The fact that, with respect to time, velocity and acceleration are the first and second order derivatives of displacement. That is all pretty fundamental, to be honest the best place to brush up on calculus is youtube, I know that MIT and Berkley have open source content on there, I found it pretty useful when studying maths and vibrations for the first time. Look up stuff like waves on a string, then you can listen to some professor solve the wave e.q. for clamped strings or uncomplicated strings or damped strings, but yeah acoustics is basically mechanical engineering in the maths that you use. Have fun with it!

Ben.

Learning Materials for Speech, Music, and Acoustics Research

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Other Hi, I need some advice about how to publish

Pre-reqs for Ultrafast Optics? (Jackson or Zangwill for Electrodynamics?)

Advice for Physics Olympiads

Other Is having bad resources from university normal?

Admissions Graduating a year early anxieties + grad school

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect