Learn Speech Recognition Programming from Scratch

Gianluca · Jan 4, 2011

Hello guys!
Well, computer programming language: any. I am willing to learn from scratch. Platform, preferably Windows, but I'm prepared to fall back on Linux (but I do not know it with perfection)
The end result I want to achieve is to create a program (or script) that allows the computer to determine which of, well, four words he already knows (ie pear, flower, bee, apple), I just said.
Something like that: Start the program (he already knows the 4 words). I say apple at the mic. And he answers: apple.
Nothing fancy, completely useless, I know, but it'll not only help to expand considerably my coder skills, but also it'll help me to integrate significantly a thesis in physics on which I'm working. And it is precisely the problem.
For Linux there are a lot of packages which, if implemented in a script, help in this area. But unfortunately I can not afford to have that packages that do the dirty work for me, cause it is a thesis of physics, not a computer science one, and I've got to find the way to teach the PC to recognize what I'm saying, using the basis of the wave from of the sound, or otherwise a physically and objectively measurable (with a mic) parameter.
Another problem: where to save the four words he has learned? Database? Invented ad hoc file? hum :|

I realize this is something incredibly difficul to achieve, maybe impossible, but perhaps you know some sites, papers, books (author and title are enough for me), would you do me a favor and link them?
Even better, you could link me another section of this forum maybe a bit more suitable to address the issue?

If something is not clear, just say ... I'm italian and it's not impossible that I've done some stupid errors somewhere :P

THX! :D

Mark44 · Jan 4, 2011

There is quite a bit of speech recognition functionality that is already built into Windows and .NET Framework that you can use.

Here's a link to the MSDN articles on speech recognition and speech synthesis (translating text to speech): http://msdn.microsoft.com/en-us/library/ff394875.aspx

Take a look at the Speech Recognition section (http://msdn.microsoft.com/en-us/library/ff394880.aspx), and that might give you some ideas about how you can accomplish your goal.

honestrosewater · Jan 5, 2011

So are you more interested in classifying the signal (as one of your four words) or in processing the signal into something useful to a classifier? If you want to classify it, I think you'll have the most luck doing it statistically with an HMM. This has been the predominant approach for long enough that you can find resources explaining how to do it. I've never done the type of signal processing that's required for speech recognition, but the HMM is pretty simple to write and train, and implementing Viterbi is short and straightforward. You can certainly write these parts yourself from scratch.

Learn Speech Recognition Programming from Scratch

Thread 'Learning Assembly and computer architecture for x86'

Thread 'Microsoft Technical Interview question'

Similar threads

Hot Threads

Hackathon ideas?

Touch-typing for programmers

How to calculate Tension for a series of connected points?

Trying To Debug A Python File

Python Complaining About Python

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective