Learn Speech Recognition Programming from Scratch

  • Thread starter Thread starter Gianluca
  • Start date Start date
Click For Summary
SUMMARY

This discussion focuses on developing a speech recognition program from scratch, specifically targeting the identification of four predefined words: "pear," "flower," "bee," and "apple." The user seeks to utilize Windows or Linux platforms, emphasizing a hands-on approach without relying on pre-built packages. Key suggestions include leveraging the built-in speech recognition capabilities of the .NET Framework and implementing statistical methods such as Hidden Markov Models (HMM) for signal classification. Resources such as MSDN articles on speech recognition are recommended for further exploration.

PREREQUISITES
  • Basic understanding of programming languages (any language)
  • Familiarity with Windows and Linux operating systems
  • Knowledge of signal processing concepts
  • Understanding of statistical models, particularly Hidden Markov Models (HMM)
NEXT STEPS
  • Explore MSDN articles on speech recognition and synthesis
  • Research Hidden Markov Models (HMM) for speech classification
  • Learn about the Viterbi algorithm for implementing HMM
  • Investigate signal processing techniques relevant to speech recognition
USEFUL FOR

This discussion is beneficial for aspiring programmers, physics students integrating coding into their research, and anyone interested in developing custom speech recognition applications.

Gianluca
Messages
4
Reaction score
0
Hello guys!
Well, computer programming language: any. I am willing to learn from scratch. Platform, preferably Windows, but I'm prepared to fall back on Linux (but I do not know it with perfection)
The end result I want to achieve is to create a program (or script) that allows the computer to determine which of, well, four words he already knows (ie pear, flower, bee, apple), I just said.
Something like that: Start the program (he already knows the 4 words). I say apple at the mic. And he answers: apple.
Nothing fancy, completely useless, I know, but it'll not only help to expand considerably my coder skills, but also it'll help me to integrate significantly a thesis in physics on which I'm working. And it is precisely the problem.
For Linux there are a lot of packages which, if implemented in a script, help in this area. But unfortunately I can not afford to have that packages that do the dirty work for me, cause it is a thesis of physics, not a computer science one, and I've got to find the way to teach the PC to recognize what I'm saying, using the basis of the wave from of the sound, or otherwise a physically and objectively measurable (with a mic) parameter.
Another problem: where to save the four words he has learned? Database? Invented ad hoc file? hum :|

I realize this is something incredibly difficul to achieve, maybe impossible, but perhaps you know some sites, papers, books (author and title are enough for me), would you do me a favor and link them?
Even better, you could link me another section of this forum maybe a bit more suitable to address the issue?

If something is not clear, just say ... I'm italian and it's not impossible that I've done some stupid errors somewhere :P

THX! :D
 
Technology news on Phys.org
So are you more interested in classifying the signal (as one of your four words) or in processing the signal into something useful to a classifier? If you want to classify it, I think you'll have the most luck doing it statistically with an HMM. This has been the predominant approach for long enough that you can find resources explaining how to do it. I've never done the type of signal processing that's required for speech recognition, but the HMM is pretty simple to write and train, and implementing Viterbi is short and straightforward. You can certainly write these parts yourself from scratch.
 

Similar threads

  • · Replies 43 ·
2
Replies
43
Views
7K
Replies
16
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
Replies
69
Views
10K
  • · Replies 21 ·
Replies
21
Views
2K
  • · Replies 8 ·
Replies
8
Views
4K
Replies
1
Views
1K
  • · Replies 8 ·
Replies
8
Views
1K