Learn Speech Recognition Programming from Scratch

  • Thread starter Gianluca
  • Start date
In summary, the conversation was about the speaker's interest in learning computer programming language, preferably on the Windows platform. They are looking to create a program that can recognize and respond to specific spoken words, and are seeking resources and guidance on how to achieve this goal. They are also considering the option of using pre-existing packages for assistance, but are hesitant due to the nature of their project. The expert suggests looking into speech recognition and synthesis functionality already available in Windows and .NET Framework, as well as using an HMM for classification.
  • #1
Gianluca
4
0
Hello guys!
Well, computer programming language: any. I am willing to learn from scratch. Platform, preferably Windows, but I'm prepared to fall back on Linux (but I do not know it with perfection)
The end result I want to achieve is to create a program (or script) that allows the computer to determine which of, well, four words he already knows (ie pear, flower, bee, apple), I just said.
Something like that: Start the program (he already knows the 4 words). I say apple at the mic. And he answers: apple.
Nothing fancy, completely useless, I know, but it'll not only help to expand considerably my coder skills, but also it'll help me to integrate significantly a thesis in physics on which I'm working. And it is precisely the problem.
For Linux there are a lot of packages which, if implemented in a script, help in this area. But unfortunately I can not afford to have that packages that do the dirty work for me, cause it is a thesis of physics, not a computer science one, and I've got to find the way to teach the PC to recognize what I'm saying, using the basis of the wave from of the sound, or otherwise a physically and objectively measurable (with a mic) parameter.
Another problem: where to save the four words he has learned? Database? Invented ad hoc file? hum :|

I realize this is something incredibly difficul to achieve, maybe impossible, but perhaps you know some sites, papers, books (author and title are enough for me), would you do me a favor and link them?
Even better, you could link me another section of this forum maybe a bit more suitable to address the issue?

If something is not clear, just say ... I'm italian and it's not impossible that I've done some stupid errors somewhere :P

THX! :D
 
Technology news on Phys.org
  • #2
  • #3
So are you more interested in classifying the signal (as one of your four words) or in processing the signal into something useful to a classifier? If you want to classify it, I think you'll have the most luck doing it statistically with an HMM. This has been the predominant approach for long enough that you can find resources explaining how to do it. I've never done the type of signal processing that's required for speech recognition, but the HMM is pretty simple to write and train, and implementing Viterbi is short and straightforward. You can certainly write these parts yourself from scratch.
 

1. What is speech recognition programming?

Speech recognition programming is the process of creating software that can recognize and interpret spoken language. This technology allows computers to understand and respond to human speech, making it possible for us to interact with devices using our voices.

2. Why should I learn speech recognition programming?

Learning speech recognition programming opens up a wide range of career opportunities in fields such as artificial intelligence, natural language processing, and human-computer interaction. It also allows you to develop innovative applications and improve user experience in various industries.

3. Do I need any prior programming experience to learn speech recognition programming?

While some basic knowledge of programming concepts may be helpful, it is not necessary to have prior programming experience to learn speech recognition programming. Many resources are available that cater to beginners and provide step-by-step guidance.

4. What are the main components of speech recognition programming?

The main components of speech recognition programming include a speech recognition engine, acoustic and language models, and a user interface. The speech recognition engine is responsible for converting spoken words into text, while the acoustic and language models help in understanding the context and meaning of the words. The user interface allows users to interact with the speech recognition software.

5. What are some common programming languages used in speech recognition?

Some commonly used programming languages in speech recognition include Python, Java, C++, and JavaScript. There are also specialized speech recognition libraries and frameworks available, such as Google's Speech Recognition API and CMU Sphinx, which can be used with various programming languages.

Similar threads

  • Programming and Computer Science
Replies
16
Views
1K
  • Programming and Computer Science
Replies
8
Views
875
  • Programming and Computer Science
Replies
4
Views
637
  • Programming and Computer Science
2
Replies
69
Views
4K
  • Programming and Computer Science
Replies
1
Views
554
  • Programming and Computer Science
Replies
8
Views
1K
  • Programming and Computer Science
Replies
21
Views
1K
  • STEM Academic Advising
Replies
21
Views
1K
  • Programming and Computer Science
Replies
8
Views
350
  • Programming and Computer Science
Replies
11
Views
1K
Back
Top