Voice is increasingly used to interact with our devices, in addition to being the only method for those who need this type of accessibility methods. In any case, it is quite comfortable to speak to dictate text or simply enter voice commands into our systems so that they do some kind of operation without using their hands. The problem is that speech recognition they are based on engines that use mathematical algorithms to recognize speech and are not 100% reliable.
Technological advances are increasingly bringing the reliability to perfection, and artificial intelligence and big data systems are also helping a lot to improve speech recognition programs enormously. Lately many efforts are being put to improve these systems to the maximum, and many studies are focusing on it to improve control and make them the interface of the future. Keep in mind that current interfaces are less natural for people and less fast than voice.
Voice recognition systems will have a value of about 10 billion dollars in the coming years and that is why large companies are focusing on the development of assistants such as Apple's Siri, Microsoft's Cortana or Mycroft for Linux, in addition to becoming increasingly popular and frequent products such as Amazon Echo, Google Home, or Apple HomePod for the home, as well as integrating sophisticated voice recognition systems in connected cars.
Having said that, our list of speech recognition tools for Linux are:
- Julius: is a powerful continuous speech recognition engine with lots of vocabulary.
- DeepSpeech: is a TensorFLow implementation of Baidu's DeepSpeech architecture.
- Simon: a fairly flexible speech recognition software.
- Kaldi: is a C ++ design toolkit for speech recognition research.
- CMUS Phinx: in this case it is a voice recognition engine for mobile apps and servers.
- deepspeech.python: is an implementation of DeepSPeech with Python and using Baidu Warp-CTC.
7 comments, leave yours
Very good, and will there be any good TTS (text to speech) for linux?
In Windows and Android there are very good quality voices such as Loquendo, Ivona or NeoSpeech, but they are not for Linux. On Linux I tried the mbrola and picoTTS voices but they are very robotic.
Cepstral offers a free Alejandra voice for linux which is pretty good, but I didn't know how to install it.
I walk in the same if you get a good share
You can use loquendo with wine on linux. I recommend this video ...
I tried to install an assistant, I mean, Google Assistant and I couldn't, I stayed in the part of the registry file, I think it's called. Too bad Alexa is crap ...
The espeak program works on the debian apt install espeak console. And join for example espeak -ves «Hello World»
the -ves is v = voices is = Spanish
You have many options to read a text file, write the result to a wav file.
the truth is all very bad, windows that is another world ... here they are 10 years behind
And 3 years later, yes! this is still overdue.