DeepSpeech | open source | Speech library

 by   mozilla C++ Version: 0.10.0a3 License: MPL-2.0

kandi X-RAY | DeepSpeech Summary

kandi X-RAY | DeepSpeech Summary

DeepSpeech is a C++ library typically used in Artificial Intelligence, Speech, Deep Learning, Tensorflow applications. DeepSpeech has no bugs, it has no vulnerabilities, it has a Weak Copyleft License and it has medium support. You can download it from GitHub.

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

            kandi-support Support

              DeepSpeech has a medium active ecosystem.
              It has 22108 star(s) with 3761 fork(s). There are 658 watchers for this library.
              It had no major release in the last 12 months.
              There are 110 open issues and 1977 have been closed. On average issues are closed in 97 days. There are 19 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of DeepSpeech is 0.10.0a3

            kandi-Quality Quality

              DeepSpeech has 0 bugs and 0 code smells.

            kandi-Security Security

              DeepSpeech has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              DeepSpeech code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              DeepSpeech is licensed under the MPL-2.0 License. This license is Weak Copyleft.
              Weak Copyleft licenses have some restrictions, but you can use them in commercial projects.

            kandi-Reuse Reuse

              DeepSpeech releases are available to install and integrate.
              It has 9524 lines of code, 659 functions and 119 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of DeepSpeech
            Get all kandi verified functions for this library.

            DeepSpeech Key Features

            No Key Features are available at this moment for DeepSpeech.

            DeepSpeech Examples and Code Snippets

            No Code Snippets are available at this moment for DeepSpeech.

            Community Discussions


            Adafruit I2S MEMS microphone is not working with voice activity detection system
            Asked 2022-Jan-26 at 13:36

            I am trying to make a speech to text system using raspberry pi. There are many problems with VAD. I am using DeepCpeech's VAD script. Adafruit I2S MEMS microphone accepts only 32-bit PCM audio. So I modified the script to record 32-bit audio and then convert it to 16 bit for DeepSpeech's processing. Frames generation and conversation parts are below:



            Answered 2022-Jan-26 at 13:36

            I searched for DeepCpeech's VAD script and found it. The problem is connected with the webrtcvad. The webrtcvad VAD only accepts 16-bit mono PCM audio, sampled at 8000, 16000, 32000 or 48000 Hz. So you need to convert the 32-bit frame to 16-bit (I am about PyAudio output frame) to process webrtcvad.is_speech(). I changed and it worked fine.



            Can't set a hotword with deepspeech
            Asked 2022-Jan-23 at 23:41

            I tried to set my hotword for deepspeech on my raspberry pi and got a really long error when I sent this in terminal:

            python3 /home/pi/DeepSpeech_RaspberryPi4_Hotword/ --keywords jarvis


            I don't know how to fix this and didn't find anything anywhere else.



            Answered 2021-Nov-25 at 03:10

            these errors are not related to DeepSpeech, they're related to ALSA, which is the sound subsystem for Linux. By the looks of the error, your system is having trouble accessing the microphone.

            I would recommend running several ALSA tests, such as;

            arecord -l

            This should give you a list of recording devices that are detected, such as:



            Subprocess call error while calling of DeepSpeech
            Asked 2021-Dec-06 at 03:33

            I am trying to build customised scorer (language model) for speech-to-text using DeepSpeech in colab. While calling getting this error:



            Answered 2021-Dec-06 at 03:33

            Able to find a solution for the above question. Successfully created language model after reducing the value of top_k to 15000. My phrases file has about 42000 entries only. We have to adjust top_k value based on the number of phrases in our collection. top_k parameter says - this much of less frequent phrases will be removed before processing.



            (0) Invalid argument: Not enough time for target transition sequence (required: 28, available: 24) During the Training in Mozilla Deepspeech
            Asked 2021-Sep-25 at 18:12

            I am using below command to start the training of deepspeech model



            Answered 2021-Sep-25 at 18:12

            Following worked for me

            Go to



            ['kenlm/build/bin/build_binary', '-a', '255', '-q', '8', '-v', 'trie', '', '/content/lm.binary']' returned non-zero exit status 1
            Asked 2021-Sep-25 at 14:09

            During the build of lm binay to create scorer doe deepspeech model I was getting the following error again and again



            Answered 2021-Sep-25 at 14:09

            Following worked for me Go to



            Error during training in deepspeech Internal: Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]
            Asked 2021-Sep-24 at 13:04

            Getting following error when trying to excecute



            Answered 2021-Sep-23 at 07:59

            If i try it as below it worked fine.



            The method 'getApplicationDocumentsDirectory' isn't defined for the type '_MyAppState'
            Asked 2021-Jun-13 at 13:29

            so a part of my code is



            Answered 2021-Jun-13 at 13:29

            You have to install path provider package by running flutter pub add path_provider in your terminal. If you already installed it. check whether you are importing it to your file.



            while I was trying to train a DeepSpeech model on google colab, I'm getting an error saying that .whl file is not suported
            Asked 2021-May-26 at 00:07

            commands i used



            Answered 2021-May-26 at 00:07

            You are using wget to pull down a .whl file that was built for a different version of Python. You are pulling down


            but are running Python 3.7. You need a different .whl file, such as:


            This is available here from the DeepSpeech releases page on GitHub.



            How to change microphone sample rate to 16000 on linux?
            Asked 2021-May-18 at 13:17

            I am currently working on a project for which I am trying to use Deepspeech on a raspberry pi while using microphone audio, but I keep getting an Invalid Sample rate error. Using pyAudio I create a stream which uses the sample rate the model wants, which is 16000, but the microphone I am using has a sample rate of 44100. When running the python script no rate conversion is done and the microphones sample rate and the expected sample rate of the model produce an Invalid Sample Rate error.

            The microphone info is listed like this by pyaudio:



            Answered 2021-Jan-09 at 16:47

            So after some more testing I wound up editing the config file for pulse. In this file you are able to uncomment entries which allow you to edit the default and/or alternate sampling rate. The editing of the alternative sampling rate from 48000 to 16000 is what was able to solve my problem.

            The file is located here: /etc/pulse/daemon.conf . We can open and edit this file on Raspberian using sudo vi daemon.conf. Then we need to uncomment the line ; alternate-sample-rate = 48000 which is done by removing the ; and change the value of 48000 to 16000. Save the file and exit vim. Then restart the Pulseaudio using pulseaudio -k to make sure it runs the changed file.

            If you are unfamiliar with vim and Linux here is a more elaborate guide through the process of changing the sample rate.



            DeepSpeech failed to learn Persian language
            Asked 2021-May-15 at 08:12

            I’m training DeepSpeech from scratch (without checkpoint) with a language model generated using KenLM as stated in its doc. The dataset is a Common Voice dataset for Persian language.

            My configurations are as follows:

            1. Batch size = 2 (due to cuda OOM)
            2. Learning rate = 0.0001
            3. Num. neurons = 2048
            4. Num. epochs = 50
            5. Train set size = 7500
            6. Test and Dev sets size = 5000
            7. dropout for layers 1 to 5 = 0.2 (also 0.4 is experimented, same results)

            Train and val losses decreases through the training process but after a few epochs val loss does not decrease anymore. Train loss is about 18 and val loss is about 40.

            The predictions are all empty strings at the end of the process. Any ideas how to improve the model?



            Answered 2021-May-11 at 14:02

            maybe you need to decrease learning rate or use a learning rate scheduler.


            Community Discussions, Code Snippets contain sources that include Stack Exchange Network


            No vulnerabilities reported

            Install DeepSpeech

            You can download it from GitHub.


            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
          • PyPI

            pip install deepspeech

          • CLONE
          • HTTPS


          • CLI

            gh repo clone mozilla/DeepSpeech

          • sshUrl


          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link