18 best Python Speech Recognition Libraries for 2023

share link

by naveen.kumar@openweaver.com dot icon Updated: Jul 31, 2023

technology logo
technology logo

Guide Kit Guide Kit  


Speech recognition is converting spoken words to text. It supports Google Speech Engine, Cloud Speech API, Bing Voice Recognition, and IBM Speech.


As we know Python is a multipurpose language that can be used for developing various applications including web apps. Python has many libraries dedicated to speech recognition, text-to-speech conversion, and text analysis.


In this article, I have listed some of the best Python Speech Recognition libraries with their key features. In this kit, we will go through some of the best Python Speech Recognition libraries like Real-Time-Voice-Cloning - 5 seconds to generate arbitrary speech; speech_recognition - Speech recognition module for Python, supporting several engines; wav2letter - Facebook AI Research's Automatic Speech Recognition Toolkit. Find the top 18 best Python Speech Recognition libraries in 2022.

Python doticonstar image 42399 doticonVersion:Currentdoticon
License: Others (Non-SPDX)

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Support
    Quality
      Security
        License
          Reuse

            Real-Time-Voice-Cloningby CorentinJ

            Python doticon star image 42399 doticonVersion:Currentdoticon License: Others (Non-SPDX)

            Clone a voice in 5 seconds to generate arbitrary speech in real-time
            Support
              Quality
                Security
                  License
                    Reuse
                      Python doticonstar image 7239 doticonVersion:3.10.0doticon
                      License: Permissive (BSD-3-Clause)

                      Speech recognition module for Python, supporting several engines and APIs, online and offline.

                      Support
                        Quality
                          Security
                            License
                              Reuse

                                speech_recognitionby Uberi

                                Python doticon star image 7239 doticonVersion:3.10.0doticon License: Permissive (BSD-3-Clause)

                                Speech recognition module for Python, supporting several engines and APIs, online and offline.
                                Support
                                  Quality
                                    Security
                                      License
                                        Reuse

                                          wav2letterby facebookresearch

                                          Python doticonstar image 5531 doticonVersion:v0.1doticon
                                          License: Others (Non-SPDX)

                                          Facebook AI Research's Automatic Speech Recognition Toolkit

                                          Support
                                            Quality
                                              Security
                                                License
                                                  Reuse

                                                    wav2letterby facebookresearch

                                                    Python doticon star image 5531 doticonVersion:v0.1doticon License: Others (Non-SPDX)

                                                    Facebook AI Research's Automatic Speech Recognition Toolkit
                                                    Support
                                                      Quality
                                                        Security
                                                          License
                                                            Reuse
                                                              Python doticonstar image 6646 doticonVersion:v1.3.0doticon
                                                              License: Strong Copyleft (GPL-3.0)

                                                              A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

                                                              Support
                                                                Quality
                                                                  Security
                                                                    License
                                                                      Reuse

                                                                        ASRT_SpeechRecognitionby nl8590687

                                                                        Python doticon star image 6646 doticonVersion:v1.3.0doticon License: Strong Copyleft (GPL-3.0)

                                                                        A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
                                                                        Support
                                                                          Quality
                                                                            Security
                                                                              License
                                                                                Reuse

                                                                                  ffsubsyncby smacke

                                                                                  Python doticonstar image 5990 doticonVersion:0.4.22doticon
                                                                                  License: Permissive (MIT)

                                                                                  Automagically synchronize subtitles with video.

                                                                                  Support
                                                                                    Quality
                                                                                      Security
                                                                                        License
                                                                                          Reuse

                                                                                            ffsubsyncby smacke

                                                                                            Python doticon star image 5990 doticonVersion:0.4.22doticon License: Permissive (MIT)

                                                                                            Automagically synchronize subtitles with video.
                                                                                            Support
                                                                                              Quality
                                                                                                Security
                                                                                                  License
                                                                                                    Reuse

                                                                                                      espnetby espnet

                                                                                                      Python doticonstar image 6684 doticonVersion:v.202304doticon
                                                                                                      License: Permissive (Apache-2.0)

                                                                                                      End-to-End Speech Processing Toolkit

                                                                                                      Support
                                                                                                        Quality
                                                                                                          Security
                                                                                                            License
                                                                                                              Reuse

                                                                                                                espnetby espnet

                                                                                                                Python doticon star image 6684 doticonVersion:v.202304doticon License: Permissive (Apache-2.0)

                                                                                                                End-to-End Speech Processing Toolkit
                                                                                                                Support
                                                                                                                  Quality
                                                                                                                    Security
                                                                                                                      License
                                                                                                                        Reuse

                                                                                                                          speechbrainby speechbrain

                                                                                                                          Python doticonstar image 6123 doticonVersion:v0.5.14doticon
                                                                                                                          License: Permissive (Apache-2.0)

                                                                                                                          A PyTorch-based Speech Toolkit

                                                                                                                          Support
                                                                                                                            Quality
                                                                                                                              Security
                                                                                                                                License
                                                                                                                                  Reuse

                                                                                                                                    speechbrainby speechbrain

                                                                                                                                    Python doticon star image 6123 doticonVersion:v0.5.14doticon License: Permissive (Apache-2.0)

                                                                                                                                    A PyTorch-based Speech Toolkit
                                                                                                                                    Support
                                                                                                                                      Quality
                                                                                                                                        Security
                                                                                                                                          License
                                                                                                                                            Reuse

                                                                                                                                              speech-to-text-wavenetby buriburisuri

                                                                                                                                              Python doticonstar image 3746 doticonVersion:Currentdoticon
                                                                                                                                              License: Permissive (Apache-2.0)

                                                                                                                                              Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow

                                                                                                                                              Support
                                                                                                                                                Quality
                                                                                                                                                  Security
                                                                                                                                                    License
                                                                                                                                                      Reuse

                                                                                                                                                        speech-to-text-wavenetby buriburisuri

                                                                                                                                                        Python doticon star image 3746 doticonVersion:Currentdoticon License: Permissive (Apache-2.0)

                                                                                                                                                        Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow
                                                                                                                                                        Support
                                                                                                                                                          Quality
                                                                                                                                                            Security
                                                                                                                                                              License
                                                                                                                                                                Reuse
                                                                                                                                                                  Python doticonstar image 2729 doticonVersion:Currentdoticon
                                                                                                                                                                  License: Permissive (MIT)

                                                                                                                                                                  End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

                                                                                                                                                                  Support
                                                                                                                                                                    Quality
                                                                                                                                                                      Security
                                                                                                                                                                        License
                                                                                                                                                                          Reuse

                                                                                                                                                                            Automatic_Speech_Recognitionby zzw922cn

                                                                                                                                                                            Python doticon star image 2729 doticonVersion:Currentdoticon License: Permissive (MIT)

                                                                                                                                                                            End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
                                                                                                                                                                            Support
                                                                                                                                                                              Quality
                                                                                                                                                                                Security
                                                                                                                                                                                  License
                                                                                                                                                                                    Reuse

                                                                                                                                                                                      tacotronby keithito

                                                                                                                                                                                      Python doticonstar image 2787 doticonVersion:v0.2.0doticon
                                                                                                                                                                                      License: Permissive (MIT)

                                                                                                                                                                                      A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

                                                                                                                                                                                      Support
                                                                                                                                                                                        Quality
                                                                                                                                                                                          Security
                                                                                                                                                                                            License
                                                                                                                                                                                              Reuse

                                                                                                                                                                                                tacotronby keithito

                                                                                                                                                                                                Python doticon star image 2787 doticonVersion:v0.2.0doticon License: Permissive (MIT)

                                                                                                                                                                                                A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
                                                                                                                                                                                                Support
                                                                                                                                                                                                  Quality
                                                                                                                                                                                                    Security
                                                                                                                                                                                                      License
                                                                                                                                                                                                        Reuse

                                                                                                                                                                                                          TensorFlowTTSby TensorSpeech

                                                                                                                                                                                                          Python doticonstar image 3375 doticonVersion:v1.8doticon
                                                                                                                                                                                                          License: Permissive (Apache-2.0)

                                                                                                                                                                                                          :stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

                                                                                                                                                                                                          Support
                                                                                                                                                                                                            Quality
                                                                                                                                                                                                              Security
                                                                                                                                                                                                                License
                                                                                                                                                                                                                  Reuse

                                                                                                                                                                                                                    TensorFlowTTSby TensorSpeech

                                                                                                                                                                                                                    Python doticon star image 3375 doticonVersion:v1.8doticon License: Permissive (Apache-2.0)

                                                                                                                                                                                                                    :stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
                                                                                                                                                                                                                    Support
                                                                                                                                                                                                                      Quality
                                                                                                                                                                                                                        Security
                                                                                                                                                                                                                          License
                                                                                                                                                                                                                            Reuse
                                                                                                                                                                                                                              Python doticonstar image 2142 doticonVersion:Currentdoticon
                                                                                                                                                                                                                              License: Others (Non-SPDX)

                                                                                                                                                                                                                              🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

                                                                                                                                                                                                                              Support
                                                                                                                                                                                                                                Quality
                                                                                                                                                                                                                                  Security
                                                                                                                                                                                                                                    License
                                                                                                                                                                                                                                      Reuse

                                                                                                                                                                                                                                        tensorflow-speech-recognitionby pannous

                                                                                                                                                                                                                                        Python doticon star image 2142 doticonVersion:Currentdoticon License: Others (Non-SPDX)

                                                                                                                                                                                                                                        🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
                                                                                                                                                                                                                                        Support
                                                                                                                                                                                                                                          Quality
                                                                                                                                                                                                                                            Security
                                                                                                                                                                                                                                              License
                                                                                                                                                                                                                                                Reuse

                                                                                                                                                                                                                                                  say_whatby joshnewlan

                                                                                                                                                                                                                                                  Python doticonstar image 2080 doticonVersion:Currentdoticon
                                                                                                                                                                                                                                                  no licences License: No License (null)

                                                                                                                                                                                                                                                  Using speech-to-text to fully check out during con calls

                                                                                                                                                                                                                                                  Support
                                                                                                                                                                                                                                                    Quality
                                                                                                                                                                                                                                                      Security
                                                                                                                                                                                                                                                        License
                                                                                                                                                                                                                                                          Reuse

                                                                                                                                                                                                                                                            say_whatby joshnewlan

                                                                                                                                                                                                                                                            Python doticon star image 2080 doticonVersion:Currentdoticonno licences License: No License

                                                                                                                                                                                                                                                            Using speech-to-text to fully check out during con calls
                                                                                                                                                                                                                                                            Support
                                                                                                                                                                                                                                                              Quality
                                                                                                                                                                                                                                                                Security
                                                                                                                                                                                                                                                                  License
                                                                                                                                                                                                                                                                    Reuse

                                                                                                                                                                                                                                                                      pytorch-kaldiby mravanelli

                                                                                                                                                                                                                                                                      Python doticonstar image 2267 doticonVersion:Currentdoticon
                                                                                                                                                                                                                                                                      no licences License: No License (null)

                                                                                                                                                                                                                                                                      pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

                                                                                                                                                                                                                                                                      Support
                                                                                                                                                                                                                                                                        Quality
                                                                                                                                                                                                                                                                          Security
                                                                                                                                                                                                                                                                            License
                                                                                                                                                                                                                                                                              Reuse

                                                                                                                                                                                                                                                                                pytorch-kaldiby mravanelli

                                                                                                                                                                                                                                                                                Python doticon star image 2267 doticonVersion:Currentdoticonno licences License: No License

                                                                                                                                                                                                                                                                                pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
                                                                                                                                                                                                                                                                                Support
                                                                                                                                                                                                                                                                                  Quality
                                                                                                                                                                                                                                                                                    Security
                                                                                                                                                                                                                                                                                      License
                                                                                                                                                                                                                                                                                        Reuse

                                                                                                                                                                                                                                                                                          deepspeech.pytorchby SeanNaren

                                                                                                                                                                                                                                                                                          Python doticonstar image 2023 doticonVersion:V3.0doticon
                                                                                                                                                                                                                                                                                          License: Permissive (MIT)

                                                                                                                                                                                                                                                                                          Speech Recognition using DeepSpeech2.

                                                                                                                                                                                                                                                                                          Support
                                                                                                                                                                                                                                                                                            Quality
                                                                                                                                                                                                                                                                                              Security
                                                                                                                                                                                                                                                                                                License
                                                                                                                                                                                                                                                                                                  Reuse

                                                                                                                                                                                                                                                                                                    deepspeech.pytorchby SeanNaren

                                                                                                                                                                                                                                                                                                    Python doticon star image 2023 doticonVersion:V3.0doticon License: Permissive (MIT)

                                                                                                                                                                                                                                                                                                    Speech Recognition using DeepSpeech2.
                                                                                                                                                                                                                                                                                                    Support
                                                                                                                                                                                                                                                                                                      Quality
                                                                                                                                                                                                                                                                                                        Security
                                                                                                                                                                                                                                                                                                          License
                                                                                                                                                                                                                                                                                                            Reuse

                                                                                                                                                                                                                                                                                                              aeneasby readbeyond

                                                                                                                                                                                                                                                                                                              Python doticonstar image 2169 doticonVersion:v1.7.3doticon
                                                                                                                                                                                                                                                                                                              License: Strong Copyleft (AGPL-3.0)

                                                                                                                                                                                                                                                                                                              aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

                                                                                                                                                                                                                                                                                                              Support
                                                                                                                                                                                                                                                                                                                Quality
                                                                                                                                                                                                                                                                                                                  Security
                                                                                                                                                                                                                                                                                                                    License
                                                                                                                                                                                                                                                                                                                      Reuse

                                                                                                                                                                                                                                                                                                                        aeneasby readbeyond

                                                                                                                                                                                                                                                                                                                        Python doticon star image 2169 doticonVersion:v1.7.3doticon License: Strong Copyleft (AGPL-3.0)

                                                                                                                                                                                                                                                                                                                        aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
                                                                                                                                                                                                                                                                                                                        Support
                                                                                                                                                                                                                                                                                                                          Quality
                                                                                                                                                                                                                                                                                                                            Security
                                                                                                                                                                                                                                                                                                                              License
                                                                                                                                                                                                                                                                                                                                Reuse

                                                                                                                                                                                                                                                                                                                                  waveglowby NVIDIA

                                                                                                                                                                                                                                                                                                                                  Python doticonstar image 2110 doticonVersion:Currentdoticon
                                                                                                                                                                                                                                                                                                                                  License: Permissive (BSD-3-Clause)

                                                                                                                                                                                                                                                                                                                                  A Flow-based Generative Network for Speech Synthesis

                                                                                                                                                                                                                                                                                                                                  Support
                                                                                                                                                                                                                                                                                                                                    Quality
                                                                                                                                                                                                                                                                                                                                      Security
                                                                                                                                                                                                                                                                                                                                        License
                                                                                                                                                                                                                                                                                                                                          Reuse

                                                                                                                                                                                                                                                                                                                                            waveglowby NVIDIA

                                                                                                                                                                                                                                                                                                                                            Python doticon star image 2110 doticonVersion:Currentdoticon License: Permissive (BSD-3-Clause)

                                                                                                                                                                                                                                                                                                                                            A Flow-based Generative Network for Speech Synthesis
                                                                                                                                                                                                                                                                                                                                            Support
                                                                                                                                                                                                                                                                                                                                              Quality
                                                                                                                                                                                                                                                                                                                                                Security
                                                                                                                                                                                                                                                                                                                                                  License
                                                                                                                                                                                                                                                                                                                                                    Reuse
                                                                                                                                                                                                                                                                                                                                                      Python doticonstar image 1730 doticonVersion:1.2doticon
                                                                                                                                                                                                                                                                                                                                                      License: Permissive (Apache-2.0)

                                                                                                                                                                                                                                                                                                                                                      :unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

                                                                                                                                                                                                                                                                                                                                                      Support
                                                                                                                                                                                                                                                                                                                                                        Quality
                                                                                                                                                                                                                                                                                                                                                          Security
                                                                                                                                                                                                                                                                                                                                                            License
                                                                                                                                                                                                                                                                                                                                                              Reuse

                                                                                                                                                                                                                                                                                                                                                                lip-reading-deeplearningby astorfi

                                                                                                                                                                                                                                                                                                                                                                Python doticon star image 1730 doticonVersion:1.2doticon License: Permissive (Apache-2.0)

                                                                                                                                                                                                                                                                                                                                                                :unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
                                                                                                                                                                                                                                                                                                                                                                Support
                                                                                                                                                                                                                                                                                                                                                                  Quality
                                                                                                                                                                                                                                                                                                                                                                    Security
                                                                                                                                                                                                                                                                                                                                                                      License
                                                                                                                                                                                                                                                                                                                                                                        Reuse