kandi background
Explore Kits

18 best Python Speech Recognition libraries in 2022

by naveen.kumar@openweaver.com Updated: Jul 13, 2022

Python Speech Recognition Libraries Banner Speech recognition is the process of converting spoken words to text. Python supports many speech recognition engines and APIs, including Google Speech Engine, Google Cloud Speech API, Microsoft Bing Voice Recognition, and IBM Speech to Text. As we know Python is a multipurpose language that can be used for developing various applications including web apps. Python has many libraries dedicated to speech recognition, text-to-speech conversion, and text analysis. In this article, I have listed some of the best Python Speech Recognition libraries with their key features. In this kit, we will go through some of the best Python Speech Recognition libraries like Real-Time-Voice-Cloning - 5 seconds to generate arbitrary speech; speech_recognition - Speech recognition module for Python, supporting several engines; wav2letter - Facebook AI Research's Automatic Speech Recognition Toolkit. Find the top 18 best Python Speech Recognition libraries in 2022.

Real-Time-Voice-Cloningby CorentinJ

Python star image 32619 Version:Current

License: Others (Non-SPDX)

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Support
Quality
Security
License
Reuse

Real-Time-Voice-Cloningby CorentinJ

Python star image 32619 Version:Current License: Others (Non-SPDX)

Clone a voice in 5 seconds to generate arbitrary speech in real-time
Support
Quality
Security
License
Reuse

speech_recognitionby Uberi

Python star image 5813 Version:3.8.1

License: Others (Non-SPDX)

Speech recognition module for Python, supporting several engines and APIs, online and offline.

Support
Quality
Security
License
Reuse

speech_recognitionby Uberi

Python star image 5813 Version:3.8.1 License: Others (Non-SPDX)

Speech recognition module for Python, supporting several engines and APIs, online and offline.
Support
Quality
Security
License
Reuse

wav2letterby facebookresearch

Python star image 5531 Version:v0.1

License: Others (Non-SPDX)

Facebook AI Research's Automatic Speech Recognition Toolkit

Support
Quality
Security
License
Reuse

wav2letterby facebookresearch

Python star image 5531 Version:v0.1 License: Others (Non-SPDX)

Facebook AI Research's Automatic Speech Recognition Toolkit
Support
Quality
Security
License
Reuse

ASRT_SpeechRecognitionby nl8590687

Python star image 5170 Version:v1.1.1

License: Strong Copyleft (GPL-3.0)

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Support
Quality
Security
License
Reuse

ASRT_SpeechRecognitionby nl8590687

Python star image 5170 Version:v1.1.1 License: Strong Copyleft (GPL-3.0)

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Support
Quality
Security
License
Reuse

ffsubsyncby smacke

Python star image 5018 Version:0.4.11

License: Permissive (MIT)

Automagically synchronize subtitles with video.

Support
Quality
Security
License
Reuse

ffsubsyncby smacke

Python star image 5018 Version:0.4.11 License: Permissive (MIT)

Automagically synchronize subtitles with video.
Support
Quality
Security
License
Reuse

espnetby espnet

Python star image 4934 Version:v.202204

License: Permissive (Apache-2.0)

End-to-End Speech Processing Toolkit

Support
Quality
Security
License
Reuse

espnetby espnet

Python star image 4934 Version:v.202204 License: Permissive (Apache-2.0)

End-to-End Speech Processing Toolkit
Support
Quality
Security
License
Reuse

speechbrainby speechbrain

Python star image 3933 Version:v0.5.11

License: Permissive (Apache-2.0)

A PyTorch-based Speech Toolkit

Support
Quality
Security
License
Reuse

speechbrainby speechbrain

Python star image 3933 Version:v0.5.11 License: Permissive (Apache-2.0)

A PyTorch-based Speech Toolkit
Support
Quality
Security
License
Reuse

speech-to-text-wavenetby buriburisuri

Python star image 3557 Version:Current

License: Permissive (Apache-2.0)

Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow

Support
Quality
Security
License
Reuse

speech-to-text-wavenetby buriburisuri

Python star image 3557 Version:Current License: Permissive (Apache-2.0)

Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow
Support
Quality
Security
License
Reuse

Automatic_Speech_Recognitionby zzw922cn

Python star image 2729 Version:Current

License: Permissive (MIT)

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

Support
Quality
Security
License
Reuse

Automatic_Speech_Recognitionby zzw922cn

Python star image 2729 Version:Current License: Permissive (MIT)

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Support
Quality
Security
License
Reuse

tacotronby keithito

Python star image 2489 Version:v0.2.0

License: Permissive (MIT)

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

Support
Quality
Security
License
Reuse

tacotronby keithito

Python star image 2489 Version:v0.2.0 License: Permissive (MIT)

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Support
Quality
Security
License
Reuse

TensorFlowTTSby TensorSpeech

Python star image 2140 Version:v1.8

License: Permissive (Apache-2.0)

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Support
Quality
Security
License
Reuse

TensorFlowTTSby TensorSpeech

Python star image 2140 Version:v1.8 License: Permissive (Apache-2.0)

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Support
Quality
Security
License
Reuse

tensorflow-speech-recognitionby pannous

Python star image 2115 Version:Current

License: Others (Non-SPDX)

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

Support
Quality
Security
License
Reuse

tensorflow-speech-recognitionby pannous

Python star image 2115 Version:Current License: Others (Non-SPDX)

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Support
Quality
Security
License
Reuse

say_whatby joshnewlan

Python star image 2080 Version:Current

License: No License (null)

Using speech-to-text to fully check out during con calls

Support
Quality
Security
License
Reuse

say_whatby joshnewlan

Python star image 2080 Version:Current License: No License

Using speech-to-text to fully check out during con calls
Support
Quality
Security
License
Reuse

pytorch-kaldiby mravanelli

Python star image 2021 Version:Current

License: No License (null)

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Support
Quality
Security
License
Reuse

pytorch-kaldiby mravanelli

Python star image 2021 Version:Current License: No License

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Support
Quality
Security
License
Reuse

deepspeech.pytorchby SeanNaren

Python star image 1881 Version:V3.0

License: Permissive (MIT)

Speech Recognition using DeepSpeech2.

Support
Quality
Security
License
Reuse

deepspeech.pytorchby SeanNaren

Python star image 1881 Version:V3.0 License: Permissive (MIT)

Speech Recognition using DeepSpeech2.
Support
Quality
Security
License
Reuse

aeneasby readbeyond

Python star image 1867 Version:v1.7.3

License: Strong Copyleft (AGPL-3.0)

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Support
Quality
Security
License
Reuse

aeneasby readbeyond

Python star image 1867 Version:v1.7.3 License: Strong Copyleft (AGPL-3.0)

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Support
Quality
Security
License
Reuse

waveglowby NVIDIA

Python star image 1809 Version:Current

License: Permissive (BSD-3-Clause)

A Flow-based Generative Network for Speech Synthesis

Support
Quality
Security
License
Reuse

waveglowby NVIDIA

Python star image 1809 Version:Current License: Permissive (BSD-3-Clause)

A Flow-based Generative Network for Speech Synthesis
Support
Quality
Security
License
Reuse

lip-reading-deeplearningby astorfi

Python star image 1598 Version:1.2

License: Permissive (Apache-2.0)

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

Support
Quality
Security
License
Reuse

lip-reading-deeplearningby astorfi

Python star image 1598 Version:1.2 License: Permissive (Apache-2.0)

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
Support
Quality
Security
License
Reuse
  • © 2022 Open Weaver Inc.