React hooks for Speech Recognition and Speech Synthesis
Support
Quality
Security
License
Reuse
Desktop application for neural speech synthesis written in C++
Support
Quality
Security
License
Reuse
This is a Python voice assistant that takes two different wake words. One for prompting Bing AI using EdgeGPT and the other will prompt the GPT-3.5-Turbo API
Support
Quality
Security
License
Reuse
Speech Recognition with Python examples
Support
Quality
Security
License
Reuse
NOTE: This plugin is now deprecated in favour of the coqui-stt branch in gst-plugins-bad: https://gitlab.freedesktop.org/philn/gstreamer/-/tree/coqui-stt/subprojects/gst-plugins-bad/ext/coqui
Support
Quality
Security
License
Reuse
This repository is a collection of TTS Models in TFLite
Support
Quality
Security
License
Reuse
Speech Recognition Using Tacotron
Support
Quality
Security
License
Reuse
Speech noise reduction which was generated using existing post-production techniques implemented in Python
Support
Quality
Security
License
Reuse
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
Support
Quality
Security
License
Reuse
B
Beamforming-for-speech-enhancementby AkojimaSLP
Python 
164
Version:Current
License: No License (No License)
simple delaysum, MVDR and CGMM-MVDR
Support
Quality
Security
License
Reuse
Official code for Cotatron @ INTERSPEECH 2020
Support
Quality
Security
License
Reuse
Painless Wiener filters for audio separation
Support
Quality
Security
License
Reuse
A suite of speech signal processing tools
Support
Quality
Security
License
Reuse
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Support
Quality
Security
License
Reuse
VoiceGPT is a voice assistant that leverages the powerful ChatGPT chatbot to answer your questions.
Support
Quality
Security
License
Reuse
A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.
Support
Quality
Security
License
Reuse
Deep learning based speech source separation using Pytorch
Support
Quality
Security
License
Reuse
A way to utilize Chrome's speech recognition APIs to perform actions when specific text is heard.
Support
Quality
Security
License
Reuse
Scripts for training general-purpose large vocabulary German acoustic models for ASR with Kaldi.
Support
Quality
Security
License
Reuse
Embeddable custom voice assistant for Android applications
Support
Quality
Security
License
Reuse
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
Support
Quality
Security
License
Reuse
Yet another speech toolkit based on Kaldi and PyTorch
Support
Quality
Security
License
Reuse
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
Support
Quality
Security
License
Reuse
uni-svc based on whisper for singing voice conversion, also for singing voice clone. lora for svc.
Support
Quality
Security
License
Reuse
The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".
Support
Quality
Security
License
Reuse
Javascript Text to speech library
Support
Quality
Security
License
Reuse
:microphone: Easy speech recognition in Node!
Support
Quality
Security
License
Reuse
:microphone: Easy speech recognition in Node!
Support
Quality
Security
License
Reuse
Hate Speech Detection Library for Python.
Support
Quality
Security
License
Reuse
Official implementation of the source-filter HiFiGAN vocoder
Support
Quality
Security
License
Reuse
All in one voice processing library
Support
Quality
Security
License
Reuse
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)
Support
Quality
Security
License
Reuse
A Pytorch Implementation of Transducer Model for End-to-End Speech Recognition
Support
Quality
Security
License
Reuse
Example transcribing audio file (speech) to text with Google Cloud Speech API and Python
Support
Quality
Security
License
Reuse
Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.
Support
Quality
Security
License
Reuse
V
Voice_Activity_Detectorby eesungkim
Jupyter Notebook 
157
Version:Current
License: No License (No License)
A statistical model-based Voice Activity Detection
Support
Quality
Security
License
Reuse
Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF
Support
Quality
Security
License
Reuse
All the support file for my code by voice setup using Dragon Naturally Speaking and DragonFly
Support
Quality
Security
License
Reuse
g
gpt-voice-conversation-chatbotby Adri6336
Python 
156
Version:Current
License: Strong Copyleft (GPL-3.0)
Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
Support
Quality
Security
License
Reuse
A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation"
Support
Quality
Security
License
Reuse
An opensource speech-to-text software written in tensorflow
Support
Quality
Security
License
Reuse
Swift wrapper around Pocketsphinx
Support
Quality
Security
License
Reuse
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)
Support
Quality
Security
License
Reuse
A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API
Support
Quality
Security
License
Reuse
C
Chinese-automatic-speech-recognitionby chenmingxiang110
Jupyter Notebook 
152
Version:Current
License: Permissive (MIT)
Chinese speech recognition
Support
Quality
Security
License
Reuse
A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.
Support
Quality
Security
License
Reuse
The open, easy-to-use and powerful translator app for Android
Support
Quality
Security
License
Reuse
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Support
Quality
Security
License
Reuse
Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)
Support
Quality
Security
License
Reuse
VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram
Support
Quality
Security
License
Reuse
r
react-speech-kitby MikeyParton
React hooks for Speech Recognition and Speech Synthesis
JavaScript
168
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
T
TensorVoxby ZDisket
Desktop application for neural speech synthesis written in C++
C++
168
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
B
Bing-GPT-Voice-Assistantby Ai-Austin
This is a Python voice assistant that takes two different wake words. One for prompting Bing AI using EdgeGPT and the other will prompt the GPT-3.5-Turbo API
Python
168
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
p
python-speech-recognitionby realpython
Speech Recognition with Python examples
Python
167
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gst-deepspeechby Elleo
NOTE: This plugin is now deprecated in favour of the coqui-stt branch in gst-plugins-bad: https://gitlab.freedesktop.org/philn/gstreamer/-/tree/coqui-stt/subprojects/gst-plugins-bad/ext/coqui
C++
166
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
T
TTS_TFLiteby tulasiram58827
This repository is a collection of TTS Models in TFLite
Jupyter Notebook
166
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
tacotron_asrby Kyubyong
Speech Recognition Using Tacotron
Python
165
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
n
noise_reductionby dodiku
Speech noise reduction which was generated using existing post-production techniques implemented in Python
HTML
165
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
c
crankby k2kobayashi
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
Python
164
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
B
Beamforming-for-speech-enhancementby AkojimaSLP
simple delaysum, MVDR and CGMM-MVDR
Python
164
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
c
cotatronby mindslab-ai
Official code for Cotatron @ INTERSPEECH 2020
Python
164
Updated: 4 y ago
License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
n
norbertby sigsep
Painless Wiener filters for audio separation
Python
164
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SPTKby sp-nitech
A suite of speech signal processing tools
C++
164
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
S
StyleSpeechby keonlee9420
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Python
164
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
voice_chatgptby nickbild
VoiceGPT is a voice assistant that leverages the powerful ChatGPT chatbot to answer your questions.
Python
164
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
m
myprosodyby Shahabks
A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.
Python
163
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
source_separationby AppleHolic
Deep learning based speech source separation using Pytorch
Python
163
Updated: 5 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speech-routerby lukasolson
A way to utilize Chrome's speech recognition APIs to perform actions when specific text is heard.
JavaScript
163
Updated: 5 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaldi-tuda-deby uhh-lt
Scripts for training general-purpose large vocabulary German acoustic models for ASR with Kaldi.
Shell
163
Updated: 3 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
aimybox-android-assistantby just-ai
Embeddable custom voice assistant for Android applications
Kotlin
163
Updated: 3 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
L
LiveWhisperby Nikorasu
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
Python
163
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pykaldi2by jzlianglu
Yet another speech toolkit based on Kaldi and PyTorch
Python
162
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
python-pesqby ludlows
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
C
162
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
U
Uni-SVCby PlayVoice
uni-svc based on whisper for singing voice conversion, also for singing voice clone. lora for svc.
Python
162
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
F
FullSubNet-plusby RookieJunChen
The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".
Python
162
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
Talkifyby Hagsten
Javascript Text to speech library
JavaScript
160
Updated: 3 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
e
electron-speechby noffle
:microphone: Easy speech recognition in Node!
JavaScript
160
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
e
electron-speechby hackergrrl
:microphone: Easy speech recognition in Node!
JavaScript
160
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
H
HateSonarby Hironsan
Hate Speech Detection Library for Python.
Jupyter Notebook
160
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SiFiGANby chomeyama
Official implementation of the source-filter HiFiGAN vocoder
Python
159
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
voicetoolsby namco1992
All in one voice processing library
Python
158
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
ClovaCallby clovaai
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)
Python
158
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
rnn-transducerby ZhengkunTian
A Pytorch Implementation of Transducer Model for End-to-End Speech Recognition
Python
157
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-to-textby akras14
Example transcribing audio file (speech) to text with Google Cloud Speech API and Python
Python
157
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
p
py-kaldi-asrby gooofy
Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.
C++
157
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
V
Voice_Activity_Detectorby eesungkim
A statistical model-based Voice Activity Detection
Jupyter Notebook
157
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
S
Speech_Enhancement_DNN_NMFby eesungkim
Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF
Python
156
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
c
code-by-voiceby simianhacker
All the support file for my code by voice setup using Dragon Naturally Speaking and DragonFly
Python
156
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gpt-voice-conversation-chatbotby Adri6336
Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
Python
156
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
c
conv-tasnetby funcwj
A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation"
Python
153
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speechTby louiskirsch
An opensource speech-to-text software written in tensorflow
Python
153
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
TLSphinxby tryolabs
Swift wrapper around Pocketsphinx
C++
153
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
D
DiffSingerby keonlee9420
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)
Python
152
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
M
MsEdgeTTSby Migushthe2nd
A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API
TypeScript
152
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
C
Chinese-automatic-speech-recognitionby chenmingxiang110
Chinese speech recognition
Jupyter Notebook
152
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
J
JARVIS-ChatGPTby gia-guar
A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.
Python
152
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mitzuliby artetxem
The open, easy-to-use and powerful translator app for Android
C
151
Updated: 4 y ago
License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
m
muavicby facebookresearch
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Python
151
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
j
jPTDPby datquocnguyen
Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)
Python
150
Updated: 4 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
V
VoiceSplitby Edresson
VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram
Python
150
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse