React hooks for Speech Recognition and Speech Synthesis
Support
Quality
Security
License
Reuse
Desktop application for neural speech synthesis written in C++
Support
Quality
Security
License
Reuse
This is a Python voice assistant that takes two different wake words. One for prompting Bing AI using EdgeGPT and the other will prompt the GPT-3.5-Turbo API
Support
Quality
Security
License
Reuse
Speech Recognition with Python examples
Support
Quality
Security
License
Reuse
NOTE: This plugin is now deprecated in favour of the coqui-stt branch in gst-plugins-bad: https://gitlab.freedesktop.org/philn/gstreamer/-/tree/coqui-stt/subprojects/gst-plugins-bad/ext/coqui
Support
Quality
Security
License
Reuse
This repository is a collection of TTS Models in TFLite
Support
Quality
Security
License
Reuse
Speech Recognition Using Tacotron
Support
Quality
Security
License
Reuse
Speech noise reduction which was generated using existing post-production techniques implemented in Python
Support
Quality
Security
License
Reuse
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
Support
Quality
Security
License
Reuse
B
Beamforming-for-speech-enhancementby AkojimaSLP
Python 164 Version:Current License: No License (No License)
simple delaysum, MVDR and CGMM-MVDR
Support
Quality
Security
License
Reuse
Official code for Cotatron @ INTERSPEECH 2020
Support
Quality
Security
License
Reuse
Painless Wiener filters for audio separation
Support
Quality
Security
License
Reuse
A suite of speech signal processing tools
Support
Quality
Security
License
Reuse
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Support
Quality
Security
License
Reuse
VoiceGPT is a voice assistant that leverages the powerful ChatGPT chatbot to answer your questions.
Support
Quality
Security
License
Reuse
A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.
Support
Quality
Security
License
Reuse
Deep learning based speech source separation using Pytorch
Support
Quality
Security
License
Reuse
A way to utilize Chrome's speech recognition APIs to perform actions when specific text is heard.
Support
Quality
Security
License
Reuse
Scripts for training general-purpose large vocabulary German acoustic models for ASR with Kaldi.
Support
Quality
Security
License
Reuse
Embeddable custom voice assistant for Android applications
Support
Quality
Security
License
Reuse
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
Support
Quality
Security
License
Reuse
Yet another speech toolkit based on Kaldi and PyTorch
Support
Quality
Security
License
Reuse
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
Support
Quality
Security
License
Reuse
uni-svc based on whisper for singing voice conversion, also for singing voice clone. lora for svc.
Support
Quality
Security
License
Reuse
The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".
Support
Quality
Security
License
Reuse
Javascript Text to speech library
Support
Quality
Security
License
Reuse
:microphone: Easy speech recognition in Node!
Support
Quality
Security
License
Reuse
:microphone: Easy speech recognition in Node!
Support
Quality
Security
License
Reuse
Hate Speech Detection Library for Python.
Support
Quality
Security
License
Reuse
Official implementation of the source-filter HiFiGAN vocoder
Support
Quality
Security
License
Reuse
All in one voice processing library
Support
Quality
Security
License
Reuse
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)
Support
Quality
Security
License
Reuse
A Pytorch Implementation of Transducer Model for End-to-End Speech Recognition
Support
Quality
Security
License
Reuse
Example transcribing audio file (speech) to text with Google Cloud Speech API and Python
Support
Quality
Security
License
Reuse
Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.
Support
Quality
Security
License
Reuse
V
Voice_Activity_Detectorby eesungkim
Jupyter Notebook 157 Version:Current License: No License (No License)
A statistical model-based Voice Activity Detection
Support
Quality
Security
License
Reuse
Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF
Support
Quality
Security
License
Reuse
All the support file for my code by voice setup using Dragon Naturally Speaking and DragonFly
Support
Quality
Security
License
Reuse
g
gpt-voice-conversation-chatbotby Adri6336
Python 156 Version:Current License: Strong Copyleft (GPL-3.0)
Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
Support
Quality
Security
License
Reuse
A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation"
Support
Quality
Security
License
Reuse
An opensource speech-to-text software written in tensorflow
Support
Quality
Security
License
Reuse
Swift wrapper around Pocketsphinx
Support
Quality
Security
License
Reuse
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)
Support
Quality
Security
License
Reuse
A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API
Support
Quality
Security
License
Reuse
C
Chinese-automatic-speech-recognitionby chenmingxiang110
Jupyter Notebook 152 Version:Current License: Permissive (MIT)
Chinese speech recognition
Support
Quality
Security
License
Reuse
A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.
Support
Quality
Security
License
Reuse
The open, easy-to-use and powerful translator app for Android
Support
Quality
Security
License
Reuse
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Support
Quality
Security
License
Reuse
Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)
Support
Quality
Security
License
Reuse
VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram
Support
Quality
Security
License
Reuse
r
react-speech-kitby MikeyParton
React hooks for Speech Recognition and Speech Synthesis
JavaScript 168Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
TensorVoxby ZDisket
Desktop application for neural speech synthesis written in C++
C++ 168Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
B
Bing-GPT-Voice-Assistantby Ai-Austin
This is a Python voice assistant that takes two different wake words. One for prompting Bing AI using EdgeGPT and the other will prompt the GPT-3.5-Turbo API
Python 168Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
python-speech-recognitionby realpython
Speech Recognition with Python examples
Python 167Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gst-deepspeechby Elleo
NOTE: This plugin is now deprecated in favour of the coqui-stt branch in gst-plugins-bad: https://gitlab.freedesktop.org/philn/gstreamer/-/tree/coqui-stt/subprojects/gst-plugins-bad/ext/coqui
C++ 166Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
T
TTS_TFLiteby tulasiram58827
This repository is a collection of TTS Models in TFLite
Jupyter Notebook 166Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
tacotron_asrby Kyubyong
Speech Recognition Using Tacotron
Python 165Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
n
noise_reductionby dodiku
Speech noise reduction which was generated using existing post-production techniques implemented in Python
HTML 165Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
crankby k2kobayashi
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
Python 164Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
B
Beamforming-for-speech-enhancementby AkojimaSLP
simple delaysum, MVDR and CGMM-MVDR
Python 164Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
cotatronby mindslab-ai
Official code for Cotatron @ INTERSPEECH 2020
Python 164Updated: 3 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
n
norbertby sigsep
Painless Wiener filters for audio separation
Python 164Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SPTKby sp-nitech
A suite of speech signal processing tools
C++ 164Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
S
StyleSpeechby keonlee9420
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Python 164Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
voice_chatgptby nickbild
VoiceGPT is a voice assistant that leverages the powerful ChatGPT chatbot to answer your questions.
Python 164Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
myprosodyby Shahabks
A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.
Python 163Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
source_separationby AppleHolic
Deep learning based speech source separation using Pytorch
Python 163Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speech-routerby lukasolson
A way to utilize Chrome's speech recognition APIs to perform actions when specific text is heard.
JavaScript 163Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaldi-tuda-deby uhh-lt
Scripts for training general-purpose large vocabulary German acoustic models for ASR with Kaldi.
Shell 163Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
aimybox-android-assistantby just-ai
Embeddable custom voice assistant for Android applications
Kotlin 163Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
L
LiveWhisperby Nikorasu
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
Python 163Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pykaldi2by jzlianglu
Yet another speech toolkit based on Kaldi and PyTorch
Python 162Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
python-pesqby ludlows
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
C 162Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
U
Uni-SVCby PlayVoice
uni-svc based on whisper for singing voice conversion, also for singing voice clone. lora for svc.
Python 162Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
F
FullSubNet-plusby RookieJunChen
The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".
Python 162Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
Talkifyby Hagsten
Javascript Text to speech library
JavaScript 160Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
e
electron-speechby noffle
:microphone: Easy speech recognition in Node!
JavaScript 160Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
e
electron-speechby hackergrrl
:microphone: Easy speech recognition in Node!
JavaScript 160Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
H
HateSonarby Hironsan
Hate Speech Detection Library for Python.
Jupyter Notebook 160Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SiFiGANby chomeyama
Official implementation of the source-filter HiFiGAN vocoder
Python 159Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
voicetoolsby namco1992
All in one voice processing library
Python 158Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
ClovaCallby clovaai
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)
Python 158Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
rnn-transducerby ZhengkunTian
A Pytorch Implementation of Transducer Model for End-to-End Speech Recognition
Python 157Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-to-textby akras14
Example transcribing audio file (speech) to text with Google Cloud Speech API and Python
Python 157Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
py-kaldi-asrby gooofy
Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.
C++ 157Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
V
Voice_Activity_Detectorby eesungkim
A statistical model-based Voice Activity Detection
Jupyter Notebook 157Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
Speech_Enhancement_DNN_NMFby eesungkim
Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF
Python 156Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
code-by-voiceby simianhacker
All the support file for my code by voice setup using Dragon Naturally Speaking and DragonFly
Python 156Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gpt-voice-conversation-chatbotby Adri6336
Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
Python 156Updated: 1 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
c
conv-tasnetby funcwj
A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation"
Python 153Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speechTby louiskirsch
An opensource speech-to-text software written in tensorflow
Python 153Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
TLSphinxby tryolabs
Swift wrapper around Pocketsphinx
C++ 153Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
D
DiffSingerby keonlee9420
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)
Python 152Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
M
MsEdgeTTSby Migushthe2nd
A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API
TypeScript 152Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
C
Chinese-automatic-speech-recognitionby chenmingxiang110
Chinese speech recognition
Jupyter Notebook 152Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
J
JARVIS-ChatGPTby gia-guar
A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.
Python 152Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mitzuliby artetxem
The open, easy-to-use and powerful translator app for Android
C 151Updated: 4 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
m
muavicby facebookresearch
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Python 151Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
j
jPTDPby datquocnguyen
Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)
Python 150Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
V
VoiceSplitby Edresson
VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram
Python 150Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse