Tacotron2 + LPCNET for complete End-to-End TTS System
Support
Quality
Security
License
Reuse
This VAD library is designed to process audio in real-time and detect human speech in audio samples that have a mix of speech and noise. It supports both DNN-based Silero VAD and GMM-based WebRTC VAD models.
Support
Quality
Security
License
Reuse
The official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".
Support
Quality
Security
License
Reuse
This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"
Support
Quality
Security
License
Reuse
Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments
Support
Quality
Security
License
Reuse
Tool for creation, manipulation and maintenance of voice corpora
Support
Quality
Security
License
Reuse
Python code and wav files for the post "The Fast Fourier Transform Algorithm, and Denoising a Sound Clip"
Support
Quality
Security
License
Reuse
🎙️ Handsfree Audio Development Interface
Support
Quality
Security
License
Reuse
Pytorch Implementation of FFTNet
Support
Quality
Security
License
Reuse
GPT-3 client for Windows and Unix with memories management that supports both text and speech in any language.
Support
Quality
Security
License
Reuse
Deep Neural Pitch Extractor for Voice Conversion and TTS Training
Support
Quality
Security
License
Reuse
Speech Recognition model based off of FAIR research paper built using Pytorch.
Support
Quality
Security
License
Reuse
An implementation of Tacotron and Tacotron2
Support
Quality
Security
License
Reuse
Emotion recognition by speech in android.
Support
Quality
Security
License
Reuse
A pytorch based end2end speech recognition system.
Support
Quality
Security
License
Reuse
Deepstory turns a text/generated text into a video where the character is animated to speak your story using his/her voice.
Support
Quality
Security
License
Reuse
Let’s Create a Speech Synthesizer
Support
Quality
Security
License
Reuse
A multi-speaker, multilingual speech generation tool
Support
Quality
Security
License
Reuse
Android library for speech-to-text and text-to-speech apps
Support
Quality
Security
License
Reuse
Python3 Text to Speech Video Sample
Support
Quality
Security
License
Reuse
MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.
Support
Quality
Security
License
Reuse
D
Deep-Clustering-for-Speech-Separationby JusperLee
Python 77 Version:Current License: No License (No License)
Pytorch implements Deep Clustering: Discriminative Embeddings For Segmentation And Separation
Support
Quality
Security
License
Reuse
This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of acoustic impulse responses.
Support
Quality
Security
License
Reuse
Crystal TTVS engine is a real-time audio-visual Multilingual speech synthesizer with a 3D expressive avatar.
Support
Quality
Security
License
Reuse
Server framework for Kaldi ASR Toolkit
Support
Quality
Security
License
Reuse
Custom decoders for Kaldi
Support
Quality
Security
License
Reuse
Cross browser Speech Synthesis; no dependencies
Support
Quality
Security
License
Reuse
speech recognition in dart support all audio format and support server side client side, + support all language, only support in cpu only
Support
Quality
Security
License
Reuse
Official Implementation of StyleTTS-VC
Support
Quality
Security
License
Reuse
e
Python 76 Version:Current License: No License (No License)
This is the implementation of the Speaker Odyssey 2020 paper " Transforming spectrum and prosody for emotional voice conversion with non-parallel training data".
Support
Quality
Security
License
Reuse
v
vggvox-speaker-identificationby linhdvu14
Python 76 Version:Current License: No License (No License)
Speaker identification with VGGVox network
Support
Quality
Security
License
Reuse
🐸TTS recipes for different datasets
Support
Quality
Security
License
Reuse
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.
Support
Quality
Security
License
Reuse
Manim plugin for all things voiceover
Support
Quality
Security
License
Reuse
r
russian_stt_text_normalizationby snakers4
Python 75 Version:Current License: Strong Copyleft (GPL-3.0)
Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks
Support
Quality
Security
License
Reuse
Easy to use cross platform speech recognition (speech to text) plugin for Xamarin & UWP
Support
Quality
Security
License
Reuse
The Office Assistant was an intelligent user interface for Microsoft Office. The code written in C++ is now avalible for anyone to use that agrees to the licence. Enjoy
Support
Quality
Security
License
Reuse
Multi-channel speech enhancement system (MVDR beamformer + several postfilters)
Support
Quality
Security
License
Reuse
A fast cnn-based vocoder
Support
Quality
Security
License
Reuse
A programmable version of Neil Thapen's Pink Trombone
Support
Quality
Security
License
Reuse
A demo of zh/Chinese Text to Speech system run on CPU in real time. 中文实时语音合成系统Demo
Support
Quality
Security
License
Reuse
Aalto Automatic Speech Recognition tools
Support
Quality
Security
License
Reuse
A fast, local neural text to speech system
Support
Quality
Security
License
Reuse
Program to benchmark various speech recognition APIs
Support
Quality
Security
License
Reuse
A voice-enabled chatbot application built using of 🦜️🔗 LangChain, text-to-speech, and speech-to-text models from 🤗 Hugging Face, and 🍱 BentoML.
Support
Quality
Security
License
Reuse
Continuous speech recognition library for Android with options to use GoogleVoiceIme dialog and offline mode.
Support
Quality
Security
License
Reuse
An optimized re-implementation for 2D-TAN: Learning 2D Temporal Localization Networks for Moment Localization with Natural Language (AAAI'2020).
Support
Quality
Security
License
Reuse
An audio recognition system
Support
Quality
Security
License
Reuse
s
speech-emotion-recognition-exerciseby YJango
Python 72 Version:Current License: No License (No License)
2018年7⽉30⽇-8⽉13⽇持续2周的AI训练营中语⾳情感识别营的项目报告。
Support
Quality
Security
License
Reuse
Android Speech Recognition Service using Vosk/Kaldi and Mozilla DeepSpeech
Support
Quality
Security
License
Reuse
L
LPCTronby alokprasad
Tacotron2 + LPCNET for complete End-to-End TTS System
C 81Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
android-vadby gkonovalov
This VAD library is designed to process audio in real-time and detect human speech in audio samples that have a mix of speech and noise. It supports both DNN-based Silero VAD and GMM-based WebRTC VAD models.
C 81Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
N
NBSSby Audio-WestlakeU
The official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".
Python 81Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
control-vcby MelissaChen15
This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"
Python 81Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
d
deepspeech-websocket-serverby daanzu
Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments
Python 80Updated: 4 y ago License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
v
voice-corpus-toolby mozilla
Tool for creation, manipulation and maintenance of voice corpora
Python 80Updated: 4 y ago License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
f
fftby j2kun
Python code and wav files for the post "The Fast Fourier Transform Algorithm, and Denoising a Sound Clip"
Python 80Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
i
idearby OpenASR
🎙️ Handsfree Audio Development Interface
Kotlin 80Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
F
FFTNetby fatchord
Pytorch Implementation of FFTNet
Jupyter Notebook 80Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
G
GPTalkby 0ut0flin3
GPT-3 client for Windows and Unix with memories management that supports both text and speech in any language.
Python 80Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
P
PitchExtractorby yl4579
Deep Neural Pitch Extractor for Voice Conversion and TTS Training
Python 80Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
W
Wav2Letterby LearnedVector
Speech Recognition model based off of FAIR research paper built using Pytorch.
Python 79Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
t
tacotron2by nii-yamagishilab
An implementation of Tacotron and Tacotron2
Python 79Updated: 4 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
V
VokaturiAndroidby alshell7
Emotion recognition by speech in android.
C 79Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
O
OpenASRby by2101
A pytorch based end2end speech recognition system.
Python 78Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
deepstoryby thetobysiu
Deepstory turns a text/generated text into a video where the character is animated to speak your story using his/her voice.
Python 78Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech_synth_seriesby bisqwit
Let’s Create a Speech Synthesizer
PHP 78Updated: 4 y ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
v
voice-generator-webuiby log1stics
A multi-speaker, multilingual speech generation tool
Jupyter Notebook 78Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speechutilsby Kaljurand
Android library for speech-to-text and text-to-speech apps
Java 77Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
text-to-speech-sampleby alexram1313
Python3 Text to Speech Video Sample
Python 77Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
magphaseby CSTR-Edinburgh
MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.
Python 77Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
Deep-Clustering-for-Speech-Separationby JusperLee
Pytorch implements Deep Clustering: Discriminative Embeddings For Segmentation And Separation
Python 77Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
pySpeechRevby mravanelli
This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of acoustic impulse responses.
Python 77Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
Crystal.TTVSby thuhcsi
Crystal TTVS engine is a real-time audio-visual Multilingual speech synthesizer with a 3D expressive avatar.
C++ 77Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
k
kaldi-serveby Vernacular-ai
Server framework for Kaldi ASR Toolkit
C++ 77Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
k
kaldi-decodersby jpuigcerver
Custom decoders for Kaldi
C++ 77Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
e
easy-speechby jankapunkt
Cross browser Speech Synthesis; no dependencies
JavaScript 77Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
whisper_dartby azkadev
speech recognition in dart support all audio format and support server side client side, + support all language, only support in cpu only
C++ 77Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
StyleTTS-VCby yl4579
Official Implementation of StyleTTS-VC
Python 77Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
e
emotional-voice-conversion-with-CycleGAN-and-CWT-for-Spectrum-and-F0by KunZhou9646
This is the implementation of the Speaker Odyssey 2020 paper " Transforming spectrum and prosody for emotional voice conversion with non-parallel training data".
Python 76Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
vggvox-speaker-identificationby linhdvu14
Speaker identification with VGGVox network
Python 76Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
TTS-recipesby coqui-ai
🐸TTS recipes for different datasets
Shell 76Updated: 2 y ago License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
T
TalkNet2-pytorchby rishikksh20
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.
Python 76Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
manim-voiceoverby ManimCommunity
Manim plugin for all things voiceover
Python 76Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
russian_stt_text_normalizationby snakers4
Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks
Python 75Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
speechrecognitionby aritchie
Easy to use cross platform speech recognition (speech to text) plugin for Xamarin & UWP
C# 75Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
O
OfficeAssistantby thebeebs
The Office Assistant was an intelligent user interface for Microsoft Office. The code written in C++ is now avalible for anyone to use that agrees to the licence. Enjoy
C++ 75Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
mcseby DistantSpeechRecognition
Multi-channel speech enhancement system (MVDR beamformer + several postfilters)
Python 74Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
Support
Quality
Security
License
Reuse
P
Pink-Tromboneby zakaton
A programmable version of Neil Thapen's Pink Trombone
JavaScript 74Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
z
zhttsby Jackiexiao
A demo of zh/Chinese Text to Speech system run on CPU in real time. 中文实时语音合成系统Demo
Python 74Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
AaltoASRby aalto-speech
Aalto Automatic Speech Recognition tools
C++ 74Updated: 4 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
l
larynx2by rhasspy
A fast, local neural text to speech system
C++ 74Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
ASR_benchmarkby Franck-Dernoncourt
Program to benchmark various speech recognition APIs
Python 73Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
B
BentoChainby ssheng
A voice-enabled chatbot application built using of 🦜️🔗 LangChain, text-to-speech, and speech-to-text models from 🤗 Hugging Face, and 🍱 BentoML.
Python 73Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
A
Android-Speech-Recognitionby maxwellobi
Continuous speech recognition library for Android with options to use GoogleVoiceIme dialog and offline mode.
Java 72Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
2
2dtanby ChenJoya
An optimized re-implementation for 2D-TAN: Learning 2D Temporal Localization Networks for Moment Localization with Natural Language (AAAI'2020).
Python 72Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
audio_recogition_systemby baliksjosay
An audio recognition system
Python 72Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-emotion-recognition-exerciseby YJango
2018年7⽉30⽇-8⽉13⽇持续2周的AI训练营中语⾳情感识别营的项目报告。
Python 72Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
L
LocalSTTby ccoreilly
Android Speech Recognition Service using Vosk/Kaldi and Mozilla DeepSpeech
Java 72Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse