P2P VoIP in Unity
Support
Quality
Security
License
Reuse
K
KontinuousSpeechRecognizerby StephenVinouze
Kotlin 
132
Version:Current
License: Permissive (Apache-2.0)
A Kotlin Speech Recognizer that runs continuously and is triggered with an activation keyword
Support
Quality
Security
License
Reuse
Code for "Generative Code Modeling with Graphs" (ICLR'19)
Support
Quality
Security
License
Reuse
Text to Speech with PyTorch (English and Mongolian)
Support
Quality
Security
License
Reuse
Useful resources for Mongolian NLP
Support
Quality
Security
License
Reuse
Time delay neural network (TDNN) implementation in Pytorch using unfold method
Support
Quality
Security
License
Reuse
Re-implementation the code used in Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder
Support
Quality
Security
License
Reuse
Chinese Speech To Text Using Wavenet
Support
Quality
Security
License
Reuse
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Support
Quality
Security
License
Reuse
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
Support
Quality
Security
License
Reuse
Ukrainian TTS (text-to-speech) using ESPNET
Support
Quality
Security
License
Reuse
PyTorch Implementation of Multi-Singer (ACM-MM'21)
Support
Quality
Security
License
Reuse
Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
t
tensorflow-ctc-speech-recognitionby philipperemy
Python 
127
Version:Current
License: Permissive (Apache-2.0)
Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).
Support
Quality
Security
License
Reuse
C
Python 
126
Version:Current
License: No License (No License)
Convolutional neural nets for single channel speech enhancement
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
Support
Quality
Security
License
Reuse
The code for the bark-voicecloning model. Training and inference.
Support
Quality
Security
License
Reuse
A speech dereverberation algorithm, also called wpe
Support
Quality
Security
License
Reuse
Tacotron, Korean, Wavenet-Vocoder, Korean TTS
Support
Quality
Security
License
Reuse
Saiy Android Play Services dependencies
Support
Quality
Security
License
Reuse
Word-accurate timestamps for Qur'anic audio.
Support
Quality
Security
License
Reuse
Keras Interface for Kaldi ASR
Support
Quality
Security
License
Reuse
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
Support
Quality
Security
License
Reuse
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Support
Quality
Security
License
Reuse
Twist - node-based audio synthesizer
Support
Quality
Security
License
Reuse
p
python-google-speech-scriptsby jeysonmc
Python 
123
Version:Current
License: Proprietary (Proprietary)
Simple scripts to interact with Google's speech services
Support
Quality
Security
License
Reuse
Simple speech recognition using your microphone.
Support
Quality
Security
License
Reuse
Pytorch code for End-to-End Audiovisual Speech Recognition
Support
Quality
Security
License
Reuse
Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)
Support
Quality
Security
License
Reuse
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Support
Quality
Security
License
Reuse
A repository for single- and multi-modal speaker verification, speaker recognition, and speaker diarization.
Support
Quality
Security
License
Reuse
A unofficial Pytorch implementation of Microsoft's PHASEN
Support
Quality
Security
License
Reuse
Audio File Library
Support
Quality
Security
License
Reuse
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
Support
Quality
Security
License
Reuse
Pitch Estimating Neural Networks (PENN)
Support
Quality
Security
License
Reuse
Easy to use Beamformers for multi-channel speech separation/enhancement
Support
Quality
Security
License
Reuse
Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.
Support
Quality
Security
License
Reuse
Mongolian speech recognition with PyTorch
Support
Quality
Security
License
Reuse
Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.
Support
Quality
Security
License
Reuse
CMU Sphinx - Speech Recognition Toolkit
Support
Quality
Security
License
Reuse
This plugin integrates Azure Speech Cognitive Services in Unreal Engine.
Support
Quality
Security
License
Reuse
MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
Support
Quality
Security
License
Reuse
GStreamer plug-in for interpipeline communication
Support
Quality
Security
License
Reuse
List your dependencies capabilities and monitor if updates require more capabilities.
Support
Quality
Security
License
Reuse
Overview of HTML5 Standardization Activities.
Support
Quality
Security
License
Reuse
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Support
Quality
Security
License
Reuse
V
Voice-Conversion-GANby pritishyuvraj
Jupyter Notebook 
118
Version:Current
License: Permissive (Unlicense)
Voice Conversion using Cycle GAN's For Non-Parallel Data
Support
Quality
Security
License
Reuse
Android library for continuous speech recognition
Support
Quality
Security
License
Reuse
U
Support
Quality
Security
License
Reuse
K
KontinuousSpeechRecognizerby StephenVinouze
A Kotlin Speech Recognizer that runs continuously and is triggered with an activation keyword
Kotlin
132
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
g
graph-based-code-modellingby microsoft
Code for "Generative Code Modeling with Graphs" (ICLR'19)
C#
131
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pytorch-dc-ttsby tugstugi
Text to Speech with PyTorch (English and Mongolian)
Jupyter Notebook
131
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mongolian-nlpby tugstugi
Useful resources for Mongolian NLP
Jupyter Notebook
131
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
T
TDNNby cvqluu
Time delay neural network (TDNN) implementation in Pytorch using unfold method
Python
130
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
v
vae-npvcby JeremyCCHsu
Re-implementation the code used in Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder
Python
130
Updated: 4 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
C
Chinese-speech-to-textby liangstein
Chinese Speech To Text Using Wavenet
Python
129
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
o
openspeechby sooftware
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Python
129
Updated: 4 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
F
FastSpeech2by rishikksh20
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
Jupyter Notebook
129
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
u
ukrainian-ttsby robinhad
Ukrainian TTS (text-to-speech) using ESPNET
Jupyter Notebook
129
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
M
Multi-Singerby Rongjiehuang
PyTorch Implementation of Multi-Singer (ACM-MM'21)
Python
129
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
V
VI-SVSby PlayVoice
Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.
Python
129
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
e
ei-keyword-spottingby ShawnHymel
C
128
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
t
tensorflow-ctc-speech-recognitionby philipperemy
Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).
Python
127
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
CNN-for-single-channel-speech-enhancementby zhr1201
Convolutional neural nets for single channel speech enhancement
Python
126
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
sova-ttsby sovaai
Python
126
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
Crystalby thuhcsi
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
C++
126
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
bark-voice-cloning-HuBERT-quantizerby gitmylo
The code for the bark-voicecloning model. Training and inference.
Python
126
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
fdndlpby helianvine
A speech dereverberation algorithm, also called wpe
Python
125
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
Tacotron-Wavenet-Vocoder-Koreanby hccho2
Tacotron, Korean, Wavenet-Vocoder, Korean TTS
Python
125
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Saiy-PSby brandall76
Saiy Android Play Services dependencies
Java
125
Updated: 4 y ago
License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
q
quran-alignby cpfair
Word-accurate timestamps for Qur'anic audio.
C++
125
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
keras-kaldiby dspavankumar
Keras Interface for Kaldi ASR
Python
124
Updated: 4 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
a
at16kby at16k
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
Python
124
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
STYLERby keonlee9420
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Python
124
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
Twistby DCubix
Twist - node-based audio synthesizer
C
124
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
p
python-google-speech-scriptsby jeysonmc
Simple scripts to interact with Google's speech services
Python
123
Updated: 4 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
scribeby VikParuchuri
Simple speech recognition using your microphone.
Python
123
Updated: 6 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
e
end-to-end-lipreadingby mpc001
Pytorch code for End-to-End Audiovisual Speech Recognition
Python
123
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
f
fac-via-ppgby guanlongzhao
Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)
Python
123
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
DurIANby ivanvovk
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Python
123
Updated: 4 y ago
License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
3
3D-Speakerby alibaba-damo-academy
A repository for single- and multi-modal speaker verification, speaker recognition, and speaker diarization.
Python
123
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
phasenby huyanxin
A unofficial Pytorch implementation of Microsoft's PHASEN
Python
122
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
a
Support
Quality
Security
License
Reuse
M
MTFAA-Netby echocatzh
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
Python
122
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pennby interactiveaudiolab
Pitch Estimating Neural Networks (PENN)
Python
122
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
beamformersby Enny1991
Easy to use Beamformers for multi-channel speech separation/enhancement
Python
121
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SpleeterRTby james34602
Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.
C
121
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
m
mongolian-speech-recognitionby tugstugi
Mongolian speech recognition with PyTorch
Python
120
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
h
howlby castorini
Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.
Python
120
Updated: 3 y ago
License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
c
cmusphinxby cjac
CMU Sphinx - Speech Recognition Toolkit
C
120
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
U
UEAzSpeechby lucoiso
This plugin integrates Azure Speech Cognitive Services in Unreal Engine.
C++
120
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
R
RNN-Transducerby HawkAaron
MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
Python
119
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
g
gst-interpipeby RidgeRun
GStreamer plug-in for interpipeline communication
C
119
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
g
gocapby cugu
List your dependencies capabilities and monitor if updates require more capabilities.
Go
119
Updated: 3 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
H
HTML5-overviewby dret
Overview of HTML5 Standardization Activities.
HTML
118
Updated: 2 y ago
License: Permissive (Unlicense)
Support
Quality
Security
License
Reuse
O
OpenSpeechby sooftware
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Python
118
Updated: 4 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
V
Voice-Conversion-GANby pritishyuvraj
Voice Conversion using Cycle GAN's For Non-Parallel Data
Jupyter Notebook
118
Updated: 2 y ago
License: Permissive (Unlicense)
Support
Quality
Security
License
Reuse
D
DroidSpeechby vikramezhil
Android library for continuous speech recognition
Java
117
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse