P2P VoIP in Unity
Support
Quality
Security
License
Reuse
K
KontinuousSpeechRecognizerby StephenVinouze
Kotlin 132 Version:Current License: Permissive (Apache-2.0)
A Kotlin Speech Recognizer that runs continuously and is triggered with an activation keyword
Support
Quality
Security
License
Reuse
Code for "Generative Code Modeling with Graphs" (ICLR'19)
Support
Quality
Security
License
Reuse
Text to Speech with PyTorch (English and Mongolian)
Support
Quality
Security
License
Reuse
Useful resources for Mongolian NLP
Support
Quality
Security
License
Reuse
Time delay neural network (TDNN) implementation in Pytorch using unfold method
Support
Quality
Security
License
Reuse
Re-implementation the code used in Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder
Support
Quality
Security
License
Reuse
Chinese Speech To Text Using Wavenet
Support
Quality
Security
License
Reuse
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Support
Quality
Security
License
Reuse
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
Support
Quality
Security
License
Reuse
Ukrainian TTS (text-to-speech) using ESPNET
Support
Quality
Security
License
Reuse
PyTorch Implementation of Multi-Singer (ACM-MM'21)
Support
Quality
Security
License
Reuse
Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
t
tensorflow-ctc-speech-recognitionby philipperemy
Python 127 Version:Current License: Permissive (Apache-2.0)
Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).
Support
Quality
Security
License
Reuse
C
Python 126 Version:Current License: No License (No License)
Convolutional neural nets for single channel speech enhancement
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
Support
Quality
Security
License
Reuse
The code for the bark-voicecloning model. Training and inference.
Support
Quality
Security
License
Reuse
A speech dereverberation algorithm, also called wpe
Support
Quality
Security
License
Reuse
Tacotron, Korean, Wavenet-Vocoder, Korean TTS
Support
Quality
Security
License
Reuse
Saiy Android Play Services dependencies
Support
Quality
Security
License
Reuse
Word-accurate timestamps for Qur'anic audio.
Support
Quality
Security
License
Reuse
Keras Interface for Kaldi ASR
Support
Quality
Security
License
Reuse
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
Support
Quality
Security
License
Reuse
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Support
Quality
Security
License
Reuse
Twist - node-based audio synthesizer
Support
Quality
Security
License
Reuse
p
python-google-speech-scriptsby jeysonmc
Python 123 Version:Current License: Proprietary (Proprietary)
Simple scripts to interact with Google's speech services
Support
Quality
Security
License
Reuse
Simple speech recognition using your microphone.
Support
Quality
Security
License
Reuse
Pytorch code for End-to-End Audiovisual Speech Recognition
Support
Quality
Security
License
Reuse
Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)
Support
Quality
Security
License
Reuse
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Support
Quality
Security
License
Reuse
A repository for single- and multi-modal speaker verification, speaker recognition, and speaker diarization.
Support
Quality
Security
License
Reuse
A unofficial Pytorch implementation of Microsoft's PHASEN
Support
Quality
Security
License
Reuse
Audio File Library
Support
Quality
Security
License
Reuse
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
Support
Quality
Security
License
Reuse
Pitch Estimating Neural Networks (PENN)
Support
Quality
Security
License
Reuse
Easy to use Beamformers for multi-channel speech separation/enhancement
Support
Quality
Security
License
Reuse
Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.
Support
Quality
Security
License
Reuse
Mongolian speech recognition with PyTorch
Support
Quality
Security
License
Reuse
Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.
Support
Quality
Security
License
Reuse
CMU Sphinx - Speech Recognition Toolkit
Support
Quality
Security
License
Reuse
This plugin integrates Azure Speech Cognitive Services in Unreal Engine.
Support
Quality
Security
License
Reuse
MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
Support
Quality
Security
License
Reuse
GStreamer plug-in for interpipeline communication
Support
Quality
Security
License
Reuse
List your dependencies capabilities and monitor if updates require more capabilities.
Support
Quality
Security
License
Reuse
Overview of HTML5 Standardization Activities.
Support
Quality
Security
License
Reuse
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Support
Quality
Security
License
Reuse
V
Voice-Conversion-GANby pritishyuvraj
Jupyter Notebook 118 Version:Current License: Permissive (Unlicense)
Voice Conversion using Cycle GAN's For Non-Parallel Data
Support
Quality
Security
License
Reuse
Android library for continuous speech recognition
Support
Quality
Security
License
Reuse
U
Support
Quality
Security
License
Reuse
K
KontinuousSpeechRecognizerby StephenVinouze
A Kotlin Speech Recognizer that runs continuously and is triggered with an activation keyword
Kotlin 132Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
g
graph-based-code-modellingby microsoft
Code for "Generative Code Modeling with Graphs" (ICLR'19)
C# 131Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pytorch-dc-ttsby tugstugi
Text to Speech with PyTorch (English and Mongolian)
Jupyter Notebook 131Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mongolian-nlpby tugstugi
Useful resources for Mongolian NLP
Jupyter Notebook 131Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
TDNNby cvqluu
Time delay neural network (TDNN) implementation in Pytorch using unfold method
Python 130Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
vae-npvcby JeremyCCHsu
Re-implementation the code used in Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder
Python 130Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
C
Chinese-speech-to-textby liangstein
Chinese Speech To Text Using Wavenet
Python 129Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
o
openspeechby sooftware
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Python 129Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
F
FastSpeech2by rishikksh20
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
Jupyter Notebook 129Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
u
ukrainian-ttsby robinhad
Ukrainian TTS (text-to-speech) using ESPNET
Jupyter Notebook 129Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
M
Multi-Singerby Rongjiehuang
PyTorch Implementation of Multi-Singer (ACM-MM'21)
Python 129Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
V
VI-SVSby PlayVoice
Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.
Python 129Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
e
ei-keyword-spottingby ShawnHymel
C 128Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
t
tensorflow-ctc-speech-recognitionby philipperemy
Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).
Python 127Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
CNN-for-single-channel-speech-enhancementby zhr1201
Convolutional neural nets for single channel speech enhancement
Python 126Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
sova-ttsby sovaai
Python 126Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
Crystalby thuhcsi
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
C++ 126Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
bark-voice-cloning-HuBERT-quantizerby gitmylo
The code for the bark-voicecloning model. Training and inference.
Python 126Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
fdndlpby helianvine
A speech dereverberation algorithm, also called wpe
Python 125Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
Tacotron-Wavenet-Vocoder-Koreanby hccho2
Tacotron, Korean, Wavenet-Vocoder, Korean TTS
Python 125Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Saiy-PSby brandall76
Saiy Android Play Services dependencies
Java 125Updated: 4 y ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
q
quran-alignby cpfair
Word-accurate timestamps for Qur'anic audio.
C++ 125Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
keras-kaldiby dspavankumar
Keras Interface for Kaldi ASR
Python 124Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
a
at16kby at16k
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
Python 124Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
STYLERby keonlee9420
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Python 124Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
Twistby DCubix
Twist - node-based audio synthesizer
C 124Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
python-google-speech-scriptsby jeysonmc
Simple scripts to interact with Google's speech services
Python 123Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
scribeby VikParuchuri
Simple speech recognition using your microphone.
Python 123Updated: 5 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
e
end-to-end-lipreadingby mpc001
Pytorch code for End-to-End Audiovisual Speech Recognition
Python 123Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
f
fac-via-ppgby guanlongzhao
Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)
Python 123Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
DurIANby ivanvovk
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Python 123Updated: 3 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
3
3D-Speakerby alibaba-damo-academy
A repository for single- and multi-modal speaker verification, speaker recognition, and speaker diarization.
Python 123Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
phasenby huyanxin
A unofficial Pytorch implementation of Microsoft's PHASEN
Python 122Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
Support
Quality
Security
License
Reuse
M
MTFAA-Netby echocatzh
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
Python 122Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pennby interactiveaudiolab
Pitch Estimating Neural Networks (PENN)
Python 122Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
beamformersby Enny1991
Easy to use Beamformers for multi-channel speech separation/enhancement
Python 121Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SpleeterRTby james34602
Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.
C 121Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
m
mongolian-speech-recognitionby tugstugi
Mongolian speech recognition with PyTorch
Python 120Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
h
howlby castorini
Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.
Python 120Updated: 3 y ago License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
c
cmusphinxby cjac
CMU Sphinx - Speech Recognition Toolkit
C 120Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
U
UEAzSpeechby lucoiso
This plugin integrates Azure Speech Cognitive Services in Unreal Engine.
C++ 120Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
R
RNN-Transducerby HawkAaron
MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
Python 119Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
gst-interpipeby RidgeRun
GStreamer plug-in for interpipeline communication
C 119Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
g
gocapby cugu
List your dependencies capabilities and monitor if updates require more capabilities.
Go 119Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
H
HTML5-overviewby dret
Overview of HTML5 Standardization Activities.
HTML 118Updated: 2 y ago License: Permissive (Unlicense)
Support
Quality
Security
License
Reuse
O
OpenSpeechby sooftware
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Python 118Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
V
Voice-Conversion-GANby pritishyuvraj
Voice Conversion using Cycle GAN's For Non-Parallel Data
Jupyter Notebook 118Updated: 2 y ago License: Permissive (Unlicense)
Support
Quality
Security
License
Reuse
D
DroidSpeechby vikramezhil
Android library for continuous speech recognition
Java 117Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse