Use your data to create a speech recognition system in Kaldi. Fast.
Support
Quality
Security
License
Reuse
Tom's Audio Processing LADSPA plugins
Support
Quality
Security
License
Reuse
Robust Speech Recognition Using Generative Adversarial Networks (GAN)
Support
Quality
Security
License
Reuse
This library use the Google Voice API and the Speex audio codec for speech-to-text on iOS
Support
Quality
Security
License
Reuse
:zap: Finetune Wa2vec 2.0 For Speech Recognition
Support
Quality
Security
License
Reuse
百度云流式语音识别客户端 SDK
Support
Quality
Security
License
Reuse
Converts spoken words into text form.
Support
Quality
Security
License
Reuse
C
Co-Speech_Gesture_Generationby youngwoo-yoon
Python 53 Version:Current License: Proprietary (Proprietary)
This is an implementation of Robots learn social skills: End-to-end learning of co-speech gesture generation for humanoid robots.
Support
Quality
Security
License
Reuse
Text-to-speech browser extension button. Select text on any web page, and have the computer read it out loud for you by simply clicking the Talkie button.
Support
Quality
Security
License
Reuse
Sample app used to demonstrate the use of Microsoft Cognitive Services Text-to-Speech APIs (aka Speech Synthesis) from within Unity.
Support
Quality
Security
License
Reuse
An implement of "Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training"
Support
Quality
Security
License
Reuse
Code for the papers "Digital Voicing of Silent Speech" at EMNLP 2020 and "An Improved Model for Voicing Silent Speech" at ACL 2021.
Support
Quality
Security
License
Reuse
🍊📄 : An #rstats project to keep track of The 🍊 One's speeches
Support
Quality
Security
License
Reuse
Reverb.js is a Web Audio API extension for creating reverb nodes and an accompanying impulse-response reverb library.
Support
Quality
Security
License
Reuse
The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.
Support
Quality
Security
License
Reuse
A simple speech recognition using HMM (python)
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
A pytorch wrapper for LF-MMI training and parallel training in Kaldi
Support
Quality
Security
License
Reuse
:v: A small JavaScript library that provides a text to speech conversion using tts-api.com service.
Support
Quality
Security
License
Reuse
Sample Unity project used to demonstrate Speech Recognition using the new Microsoft Speech Service (Preview) via WebSockets.
Support
Quality
Security
License
Reuse
Real-Time High-Fidelity Speech Synthesis without GPU
Support
Quality
Security
License
Reuse
Waveform and Audio Synthesis library in Go
Support
Quality
Security
License
Reuse
An example directory for running Multi-Task Learning training on Kaldi neural networks. In Kaldi-speak, this is an egs dir for nnet3 training.
Support
Quality
Security
License
Reuse
c
cs224n-gpu-that-talksby akashmjn
Jupyter Notebook 52 Version:Current License: No License (No License)
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Support
Quality
Security
License
Reuse
These are the results for VoiceGAN voice transformation. You can hear the audios which are in folder A-AB-ABA/B-BA-BAB
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement" submitted to ICASSP 2023
Support
Quality
Security
License
Reuse
TensorFlow implementation of "GANSynth: Adversarial Neural Audio Synthesis"
Support
Quality
Security
License
Reuse
Online decoder for Kaldi NNET2 and GMM speech recognition models with Python bindings.
Support
Quality
Security
License
Reuse
Sequence to sequence learning with MXNET
Support
Quality
Security
License
Reuse
A Polymer 3+ webcomponent / button for doing speech recognition
Support
Quality
Security
License
Reuse
Tacotron2 based engine for the SOVA-TTS project
Support
Quality
Security
License
Reuse
A simple no-API voice command assitant
Support
Quality
Security
License
Reuse
Audio/Video Processing Service
Support
Quality
Security
License
Reuse
Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units". https://arxiv.org/abs/2212.09730
Support
Quality
Security
License
Reuse
Synthalingua - Real Time Translation
Support
Quality
Security
License
Reuse
Part-of-Speech (POS) Tagger for Turkish
Support
Quality
Security
License
Reuse
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Support
Quality
Security
License
Reuse
Simple wrapper for Javascript Speech-to-text to add voice commands.
Support
Quality
Security
License
Reuse
This is the implementation of the paper "Emotion Intensity and its Control for Emotional Voice Conversion".
Support
Quality
Security
License
Reuse
VITSによるテキスト読み上げ器&ボイスチェンジャー
Support
Quality
Security
License
Reuse
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
Support
Quality
Security
License
Reuse
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Support
Quality
Security
License
Reuse
The mobilenetssd object detection android example
Support
Quality
Security
License
Reuse
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
PyTorch implementation of RNN-Transducer(RNN-T).
Support
Quality
Security
License
Reuse
Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)
Support
Quality
Security
License
Reuse
VisualOn AAC encoder from Android
Support
Quality
Security
License
Reuse
TeamSpeak bot w/ speech recognition (like Siri, OK Google, Cortana, etc.)
Support
Quality
Security
License
Reuse
e
easy-kaldiby JRMeyer
Use your data to create a speech recognition system in Kaldi. Fast.
Shell 54Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
tap-pluginsby tomszilagyi
Tom's Audio Processing LADSPA plugins
C 54Updated: 4 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
r
rsrganby wangkenpu
Robust Speech Recognition Using Generative Adversarial Networks (GAN)
Shell 54Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
i
iOS-Speech-To-Textby mzeeshanid
This library use the Google Voice API and the Speex audio codec for speech-to-text on iOS
C 54Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
A
ASR-Wav2vec-Finetuneby khanld
:zap: Finetune Wa2vec 2.0 For Speech Recognition
Python 54Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
Support
Quality
Security
License
Reuse
M
MAX-Speech-to-Text-Converterby IBM
Converts spoken words into text form.
Python 53Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
Co-Speech_Gesture_Generationby youngwoo-yoon
This is an implementation of Robots learn social skills: End-to-end learning of co-speech gesture generation for humanoid robots.
Python 53Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
t
talkieby joelpurra
Text-to-speech browser extension button. Select text on any web page, and have the computer read it out loud for you by simply clicking the Talkie button.
TypeScript 53Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
U
Unity-Text-to-Speechby ActiveNick
Sample app used to demonstrate the use of Microsoft Cognitive Services Text-to-Speech APIs (aka Speech Synthesis) from within Unity.
C# 53Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
E
EA-SVCby hhguo
An implement of "Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training"
Python 53Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
silent_speechby dgaddy
Code for the papers "Digital Voicing of Silent Speech" at EMNLP 2020 and "An Improved Model for Voicing Silent Speech" at ACL 2021.
Python 53Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
o
orangetextby hrbrmstr
🍊📄 : An #rstats project to keep track of The 🍊 One's speeches
R 53Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
R
Reverb.jsby burnson
Reverb.js is a Web Audio API extension for creating reverb nodes and an accompanying impulse-response reverb library.
HTML 53Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
I
Inter-SubNetby RookieJunChen
The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.
Python 53Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
S
Speech_Recognitionby drbinliang
A simple speech recognition using HMM (python)
Python 52Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
Speech-Accent-Recognitionby yatharthgarg
Python 52Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
pkwrapby idiap
A pytorch wrapper for LF-MMI training and parallel training in Kaldi
Python 52Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
t
text-to-speech-jsby IonicaBizau
:v: A small JavaScript library that provides a text to speech conversion using tts-api.com service.
JavaScript 52Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
U
Unity-MS-SpeechSDKby ActiveNick
Sample Unity project used to demonstrate Speech Recognition using the new Microsoft Speech Service (Preview) via WebSockets.
C# 52Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
W
WG-WaveNetby BogiHsu
Real-Time High-Fidelity Speech Synthesis without GPU
Python 52Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
klangsyntheseby 200sc
Waveform and Audio Synthesis library in Go
Go 52Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
multi-task-kaldiby JRMeyer
An example directory for running Multi-Task Learning training on Kaldi neural networks. In Kaldi-speak, this is an egs dir for nnet3 training.
Shell 52Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
cs224n-gpu-that-talksby akashmjn
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Jupyter Notebook 52Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
V
VoiceGANby Yolanda-Gao
These are the results for VoiceGAN voice transformation. You can hear the audios which are in folder A-AB-ABA/B-BA-BAB
Jupyter Notebook 52Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
spot-cpp-sdkby boston-dynamics
C++ 52Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
M
McNetby Audio-WestlakeU
The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement" submitted to ICASSP 2023
Python 52Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
G
GANSynthby skmhrk1209
TensorFlow implementation of "GANSynth: Adversarial Neural Audio Synthesis"
Python 51Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
alex-asrby UFAL-DSG
Online decoder for Kaldi NNET2 and GMM speech recognition models with Python bindings.
Python 51Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
m
mxnet-seq2seqby yoosan
Sequence to sequence learning with MXNET
Python 51Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
o
obviby googlecreativelab
A Polymer 3+ webcomponent / button for doing speech recognition
JavaScript 51Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
sova-tts-engineby sovaai
Tacotron2 based engine for the SOVA-TTS project
Python 51Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
v
voice-commandby PRFTDigitalLabs
A simple no-API voice command assitant
JavaScript 51Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
i
Support
Quality
Security
License
Reuse
D
DISSCby gallilmaimon
Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units". https://arxiv.org/abs/2212.09730
Python 51Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Synthalinguaby cyberofficial
Synthalingua - Real Time Translation
Python 51Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
t
turkish-pos-taggerby onuryilmaz
Part-of-Speech (POS) Tagger for Turkish
Python 50Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
deep_avsrby lordmartian
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Python 50Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
voice-commands.jsby jimmybyrum
Simple wrapper for Javascript Speech-to-text to add voice commands.
HTML 50Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
E
Emovoxby KunZhou9646
This is the implementation of the paper "Emotion Intensity and its Control for Emotional Voice Conversion".
Python 50Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
V
Support
Quality
Security
License
Reuse
R
RuntimeSpeechRecognizerby gtreshchev
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
C++ 50Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
vocosby charactr-platform
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Python 50Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
ncnn-android-mobilenetssdby nihui
The mobilenetssd object detection android example
Java 49Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
keras-sincnetby grausof
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
Python 49Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
clari_wavenet_vocoderby HaiFengZeng
Python 49Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
R
RNN-Transducerby sooftware
PyTorch implementation of RNN-Transducer(RNN-T).
Python 49Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
L
LAS_Mandarin_PyTorchby jackaduma
Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)
Python 49Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
vo-aacencby mstorsjo
VisualOn AAC encoder from Android
C 49Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
hey-victoriaby sk89q
TeamSpeak bot w/ speech recognition (like Siri, OK Google, Cortana, etc.)
C 49Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse