Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech :fist:
Support
Quality
Security
License
Reuse
Speech transcription on the browser using WebRTC and Google Speech
Support
Quality
Security
License
Reuse
A modified version of Speech Signal Processing Toolkit (SPTK)
Support
Quality
Security
License
Reuse
Multi-Task Audio Source Separation, Two-Stage Model, Complex Domain.
Support
Quality
Security
License
Reuse
A webui for different audio related Neural Networks
Support
Quality
Security
License
Reuse
Utterance lets you use the platform's native Text To Speech Engine within your Titanium Project
Support
Quality
Security
License
Reuse
A unofficial Pytorch implementation of Google's VoiceFilter
Support
Quality
Security
License
Reuse
Implementation of voice conversion system utilizing phonetic posteriorgrams (status: archive)
Support
Quality
Security
License
Reuse
The ITU-T Software Tool Library (G.191)
Support
Quality
Security
License
Reuse
树莓派上的语音控制语音聊天的智能机器人。利用树莓派的wiringPi,科大讯飞,图灵机器人,alsa等开发库实现。
Support
Quality
Security
License
Reuse
Is Angular Ivy Ready website
Support
Quality
Security
License
Reuse
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
Support
Quality
Security
License
Reuse
A speech recognition API service to decode audio to text
Support
Quality
Security
License
Reuse
An React client library for Speechly API
Support
Quality
Security
License
Reuse
Tutorial on Kaldi for Brandeis ASR course
Support
Quality
Security
License
Reuse
Speech Recognition in Asterisk with Vosk Server
Support
Quality
Security
License
Reuse
A GLaDOS TTS, using Forward Tacotron and HiFiGAN. Inference is fast and stable, even on the CPU. A low quality vocoder model is included for mobile use. Rudimentary TTS script included. Works perfectly on Linux, partially on Maybe someone smarter than me can make a GUI.
Support
Quality
Security
License
Reuse
Kiwix & openZIM build engine
Support
Quality
Security
License
Reuse
An open-source Python library for audio time-scale modification.
Support
Quality
Security
License
Reuse
Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"
Support
Quality
Security
License
Reuse
Jarvis Assisant is Virtual Assistant to do basic task
Support
Quality
Security
License
Reuse
A personal toolkit for single/multi-channel speech recognition & enhancement & separation.
Support
Quality
Security
License
Reuse
Open Source WFST-based Decoder Toolkit
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
W3C Web Speech API - Speech synthesis plugin for PhoneGap
Support
Quality
Security
License
Reuse
Conv-LSTM-CTC speech recognition network (end-to-end), written in TensorFlow.
Support
Quality
Security
License
Reuse
Kim,你的私人语音助理。
Support
Quality
Security
License
Reuse
The pytorch implementation of DC-TTS
Support
Quality
Security
License
Reuse
Speech Synthesis polyfill
Support
Quality
Security
License
Reuse
基于深度学习的语音增强工具(Speech Enhancement Tools Based on Deep Learning)
Support
Quality
Security
License
Reuse
Simple speech linguistic AI with Python
Support
Quality
Security
License
Reuse
VAE Tacotron 2, an alternative of GST Tacotron
Support
Quality
Security
License
Reuse
PyTorch re-implementation of Speech-Transformer
Support
Quality
Security
License
Reuse
A Web Audio stochastic synthesis module
Support
Quality
Security
License
Reuse
A Voice Biometric Application using Watson Speech to Text
Support
Quality
Security
License
Reuse
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Text to Speech in Unity.
Support
Quality
Security
License
Reuse
Unofficial implementation of NaturalSpeech2 for Voice Conversion
Support
Quality
Security
License
Reuse
ConvNets for Audio Recognition using Google Commands Dataset
Support
Quality
Security
License
Reuse
Voice Activity Detector
Support
Quality
Security
License
Reuse
This is a collection of Cortana Skills Kit code samples from the Microsoft Build 2017 conference.
Support
Quality
Security
License
Reuse
Feature extraction of speech signal is the initial stage of any speech recognition system.
Support
Quality
Security
License
Reuse
Project uses Google Speech API to transcript sound files(.flac) and play the sound files
Support
Quality
Security
License
Reuse
N
Neural-Speech-Dereverberationby DiegoLeon96
Python 66 Version:Current License: Strong Copyleft (GPL-3.0)
Machine and Deep Learning models for speech dereverberation
Support
Quality
Security
License
Reuse
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
Support
Quality
Security
License
Reuse
Wearable computing software framework for intelligence augmentation research and applications. Easily build smart glasses apps, relying on built in voice command, speech recognition, computer vision, UI, sensors, smart phone connection, NLP, facial recognition, database, cloud connection, and more. This repo is in beta.
Support
Quality
Security
License
Reuse
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Support
Quality
Security
License
Reuse
Jejueo Datasets for Machine Translation and Speech Synthesis
Support
Quality
Security
License
Reuse
s
speech-android-sdkby watson-developer-cloud
Java 65 Version:Current License: Permissive (Apache-2.0)
DEPRECATED - Please use https://github.com/watson-developer-cloud/android-sdk
Support
Quality
Security
License
Reuse
F
FastSpeech2by ga642381
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech :fist:
Python 72Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
webrtc-speech-to-textby rviscarra
Speech transcription on the browser using WebRTC and Google Speech
Go 72Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SPTKby r9y9
A modified version of Speech Signal Processing Toolkit (SPTK)
C 72Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
C
Complex-MTASSNetby Windstudent
Multi-Task Audio Source Separation, Two-Stage Model, Complex Domain.
Python 72Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
audio-webuiby gitmylo
A webui for different audio related Neural Networks
Python 72Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
U
Utteranceby benbahrenburg
Utterance lets you use the platform's native Text To Speech Engine within your Titanium Project
Java 71Updated: 5 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
voice-filterby funcwj
A unofficial Pytorch implementation of Google's VoiceFilter
Python 71Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
ppg_vcby ryokamoi
Implementation of voice conversion system utilizing phonetic posteriorgrams (status: archive)
Python 71Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
STLby openitu
The ITU-T Software Tool Library (G.191)
C 71Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
V
Voice_Recognition_Control_Robotby wwptrdudu
树莓派上的语音控制语音聊天的智能机器人。利用树莓派的wiringPi,科大讯飞,图灵机器人,alsa等开发库实现。
C 71Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
i
is-angular-ivy-readyby benbraou
Is Angular Ivy Ready website
TypeScript 71Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
simple_diarizerby cvqluu
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
Python 71Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
speechyby chrisenytc
A speech recognition API service to decode audio to text
JavaScript 70Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
react-clientby speechly
An React client library for Speechly API
TypeScript 70Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaldi-yesno-tutorialby keighrim
Tutorial on Kaldi for Brandeis ASR course
Shell 70Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
v
vosk-asteriskby alphacep
Speech Recognition in Asterisk with Vosk Server
C 70Updated: 2 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
g
glados-ttsby R2D2FISH
A GLaDOS TTS, using Forward Tacotron and HiFiGAN. Inference is fast and stable, even on the CPU. A low quality vocoder model is included for mobile use. Rudimentary TTS script included. Works perfectly on Linux, partially on Maybe someone smarter than me can make a GUI.
Python 70Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kiwix-buildby kiwix
Kiwix & openZIM build engine
Python 69Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
P
PyTSModby KAIST-MACLab
An open-source Python library for audio time-scale modification.
Python 69Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
g
gtn_applicationsby facebookresearch
Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"
Python 69Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
J
Jarvis-Assisantby Dipeshpal
Jarvis Assisant is Virtual Assistant to do basic task
Python 69Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
apsby funcwj
A personal toolkit for single/multi-channel speech recognition & enhancement & separation.
Python 69Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
o
opendcdby opendcd
Open Source WFST-based Decoder Toolkit
C++ 69Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
G
GCRN-complexby JupiterEthan
Python 69Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
SpeechSynthesisPluginby macdonst
W3C Web Speech API - Speech synthesis plugin for PhoneGap
Java 68Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaggle_speech_recognitionby huschen
Conv-LSTM-CTC speech recognition network (end-to-end), written in TensorFlow.
Python 68Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
Support
Quality
Security
License
Reuse
d
dctts-pytorchby chaiyujin
The pytorch implementation of DC-TTS
Python 68Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speech-synthesisby janantala
Speech Synthesis polyfill
JavaScript 68Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SpeechEnhancementby tech-podcasts
基于深度学习的语音增强工具(Speech Enhancement Tools Based on Deep Learning)
Go 68Updated: 2 y ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
S
Speech_AIby LetsPlayNow
Simple speech linguistic AI with Python
Python 67Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
vae_tacotron2by rishikksh20
VAE Tacotron 2, an alternative of GST Tacotron
Python 67Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Speech-Transformerby foamliu
PyTorch re-implementation of Speech-Transformer
Python 67Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gendyby abbernie
A Web Audio stochastic synthesis module
JavaScript 67Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
V
VoiceSensby bedangSen
A Voice Biometric Application using Watson Speech to Text
JavaScript 67Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
A
AdaSpeech2by rishikksh20
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Jupyter Notebook 67Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
K
K-wav2vecby JoungheeKim
Python 67Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
U
UnityTTSby voxell-tech
Text to Speech in Unity.
C# 67Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
N
NS2VCby adelacvg
Unofficial implementation of NaturalSpeech2 for Voice Conversion
Python 67Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
G
GCommandsPytorchby adiyoss
ConvNets for Audio Recognition using Google Commands Dataset
Python 66Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
Support
Quality
Security
License
Reuse
C
Cortana-Skills-Samples-Build-2017by microsoft
This is a collection of Cortana Skills Kit code samples from the Microsoft Build 2017 conference.
C# 66Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Speech_Feature_Extractionby aishoot
Feature extraction of speech signal is the initial stage of any speech recognition system.
Python 66Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
voiceRecognitionby katchsvartanian
Project uses Google Speech API to transcript sound files(.flac) and play the sound files
C 66Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
N
Neural-Speech-Dereverberationby DiegoLeon96
Machine and Deep Learning models for speech dereverberation
Python 66Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
F
FastPitchFormantby keonlee9420
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
Python 66Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
W
WearableIntelligenceSystemby emexlabs
Wearable computing software framework for intelligence augmentation research and applications. Easily build smart glasses apps, relying on built in voice command, speech recognition, computer vision, UI, sensors, smart phone connection, NLP, facial recognition, database, cloud connection, and more. This repo is in beta.
C++ 66Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
W
WatBotby VidyasagarMSC
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Java 65Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
j
jejueoby kakaobrain
Jejueo Datasets for Machine Translation and Speech Synthesis
Python 65Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speech-android-sdkby watson-developer-cloud
DEPRECATED - Please use https://github.com/watson-developer-cloud/android-sdk
Java 65Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse