A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Support
Quality
Security
License
Reuse
Opus .NET Wrapper
Support
Quality
Security
License
Reuse
Use Google text-to-speech on your Linux desktop
Support
Quality
Security
License
Reuse
*Deprecated* A fast and accurate part-of-speech tagger for TextBlob.
Support
Quality
Security
License
Reuse
Official implementation of Meta-StyleSpeech and StyleSpeech
Support
Quality
Security
License
Reuse
A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio"
Support
Quality
Security
License
Reuse
Google Summer of Code 2018 Project: Automatic Speech Recognition for Speech-to-Text on Chinese
Support
Quality
Security
License
Reuse
A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.
Support
Quality
Security
License
Reuse
S
Speech-Tranformer-Pytorchby ZhengkunTian
Python 101 Version:Current License: No License (No License)
Seq2Seq Speech Recognition with Transformer on Mandarin Chinese
Support
Quality
Security
License
Reuse
C
Cognitive-SpeakerRecognition-Pythonby microsoft
Python 101 Version:Current License: Proprietary (Proprietary)
Python SDK for the Microsoft Speaker Recognition API, part of Cognitive Services
Support
Quality
Security
License
Reuse
Based on Sonic (speed , pitch and rate) , the demo for Android. [ Deprecated See - https://github.com/jcodeing/KMedia An application level media framework for Android.]
Support
Quality
Security
License
Reuse
A PHP library for interacting with Sonos speakers
Support
Quality
Security
License
Reuse
Plug and play component to display LED meters for JUCE audio buffers
Support
Quality
Security
License
Reuse
Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on Tacotron-2 model.
Support
Quality
Security
License
Reuse
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.
Support
Quality
Security
License
Reuse
SummerTTS 是一个基于C++的独立编译的中文语音合成项目,可以本地运行不需要网络,而且没有额外的依赖,一键编译完成即可用于中文语音合成。SummerTTS is a standalone Chinese speech synthesis(TTS) project that has almost no dependency and could be easily used for Chinese TTS with just one key build out
Support
Quality
Security
License
Reuse
s
self-attention-tacotronby nii-yamagishilab
Python 100 Version:Current License: Permissive (BSD-3-Clause)
An implementation of "Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language" https://arxiv.org/abs/1810.11960
Support
Quality
Security
License
Reuse
Python library for handling audio datasets.
Support
Quality
Security
License
Reuse
N
Jupyter Notebook 100 Version:Current License: Permissive (MIT)
Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.
Support
Quality
Security
License
Reuse
react-native-android-voice is a speech-to-text library for React Native for the Android Platform.
Support
Quality
Security
License
Reuse
Music Identification Program based on Shazam's methods
Support
Quality
Security
License
Reuse
A
Android-Audio-Processing-Using-WebRTCby mail2chromium
C++ 99 Version:Current License: No License (No License)
All in all WebRTC. A Complete Guide to enable Rich and High Quality of **Real-Time Voice Communication** on Android Platform. This repository involves a complete understanding, implementation and documentation related to WebRTC Audio Processing.
Support
Quality
Security
License
Reuse
Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)
Support
Quality
Security
License
Reuse
YSDA course in Speech Processing.
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.
Support
Quality
Security
License
Reuse
Extract xvector and ivector under kaldi
Support
Quality
Security
License
Reuse
A demo repository for UniMRCP plugin implementation with iflytek ASR & TTS API
Support
Quality
Security
License
Reuse
This program uses speech recognition and text-to-speech to enable voice-driven conversations with OpenAI. The user speaks a prompt into the microphone, and the program sends the prompt to OpenAI to generate a response. The response is then converted to an audio file and played back to the user.
Support
Quality
Security
License
Reuse
ElevateAI - Speech-to-text API Python SDK
Support
Quality
Security
License
Reuse
deep clustering method for single-channel speech separation
Support
Quality
Security
License
Reuse
Official implementation of BVAE-TTS
Support
Quality
Security
License
Reuse
Vietnamese Text to Speech library
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Cross-lingual Voice Conversion
Support
Quality
Security
License
Reuse
SEGAN pytorch implementation https://arxiv.org/abs/1703.09452
Support
Quality
Security
License
Reuse
Blind Source Separation for Audio Recognition Tasks
Support
Quality
Security
License
Reuse
Y
You-Only-Speak-Onceby Speaker-Identification
Jupyter Notebook 95 Version:Current License: No License (No License)
Deep Learning - one shot learning for speaker recognition using Filter Banks
Support
Quality
Security
License
Reuse
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Support
Quality
Security
License
Reuse
Build your own Real-time Speech Emotion Recognizer
Support
Quality
Security
License
Reuse
transformer for ASR-systerm (via tensorflow2.0)
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Simple wrapper for Chrome's web speech recognition API
Support
Quality
Security
License
Reuse
A library for real-time voice processing in web browsers
Support
Quality
Security
License
Reuse
A PHP library to convert text to speech using various web services
Support
Quality
Security
License
Reuse
A Go library to read/write WAVE(RIFF waveform Audio) Format
Support
Quality
Security
License
Reuse
Windows "say"
Support
Quality
Security
License
Reuse
This is a repository of chinese/mandarin tts (text-to-speech) .
Support
Quality
Security
License
Reuse
Ruby-based web service for speech recognition, using the PocketSphinx gstreamer module
Support
Quality
Security
License
Reuse
A PyTorch implementation of the FFTNet: a Real-Time Speaker-Dependent Neural Vocoder
Support
Quality
Security
License
Reuse
T
Tacotronby bshall
A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Python 104Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
O
Support
Quality
Security
License
Reuse
s
simple-google-ttsby glutanimate
Use Google text-to-speech on your Linux desktop
Perl 104Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
t
textblob-aptaggerby sloria
*Deprecated* A fast and accurate part-of-speech tagger for TextBlob.
Python 103Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
StyleSpeechby KevinMIN95
Official implementation of Meta-StyleSpeech and StyleSpeech
Python 103Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
W
WaveFlowby L0SG
A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio"
Jupyter Notebook 102Updated: 3 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
A
ASR-for-Chinese-Pipelineby CynthiaSuwi
Google Summer of Code 2018 Project: Automatic Speech Recognition for Speech-to-Text on Chinese
Python 102Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
wavenet-for-chromeby pgmichael
A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.
TypeScript 102Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Speech-Tranformer-Pytorchby ZhengkunTian
Seq2Seq Speech Recognition with Transformer on Mandarin Chinese
Python 101Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
Cognitive-SpeakerRecognition-Pythonby microsoft
Python SDK for the Microsoft Speaker Recognition API, part of Cognitive Services
Python 101Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
K
K-Sonicby jcodeing
Based on Sonic (speed , pitch and rate) , the demo for Android. [ Deprecated See - https://github.com/jcodeing/KMedia An application level media framework for Android.]
Java 101Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
sonosby duncan3dc
A PHP library for interacting with Sonos speakers
PHP 101Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
f
ff_metersby ffAudio
Plug and play component to display LED meters for JUCE audio buffers
C++ 101Updated: 2 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
t
tacotron2-mandarinby atomicoo
Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on Tacotron-2 model.
Python 101Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
V
VAENAR-TTSby thuhcsi
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.
Python 101Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SummerTTSby huakunyang
SummerTTS 是一个基于C++的独立编译的中文语音合成项目,可以本地运行不需要网络,而且没有额外的依赖,一键编译完成即可用于中文语音合成。SummerTTS is a standalone Chinese speech synthesis(TTS) project that has almost no dependency and could be easily used for Chinese TTS with just one key build out
C++ 101Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
self-attention-tacotronby nii-yamagishilab
An implementation of "Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language" https://arxiv.org/abs/1810.11960
Python 100Updated: 3 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
a
audiomateby ynop
Python library for handling audio datasets.
Python 100Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
N
Noise2Noise-audio_denoising_without_clean_training_databy madhavmk
Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.
Jupyter Notebook 100Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
react-native-android-voiceby JoaoCnh
react-native-android-voice is a speech-to-text library for React Native for the Android Platform.
Java 99Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Shazamby bmoquist
Music Identification Program based on Shazam's methods
Python 99Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
Android-Audio-Processing-Using-WebRTCby mail2chromium
All in all WebRTC. A Complete Guide to enable Rich and High Quality of **Real-Time Voice Communication** on Android Platform. This repository involves a complete understanding, implementation and documentation related to WebRTC Audio Processing.
C++ 99Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-representationsby awslabs
Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)
Python 98Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speech_courseby yandexdataschool
YSDA course in Speech Processing.
Jupyter Notebook 98Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
audio-sync-kitby google
Python 97Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
e
extensionby wavenet-for-chrome
A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.
TypeScript 97Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
i
ivector-xvectorby zeroQiaoba
Extract xvector and ivector under kaldi
Shell 97Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
M
MRCP-Plugin-Demoby cotinyang
A demo repository for UniMRCP plugin implementation with iflytek ASR & TTS API
C 97Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
ChatGPT-OpenAI-Smart-Speakerby Olney1
This program uses speech recognition and text-to-speech to enable voice-driven conversations with OpenAI. The user speaks a prompt into the microphone, and the program sends the prompt to OpenAI to generate a response. The response is then converted to an audio file and played back to the user.
Python 97Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
E
ElevateAIPythonSDKby NICEElevateAI
ElevateAI - Speech-to-text API Python SDK
Python 97Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
deep-clusteringby funcwj
deep clustering method for single-channel speech separation
Python 96Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
B
BVAE-TTSby LEEYOONHYUNG
Official implementation of BVAE-TTS
Python 96Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
vietTTSby NTT123
Vietnamese Text to Speech library
Python 96Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
AISHELL-4by felixfuyihui
Python 96Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
cross_vcby Kyubyong
Cross-lingual Voice Conversion
Python 95Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
segan-pytorchby dansuh17
SEGAN pytorch implementation https://arxiv.org/abs/1703.09452
Python 95Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
o
openBliSSARTby openBliSSART
Blind Source Separation for Audio Recognition Tasks
C++ 95Updated: 4 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
Y
You-Only-Speak-Onceby Speaker-Identification
Deep Learning - one shot learning for speaker recognition using Filter Banks
Jupyter Notebook 95Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
F
Fre-GAN-pytorchby rishikksh20
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Python 95Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
e
emovoiceby hcmlab
Build your own Real-time Speech Emotion Recognizer
Python 94Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
S
Speech-Transformer-tf2.0by xingchensong
transformer for ASR-systerm (via tensorflow2.0)
Python 94Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
acrcloud_sdk_pythonby acrcloud
Python 94Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
Speech.jsby yyx990803
Simple wrapper for Chrome's web speech recognition API
JavaScript 94Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
web-voice-processorby Picovoice
A library for real-time voice processing in web browsers
TypeScript 94Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speakerby duncan3dc
A PHP library to convert text to speech using various web services
PHP 94Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
g
go-wavby youpy
A Go library to read/write WAVE(RIFF waveform Audio) Format
Go 94Updated: 3 y ago License: Permissive (ISC)
Support
Quality
Security
License
Reuse
w
Support
Quality
Security
License
Reuse
t
tacotron2-mandarin-griffin-limby Joee1995
This is a repository of chinese/mandarin tts (text-to-speech) .
Python 93Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
ruby-pocketsphinx-serverby alumae
Ruby-based web service for speech recognition, using the PocketSphinx gstreamer module
Ruby 93Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
F
FFTNetby syang1993
A PyTorch implementation of the FFTNet: a Real-Time Speaker-Dependent Neural Vocoder
Python 92Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse