Speech Libraries - Page 8

C# 235 Version:Current
License: Strong Copyleft (GPL-2.0)

Provides sample implementations of the Polly library. The intent of this project is to help newcomers kick-start their use of Polly within their own projects.

Support

Quality

Security

License

Reuse

Python 234 Version:Current
License: Permissive (MIT)

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

Support

Quality

Security

License

Reuse

wekwsby wenet-e2e

Python 234 Version:Current
License: Permissive (Apache-2.0)

Production First and Production Ready End-to-End Keyword Spotting Toolkit

Support

Quality

Security

License

Reuse

MoeTTSby luoyily

Python 234 Version:Current
License: Permissive (BSD-3-Clause)

Speech synthesis model repo for galgame characters based on Tacotron2 and Hifigan

Support

Quality

Security

License

Reuse

vocodersby openvpi

HTML 234 Version:Current
License: Strong Copyleft (AGPL-3.0)

DiffSinger community vocoders release page

Support

Quality

Security

License

Reuse

JavaScript 231 Version:Current
License: Permissive (Apache-2.0)

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

Support

Quality

Security

License

Reuse

nonparaSeq2seqVC_codeby jxzhanggg

Python 230 Version:Current
License: Permissive (MIT)

Implementation code of non-parallel sequence-to-sequence VC

Support

Quality

Security

License

Reuse

voluteby webfansplz

JavaScript 228 Version:Current
License: Permissive (MIT)

Raspberry Pi + Nodejs = Speech Robot

Support

Quality

Security

License

Reuse

wavegradby lmnt-com

Python 228 Version:Current
License: Permissive (Apache-2.0)

A fast, high-quality neural vocoder.

Support

Quality

Security

License

Reuse

K6neleby Kaljurand

Java 227 Version:Current
License: Permissive (Apache-2.0)

An Android app that offers speech-to-text user interfaces to other apps

Support

Quality

Security

License

Reuse

rnnt-speech-recognitionby noahchalifour

Python 227 Version:Current
License: Permissive (MIT)

End-to-end speech recognition using RNN Transducers in Tensorflow 2.0

Support

Quality

Security

License

Reuse

JavaScript 227 Version:Current
License: Permissive (MIT)

Web Audio Speech Synthesis / Recognition for p5.js

Support

Quality

Security

License

Reuse

Tacotronby barronalex

Python 226 Version:Current
License: No License (No License)

Implementation of Google's Tacotron in TensorFlow

Support

Quality

Security

License

Reuse

Python 223 Version:Current
License: Permissive (Apache-2.0)

Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).

Support

Quality

Security

License

Reuse

spectrographicby LeviBorodenko

Python 222 Version:Current
License: Permissive (MIT)

Turn an image into sound whose spectrogram looks like the image.

Support

Quality

Security

License

Reuse

HTML 222 Version:Current
License: Permissive (MIT)

Encode and decode text using the Web Audio API to enable offline data transfer between devices.

Support

Quality

Security

License

Reuse

TypeScript 222 Version:Current
License: Permissive (MIT)

Voice assistant for Visual Studio Code.

Support

Quality

Security

License

Reuse

kaldiioby nttcslab-sp

Python 220 Version:Current
License: Proprietary (Proprietary)

A pure python module for reading and writing kaldi ark files

Support

Quality

Security

License

Reuse

edgedictby theblackcat102

Python 219 Version:Current
License: Permissive (Apache-2.0)

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Support

Quality

Security

License

Reuse

GAN-TTSby yanggeng1995

Python 218 Version:Current
License: No License (No License)

A pytroch implementation of the GAN-TTS: HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS

Support

Quality

Security

License

Reuse

datasets-CMU_Wildernessby festvox

Shell 218 Version:Current
License: No License (No License)

CMU Wilderness Multilingual Speech Dataset

Support

Quality

Security

License

Reuse

TTS-Cubeby tiberiu44

Python 216 Version:Current
License: Permissive (Apache-2.0)

End-2-end speech synthesis with recurrent neural networks

Support

Quality

Security

License

Reuse

DNN-based_source_separationby tky823

Python 216 Version:Current
License: No License (No License)

A PyTorch implementation of DNN-based source separation.

Support

Quality

Security

License

Reuse

AutoPSTby auspicious3000

Python 216 Version:Current
License: Permissive (MIT)

Global Rhythm Style Transfer Without Text Transcriptions

Support

Quality

Security

License

Reuse

multi-speaker-tacotronby nii-yamagishilab

Python 214 Version:Current
License: Permissive (BSD-3-Clause)

VCTK multi-speaker tacotron for ICASSP 2020

Support

Quality

Security

License

Reuse

Python 213 Version:Current
License: Permissive (MIT)

My-Voice Analysis is a Python library for the analysis of voice (simultaneous speech, high entropy) without the need of a transcription. It breaks utterances and detects syllable boundaries, fundamental frequency contours, and formants.

Support

Quality

Security

License

Reuse

ronorby mlang

Rust 213 Version:Current
License: No License (No License)

Sonos smart speaker controller API and command-line tools

Support

Quality

Security

License

Reuse

Naomiby NaomiProject

Python 211 Version:Current
License: Permissive (MIT)

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Support

Quality

Security

License

Reuse

C++ 211 Version:Current
License: No License (No License)

C++ implementation of LSTM (Long Short Term Memory), in Kaldi's nnet1 framework. Used for automatic speech recognition, possibly language modeling etc, the training can be switched between CPU and GPU(CUDA). This repo is now merged into official Kaldi codebase(Karel's setup), so this repo is no longer maintained, please check out the Kaldi project instead.

Support

Quality

Security

License

Reuse

Dictaterby Nosrac

Swift 211 Version:Current
License: Permissive (MIT)

Replacement for built-in Speech services. Supports playing, skipping, progress, and more

Support

Quality

Security

License

Reuse

MelGAN-VCby marcoppasini

Jupyter Notebook 210 Version:Current
License: Permissive (MIT)

MelGAN-VC: Voice Conversion and Audio Style Transfer on arbitrarily long samples using Spectrograms

Support

Quality

Security

License

Reuse

BigCiDianby speechio

Python 209 Version:Current
License: No License (No License)

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

Support

Quality

Security

License

Reuse

DynamicAudioNormalizerby lordmulder

C++ 209 Version:Current
License: Proprietary (Proprietary)

Dynamic Audio Normalizer

Support

Quality

Security

License

Reuse

MS-SNSDby microsoft

HTML 209 Version:Current
License: Permissive (MIT)

The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired.

Support

Quality

Security

License

Reuse

Jupyter Notebook 209 Version:Current
License: Permissive (MIT)

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Support

Quality

Security

License

Reuse

CREMA-Dby CheyneyComputerScience

R 208 Version:Current
License: Proprietary (Proprietary)

Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D)

Support

Quality

Security

License

Reuse

klaamby ARBML

Jupyter Notebook 208 Version:Current
License: Permissive (MIT)

Arabic speech recognition, classification and text-to-speech.

Support

Quality

Security

License

Reuse

Speech_emotion_recognition_BLSTMby RayanWang

Python 207 Version:Current
License: Permissive (MIT)

Bidirectional LSTM network for speech emotion recognition.

Support

Quality

Security

License

Reuse

speaker-recognition-py3by crouchred

Python 206 Version:Current
License: Permissive (Apache-2.0)

Base on MFCC and GMM(基于MFCC和高斯混合模型的语音识别)

Support

Quality

Security

License

Reuse

waveglowby npuichigo

Python 205 Version:Current
License: Permissive (Apache-2.0)

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

Support

Quality

Security

License

Reuse

Python 205 Version:Current
License: Permissive (Apache-2.0)

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

Support

Quality

Security

License

Reuse

spectrologyby solusipse

Python 204 Version:Current
License: Permissive (MIT)

Images to audio files with corresponding spectrograms encoder.

Support

Quality

Security

License

Reuse

Python 203 Version:Current
License: Permissive (MIT)

A PyTorch implementation of "Robust Universal Neural Vocoding"

Support

Quality

Security

License

Reuse

multimodal-speech-emotionby david-yoon

Jupyter Notebook 203 Version:Current
License: Permissive (MIT)

TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18

Support

Quality

Security

License

Reuse

Rust 201 Version:Current
License: Permissive (BSD-3-Clause)

Recurrent neural network for audio noise reduction

Support

Quality

Security

License

Reuse

stm32-speech-recognitionby gk969

C 199 Version:Current
License: Permissive (MIT)

基于STM32的孤立词语音识别

Support

Quality

Security

License

Reuse

html2Dashby selfboot

Python 198 Version:Current
License: No License (No License)

Generate a docset from any HTML documentations. Written in python

Support

Quality

Security

License

Reuse

wettsby wenet-e2e

Python 198 Version:Current
License: Permissive (Apache-2.0)

Production First and Production Ready End-to-End Text-to-Speech Toolkit

Support

Quality

Security

License

Reuse

voskby alphacep

C 197 Version:Current
License: Permissive (Apache-2.0)

VOSK Speech Recognition Toolkit

Support

Quality

Security

License

Reuse

Python 195 Version:Current
License: No License (No License)

Multi-voice singing voice synthesis

Support

Quality

Security

License

Reuse

Polly-Samplesby App-vNext

Provides sample implementations of the Polly library. The intent of this project is to help newcomers kick-start their use of Polly within their own projects.

235

Updated: 2 y ago

License: Strong Copyleft (GPL-2.0)

Support

Quality

Security

License

Reuse

VQ-VAE-Speechby swasun

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

Python

234

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

wekwsby wenet-e2e

Production First and Production Ready End-to-End Keyword Spotting Toolkit

Python

234

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

MoeTTSby luoyily

Speech synthesis model repo for galgame characters based on Tacotron2 and Hifigan

Python

234

Updated: 3 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

vocodersby openvpi

DiffSinger community vocoders release page

HTML

234

Updated: 2 y ago

License: Strong Copyleft (AGPL-3.0)

Support

Quality

Security

License

Reuse

vosk-browserby ccoreilly

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

JavaScript

231

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

nonparaSeq2seqVC_codeby jxzhanggg

Implementation code of non-parallel sequence-to-sequence VC

Python

230

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

voluteby webfansplz

Raspberry Pi + Nodejs = Speech Robot

JavaScript

228

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

wavegradby lmnt-com

A fast, high-quality neural vocoder.

Python

228

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

K6neleby Kaljurand

An Android app that offers speech-to-text user interfaces to other apps

Java

227

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

rnnt-speech-recognitionby noahchalifour

End-to-end speech recognition using RNN Transducers in Tensorflow 2.0

Python

227

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

p5.js-speechby IDMNYU

Web Audio Speech Synthesis / Recognition for p5.js

JavaScript

227

Updated: 3 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Tacotronby barronalex

Implementation of Google's Tacotron in TensorFlow

Python

226

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

asr-evaluationby belambert

Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).

Python

223

Updated: 3 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

spectrographicby LeviBorodenko

Turn an image into sound whose spectrogram looks like the image.

Python

222

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

webaudio-modemby martme

Encode and decode text using the Web Audio API to enable offline data transfer between devices.

HTML

222

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

voice-assistantby b4rtaz

Voice assistant for Visual Studio Code.

TypeScript

222

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

kaldiioby nttcslab-sp

A pure python module for reading and writing kaldi ark files

Python

220

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

edgedictby theblackcat102

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Python

219

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

GAN-TTSby yanggeng1995

A pytroch implementation of the GAN-TTS: HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS

Python

218

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

datasets-CMU_Wildernessby festvox

CMU Wilderness Multilingual Speech Dataset

Shell

218

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

TTS-Cubeby tiberiu44

End-2-end speech synthesis with recurrent neural networks

Python

216

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

DNN-based_source_separationby tky823

A PyTorch implementation of DNN-based source separation.

Python

216

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

AutoPSTby auspicious3000

Global Rhythm Style Transfer Without Text Transcriptions

Python

216

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

multi-speaker-tacotronby nii-yamagishilab

VCTK multi-speaker tacotron for ICASSP 2020

Python

214

Updated: 4 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

My-Voice Analysis is a Python library for the analysis of voice (simultaneous speech, high entropy) without the need of a transcription. It breaks utterances and detects syllable boundaries, fundamental frequency contours, and formants.

Python

213

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

ronorby mlang

Sonos smart speaker controller API and command-line tools

Rust

213

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Naomiby NaomiProject

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Python

211

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

kaldi-lstmby dophist

C++ implementation of LSTM (Long Short Term Memory), in Kaldi's nnet1 framework. Used for automatic speech recognition, possibly language modeling etc, the training can be switched between CPU and GPU(CUDA). This repo is now merged into official Kaldi codebase(Karel's setup), so this repo is no longer maintained, please check out the Kaldi project instead.

C++

211

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Dictaterby Nosrac

Replacement for built-in Speech services. Supports playing, skipping, progress, and more

Swift

211

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

MelGAN-VCby marcoppasini

MelGAN-VC: Voice Conversion and Audio Style Transfer on arbitrarily long samples using Spectrograms

Jupyter Notebook

210

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

BigCiDianby speechio

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

Python

209

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

DynamicAudioNormalizerby lordmulder

Dynamic Audio Normalizer

C++

209

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

MS-SNSDby microsoft

The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired.

HTML

209

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

ttslearnby r9y9

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Jupyter Notebook

209

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

CREMA-Dby CheyneyComputerScience

Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D)

208

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

klaamby ARBML

Arabic speech recognition, classification and text-to-speech.

Jupyter Notebook

208

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Speech_emotion_recognition_BLSTMby RayanWang

Bidirectional LSTM network for speech emotion recognition.

Python

207

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

speaker-recognition-py3by crouchred

Base on MFCC and GMM(基于MFCC和高斯混合模型的语音识别)

Python

206

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

waveglowby npuichigo

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

Python

205

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

speaker-idby google

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

Python

205

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

spectrologyby solusipse

Images to audio files with corresponding spectrograms encoder.

Python

204

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

UniversalVocodingby bshall

A PyTorch implementation of "Robust Universal Neural Vocoding"

Python

203

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

multimodal-speech-emotionby david-yoon

TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18

Jupyter Notebook

203

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

nnnoiselessby jneem

Recurrent neural network for audio noise reduction

Rust

201

Updated: 2 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

stm32-speech-recognitionby gk969

基于STM32的孤立词语音识别

199

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

html2Dashby selfboot

Generate a docset from any HTML documentations. Written in python

Python

198

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

wettsby wenet-e2e

Production First and Production Ready End-to-End Text-to-Speech Toolkit

Python

198

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

voskby alphacep

VOSK Speech Recognition Toolkit

197

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

WGANSingby MTG

Multi-voice singing voice synthesis

Python

195

Updated: 3 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Open Weaver – Develop Applications Faster with Open Source

Terms
Privacy policy

Speech Libraries - Page 8

Polly-Samplesby App-vNext

C# 235 Version:Current License: Strong Copyleft (GPL-2.0)

Provides sample implementations of the Polly library. The intent of this project is to help newcomers kick-start their use of Polly within their own projects.

VQ-VAE-Speechby swasun

Python 234 Version:Current License: Permissive (MIT)

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

wekwsby wenet-e2e

Python 234 Version:Current License: Permissive (Apache-2.0)

Production First and Production Ready End-to-End Keyword Spotting Toolkit

MoeTTSby luoyily

Python 234 Version:Current License: Permissive (BSD-3-Clause)

Speech synthesis model repo for galgame characters based on Tacotron2 and Hifigan

vocodersby openvpi

HTML 234 Version:Current License: Strong Copyleft (AGPL-3.0)

DiffSinger community vocoders release page

vosk-browserby ccoreilly

JavaScript 231 Version:Current License: Permissive (Apache-2.0)

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

nonparaSeq2seqVC_codeby jxzhanggg

Python 230 Version:Current License: Permissive (MIT)

Implementation code of non-parallel sequence-to-sequence VC

voluteby webfansplz

JavaScript 228 Version:Current License: Permissive (MIT)

Raspberry Pi + Nodejs = Speech Robot

wavegradby lmnt-com

Python 228 Version:Current License: Permissive (Apache-2.0)

A fast, high-quality neural vocoder.

K6neleby Kaljurand

Java 227 Version:Current License: Permissive (Apache-2.0)

An Android app that offers speech-to-text user interfaces to other apps

rnnt-speech-recognitionby noahchalifour

Python 227 Version:Current License: Permissive (MIT)

End-to-end speech recognition using RNN Transducers in Tensorflow 2.0

p5.js-speechby IDMNYU

JavaScript 227 Version:Current License: Permissive (MIT)

Web Audio Speech Synthesis / Recognition for p5.js

Tacotronby barronalex

Python 226 Version:Current License: No License (No License)

Implementation of Google's Tacotron in TensorFlow

asr-evaluationby belambert

Python 223 Version:Current License: Permissive (Apache-2.0)

Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).

spectrographicby LeviBorodenko

Python 222 Version:Current License: Permissive (MIT)

Turn an image into sound whose spectrogram looks like the image.

webaudio-modemby martme

HTML 222 Version:Current License: Permissive (MIT)

Encode and decode text using the Web Audio API to enable offline data transfer between devices.

voice-assistantby b4rtaz

TypeScript 222 Version:Current License: Permissive (MIT)

Voice assistant for Visual Studio Code.

kaldiioby nttcslab-sp

Python 220 Version:Current License: Proprietary (Proprietary)

A pure python module for reading and writing kaldi ark files

edgedictby theblackcat102

Python 219 Version:Current License: Permissive (Apache-2.0)

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

GAN-TTSby yanggeng1995

Python 218 Version:Current License: No License (No License)

A pytroch implementation of the GAN-TTS: HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS

datasets-CMU_Wildernessby festvox

Shell 218 Version:Current License: No License (No License)

CMU Wilderness Multilingual Speech Dataset

TTS-Cubeby tiberiu44

Python 216 Version:Current License: Permissive (Apache-2.0)

End-2-end speech synthesis with recurrent neural networks

DNN-based_source_separationby tky823

Python 216 Version:Current License: No License (No License)

A PyTorch implementation of DNN-based source separation.

AutoPSTby auspicious3000

Python 216 Version:Current License: Permissive (MIT)

Global Rhythm Style Transfer Without Text Transcriptions

multi-speaker-tacotronby nii-yamagishilab

Python 214 Version:Current License: Permissive (BSD-3-Clause)

VCTK multi-speaker tacotron for ICASSP 2020

my-voice-analysisby Shahabks

Python 213 Version:Current License: Permissive (MIT)

My-Voice Analysis is a Python library for the analysis of voice (simultaneous speech, high entropy) without the need of a transcription. It breaks utterances and detects syllable boundaries, fundamental frequency contours, and formants.

ronorby mlang

C# 235 Version:Current
License: Strong Copyleft (GPL-2.0)

Python 234 Version:Current
License: Permissive (MIT)

Python 234 Version:Current
License: Permissive (Apache-2.0)

Python 234 Version:Current
License: Permissive (BSD-3-Clause)

HTML 234 Version:Current
License: Strong Copyleft (AGPL-3.0)

JavaScript 231 Version:Current
License: Permissive (Apache-2.0)

Python 230 Version:Current
License: Permissive (MIT)

JavaScript 228 Version:Current
License: Permissive (MIT)

Python 228 Version:Current
License: Permissive (Apache-2.0)

Java 227 Version:Current
License: Permissive (Apache-2.0)

Python 227 Version:Current
License: Permissive (MIT)

JavaScript 227 Version:Current
License: Permissive (MIT)

Python 226 Version:Current
License: No License (No License)

Python 223 Version:Current
License: Permissive (Apache-2.0)

Python 222 Version:Current
License: Permissive (MIT)

HTML 222 Version:Current
License: Permissive (MIT)

TypeScript 222 Version:Current
License: Permissive (MIT)

Python 220 Version:Current
License: Proprietary (Proprietary)

Python 219 Version:Current
License: Permissive (Apache-2.0)

Python 218 Version:Current
License: No License (No License)

Shell 218 Version:Current
License: No License (No License)

Python 216 Version:Current
License: Permissive (Apache-2.0)

Python 216 Version:Current
License: No License (No License)

Python 216 Version:Current
License: Permissive (MIT)

Python 214 Version:Current
License: Permissive (BSD-3-Clause)

Python 213 Version:Current
License: Permissive (MIT)

Rust 213 Version:Current
License: No License (No License)

Python 211 Version:Current
License: Permissive (MIT)

C++ 211 Version:Current
License: No License (No License)

Swift 211 Version:Current
License: Permissive (MIT)

Jupyter Notebook 210 Version:Current
License: Permissive (MIT)

Python 209 Version:Current
License: No License (No License)

C++ 209 Version:Current
License: Proprietary (Proprietary)

HTML 209 Version:Current
License: Permissive (MIT)

Jupyter Notebook 209 Version:Current
License: Permissive (MIT)

R 208 Version:Current
License: Proprietary (Proprietary)

Jupyter Notebook 208 Version:Current
License: Permissive (MIT)

Python 207 Version:Current
License: Permissive (MIT)

Python 206 Version:Current
License: Permissive (Apache-2.0)

Python 205 Version:Current
License: Permissive (Apache-2.0)

Python 205 Version:Current
License: Permissive (Apache-2.0)

Python 204 Version:Current
License: Permissive (MIT)

Python 203 Version:Current
License: Permissive (MIT)

Jupyter Notebook 203 Version:Current
License: Permissive (MIT)

Rust 201 Version:Current
License: Permissive (BSD-3-Clause)

C 199 Version:Current
License: Permissive (MIT)

Python 198 Version:Current
License: No License (No License)

Python 198 Version:Current
License: Permissive (Apache-2.0)

C 197 Version:Current
License: Permissive (Apache-2.0)

Python 195 Version:Current
License: No License (No License)