Speech Libraries - Page 14

Tacotronby bshall

Python 104 Version:Current
License: Permissive (MIT)

A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

Support

Quality

Security

License

Reuse

Opus.NETby DevJohnC

C# 104 Version:Current
License: Permissive (MIT)

Opus .NET Wrapper

Support

Quality

Security

License

Reuse

simple-google-ttsby glutanimate

Perl 104 Version:Current
License: Strong Copyleft (GPL-3.0)

Use Google text-to-speech on your Linux desktop

Support

Quality

Security

License

Reuse

Python 103 Version:Current
License: Permissive (MIT)

Deprecated A fast and accurate part-of-speech tagger for TextBlob.

Support

Quality

Security

License

Reuse

StyleSpeechby KevinMIN95

Python 103 Version:Current
License: Permissive (MIT)

Official implementation of Meta-StyleSpeech and StyleSpeech

Support

Quality

Security

License

Reuse

Jupyter Notebook 102 Version:Current
License: Permissive (BSD-3-Clause)

A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio"

Support

Quality

Security

License

Reuse

ASR-for-Chinese-Pipelineby CynthiaSuwi

Python 102 Version:Current
License: No License (No License)

Google Summer of Code 2018 Project: Automatic Speech Recognition for Speech-to-Text on Chinese

Support

Quality

Security

License

Reuse

wavenet-for-chromeby pgmichael

TypeScript 102 Version:Current
License: Permissive (MIT)

A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.

Support

Quality

Security

License

Reuse

Speech-Tranformer-Pytorchby ZhengkunTian

Python 101 Version:Current
License: No License (No License)

Seq2Seq Speech Recognition with Transformer on Mandarin Chinese

Support

Quality

Security

License

Reuse

Cognitive-SpeakerRecognition-Pythonby microsoft

Python 101 Version:Current
License: Proprietary (Proprietary)

Python SDK for the Microsoft Speaker Recognition API, part of Cognitive Services

Support

Quality

Security

License

Reuse

K-Sonicby jcodeing

Java 101 Version:Current
License: Permissive (MIT)

Based on Sonic (speed , pitch and rate) , the demo for Android. [ Deprecated See - https://github.com/jcodeing/KMedia An application level media framework for Android.]

Support

Quality

Security

License

Reuse

sonosby duncan3dc

PHP 101 Version:Current
License: Permissive (Apache-2.0)

A PHP library for interacting with Sonos speakers

Support

Quality

Security

License

Reuse

ff_metersby ffAudio

C++ 101 Version:Current
License: Permissive (BSD-3-Clause)

Plug and play component to display LED meters for JUCE audio buffers

Support

Quality

Security

License

Reuse

Python 101 Version:Current
License: Permissive (MIT)

Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on Tacotron-2 model.

Support

Quality

Security

License

Reuse

Python 101 Version:Current
License: Permissive (MIT)

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Support

Quality

Security

License

Reuse

SummerTTSby huakunyang

C++ 101 Version:Current
License: No License (No License)

SummerTTS 是一个基于C++的独立编译的中文语音合成项目，可以本地运行不需要网络，而且没有额外的依赖，一键编译完成即可用于中文语音合成。SummerTTS is a standalone Chinese speech synthesis(TTS) project that has almost no dependency and could be easily used for Chinese TTS with just one key build out

Support

Quality

Security

License

Reuse

self-attention-tacotronby nii-yamagishilab

Python 100 Version:Current
License: Permissive (BSD-3-Clause)

An implementation of "Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language" https://arxiv.org/abs/1810.11960

Support

Quality

Security

License

Reuse

Python 100 Version:Current
License: Permissive (MIT)

Python library for handling audio datasets.

Support

Quality

Security

License

Reuse

Noise2Noise-audio_denoising_without_clean_training_databy madhavmk

Jupyter Notebook 100 Version:Current
License: Permissive (MIT)

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.

Support

Quality

Security

License

Reuse

react-native-android-voiceby JoaoCnh

Java 99 Version:Current
License: Permissive (MIT)

react-native-android-voice is a speech-to-text library for React Native for the Android Platform.

Support

Quality

Security

License

Reuse

Shazamby bmoquist

Python 99 Version:Current
License: Permissive (MIT)

Music Identification Program based on Shazam's methods

Support

Quality

Security

License

Reuse

Android-Audio-Processing-Using-WebRTCby mail2chromium

C++ 99 Version:Current
License: No License (No License)

All in all WebRTC. A Complete Guide to enable Rich and High Quality of Real-Time Voice Communication on Android Platform. This repository involves a complete understanding, implementation and documentation related to WebRTC Audio Processing.

Support

Quality

Security

License

Reuse

speech-representationsby awslabs

Python 98 Version:Current
License: Permissive (Apache-2.0)

Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)

Support

Quality

Security

License

Reuse

speech_courseby yandexdataschool

Jupyter Notebook 98 Version:Current
License: Permissive (MIT)

YSDA course in Speech Processing.

Support

Quality

Security

License

Reuse

Python 97 Version:Current
License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

extensionby wavenet-for-chrome

TypeScript 97 Version:Current
License: Permissive (MIT)

A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.

Support

Quality

Security

License

Reuse

Shell 97 Version:Current
License: No License (No License)

Extract xvector and ivector under kaldi

Support

Quality

Security

License

Reuse

C 97 Version:Current
License: No License (No License)

A demo repository for UniMRCP plugin implementation with iflytek ASR & TTS API

Support

Quality

Security

License

Reuse

ChatGPT-OpenAI-Smart-Speakerby Olney1

Python 97 Version:Current
License: Permissive (MIT)

This program uses speech recognition and text-to-speech to enable voice-driven conversations with OpenAI. The user speaks a prompt into the microphone, and the program sends the prompt to OpenAI to generate a response. The response is then converted to an audio file and played back to the user.

Support

Quality

Security

License

Reuse

ElevateAIPythonSDKby NICEElevateAI

Python 97 Version:Current
License: Permissive (MIT)

ElevateAI - Speech-to-text API Python SDK

Support

Quality

Security

License

Reuse

Python 96 Version:Current
License: No License (No License)

deep clustering method for single-channel speech separation

Support

Quality

Security

License

Reuse

BVAE-TTSby LEEYOONHYUNG

Python 96 Version:Current
License: Permissive (MIT)

Official implementation of BVAE-TTS

Support

Quality

Security

License

Reuse

vietTTSby NTT123

Python 96 Version:Current
License: Permissive (MIT)

Vietnamese Text to Speech library

Support

Quality

Security

License

Reuse

AISHELL-4by felixfuyihui

Python 96 Version:Current
License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

cross_vcby Kyubyong

Python 95 Version:Current
License: Permissive (Apache-2.0)

Cross-lingual Voice Conversion

Support

Quality

Security

License

Reuse

Python 95 Version:Current
License: Strong Copyleft (GPL-3.0)

SEGAN pytorch implementation https://arxiv.org/abs/1703.09452

Support

Quality

Security

License

Reuse

openBliSSARTby openBliSSART

C++ 95 Version:Current
License: Strong Copyleft (GPL-2.0)

Blind Source Separation for Audio Recognition Tasks

Support

Quality

Security

License

Reuse

You-Only-Speak-Onceby Speaker-Identification

Jupyter Notebook 95 Version:Current
License: No License (No License)

Deep Learning - one shot learning for speaker recognition using Filter Banks

Support

Quality

Security

License

Reuse

Python 95 Version:Current
License: Permissive (MIT)

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Support

Quality

Security

License

Reuse

emovoiceby hcmlab

Python 94 Version:Current
License: Proprietary (Proprietary)

Build your own Real-time Speech Emotion Recognizer

Support

Quality

Security

License

Reuse

Speech-Transformer-tf2.0by xingchensong

Python 94 Version:Current
License: No License (No License)

transformer for ASR-systerm (via tensorflow2.0)

Support

Quality

Security

License

Reuse

acrcloud_sdk_pythonby acrcloud

Python 94 Version:Current
License: No License (No License)

Support

Quality

Security

License

Reuse

Speech.jsby yyx990803

JavaScript 94 Version:Current
License: No License (No License)

Simple wrapper for Chrome's web speech recognition API

Support

Quality

Security

License

Reuse

web-voice-processorby Picovoice

TypeScript 94 Version:Current
License: Permissive (Apache-2.0)

A library for real-time voice processing in web browsers

Support

Quality

Security

License

Reuse

speakerby duncan3dc

PHP 94 Version:Current
License: Permissive (Apache-2.0)

A PHP library to convert text to speech using various web services

Support

Quality

Security

License

Reuse

go-wavby youpy

Go 94 Version:Current
License: Permissive (ISC)

A Go library to read/write WAVE(RIFF waveform Audio) Format

Support

Quality

Security

License

Reuse

wsayby p-groarke

C++ 94 Version:Current
License: Permissive (BSD-3-Clause)

Windows "say"

Support

Quality

Security

License

Reuse

tacotron2-mandarin-griffin-limby Joee1995

Python 93 Version:Current
License: Permissive (MIT)

This is a repository of chinese/mandarin tts (text-to-speech) .

Support

Quality

Security

License

Reuse

ruby-pocketsphinx-serverby alumae

Ruby 93 Version:Current
License: Proprietary (Proprietary)

Ruby-based web service for speech recognition, using the PocketSphinx gstreamer module

Support

Quality

Security

License

Reuse

FFTNetby syang1993

Python 92 Version:Current
License: No License (No License)

A PyTorch implementation of the FFTNet: a Real-Time Speaker-Dependent Neural Vocoder

Support

Quality

Security

License

Reuse

Tacotronby bshall

A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

Python

104

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Opus.NETby DevJohnC

Opus .NET Wrapper

104

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

simple-google-ttsby glutanimate

Use Google text-to-speech on your Linux desktop

Perl

104

Updated: 4 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

textblob-aptaggerby sloria

*Deprecated* A fast and accurate part-of-speech tagger for TextBlob.

Python

103

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

StyleSpeechby KevinMIN95

Official implementation of Meta-StyleSpeech and StyleSpeech

Python

103

Updated: 3 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

WaveFlowby L0SG

A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio"

Jupyter Notebook

102

Updated: 4 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

ASR-for-Chinese-Pipelineby CynthiaSuwi

Google Summer of Code 2018 Project: Automatic Speech Recognition for Speech-to-Text on Chinese

Python

102

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

wavenet-for-chromeby pgmichael

A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.

TypeScript

102

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Speech-Tranformer-Pytorchby ZhengkunTian

Seq2Seq Speech Recognition with Transformer on Mandarin Chinese

Python

101

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Cognitive-SpeakerRecognition-Pythonby microsoft

Python SDK for the Microsoft Speaker Recognition API, part of Cognitive Services

Python

101

Updated: 4 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

K-Sonicby jcodeing

Based on Sonic (speed , pitch and rate) , the demo for Android. [ Deprecated See - https://github.com/jcodeing/KMedia An application level media framework for Android.]

Java

101

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

sonosby duncan3dc

A PHP library for interacting with Sonos speakers

PHP

101

Updated: 3 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

ff_metersby ffAudio

Plug and play component to display LED meters for JUCE audio buffers

C++

101

Updated: 2 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

tacotron2-mandarinby atomicoo

Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on Tacotron-2 model.

Python

101

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

VAENAR-TTSby thuhcsi

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Python

101

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

SummerTTSby huakunyang

SummerTTS 是一个基于C++的独立编译的中文语音合成项目，可以本地运行不需要网络，而且没有额外的依赖，一键编译完成即可用于中文语音合成。SummerTTS is a standalone Chinese speech synthesis(TTS) project that has almost no dependency and could be easily used for Chinese TTS with just one key build out

C++

101

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

self-attention-tacotronby nii-yamagishilab

An implementation of "Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language" https://arxiv.org/abs/1810.11960

Python

100

Updated: 4 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

audiomateby ynop

Python library for handling audio datasets.

Python

100

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Noise2Noise-audio_denoising_without_clean_training_databy madhavmk

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.

Jupyter Notebook

100

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

react-native-android-voiceby JoaoCnh

react-native-android-voice is a speech-to-text library for React Native for the Android Platform.

Java

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Shazamby bmoquist

Music Identification Program based on Shazam's methods

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Android-Audio-Processing-Using-WebRTCby mail2chromium

All in all WebRTC. A Complete Guide to enable Rich and High Quality of **Real-Time Voice Communication** on Android Platform. This repository involves a complete understanding, implementation and documentation related to WebRTC Audio Processing.

C++

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

speech-representationsby awslabs

Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)

Python

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

speech_courseby yandexdataschool

YSDA course in Speech Processing.

Jupyter Notebook

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

audio-sync-kitby google

Python

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

extensionby wavenet-for-chrome

A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.

TypeScript

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

ivector-xvectorby zeroQiaoba

Extract xvector and ivector under kaldi

Shell

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

MRCP-Plugin-Demoby cotinyang

A demo repository for UniMRCP plugin implementation with iflytek ASR & TTS API

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

ChatGPT-OpenAI-Smart-Speakerby Olney1

This program uses speech recognition and text-to-speech to enable voice-driven conversations with OpenAI. The user speaks a prompt into the microphone, and the program sends the prompt to OpenAI to generate a response. The response is then converted to an audio file and played back to the user.

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

ElevateAIPythonSDKby NICEElevateAI

ElevateAI - Speech-to-text API Python SDK

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

deep-clusteringby funcwj

deep clustering method for single-channel speech separation

Python

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

BVAE-TTSby LEEYOONHYUNG

Official implementation of BVAE-TTS

Python

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

vietTTSby NTT123

Vietnamese Text to Speech library

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

AISHELL-4by felixfuyihui

Python

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

cross_vcby Kyubyong

Cross-lingual Voice Conversion

Python

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

segan-pytorchby dansuh17

SEGAN pytorch implementation https://arxiv.org/abs/1703.09452

Python

Updated: 4 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

openBliSSARTby openBliSSART

Blind Source Separation for Audio Recognition Tasks

C++

Updated: 4 y ago

License: Strong Copyleft (GPL-2.0)

Support

Quality

Security

License

Reuse

You-Only-Speak-Onceby Speaker-Identification

Deep Learning - one shot learning for speaker recognition using Filter Banks

Jupyter Notebook

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Fre-GAN-pytorchby rishikksh20

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

emovoiceby hcmlab

Build your own Real-time Speech Emotion Recognizer

Python

Updated: 4 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

Speech-Transformer-tf2.0by xingchensong

transformer for ASR-systerm (via tensorflow2.0)

Python

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

acrcloud_sdk_pythonby acrcloud

Python

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Speech.jsby yyx990803

Simple wrapper for Chrome's web speech recognition API

JavaScript

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

web-voice-processorby Picovoice

A library for real-time voice processing in web browsers

TypeScript

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

speakerby duncan3dc

A PHP library to convert text to speech using various web services

PHP

Updated: 3 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

go-wavby youpy

A Go library to read/write WAVE(RIFF waveform Audio) Format

Updated: 3 y ago

License: Permissive (ISC)

Support

Quality

Security

License

Reuse

wsayby p-groarke

Windows "say"

C++

Updated: 2 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

tacotron2-mandarin-griffin-limby Joee1995

This is a repository of chinese/mandarin tts (text-to-speech) .

Python

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

ruby-pocketsphinx-serverby alumae

Ruby-based web service for speech recognition, using the PocketSphinx gstreamer module

Ruby

Updated: 4 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

FFTNetby syang1993

A PyTorch implementation of the FFTNet: a Real-Time Speaker-Dependent Neural Vocoder

Python

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Open Weaver – Develop Applications Faster with Open Source

Terms
Privacy policy

Speech Libraries - Page 14

Tacotronby bshall

Python 104 Version:Current License: Permissive (MIT)

A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

Opus.NETby DevJohnC

C# 104 Version:Current License: Permissive (MIT)

Opus .NET Wrapper

simple-google-ttsby glutanimate

Perl 104 Version:Current License: Strong Copyleft (GPL-3.0)

Use Google text-to-speech on your Linux desktop

textblob-aptaggerby sloria

Python 103 Version:Current License: Permissive (MIT)

*Deprecated* A fast and accurate part-of-speech tagger for TextBlob.

StyleSpeechby KevinMIN95

Python 103 Version:Current License: Permissive (MIT)

Official implementation of Meta-StyleSpeech and StyleSpeech

WaveFlowby L0SG

Jupyter Notebook 102 Version:Current License: Permissive (BSD-3-Clause)

A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio"

ASR-for-Chinese-Pipelineby CynthiaSuwi

Python 102 Version:Current License: No License (No License)

Google Summer of Code 2018 Project: Automatic Speech Recognition for Speech-to-Text on Chinese

wavenet-for-chromeby pgmichael

TypeScript 102 Version:Current License: Permissive (MIT)

A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.

Speech-Tranformer-Pytorchby ZhengkunTian

Python 101 Version:Current License: No License (No License)

Seq2Seq Speech Recognition with Transformer on Mandarin Chinese

Cognitive-SpeakerRecognition-Pythonby microsoft

Python 101 Version:Current License: Proprietary (Proprietary)

Python SDK for the Microsoft Speaker Recognition API, part of Cognitive Services

K-Sonicby jcodeing

Java 101 Version:Current License: Permissive (MIT)

Based on Sonic (speed , pitch and rate) , the demo for Android. [ Deprecated See - https://github.com/jcodeing/KMedia An application level media framework for Android.]

sonosby duncan3dc

PHP 101 Version:Current License: Permissive (Apache-2.0)

A PHP library for interacting with Sonos speakers

ff_metersby ffAudio

C++ 101 Version:Current License: Permissive (BSD-3-Clause)

Plug and play component to display LED meters for JUCE audio buffers

tacotron2-mandarinby atomicoo

Python 101 Version:Current License: Permissive (MIT)

Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on Tacotron-2 model.

VAENAR-TTSby thuhcsi

Python 101 Version:Current License: Permissive (MIT)

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

SummerTTSby huakunyang

C++ 101 Version:Current License: No License (No License)

self-attention-tacotronby nii-yamagishilab

Python 100 Version:Current License: Permissive (BSD-3-Clause)

An implementation of "Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language" https://arxiv.org/abs/1810.11960

audiomateby ynop

Python 100 Version:Current License: Permissive (MIT)

Python library for handling audio datasets.

Noise2Noise-audio_denoising_without_clean_training_databy madhavmk

Jupyter Notebook 100 Version:Current License: Permissive (MIT)

react-native-android-voiceby JoaoCnh

Java 99 Version:Current License: Permissive (MIT)

react-native-android-voice is a speech-to-text library for React Native for the Android Platform.

Shazamby bmoquist

Python 99 Version:Current License: Permissive (MIT)

Music Identification Program based on Shazam's methods

Android-Audio-Processing-Using-WebRTCby mail2chromium

C++ 99 Version:Current License: No License (No License)

All in all WebRTC. A Complete Guide to enable Rich and High Quality of **Real-Time Voice Communication** on Android Platform. This repository involves a complete understanding, implementation and documentation related to WebRTC Audio Processing.

speech-representationsby awslabs

Python 98 Version:Current License: Permissive (Apache-2.0)

Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)

speech_courseby yandexdataschool

Jupyter Notebook 98 Version:Current License: Permissive (MIT)

YSDA course in Speech Processing.

audio-sync-kitby google

Python 97 Version:Current License: Permissive (Apache-2.0)

extensionby wavenet-for-chrome

TypeScript 97 Version:Current License: Permissive (MIT)

A wrapper for Google Cloud’s text-to-speech services that transforms highlighted text into high-quality natural sounding audio.

ivector-xvectorby zeroQiaoba

Shell 97 Version:Current License: No License (No License)

Extract xvector and ivector under kaldi

MRCP-Plugin-Demoby cotinyang

Python 104 Version:Current
License: Permissive (MIT)

C# 104 Version:Current
License: Permissive (MIT)

Perl 104 Version:Current
License: Strong Copyleft (GPL-3.0)

Python 103 Version:Current
License: Permissive (MIT)

Deprecated A fast and accurate part-of-speech tagger for TextBlob.

Python 103 Version:Current
License: Permissive (MIT)

Jupyter Notebook 102 Version:Current
License: Permissive (BSD-3-Clause)

Python 102 Version:Current
License: No License (No License)

TypeScript 102 Version:Current
License: Permissive (MIT)

Python 101 Version:Current
License: No License (No License)

Python 101 Version:Current
License: Proprietary (Proprietary)

Java 101 Version:Current
License: Permissive (MIT)

PHP 101 Version:Current
License: Permissive (Apache-2.0)

C++ 101 Version:Current
License: Permissive (BSD-3-Clause)

Python 101 Version:Current
License: Permissive (MIT)

Python 101 Version:Current
License: Permissive (MIT)

C++ 101 Version:Current
License: No License (No License)

Python 100 Version:Current
License: Permissive (BSD-3-Clause)

Python 100 Version:Current
License: Permissive (MIT)

Jupyter Notebook 100 Version:Current
License: Permissive (MIT)

Java 99 Version:Current
License: Permissive (MIT)

Python 99 Version:Current
License: Permissive (MIT)

C++ 99 Version:Current
License: No License (No License)

All in all WebRTC. A Complete Guide to enable Rich and High Quality of Real-Time Voice Communication on Android Platform. This repository involves a complete understanding, implementation and documentation related to WebRTC Audio Processing.

Python 98 Version:Current
License: Permissive (Apache-2.0)

Jupyter Notebook 98 Version:Current
License: Permissive (MIT)

Python 97 Version:Current
License: Permissive (Apache-2.0)

TypeScript 97 Version:Current
License: Permissive (MIT)

Shell 97 Version:Current
License: No License (No License)

C 97 Version:Current
License: No License (No License)

Python 97 Version:Current
License: Permissive (MIT)

Python 97 Version:Current
License: Permissive (MIT)

Python 96 Version:Current
License: No License (No License)

Python 96 Version:Current
License: Permissive (MIT)

Python 96 Version:Current
License: Permissive (MIT)

Python 96 Version:Current
License: Permissive (Apache-2.0)

Python 95 Version:Current
License: Permissive (Apache-2.0)

Python 95 Version:Current
License: Strong Copyleft (GPL-3.0)

C++ 95 Version:Current
License: Strong Copyleft (GPL-2.0)

Jupyter Notebook 95 Version:Current
License: No License (No License)

Python 95 Version:Current
License: Permissive (MIT)

Python 94 Version:Current
License: Proprietary (Proprietary)

Python 94 Version:Current
License: No License (No License)

Python 94 Version:Current
License: No License (No License)

JavaScript 94 Version:Current
License: No License (No License)

TypeScript 94 Version:Current
License: Permissive (Apache-2.0)

PHP 94 Version:Current
License: Permissive (Apache-2.0)

Go 94 Version:Current
License: Permissive (ISC)

C++ 94 Version:Current
License: Permissive (BSD-3-Clause)

Python 93 Version:Current
License: Permissive (MIT)

Ruby 93 Version:Current
License: Proprietary (Proprietary)

Python 92 Version:Current
License: No License (No License)