Speech Libraries - Page 9

dictate.jsby Kaljurand

JavaScript 195 Version:Current
License: Permissive (BSD-3-Clause)

A small Javascript library for browser-based real-time speech recognition, which uses Recorderjs for audio capture, and a WebSocket connection to the Kaldi GStreamer server for speech recognition.

Support

Quality

Security

License

Reuse

masrby binzhouchn

Python 195 Version:Current
License: No License (No License)

中文语音识别系列，读者可以借助它快速训练属于自己的中文语音识别模型，或直接使用预训练模型测试效果。

Support

Quality

Security

License

Reuse

expressive_tacotronby Kyubyong

Python 194 Version:Current
License: No License (No License)

Tensorflow Implementation of Expressive Tacotron

Support

Quality

Security

License

Reuse

doc2audiobookby danthelion

Python 194 Version:Current
License: Permissive (MIT)

Convert text documents to high fidelity audio(books).

Support

Quality

Security

License

Reuse

Cross-Lingual-Voice-Cloningby deterministic-algorithms-lab

Jupyter Notebook 193 Version:Current
License: Permissive (BSD-3-Clause)

Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.

Support

Quality

Security

License

Reuse

ddsp-singing-vocodersby YatingMusic

Python 193 Version:Current
License: Strong Copyleft (AGPL-3.0)

Official implementation of SawSing (ISMIR'22)

Support

Quality

Security

License

Reuse

Tacotron-pytorchby soobinseo

Python 192 Version:Current
License: Permissive (Apache-2.0)

Pytorch implementation of Tacotron

Support

Quality

Security

License

Reuse

onssenby speechLabBcCuny

Python 192 Version:Current
License: Strong Copyleft (GPL-3.0)

An open-source speech separation and enhancement library

Support

Quality

Security

License

Reuse

kaldi-offline-transcriberby alumae

Python 191 Version:Current
License: Proprietary (Proprietary)

Offline transcription system for Estonian using Kaldi

Support

Quality

Security

License

Reuse

SpeechToText-WebSockets-Javascriptby Azure-Samples

TypeScript 191 Version:Current
License: Permissive (MIT)

SDK & Sample to do speech recognition using websockets in Javascript

Support

Quality

Security

License

Reuse

pychainby YiwenShaoStephen

C++ 191 Version:Current
License: No License (No License)

PyTorch implementation of LF-MMI for End-to-end ASR

Support

Quality

Security

License

Reuse

simple-ehmby morrolinux

Jupyter Notebook 191 Version:Current
License: Permissive (MIT)

A simple tool for a simple task: remove filler sounds ("ehm") from pre-recorded speeches. AI powered.

Support

Quality

Security

License

Reuse

WaveRNNby mkotha

Python 190 Version:Current
License: Permissive (MIT)

A WaveRNN implementation

Support

Quality

Security

License

Reuse

parrotsby shibing624

Python 190 Version:Current
License: Permissive (Apache-2.0)

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音，基于语音库实现，易扩展。

Support

Quality

Security

License

Reuse

baby_cry_detectionby giulbia

Python 189 Version:Current
License: No License (No License)

Recognition of baby cry audio signal

Support

Quality

Security

License

Reuse

kaldi-onnxby XiaoMi

Python 189 Version:Current
License: Permissive (Apache-2.0)

Kaldi model converter to ONNX

Support

Quality

Security

License

Reuse

pytorch-StarGAN-VCby hujinsen

Python 189 Version:Current
License: No License (No License)

Fully reproduce the paper of StarGAN-VC. Stable training and Better audio quality .

Support

Quality

Security

License

Reuse

pytorch_xvectorsby manojpamk

Python 189 Version:Current
License: Permissive (MIT)

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

Support

Quality

Security

License

Reuse

Listen-Attend-Spellby kaituoxu

Python 188 Version:Current
License: No License (No License)

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

Support

Quality

Security

License

Reuse

chatbot-watson-androidby IBM-Cloud

Java 186 Version:Current
License: Proprietary (Proprietary)

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

Support

Quality

Security

License

Reuse

voicefixer_mainby haoheliu

Python 186 Version:Current
License: Strong Copyleft (AGPL-3.0)

General Speech Restoration

Support

Quality

Security

License

Reuse

pyannote-whisperby yinruiqing

Python 186 Version:Current
License: No License (No License)

Support

Quality

Security

License

Reuse

2D-TANby microsoft

Python 185 Version:Current
License: Proprietary (Proprietary)

AAAI‘20 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language

Support

Quality

Security

License

Reuse

mbelibby szechyjs

C++ 185 Version:Current
License: Proprietary (Proprietary)

P25 Phase 1 and ProVoice vocoder

Support

Quality

Security

License

Reuse

Scycloneby Torsion-Audio

C++ 185 Version:Current
License: Proprietary (Proprietary)

Real-time Neural Timbre Transfer

Support

Quality

Security

License

Reuse

deepspeech-serverby MainRo

Python 184 Version:Current
License: Weak Copyleft (MPL-2.0)

A testing server for a speech to text service based on mozilla deepspeech

Support

Quality

Security

License

Reuse

gst-kaldi-nnet2-onlineby alumae

C++ 184 Version:Current
License: Permissive (Apache-2.0)

GStreamer plugin around Kaldi's online neural network decoder

Support

Quality

Security

License

Reuse

transcribe-anythingby zackees

Python 184 Version:Current
License: Permissive (MIT)

Input a local file or url and this service will transcribe it using Whisper AI. Completely private and Free 🤯🤯🤯

Support

Quality

Security

License

Reuse

transcriber_appby davabase

Python 183 Version:Current
License: No License (No License)

Real time speech to text transcription app.

Support

Quality

Security

License

Reuse

Wave-U-Net-For-Speech-Enhancementby craigmacartney

Python 182 Version:Current
License: No License (No License)

Improved speech enhancement with the Wave-U-Net, a deep convolutional neural network architecture for audio source separation, implemented for the task of speech enhancement in the time-domain.

Support

Quality

Security

License

Reuse

end-to-end-SLUby lorenlugosch

Python 182 Version:Current
License: Permissive (Apache-2.0)

PyTorch code for end-to-end spoken language understanding (SLU) with ASR-based transfer learning

Support

Quality

Security

License

Reuse

Deep_VoiceChangerby pstuvwx

Python 181 Version:Current
License: Permissive (MIT)

深層学習とかを使ってボイスチェンジャー作るリポジトリ

Support

Quality

Security

License

Reuse

rtmonoaudio2midiby aniawsz

Python 180 Version:Current
License: Strong Copyleft (GPL-3.0)

Real-time note recognition in monophonic audio stream

Support

Quality

Security

License

Reuse

One-Shot-Voice-Cloningby CMsmartvoice

Jupyter Notebook 180 Version:Current
License: No License (No License)

:relaxed: One Shot Voice Cloning base on Unet-TTS

Support

Quality

Security

License

Reuse

ASR-Audio-Data-Linksby robmsmt

Shell 180 Version:Current
License: Permissive (Apache-2.0)

A list of publically available audio data that anyone can download for ASR or other speech activities

Support

Quality

Security

License

Reuse

speech-inputby Daniel-Hug

JavaScript 179 Version:Current
License: No License (No License)

Simple speech input for <input>s —replaces the now defunct x-webkit-speech attribute

Support

Quality

Security

License

Reuse

pitsby anonymous-pits

Python 179 Version:Current
License: Permissive (MIT)

PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor

Support

Quality

Security

License

Reuse

VocGANby rishikksh20

Python 178 Version:Current
License: Permissive (MIT)

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

Support

Quality

Security

License

Reuse

ovos-buildrootby OpenVoiceOS

Python 177 Version:Current
License: Permissive (Apache-2.0)

Open Voice Operating System - Buildroot edition is a minimalistic linux OS bringing the open source voice assistant Mycroft A.I. to embbeded, low-spec headless and/or small (touch)screen devices.

Support

Quality

Security

License

Reuse

asr-serverby dialogflow

C++ 176 Version:Current
License: Permissive (Apache-2.0)

FastCGI support for Kaldi ASR

Support

Quality

Security

License

Reuse

Expressive-FastSpeech2by keonlee9420

Python 176 Version:Current
License: Proprietary (Proprietary)

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

Support

Quality

Security

License

Reuse

malaya-speechby huseinzol05

Jupyter Notebook 176 Version:Current
License: Permissive (MIT)

Speech Toolkit for bahasa Malaysia, https://malaya-speech.readthedocs.io/

Support

Quality

Security

License

Reuse

androidspeechby mozilla

C 175 Version:Current
License: No License (No License)

An Android library module to Mozilla's Speech-To-Text services

Support

Quality

Security

License

Reuse

speech-emotion-recognitionby harry-7

Python 174 Version:Current
License: Permissive (MIT)

Speaker independent emotion recognition

Support

Quality

Security

License

Reuse

Speaker-Identification-Pythonby Atul-Anand-Jha

Python 174 Version:Current
License: Weak Copyleft (LGPL-3.0)

Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library

Support

Quality

Security

License

Reuse

cldby jtoy

C++ 174 Version:Current
License: Proprietary (Proprietary)

compact language detection in ruby

Support

Quality

Security

License

Reuse

MBROLAby numediart

C 170 Version:Current
License: Strong Copyleft (AGPL-3.0)

MBROLA is a speech synthesizer based on the concatenation of diphones

Support

Quality

Security

License

Reuse

cognitive-services-speech-sdk-jsby microsoft

TypeScript 169 Version:Current
License: Proprietary (Proprietary)

Microsoft Azure Cognitive Services Speech SDK for JavaScript

Support

Quality

Security

License

Reuse

speechlyby speechly

TypeScript 169 Version:Current
License: Permissive (MIT)

Client libraries, examples and demos of Speechly API for the Web.

Support

Quality

Security

License

Reuse

audio-SNRby Sato-Kunihiko

Python 168 Version:Current
License: No License (No License)

Mixing an audio file with a noise file at any Signal-to-Noise Ratio (SNR)

Support

Quality

Security

License

Reuse

dictate.jsby Kaljurand

A small Javascript library for browser-based real-time speech recognition, which uses Recorderjs for audio capture, and a WebSocket connection to the Kaldi GStreamer server for speech recognition.

JavaScript

195

Updated: 4 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

masrby binzhouchn

中文语音识别系列，读者可以借助它快速训练属于自己的中文语音识别模型，或直接使用预训练模型测试效果。

Python

195

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

expressive_tacotronby Kyubyong

Tensorflow Implementation of Expressive Tacotron

Python

194

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

doc2audiobookby danthelion

Convert text documents to high fidelity audio(books).

Python

194

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Cross-Lingual-Voice-Cloningby deterministic-algorithms-lab

Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.

Jupyter Notebook

193

Updated: 3 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

ddsp-singing-vocodersby YatingMusic

Official implementation of SawSing (ISMIR'22)

Python

193

Updated: 2 y ago

License: Strong Copyleft (AGPL-3.0)

Support

Quality

Security

License

Reuse

Tacotron-pytorchby soobinseo

Pytorch implementation of Tacotron

Python

192

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

onssenby speechLabBcCuny

An open-source speech separation and enhancement library

Python

192

Updated: 4 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

kaldi-offline-transcriberby alumae

Offline transcription system for Estonian using Kaldi

Python

191

Updated: 4 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

SpeechToText-WebSockets-Javascriptby Azure-Samples

SDK & Sample to do speech recognition using websockets in Javascript

TypeScript

191

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

pychainby YiwenShaoStephen

PyTorch implementation of LF-MMI for End-to-end ASR

C++

191

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

simple-ehmby morrolinux

A simple tool for a simple task: remove filler sounds ("ehm") from pre-recorded speeches. AI powered.

Jupyter Notebook

191

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

WaveRNNby mkotha

A WaveRNN implementation

Python

190

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

parrotsby shibing624

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音，基于语音库实现，易扩展。

Python

190

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

baby_cry_detectionby giulbia

Recognition of baby cry audio signal

Python

189

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

kaldi-onnxby XiaoMi

Kaldi model converter to ONNX

Python

189

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

pytorch-StarGAN-VCby hujinsen

Fully reproduce the paper of StarGAN-VC. Stable training and Better audio quality .

Python

189

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

pytorch_xvectorsby manojpamk

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

Python

189

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Listen-Attend-Spellby kaituoxu

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

Python

188

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

chatbot-watson-androidby IBM-Cloud

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

Java

186

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

voicefixer_mainby haoheliu

General Speech Restoration

Python

186

Updated: 2 y ago

License: Strong Copyleft (AGPL-3.0)

Support

Quality

Security

License

Reuse

pyannote-whisperby yinruiqing

Python

186

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

2D-TANby microsoft

AAAI‘20 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language

Python

185

Updated: 4 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

mbelibby szechyjs

P25 Phase 1 and ProVoice vocoder

C++

185

Updated: 4 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

Scycloneby Torsion-Audio

Real-time Neural Timbre Transfer

C++

185

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

deepspeech-serverby MainRo

A testing server for a speech to text service based on mozilla deepspeech

Python

184

Updated: 4 y ago

License: Weak Copyleft (MPL-2.0)

Support

Quality

Security

License

Reuse

gst-kaldi-nnet2-onlineby alumae

GStreamer plugin around Kaldi's online neural network decoder

C++

184

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

transcribe-anythingby zackees

Input a local file or url and this service will transcribe it using Whisper AI. Completely private and Free 🤯🤯🤯

Python

184

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

transcriber_appby davabase

Real time speech to text transcription app.

Python

183

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Wave-U-Net-For-Speech-Enhancementby craigmacartney

Improved speech enhancement with the Wave-U-Net, a deep convolutional neural network architecture for audio source separation, implemented for the task of speech enhancement in the time-domain.

Python

182

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

end-to-end-SLUby lorenlugosch

PyTorch code for end-to-end spoken language understanding (SLU) with ASR-based transfer learning

Python

182

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

Deep_VoiceChangerby pstuvwx

深層学習とかを使ってボイスチェンジャー作るリポジトリ

Python

181

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

rtmonoaudio2midiby aniawsz

Real-time note recognition in monophonic audio stream

Python

180

Updated: 4 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

One-Shot-Voice-Cloningby CMsmartvoice

:relaxed: One Shot Voice Cloning base on Unet-TTS

Jupyter Notebook

180

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

ASR-Audio-Data-Linksby robmsmt

A list of publically available audio data that anyone can download for ASR or other speech activities

Shell

180

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

speech-inputby Daniel-Hug

Simple speech input for <input>s —replaces the now defunct x-webkit-speech attribute

JavaScript

179

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

pitsby anonymous-pits

PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor

Python

179

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

VocGANby rishikksh20

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

Python

178

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

ovos-buildrootby OpenVoiceOS

Open Voice Operating System - Buildroot edition is a minimalistic linux OS bringing the open source voice assistant Mycroft A.I. to embbeded, low-spec headless and/or small (touch)screen devices.

Python

177

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

asr-serverby dialogflow

FastCGI support for Kaldi ASR

C++

176

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

Expressive-FastSpeech2by keonlee9420

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

Python

176

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

malaya-speechby huseinzol05

Speech Toolkit for bahasa Malaysia, https://malaya-speech.readthedocs.io/

Jupyter Notebook

176

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

androidspeechby mozilla

An Android library module to Mozilla's Speech-To-Text services

175

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

speech-emotion-recognitionby harry-7

Speaker independent emotion recognition

Python

174

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Speaker-Identification-Pythonby Atul-Anand-Jha

Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library

Python

174

Updated: 2 y ago

License: Weak Copyleft (LGPL-3.0)

Support

Quality

Security

License

Reuse

cldby jtoy

compact language detection in ruby

C++

174

Updated: 3 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

MBROLAby numediart

MBROLA is a speech synthesizer based on the concatenation of diphones

170

Updated: 2 y ago

License: Strong Copyleft (AGPL-3.0)

Support

Quality

Security

License

Reuse

cognitive-services-speech-sdk-jsby microsoft

Microsoft Azure Cognitive Services Speech SDK for JavaScript

TypeScript

169

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

speechlyby speechly

Client libraries, examples and demos of Speechly API for the Web.

TypeScript

169

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

audio-SNRby Sato-Kunihiko

Mixing an audio file with a noise file at any Signal-to-Noise Ratio (SNR)

Python

168

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Open Weaver – Develop Applications Faster with Open Source

Terms
Privacy policy

Speech Libraries - Page 9

dictate.jsby Kaljurand

JavaScript 195 Version:Current License: Permissive (BSD-3-Clause)

A small Javascript library for browser-based real-time speech recognition, which uses Recorderjs for audio capture, and a WebSocket connection to the Kaldi GStreamer server for speech recognition.

masrby binzhouchn

Python 195 Version:Current License: No License (No License)

中文语音识别系列，读者可以借助它快速训练属于自己的中文语音识别模型，或直接使用预训练模型测试效果。

expressive_tacotronby Kyubyong

Python 194 Version:Current License: No License (No License)

Tensorflow Implementation of Expressive Tacotron

doc2audiobookby danthelion

Python 194 Version:Current License: Permissive (MIT)

Convert text documents to high fidelity audio(books).

Cross-Lingual-Voice-Cloningby deterministic-algorithms-lab

Jupyter Notebook 193 Version:Current License: Permissive (BSD-3-Clause)

Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.

ddsp-singing-vocodersby YatingMusic

Python 193 Version:Current License: Strong Copyleft (AGPL-3.0)

Official implementation of SawSing (ISMIR'22)

Tacotron-pytorchby soobinseo

Python 192 Version:Current License: Permissive (Apache-2.0)

Pytorch implementation of Tacotron

onssenby speechLabBcCuny

Python 192 Version:Current License: Strong Copyleft (GPL-3.0)

An open-source speech separation and enhancement library

kaldi-offline-transcriberby alumae

Python 191 Version:Current License: Proprietary (Proprietary)

Offline transcription system for Estonian using Kaldi

SpeechToText-WebSockets-Javascriptby Azure-Samples

TypeScript 191 Version:Current License: Permissive (MIT)

SDK & Sample to do speech recognition using websockets in Javascript

pychainby YiwenShaoStephen

C++ 191 Version:Current License: No License (No License)

PyTorch implementation of LF-MMI for End-to-end ASR

simple-ehmby morrolinux

Jupyter Notebook 191 Version:Current License: Permissive (MIT)

A simple tool for a simple task: remove filler sounds ("ehm") from pre-recorded speeches. AI powered.

WaveRNNby mkotha

Python 190 Version:Current License: Permissive (MIT)

A WaveRNN implementation

parrotsby shibing624

Python 190 Version:Current License: Permissive (Apache-2.0)

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音，基于语音库实现，易扩展。

baby_cry_detectionby giulbia

Python 189 Version:Current License: No License (No License)

Recognition of baby cry audio signal

kaldi-onnxby XiaoMi

Python 189 Version:Current License: Permissive (Apache-2.0)

Kaldi model converter to ONNX

pytorch-StarGAN-VCby hujinsen

Python 189 Version:Current License: No License (No License)

Fully reproduce the paper of StarGAN-VC. Stable training and Better audio quality .

pytorch_xvectorsby manojpamk

Python 189 Version:Current License: Permissive (MIT)

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

Listen-Attend-Spellby kaituoxu

Python 188 Version:Current License: No License (No License)

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

chatbot-watson-androidby IBM-Cloud

Java 186 Version:Current License: Proprietary (Proprietary)

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

voicefixer_mainby haoheliu

Python 186 Version:Current License: Strong Copyleft (AGPL-3.0)

General Speech Restoration

pyannote-whisperby yinruiqing

Python 186 Version:Current License: No License (No License)

2D-TANby microsoft

Python 185 Version:Current License: Proprietary (Proprietary)

AAAI‘20 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language

mbelibby szechyjs

C++ 185 Version:Current License: Proprietary (Proprietary)

P25 Phase 1 and ProVoice vocoder

Scycloneby Torsion-Audio

C++ 185 Version:Current License: Proprietary (Proprietary)

Real-time Neural Timbre Transfer

deepspeech-serverby MainRo

Python 184 Version:Current License: Weak Copyleft (MPL-2.0)

A testing server for a speech to text service based on mozilla deepspeech

gst-kaldi-nnet2-onlineby alumae

C++ 184 Version:Current License: Permissive (Apache-2.0)

JavaScript 195 Version:Current
License: Permissive (BSD-3-Clause)

Python 195 Version:Current
License: No License (No License)

Python 194 Version:Current
License: No License (No License)

Python 194 Version:Current
License: Permissive (MIT)

Jupyter Notebook 193 Version:Current
License: Permissive (BSD-3-Clause)

Python 193 Version:Current
License: Strong Copyleft (AGPL-3.0)

Python 192 Version:Current
License: Permissive (Apache-2.0)

Python 192 Version:Current
License: Strong Copyleft (GPL-3.0)

Python 191 Version:Current
License: Proprietary (Proprietary)

TypeScript 191 Version:Current
License: Permissive (MIT)

C++ 191 Version:Current
License: No License (No License)

Jupyter Notebook 191 Version:Current
License: Permissive (MIT)

Python 190 Version:Current
License: Permissive (MIT)

Python 190 Version:Current
License: Permissive (Apache-2.0)

Python 189 Version:Current
License: No License (No License)

Python 189 Version:Current
License: Permissive (Apache-2.0)

Python 189 Version:Current
License: No License (No License)

Python 189 Version:Current
License: Permissive (MIT)

Python 188 Version:Current
License: No License (No License)

Java 186 Version:Current
License: Proprietary (Proprietary)

Python 186 Version:Current
License: Strong Copyleft (AGPL-3.0)

Python 186 Version:Current
License: No License (No License)

Python 185 Version:Current
License: Proprietary (Proprietary)

C++ 185 Version:Current
License: Proprietary (Proprietary)

C++ 185 Version:Current
License: Proprietary (Proprietary)

Python 184 Version:Current
License: Weak Copyleft (MPL-2.0)

C++ 184 Version:Current
License: Permissive (Apache-2.0)

Python 184 Version:Current
License: Permissive (MIT)

Python 183 Version:Current
License: No License (No License)

Python 182 Version:Current
License: No License (No License)

Python 182 Version:Current
License: Permissive (Apache-2.0)

Python 181 Version:Current
License: Permissive (MIT)

Python 180 Version:Current
License: Strong Copyleft (GPL-3.0)

Jupyter Notebook 180 Version:Current
License: No License (No License)

Shell 180 Version:Current
License: Permissive (Apache-2.0)

JavaScript 179 Version:Current
License: No License (No License)

Python 179 Version:Current
License: Permissive (MIT)

Python 178 Version:Current
License: Permissive (MIT)

Python 177 Version:Current
License: Permissive (Apache-2.0)

C++ 176 Version:Current
License: Permissive (Apache-2.0)

Python 176 Version:Current
License: Proprietary (Proprietary)

Jupyter Notebook 176 Version:Current
License: Permissive (MIT)

C 175 Version:Current
License: No License (No License)

Python 174 Version:Current
License: Permissive (MIT)

Python 174 Version:Current
License: Weak Copyleft (LGPL-3.0)

C++ 174 Version:Current
License: Proprietary (Proprietary)

C 170 Version:Current
License: Strong Copyleft (AGPL-3.0)

TypeScript 169 Version:Current
License: Proprietary (Proprietary)

TypeScript 169 Version:Current
License: Permissive (MIT)

Python 168 Version:Current
License: No License (No License)