Speech Libraries - Page 7

speech-to-text-webcam-overlayby 1heisuzuki

JavaScript 283 Version:Current
License: Permissive (CC0-1.0)

Web Speech API で音声認識した結果の字幕をWebカメラ映像に重ねて表示するWebページ

Support

Quality

Security

License

Reuse

VGG-Speaker-Recognitionby WeidiXie

Python 282 Version:Current
License: No License (No License)

Utterance-level Aggregation For Speaker Recognition In The Wild

Support

Quality

Security

License

Reuse

speech-emotion-recognitionby xuanjihe

Python 279 Version:Current
License: No License (No License)

speech emotion recognition using a convolutional recurrent networks based on IEMOCAP

Support

Quality

Security

License

Reuse

PortaSpeechby keonlee9420

Python 279 Version:Current
License: Permissive (MIT)

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

Support

Quality

Security

License

Reuse

speech-resynthesisby facebookresearch

Python 277 Version:Current
License: Proprietary (Proprietary)

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Support

Quality

Security

License

Reuse

alacby macosforge

C++ 275 Version:Current
License: Permissive (Apache-2.0)

The Apple Lossless Audio Codec (ALAC) is a lossless audio codec developed by Apple and deployed on all of its platforms and devices.

Support

Quality

Security

License

Reuse

FDSoundActivatedRecorderby fulldecent

Swift 275 Version:Current
License: Permissive (MIT)

Start recording when the user speaks

Support

Quality

Security

License

Reuse

huggingsoundby jonatasgrosman

Python 275 Version:Current
License: Permissive (MIT)

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

Support

Quality

Security

License

Reuse

PytorchWaveNetVocoderby kan-bayashi

Shell 272 Version:Current
License: Permissive (Apache-2.0)

WaveNet-Vocoder implementation with pytorch.

Support

Quality

Security

License

Reuse

chinese_text_normalizationby speechio

Python 271 Version:Current
License: Permissive (MIT)

Chinese text normalization for speech processing

Support

Quality

Security

License

Reuse

end2end-asr-pytorchby gentaiscool

Python 270 Version:Current
License: Permissive (MIT)

End-to-End Automatic Speech Recognition on PyTorch

Support

Quality

Security

License

Reuse

speech-denoiserby lucianodato

C 269 Version:Current
License: Weak Copyleft (LGPL-3.0)

A speech denoise lv2 plugin based on RNNoise library

Support

Quality

Security

License

Reuse

alexis_speech_assistantby bradtraversy

Python 268 Version:Current
License: No License (No License)

Python speech assist app

Support

Quality

Security

License

Reuse

speak-jsby mtttmpl

JavaScript 267 Version:Current
License: Strong Copyleft (GPL-3.0)

Text-to-Speech in JavaScript

Support

Quality

Security

License

Reuse

speech-vad-demoby Baidu-AIP

C 266 Version:Current
License: No License (No License)

集成Webrtc的VAD，用于切分音频文件

Support

Quality

Security

License

Reuse

speech-emotion-recognitionby hkveeranki

Python 265 Version:Current
License: Permissive (MIT)

Speaker independent emotion recognition

Support

Quality

Security

License

Reuse

LSTM_PIT_Speech_Separationby aishoot

Jupyter Notebook 263 Version:Current
License: No License (No License)

Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.

Support

Quality

Security

License

Reuse

KoSpeechby sooftware

Python 262 Version:Current
License: Permissive (Apache-2.0)

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.

Support

Quality

Security

License

Reuse

pocketsphinx-goby xlab

C 260 Version:Current
License: No License (No License)

CMU PocketSphinx for Golang, a lightweight speech recognition engine.

Support

Quality

Security

License

Reuse

attention-lvcsrby rizar

Python 259 Version:Current
License: Permissive (MIT)

End-to-End Attention-Based Large Vocabulary Speech Recognition

Support

Quality

Security

License

Reuse

Speech-to-Text-Russianby SergeyShk

Python 259 Version:Current
License: No License (No License)

Проект для распознавания речи на русском языке на основе pykaldi.

Support

Quality

Security

License

Reuse

ZeroSpeechby bshall

Python 258 Version:Current
License: No License (No License)

VQ-VAE for Acoustic Unit Discovery and Voice Conversion

Support

Quality

Security

License

Reuse

CLMRby Spijkervet

Python 258 Version:Current
License: Permissive (Apache-2.0)

Official PyTorch implementation of Contrastive Learning of Musical Representations

Support

Quality

Security

License

Reuse

soft-vcby bshall

Jupyter Notebook 258 Version:Current
License: Permissive (MIT)

Soft speech units for voice conversion

Support

Quality

Security

License

Reuse

zerothby goodatlas

Shell 257 Version:Current
License: Permissive (Apache-2.0)

Kaldi-based Korean ASR (한국어 음성인식) open-source project

Support

Quality

Security

License

Reuse

StreamingSpeakerDiarizationby juanmc2005

Python 255 Version:Current
License: Permissive (MIT)

Lightweight python library for speaker diarization in real time implemented in pytorch

Support

Quality

Security

License

Reuse

Place-Recognition-using-Autoencoders-and-NNby aqibsaeed

Jupyter Notebook 254 Version:Current
License: Permissive (Apache-2.0)

Place recognition with WiFi fingerprints using Autoencoders and Neural Networks

Support

Quality

Security

License

Reuse

pysepmby schmiph2

Python 252 Version:Current
License: Strong Copyleft (GPL-3.0)

Python implementation of performance metrics in Loizou's Speech Enhancement book

Support

Quality

Security

License

Reuse

tacotron_pytorchby r9y9

Jupyter Notebook 252 Version:Current
License: Proprietary (Proprietary)

PyTorch implementation of Tacotron speech synthesis model.

Support

Quality

Security

License

Reuse

sherpaby k2-fsa

Python 252 Version:Current
License: Permissive (Apache-2.0)

Speech-to-text server framework with next-gen Kaldi

Support

Quality

Security

License

Reuse

esp-va-sdkby espressif

C 251 Version:Current
License: Proprietary (Proprietary)

Espressif's Voice Assistant SDK: Alexa, Google Voice Assistant, Google DialogFlow

Support

Quality

Security

License

Reuse

rvc-webuiby ddPn08

Python 251 Version:Current
License: Permissive (MIT)

liujing04/Retrieval-based-Voice-Conversion-WebUI reconstruction project

Support

Quality

Security

License

Reuse

assem-vcby mindslab-ai

Jupyter Notebook 250 Version:Current
License: Permissive (BSD-3-Clause)

Official Code for Assem-VC @ICASSP2022

Support

Quality

Security

License

Reuse

pocketsphinx-rubyby watsonbox

Ruby 249 Version:Current
License: Permissive (MIT)

Ruby speech recognition with Pocketsphinx

Support

Quality

Security

License

Reuse

google-ttsby zlargon

JavaScript 248 Version:Current
License: Permissive (MIT)

Google TTS (Text-To-Speech) for node.js

Support

Quality

Security

License

Reuse

PercepNetby jzi040941

C++ 245 Version:Current
License: Permissive (BSD-3-Clause)

Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

Support

Quality

Security

License

Reuse

Speech-and-Textby Renovamen

Python 243 Version:Current
License: No License (No License)

Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字（PocketSphinx、百度 API、科大讯飞 API）和文字转语音（pyttsx3）

Support

Quality

Security

License

Reuse

Wave-U-Net-for-Speech-Enhancementby haoxiangsnr

Python 243 Version:Current
License: Permissive (MIT)

Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.

Support

Quality

Security

License

Reuse

speech-javascript-sdkby watson-developer-cloud

JavaScript 243 Version:Current
License: No License (No License)

Library for using the IBM Watson Speech to Text and Text to Speech services in web browsers.

Support

Quality

Security

License

Reuse

mayavozby shahules786

Python 243 Version:Current
License: Permissive (MIT)

Pytorch based speech enhancement toolkit.

Support

Quality

Security

License

Reuse

kaldi-active-grammarby daanzu

Python 240 Version:Current
License: Strong Copyleft (AGPL-3.0)

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

Support

Quality

Security

License

Reuse

Neural-Voice-Cloning-with-Few-Samplesby Sharad24

Python 240 Version:Current
License: No License (No License)

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

Support

Quality

Security

License

Reuse

chatgpt-api-whisper-api-voice-assistantby hackingthemarkets

Python 240 Version:Current
License: No License (No License)

chatgpt api and whisper api tutorial - voice conversation with therapist

Support

Quality

Security

License

Reuse

warp-transducerby HawkAaron

C++ 239 Version:Current
License: Permissive (Apache-2.0)

A fast parallel implementation of RNN Transducer.

Support

Quality

Security

License

Reuse

GenerSpeechby Rongjiehuang

Python 239 Version:Current
License: Permissive (MIT)

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Support

Quality

Security

License

Reuse

DaCiDianby aishell-foundation

Python 237 Version:Current
License: No License (No License)

DaCiDian is an open-sourced chinese mandarin lexicon for automatic speech recognition(ASR)

Support

Quality

Security

License

Reuse

Maix-Speechby sipeed

Python 237 Version:Current
License: Proprietary (Proprietary)

Maix Speech AI lib, a fast and small speech lib running on embedded devices, including ASR, chat, TTS etc.

Support

Quality

Security

License

Reuse

OpenTransformerby ZhengkunTian

Python 236 Version:Current
License: Permissive (MIT)

A No-Recurrence Sequence-to-Sequence Model for Speech Recognition

Support

Quality

Security

License

Reuse

speech-recognition-ukby egorsmkv

Shell 236 Version:Current
License: No License (No License)

Speech Recognition for Ukrainian

Support

Quality

Security

License

Reuse

gcc-nmfby seanwood

Python 235 Version:Current
License: Permissive (MIT)

Real-time GCC-NMF Blind Speech Separation and Enhancement

Support

Quality

Security

License

Reuse

speech-to-text-webcam-overlayby 1heisuzuki

Web Speech API で音声認識した結果の字幕をWebカメラ映像に重ねて表示するWebページ

JavaScript

283

Updated: 2 y ago

License: Permissive (CC0-1.0)

Support

Quality

Security

License

Reuse

VGG-Speaker-Recognitionby WeidiXie

Utterance-level Aggregation For Speaker Recognition In The Wild

Python

282

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

speech-emotion-recognitionby xuanjihe

speech emotion recognition using a convolutional recurrent networks based on IEMOCAP

Python

279

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

PortaSpeechby keonlee9420

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

Python

279

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

speech-resynthesisby facebookresearch

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Python

277

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

alacby macosforge

The Apple Lossless Audio Codec (ALAC) is a lossless audio codec developed by Apple and deployed on all of its platforms and devices.

C++

275

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

FDSoundActivatedRecorderby fulldecent

Start recording when the user speaks

Swift

275

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

huggingsoundby jonatasgrosman

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

Python

275

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

PytorchWaveNetVocoderby kan-bayashi

WaveNet-Vocoder implementation with pytorch.

Shell

272

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

chinese_text_normalizationby speechio

Chinese text normalization for speech processing

Python

271

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

end2end-asr-pytorchby gentaiscool

End-to-End Automatic Speech Recognition on PyTorch

Python

270

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

speech-denoiserby lucianodato

A speech denoise lv2 plugin based on RNNoise library

269

Updated: 2 y ago

License: Weak Copyleft (LGPL-3.0)

Support

Quality

Security

License

Reuse

alexis_speech_assistantby bradtraversy

Python speech assist app

Python

268

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

speak-jsby mtttmpl

Text-to-Speech in JavaScript

JavaScript

267

Updated: 4 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

speech-vad-demoby Baidu-AIP

集成Webrtc的VAD，用于切分音频文件

266

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

speech-emotion-recognitionby hkveeranki

Speaker independent emotion recognition

Python

265

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

LSTM_PIT_Speech_Separationby aishoot

Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.

Jupyter Notebook

263

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

KoSpeechby sooftware

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.

Python

262

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

pocketsphinx-goby xlab

CMU PocketSphinx for Golang, a lightweight speech recognition engine.

260

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

attention-lvcsrby rizar

End-to-End Attention-Based Large Vocabulary Speech Recognition

Python

259

Updated: 3 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Speech-to-Text-Russianby SergeyShk

Проект для распознавания речи на русском языке на основе pykaldi.

Python

259

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

ZeroSpeechby bshall

VQ-VAE for Acoustic Unit Discovery and Voice Conversion

Python

258

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

CLMRby Spijkervet

Official PyTorch implementation of Contrastive Learning of Musical Representations

Python

258

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

soft-vcby bshall

Soft speech units for voice conversion

Jupyter Notebook

258

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

zerothby goodatlas

Kaldi-based Korean ASR (한국어 음성인식) open-source project

Shell

257

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

StreamingSpeakerDiarizationby juanmc2005

Lightweight python library for speaker diarization in real time implemented in pytorch

Python

255

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Place-Recognition-using-Autoencoders-and-NNby aqibsaeed

Place recognition with WiFi fingerprints using Autoencoders and Neural Networks

Jupyter Notebook

254

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

pysepmby schmiph2

Python implementation of performance metrics in Loizou's Speech Enhancement book

Python

252

Updated: 2 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

tacotron_pytorchby r9y9

PyTorch implementation of Tacotron speech synthesis model.

Jupyter Notebook

252

Updated: 4 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

sherpaby k2-fsa

Speech-to-text server framework with next-gen Kaldi

Python

252

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

esp-va-sdkby espressif

Espressif's Voice Assistant SDK: Alexa, Google Voice Assistant, Google DialogFlow

251

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

rvc-webuiby ddPn08

liujing04/Retrieval-based-Voice-Conversion-WebUI reconstruction project

Python

251

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

assem-vcby mindslab-ai

Official Code for Assem-VC @ICASSP2022

Jupyter Notebook

250

Updated: 2 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

pocketsphinx-rubyby watsonbox

Ruby speech recognition with Pocketsphinx

Ruby

249

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

google-ttsby zlargon

Google TTS (Text-To-Speech) for node.js

JavaScript

248

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

PercepNetby jzi040941

Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

C++

245

Updated: 2 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

Speech-and-Textby Renovamen

Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字（PocketSphinx、百度 API、科大讯飞 API）和文字转语音（pyttsx3）

Python

243

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Wave-U-Net-for-Speech-Enhancementby haoxiangsnr

Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.

Python

243

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

speech-javascript-sdkby watson-developer-cloud

Library for using the IBM Watson Speech to Text and Text to Speech services in web browsers.

JavaScript

243

Updated: 3 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

mayavozby shahules786

Pytorch based speech enhancement toolkit.

Python

243

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

kaldi-active-grammarby daanzu

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

Python

240

Updated: 4 y ago

License: Strong Copyleft (AGPL-3.0)

Support

Quality

Security

License

Reuse

Neural-Voice-Cloning-with-Few-Samplesby Sharad24

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

Python

240

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

chatgpt-api-whisper-api-voice-assistantby hackingthemarkets

chatgpt api and whisper api tutorial - voice conversation with therapist

Python

240

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

warp-transducerby HawkAaron

A fast parallel implementation of RNN Transducer.

C++

239

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

GenerSpeechby Rongjiehuang

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Python

239

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

DaCiDianby aishell-foundation

DaCiDian is an open-sourced chinese mandarin lexicon for automatic speech recognition(ASR)

Python

237

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Maix-Speechby sipeed

Maix Speech AI lib, a fast and small speech lib running on embedded devices, including ASR, chat, TTS etc.

Python

237

Updated: 3 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

OpenTransformerby ZhengkunTian

A No-Recurrence Sequence-to-Sequence Model for Speech Recognition

Python

236

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

speech-recognition-ukby egorsmkv

Speech Recognition for Ukrainian

Shell

236

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

gcc-nmfby seanwood

Real-time GCC-NMF Blind Speech Separation and Enhancement

Python

235

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Open Weaver – Develop Applications Faster with Open Source

Terms
Privacy policy

Speech Libraries - Page 7

speech-to-text-webcam-overlayby 1heisuzuki

JavaScript 283 Version:Current License: Permissive (CC0-1.0)

Web Speech API で音声認識した結果の字幕をWebカメラ映像に重ねて表示するWebページ

VGG-Speaker-Recognitionby WeidiXie

Python 282 Version:Current License: No License (No License)

Utterance-level Aggregation For Speaker Recognition In The Wild

speech-emotion-recognitionby xuanjihe

Python 279 Version:Current License: No License (No License)

speech emotion recognition using a convolutional recurrent networks based on IEMOCAP

PortaSpeechby keonlee9420

Python 279 Version:Current License: Permissive (MIT)

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

speech-resynthesisby facebookresearch

Python 277 Version:Current License: Proprietary (Proprietary)

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

alacby macosforge

C++ 275 Version:Current License: Permissive (Apache-2.0)

The Apple Lossless Audio Codec (ALAC) is a lossless audio codec developed by Apple and deployed on all of its platforms and devices.

FDSoundActivatedRecorderby fulldecent

Swift 275 Version:Current License: Permissive (MIT)

Start recording when the user speaks

huggingsoundby jonatasgrosman

Python 275 Version:Current License: Permissive (MIT)

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

PytorchWaveNetVocoderby kan-bayashi

Shell 272 Version:Current License: Permissive (Apache-2.0)

WaveNet-Vocoder implementation with pytorch.

chinese_text_normalizationby speechio

Python 271 Version:Current License: Permissive (MIT)

Chinese text normalization for speech processing

end2end-asr-pytorchby gentaiscool

Python 270 Version:Current License: Permissive (MIT)

End-to-End Automatic Speech Recognition on PyTorch

speech-denoiserby lucianodato

C 269 Version:Current License: Weak Copyleft (LGPL-3.0)

A speech denoise lv2 plugin based on RNNoise library

alexis_speech_assistantby bradtraversy

Python 268 Version:Current License: No License (No License)

Python speech assist app

speak-jsby mtttmpl

JavaScript 267 Version:Current License: Strong Copyleft (GPL-3.0)

Text-to-Speech in JavaScript

speech-vad-demoby Baidu-AIP

C 266 Version:Current License: No License (No License)

集成Webrtc的VAD，用于切分音频文件

speech-emotion-recognitionby hkveeranki

Python 265 Version:Current License: Permissive (MIT)

Speaker independent emotion recognition

LSTM_PIT_Speech_Separationby aishoot

Jupyter Notebook 263 Version:Current License: No License (No License)

Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.

KoSpeechby sooftware

Python 262 Version:Current License: Permissive (Apache-2.0)

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.

pocketsphinx-goby xlab

C 260 Version:Current License: No License (No License)

CMU PocketSphinx for Golang, a lightweight speech recognition engine.

attention-lvcsrby rizar

Python 259 Version:Current License: Permissive (MIT)

End-to-End Attention-Based Large Vocabulary Speech Recognition

Speech-to-Text-Russianby SergeyShk

Python 259 Version:Current License: No License (No License)

Проект для распознавания речи на русском языке на основе pykaldi.

ZeroSpeechby bshall

Python 258 Version:Current License: No License (No License)

VQ-VAE for Acoustic Unit Discovery and Voice Conversion

CLMRby Spijkervet

Python 258 Version:Current License: Permissive (Apache-2.0)

Official PyTorch implementation of Contrastive Learning of Musical Representations

soft-vcby bshall

Jupyter Notebook 258 Version:Current License: Permissive (MIT)

Soft speech units for voice conversion

zerothby goodatlas

Shell 257 Version:Current License: Permissive (Apache-2.0)

Kaldi-based Korean ASR (한국어 음성인식) open-source project

StreamingSpeakerDiarizationby juanmc2005

Python 255 Version:Current License: Permissive (MIT)

Lightweight python library for speaker diarization in real time implemented in pytorch

Place-Recognition-using-Autoencoders-and-NNby aqibsaeed

JavaScript 283 Version:Current
License: Permissive (CC0-1.0)

Python 282 Version:Current
License: No License (No License)

Python 279 Version:Current
License: No License (No License)

Python 279 Version:Current
License: Permissive (MIT)

Python 277 Version:Current
License: Proprietary (Proprietary)

C++ 275 Version:Current
License: Permissive (Apache-2.0)

Swift 275 Version:Current
License: Permissive (MIT)

Python 275 Version:Current
License: Permissive (MIT)

Shell 272 Version:Current
License: Permissive (Apache-2.0)

Python 271 Version:Current
License: Permissive (MIT)

Python 270 Version:Current
License: Permissive (MIT)

C 269 Version:Current
License: Weak Copyleft (LGPL-3.0)

Python 268 Version:Current
License: No License (No License)

JavaScript 267 Version:Current
License: Strong Copyleft (GPL-3.0)

C 266 Version:Current
License: No License (No License)

Python 265 Version:Current
License: Permissive (MIT)

Jupyter Notebook 263 Version:Current
License: No License (No License)

Python 262 Version:Current
License: Permissive (Apache-2.0)

C 260 Version:Current
License: No License (No License)

Python 259 Version:Current
License: Permissive (MIT)

Python 259 Version:Current
License: No License (No License)

Python 258 Version:Current
License: No License (No License)

Python 258 Version:Current
License: Permissive (Apache-2.0)

Jupyter Notebook 258 Version:Current
License: Permissive (MIT)

Shell 257 Version:Current
License: Permissive (Apache-2.0)

Python 255 Version:Current
License: Permissive (MIT)

Jupyter Notebook 254 Version:Current
License: Permissive (Apache-2.0)

Python 252 Version:Current
License: Strong Copyleft (GPL-3.0)

Jupyter Notebook 252 Version:Current
License: Proprietary (Proprietary)

Python 252 Version:Current
License: Permissive (Apache-2.0)

C 251 Version:Current
License: Proprietary (Proprietary)

Python 251 Version:Current
License: Permissive (MIT)

Jupyter Notebook 250 Version:Current
License: Permissive (BSD-3-Clause)

Ruby 249 Version:Current
License: Permissive (MIT)

JavaScript 248 Version:Current
License: Permissive (MIT)

C++ 245 Version:Current
License: Permissive (BSD-3-Clause)

Python 243 Version:Current
License: No License (No License)

Python 243 Version:Current
License: Permissive (MIT)

JavaScript 243 Version:Current
License: No License (No License)

Python 243 Version:Current
License: Permissive (MIT)

Python 240 Version:Current
License: Strong Copyleft (AGPL-3.0)

Python 240 Version:Current
License: No License (No License)

Python 240 Version:Current
License: No License (No License)

C++ 239 Version:Current
License: Permissive (Apache-2.0)

Python 239 Version:Current
License: Permissive (MIT)

Python 237 Version:Current
License: No License (No License)

Python 237 Version:Current
License: Proprietary (Proprietary)

Python 236 Version:Current
License: Permissive (MIT)

Shell 236 Version:Current
License: No License (No License)

Python 235 Version:Current
License: Permissive (MIT)