s
speech-to-text-webcam-overlayby 1heisuzuki
JavaScript 283 Version:Current License: Permissive (CC0-1.0)
Web Speech API で音声認識した結果の字幕をWebカメラ映像に重ねて表示するWebページ
Support
Quality
Security
License
Reuse
Utterance-level Aggregation For Speaker Recognition In The Wild
Support
Quality
Security
License
Reuse
speech emotion recognition using a convolutional recurrent networks based on IEMOCAP
Support
Quality
Security
License
Reuse
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Support
Quality
Security
License
Reuse
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.
Support
Quality
Security
License
Reuse
The Apple Lossless Audio Codec (ALAC) is a lossless audio codec developed by Apple and deployed on all of its platforms and devices.
Support
Quality
Security
License
Reuse
Start recording when the user speaks
Support
Quality
Security
License
Reuse
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Support
Quality
Security
License
Reuse
WaveNet-Vocoder implementation with pytorch.
Support
Quality
Security
License
Reuse
Chinese text normalization for speech processing
Support
Quality
Security
License
Reuse
End-to-End Automatic Speech Recognition on PyTorch
Support
Quality
Security
License
Reuse
A speech denoise lv2 plugin based on RNNoise library
Support
Quality
Security
License
Reuse
Python speech assist app
Support
Quality
Security
License
Reuse
Text-to-Speech in JavaScript
Support
Quality
Security
License
Reuse
集成Webrtc的VAD,用于切分音频文件
Support
Quality
Security
License
Reuse
Speaker independent emotion recognition
Support
Quality
Security
License
Reuse
L
LSTM_PIT_Speech_Separationby aishoot
Jupyter Notebook 263 Version:Current License: No License (No License)
Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.
Support
Quality
Security
License
Reuse
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.
Support
Quality
Security
License
Reuse
CMU PocketSphinx for Golang, a lightweight speech recognition engine.
Support
Quality
Security
License
Reuse
End-to-End Attention-Based Large Vocabulary Speech Recognition
Support
Quality
Security
License
Reuse
Проект для распознавания речи на русском языке на основе pykaldi.
Support
Quality
Security
License
Reuse
VQ-VAE for Acoustic Unit Discovery and Voice Conversion
Support
Quality
Security
License
Reuse
Official PyTorch implementation of Contrastive Learning of Musical Representations
Support
Quality
Security
License
Reuse
Soft speech units for voice conversion
Support
Quality
Security
License
Reuse
Kaldi-based Korean ASR (한국어 음성인식) open-source project
Support
Quality
Security
License
Reuse
Lightweight python library for speaker diarization in real time implemented in pytorch
Support
Quality
Security
License
Reuse
P
Place-Recognition-using-Autoencoders-and-NNby aqibsaeed
Jupyter Notebook 254 Version:Current License: Permissive (Apache-2.0)
Place recognition with WiFi fingerprints using Autoencoders and Neural Networks
Support
Quality
Security
License
Reuse
Python implementation of performance metrics in Loizou's Speech Enhancement book
Support
Quality
Security
License
Reuse
PyTorch implementation of Tacotron speech synthesis model.
Support
Quality
Security
License
Reuse
Speech-to-text server framework with next-gen Kaldi
Support
Quality
Security
License
Reuse
Espressif's Voice Assistant SDK: Alexa, Google Voice Assistant, Google DialogFlow
Support
Quality
Security
License
Reuse
liujing04/Retrieval-based-Voice-Conversion-WebUI reconstruction project
Support
Quality
Security
License
Reuse
Official Code for Assem-VC @ICASSP2022
Support
Quality
Security
License
Reuse
Ruby speech recognition with Pocketsphinx
Support
Quality
Security
License
Reuse
Google TTS (Text-To-Speech) for node.js
Support
Quality
Security
License
Reuse
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
Support
Quality
Security
License
Reuse
Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字(PocketSphinx、百度 API、科大讯飞 API)和文字转语音(pyttsx3)
Support
Quality
Security
License
Reuse
W
Wave-U-Net-for-Speech-Enhancementby haoxiangsnr
Python 243 Version:Current License: Permissive (MIT)
Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
Support
Quality
Security
License
Reuse
s
speech-javascript-sdkby watson-developer-cloud
JavaScript 243 Version:Current License: No License (No License)
Library for using the IBM Watson Speech to Text and Text to Speech services in web browsers.
Support
Quality
Security
License
Reuse
Pytorch based speech enhancement toolkit.
Support
Quality
Security
License
Reuse
Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
Support
Quality
Security
License
Reuse
N
Neural-Voice-Cloning-with-Few-Samplesby Sharad24
Python 240 Version:Current License: No License (No License)
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
Support
Quality
Security
License
Reuse
c
chatgpt-api-whisper-api-voice-assistantby hackingthemarkets
Python 240 Version:Current License: No License (No License)
chatgpt api and whisper api tutorial - voice conversation with therapist
Support
Quality
Security
License
Reuse
A fast parallel implementation of RNN Transducer.
Support
Quality
Security
License
Reuse
PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.
Support
Quality
Security
License
Reuse
DaCiDian is an open-sourced chinese mandarin lexicon for automatic speech recognition(ASR)
Support
Quality
Security
License
Reuse
Maix Speech AI lib, a fast and small speech lib running on embedded devices, including ASR, chat, TTS etc.
Support
Quality
Security
License
Reuse
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
Support
Quality
Security
License
Reuse
Speech Recognition for Ukrainian
Support
Quality
Security
License
Reuse
Real-time GCC-NMF Blind Speech Separation and Enhancement
Support
Quality
Security
License
Reuse
s
speech-to-text-webcam-overlayby 1heisuzuki
Web Speech API で音声認識した結果の字幕をWebカメラ映像に重ねて表示するWebページ
JavaScript 283Updated: 2 y ago License: Permissive (CC0-1.0)
Support
Quality
Security
License
Reuse
V
VGG-Speaker-Recognitionby WeidiXie
Utterance-level Aggregation For Speaker Recognition In The Wild
Python 282Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-emotion-recognitionby xuanjihe
speech emotion recognition using a convolutional recurrent networks based on IEMOCAP
Python 279Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
P
PortaSpeechby keonlee9420
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Python 279Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speech-resynthesisby facebookresearch
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.
Python 277Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
a
alacby macosforge
The Apple Lossless Audio Codec (ALAC) is a lossless audio codec developed by Apple and deployed on all of its platforms and devices.
C++ 275Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
F
FDSoundActivatedRecorderby fulldecent
Start recording when the user speaks
Swift 275Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
h
huggingsoundby jonatasgrosman
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Python 275Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
PytorchWaveNetVocoderby kan-bayashi
WaveNet-Vocoder implementation with pytorch.
Shell 272Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
chinese_text_normalizationby speechio
Chinese text normalization for speech processing
Python 271Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
e
end2end-asr-pytorchby gentaiscool
End-to-End Automatic Speech Recognition on PyTorch
Python 270Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speech-denoiserby lucianodato
A speech denoise lv2 plugin based on RNNoise library
C 269Updated: 1 y ago License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
a
alexis_speech_assistantby bradtraversy
Python speech assist app
Python 268Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speak-jsby mtttmpl
Text-to-Speech in JavaScript
JavaScript 267Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
speech-vad-demoby Baidu-AIP
集成Webrtc的VAD,用于切分音频文件
C 266Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-emotion-recognitionby hkveeranki
Speaker independent emotion recognition
Python 265Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
L
LSTM_PIT_Speech_Separationby aishoot
Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.
Jupyter Notebook 263Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
K
KoSpeechby sooftware
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.
Python 262Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
pocketsphinx-goby xlab
CMU PocketSphinx for Golang, a lightweight speech recognition engine.
C 260Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
attention-lvcsrby rizar
End-to-End Attention-Based Large Vocabulary Speech Recognition
Python 259Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Speech-to-Text-Russianby SergeyShk
Проект для распознавания речи на русском языке на основе pykaldi.
Python 259Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
Z
ZeroSpeechby bshall
VQ-VAE for Acoustic Unit Discovery and Voice Conversion
Python 258Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
CLMRby Spijkervet
Official PyTorch implementation of Contrastive Learning of Musical Representations
Python 258Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
soft-vcby bshall
Soft speech units for voice conversion
Jupyter Notebook 258Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
z
zerothby goodatlas
Kaldi-based Korean ASR (한국어 음성인식) open-source project
Shell 257Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
S
StreamingSpeakerDiarizationby juanmc2005
Lightweight python library for speaker diarization in real time implemented in pytorch
Python 255Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
Place-Recognition-using-Autoencoders-and-NNby aqibsaeed
Place recognition with WiFi fingerprints using Autoencoders and Neural Networks
Jupyter Notebook 254Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
pysepmby schmiph2
Python implementation of performance metrics in Loizou's Speech Enhancement book
Python 252Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
t
tacotron_pytorchby r9y9
PyTorch implementation of Tacotron speech synthesis model.
Jupyter Notebook 252Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
sherpaby k2-fsa
Speech-to-text server framework with next-gen Kaldi
Python 252Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
e
esp-va-sdkby espressif
Espressif's Voice Assistant SDK: Alexa, Google Voice Assistant, Google DialogFlow
C 251Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
r
rvc-webuiby ddPn08
liujing04/Retrieval-based-Voice-Conversion-WebUI reconstruction project
Python 251Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
assem-vcby mindslab-ai
Official Code for Assem-VC @ICASSP2022
Jupyter Notebook 250Updated: 2 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
p
pocketsphinx-rubyby watsonbox
Ruby speech recognition with Pocketsphinx
Ruby 249Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
google-ttsby zlargon
Google TTS (Text-To-Speech) for node.js
JavaScript 248Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
PercepNetby jzi040941
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
C++ 245Updated: 2 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
S
Speech-and-Textby Renovamen
Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字(PocketSphinx、百度 API、科大讯飞 API)和文字转语音(pyttsx3)
Python 243Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
W
Wave-U-Net-for-Speech-Enhancementby haoxiangsnr
Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
Python 243Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speech-javascript-sdkby watson-developer-cloud
Library for using the IBM Watson Speech to Text and Text to Speech services in web browsers.
JavaScript 243Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
mayavozby shahules786
Pytorch based speech enhancement toolkit.
Python 243Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaldi-active-grammarby daanzu
Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
Python 240Updated: 3 y ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
N
Neural-Voice-Cloning-with-Few-Samplesby Sharad24
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
Python 240Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
chatgpt-api-whisper-api-voice-assistantby hackingthemarkets
chatgpt api and whisper api tutorial - voice conversation with therapist
Python 240Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
warp-transducerby HawkAaron
A fast parallel implementation of RNN Transducer.
C++ 239Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
G
GenerSpeechby Rongjiehuang
PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.
Python 239Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
D
DaCiDianby aishell-foundation
DaCiDian is an open-sourced chinese mandarin lexicon for automatic speech recognition(ASR)
Python 237Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
M
Maix-Speechby sipeed
Maix Speech AI lib, a fast and small speech lib running on embedded devices, including ASR, chat, TTS etc.
Python 237Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
O
OpenTransformerby ZhengkunTian
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
Python 236Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speech-recognition-ukby egorsmkv
Speech Recognition for Ukrainian
Shell 236Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
gcc-nmfby seanwood
Real-time GCC-NMF Blind Speech Separation and Enhancement
Python 235Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse