🗣 A flexible GUI for Speech Recognition
Support
Quality
Security
License
Reuse
GStreamer bindings for Rust - This repository moved to https://gitlab.freedesktop.org/gstreamer/gstreamer-rs
Support
Quality
Security
License
Reuse
t
tone-analyzer-nodejsby watson-developer-cloud
CSS 
454
Version:Current
License: Permissive (Apache-2.0)
Sample Node.js Application for the IBM Tone Analyzer Service
Support
Quality
Security
License
Reuse
Speech recognition toolkit for the arduino
Support
Quality
Security
License
Reuse
Machine learning based speech synthesis Electron app, with voices from specific characters from video games
Support
Quality
Security
License
Reuse
"Google Now" style animation for Speech Recognizer.
Support
Quality
Security
License
Reuse
💬Speech recognition for your React app
Support
Quality
Security
License
Reuse
[ICLR-2020] Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification.
Support
Quality
Security
License
Reuse
In this video, we're going to build a Conversational Voice Controlled React News Application using Alan AI. Alan AI is a revolutionary speech recognition software that allows you to add voice capabilities to your applications.
Support
Quality
Security
License
Reuse
Run a command using sudo, prompting the user with an OS dialog if necessary.
Support
Quality
Security
License
Reuse
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
Support
Quality
Security
License
Reuse
Android speech recognition and text to speech made easy
Support
Quality
Security
License
Reuse
一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目,CPU上的实时率(RTF)小于0.1
Support
Quality
Security
License
Reuse
Jaxcore Bumblebee - a JavaScript voice application framework
Support
Quality
Security
License
Reuse
Linux Speech Recognition
Support
Quality
Security
License
Reuse
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Support
Quality
Security
License
Reuse
Config for talon for Mac, Windows and Linux. Very much in progress.
Support
Quality
Security
License
Reuse
Simple cross-platform dialog API for go-lang
Support
Quality
Security
License
Reuse
The official Python API for ElevenLabs text-to-speech.
Support
Quality
Security
License
Reuse
singing voice change based on whisper, and lora for singing voice clone
Support
Quality
Security
License
Reuse
a
adaptive_voice_conversionby jjery2243542
Python 
414
Version:Current
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
Support
Quality
Security
License
Reuse
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
Support
Quality
Security
License
Reuse
Your personal voice assistant
Support
Quality
Security
License
Reuse
Different implementations of "Weighted Prediction Error" for speech dereverberation
Support
Quality
Security
License
Reuse
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
Support
Quality
Security
License
Reuse
Open tools and data for cloudless automatic speech recognition
Support
Quality
Security
License
Reuse
Voice Converter Using CycleGAN and Non-Parallel Data
Support
Quality
Security
License
Reuse
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
Support
Quality
Security
License
Reuse
Phonetisaurus G2P
Support
Quality
Security
License
Reuse
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Support
Quality
Security
License
Reuse
N
Neural-Voice-Cloning-With-Few-Samplesby SforAiDl
Python 
391
Version:Current
License: Permissive (MIT)
This repository has implementation for "Neural Voice Cloning With Few Samples"
Support
Quality
Security
License
Reuse
A shazam like tool to store songs fingerprints and retrieve them
Support
Quality
Security
License
Reuse
S
Speech-Backbonesby huawei-noah
Jupyter Notebook 
388
Version:Current
License: No License (No License)
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Support
Quality
Security
License
Reuse
Library to build speech synthesis systems designed for easy and fast prototyping.
Support
Quality
Security
License
Reuse
On-device voice assistant platform powered by deep learning
Support
Quality
Security
License
Reuse
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
Support
Quality
Security
License
Reuse
Espressif intelligent voice assistant
Support
Quality
Security
License
Reuse
Problem Agnostic Speech Encoder
Support
Quality
Security
License
Reuse
ARCHIVED! - Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS) and Windows Speech Recognition (WSR)
Support
Quality
Security
License
Reuse
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Support
Quality
Security
License
Reuse
Python interface to CMU Sphinxbase and Pocketsphinx libraries
Support
Quality
Security
License
Reuse
[mirror] Go supplementary time packages
Support
Quality
Security
License
Reuse
Server for the Echoprint audio fingerprint system
Support
Quality
Security
License
Reuse
Deep learning for audio processing
Support
Quality
Security
License
Reuse
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Support
Quality
Security
License
Reuse
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Support
Quality
Security
License
Reuse
Neural network-based singing voice synthesis library for research
Support
Quality
Security
License
Reuse
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder, with biaobei and aishell3 datasets
Support
Quality
Security
License
Reuse
On-device speech-to-text engine powered by deep learning
Support
Quality
Security
License
Reuse
S
SpeechKITTby TalAter
🗣 A flexible GUI for Speech Recognition
JavaScript
460
Updated: 3 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gstreamer-rsby sdroege
GStreamer bindings for Rust - This repository moved to https://gitlab.freedesktop.org/gstreamer/gstreamer-rs
Rust
458
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
t
tone-analyzer-nodejsby watson-developer-cloud
Sample Node.js Application for the IBM Tone Analyzer Service
CSS
454
Updated: 3 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
u
uSpeechby arjo129
Speech recognition toolkit for the arduino
C++
453
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
x
xVA-Synthby DanRuta
Machine learning based speech synthesis Electron app, with voices from specific characters from video games
JavaScript
452
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
S
SpeechRecognitionViewby zagum
"Google Now" style animation for Speech Recognizer.
Java
450
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
r
react-speech-recognitionby JamesBrill
💬Speech recognition for your React app
JavaScript
448
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
M
MMTby yxgeee
[ICLR-2020] Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification.
Python
442
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
project_news_alan_aiby adrianhajdin
In this video, we're going to build a Conversational Voice Controlled React News Application using Alan AI. Alan AI is a revolutionary speech recognition software that allows you to add voice capabilities to your applications.
JavaScript
440
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
sudo-promptby jorangreef
Run a command using sudo, prompting the user with an OS dialog if necessary.
JavaScript
438
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
audioreadby beetbox
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
Python
436
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
android-speechby gotev
Android speech recognition and text to speech made easy
Java
434
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
TensorflowASRby Z-yq
一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目,CPU上的实时率(RTF)小于0.1
Python
428
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
bumblebeeby jaxcore
Jaxcore Bumblebee - a JavaScript voice application framework
JavaScript
427
Updated: 3 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
Palaverby JamezQ
Linux Speech Recognition
Python
425
Updated: 4 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
F
FullSubNetby Audio-WestlakeU
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Python
425
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
knausj_talonby knausj85
Config for talon for Mac, Windows and Linux. Very much in progress.
Python
424
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
dialogby sqweek
Simple cross-platform dialog API for go-lang
Go
420
Updated: 2 y ago
License: Permissive (ISC)
Support
Quality
Security
License
Reuse
e
elevenlabs-pythonby elevenlabs
The official Python API for ElevenLabs text-to-speech.
Python
419
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
l
lora-svcby PlayVoice
singing voice change based on whisper, and lora for singing voice clone
Python
418
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
adaptive_voice_conversionby jjery2243542
Python
414
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
v
voxpopuliby facebookresearch
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
Python
407
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
c
css10by Kyubyong
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
HTML
407
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
hey-athena-clientby rcbyron
Your personal voice assistant
Python
406
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nara_wpeby fgnt
Different implementations of "Weighted Prediction Error" for speech dereverberation
Python
405
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
allosaurusby xinjli
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
Python
405
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
z
zamia-speechby gooofy
Open tools and data for cloudless automatic speech recognition
Python
397
Updated: 4 y ago
License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
V
Voice-Converter-CycleGANby leimao
Voice Converter Using CycleGAN and Non-Parallel Data
Python
396
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
PESQby ludlows
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
C
395
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
Phonetisaurusby AdolfVonKleist
Phonetisaurus G2P
Shell
393
Updated: 2 y ago
License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
N
NISQAby gabrielmittag
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Python
392
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
N
Neural-Voice-Cloning-With-Few-Samplesby SforAiDl
This repository has implementation for "Neural Voice Cloning With Few Samples"
Python
391
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
musigby sfluor
A shazam like tool to store songs fingerprints and retrieve them
Go
390
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Speech-Backbonesby huawei-noah
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Jupyter Notebook
388
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
n
nnmnkwiiby r9y9
Library to build speech synthesis systems designed for easy and fast prototyping.
Python
382
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
p
picovoiceby Picovoice
On-device voice assistant platform powered by deep learning
Python
377
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
M
MASRby yeyupiaoling
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
Python
372
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
e
esp-skainetby espressif
Espressif intelligent voice assistant
C
372
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
p
Support
Quality
Security
License
Reuse
d
dragonflyby t4ngo
ARCHIVED! - Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS) and Windows Speech Recognition (WSR)
Python
364
Updated: 4 y ago
License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
S
StarGANv2-VCby yl4579
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Python
364
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pocketsphinx-pythonby bambocher
Python interface to CMU Sphinxbase and Pocketsphinx libraries
Python
363
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
t
timeby golang
[mirror] Go supplementary time packages
Go
363
Updated: 2 y ago
License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
e
echoprint-serverby spotify
Server for the Echoprint audio fingerprint system
C
363
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
dlaby markovka17
Deep learning for audio processing
Jupyter Notebook
362
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
vosk-serverby alphacep
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Python
361
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speech-alignerby open-speech
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
C++
361
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
n
nnsvsby r9y9
Neural network-based singing voice synthesis library for research
Python
360
Updated: 3 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mandarin-ttsby ranchlai
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder, with biaobei and aishell3 datasets
Python
360
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
l
leopardby Picovoice
On-device speech-to-text engine powered by deep learning
Python
359
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse