🗣 A flexible GUI for Speech Recognition
Support
Quality
Security
License
Reuse
GStreamer bindings for Rust - This repository moved to https://gitlab.freedesktop.org/gstreamer/gstreamer-rs
Support
Quality
Security
License
Reuse
t
tone-analyzer-nodejsby watson-developer-cloud
CSS 454 Version:Current License: Permissive (Apache-2.0)
Sample Node.js Application for the IBM Tone Analyzer Service
Support
Quality
Security
License
Reuse
Speech recognition toolkit for the arduino
Support
Quality
Security
License
Reuse
Machine learning based speech synthesis Electron app, with voices from specific characters from video games
Support
Quality
Security
License
Reuse
"Google Now" style animation for Speech Recognizer.
Support
Quality
Security
License
Reuse
💬Speech recognition for your React app
Support
Quality
Security
License
Reuse
[ICLR-2020] Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification.
Support
Quality
Security
License
Reuse
In this video, we're going to build a Conversational Voice Controlled React News Application using Alan AI. Alan AI is a revolutionary speech recognition software that allows you to add voice capabilities to your applications.
Support
Quality
Security
License
Reuse
Run a command using sudo, prompting the user with an OS dialog if necessary.
Support
Quality
Security
License
Reuse
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
Support
Quality
Security
License
Reuse
Android speech recognition and text to speech made easy
Support
Quality
Security
License
Reuse
一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目,CPU上的实时率(RTF)小于0.1
Support
Quality
Security
License
Reuse
Jaxcore Bumblebee - a JavaScript voice application framework
Support
Quality
Security
License
Reuse
Linux Speech Recognition
Support
Quality
Security
License
Reuse
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Support
Quality
Security
License
Reuse
Config for talon for Mac, Windows and Linux. Very much in progress.
Support
Quality
Security
License
Reuse
Simple cross-platform dialog API for go-lang
Support
Quality
Security
License
Reuse
The official Python API for ElevenLabs text-to-speech.
Support
Quality
Security
License
Reuse
singing voice change based on whisper, and lora for singing voice clone
Support
Quality
Security
License
Reuse
a
adaptive_voice_conversionby jjery2243542
Python 414 Version:Current License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
Support
Quality
Security
License
Reuse
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
Support
Quality
Security
License
Reuse
Your personal voice assistant
Support
Quality
Security
License
Reuse
Different implementations of "Weighted Prediction Error" for speech dereverberation
Support
Quality
Security
License
Reuse
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
Support
Quality
Security
License
Reuse
Open tools and data for cloudless automatic speech recognition
Support
Quality
Security
License
Reuse
Voice Converter Using CycleGAN and Non-Parallel Data
Support
Quality
Security
License
Reuse
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
Support
Quality
Security
License
Reuse
Phonetisaurus G2P
Support
Quality
Security
License
Reuse
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Support
Quality
Security
License
Reuse
N
Neural-Voice-Cloning-With-Few-Samplesby SforAiDl
Python 391 Version:Current License: Permissive (MIT)
This repository has implementation for "Neural Voice Cloning With Few Samples"
Support
Quality
Security
License
Reuse
A shazam like tool to store songs fingerprints and retrieve them
Support
Quality
Security
License
Reuse
S
Speech-Backbonesby huawei-noah
Jupyter Notebook 388 Version:Current License: No License (No License)
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Support
Quality
Security
License
Reuse
Library to build speech synthesis systems designed for easy and fast prototyping.
Support
Quality
Security
License
Reuse
On-device voice assistant platform powered by deep learning
Support
Quality
Security
License
Reuse
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
Support
Quality
Security
License
Reuse
Espressif intelligent voice assistant
Support
Quality
Security
License
Reuse
Problem Agnostic Speech Encoder
Support
Quality
Security
License
Reuse
ARCHIVED! - Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS) and Windows Speech Recognition (WSR)
Support
Quality
Security
License
Reuse
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Support
Quality
Security
License
Reuse
Python interface to CMU Sphinxbase and Pocketsphinx libraries
Support
Quality
Security
License
Reuse
[mirror] Go supplementary time packages
Support
Quality
Security
License
Reuse
Server for the Echoprint audio fingerprint system
Support
Quality
Security
License
Reuse
Deep learning for audio processing
Support
Quality
Security
License
Reuse
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Support
Quality
Security
License
Reuse
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Support
Quality
Security
License
Reuse
Neural network-based singing voice synthesis library for research
Support
Quality
Security
License
Reuse
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder, with biaobei and aishell3 datasets
Support
Quality
Security
License
Reuse
On-device speech-to-text engine powered by deep learning
Support
Quality
Security
License
Reuse
S
SpeechKITTby TalAter
🗣 A flexible GUI for Speech Recognition
JavaScript 460Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gstreamer-rsby sdroege
GStreamer bindings for Rust - This repository moved to https://gitlab.freedesktop.org/gstreamer/gstreamer-rs
Rust 458Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
t
tone-analyzer-nodejsby watson-developer-cloud
Sample Node.js Application for the IBM Tone Analyzer Service
CSS 454Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
u
uSpeechby arjo129
Speech recognition toolkit for the arduino
C++ 453Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
x
xVA-Synthby DanRuta
Machine learning based speech synthesis Electron app, with voices from specific characters from video games
JavaScript 452Updated: 1 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
S
SpeechRecognitionViewby zagum
"Google Now" style animation for Speech Recognizer.
Java 450Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
r
react-speech-recognitionby JamesBrill
💬Speech recognition for your React app
JavaScript 448Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
M
MMTby yxgeee
[ICLR-2020] Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification.
Python 442Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
project_news_alan_aiby adrianhajdin
In this video, we're going to build a Conversational Voice Controlled React News Application using Alan AI. Alan AI is a revolutionary speech recognition software that allows you to add voice capabilities to your applications.
JavaScript 440Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
sudo-promptby jorangreef
Run a command using sudo, prompting the user with an OS dialog if necessary.
JavaScript 438Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
audioreadby beetbox
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
Python 436Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
android-speechby gotev
Android speech recognition and text to speech made easy
Java 434Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
TensorflowASRby Z-yq
一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目,CPU上的实时率(RTF)小于0.1
Python 428Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
bumblebeeby jaxcore
Jaxcore Bumblebee - a JavaScript voice application framework
JavaScript 427Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
Palaverby JamezQ
Linux Speech Recognition
Python 425Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
F
FullSubNetby Audio-WestlakeU
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Python 425Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
knausj_talonby knausj85
Config for talon for Mac, Windows and Linux. Very much in progress.
Python 424Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
dialogby sqweek
Simple cross-platform dialog API for go-lang
Go 420Updated: 1 y ago License: Permissive (ISC)
Support
Quality
Security
License
Reuse
e
elevenlabs-pythonby elevenlabs
The official Python API for ElevenLabs text-to-speech.
Python 419Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
lora-svcby PlayVoice
singing voice change based on whisper, and lora for singing voice clone
Python 418Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
adaptive_voice_conversionby jjery2243542
Python 414Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
v
voxpopuliby facebookresearch
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
Python 407Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
c
css10by Kyubyong
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
HTML 407Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
hey-athena-clientby rcbyron
Your personal voice assistant
Python 406Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nara_wpeby fgnt
Different implementations of "Weighted Prediction Error" for speech dereverberation
Python 405Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
allosaurusby xinjli
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
Python 405Updated: 1 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
z
zamia-speechby gooofy
Open tools and data for cloudless automatic speech recognition
Python 397Updated: 3 y ago License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
V
Voice-Converter-CycleGANby leimao
Voice Converter Using CycleGAN and Non-Parallel Data
Python 396Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
PESQby ludlows
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
C 395Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
Phonetisaurusby AdolfVonKleist
Phonetisaurus G2P
Shell 393Updated: 2 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
N
NISQAby gabrielmittag
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Python 392Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
N
Neural-Voice-Cloning-With-Few-Samplesby SforAiDl
This repository has implementation for "Neural Voice Cloning With Few Samples"
Python 391Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
musigby sfluor
A shazam like tool to store songs fingerprints and retrieve them
Go 390Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Speech-Backbonesby huawei-noah
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Jupyter Notebook 388Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
n
nnmnkwiiby r9y9
Library to build speech synthesis systems designed for easy and fast prototyping.
Python 382Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
p
picovoiceby Picovoice
On-device voice assistant platform powered by deep learning
Python 377Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
M
MASRby yeyupiaoling
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
Python 372Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
e
esp-skainetby espressif
Espressif intelligent voice assistant
C 372Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
p
Support
Quality
Security
License
Reuse
d
dragonflyby t4ngo
ARCHIVED! - Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS) and Windows Speech Recognition (WSR)
Python 364Updated: 3 y ago License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
S
StarGANv2-VCby yl4579
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Python 364Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pocketsphinx-pythonby bambocher
Python interface to CMU Sphinxbase and Pocketsphinx libraries
Python 363Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
t
timeby golang
[mirror] Go supplementary time packages
Go 363Updated: 2 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
e
echoprint-serverby spotify
Server for the Echoprint audio fingerprint system
C 363Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
dlaby markovka17
Deep learning for audio processing
Jupyter Notebook 362Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
vosk-serverby alphacep
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Python 361Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speech-alignerby open-speech
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
C++ 361Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
n
nnsvsby r9y9
Neural network-based singing voice synthesis library for research
Python 360Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mandarin-ttsby ranchlai
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder, with biaobei and aishell3 datasets
Python 360Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
leopardby Picovoice
On-device speech-to-text engine powered by deep learning
Python 359Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse