Simple, hackable offline speech to text - using the VOSK-API.
Support
Quality
Security
License
Reuse
Subtitle Speech Synchronizer
Support
Quality
Security
License
Reuse
b
botium-speech-processingby codeforequity-at
JavaScript 
939
Version:Current
License: Permissive (MIT)
Botium Speech Processing
Support
Quality
Security
License
Reuse
A Python wrapper for Kaldi
Support
Quality
Security
License
Reuse
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Support
Quality
Security
License
Reuse
Creates audio supercuts.
Support
Quality
Security
License
Reuse
Chinese real time voice cloning (VC) and Chinese text to speech (TTS). 好用的中文语音克隆兼中文语音合成系统,包含语音编码器、语音合成器、声码器和可视化模块。
Support
Quality
Security
License
Reuse
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Support
Quality
Security
License
Reuse
an open-source implementation of sequence-to-sequence based speech processing engine
Support
Quality
Security
License
Reuse
Audio MODEM Communication Library in Python
Support
Quality
Security
License
Reuse
:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
Support
Quality
Security
License
Reuse
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
Support
Quality
Security
License
Reuse
Audio Waveform Data Manipulation API – resample, offset and segment waveform data in JavaScript.
Support
Quality
Security
License
Reuse
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
Support
Quality
Security
License
Reuse
The official repository of the Eesen project
Support
Quality
Security
License
Reuse
A fast, local neural text to speech system
Support
Quality
Security
License
Reuse
Stephanie is an open-source platform built specifically for voice-controlled applications as well as to automate daily tasks imitating much of an virtual assistant's work.
Support
Quality
Security
License
Reuse
The Implementation of FastSpeech based on pytorch.
Support
Quality
Security
License
Reuse
Closed Captioning OBS plugin using Google Speech Recognition
Support
Quality
Security
License
Reuse
Jarvis.sh is a simple configurable multi-lang assistant.
Support
Quality
Security
License
Reuse
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
Support
Quality
Security
License
Reuse
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
Support
Quality
Security
License
Reuse
End to end text to speech system using gruut and onnx
Support
Quality
Security
License
Reuse
A chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.
Support
Quality
Security
License
Reuse
OpenAI Whisper ASR Webservice API
Support
Quality
Security
License
Reuse
Open STT
Support
Quality
Security
License
Reuse
Chinese text-to-speech engine
Support
Quality
Security
License
Reuse
中文语音识别
Support
Quality
Security
License
Reuse
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Support
Quality
Security
License
Reuse
Adapt Intent Parser
Support
Quality
Security
License
Reuse
s
stream-audio-fingerprintby adblockradio
JavaScript 
704
Version:Current
License: Weak Copyleft (MPL-2.0)
Audio landmark fingerprinting as a Node Stream module
Support
Quality
Security
License
Reuse
A lightweight, simple-to-use, RNN wake word listener
Support
Quality
Security
License
Reuse
Production First and Production Ready End-to-End Speech Recognition Toolkit
Support
Quality
Security
License
Reuse
Tools for handling speech data in machine learning projects.
Support
Quality
Security
License
Reuse
Node.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
Support
Quality
Security
License
Reuse
A fast local neural text to speech engine for Mycroft
Support
Quality
Security
License
Reuse
text to speech toolkit. 好用的中文语音合成工具箱,包含语音编码器、语音合成器、声码器和可视化模块。
Support
Quality
Security
License
Reuse
Python AI assistant 🧠
Support
Quality
Security
License
Reuse
Example scripts for voice assistants created with the Alan AI Platform.
Support
Quality
Security
License
Reuse
A PyTorch Implementation of End-to-End Models for Speech-to-Text
Support
Quality
Security
License
Reuse
A Speaker Recognition System
Support
Quality
Security
License
Reuse
:speech_balloon: An On-Premises, Streaming Speech Recognition System
Support
Quality
Security
License
Reuse
Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
Support
Quality
Security
License
Reuse
An open "intelligent" assistant for the web that can listen to you and learn.
Support
Quality
Security
License
Reuse
An audio/acoustic activity detection and audio segmentation tool
Support
Quality
Security
License
Reuse
语音api示例
Support
Quality
Security
License
Reuse
JavaScript implementation of Japanese morphological analyzer
Support
Quality
Security
License
Reuse
a
aiexperiments-drum-machineby googlecreativelab
JavaScript 
615
Version:Current
License: Permissive (Apache-2.0)
Thousands of everyday sounds, organized using machine learning.
Support
Quality
Security
License
Reuse
Mycroft's TTS engine, based on CMU's Flite (Festival Lite)
Support
Quality
Security
License
Reuse
This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"
Support
Quality
Security
License
Reuse
n
nerd-dictationby ideasman42
Simple, hackable offline speech to text - using the VOSK-API.
Python
956
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
subsyncby sc0ty
Subtitle Speech Synchronizer
C++
943
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
b
botium-speech-processingby codeforequity-at
Botium Speech Processing
JavaScript
939
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pykaldiby pykaldi
A Python wrapper for Kaldi
Python
936
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
v
voicefilterby mindslab-ai
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Python
912
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
a
Support
Quality
Security
License
Reuse
z
zhrtvcby KuangDD
Chinese real time voice cloning (VC) and Chinese text to speech (TTS). 好用的中文语音克隆兼中文语音合成系统,包含语音编码器、语音合成器、声码器和可视化模块。
Python
890
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
e
espressoby freewym
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Python
887
Updated: 3 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
a
athenaby athena-team
an open-source implementation of sequence-to-sequence based speech processing engine
C++
869
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
amodemby romanz
Audio MODEM Communication Library in Python
Python
864
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
speechpyby astorfi
:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
Python
839
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
TensorFlowASRby TensorSpeech
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
Jupyter Notebook
839
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
w
waveform-data.jsby bbc
Audio Waveform Data Manipulation API – resample, offset and segment waveform data in JavaScript.
JavaScript
824
Updated: 3 y ago
License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
f
flowtronby NVIDIA
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
Jupyter Notebook
817
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
e
eesenby srvk
The official repository of the Eesen project
C++
816
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
piperby rhasspy
A fast, local neural text to speech system
C++
794
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
stephanie-vaby SlapBot
Stephanie is an open-source platform built specifically for voice-controlled applications as well as to automate daily tasks imitating much of an virtual assistant's work.
Python
788
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
F
FastSpeechby xcmyz
The Implementation of FastSpeech based on pytorch.
Python
785
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
O
OBS-captions-pluginby ratwithacompiler
Closed Captioning OBS plugin using Google Speech Recognition
C++
785
Updated: 2 y ago
License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
j
jarvisby alexylem
Jarvis.sh is a simple configurable multi-lang assistant.
Shell
780
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mellotronby NVIDIA
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
Jupyter Notebook
773
Updated: 2 y ago
License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
C
CTCDecoderby githubharald
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
Python
766
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
l
larynxby rhasspy
End to end text to speech system using gruut and onnx
Python
766
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
q
quillmanby modal-labs
A chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.
JavaScript
730
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
w
whisper-asr-webserviceby ahmetoner
OpenAI Whisper ASR Webservice API
Python
728
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
o
Support
Quality
Security
License
Reuse
e
ekhoby hgneng
Chinese text-to-speech engine
C++
719
Updated: 4 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
S
Speech-Transformerby kaituoxu
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Python
709
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
a
Support
Quality
Security
License
Reuse
s
stream-audio-fingerprintby adblockradio
Audio landmark fingerprinting as a Node Stream module
JavaScript
704
Updated: 4 y ago
License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
m
mycroft-preciseby MycroftAI
A lightweight, simple-to-use, RNN wake word listener
Python
700
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
w
wenetby mobvoi
Production First and Production Ready End-to-End Speech Recognition Toolkit
Python
687
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
l
lhotseby lhotse-speech
Tools for handling speech data in machine learning projects.
Python
686
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
n
nodejs-speechby googleapis
Node.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
TypeScript
683
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
mimic3by MycroftAI
A fast local neural text to speech engine for Mycroft
Python
676
Updated: 2 y ago
License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
t
ttskitby kuangdd
text to speech toolkit. 好用的中文语音合成工具箱,包含语音编码器、语音合成器、声码器和可视化模块。
Python
661
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
Python-ai-assistantby ggeop
Python AI assistant 🧠
Python
660
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
voice-assistant-scriptsby alan-ai
Example scripts for voice assistants created with the Alan AI Platform.
JavaScript
659
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
speechby awni
A PyTorch Implementation of End-to-End Models for Speech-to-Text
Python
654
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speaker-recognitionby ppwwyyxx
A Speaker Recognition System
C++
645
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
L
LibreASRby iceychris
:speech_balloon: An On-Premises, Streaming Speech Recognition System
Python
642
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
C
Cognitive-Speech-TTSby Azure-Samples
Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
C#
635
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
m
mysamby mysamai
An open "intelligent" assistant for the web that can listen to you and learn.
JavaScript
628
Updated: 4 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
a
auditokby amsehili
An audio/acoustic activity detection and audio segmentation tool
Python
626
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
k
kuromoji.jsby takuyaa
JavaScript implementation of Japanese morphological analyzer
JavaScript
620
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
aiexperiments-drum-machineby googlecreativelab
Thousands of everyday sounds, organized using machine learning.
JavaScript
615
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
mimic1by MycroftAI
Mycroft's TTS engine, based on CMU's Flite (Festival Lite)
C
615
Updated: 4 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
L
Lip2Wavby Rudrabha
This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"
Python
613
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse