Cloud-based Automatic Speech Recognition (ASR) platform and a public ASR webservice.
Support
Quality
Security
License
Reuse
A Unity plugin for real-time, indefinite speech-to-text transcription from a microphone using Google Cloud Speech-to-Text.
Support
Quality
Security
License
Reuse
80speak is an online speech synthesizer based on DECtalk, famously used by Professor Stephen Hawking, The US National Weather Service, Back To The Future Part II, and Benny Benassi.
Support
Quality
Security
License
Reuse
Web interface for Microsoft Sam & friends
Support
Quality
Security
License
Reuse
Voice models for Mimic 3 text to speech system
Support
Quality
Security
License
Reuse
Various simple Web Audio API examples
Support
Quality
Security
License
Reuse
Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS
Support
Quality
Security
License
Reuse
Standalone executables for those who don't want to bother with Python.
Support
Quality
Security
License
Reuse
Python codes for Lite Audio-Visual Speech Enhancement.
Support
Quality
Security
License
Reuse
An Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.
Support
Quality
Security
License
Reuse
speech-recorder is a node.js module for streaming audio from a device's microphone and filtering for speech.
Support
Quality
Security
License
Reuse
Convert audio to text, understand intent, and convert text back to speech for natural responsiveness.
Support
Quality
Security
License
Reuse
This is sample code for an Alexa skill that uses realistic voice cloning powered by Resemble AI's text-to-speech API, and Open AI’s GPT-3 AI engine.
Support
Quality
Security
License
Reuse
A pytorch_lightning reimplementation of the Transducer module from ESPnet.
Support
Quality
Security
License
Reuse
Use Watson Speech to Text, Language Translator, and Text to Speech in a web app with React components
Support
Quality
Security
License
Reuse
Speech Markdown grammar, parser, and formatters for use with JavaScript.
Support
Quality
Security
License
Reuse
Tacotron text to speech in C++(synthesize only)
Support
Quality
Security
License
Reuse
Modular and Polyphonic audio synthesis library
Support
Quality
Security
License
Reuse
Welcome to the Microsoft Voice Assistant samples repository! Here you will find samples to help you get started building client application for your bot or Custom Command service. You will also be able to easily deploy a working Custom Command based Voice Assistant to your own Azure subscription
Support
Quality
Security
License
Reuse
Code for our CVPR'23 paper - "FLEX: Full-Body Grasping Without Full-Body Grasps"
Support
Quality
Security
License
Reuse
A PyTorch implementation of Time-domain Audio Separation Network (TasNet) with Permutation Invariant Training (PIT) for speech separation.
Support
Quality
Security
License
Reuse
Проект для перевода чисел, записанных в текстовом виде на русском языке.
Support
Quality
Security
License
Reuse
Text to Speech Plugin for Xamarin and Windows
Support
Quality
Security
License
Reuse
The iOS client library for Speechly API
Support
Quality
Security
License
Reuse
A python library for voice activity detection (VAD) for speech/non-speech segmentation.
Support
Quality
Security
License
Reuse
Reverb.js is a Web Audio API extension for creating reverb nodes and an accompanying impulse-response reverb library.
Support
Quality
Security
License
Reuse
Transcribe audio files using the "Whisper" Automatic Speech Recognition model from R
Support
Quality
Security
License
Reuse
A python script that takes an input MP3/FLAC and outputs an acapella/background noise stripped WAV using the power of NVIDIA's RTX Voice
Support
Quality
Security
License
Reuse
A complete speech recognition system you can deploy with just a few lines of Python, built on CMU Sphinx-4.
Support
Quality
Security
License
Reuse
A simple pitch shifting script (Time-Domain Pitch-Synchronous Overlap and Add)
Support
Quality
Security
License
Reuse
Python implementation of pre-processing for End-to-End speech recognition
Support
Quality
Security
License
Reuse
Offline Voice Recognition Module for MagicMirror²
Support
Quality
Security
License
Reuse
Web app for keyword spotting using TensorflowJS
Support
Quality
Security
License
Reuse
Spoken - JavaScript Text-to-Speech and Speech-to-Text for AI Artificial Intelligence Apps
Support
Quality
Security
License
Reuse
New version of voice input button using new interface of iflytek voice dictation (the stream version). 基于讯飞新版语音听写(流式版) api 的语音输入按钮 vue 组件。
Support
Quality
Security
License
Reuse
lessampler is a Singing Voice Synthesizer
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Open Source VoiceXML interpreter
Support
Quality
Security
License
Reuse
Reproducing Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis (https://arxiv.org/pdf/1803.09017.pdf)
Support
Quality
Security
License
Reuse
Python wrappers for Kaldi data
Support
Quality
Security
License
Reuse
Speech recognition using webrtc for FirefoxOS
Support
Quality
Security
License
Reuse
Automatically exported from code.google.com/p/musicg
Support
Quality
Security
License
Reuse
FFTNet: a Real-Time Speaker-Dependent Neural Vocoder
Support
Quality
Security
License
Reuse
Audio-Visual Speech Recognition using Sequence to Sequence Models
Support
Quality
Security
License
Reuse
A speech-to-text framework and bot for Discord. Take control of your Discord server using speech and voice commands. Can also be useful for hearing impaired and deaf people.
Support
Quality
Security
License
Reuse
MAGE is a C/C++ software toolkit for reactive implementation of HMM-based speech and singing synthesis.
Support
Quality
Security
License
Reuse
Nakloid: Unit-waveform-oriented Singing Voice Synthesis System
Support
Quality
Security
License
Reuse
StageMate is the smart assistant for your presentation. It will cover all aspects of your pitch from skipping slides to reminding you if you miss some major point.
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Support
Quality
Security
License
Reuse
c
cloud-asrby UFAL-DSG
Cloud-based Automatic Speech Recognition (ASR) platform and a public ASR webservice.
Python 65Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
U
UnityGoogleStreamingSpeechToTextby oshoham
A Unity plugin for real-time, indefinite speech-to-text transcription from a microphone using Google Cloud Speech-to-Text.
C# 65Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
8
80speakby connornishijima
80speak is an online speech synthesizer based on DECtalk, famously used by Professor Stephen Hawking, The US National Weather Service, Back To The Future Part II, and Benny Benassi.
C 65Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
S
SAPI4by TETYYS
Web interface for Microsoft Sam & friends
C++ 65Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mimic3-voicesby MycroftAI
Voice models for Mimic 3 text to speech system
HTML 65Updated: 2 y ago License: Strong Copyleft (CC-BY-SA-4.0)
Support
Quality
Security
License
Reuse
W
Working-with-the-Web-Audio-APIby joshreiss
Various simple Web Audio API examples
HTML 65Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
willow-inference-serverby toverainc
Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS
Python 65Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
whisper-standalone-winby Purfview
Standalone executables for those who don't want to bother with Python.
Python 65Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
L
LAVSEby kagaminccino
Python codes for Lite Audio-Visual Speech Enhancement.
Python 64Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
I
Inimesedby Kaljurand
An Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.
Java 64Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speech-recorderby serenadeai
speech-recorder is a node.js module for streaming audio from a device's microphone and filtering for speech.
C++ 64Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Speechby Microsoft
Convert audio to text, understand intent, and convert text back to speech for natural responsiveness.
cloud_api 64Updated: Current License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
r
resemble-alexaby resemble-ai
This is sample code for an Alexa skill that uses realistic voice cloning powered by Resemble AI's text-to-speech API, and Open AI’s GPT-3 AI engine.
Python 64Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
Transformer-Transducerby oshindow
A pytorch_lightning reimplementation of the Transducer module from ESPnet.
Python 64Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
watson-speech-translatorby IBM
Use Watson Speech to Text, Language Translator, and Text to Speech in a web app with React components
JavaScript 63Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speechmarkdown-jsby speechmarkdown
Speech Markdown grammar, parser, and formatters for use with JavaScript.
TypeScript 63Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
tacotron-tts-cppby syoyo
Tacotron text to speech in C++(synthesize only)
C++ 63Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mopoby mtytel
Modular and Polyphonic audio synthesis library
C++ 63Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
C
Cognitive-Services-Voice-Assistantby Azure-Samples
Welcome to the Microsoft Voice Assistant samples repository! Here you will find samples to help you get started building client application for your bot or Custom Command service. You will also be able to easily deploy a working Custom Command based Voice Assistant to your own Azure subscription
C++ 63Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
F
FLEXby purvaten
Code for our CVPR'23 paper - "FLEX: Full-Body Grasping Without Full-Body Grasps"
Python 63Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
TasNetby kaituoxu
A PyTorch implementation of Time-domain Audio Separation Network (TasNet) with Permutation Invariant Training (PIT) for speech separation.
Python 62Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
W
Word-to-Number-Russianby SergeyShk
Проект для перевода чисел, записанных в текстовом виде на русском языке.
Python 62Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
TextToSpeechPluginby jamesmontemagno
Text to Speech Plugin for Xamarin and Windows
C# 62Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
i
ios-clientby speechly
The iOS client library for Speechly API
Swift 62Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
voxsegby NickWilkinson37
A python library for voice activity detection (VAD) for speech/non-speech segmentation.
Python 62Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
R
Reverb.jsby andibrae
Reverb.js is a Web Audio API extension for creating reverb nodes and an accompanying impulse-response reverb library.
HTML 62Updated: 2 y ago License: Permissive (CC0-1.0)
Support
Quality
Security
License
Reuse
a
audio.whisperby bnosac
Transcribe audio files using the "Whisper" Automatic Speech Recognition model from R
C 62Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
r
rtx-voice-scriptby amirldn
A python script that takes an input MP3/FLAC and outputs an acapella/background noise stripped WAV using the power of NVIDIA's RTX Voice
Python 62Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
simple-speech-recognitionby kelvinguu
A complete speech recognition system you can deploy with just a few lines of Python, built on CMU Sphinx-4.
Python 61Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
TD-PSOLAby sannawag
A simple pitch shifting script (Time-Domain Pitch-Synchronous Overlap and Add)
Python 61Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
asr_preprocessingby hirofumi0810
Python implementation of pre-processing for End-to-End speech recognition
Python 61Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
M
MMM-voiceby fewieden
Offline Voice Recognition Module for MagicMirror²
JavaScript 61Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
h
honklingby castorini
Web app for keyword spotting using TensorflowJS
JavaScript 61Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
spokenby stephenlb
Spoken - JavaScript Text-to-Speech and Speech-to-Text for AI Artificial Intelligence Apps
JavaScript 61Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
voice-input-button2by ferrinweb
New version of voice input button using new interface of iflytek voice dictation (the stream version). 基于讯飞新版语音听写(流式版) api 的语音输入按钮 vue 组件。
JavaScript 61Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
lessamplerby YuzukiTsuru
lessampler is a Singing Voice Synthesizer
C++ 61Updated: 2 y ago License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
E
EasyVCby MingjieChen
Python 61Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
J
JVoiceXMLby JVoiceXML
Open Source VoiceXML interpreter
Java 60Updated: 2 y ago License: Weak Copyleft (LGPL-2.1)
Support
Quality
Security
License
Reuse
G
GST-tacotronby acetylSv
Reproducing Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis (https://arxiv.org/pdf/1803.09017.pdf)
Python 60Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
kaldi-pythonby janchorowski
Python wrappers for Kaldi data
C++ 60Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
speechrtcby andrenatal
Speech recognition using webrtc for FirefoxOS
C 60Updated: 5 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
musicgby loisaidasam
Automatically exported from code.google.com/p/musicg
Java 59Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
F
FFTNetby azraelkuan
FFTNet: a Real-Time Speaker-Dependent Neural Vocoder
Python 59Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
avsr-tf1by georgesterpu
Audio-Visual Speech Recognition using Sequence to Sequence Models
Python 59Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
D
DiscordEarsBotby healzer
A speech-to-text framework and bot for Discord. Take control of your Discord server using speech and voice commands. Can also be useful for hearing impaired and deaf people.
JavaScript 59Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mageby numediart
MAGE is a C/C++ software toolkit for reactive implementation of HMM-based speech and singing synthesis.
C++ 59Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
N
Nakloidby acknak
Nakloid: Unit-waveform-oriented Singing Voice Synthesis System
C++ 59Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
StageMateby Langhalsdino
StageMate is the smart assistant for your presentation. It will cover all aspects of your pitch from skipping slides to reminding you if you miss some major point.
HTML 59Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaldi.jsby adrianbg
C++ 59Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
P
Parallel-Tacotron2by keonlee9420
Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Python 59Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse