Speech to Text for Golang. Capable of continuous speech. Also wrappers for various Speech to Text APIs
Support
Quality
Security
License
Reuse
Easy AI from your friends & family
Support
Quality
Security
License
Reuse
Python library for interfacing with the XunFei text-to-speech API
Support
Quality
Security
License
Reuse
c++ Kaldi IO lib (static and dynamic).
Support
Quality
Security
License
Reuse
nao robot speech recognition module. online file:
Support
Quality
Security
License
Reuse
The official Houndify SDK for Go
Support
Quality
Security
License
Reuse
Asterisk module for adjusting pitch of voices
Support
Quality
Security
License
Reuse
Provides access to GPIOs by directly writing to the hw registers, implements sw PWM as well
Support
Quality
Security
License
Reuse
:speech_balloon: A simple and lightweight C++11 dialog library (for Windows)
Support
Quality
Security
License
Reuse
A Hackable speech recognition library.
Support
Quality
Security
License
Reuse
Thai_TTS is the project about training "Text to Speech in Thai" using Tacotron2 by NVIDIA.
Support
Quality
Security
License
Reuse
REPeating Pattern Extraction Technique (REPET) in Python for audio source separation: original REPET, REPET extended, adaptive REPET, REPET-SIM, online REPET-SIM
Support
Quality
Security
License
Reuse
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Support
Quality
Security
License
Reuse
使用C++ OnnxRuntime 重构了Tacotron2的推理,使用Libtorch实现了VITS单角色和多角色模型推理的集成UI软件
Support
Quality
Security
License
Reuse
Code for paper "Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation"
Support
Quality
Security
License
Reuse
LD3320 full function driver for general MCU and Linux.
Support
Quality
Security
License
Reuse
M
MB-iSTFT-VITS-44100Hz-Jaby AcogiMin
Jupyter Notebook 22 Version:Current License: Permissive (Apache-2.0)
【44100Hz and Ja Support】Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Support
Quality
Security
License
Reuse
singing voice conversion without f0
Support
Quality
Security
License
Reuse
Java library for speech enhancement
Support
Quality
Security
License
Reuse
Simple voice to speech transcription using Google
Support
Quality
Security
License
Reuse
PMML evaluator library for the Android operating system (http://www.android.com/)
Support
Quality
Security
License
Reuse
S
Speaker-Recognition-System-using-GMMby genzen2103
Python 21 Version:Current License: No License (No License)
System for identifying speaker from given speech signal using MFCC,LPC features and Gaussian Mixture Models
Support
Quality
Security
License
Reuse
Example of using Watson's Streaming Speech to Text websockets interface for real time transcription. Written in Python. WARNING: This repository is no longer maintained ⚠️ This repository will not be updated. The repository will be kept available in read-only mode.
Support
Quality
Security
License
Reuse
A lightning strike detector using the AS9535 sensor from AMS, that tweets about detected storms.
Support
Quality
Security
License
Reuse
A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Support
Quality
Security
License
Reuse
Acoustic and language models for minorised languages.
Support
Quality
Security
License
Reuse
The dataset with English, German and Spanish speech samples.
Support
Quality
Security
License
Reuse
Python Speech Recognition, Voice Recognition, Text-to-Speech and Voice Command Engine
Support
Quality
Security
License
Reuse
Pytorch implementation of CS-Tacotron, a code-switching speech synthesis end-to-end generative TTS model.
Support
Quality
Security
License
Reuse
Eloquence synthesizer NVDA add-on compatible with threshold versions of NVDA (2019.3 and later). Supports Python 3 and new NVDA speech framework.
Support
Quality
Security
License
Reuse
Reverso API for Python. Currently available for Reverso Context and Reverso Voice
Support
Quality
Security
License
Reuse
[DEPRECATED] To make it possible to speak to Gladys !
Support
Quality
Security
License
Reuse
Framework/Library agnostic paystack wrapper
Support
Quality
Security
License
Reuse
Encode EAS (Emergency Alert System - United States) audio messages with valid SAME (Specific Area Message Encoding) headers, EBS (Emergency Broadcast System) attention tones, NWS (National Weather Service) attention tones, and/or spoken announcement, which is synthesized by Microsoft SAPI TTS voices. Supports output to .wav/.mp3 file or MemoryStream.
Support
Quality
Security
License
Reuse
Rescoring methods for end-to-end Automatic Speech Recognition
Support
Quality
Security
License
Reuse
Japanese text-to-speech engine binding for NodeJS
Support
Quality
Security
License
Reuse
🔛 Angular 5+ Detect online/offline state
Support
Quality
Security
License
Reuse
Node.js SDK for the Rev AI API
Support
Quality
Security
License
Reuse
This is a Polyfill for the HTML5 Speech Recognition API. It uses Microsoft's Cognitive Services as a backend. All Browsers supporting WebRTC will be supported by this Polyfill.
Support
Quality
Security
License
Reuse
A convolutional generative audio synthesis model
Support
Quality
Security
License
Reuse
Public repository for DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video Code accompanying the paper
Support
Quality
Security
License
Reuse
基于python的hmm-gmm声学模型
Support
Quality
Security
License
Reuse
SEPIA server to support open-source speech recognition via WebSocket connection.
Support
Quality
Security
License
Reuse
pytorch implementation of DNN-HSMM for TTS
Support
Quality
Security
License
Reuse
Голосовой ассистент Порфирьевич
Support
Quality
Security
License
Reuse
Vad (Voice Activity Detection ) is for embeded system.
Support
Quality
Security
License
Reuse
AMBE/AMBE+ Vocoder implementation/decoding library.
Support
Quality
Security
License
Reuse
a
asterisk-eagi-google-speech-recognitionby phsultan
Shell 21 Version:Current License: No License (No License)
An example of how to use Asterisk EAGI along with Google Speech recognition to transcribe voice to text
Support
Quality
Security
License
Reuse
Kanji handwriting recognition for iOS using Zinnia.
Support
Quality
Security
License
Reuse
Top level code to transcribe English audio/video files into text/subtitles
Support
Quality
Security
License
Reuse
g
go-speakby nicolaifsf
Speech to Text for Golang. Capable of continuous speech. Also wrappers for various Speech to Text APIs
Go 22Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gildas-aiby gildasch
Easy AI from your friends & family
Go 22Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
x
xunfei_ttsby goldengrape
Python library for interfacing with the XunFei text-to-speech API
HTML 22Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaldi-ioby open-speech
c++ Kaldi IO lib (static and dynamic).
C 22Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
ALSoundRecognitionby zyqzyq
nao robot speech recognition module. online file:
Shell 22Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
houndify-sdk-goby soundhound
The official Houndify SDK for Go
Go 22Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
asterisk-voicechangerby jart
Asterisk module for adjusting pitch of voices
C 22Updated: 4 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
f
fast-gpioby OnionIoT
Provides access to GPIOs by directly writing to the hw registers, implements sw PWM as well
C++ 22Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
b
bubbleby r-lyeh-archived
:speech_balloon: A simple and lightweight C++11 dialog library (for Windows)
C++ 22Updated: 5 y ago License: Permissive (Zlib)
Support
Quality
Security
License
Reuse
t
thunder-speechby scart97
A Hackable speech recognition library.
Python 22Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
Thai_TTSby Prim9000
Thai_TTS is the project about training "Text to Speech in Thai" using Tacotron2 by NVIDIA.
Jupyter Notebook 22Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
R
REPET-Pythonby zafarrafii
REPeating Pattern Extraction Technique (REPET) in Python for audio source separation: original REPET, REPET extended, adaptive REPET, REPET-SIM, online REPET-SIM
Jupyter Notebook 22Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
Comprehensive-Tacotron2by keonlee9420
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Python 22Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
TSJ-TTSby FujiwaraShirakana
使用C++ OnnxRuntime 重构了Tacotron2的推理,使用Libtorch实现了VITS单角色和多角色模型推理的集成UI软件
C++ 22Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
U
Unified-Enhance-Separationby YUCHEN005
Code for paper "Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation"
Python 22Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
l
ld3320by hepingood
LD3320 full function driver for general MCU and Linux.
C 22Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
M
MB-iSTFT-VITS-44100Hz-Jaby AcogiMin
【44100Hz and Ja Support】Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Jupyter Notebook 22Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
max-vcby PlayVoice
singing voice conversion without f0
Python 22Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
AudioProcessorby alexanderchiu
Java library for speech enhancement
Java 21Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
python-google-transcribeby korylprince
Simple voice to speech transcription using Google
Python 21Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
j
jpmml-androidby jpmml
PMML evaluator library for the Android operating system (http://www.android.com/)
Java 21Updated: 4 y ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
S
Speaker-Recognition-System-using-GMMby genzen2103
System for identifying speaker from given speech signal using MFCC,LPC features and Gaussian Mixture Models
Python 21Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
watson-streaming-sttby IBM
Example of using Watson's Streaming Speech to Text websockets interface for real time transcription. Written in Python. WARNING: This repository is no longer maintained ⚠️ This repository will not be updated. The repository will be kept available in read-only mode.
Python 21Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
L
LightningTweeterby Hexalyse
A lightning strike detector using the AS9535 sensor from AMS, that tweets about detected storms.
Python 21Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
K
KeenASR-Android-PoCby keenresearch
A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Java 21Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
cmusphinx-modelsby collectivat
Acoustic and language models for minorised languages.
Python 21Updated: 4 y ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
s
spoken_language_datasetby tomasz-oponowicz
The dataset with English, German and Spanish speech samples.
Python 21Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
PySpeakby johnwyles
Python Speech Recognition, Voice Recognition, Text-to-Speech and Voice Command Engine
Python 21Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
C
CS-Tacotron-Pytorchby andi611
Pytorch implementation of CS-Tacotron, a code-switching speech synthesis end-to-end generative TTS model.
Python 21Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
e
eloquence_thresholdby pumper42nickel
Eloquence synthesizer NVDA add-on compatible with threshold versions of NVDA (2019.3 and later). Supports Python 3 and new NVDA speech framework.
Python 21Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
R
ReversoAPIby demian-wolf
Reverso API for Python. Currently available for Reverso Context and Reverso Voice
Python 21Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gladys-voiceby GladysAssistant
[DEPRECATED] To make it possible to speak to Gladys !
JavaScript 21Updated: 5 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
paystack-simpleby ashinzekene
Framework/Library agnostic paystack wrapper
JavaScript 21Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
E
EAS-Encoderby SotaJoe
Encode EAS (Emergency Alert System - United States) audio messages with valid SAME (Specific Area Message Encoding) headers, EBS (Emergency Broadcast System) attention tones, NWS (National Weather Service) attention tones, and/or spoken announcement, which is synthesized by Microsoft SAPI TTS voices. Supports output to .wav/.mp3 file or MemoryStream.
C# 21Updated: 3 y ago License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
a
asr-rescoringby diego-fustes
Rescoring methods for end-to-end Automatic Speech Recognition
Python 21Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
n
node-openjtalkby TanUkkii007
Japanese text-to-speech engine binding for NodeJS
C++ 21Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
ngx-online-statusby VadimDez
🔛 Angular 5+ Detect online/offline state
TypeScript 21Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
revai-node-sdkby revdotcom
Node.js SDK for the Rev AI API
TypeScript 21Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speech-polyfillby anteloe
This is a Polyfill for the HTML5 Speech Recognition API. It uses Microsoft's Cognitive Services as a backend. All Browsers supporting WebRTC will be supported by this Polyfill.
TypeScript 21Updated: 5 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mp3netby korneelvdbroek
A convolutional generative audio synthesis model
Python 21Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
D
DORiby crodriguezo
Public repository for DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video Code accompanying the paper
Python 21Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
my_hmm_gmm_speech_recognitionby cc8848
基于python的hmm-gmm声学模型
Python 21Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
sepia-stt-serverby SEPIA-Framework
SEPIA server to support open-source speech recognition via WebSocket connection.
Python 21Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
D
DNN-HSMMby sp-nitech
pytorch implementation of DNN-HSMM for TTS
Python 21Updated: 3 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
p
porfirby morfeusys
Голосовой ассистент Порфирьевич
Kotlin 21Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
v
vadby dreamflyforever
Vad (Voice Activity Detection ) is for embeded system.
C 21Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
mbelib-testingby pbarfuss
AMBE/AMBE+ Vocoder implementation/decoding library.
C 21Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
a
asterisk-eagi-google-speech-recognitionby phsultan
An example of how to use Asterisk EAGI along with Google Speech recognition to transcribe voice to text
Shell 21Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
kanji-handwriting-swiftby tuanna-hsp
Kanji handwriting recognition for iOS using Zinnia.
C++ 21Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
srvk-eesen-offline-transcriberby srvk
Top level code to transcribe English audio/video files into text/subtitles
Shell 21Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse