Official JavaScript SDK for Deepgram's automated speech recognition APIs.
Support
Quality
Security
License
Reuse
Android app to translate text conversations, supporting 90+ languages with speech-to-text and text-to-speech features for ease of accessibility.
Support
Quality
Security
License
Reuse
implement end-to-end asr algorithm with tensorflow
Support
Quality
Security
License
Reuse
Keras framework for speech enhancement using relativistic GANs
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Support
Quality
Security
License
Reuse
A PyTorch implementation of DeepSpeech and DeepSpeech2.
Support
Quality
Security
License
Reuse
Tensor2tensor experiment with SpecAugment
Support
Quality
Security
License
Reuse
C
Cognitive-SpeakerRecognition-Androidby microsoft
Java 45 Version:Current License: Proprietary (Proprietary)
Android SDK for Microsoft Speaker Recognition API, part of Cognitive Services
Support
Quality
Security
License
Reuse
A PyTorch implementation of Tacotron2, an end-to-end text-to-speech(TTS) system described in "Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions".
Support
Quality
Security
License
Reuse
中文版贾维斯Jarvis语音助手(电脑加强版Siri,自动播放下载音乐/天气播报/问路导航/计时器/搜索等)
Support
Quality
Security
License
Reuse
End to End Dialect Identification using Convolutional Neural Network
Support
Quality
Security
License
Reuse
"my" mikutter mirror. please check official repository. this repository does not apply pr.
Support
Quality
Security
License
Reuse
迅飞 语音听写 WebAPI - 把语音(≤60秒)转换成对应的文字信息,让机器能够“听懂”人类语言,相当于给机器安装上“耳朵”,使其具备“能听”的功能。
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
SmartApp Framework для создания навыков семейства Виртуальных Ассистентов "Салют" на языке Python
Support
Quality
Security
License
Reuse
iOS application for finding formants in spoken sounds
Support
Quality
Security
License
Reuse
Simple audio recognition library.
Support
Quality
Security
License
Reuse
[abandoned] Speech Recognition and Synthesis Addon for OpenFrameworks
Support
Quality
Security
License
Reuse
State-of-the-art voice recognition for Rust using vosk. View demo: https://fars.ee/F9-b.mp4
Support
Quality
Security
License
Reuse
Safe Rust bindings for mecab a part-of-speech and morphological analyzer library
Support
Quality
Security
License
Reuse
A
Audio-Source-Separationby ShichengChen
Jupyter Notebook 45 Version:Current License: Permissive (MIT)
WaveNet for the separation of audio sources
Support
Quality
Security
License
Reuse
SoundPy (alpha stage) is a research-based python package for speech and sound. Applications include deep-learning, filtering, speech-enhancement, audio augmentation, feature extraction and visualization, dataset and audio file conversion, and beyond.
Support
Quality
Security
License
Reuse
A ctc decoder for both online and offline asr model
Support
Quality
Security
License
Reuse
Examples on how to use Tinkoff Voicekit
Support
Quality
Security
License
Reuse
Listen to any audio stream on your machine and print out the transcribed or translated audio.
Support
Quality
Security
License
Reuse
multilingual speech aligner
Support
Quality
Security
License
Reuse
A chatbot that uses speech to text for input, sends the text to OpenAI's ChatGPT text generation model and speaks the response using text to speech.
Support
Quality
Security
License
Reuse
Voice Conversion With Just Nearest Neighbors
Support
Quality
Security
License
Reuse
将任意人的音色转换为成千上万种不同音色
Support
Quality
Security
License
Reuse
SMS-WSJ: Spatialized Multi-Speaker Wall Street Journal database for multi-channel source separation and recognition
Support
Quality
Security
License
Reuse
The Implementation of FastSpeech2 Based on Pytorch.
Support
Quality
Security
License
Reuse
GPU accelerated implementation of i-vector extractor training using PyTorch. Requires Kaldi for feature extraction and UBM training. An example script is provided for VoxCeleb data.
Support
Quality
Security
License
Reuse
Real-time audio analysis library, support acoustic feature extraction and real-time beats detection
Support
Quality
Security
License
Reuse
The acoustic rake receiver, a microphone beamformer that uses echoes to improve the noise and interference suppression. Python code to reproduce all the results from Raking the Cocktail Party by Ivan Dokmanic, Robin Scheibler, and Martin Vetterli.
Support
Quality
Security
License
Reuse
An unofficial implementation of the paper "One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization".
Support
Quality
Security
License
Reuse
Myanmar (Burmese) Language Grapheme to Phoneme (myG2P) Conversion Dictionary for speech recognition (ASR) and speech synthesis (TTS).
Support
Quality
Security
License
Reuse
s
speaker_recognition_GMM_UBMby scelesticsiva
Jupyter Notebook 44 Version:Current License: No License (No License)
A speaker recognition system which uses GMM-UBM for use in an Android application which helps in monitoring patients suffering from Schizophrenia.
Support
Quality
Security
License
Reuse
t
translate-Red-Deat-Redemption-2by IndiMops
Python 44 Version:Current License: No License (No License)
Українська локалізація для гри Red Dead Redemption 2. Відчуй себе ковбоєм на всі 100
Support
Quality
Security
License
Reuse
Transcription and TTS Rest API (OpenAI Whisper, Speechbrain)
Support
Quality
Security
License
Reuse
maximum entropy based part-of-speech tagger for NLTK
Support
Quality
Security
License
Reuse
A simple python TTS wrapper
Support
Quality
Security
License
Reuse
Deep CNN networks for Speech Synthesis
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Android offline speech recognition natively on PC
Support
Quality
Security
License
Reuse
End-to-end speech recognition using TensorFlow
Support
Quality
Security
License
Reuse
Google Summer of Code 2017 Project: Development of Speech Recognition Module for Red Hen Lab
Support
Quality
Security
License
Reuse
This is the code&dataset for our paper [Modeling Attention and Memory for Auditory Selection in a Cocktail Party Environment. AAAI 2018]
Support
Quality
Security
License
Reuse
A implementation of Power Normalized Cepstral Coefficients: PNCC
Support
Quality
Security
License
Reuse
A Tensorflow Implementation of VQ-VAE Speaker Conversion
Support
Quality
Security
License
Reuse
d
deepgram-node-sdkby deepgram
Official JavaScript SDK for Deepgram's automated speech recognition APIs.
TypeScript 46Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
translateby apaar97
Android app to translate text conversations, supporting 90+ languages with speech-to-text and text-to-speech features for ease of accessibility.
Java 46Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
e
end2endASRby cdyangbo
implement end-to-end asr algorithm with tensorflow
Python 45Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
se_relativisticganby deepakbaby
Keras framework for speech enhancement using relativistic GANs
Python 45Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
E
Extended_VQVAEby nii-yamagishilab
Python 45Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
spokestack-androidby spokestack
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Java 45Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
deepspeechby MyrtleSoftware
A PyTorch implementation of DeepSpeech and DeepSpeech2.
Python 45Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
specAugmentby Kyubyong
Tensor2tensor experiment with SpecAugment
Python 45Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
Cognitive-SpeakerRecognition-Androidby microsoft
Android SDK for Microsoft Speaker Recognition API, part of Cognitive Services
Java 45Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
T
Tacotron2by kaituoxu
A PyTorch implementation of Tacotron2, an end-to-end text-to-speech(TTS) system described in "Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions".
Python 45Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
j
jarvisby edisonwong520
中文版贾维斯Jarvis语音助手(电脑加强版Siri,自动播放下载音乐/天气播报/问路导航/计时器/搜索等)
Python 45Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
dialectID_e2eby swshon
End to End Dialect Identification using Convolutional Neural Network
Python 45Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
mikutterby katsyoshi
"my" mikutter mirror. please check official repository. this repository does not apply pr.
Ruby 45Updated: 6 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
V
VoiceDictationby MuGuiLin
迅飞 语音听写 WebAPI - 把语音(≤60秒)转换成对应的文字信息,让机器能够“听懂”人类语言,相当于给机器安装上“耳朵”,使其具备“能听”的功能。
JavaScript 45Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
Template10.Validationby Windows-XAML
C# 45Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
smart_app_frameworkby sberdevices
SmartApp Framework для создания навыков семейства Виртуальных Ассистентов "Салют" на языке Python
Python 45Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
f
formant-analyzerby fulldecent
iOS application for finding formants in spoken sounds
Swift 45Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
khalzamby kisasexypantera94
Simple audio recognition library.
Go 45Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
o
ofxSpeechby latrokles
[abandoned] Speech Recognition and Synthesis Addon for OpenFrameworks
C++ 45Updated: 6 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
vosk-rsby wzhd
State-of-the-art voice recognition for Rust using vosk. View demo: https://fars.ee/F9-b.mp4
Rust 45Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
mecab-rsby tsurai
Safe Rust bindings for mecab a part-of-speech and morphological analyzer library
Rust 45Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
Audio-Source-Separationby ShichengChen
WaveNet for the separation of audio sources
Jupyter Notebook 45Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
Python-Sound-Toolby a-n-rose
SoundPy (alpha stage) is a research-based python package for speech and sound. Applications include deep-learning, filtering, speech-enhancement, audio augmentation, feature extraction and visualization, dataset and audio file conversion, and beyond.
Jupyter Notebook 45Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
c
ctc_decoderby Slyne
A ctc decoder for both online and offline asr model
C++ 45Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
voicekit-examplesby Tinkoff
Examples on how to use Tinkoff Voicekit
C# 45Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
audioWhisperby Awexander
Listen to any audio stream on your machine and print out the transcribed or translated audio.
Python 45Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
alqalignby xinjli
multilingual speech aligner
Python 45Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
chatgpt-voice-assistantby jakecyr
A chatbot that uses speech to text for input, sends the text to OpenAI's ChatGPT text generation model and speaks the response using text to speech.
Python 45Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
knn-vcby bshall
Voice Conversion With Just Nearest Neighbors
Python 45Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
v
Support
Quality
Security
License
Reuse
s
sms_wsjby fgnt
SMS-WSJ: Spatialized Multi-Speaker Wall Street Journal database for multi-channel source separation and recognition
Python 44Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
F
FastSpeech2by xcmyz
The Implementation of FastSpeech2 Based on Pytorch.
Python 44Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
pytorch-ivectorsby vvestman
GPU accelerated implementation of i-vector extractor training using PyTorch. Requires Kaldi for feature extraction and UBM training. An example script is provided for VoxCeleb data.
Python 44Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
r
rcaudioby mhy12345
Real-time audio analysis library, support acoustic feature extraction and real-time beats detection
Python 44Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
AcousticRakeReceiverby LCAV
The acoustic rake receiver, a microphone beamformer that uses echoes to improve the noise and interference suppression. Python code to reproduce all the results from Raking the Cocktail Party by Ivan Dokmanic, Robin Scheibler, and Martin Vetterli.
Python 44Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
A
AdaIN-VCby cyhuang-tw
An unofficial implementation of the paper "One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization".
Python 44Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
myG2Pby ye-kyaw-thu
Myanmar (Burmese) Language Grapheme to Phoneme (myG2P) Conversion Dictionary for speech recognition (ASR) and speech synthesis (TTS).
Perl 44Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speaker_recognition_GMM_UBMby scelesticsiva
A speaker recognition system which uses GMM-UBM for use in an Android application which helps in monitoring patients suffering from Schizophrenia.
Jupyter Notebook 44Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
t
translate-Red-Deat-Redemption-2by IndiMops
Українська локалізація для гри Red Dead Redemption 2. Відчуй себе ковбоєм на всі 100
Python 44Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-rest-apiby askrella
Transcription and TTS Rest API (OpenAI Whisper, Speechbrain)
Python 44Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
n
nltk-maxent-pos-taggerby arne-cl
maximum entropy based part-of-speech tagger for NLTK
Python 43Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
t
ttsby DeepHorizons
A simple python TTS wrapper
Python 43Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
A
AiVoiceby candlewill
Deep CNN networks for Speech Synthesis
Python 43Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
G
Goodness-of-Pronunciationby sweekarsud
Python 43Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
asrby biemster
Android offline speech recognition natively on PC
Python 43Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
d
deepSpeech2by yao-matrix
End-to-end speech recognition using TensorFlow
Python 43Updated: 3 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
A
AVSR-Deep-Speechby pandeydivesh15
Google Summer of Code 2017 Project: Development of Speech Recognition Module for Red Hen Lab
Python 43Updated: 4 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
A
ASAMby jacoxu
This is the code&dataset for our paper [Modeling Attention and Memory for Auditory Selection in a Cocktail Party Environment. AAAI 2018]
Python 43Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
P
PNCCby supikiti
A implementation of Power Normalized Cepstral Coefficients: PNCC
Python 43Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
vq-vaeby Kyubyong
A Tensorflow Implementation of VQ-VAE Speaker Conversion
Python 43Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse