I
ITRI-speech-recognition-dataset-generationby khuangaf
Jupyter Notebook 33 Version:Current License: No License (No License)
Automatic Speech Recognition Dataset Generation
Support
Quality
Security
License
Reuse
Speech recognition AI based on FFNN in Java
Support
Quality
Security
License
Reuse
[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-Fire mechanism).
Support
Quality
Security
License
Reuse
Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.
Support
Quality
Security
License
Reuse
Causality Check in Frame-online Speech Separation
Support
Quality
Security
License
Reuse
ChatGPT + Google T2S + Google S2T
Support
Quality
Security
License
Reuse
PPSpeech: Phrase based Parallel End-to-End TTS System
Support
Quality
Security
License
Reuse
Java wrapper around the famous sox (sound-exchange) audio processing utility
Support
Quality
Security
License
Reuse
Wake-Up-Word Keyword Spotting implemented in Keras
Support
Quality
Security
License
Reuse
implementation of Monaural Speech Enhancement with Recursive Learning in the Time Domain
Support
Quality
Security
License
Reuse
A Pytorch implementation of 'AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION'
Support
Quality
Security
License
Reuse
s
speech-emotion-recognition-exerciseby lmingde
Python 32 Version:Current License: No License (No License)
2018年7⽉30⽇-8⽉13⽇持续2周的好未来AI训练营中语⾳情感识别营的项目报告
Support
Quality
Security
License
Reuse
Vasisualy it's a simple Russian voice assistant written on Python for GNU/Linux, Windows and Android.
Support
Quality
Security
License
Reuse
b
bingspeech-api-clientby palmerabollo
TypeScript 32 Version:Current License: Proprietary (Proprietary)
Microsoft Bing Speech API client in node.js
Support
Quality
Security
License
Reuse
cgo interface to WebRTC Voice Activity Dectection
Support
Quality
Security
License
Reuse
A simple Noise Gate algorithm for splitting an audio stream into chunks based on volume/silence
Support
Quality
Security
License
Reuse
☕🇧🇷 Scripts para o Kaldi em Português Brasileiro
Support
Quality
Security
License
Reuse
Comprehensive Python library for speech and voice.
Support
Quality
Security
License
Reuse
.NET library to easily create Voice Command Control feature.
Support
Quality
Security
License
Reuse
An example project showing how to use www.carterapi.com as a voice assistant.
Support
Quality
Security
License
Reuse
Rust bindings to the Vosk API Speech Recognition library
Support
Quality
Security
License
Reuse
A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis
Support
Quality
Security
License
Reuse
A Simple Wav audio recorder dialog
Support
Quality
Security
License
Reuse
Phase Vocoder In Python
Support
Quality
Security
License
Reuse
Android App for Deaf
Support
Quality
Security
License
Reuse
Voice command assistant
Support
Quality
Security
License
Reuse
T
Translation-Augmented-LibriSpeech-Corpusby alicank
Python 31 Version:Current License: No License (No License)
Large scale (>200h) and publicly available read audio book corpus. This corpus is an augmentation of LibriSpeech ASR Corpus (1000h) and contains English utterances (from audiobooks) automatically aligned with French text. Our dataset offers ~236h of speech aligned to translated text.
Support
Quality
Security
License
Reuse
Google's TPGST reimplementation.
Support
Quality
Security
License
Reuse
program, which helps people to communicate with speech disorders
Support
Quality
Security
License
Reuse
Listen, Attend and spell model for E2E ASR. Implementation in Pytorch
Support
Quality
Security
License
Reuse
Official Implementation of "Seeing Through Noise: Speaker Separation and Enhancement using Visually-derived Speech", ICASSP 2018.
Support
Quality
Security
License
Reuse
ChiNese Text Normalization (CNTN) tool for Text-to-speech system
Support
Quality
Security
License
Reuse
Web Speech recognition grammar POC for webkit using the Levenshtein distance algorithm
Support
Quality
Security
License
Reuse
Spokestack: give your iOS app a voice interface!
Support
Quality
Security
License
Reuse
THEANO-KALDI-RNNs is a project implementing various Recurrent Neural Networks (RNNs) for RNN-HMM speech recognition. The Theano Code is coupled with the Kaldi decoder.
Support
Quality
Security
License
Reuse
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".
Support
Quality
Security
License
Reuse
Coqui AI TTS plugin
Support
Quality
Security
License
Reuse
This chatbot lets you use your microphone to communicate with GPT-4. It uses the Windows TTS to respond with a voice. It uses Pinecone to store long term information and retrieves it to create context. API keys for OpenAI and Pinecone required. Tested on Windows
Support
Quality
Security
License
Reuse
Python AI project
Support
Quality
Security
License
Reuse
ChatGPT
Support
Quality
Security
License
Reuse
A
Audio-Speech-To-Sign-Language-Converterby jigargajjar55
HTML 31 Version:Current License: Permissive (MIT)
A web based application which accepts Audio speech or Text as input and converts it to corresponding Indian Sign Language for impaired of speaking or impaired of hearing and deaf people.
Support
Quality
Security
License
Reuse
This Repository includes four different implementations of the Speaker Verification task including the GMM_UBM, Ivector, Deep-Speaker, and voice-vector
Support
Quality
Security
License
Reuse
Command line and webapp application for driving Sonos boxes
Support
Quality
Security
License
Reuse
Android App to translate text conversations, supporting 90 languages with Speech-To-Text and Text-to-Speech features for ease of accessibility.
Support
Quality
Security
License
Reuse
S
Python 30 Version:Current License: Permissive (MIT)
Implementation of the paper "SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement."
Support
Quality
Security
License
Reuse
The repo contains our code of ``Semantic Mask for Transformer based End-to-End Speech Recognition"
Support
Quality
Security
License
Reuse
Code from the paper "DACS: Domain Adaptation via Cross-domain Mixed Sampling"
Support
Quality
Security
License
Reuse
Listen, Attend and Spell (LAS) framework for speech recognition (see https://arxiv.org/pdf/1508.01211.pdf).
Support
Quality
Security
License
Reuse
A collection of useful tools for handling speech recognition data
Support
Quality
Security
License
Reuse
Python auditory modeling toolbox.
Support
Quality
Security
License
Reuse
I
ITRI-speech-recognition-dataset-generationby khuangaf
Automatic Speech Recognition Dataset Generation
Jupyter Notebook 33Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
SpeechRecognitionAIby viktorvano
Speech recognition AI based on FFNN in Java
Java 33Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
CIF-PyTorchby MingLunHan
[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-Fire mechanism).
Python 33Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
F
FG-transformer-TTSby b04901014
Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.
Python 33Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
C
CausalityCheckby zqwang7
Causality Check in Frame-online Speech Separation
Python 33Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
chatGPT_Talkingby ch-tseng
ChatGPT + Google T2S + Google S2T
Python 33Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
P
PPSpeechby rishikksh20
PPSpeech: Phrase based Parallel End-to-End TTS System
Python 32Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
sox-wrapper-javaby corballis
Java wrapper around the famous sox (sound-exchange) audio processing utility
Java 32Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
keyword-spottingby rajathkmp
Wake-Up-Word Keyword Spotting implemented in Keras
Python 32Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
R
RTNetby Andong-Li-speech
implementation of Monaural Speech Enhancement with Recursive Learning in the Time Domain
Python 32Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
localatt_emorecogby gogyzzz
A Pytorch implementation of 'AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION'
Python 32Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-emotion-recognition-exerciseby lmingde
2018年7⽉30⽇-8⽉13⽇持续2周的好未来AI训练营中语⾳情感识别营的项目报告
Python 32Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
vasisualyby Oknolaz
Vasisualy it's a simple Russian voice assistant written on Python for GNU/Linux, Windows and Android.
Python 32Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
b
bingspeech-api-clientby palmerabollo
Microsoft Bing Speech API client in node.js
TypeScript 32Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
g
go-webrtcvadby maxhawkins
cgo interface to WebRTC Voice Activity Dectection
C 32Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
n
noise-gateby Michael-F-Bryan
A simple Noise Gate algorithm for splitting an audio stream into chunks based on volume/silence
Rust 32Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
k
kaldi-brby falabrasil
☕🇧🇷 Scripts para o Kaldi em Português Brasileiro
Shell 32Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
V
Voicenetby Robofied
Comprehensive Python library for speech and voice.
Jupyter Notebook 32Updated: 4 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
V
VoiceNET.Libraryby nhannt201
.NET library to easily create Voice Command Control feature.
C# 32Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
c
carter-voice-assistantby huwprosser
An example project showing how to use www.carterapi.com as a voice assistant.
Python 32Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
vosk-rsby Bear-03
Rust bindings to the Vosk API Speech Recognition library
Rust 32Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
PPG-GradVCby seahore
A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis
Python 32Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
R
RecordDialogby IvanSotelo
A Simple Wav audio recorder dialog
Java 31Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
phasevocoderby haoyu987
Phase Vocoder In Python
Python 31Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
d
Support
Quality
Security
License
Reuse
J
Jarvisby m4n3dw0lf
Voice command assistant
Python 31Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
T
Translation-Augmented-LibriSpeech-Corpusby alicank
Large scale (>200h) and publicly available read audio book corpus. This corpus is an augmentation of LibriSpeech ASR Corpus (1000h) and contains English utterances (from audiobooks) automatically aligned with French text. Our dataset offers ~236h of speech aligned to translated text.
Python 31Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
TPGST-Tacotronby Yangyangii
Google's TPGST reimplementation.
Python 31Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
linkatype-androidby linkasu
program, which helps people to communicate with speech disorders
Java 31Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
las-pytorchby jiwidi
Listen, Attend and spell model for E2E ASR. Implementation in Pytorch
Python 31Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
cocktail-partyby avivga
Official Implementation of "Seeing Through Noise: Speaker Separation and Enhancement using Visually-derived Speech", ICASSP 2018.
Python 31Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
C
CNTNby candlewill
ChiNese Text Normalization (CNTN) tool for Text-to-speech system
Python 31Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
t
tUnE.jsby LevyGuy
Web Speech recognition grammar POC for webkit using the Levenshtein distance algorithm
JavaScript 31Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
spokestack-iosby spokestack
Spokestack: give your iOS app a voice interface!
Swift 31Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
theano-kaldi-rnnby mravanelli
THEANO-KALDI-RNNs is a project implementing various Recurrent Neural Networks (RNNs) for RNN-HMM speech recognition. The Theano Code is coupled with the Kaldi decoder.
Perl 31Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
STEMMby ictnlp
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".
Python 31Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
neon-tts-plugin-coquiby NeonGeckoCom
Coqui AI TTS plugin
Python 31Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
g
gpt_chatbotby 1nnovat1on
This chatbot lets you use your microphone to communicate with GPT-4. It uses the Windows TTS to respond with a voice. It uses Pinecone to store long term information and retrieves it to create context. API keys for OpenAI and Pinecone required. Tested on Windows
Python 31Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
subui-speech-assistantby python019
Python AI project
Python 31Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
c
Support
Quality
Security
License
Reuse
A
Audio-Speech-To-Sign-Language-Converterby jigargajjar55
A web based application which accepts Audio speech or Text as input and converts it to corresponding Indian Sign Language for impaired of speaking or impaired of hearing and deaf people.
HTML 31Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
M
Master-Voice_Printsby prajual
This Repository includes four different implementations of the Speaker Verification task including the GMM_UBM, Ivector, Deep-Speaker, and voice-vector
Python 30Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
sonos-javaby SR-G
Command line and webapp application for driving Sonos boxes
Java 30Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
TranslateAppby apaar97
Android App to translate text conversations, supporting 90 languages with Speech-To-Text and Text-to-Speech features for ease of accessibility.
Java 30Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
S
SNR-Based-Progressive-Learning-of-Deep-Neural-Network-for-Speech-Enhancementby haoxiangsnr
Implementation of the paper "SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement."
Python 30Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SemanticMaskby MarkWuNLP
The repo contains our code of ``Semantic Mask for Transformer based End-to-End Speech Recognition"
Python 30Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
D
DACSby vikolss
Code from the paper "DACS: Domain Adaptation via Cross-domain Mixed Sampling"
Python 30Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
L
LAS-SpeechRecognitionby PengdaLiu
Listen, Attend and Spell (LAS) framework for speech recognition (see https://arxiv.org/pdf/1508.01211.pdf).
Python 30Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
greenkey-asrtoolkitby finos
A collection of useful tools for handling speech recognition data
Python 30Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
pamboxby achabotl
Python auditory modeling toolbox.
Python 30Updated: 4 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse