MMDVM-based Digital Voice Modem Host Software
Support
Quality
Security
License
Reuse
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”
Support
Quality
Security
License
Reuse
Kaldi-compatible online fbank extractor without external dependencies
Support
Quality
Security
License
Reuse
End-to-end MOdeling of ASR (Automatic Speech Recognition)
Support
Quality
Security
License
Reuse
Whisper + OpenAI + Speech Recognition
Support
Quality
Security
License
Reuse
v
voice-dataset-creationby hollygrimm
Jupyter Notebook 27 Version:Current License: No License (No License)
Tools to create your own voice dataset for TTS training
Support
Quality
Security
License
Reuse
java版语音预处理以及MFCC提取 speech preprocessing and MFCC
Support
Quality
Security
License
Reuse
Simple Russian voice assistant based on Android Things and Raspberry Pi 3
Support
Quality
Security
License
Reuse
Python application for speech recognition using pocketsphinx and gstreamer. A GUI is available for both Qt and Gtk.
Support
Quality
Security
License
Reuse
AIPA (A.I. Personal Assistant): Speech, Vision, Machine Learning and IoT based intelligent personal assistant for Ubuntu based Linux distributions.
Support
Quality
Security
License
Reuse
u
user-authentication-using-voice-biometricsby rhythmize
Python 26 Version:Current License: No License (No License)
Project involving Voice Signal Processing of users to recognise them using Voice Biometrics
Support
Quality
Security
License
Reuse
The tools to test and work with Mbed OS
Support
Quality
Security
License
Reuse
transcribe audio feeds into public web ui
Support
Quality
Security
License
Reuse
A
Autosub-with-Baidu-DeepSpeech2by lizhaokun
Python 26 Version:Current License: Permissive (Apache-2.0)
A Chinese speech recognition with autosub and deepspeech 在autosub上结合百度的deepspeech2模型实现中文语音识别
Support
Quality
Security
License
Reuse
TCN-based Speech Enhancement
Support
Quality
Security
License
Reuse
speaker-independent voice recognition with dynamic language learning
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Keyword spotting by Kaldi library
Support
Quality
Security
License
Reuse
Emailing System for visually impaired persons
Support
Quality
Security
License
Reuse
Ebook PDF and Epub voice reader by using TTS based on Google TTS Unofficial API
Support
Quality
Security
License
Reuse
An Alfred 3 workflow that uses macOS's TTS (text-to-speech) feature to speak text aloud.
Support
Quality
Security
License
Reuse
A
Adversarial-Many-to-Many-VCby shaojinding
Python 26 Version:Current License: Proprietary (Proprietary)
[InterSpeech 2020] "Improving the Speaker Identity of Non-Parallel Many-to-Many VoiceConversion with Adversarial Speaker Recognition" by Shaojin Ding, Guanlong Zhao, Ricardo Gutierrez-Osuna
Support
Quality
Security
License
Reuse
Transformer framework speciaized in speech recognition tasks using Pytorch.
Support
Quality
Security
License
Reuse
A deep learning model for classifying audio frames into [SPEECH, KCHI, CHI, MAL, FEM] classes.
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
c implementation of 《Audio Denoise by Time-Frequency Block Thresholding》
Support
Quality
Security
License
Reuse
Python Library with C wrappers to read 8 channels from the Texas Instruments ADS1256 ADC chip
Support
Quality
Security
License
Reuse
GlottDNN vocoder and tools for training DNN excitation models
Support
Quality
Security
License
Reuse
S
Speech-Recognitionby soheil-mpg
Jupyter Notebook 26 Version:Current License: No License (No License)
End-to-End Speech Recognition using Neural Networks.
Support
Quality
Security
License
Reuse
Unofficial Keras implementation of Google AI VoiceFilter
Support
Quality
Security
License
Reuse
Text To Speech Synthesis with Vosk
Support
Quality
Security
License
Reuse
Dataset for WWW 2020 paper "Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog"
Support
Quality
Security
License
Reuse
A Java program to implement a DMTF Decoder.
Support
Quality
Security
License
Reuse
:telescope: Speaker diarization via transfer learning
Support
Quality
Security
License
Reuse
I
Interaction-Aware-Attention-Networkby 30stomercury
Python 25 Version:Current License: No License (No License)
[ICASSP19] An Interaction-aware Attention Network for Speech Emotion Recognition in Spoken Dialogs
Support
Quality
Security
License
Reuse
Tensorflow2 based implementation of ContextNet, an improved convolutional rnn-transducer-based architecture for end-to-end speech recognition using global context
Support
Quality
Security
License
Reuse
Neural network phone duration model on top of the Kaldi speech recognition framework
Support
Quality
Security
License
Reuse
Face and speech recognition by use pyqt5 face_recognition baiduai
Support
Quality
Security
License
Reuse
A Python Implementation of the Deep Speech paper.
Support
Quality
Security
License
Reuse
Luigi pipeline to download VoxCeleb(2) audio from YouTube and extract speaker segments
Support
Quality
Security
License
Reuse
T
Tensorflow-Keyword-Spottingby mostafaelaraby
Python 25 Version:Current License: Permissive (Apache-2.0)
Keyword spotting using various architecture like convolutional vggnet , 1D convolutional network and CTC.
Support
Quality
Security
License
Reuse
A
AudioToSignLanguageConverterby sahilkhoslaa
JavaScript 25 Version:Current License: No License (No License)
A web based application which accepts Audio/ Voice as input and converts it to corresponding Sign Language for Deaf people.
Support
Quality
Security
License
Reuse
Audio Signal Processing & Speech Recognition
Support
Quality
Security
License
Reuse
Training scripts for Speech-To-Text models for Ukrainian language
Support
Quality
Security
License
Reuse
Seeing Wake Words: Audio-visual Keyword Spotting
Support
Quality
Security
License
Reuse
An unofficial implementation of the paper titled "PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network".
Support
Quality
Security
License
Reuse
Arabic TTS ( الناطق العربي )
Support
Quality
Security
License
Reuse
An advance kaldi wrapper for Pyhton
Support
Quality
Security
License
Reuse
Microsoft TTS (Text-To-Speech) for golang
Support
Quality
Security
License
Reuse
General purpose, real-time audio recognition engine
Support
Quality
Security
License
Reuse
d
dvmhostby DVMProject
MMDVM-based Digital Voice Modem Host Software
C++ 27Updated: 2 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
p
prompt_semanticsby awebson
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”
Python 27Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaldi-native-fbankby csukuangfj
Kaldi-compatible online fbank extractor without external dependencies
C++ 27Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
e
emoASRby emonosuke
End-to-end MOdeling of ASR (Automatic Speech Recognition)
Python 27Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
t
talk-to-gpt-3by davila7
Whisper + OpenAI + Speech Recognition
Python 27Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
voice-dataset-creationby hollygrimm
Tools to create your own voice dataset for TTS training
Jupyter Notebook 27Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
j
java-Speech-preprocessingby happyKen
java版语音预处理以及MFCC提取 speech preprocessing and MFCC
Java 26Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
F
Foolby mkruglikov
Simple Russian voice assistant based on Android Things and Raspberry Pi 3
Java 26Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
blatherby ajbogh
Python application for speech recognition using pocketsphinx and gstreamer. A GUI is available for both Qt and Gtk.
Python 26Updated: 5 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
aipaby ahmetozlu
AIPA (A.I. Personal Assistant): Speech, Vision, Machine Learning and IoT based intelligent personal assistant for Ubuntu based Linux distributions.
Python 26Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
u
user-authentication-using-voice-biometricsby rhythmize
Project involving Voice Signal Processing of users to recognise them using Voice Biometrics
Python 26Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
mbed-os-toolsby ARMmbed
The tools to test and work with Mbed OS
Python 26Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
f
feedsby lunixbochs
transcribe audio feeds into public web ui
Python 26Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
Autosub-with-Baidu-DeepSpeech2by lizhaokun
A Chinese speech recognition with autosub and deepspeech 在autosub上结合百度的deepspeech2模型实现中文语音识别
Python 26Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
tcnseby ykoyama58
TCN-based Speech Enhancement
Python 26Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
f
freespeech-vrby themanyone
speaker-independent voice recognition with dynamic language learning
Python 26Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
p
parallel_wavenet_vocoderby geneing
Python 26Updated: 5 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
k
kaldi_kwsby dzhelonkin
Keyword spotting by Kaldi library
Python 26Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
voice-based-email-for-blindby hacky1997
Emailing System for visually impaired persons
Python 26Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
thereaderby xnohat
Ebook PDF and Epub voice reader by using TTS based on Google TTS Unofficial API
PHP 26Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speak.awfby mklement0
An Alfred 3 workflow that uses macOS's TTS (text-to-speech) feature to speak text aloud.
Shell 26Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
A
Adversarial-Many-to-Many-VCby shaojinding
[InterSpeech 2020] "Improving the Speaker Identity of Non-Parallel Many-to-Many VoiceConversion with Adversarial Speaker Recognition" by Shaojin Ding, Guanlong Zhao, Ricardo Gutierrez-Osuna
Python 26Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
speech-transformerby sooftware
Transformer framework speciaized in speech recognition tasks using Pytorch.
Python 26Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
voice-type-classifierby MarvinLvn
A deep learning model for classifying audio frames into [SPEECH, KCHI, CHI, MAL, FEM] classes.
Shell 26Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
godot-ttsby lightsoutgames
Rust 26Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
A
AudioDenoiseby cpuimage
c implementation of 《Audio Denoise by Time-Frequency Block Thresholding》
C 26Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
py-ads1256by fabiovix
Python Library with C wrappers to read 8 channels from the Texas Instruments ADS1256 ADC chip
C 26Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
G
GlottDNNby ljuvela
GlottDNN vocoder and tools for training DNN excitation models
C++ 26Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
S
Speech-Recognitionby soheil-mpg
End-to-End Speech Recognition using Neural Networks.
Jupyter Notebook 26Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
V
VoiceFilterby jain-abhinav02
Unofficial Keras implementation of Google AI VoiceFilter
Jupyter Notebook 26Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
vosk-ttsby alphacep
Text To Speech Synthesis with Vosk
Python 26Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
stickerchatby gsh199449
Dataset for WWW 2020 paper "Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog"
Python 25Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
D
DTMF-Decoderby tino1b2be
A Java program to implement a DMTF Decoder.
Java 25Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
minutesby ubclaunchpad
:telescope: Speaker diarization via transfer learning
Python 25Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
I
Interaction-Aware-Attention-Networkby 30stomercury
[ICASSP19] An Interaction-aware Attention Network for Speech Emotion Recognition in Spoken Dialogs
Python 25Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
ContextNetby iankur
Tensorflow2 based implementation of ContextNet, an improved convolutional rnn-transducer-based architecture for end-to-end speech recognition using global context
Python 25Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
kaldi-nnet-dur-modelby alumae
Neural network phone duration model on top of the Kaldi speech recognition framework
Python 25Updated: 5 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
i
image-and-speech-processingby niehen6174
Face and speech recognition by use pyqt5 face_recognition baiduai
Python 25Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
python-deep-speechby patyork
A Python Implementation of the Deep Speech paper.
Python 25Updated: 7 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
voxceleb-luigiby maxhollmann
Luigi pipeline to download VoxCeleb(2) audio from YouTube and extract speaker segments
Python 25Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
Tensorflow-Keyword-Spottingby mostafaelaraby
Keyword spotting using various architecture like convolutional vggnet , 1D convolutional network and CTC.
Python 25Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
A
AudioToSignLanguageConverterby sahilkhoslaa
A web based application which accepts Audio/ Voice as input and converts it to corresponding Sign Language for Deaf people.
JavaScript 25Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
SpeechRecgnitionby hccho2
Audio Signal Processing & Speech Recognition
Python 25Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
voice-recognition-uaby robinhad
Training scripts for Speech-To-Text models for Ukrainian language
Jupyter Notebook 25Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
K
KWS-Netby lilianemomeni
Seeing Wake Words: Audio-visual Keyword Spotting
Python 25Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
PitchNetby xiaozhuo12138
An unofficial implementation of the paper titled "PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network".
Python 25Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
Support
Quality
Security
License
Reuse
e
exkaldiby wangyu09
An advance kaldi wrapper for Pyhton
C++ 25Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
g
golang-ttsby zhaopuyang
Microsoft TTS (Text-To-Speech) for golang
Go 25Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
audioneexby a-gram
General purpose, real-time audio recognition engine
C++ 25Updated: 3 y ago License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse