WaveRNN Vocoder + TTS
Support
Quality
Security
License
Reuse
Python library and CLI tool to interface with Google Translate's text-to-speech API
Support
Quality
Security
License
Reuse
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
Support
Quality
Security
License
Reuse
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Support
Quality
Security
License
Reuse
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Support
Quality
Security
License
Reuse
The PyTorch-based audio source separation toolkit for researchers
Support
Quality
Security
License
Reuse
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Support
Quality
Security
License
Reuse
中文语音识别; Mandarin Automatic Speech Recognition;
Support
Quality
Security
License
Reuse
Open-Source Large Vocabulary Continuous Speech Recognition Engine
Support
Quality
Security
License
Reuse
Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
Support
Quality
Security
License
Reuse
Kalliope is a framework that will help you to create your own personal assistant.
Support
Quality
Security
License
Reuse
Yet another voice assistant, but alive.
Support
Quality
Security
License
Reuse
Offline Text To Speech synthesis for python
Support
Quality
Security
License
Reuse
DELTA is a deep learning based natural language and speech processing platform.
Support
Quality
Security
License
Reuse
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Support
Quality
Security
License
Reuse
A Speech Toolkit based on PaddlePaddle.
Support
Quality
Security
License
Reuse
A Chinese Deep Speech Recognition System 包括基于深度学习的声学模型和基于深度学习的语言模型
Support
Quality
Security
License
Reuse
f
free-chatgpt-client-pubby akl7777777
JavaScript 
1445
Version:Current
License: No License (No License)
**ShellGPT is a free chatgpt client, now Supported online search.no need for a key, no need to log in.Multi-node automatic speed measurement switch,Long text translation with no word limit, AI graphics.免费的chatgpt客户端,已支持联网搜索,无需密钥,无需登录,多节点自动测速切换,长文翻译不限字数,AI出图**
Support
Quality
Security
License
Reuse
TTS (text to speech) for node.js. send text from node.js to your speakers.
Support
Quality
Security
License
Reuse
A repository for demos illustrating features of the Web Speech API. See https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API for more details.
Support
Quality
Security
License
Reuse
Pure Java speech recognition library
Support
Quality
Security
License
Reuse
:speaker: Web Component wrapper to the Web Speech API, that allows you to do voice recognition and speech synthesis using Polymer
Support
Quality
Security
License
Reuse
Live Transcribe is an Android application that provides real-time captioning for people who are deaf or hard of hearing. This repository contains the Android client libraries for communicating with Google's Cloud Speech API that are used in Live Transcribe.
Support
Quality
Security
License
Reuse
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
Support
Quality
Security
License
Reuse
DELTA is a deep learning based natural language and speech processing platform.
Support
Quality
Security
License
Reuse
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Support
Quality
Security
License
Reuse
This is now the official location of the Merlin project.
Support
Quality
Security
License
Reuse
a free and open source speech synthesizer for Russian and other languages
Support
Quality
Security
License
Reuse
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Support
Quality
Security
License
Reuse
Text-to-Speech in JavaScript using eSpeak
Support
Quality
Security
License
Reuse
Core Engine of Singing Voice Conversion & Singing Voice Clone
Support
Quality
Security
License
Reuse
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Support
Quality
Security
License
Reuse
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
Support
Quality
Security
License
Reuse
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Support
Quality
Security
License
Reuse
Praat: Doing Phonetics By Computer
Support
Quality
Security
License
Reuse
Free and open source text-to-speech software
Support
Quality
Security
License
Reuse
Use Microsoft Edge's online text-to-speech service from Python (without needing Microsoft Edge/Windows or an API key)
Support
Quality
Security
License
Reuse
Software Automatic Mouth - Tiny Speech Synthesizer
Support
Quality
Security
License
Reuse
s
speech-to-text-nodejsby watson-developer-cloud
JavaScript 
1050
Version:Current
License: Permissive (Apache-2.0)
:microphone: Sample Node.js Application for the IBM Watson Speech to Text Service
Support
Quality
Security
License
Reuse
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
Support
Quality
Security
License
Reuse
Command-line tools for speech and intent recognition on Linux
Support
Quality
Security
License
Reuse
SincNet is a neural architecture for efficiently processing raw audio samples.
Support
Quality
Security
License
Reuse
A high-quality speech analysis, manipulation and synthesis system
Support
Quality
Security
License
Reuse
Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Support
Quality
Security
License
Reuse
Command line utility for forced alignment using Kaldi
Support
Quality
Security
License
Reuse
Audio Normalization for Python/ffmpeg
Support
Quality
Security
License
Reuse
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
Support
Quality
Security
License
Reuse
自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
Support
Quality
Security
License
Reuse
W
Support
Quality
Security
License
Reuse
g
gTTSby pndurette
Python library and CLI tool to interface with Google Translate's text-to-speech API
Python
1886
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
STTby coqui-ai
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
C++
1886
Updated: 2 y ago
License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
t
tacotronby Kyubyong
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Python
1813
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
V
VITS-fast-fine-tuningby Plachtaa
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Python
1809
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
asteroidby asteroid-team
The PyTorch-based audio source separation toolkit for researchers
Python
1801
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
deepvoice3_pytorchby r9y9
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Python
1777
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
m
masrby nobody132
中文语音识别; Mandarin Automatic Speech Recognition;
Python
1708
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
j
juliusby julius-speech
Open-Source Large Vocabulary Continuous Speech Recognition Engine
C
1671
Updated: 2 y ago
License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
p
project_aliasby bjoernkarmann
Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
Python
1648
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
k
kalliopeby kalliope-project
Kalliope is a framework that will help you to create your own personal assistant.
Python
1622
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
D
Digital_Life_Serverby zixiiu
Yet another voice assistant, but alive.
Python
1606
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pyttsx3by nateshmbhat
Offline Text To Speech synthesis for python
Python
1571
Updated: 2 y ago
License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
d
deltaby Delta-ML
DELTA is a deep learning based natural language and speech processing platform.
Python
1549
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
O
OpenSeq2Seqby NVIDIA
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Python
1508
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
DeepSpeechby PaddlePaddle
A Speech Toolkit based on PaddlePaddle.
Python
1470
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
DeepSpeechRecognitionby audier
A Chinese Deep Speech Recognition System 包括基于深度学习的声学模型和基于深度学习的语言模型
Python
1454
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
f
free-chatgpt-client-pubby akl7777777
**ShellGPT is a free chatgpt client, now Supported online search.no need for a key, no need to log in.Multi-node automatic speed measurement switch,Long text translation with no word limit, AI graphics.免费的chatgpt客户端,已支持联网搜索,无需密钥,无需登录,多节点自动测速切换,长文翻译不限字数,AI出图**
JavaScript
1445
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
say.jsby Marak
TTS (text to speech) for node.js. send text from node.js to your speakers.
JavaScript
1427
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
w
web-speech-apiby mdn
A repository for demos illustrating features of the Web Speech API. See https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API for more details.
JavaScript
1408
Updated: 2 y ago
License: Permissive (CC0-1.0)
Support
Quality
Security
License
Reuse
s
sphinx4by cmusphinx
Pure Java speech recognition library
Java
1350
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
v
voice-elementsby zenorocha
:speaker: Web Component wrapper to the Web Speech API, that allows you to do voice recognition and speech synthesis using Polymer
HTML
1344
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
l
live-transcribe-speech-engineby google
Live Transcribe is an Android application that provides real-time captioning for people who are deaf or hard of hearing. This repository contains the Android client libraries for communicating with Google's Cloud Speech API that are used in Live Transcribe.
Java
1327
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
P
ParallelWaveGANby kan-bayashi
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
Jupyter Notebook
1324
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
deltaby didi
DELTA is a deep learning based natural language and speech processing platform.
Python
1289
Updated: 5 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
hifi-ganby jik876
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Python
1266
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
merlinby CSTR-Edinburgh
This is now the official location of the Merlin project.
Python
1260
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
R
RHVoiceby RHVoice
a free and open source speech synthesizer for Russian and other languages
C++
1255
Updated: 2 y ago
License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
F
FastSpeech2by ming024
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Python
1246
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speak.jsby kripken
Text-to-Speech in JavaScript using eSpeak
C++
1234
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
so-vits-svc-5.0by PlayVoice
Core Engine of Singing Voice Conversion & Singing Voice Clone
Python
1207
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pororoby kakaobrain
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Python
1199
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
artyom.jsby sdkcarlos
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
JavaScript
1165
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
dc_ttsby Kyubyong
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Python
1133
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
praatby praat
Praat: Doing Phonetics By Computer
C
1117
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
X
XZVoiceby bawangxx
Free and open source text-to-speech software
JavaScript
1117
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
e
edge-ttsby rany2
Use Microsoft Edge's online text-to-speech service from Python (without needing Microsoft Edge/Windows or an API key)
Python
1056
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
S
SAMby s-macke
Software Automatic Mouth - Tiny Speech Synthesizer
C
1054
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-to-text-nodejsby watson-developer-cloud
:microphone: Sample Node.js Application for the IBM Watson Speech to Text Service
JavaScript
1050
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
svoiceby facebookresearch
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
Python
1029
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
v
voice2jsonby synesthesiam
Command-line tools for speech and intent recognition on Linux
Python
1028
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SincNetby mravanelli
SincNet is a neural architecture for efficiently processing raw audio samples.
Python
1017
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
W
Worldby mmorise
A high-quality speech analysis, manipulation and synthesis system
C++
1017
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
k
kaldi-gstreamer-serverby alumae
Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
Python
1015
Updated: 2 y ago
License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
N
NeuralSpeechby microsoft
Python
999
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
vall-eby lifeiteng
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Python
996
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
M
Montreal-Forced-Alignerby MontrealCorpusTools
Command line utility for forced alignment using Kaldi
Python
991
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
ffmpeg-normalizeby slhck
Audio Normalization for Python/ffmpeg
Python
982
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
TransformerTTSby as-ideas
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
Python
974
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
n
nlp-paperby DengBoCong
自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
Python
960
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse