WaveRNN Vocoder + TTS
Support
Quality
Security
License
Reuse
Python library and CLI tool to interface with Google Translate's text-to-speech API
Support
Quality
Security
License
Reuse
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
Support
Quality
Security
License
Reuse
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Support
Quality
Security
License
Reuse
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Support
Quality
Security
License
Reuse
The PyTorch-based audio source separation toolkit for researchers
Support
Quality
Security
License
Reuse
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Support
Quality
Security
License
Reuse
中文语音识别; Mandarin Automatic Speech Recognition;
Support
Quality
Security
License
Reuse
Open-Source Large Vocabulary Continuous Speech Recognition Engine
Support
Quality
Security
License
Reuse
Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
Support
Quality
Security
License
Reuse
Kalliope is a framework that will help you to create your own personal assistant.
Support
Quality
Security
License
Reuse
Yet another voice assistant, but alive.
Support
Quality
Security
License
Reuse
Offline Text To Speech synthesis for python
Support
Quality
Security
License
Reuse
DELTA is a deep learning based natural language and speech processing platform.
Support
Quality
Security
License
Reuse
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Support
Quality
Security
License
Reuse
A Speech Toolkit based on PaddlePaddle.
Support
Quality
Security
License
Reuse
A Chinese Deep Speech Recognition System 包括基于深度学习的声学模型和基于深度学习的语言模型
Support
Quality
Security
License
Reuse
f
free-chatgpt-client-pubby akl7777777
JavaScript 1445 Version:Current License: No License (No License)
**ShellGPT is a free chatgpt client, now Supported online search.no need for a key, no need to log in.Multi-node automatic speed measurement switch,Long text translation with no word limit, AI graphics.免费的chatgpt客户端,已支持联网搜索,无需密钥,无需登录,多节点自动测速切换,长文翻译不限字数,AI出图**
Support
Quality
Security
License
Reuse
TTS (text to speech) for node.js. send text from node.js to your speakers.
Support
Quality
Security
License
Reuse
A repository for demos illustrating features of the Web Speech API. See https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API for more details.
Support
Quality
Security
License
Reuse
Pure Java speech recognition library
Support
Quality
Security
License
Reuse
:speaker: Web Component wrapper to the Web Speech API, that allows you to do voice recognition and speech synthesis using Polymer
Support
Quality
Security
License
Reuse
Live Transcribe is an Android application that provides real-time captioning for people who are deaf or hard of hearing. This repository contains the Android client libraries for communicating with Google's Cloud Speech API that are used in Live Transcribe.
Support
Quality
Security
License
Reuse
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
Support
Quality
Security
License
Reuse
DELTA is a deep learning based natural language and speech processing platform.
Support
Quality
Security
License
Reuse
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Support
Quality
Security
License
Reuse
This is now the official location of the Merlin project.
Support
Quality
Security
License
Reuse
a free and open source speech synthesizer for Russian and other languages
Support
Quality
Security
License
Reuse
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Support
Quality
Security
License
Reuse
Text-to-Speech in JavaScript using eSpeak
Support
Quality
Security
License
Reuse
Core Engine of Singing Voice Conversion & Singing Voice Clone
Support
Quality
Security
License
Reuse
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Support
Quality
Security
License
Reuse
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
Support
Quality
Security
License
Reuse
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Support
Quality
Security
License
Reuse
Praat: Doing Phonetics By Computer
Support
Quality
Security
License
Reuse
Free and open source text-to-speech software
Support
Quality
Security
License
Reuse
Use Microsoft Edge's online text-to-speech service from Python (without needing Microsoft Edge/Windows or an API key)
Support
Quality
Security
License
Reuse
Software Automatic Mouth - Tiny Speech Synthesizer
Support
Quality
Security
License
Reuse
s
speech-to-text-nodejsby watson-developer-cloud
JavaScript 1050 Version:Current License: Permissive (Apache-2.0)
:microphone: Sample Node.js Application for the IBM Watson Speech to Text Service
Support
Quality
Security
License
Reuse
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
Support
Quality
Security
License
Reuse
Command-line tools for speech and intent recognition on Linux
Support
Quality
Security
License
Reuse
SincNet is a neural architecture for efficiently processing raw audio samples.
Support
Quality
Security
License
Reuse
A high-quality speech analysis, manipulation and synthesis system
Support
Quality
Security
License
Reuse
Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Support
Quality
Security
License
Reuse
Command line utility for forced alignment using Kaldi
Support
Quality
Security
License
Reuse
Audio Normalization for Python/ffmpeg
Support
Quality
Security
License
Reuse
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
Support
Quality
Security
License
Reuse
自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
Support
Quality
Security
License
Reuse
W
Support
Quality
Security
License
Reuse
g
gTTSby pndurette
Python library and CLI tool to interface with Google Translate's text-to-speech API
Python 1886Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
STTby coqui-ai
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
C++ 1886Updated: 1 y ago License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
t
tacotronby Kyubyong
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Python 1813Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
V
VITS-fast-fine-tuningby Plachtaa
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Python 1809Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
asteroidby asteroid-team
The PyTorch-based audio source separation toolkit for researchers
Python 1801Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
deepvoice3_pytorchby r9y9
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Python 1777Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
m
masrby nobody132
中文语音识别; Mandarin Automatic Speech Recognition;
Python 1708Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
j
juliusby julius-speech
Open-Source Large Vocabulary Continuous Speech Recognition Engine
C 1671Updated: 1 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
p
project_aliasby bjoernkarmann
Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
Python 1648Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
k
kalliopeby kalliope-project
Kalliope is a framework that will help you to create your own personal assistant.
Python 1622Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
D
Digital_Life_Serverby zixiiu
Yet another voice assistant, but alive.
Python 1606Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pyttsx3by nateshmbhat
Offline Text To Speech synthesis for python
Python 1571Updated: 1 y ago License: Weak Copyleft (MPL-2.0)
Support
Quality
Security
License
Reuse
d
deltaby Delta-ML
DELTA is a deep learning based natural language and speech processing platform.
Python 1549Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
O
OpenSeq2Seqby NVIDIA
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Python 1508Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
DeepSpeechby PaddlePaddle
A Speech Toolkit based on PaddlePaddle.
Python 1470Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
DeepSpeechRecognitionby audier
A Chinese Deep Speech Recognition System 包括基于深度学习的声学模型和基于深度学习的语言模型
Python 1454Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
f
free-chatgpt-client-pubby akl7777777
**ShellGPT is a free chatgpt client, now Supported online search.no need for a key, no need to log in.Multi-node automatic speed measurement switch,Long text translation with no word limit, AI graphics.免费的chatgpt客户端,已支持联网搜索,无需密钥,无需登录,多节点自动测速切换,长文翻译不限字数,AI出图**
JavaScript 1445Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
say.jsby Marak
TTS (text to speech) for node.js. send text from node.js to your speakers.
JavaScript 1427Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
w
web-speech-apiby mdn
A repository for demos illustrating features of the Web Speech API. See https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API for more details.
JavaScript 1408Updated: 2 y ago License: Permissive (CC0-1.0)
Support
Quality
Security
License
Reuse
s
sphinx4by cmusphinx
Pure Java speech recognition library
Java 1350Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
v
voice-elementsby zenorocha
:speaker: Web Component wrapper to the Web Speech API, that allows you to do voice recognition and speech synthesis using Polymer
HTML 1344Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
live-transcribe-speech-engineby google
Live Transcribe is an Android application that provides real-time captioning for people who are deaf or hard of hearing. This repository contains the Android client libraries for communicating with Google's Cloud Speech API that are used in Live Transcribe.
Java 1327Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
P
ParallelWaveGANby kan-bayashi
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
Jupyter Notebook 1324Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
deltaby didi
DELTA is a deep learning based natural language and speech processing platform.
Python 1289Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
hifi-ganby jik876
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Python 1266Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
merlinby CSTR-Edinburgh
This is now the official location of the Merlin project.
Python 1260Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
R
RHVoiceby RHVoice
a free and open source speech synthesizer for Russian and other languages
C++ 1255Updated: 1 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
F
FastSpeech2by ming024
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Python 1246Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
speak.jsby kripken
Text-to-Speech in JavaScript using eSpeak
C++ 1234Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
so-vits-svc-5.0by PlayVoice
Core Engine of Singing Voice Conversion & Singing Voice Clone
Python 1207Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pororoby kakaobrain
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Python 1199Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
artyom.jsby sdkcarlos
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
JavaScript 1165Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
dc_ttsby Kyubyong
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Python 1133Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
praatby praat
Praat: Doing Phonetics By Computer
C 1117Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
X
XZVoiceby bawangxx
Free and open source text-to-speech software
JavaScript 1117Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
e
edge-ttsby rany2
Use Microsoft Edge's online text-to-speech service from Python (without needing Microsoft Edge/Windows or an API key)
Python 1056Updated: 1 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
S
SAMby s-macke
Software Automatic Mouth - Tiny Speech Synthesizer
C 1054Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
speech-to-text-nodejsby watson-developer-cloud
:microphone: Sample Node.js Application for the IBM Watson Speech to Text Service
JavaScript 1050Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
svoiceby facebookresearch
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
Python 1029Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
v
voice2jsonby synesthesiam
Command-line tools for speech and intent recognition on Linux
Python 1028Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SincNetby mravanelli
SincNet is a neural architecture for efficiently processing raw audio samples.
Python 1017Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
W
Worldby mmorise
A high-quality speech analysis, manipulation and synthesis system
C++ 1017Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
k
kaldi-gstreamer-serverby alumae
Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
Python 1015Updated: 2 y ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
N
NeuralSpeechby microsoft
Python 999Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
v
vall-eby lifeiteng
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Python 996Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
M
Montreal-Forced-Alignerby MontrealCorpusTools
Command line utility for forced alignment using Kaldi
Python 991Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
ffmpeg-normalizeby slhck
Audio Normalization for Python/ffmpeg
Python 982Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
TransformerTTSby as-ideas
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
Python 974Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
n
nlp-paperby DengBoCong
自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
Python 960Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse