Speech Libraries - Page 4

asteroidby mpariente

Python 611 Version:Current
License: Permissive (MIT)

The PyTorch-based audio source separation toolkit for researchers

Support

Quality

Security

License

Reuse

HTML 607 Version:Current
License: Permissive (MIT)

免费的在线文本转语音API

Support

Quality

Security

License

Reuse

Python 596 Version:Current
License: Permissive (Apache-2.0)

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Support

Quality

Security

License

Reuse

speech-denoising-wavenetby drethage

Python 594 Version:Current
License: Permissive (MIT)

A neural network for end-to-end speech denoising

Support

Quality

Security

License

Reuse

cboardby cboard-org

JavaScript 594 Version:Current
License: Strong Copyleft (GPL-3.0)

Augmentative and Alternative Communication (AAC) system with text-to-speech for the browser

Support

Quality

Security

License

Reuse

diffwaveby lmnt-com

Python 593 Version:Current
License: Permissive (Apache-2.0)

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Support

Quality

Security

License

Reuse

sonusby evancohen

JavaScript 592 Version:Current
License: Permissive (MIT)

:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

Support

Quality

Security

License

Reuse

WavAugmentby facebookresearch

Python 585 Version:Current
License: Permissive (MIT)

A library for speech data augmentation in time-domain

Support

Quality

Security

License

Reuse

Parakeetby PaddlePaddle

Python 584 Version:Current
License: Proprietary (Proprietary)

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)

Support

Quality

Security

License

Reuse

Python 584 Version:Current
License: Permissive (MIT)

CNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Support

Quality

Security

License

Reuse

Java 578 Version:Current
License: Permissive (Apache-2.0)

Offline speech recognition for Android with Vosk library.

Support

Quality

Security

License

Reuse

dialogflow-android-clientby dialogflow

Java 577 Version:Current
License: Permissive (Apache-2.0)

Android SDK for Dialogflow

Support

Quality

Security

License

Reuse

Java 573 Version:Current
License: Proprietary (Proprietary)

CMU ARK Twitter Part-of-Speech Tagger

Support

Quality

Security

License

Reuse

openspeechby openspeech-team

Python 572 Version:Current
License: Permissive (MIT)

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Support

Quality

Security

License

Reuse

MoeVoiceStudioby NaruseMioShirakana

C++ 572 Version:Current
License: Strong Copyleft (AGPL-3.0)

一个使用C++编写的音频处理软件

Support

Quality

Security

License

Reuse

concrete5-legacyby concretecms

PHP 566 Version:Current
License: No License (No License)

Legacy repository for concrete5

Support

Quality

Security

License

Reuse

C 564 Version:Current
License: Permissive (Apache-2.0)

Speech Algorithms

Support

Quality

Security

License

Reuse

dialogflow-python-clientby dialogflow

Python 559 Version:Current
License: Permissive (Apache-2.0)

Python library for Dialogflow

Support

Quality

Security

License

Reuse

av_hubertby facebookresearch

Python 559 Version:Current
License: Proprietary (Proprietary)

A self-supervised learning framework for audio-visual speech

Support

Quality

Security

License

Reuse

sprocketby k2kobayashi

Python 553 Version:Current
License: Permissive (MIT)

Voice Conversion Tool Kit

Support

Quality

Security

License

Reuse

Python 552 Version:Current
License: Permissive (MIT)

A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).

Support

Quality

Security

License

Reuse

Java 543 Version:Current
License: Permissive (Apache-2.0)

Language Detection Library for Java

Support

Quality

Security

License

Reuse

YourTTSby Edresson

Jupyter Notebook 541 Version:Current
License: Proprietary (Proprietary)

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Support

Quality

Security

License

Reuse

rhinoby Picovoice

Python 533 Version:Current
License: Permissive (Apache-2.0)

On-device Speech-to-Intent engine powered by deep learning

Support

Quality

Security

License

Reuse

openttsby synesthesiam

Python 530 Version:Current
License: Permissive (MIT)

Open Text to Speech Server

Support

Quality

Security

License

Reuse

Multilingual_Text_to_Speechby Tomiinek

Python 528 Version:Current
License: Permissive (MIT)

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Support

Quality

Security

License

Reuse

JavaScript 528 Version:Current
License: Permissive (Apache-2.0)

An opensource text-to-speech (TTS) voice building tool

Support

Quality

Security

License

Reuse

SpeechSplitby auspicious3000

Python 527 Version:Current
License: Permissive (MIT)

Unsupervised Speech Decomposition Via Triple Information Bottleneck

Support

Quality

Security

License

Reuse

FunASRby alibaba-damo-academy

Python 524 Version:Current
License: Proprietary (Proprietary)

A Fundamental End-to-End Speech Recognition Toolkit

Support

Quality

Security

License

Reuse

tacotronby google

HTML 522 Version:Current
License: Proprietary (Proprietary)

Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.

Support

Quality

Security

License

Reuse

PaddlePaddle-DeepSpeechby yeyupiaoling

Python 519 Version:Current
License: Permissive (Apache-2.0)

基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows，Linux下训练和预测，支持Nvidia Jetson开发板预测。

Support

Quality

Security

License

Reuse

GigaSpeechby SpeechColab

Shell 515 Version:Current
License: Permissive (Apache-2.0)

Large, modern dataset for speech recognition

Support

Quality

Security

License

Reuse

mir_evalby craffel

Python 512 Version:Current
License: Permissive (MIT)

Evaluation functions for music/audio information retrieval/signal processing algorithms.

Support

Quality

Security

License

Reuse

harkby otalk

JavaScript 504 Version:Current
License: No License (No License)

Converts an audio stream to speech events in the browser

Support

Quality

Security

License

Reuse

CTCWordBeamSearchby githubharald

C++ 504 Version:Current
License: Permissive (MIT)

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.

Support

Quality

Security

License

Reuse

ganttsby r9y9

Jupyter Notebook 503 Version:Current
License: Proprietary (Proprietary)

PyTorch implementation of GAN-based text-to-speech synthesis and voice conversion (VC)

Support

Quality

Security

License

Reuse

speech-to-text-benchmarkby Picovoice

Python 502 Version:Current
License: Permissive (Apache-2.0)

speech to text benchmark framework

Support

Quality

Security

License

Reuse

voicefixerby haoheliu

Python 501 Version:Current
License: Permissive (MIT)

General Speech Restoration

Support

Quality

Security

License

Reuse

Java 496 Version:Current
License: Strong Copyleft (GPL-3.0)

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Support

Quality

Security

License

Reuse

PPASRby yeyupiaoling

Python 495 Version:Current
License: Permissive (Apache-2.0)

基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型

Support

Quality

Security

License

Reuse

melganby seungwonpark

Python 494 Version:Current
License: Permissive (BSD-3-Clause)

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Support

Quality

Security

License

Reuse

cheetahby Picovoice

Python 491 Version:Current
License: Permissive (Apache-2.0)

On-device streaming speech-to-text engine powered by deep learning

Support

Quality

Security

License

Reuse

cn2anby Ailln

Python 486 Version:Current
License: Permissive (MIT)

📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

Support

Quality

Security

License

Reuse

kospeechby sooftware

Python 485 Version:Current
License: Permissive (Apache-2.0)

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Support

Quality

Security

License

Reuse

Python 476 Version:Current
License: Permissive (MIT)

A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"

Support

Quality

Security

License

Reuse

AutoSubby abhirooptalasila

Python 470 Version:Current
License: Permissive (MIT)

A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui

Support

Quality

Security

License

Reuse

neural_spby hirofumi0810

Python 469 Version:Current
License: Permissive (Apache-2.0)

End-to-end ASR/LM implementation with PyTorch

Support

Quality

Security

License

Reuse

Python-Wrapper-for-World-Vocoderby JeremyCCHsu

Python 468 Version:Current
License: Permissive (MIT)

A Python wrapper for the high-quality vocoder "World"

Support

Quality

Security

License

Reuse

Jupyter Notebook 467 Version:Current
License: Permissive (MIT)

🔦 A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Support

Quality

Security

License

Reuse

C++ 461 Version:Current
License: Permissive (MIT)

Flutter Text to Speech package

Support

Quality

Security

License

Reuse

asteroidby mpariente

The PyTorch-based audio source separation toolkit for researchers

Python

611

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

ms-ra-forwarderby wxxxcxx

免费的在线文本转语音API

HTML

607

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

SpecAugmentby DemisEom

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Python

596

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

speech-denoising-wavenetby drethage

A neural network for end-to-end speech denoising

Python

594

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

cboardby cboard-org

Augmentative and Alternative Communication (AAC) system with text-to-speech for the browser

JavaScript

594

Updated: 2 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

diffwaveby lmnt-com

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Python

593

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

sonusby evancohen

:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

JavaScript

592

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

WavAugmentby facebookresearch

A library for speech data augmentation in time-domain

Python

585

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Parakeetby PaddlePaddle

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)

Python

584

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

inaSpeechSegmenterby ina-foss

CNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Python

584

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

vosk-android-demoby alphacep

Offline speech recognition for Android with Vosk library.

Java

578

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

dialogflow-android-clientby dialogflow

Android SDK for Dialogflow

Java

577

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

ark-tweet-nlpby brendano

CMU ARK Twitter Part-of-Speech Tagger

Java

573

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

openspeechby openspeech-team

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Python

572

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

MoeVoiceStudioby NaruseMioShirakana

一个使用C++编写的音频处理软件

C++

572

Updated: 2 y ago

License: Strong Copyleft (AGPL-3.0)

Support

Quality

Security

License

Reuse

concrete5-legacyby concretecms

Legacy repository for concrete5

PHP

566

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

SpeechAlgorithmsby Ryuk17

Speech Algorithms

564

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

dialogflow-python-clientby dialogflow

Python library for Dialogflow

Python

559

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

av_hubertby facebookresearch

A self-supervised learning framework for audio-visual speech

Python

559

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

sprocketby k2kobayashi

Voice Conversion Tool Kit

Python

553

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Conv-TasNetby kaituoxu

A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).

Python

552

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

language-detectorby optimaize

Language Detection Library for Java

Java

543

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

YourTTSby Edresson

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Jupyter Notebook

541

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

rhinoby Picovoice

On-device Speech-to-Intent engine powered by deep learning

Python

533

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

openttsby synesthesiam

Open Text to Speech Server

Python

530

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Multilingual_Text_to_Speechby Tomiinek

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Python

528

Updated: 3 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

voice-builderby google

An opensource text-to-speech (TTS) voice building tool

JavaScript

528

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

SpeechSplitby auspicious3000

Unsupervised Speech Decomposition Via Triple Information Bottleneck

Python

527

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

FunASRby alibaba-damo-academy

A Fundamental End-to-End Speech Recognition Toolkit

Python

524

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

tacotronby google

Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.

HTML

522

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

PaddlePaddle-DeepSpeechby yeyupiaoling

基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows，Linux下训练和预测，支持Nvidia Jetson开发板预测。

Python

519

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

GigaSpeechby SpeechColab

Large, modern dataset for speech recognition

Shell

515

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

mir_evalby craffel

Evaluation functions for music/audio information retrieval/signal processing algorithms.

Python

512

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

harkby otalk

Converts an audio stream to speech events in the browser

JavaScript

504

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

CTCWordBeamSearchby githubharald

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.

C++

504

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

ganttsby r9y9

PyTorch implementation of GAN-based text-to-speech synthesis and voice conversion (VC)

Jupyter Notebook

503

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

speech-to-text-benchmarkby Picovoice

speech to text benchmark framework

Python

502

Updated: 3 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

voicefixerby haoheliu

General Speech Restoration

Python

501

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Java

496

Updated: 4 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

PPASRby yeyupiaoling

基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型

Python

495

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

melganby seungwonpark

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Python

494

Updated: 4 y ago

License: Permissive (BSD-3-Clause)

Support

Quality

Security

License

Reuse

cheetahby Picovoice

On-device streaming speech-to-text engine powered by deep learning

Python

491

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

cn2anby Ailln

📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

Python

486

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

kospeechby sooftware

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Python

485

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

FloWaveNetby ksw0306

A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"

Python

476

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

AutoSubby abhirooptalasila

A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui

Python

470

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

neural_spby hirofumi0810

End-to-end ASR/LM implementation with PyTorch

Python

469

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

Python-Wrapper-for-World-Vocoderby JeremyCCHsu

A Python wrapper for the high-quality vocoder "World"

Python

468

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

spec_augmentby zcaceres

🔦 A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Jupyter Notebook

467

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

flutter_ttsby dlutton

Flutter Text to Speech package

C++

461

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Open Weaver – Develop Applications Faster with Open Source

Terms
Privacy policy

Speech Libraries - Page 4

asteroidby mpariente

Python 611 Version:Current License: Permissive (MIT)

The PyTorch-based audio source separation toolkit for researchers

ms-ra-forwarderby wxxxcxx

HTML 607 Version:Current License: Permissive (MIT)

免费的在线文本转语音API

SpecAugmentby DemisEom

Python 596 Version:Current License: Permissive (Apache-2.0)

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

speech-denoising-wavenetby drethage

Python 594 Version:Current License: Permissive (MIT)

A neural network for end-to-end speech denoising

cboardby cboard-org

JavaScript 594 Version:Current License: Strong Copyleft (GPL-3.0)

Augmentative and Alternative Communication (AAC) system with text-to-speech for the browser

diffwaveby lmnt-com

Python 593 Version:Current License: Permissive (Apache-2.0)

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

sonusby evancohen

JavaScript 592 Version:Current License: Permissive (MIT)

:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

WavAugmentby facebookresearch

Python 585 Version:Current License: Permissive (MIT)

A library for speech data augmentation in time-domain

Parakeetby PaddlePaddle

Python 584 Version:Current License: Proprietary (Proprietary)

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)

inaSpeechSegmenterby ina-foss

Python 584 Version:Current License: Permissive (MIT)

CNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

vosk-android-demoby alphacep

Java 578 Version:Current License: Permissive (Apache-2.0)

Offline speech recognition for Android with Vosk library.

dialogflow-android-clientby dialogflow

Java 577 Version:Current License: Permissive (Apache-2.0)

Android SDK for Dialogflow

ark-tweet-nlpby brendano

Java 573 Version:Current License: Proprietary (Proprietary)

CMU ARK Twitter Part-of-Speech Tagger

openspeechby openspeech-team

Python 572 Version:Current License: Permissive (MIT)

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

MoeVoiceStudioby NaruseMioShirakana

C++ 572 Version:Current License: Strong Copyleft (AGPL-3.0)

一个使用C++编写的音频处理软件

concrete5-legacyby concretecms

PHP 566 Version:Current License: No License (No License)

Legacy repository for concrete5

SpeechAlgorithmsby Ryuk17

C 564 Version:Current License: Permissive (Apache-2.0)

Speech Algorithms

dialogflow-python-clientby dialogflow

Python 559 Version:Current License: Permissive (Apache-2.0)

Python library for Dialogflow

av_hubertby facebookresearch

Python 559 Version:Current License: Proprietary (Proprietary)

A self-supervised learning framework for audio-visual speech

sprocketby k2kobayashi

Python 553 Version:Current License: Permissive (MIT)

Voice Conversion Tool Kit

Conv-TasNetby kaituoxu

Python 552 Version:Current License: Permissive (MIT)

A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).

language-detectorby optimaize

Java 543 Version:Current License: Permissive (Apache-2.0)

Language Detection Library for Java

YourTTSby Edresson

Jupyter Notebook 541 Version:Current License: Proprietary (Proprietary)

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

rhinoby Picovoice

Python 533 Version:Current License: Permissive (Apache-2.0)

On-device Speech-to-Intent engine powered by deep learning

openttsby synesthesiam

Python 530 Version:Current License: Permissive (MIT)

Open Text to Speech Server

Multilingual_Text_to_Speechby Tomiinek

Python 528 Version:Current License: Permissive (MIT)

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

voice-builderby google

Python 611 Version:Current
License: Permissive (MIT)

HTML 607 Version:Current
License: Permissive (MIT)

Python 596 Version:Current
License: Permissive (Apache-2.0)

Python 594 Version:Current
License: Permissive (MIT)

JavaScript 594 Version:Current
License: Strong Copyleft (GPL-3.0)

Python 593 Version:Current
License: Permissive (Apache-2.0)

JavaScript 592 Version:Current
License: Permissive (MIT)

Python 585 Version:Current
License: Permissive (MIT)

Python 584 Version:Current
License: Proprietary (Proprietary)

Python 584 Version:Current
License: Permissive (MIT)

Java 578 Version:Current
License: Permissive (Apache-2.0)

Java 577 Version:Current
License: Permissive (Apache-2.0)

Java 573 Version:Current
License: Proprietary (Proprietary)

Python 572 Version:Current
License: Permissive (MIT)

C++ 572 Version:Current
License: Strong Copyleft (AGPL-3.0)

PHP 566 Version:Current
License: No License (No License)

C 564 Version:Current
License: Permissive (Apache-2.0)

Python 559 Version:Current
License: Permissive (Apache-2.0)

Python 559 Version:Current
License: Proprietary (Proprietary)

Python 553 Version:Current
License: Permissive (MIT)

Python 552 Version:Current
License: Permissive (MIT)

Java 543 Version:Current
License: Permissive (Apache-2.0)

Jupyter Notebook 541 Version:Current
License: Proprietary (Proprietary)

Python 533 Version:Current
License: Permissive (Apache-2.0)

Python 530 Version:Current
License: Permissive (MIT)

Python 528 Version:Current
License: Permissive (MIT)

JavaScript 528 Version:Current
License: Permissive (Apache-2.0)

Python 527 Version:Current
License: Permissive (MIT)

Python 524 Version:Current
License: Proprietary (Proprietary)

HTML 522 Version:Current
License: Proprietary (Proprietary)

Python 519 Version:Current
License: Permissive (Apache-2.0)

Shell 515 Version:Current
License: Permissive (Apache-2.0)

Python 512 Version:Current
License: Permissive (MIT)

JavaScript 504 Version:Current
License: No License (No License)

C++ 504 Version:Current
License: Permissive (MIT)

Jupyter Notebook 503 Version:Current
License: Proprietary (Proprietary)

Python 502 Version:Current
License: Permissive (Apache-2.0)

Python 501 Version:Current
License: Permissive (MIT)

Java 496 Version:Current
License: Strong Copyleft (GPL-3.0)

Python 495 Version:Current
License: Permissive (Apache-2.0)

Python 494 Version:Current
License: Permissive (BSD-3-Clause)

Python 491 Version:Current
License: Permissive (Apache-2.0)

Python 486 Version:Current
License: Permissive (MIT)

Python 485 Version:Current
License: Permissive (Apache-2.0)

Python 476 Version:Current
License: Permissive (MIT)

Python 470 Version:Current
License: Permissive (MIT)

Python 469 Version:Current
License: Permissive (Apache-2.0)

Python 468 Version:Current
License: Permissive (MIT)

Jupyter Notebook 467 Version:Current
License: Permissive (MIT)

C++ 461 Version:Current
License: Permissive (MIT)