Speech Libraries - Page 27

DeepSpeech2-Kerasby ShankHarinath

Python 30 Version:Current
License: No License (No License)

DeepSpeech, Speech To Text, ASR, Speech recognition, Keras, Tensorflow

Support

Quality

Security

License

Reuse

NER-with-LSby ghaddarAbs

Python 30 Version:Current
License: No License (No License)

Support

Quality

Security

License

Reuse

speechapi-examplesby iandevlin

JavaScript 30 Version:Current
License: Permissive (MIT)

A small collection of examples that use the Web Speech API.

Support

Quality

Security

License

Reuse

wiki2ssmlby baxtree

JavaScript 30 Version:Current
License: Permissive (Apache-2.0)

Wiki2SSML provides the WikiVoice markup language used for fine-tuning synthesised voice.

Support

Quality

Security

License

Reuse

vue-waveformby chenqiaoen521

JavaScript 30 Version:Current
License: No License (No License)

waveform wavesurfer -waveform js html 音频audio波形图

Support

Quality

Security

License

Reuse

voicekit-examplesby TinkoffCreditSystems

C# 30 Version:Current
License: Permissive (Apache-2.0)

Examples on how to use Tinkoff Voicekit

Support

Quality

Security

License

Reuse

Python 30 Version:Current
License: No License (No License)

A PyTorch implementation of the universal neural vocoder

Support

Quality

Security

License

Reuse

TypeScript 30 Version:Current
License: Permissive (MIT)

A small JavaScript library to call Bing Speech-To-Text API with continuous detection and Text-To-Speech API

Support

Quality

Security

License

Reuse

pytorch_MLP_for_ASRby mravanelli

Perl 30 Version:Current
License: No License (No License)

This code implements a basic MLP for speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and decoding are performed with Kaldi. The current implementation supports dropout and batch normalization. An example for phoneme recognition using the standard TIMIT dataset is provided.

Support

Quality

Security

License

Reuse

SpeakerVoiceIdentifierby FragJage

C++ 30 Version:Current
License: Strong Copyleft (GPL-3.0)

SpeakerVoiceIdentifier can recognize the voice of a speaker by learning.

Support

Quality

Security

License

Reuse

C 30 Version:Current
License: No License (No License)

C++ implementation of End to End TTS which combines both Tacatron2 and LPCNET Vocoder.

Support

Quality

Security

License

Reuse

Shell 30 Version:Current
License: No License (No License)

Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language

Support

Quality

Security

License

Reuse

C++ 30 Version:Current
License: No License (No License)

A local auto speech recognition project based on Kaldi and ALSA.

Support

Quality

Security

License

Reuse

freeswitch-sounds-ttsby jpawlowski

Shell 30 Version:Current
License: Proprietary (Proprietary)

FreeSWITCH TTS Voice Prompt Generator

Support

Quality

Security

License

Reuse

Python 30 Version:Current
License: Permissive (MIT)

Python Assistant (PA) is a voice command based assistant service written in Python 3.9+. It can recognize human speech or voice, talk to user and execute basic commands.

Support

Quality

Security

License

Reuse

Scribosermoby Jaco-Assistant

Python 30 Version:Current
License: Weak Copyleft (GNU LGPLv3)

Train fast Speech-to-Text networks in different languages

Support

Quality

Security

License

Reuse

kaldifstby k2-fsa

C++ 30 Version:Current
License: Proprietary (Proprietary)

Python wrapper for OpenFST and its extensions from Kaldi. Also support reading/writing ark/scp files

Support

Quality

Security

License

Reuse

baf-datasetby guillemcortes

Python 30 Version:Current
License: Permissive (Apache-2.0)

Reproducibility kit for "BAF: An Audio Fingerprinting Dataset for Broadcast Monitoring" by Guillem Cortès, Álex Ciurana, Emilio Molina, Marius Miron, Owen Meyers, Joren Six and Xavier Serra.

Support

Quality

Security

License

Reuse

Jupyter Notebook 30 Version:Current
License: No License (No License)

Transcribe and translate audio to text using Whisper and DeepL.

Support

Quality

Security

License

Reuse

Python 30 Version:Current
License: No License (No License)

A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis

Support

Quality

Security

License

Reuse

tafrighby ieasybooks

Python 30 Version:Current
License: Permissive (MIT)

تفريغ المواد المرئية أو المسموعة إلى نصوص

Support

Quality

Security

License

Reuse

Python 30 Version:Current
License: Strong Copyleft (GPL-3.0)

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.

Support

Quality

Security

License

Reuse

tal-asrdby calclavia

Python 30 Version:Current
License: No License (No License)

Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations

Support

Quality

Security

License

Reuse

Python 30 Version:Current
License: Permissive (MIT)

Telegram bot with voice message recognition and generation. Speech to Text and Text to Speech

Support

Quality

Security

License

Reuse

wav2trainby talonvoice

Python 29 Version:Current
License: Permissive (MIT)

automatically align transcribed audio and generate a wav2letter training corpus

Support

Quality

Security

License

Reuse

Persian-Speech-Recognitionby amirfrsd

Python 29 Version:Current
License: Permissive (MIT)

Persian Speech Recognition using Google API's

Support

Quality

Security

License

Reuse

download_audiosetby jim-schwoebel

Python 29 Version:Current
License: Proprietary (Proprietary)

📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).

Support

Quality

Security

License

Reuse

pepperspeechrecognitionby JBramauer

Python 29 Version:Current
License: Permissive (MIT)

Google Speech Recognition Module for Naoqi and the Pepper Robot by Aldebaran

Support

Quality

Security

License

Reuse

JavaScript 29 Version:Current
License: Permissive (MIT)

Runs `ember serve` and will automatically restart it when necessary

Support

Quality

Security

License

Reuse

SpeechSynthesisRecorderby guest271314

JavaScript 29 Version:Current
License: No License (No License)

Get audio output from window.speechSynthesis.speak() call as ArrayBuffer, AudioBuffer, Blob, MediaSource, MediaStream, ReadableStream, other object or data types

Support

Quality

Security

License

Reuse

JavaScript 29 Version:Current
License: Weak Copyleft (MPL-2.0)

JavaScript modules for Mozilla's cloud speech recognition API.

Support

Quality

Security

License

Reuse

jarvisby Blooware

JavaScript 29 Version:Current
License: No License (No License)

Jarvis Tutorial

Support

Quality

Security

License

Reuse

langueby yuhr

TypeScript 29 Version:Current
License: Proprietary (Proprietary)

A modern platform for conlanging. Currently in the planning stage.

Support

Quality

Security

License

Reuse

Multiband-WaveRNNby Rongjiehuang

Python 29 Version:Current
License: Permissive (MIT)

An unofficial implement of autoregressive vocoder Multiband-WaveRNN. Audio samples in https://rongjiehuang.github.io/Multiband-WaveRNN/

Support

Quality

Security

License

Reuse

Phase_aware_Deep_Complex_UNetby Doyosae

Python 29 Version:Current
License: No License (No License)

Implementation Phase-aware Speech Enhancement with Deep Complex U-Net

Support

Quality

Security

License

Reuse

Python 29 Version:Current
License: Proprietary (Proprietary)

Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch

Support

Quality

Security

License

Reuse

hmm-for-emo-ttsby Emotional-Text-to-Speech

CSS 29 Version:Current
License: Permissive (MIT)

:computer: A repository with comprehensive instructions for using the Festvox toolkit for generating Emotional speech :speaker: from text

Support

Quality

Security

License

Reuse

speechby ng-web-apis

TypeScript 29 Version:Current
License: Permissive (MIT)

A library for using Web Speech API with Angular

Support

Quality

Security

License

Reuse

AUDIO-SPEECH-TO-SIGN-LANGUAGE-CONVERTERby jigargajjar55

HTML 29 Version:Current
License: Permissive (MIT)

A web based application which accepts Audio speech or Text as input and converts it to corresponding Indian Sign Language for impaired of speaking or impaired of hearing and deaf people.

Support

Quality

Security

License

Reuse

en-posby FinNLP

TypeScript 29 Version:Current
License: No License (No License)

⚙️ [Processor] A better English POS tagger written in JavaScript

Support

Quality

Security

License

Reuse

DiDiSpeechby athena-team

HTML 29 Version:Current
License: No License (No License)

Support

Quality

Security

License

Reuse

e6870by placebokkk

C 29 Version:Current
License: No License (No License)

assignments for e6870 ASR class

Support

Quality

Security

License

Reuse

C++ 29 Version:Current
License: No License (No License)

嵌入式设备环境的前端降噪模块

Support

Quality

Security

License

Reuse

ttskitby KuangDD

Python 29 Version:Current
License: No License (No License)

语音合成工具箱，Text To Speech Toolkit，多种音色可供选择的语音合成工具。

Support

Quality

Security

License

Reuse

Jupyter Notebook 29 Version:Current
License: No License (No License)

Speech to Text with Hugging Face and Wav2vec 2.0

Support

Quality

Security

License

Reuse

EaBNetby Andong-Li-speech

Python 29 Version:Current
License: No License (No License)

This is the repo of the manuscript "Embedding and Beamforming: All-Neural Causal Beamformer for Multichannel Speech Enhancement", which was submitted to ICASSP2022.

Support

Quality

Security

License

Reuse

UnsupTTSby lwang114

Shell 29 Version:Current
License: No License (No License)

Support

Quality

Security

License

Reuse

HOSCYby PaciStardust

C# 29 Version:Current
License: Strong Copyleft (GPL-2.0)

Companion for OSC and Communication

Support

Quality

Security

License

Reuse

MimicManiaby everydaycodings

Python 29 Version:Current
License: Permissive (MIT)

MimicMania is a web application that allows you to generate speech and clone voices using text-to-speech technology. With MimicMania, you can create custom voices in a variety of languages and use them for a range of applications, from voiceovers to chatbots.

Support

Quality

Security

License

Reuse

DisCoby PantoMatrix

Python 29 Version:Current
License: No License (No License)

Disentangled Implicit Content and Rhythm Learning for Diverse Co-Speech Gestures Synthesis [ACMMM 2022]

Support

Quality

Security

License

Reuse

DeepSpeech2-Kerasby ShankHarinath

DeepSpeech, Speech To Text, ASR, Speech recognition, Keras, Tensorflow

Python

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

NER-with-LSby ghaddarAbs

Python

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

speechapi-examplesby iandevlin

A small collection of examples that use the Web Speech API.

JavaScript

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

wiki2ssmlby baxtree

Wiki2SSML provides the WikiVoice markup language used for fine-tuning synthesised voice.

JavaScript

Updated: 3 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

vue-waveformby chenqiaoen521

waveform wavesurfer -waveform js html 音频audio波形图

JavaScript

Updated: 3 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

voicekit-examplesby TinkoffCreditSystems

Examples on how to use Tinkoff Voicekit

Updated: 4 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

universal-vocoderby yistLin

A PyTorch implementation of the universal neural vocoder

Python

Updated: 3 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

BingSpeechby davrous

A small JavaScript library to call Bing Speech-To-Text API with continuous detection and Text-To-Speech API

TypeScript

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

pytorch_MLP_for_ASRby mravanelli

This code implements a basic MLP for speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and decoding are performed with Kaldi. The current implementation supports dropout and batch normalization. An example for phoneme recognition using the standard TIMIT dataset is provided.

Perl

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

SpeakerVoiceIdentifierby FragJage

SpeakerVoiceIdentifier can recognize the voice of a speaker by learning.

C++

Updated: 4 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

lpctron-tts-cppby alokprasad

C++ implementation of End to End TTS which combines both Tacatron2 and LPCNET Vocoder.

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

speech.koby homink

Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language

Shell

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

SpeechRecognitionby OAID

A local auto speech recognition project based on Kaldi and ALSA.

C++

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

freeswitch-sounds-ttsby jpawlowski

FreeSWITCH TTS Voice Prompt Generator

Shell

Updated: 3 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

Python-Assistantby Umesh-01

Python Assistant (PA) is a voice command based assistant service written in Python 3.9+. It can recognize human speech or voice, talk to user and execute basic commands.

Python

Updated: 3 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Scribosermoby Jaco-Assistant

Train fast Speech-to-Text networks in different languages

Python

Updated: 3 y ago

License: Weak Copyleft (GNU LGPLv3)

Support

Quality

Security

License

Reuse

kaldifstby k2-fsa

Python wrapper for OpenFST and its extensions from Kaldi. Also support reading/writing ark/scp files

C++

Updated: 2 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

baf-datasetby guillemcortes

Reproducibility kit for "BAF: An Audio Fingerprinting Dataset for Broadcast Monitoring" by Guillem Cortès, Álex Ciurana, Emilio Molina, Marius Miron, Owen Meyers, Joren Six and Xavier Serra.

Python

Updated: 2 y ago

License: Permissive (Apache-2.0)

Support

Quality

Security

License

Reuse

AudioToTextby Carleslc

Transcribe and translate audio to text using Whisper and DeepL.

Jupyter Notebook

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

PPG-Diff-VCby seahore

A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis

Python

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

tafrighby ieasybooks

تفريغ المواد المرئية أو المسموعة إلى نصوص

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

whisper-writerby savbell

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.

Python

Updated: 2 y ago

License: Strong Copyleft (GPL-3.0)

Support

Quality

Security

License

Reuse

tal-asrdby calclavia

Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations

Python

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

tg_bot_stt_ttsby tochilkinva

Telegram bot with voice message recognition and generation. Speech to Text and Text to Speech

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

wav2trainby talonvoice

automatically align transcribed audio and generate a wav2letter training corpus

Python

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Persian-Speech-Recognitionby amirfrsd

Persian Speech Recognition using Google API's

Python

Updated: 4 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

download_audiosetby jim-schwoebel

📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).

Python

Updated: 4 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

pepperspeechrecognitionby JBramauer

Google Speech Recognition Module for Naoqi and the Pepper Robot by Aldebaran

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

ember-autoserveby ebryn

Runs `ember serve` and will automatically restart it when necessary

JavaScript

Updated: 5 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

SpeechSynthesisRecorderby guest271314

Get audio output from window.speechSynthesis.speak() call as ArrayBuffer, AudioBuffer, Blob, MediaSource, MediaStream, ReadableStream, other object or data types

JavaScript

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

speaktome-webby mozilla

JavaScript modules for Mozilla's cloud speech recognition API.

JavaScript

Updated: 3 y ago

License: Weak Copyleft (MPL-2.0)

Support

Quality

Security

License

Reuse

jarvisby Blooware

Jarvis Tutorial

JavaScript

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

langueby yuhr

A modern platform for conlanging. Currently in the planning stage.

TypeScript

Updated: 4 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

Multiband-WaveRNNby Rongjiehuang

An unofficial implement of autoregressive vocoder Multiband-WaveRNN. Audio samples in https://rongjiehuang.github.io/Multiband-WaveRNN/

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

Phase_aware_Deep_Complex_UNetby Doyosae

Implementation Phase-aware Speech Enhancement with Deep Complex U-Net

Python

Updated: 3 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

auditory-slow-fastby ekazakos

Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch

Python

Updated: 3 y ago

License: Proprietary (Proprietary)

Support

Quality

Security

License

Reuse

hmm-for-emo-ttsby Emotional-Text-to-Speech

:computer: A repository with comprehensive instructions for using the Festvox toolkit for generating Emotional speech :speaker: from text

CSS

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

speechby ng-web-apis

A library for using Web Speech API with Angular

TypeScript

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

AUDIO-SPEECH-TO-SIGN-LANGUAGE-CONVERTERby jigargajjar55

A web based application which accepts Audio speech or Text as input and converts it to corresponding Indian Sign Language for impaired of speaking or impaired of hearing and deaf people.

HTML

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

en-posby FinNLP

⚙️ [Processor] A better English POS tagger written in JavaScript

TypeScript

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

DiDiSpeechby athena-team

HTML

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

e6870by placebokkk

assignments for e6870 ASR class

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

blacksirenby rokid

嵌入式设备环境的前端降噪模块

C++

Updated: 4 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

ttskitby KuangDD

语音合成工具箱，Text To Speech Toolkit，多种音色可供选择的语音合成工具。

Python

Updated: 3 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

speech-to-textby sdhilip200

Speech to Text with Hugging Face and Wav2vec 2.0

Jupyter Notebook

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

EaBNetby Andong-Li-speech

This is the repo of the manuscript "Embedding and Beamforming: All-Neural Causal Beamformer for Multichannel Speech Enhancement", which was submitted to ICASSP2022.

Python

Updated: 3 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

UnsupTTSby lwang114

Shell

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

HOSCYby PaciStardust

Companion for OSC and Communication

Updated: 1 y ago

License: Strong Copyleft (GPL-2.0)

Support

Quality

Security

License

Reuse

MimicManiaby everydaycodings

MimicMania is a web application that allows you to generate speech and clone voices using text-to-speech technology. With MimicMania, you can create custom voices in a variety of languages and use them for a range of applications, from voiceovers to chatbots.

Python

Updated: 2 y ago

License: Permissive (MIT)

Support

Quality

Security

License

Reuse

DisCoby PantoMatrix

Disentangled Implicit Content and Rhythm Learning for Diverse Co-Speech Gestures Synthesis [ACMMM 2022]

Python

Updated: 2 y ago

License: No License (No License)

Support

Quality

Security

License

Reuse

Open Weaver – Develop Applications Faster with Open Source

Terms
Privacy policy

Speech Libraries - Page 27

DeepSpeech2-Kerasby ShankHarinath

Python 30 Version:Current License: No License (No License)

DeepSpeech, Speech To Text, ASR, Speech recognition, Keras, Tensorflow

NER-with-LSby ghaddarAbs

Python 30 Version:Current License: No License (No License)

speechapi-examplesby iandevlin

JavaScript 30 Version:Current License: Permissive (MIT)

A small collection of examples that use the Web Speech API.

wiki2ssmlby baxtree

JavaScript 30 Version:Current License: Permissive (Apache-2.0)

Wiki2SSML provides the WikiVoice markup language used for fine-tuning synthesised voice.

vue-waveformby chenqiaoen521

JavaScript 30 Version:Current License: No License (No License)

waveform wavesurfer -waveform js html 音频audio波形图

voicekit-examplesby TinkoffCreditSystems

C# 30 Version:Current License: Permissive (Apache-2.0)

Examples on how to use Tinkoff Voicekit

universal-vocoderby yistLin

Python 30 Version:Current License: No License (No License)

A PyTorch implementation of the universal neural vocoder

BingSpeechby davrous

TypeScript 30 Version:Current License: Permissive (MIT)

A small JavaScript library to call Bing Speech-To-Text API with continuous detection and Text-To-Speech API

pytorch_MLP_for_ASRby mravanelli

Perl 30 Version:Current License: No License (No License)

SpeakerVoiceIdentifierby FragJage

C++ 30 Version:Current License: Strong Copyleft (GPL-3.0)

SpeakerVoiceIdentifier can recognize the voice of a speaker by learning.

lpctron-tts-cppby alokprasad

C 30 Version:Current License: No License (No License)

C++ implementation of End to End TTS which combines both Tacatron2 and LPCNET Vocoder.

speech.koby homink

Shell 30 Version:Current License: No License (No License)

Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language

SpeechRecognitionby OAID

C++ 30 Version:Current License: No License (No License)

A local auto speech recognition project based on Kaldi and ALSA.

freeswitch-sounds-ttsby jpawlowski

Shell 30 Version:Current License: Proprietary (Proprietary)

FreeSWITCH TTS Voice Prompt Generator

Python-Assistantby Umesh-01

Python 30 Version:Current License: Permissive (MIT)

Python Assistant (PA) is a voice command based assistant service written in Python 3.9+. It can recognize human speech or voice, talk to user and execute basic commands.

Scribosermoby Jaco-Assistant

Python 30 Version:Current License: Weak Copyleft (GNU LGPLv3)

Train fast Speech-to-Text networks in different languages

kaldifstby k2-fsa

C++ 30 Version:Current License: Proprietary (Proprietary)

Python wrapper for OpenFST and its extensions from Kaldi. Also support reading/writing ark/scp files

baf-datasetby guillemcortes

Python 30 Version:Current License: Permissive (Apache-2.0)

Reproducibility kit for "BAF: An Audio Fingerprinting Dataset for Broadcast Monitoring" by Guillem Cortès, Álex Ciurana, Emilio Molina, Marius Miron, Owen Meyers, Joren Six and Xavier Serra.

AudioToTextby Carleslc

Jupyter Notebook 30 Version:Current License: No License (No License)

Transcribe and translate audio to text using Whisper and DeepL.

PPG-Diff-VCby seahore

Python 30 Version:Current License: No License (No License)

A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis

tafrighby ieasybooks

Python 30 Version:Current License: Permissive (MIT)

تفريغ المواد المرئية أو المسموعة إلى نصوص

whisper-writerby savbell

Python 30 Version:Current License: Strong Copyleft (GPL-3.0)

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.

tal-asrdby calclavia

Python 30 Version:Current License: No License (No License)

Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations

tg_bot_stt_ttsby tochilkinva

Python 30 Version:Current License: Permissive (MIT)

Telegram bot with voice message recognition and generation. Speech to Text and Text to Speech

wav2trainby talonvoice

Python 29 Version:Current License: Permissive (MIT)

automatically align transcribed audio and generate a wav2letter training corpus

Persian-Speech-Recognitionby amirfrsd

Python 29 Version:Current License: Permissive (MIT)

Persian Speech Recognition using Google API's

download_audiosetby jim-schwoebel

Python 29 Version:Current License: Proprietary (Proprietary)

📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).

Python 30 Version:Current
License: No License (No License)

Python 30 Version:Current
License: No License (No License)

JavaScript 30 Version:Current
License: Permissive (MIT)

JavaScript 30 Version:Current
License: Permissive (Apache-2.0)

JavaScript 30 Version:Current
License: No License (No License)

C# 30 Version:Current
License: Permissive (Apache-2.0)

Python 30 Version:Current
License: No License (No License)

TypeScript 30 Version:Current
License: Permissive (MIT)

Perl 30 Version:Current
License: No License (No License)

C++ 30 Version:Current
License: Strong Copyleft (GPL-3.0)

C 30 Version:Current
License: No License (No License)

Shell 30 Version:Current
License: No License (No License)

C++ 30 Version:Current
License: No License (No License)

Shell 30 Version:Current
License: Proprietary (Proprietary)

Python 30 Version:Current
License: Permissive (MIT)

Python 30 Version:Current
License: Weak Copyleft (GNU LGPLv3)

C++ 30 Version:Current
License: Proprietary (Proprietary)

Python 30 Version:Current
License: Permissive (Apache-2.0)

Jupyter Notebook 30 Version:Current
License: No License (No License)

Python 30 Version:Current
License: No License (No License)

Python 30 Version:Current
License: Permissive (MIT)

Python 30 Version:Current
License: Strong Copyleft (GPL-3.0)

Python 30 Version:Current
License: No License (No License)

Python 30 Version:Current
License: Permissive (MIT)

Python 29 Version:Current
License: Permissive (MIT)

Python 29 Version:Current
License: Permissive (MIT)

Python 29 Version:Current
License: Proprietary (Proprietary)

Python 29 Version:Current
License: Permissive (MIT)

JavaScript 29 Version:Current
License: Permissive (MIT)

JavaScript 29 Version:Current
License: No License (No License)

JavaScript 29 Version:Current
License: Weak Copyleft (MPL-2.0)

JavaScript 29 Version:Current
License: No License (No License)

TypeScript 29 Version:Current
License: Proprietary (Proprietary)

Python 29 Version:Current
License: Permissive (MIT)

Python 29 Version:Current
License: No License (No License)

Python 29 Version:Current
License: Proprietary (Proprietary)

CSS 29 Version:Current
License: Permissive (MIT)

TypeScript 29 Version:Current
License: Permissive (MIT)

HTML 29 Version:Current
License: Permissive (MIT)

TypeScript 29 Version:Current
License: No License (No License)

HTML 29 Version:Current
License: No License (No License)

C 29 Version:Current
License: No License (No License)

C++ 29 Version:Current
License: No License (No License)

Python 29 Version:Current
License: No License (No License)

Jupyter Notebook 29 Version:Current
License: No License (No License)

Python 29 Version:Current
License: No License (No License)

Shell 29 Version:Current
License: No License (No License)

C# 29 Version:Current
License: Strong Copyleft (GPL-2.0)

Python 29 Version:Current
License: Permissive (MIT)

Python 29 Version:Current
License: No License (No License)