by kandikits Updated: Oct 16, 2022
Speaker Diarization is the process of identifying and distinguishing the different speakers in a speech/audio file. In this kit, we demonstrate the application of Speaker Diarization concept using open source libraries.
To install this kit, scroll down to refer 'Kit Deployment Instructions' section and follow instructions.
Speaker Diarization solution created using this kit is added to this section. The entire solution is available as a package to download from the source code repository.
- Download, extract and double-click kit installer file to install the kit.
- After successful installation of the kit, press 'Y' to run the kit and execute cells in the notebook.
- To run the kit manually, press 'N' and locate the zip file 'speaker-diarization'.
- Extract the zip file and navigate to the directory 'speaker-diarization-master'
- Open command prompt in the extracted directory 'speaker-diarization-master' and run the command 'jupyter notebook'
- Locate and open the 'SpeakerDiarization.ipynb' notebook from the Jupyter Notebook browser window.
- Execute cells in the notebook. Note: Demo source code will be downloaded to local machine. It is also available here
VSCode and Jupyter Notebook are used for development and debugging. Jupyter Notebook is a web based interactive environment often used for experiments, whereas VSCode is used to get a typical experience of IDE for developers. Jupyter Notebook is used for our development.
Jupyter metapackage for installation, docs and chat
Python 14197 Version:Current License: Permissive (BSD-3-Clause)
Transformers and Pytorch hub are state of the art libraries that provide pre-trained models for various ML/AI applications.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python 88466 Version:v4.27.4 License: Permissive (Apache-2.0)
Tensors and Dynamic neural networks in Python with strong GPU acceleration
C++ 64522 Version:v2.0.0 License: Others (Non-SPDX)
Pyannote audio library from PyPi provides support for audio processing and other speech related applications
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Python 2558 Version:2.1.1 License: Permissive (MIT)
Kit Solution Source
Differentiates different speakers in a speech or audio file
Jupyter Notebook 0 Version:v1.0.0 License: Permissive (MIT)
If you need help to use this kit, you can email us at firstname.lastname@example.org or direct message us on Twitter Message @OpenWeaverInc.