voxpopuli | scale multilingual speech corpus for representation learning | Speech library
kandi X-RAY | voxpopuli Summary
kandi X-RAY | voxpopuli Summary
voxpopuli is a Python library typically used in Artificial Intelligence, Speech applications. voxpopuli has no bugs, it has no vulnerabilities, it has build file available and it has low support. However voxpopuli has a Non-SPDX License. You can download it from GitHub.
a large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation. voxpopuli provides - 400k hours of unlabelled speech data for 23 languages - 1.8k hours of transcribed speech data for 16 languages - 17.3k hours of speech-to-speech interpretation data for 15x15 directions. the raw data is collected from 2009-2020 [european parliament event recordings] we acknowledge the european parliament for creating and sharing these materials. unlabelled and transcribed data. | language | code | unlabelled hours (v1/v2) | transcribed hours | transcribed speakers | transcribed tokens | lm tokens | |:---:|:---:|:---:|:---:|:---:|:---:|:---:| | english | en | 4.5k/24.1k | 543 | 1313 | 4.8m | 60.1m | | german | de | 4.5k/23.2k | 282 | 531 | 2.3m | 50.0m | | french | fr | 4.5k/22.8k | 211 | 534 | 2.1m | 58.6m | | spanish | es | 4.4k/21.4k | 166 | 305 | 1.6m | 57.4m | | polish | pl | 4.5k/21.2k | 111 | 282 | 802k | 13.6m | | italian | it | 4.6k/21.9k | 91 | 306 | 757k | 52.1m | | romanian | ro | 4.5k/17.9k | 89 | 164 | 739k | 10.3m | | hungarian | hu |
a large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation. voxpopuli provides - 400k hours of unlabelled speech data for 23 languages - 1.8k hours of transcribed speech data for 16 languages - 17.3k hours of speech-to-speech interpretation data for 15x15 directions. the raw data is collected from 2009-2020 [european parliament event recordings] we acknowledge the european parliament for creating and sharing these materials. unlabelled and transcribed data. | language | code | unlabelled hours (v1/v2) | transcribed hours | transcribed speakers | transcribed tokens | lm tokens | |:---:|:---:|:---:|:---:|:---:|:---:|:---:| | english | en | 4.5k/24.1k | 543 | 1313 | 4.8m | 60.1m | | german | de | 4.5k/23.2k | 282 | 531 | 2.3m | 50.0m | | french | fr | 4.5k/22.8k | 211 | 534 | 2.1m | 58.6m | | spanish | es | 4.4k/21.4k | 166 | 305 | 1.6m | 57.4m | | polish | pl | 4.5k/21.2k | 111 | 282 | 802k | 13.6m | | italian | it | 4.6k/21.9k | 91 | 306 | 757k | 52.1m | | romanian | ro | 4.5k/17.9k | 89 | 164 | 739k | 10.3m | | hungarian | hu |
Support
Quality
Security
License
Reuse
Support
voxpopuli has a low active ecosystem.
It has 407 star(s) with 35 fork(s). There are 19 watchers for this library.
It had no major release in the last 6 months.
There are 9 open issues and 10 have been closed. On average issues are closed in 7 days. There are 1 open pull requests and 0 closed requests.
It has a neutral sentiment in the developer community.
The latest version of voxpopuli is current.
Quality
voxpopuli has no bugs reported.
Security
voxpopuli has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
License
voxpopuli has a Non-SPDX License.
Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.
Reuse
voxpopuli releases are not available. You will need to build from source code and install.
Build file is available. You can build the component from source.
Installation instructions are not available. Examples and code snippets are available.
Top functions reviewed by kandi - BETA
kandi has reviewed voxpopuli and discovered the below as its top functions. This is intended to give you an instant insight into voxpopuli implemented functionality, and help decide if they suit your requirements.
- Downloads subtitles
- Get metadata for given subset
- Perform multiprocessing
- Parses src_id string
- Split audio
- Get a list of segmented audio segments
- Get pyannote segments
- Load tracks from a pkl file
- Download the audios data
- Load an annotation file
- Get all audio data
- Get a list of all audio files for lang
- Return a list of all the years of lang
- Return a list of all sessions for a given language
- Process an audio session
- Process the word alignment file
- Process text
- Parse arguments
- Load normalized normalized text
- Return a set of session IDs that are used in alignment
- Get all the audio files for a language
- Wrapper for multiprocessing
- Download the ASR dataset
- Check if the audio file exists
- Cut a session
- Run the cut
- Launches the segmentation on the given sessions
Get all kandi verified functions for this library.
voxpopuli Key Features
No Key Features are available at this moment for voxpopuli.
voxpopuli Examples and Code Snippets
No Code Snippets are available at this moment for voxpopuli.
Community Discussions
Trending Discussions on voxpopuli
QUESTION
Wrong array with while
Asked 2019-Mar-18 at 22:13
I'm sorry for this question, I'm sure is a noob question.
But... I'm not able to manage that and I'm sorry for asking your help.
I have this script:
...ANSWER
Answered 2019-Mar-18 at 21:58You want to use fetch() in while loops, not fetchAll(). fetchAll pulls all the rows at once.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install voxpopuli
You can download it from GitHub.
You can use voxpopuli like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
You can use voxpopuli like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
For any new features, suggestions and bugs create an issue on GitHub.
If you have any questions check and ask questions on community page Stack Overflow .
Find more information at:
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page