Speech-Transformer | PyTorch re-implementation of Speech-Transformer | Speech library
kandi X-RAY | Speech-Transformer Summary
kandi X-RAY | Speech-Transformer Summary
This is a PyTorch re-implementation of Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Train the network
- Calculate loss
- Calculate the Calculation of the loss
- Update statistics
- Train the model
- Augment the spectrogram using a time warping
- Adds zero flow control points at boundary points
- Warp a timeseries
- Warp an image
- Perform decoder
- Pad a list
- Compute key - pad mask for key query
- Prepare padded input
- Get data for given split
- Ensure folder exists
- Builds the vocabulary
- Adds results to json
- Parse the hypothesis
- Update the sum
- Process a dictionary of sos and eos IDs
- Recognize beam
- Parse command line arguments
- Calculate the CER function
- Build LFR features
- Forward the layer
- Extract a feature
- Extract data from a tar file
Speech-Transformer Key Features
Speech-Transformer Examples and Code Snippets
$ python extract.py
$ cd data/data_aishell/wav
$ find . -name '*.tar.gz' -execdir tar -xzvf '{}' \;
$ python pre_process.py
@inproceedings{aishell_2017,
title={AIShell-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline},
author={Hui Bu, Jiayu Du, Xingyu Na, Bengu Wu, Hao Zheng},
booktitle={Oriental COCOSDA 2017},
pages={Submitted},
year={
Community Discussions
Trending Discussions on Speech-Transformer
QUESTION
github: https://github.com/sephiroce/tfsr/tree/exprimental
I'm trying to reproduce recognition accuracies described in the speech transformer paper [1]. The attention penalty is a technique I could not fully understand. This is the description of the attention penalty in the paper.
"In addition, we encouraged the model attending to closer positions by adding bigger penalty on the attention weights of more distant position-pairs."
I understood as it means adding smaller negative values for more away from the diagonal on scaled attention logits (before masking) except for the first multi-head attention in decoders.
This is a code snippet for computing attention weights.
...ANSWER
Answered 2020-Jan-13 at 10:33I think you understand it well. They probably did a stripe around the diagonal, something like:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install Speech-Transformer
You can use Speech-Transformer like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page