Neural-Machine-Translation | Google Summer of Code 2018 Project | Machine Learning library
kandi X-RAY | Neural-Machine-Translation Summary
kandi X-RAY | Neural-Machine-Translation Summary
The aim of this project is to build a Multilingual Neural Machine Translation System, which would be capable of translating Red Hen Lab's TV News Transcripts from different source languages to English. The system uses Reinforcement Learning(Advantage-Actor-Critic algorithm) on the top of neural encoder-decoder architecture and outperforms the results obtained by simple Neural Machine Translation which is based upon maximum log-likelihood training. Our system achieves close to state-of-the-art results on the standard WMT(Workshop on Machine Translation) test datasets. This project is inspired by the approaches mentioned in the paper An Actor-Critic Algorithm for Sequence Prediction. I have made a GSoC blog, please refer to it for my all GSoC blogposts about the progress made so far. Blog link:
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Score a single sentence
- Update ngram count by ngrams
- Compute BLEU score
- Converts indices to labels
- Return the label associated with the given index
- Prune the entries in the dictionary
- Add a label to the distribution
- Convert labels to indices
- Lookup a label by key
- Perform the forward computation
- Perform a single step
- Save sentences to file
- Write the label to a file
- Runs the forward prediction
- Load text from a file
- Converts a list of lines to lowercase
- Fix the encoder hidden
- Calculate BLEU score
- Calculate the loss function
- Forward computation
- Compute the bleu score
- Translate the input tensor
- Load labels from file
Neural-Machine-Translation Key Features
Neural-Machine-Translation Examples and Code Snippets
Community Discussions
Trending Discussions on Neural-Machine-Translation
QUESTION
I get always 100% training and validation accuracies. Here's how it looks:
...ANSWER
Answered 2020-Jun-10 at 12:39You initialize decoder_targets_one_hot
as vectors of zeros, but do not set the index of true class as 1
anywhere. So, basically the target vectors are not one-hot vectors. The model tries to learn same target for all inputs, i.e. the vector of zeros.
QUESTION
I have a for
loop with excerpts of try-except blocks referring to https://machinetalk.org/2019/03/29/neural-machine-translation-with-attention-mechanism/?unapproved=67&moderation-hash=ea8e5dcb97c8236f68291788fbd746a7#comment-67:-
ANSWER
Answered 2019-Jul-01 at 07:47It may be a problem with nested loops, as covered by this answer. They suggest using return
, but then your loop would need to be written as a function. If that doesn't appeal you could try using various levels of break statements as shown in some of the answers. Using the for, else construction (explained here), I think your code would look like the following
QUESTION
I am building a toy encoder-decoder model for machine translation by using Tensorflow.
I use Tensorflow 1.8.0 cpu version. FastText pretrained word vector of 300 dimension is used in the embedding layer. Then the batch of training data goes through encoder and decoder with attention mechanism. In training stage decoder uses the TrainHelper and in inference stage GreedyEmbeddingHelper is used.
I already ran the model successfully by using a bidirectional LSTM encoder. However when I try to further improve my model by using multilayer LSTM, the bug arises. The code to build the training stage model is below:
...ANSWER
Answered 2018-Jun-16 at 10:52Use the following method to define a list of cell instances,
QUESTION
For some self-studying, I'm trying to implement simple a sequence-to-sequence model using Keras. While I get the basic idea and there are several tutorials available online, I still struggle with some basic concepts when looking these tutorials:
- Keras Tutorial: I've tried to adopt this tutorial. Unfortunately, it is for character sequences, but I'm aiming for word sequences. There's is a block to explain the required for word sequences, but this is currently throwing "wrong dimension" errors -- but that's OK, probably some data preparation errors from my side. But more importantly, in this tutorial, I can clearly see the 2 types of input and 1 type of output:
encoder_input_data
,decoder_input_data
,decoder_target_data
- MachineLearningMastery Tutorial: Here the network model looks very different, completely sequential with 1 input and 1 output. From what I can tell, here the decoder gets just the output of the encoder.
Is it correct to say that these are indeed two different approaches towards Seq2Seq? Which one is maybe better and why? Or do I read the 2nd tutorial wrongly? I already got an understanding in sequence classification and sequences labeling, but with sequence-to-sequence it hasn't properly clicked yet.
...ANSWER
Answered 2018-Feb-11 at 03:12Yes, those two are different approaches and there are other variations as well. MachineLearningMastery simplifies things a bit to make it accessible. I believe Keras method might perform better and is what you will need if you want to advance to seq2seq with attention which is almost always the case.
MachineLearningMastery has a hacky workaround that allows it to work without handing in decoder inputs. It simply repeats the last hidden state and passes that as the input at each timestep. This is not a flexible solution.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install Neural-Machine-Translation
Users who want the pipeline to work on case HPC, just copy the directory named nmt from the home directory of my hpc acoount i.e /home/vxg195 & then follow the instructions described for training & translation.
nmt directory will contain the following subdirectories: singularity data models Neural-Machine-Translation myenv
The singularity directory contains a singularity image(rh_xenial_20180308.img) which is copied from the home directory of Mr. Michael Pacchioli's CASE HPC account. This singularity image contains some modules like CUDA and CUDANN needed for the system.
The data directory consists of cleaned & processed datasets of respective language pairs. The subdirectories of this directory should be named like de-en where de & en are the language codes for German & English. So for any general language pair whose source language is $src and the target language is $tgt, the language data subdirectory should be named as $src-$tgt and it should contain the following files(train, validation & test): train.$src-$tgt.$src.processed train.$src-$tgt.$tgt.processed valid.$src-$tgt.$src.processed valid.$src-$tgt.$tgt.processed test.$src-$tgt.$src.processed test.$src-$tgt.$tgt.processed
The models directory consists of trained models for the respective language pairs and also follows the same structure of subdirectories as data directory. For example, models/de-en will contains trained models for the German-English language pair.
The following commands were used to install dependencies for the project: $ git clone https://github.com/RedHenLab/Neural-Machine-Translation.git $ virtualenv myenv $ source myenv/bin/activate $ pip install -r Neural-Machine-Translation/requirements.txt
Note that the virtual environment(myenv) created using virtualenv command mentioned above, should be of Python2 .
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page