Neural-Machine-Translation | Google Summer of Code 2018 Project | Machine Learning library

 by   RedHenLab Python Version: Current License: No License

kandi X-RAY | Neural-Machine-Translation Summary

kandi X-RAY | Neural-Machine-Translation Summary

Neural-Machine-Translation is a Python library typically used in Telecommunications, Media, Media, Entertainment, Artificial Intelligence, Machine Learning, Deep Learning, Pytorch applications. Neural-Machine-Translation has no bugs, it has no vulnerabilities, it has build file available and it has low support. You can download it from GitHub.

The aim of this project is to build a Multilingual Neural Machine Translation System, which would be capable of translating Red Hen Lab's TV News Transcripts from different source languages to English. The system uses Reinforcement Learning(Advantage-Actor-Critic algorithm) on the top of neural encoder-decoder architecture and outperforms the results obtained by simple Neural Machine Translation which is based upon maximum log-likelihood training. Our system achieves close to state-of-the-art results on the standard WMT(Workshop on Machine Translation) test datasets. This project is inspired by the approaches mentioned in the paper An Actor-Critic Algorithm for Sequence Prediction. I have made a GSoC blog, please refer to it for my all GSoC blogposts about the progress made so far. Blog link:
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              Neural-Machine-Translation has a low active ecosystem.
              It has 21 star(s) with 8 fork(s). There are 12 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              There are 1 open issues and 0 have been closed. On average issues are closed in 601 days. There are 11 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of Neural-Machine-Translation is current.

            kandi-Quality Quality

              Neural-Machine-Translation has no bugs reported.

            kandi-Security Security

              Neural-Machine-Translation has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

            kandi-License License

              Neural-Machine-Translation does not have a standard license declared.
              Check the repository for any license declaration and review the terms closely.
              OutlinedDot
              Without a license, all rights are reserved, and you cannot use the library in your applications.

            kandi-Reuse Reuse

              Neural-Machine-Translation releases are not available. You will need to build from source code and install.
              Build file is available. You can build the component from source.
              Installation instructions, examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi has reviewed Neural-Machine-Translation and discovered the below as its top functions. This is intended to give you an instant insight into Neural-Machine-Translation implemented functionality, and help decide if they suit your requirements.
            • Score a single sentence
            • Update ngram count by ngrams
            • Compute BLEU score
            • Converts indices to labels
            • Return the label associated with the given index
            • Prune the entries in the dictionary
            • Add a label to the distribution
            • Convert labels to indices
            • Lookup a label by key
            • Perform the forward computation
            • Perform a single step
            • Save sentences to file
            • Write the label to a file
            • Runs the forward prediction
            • Load text from a file
            • Converts a list of lines to lowercase
            • Fix the encoder hidden
            • Calculate BLEU score
            • Calculate the loss function
            • Forward computation
            • Compute the bleu score
            • Translate the input tensor
            • Load labels from file
            Get all kandi verified functions for this library.

            Neural-Machine-Translation Key Features

            No Key Features are available at this moment for Neural-Machine-Translation.

            Neural-Machine-Translation Examples and Code Snippets

            No Code Snippets are available at this moment for Neural-Machine-Translation.

            Community Discussions

            QUESTION

            100% training and valuation accuracy, tried gradient clipping too
            Asked 2020-Jun-10 at 13:30

            I get always 100% training and validation accuracies. Here's how it looks:

            ...

            ANSWER

            Answered 2020-Jun-10 at 12:39

            You initialize decoder_targets_one_hot as vectors of zeros, but do not set the index of true class as 1 anywhere. So, basically the target vectors are not one-hot vectors. The model tries to learn same target for all inputs, i.e. the vector of zeros.

            Source https://stackoverflow.com/questions/62303604

            QUESTION

            How to break out of this for loop with try-except statements?
            Asked 2019-Jul-01 at 07:47

            ANSWER

            Answered 2019-Jul-01 at 07:47

            It may be a problem with nested loops, as covered by this answer. They suggest using return, but then your loop would need to be written as a function. If that doesn't appeal you could try using various levels of break statements as shown in some of the answers. Using the for, else construction (explained here), I think your code would look like the following

            Source https://stackoverflow.com/questions/56805439

            QUESTION

            Dimension Issue with Tensorflow stack_bidirectional_dynamic_rnn
            Asked 2018-Jun-16 at 10:52

            I am building a toy encoder-decoder model for machine translation by using Tensorflow.

            I use Tensorflow 1.8.0 cpu version. FastText pretrained word vector of 300 dimension is used in the embedding layer. Then the batch of training data goes through encoder and decoder with attention mechanism. In training stage decoder uses the TrainHelper and in inference stage GreedyEmbeddingHelper is used.

            I already ran the model successfully by using a bidirectional LSTM encoder. However when I try to further improve my model by using multilayer LSTM, the bug arises. The code to build the training stage model is below:

            ...

            ANSWER

            Answered 2018-Jun-16 at 10:52

            Use the following method to define a list of cell instances,

            Source https://stackoverflow.com/questions/50886684

            QUESTION

            Seq2Seq with Keras understanding
            Asked 2018-Feb-11 at 03:12

            For some self-studying, I'm trying to implement simple a sequence-to-sequence model using Keras. While I get the basic idea and there are several tutorials available online, I still struggle with some basic concepts when looking these tutorials:

            • Keras Tutorial: I've tried to adopt this tutorial. Unfortunately, it is for character sequences, but I'm aiming for word sequences. There's is a block to explain the required for word sequences, but this is currently throwing "wrong dimension" errors -- but that's OK, probably some data preparation errors from my side. But more importantly, in this tutorial, I can clearly see the 2 types of input and 1 type of output: encoder_input_data, decoder_input_data, decoder_target_data
            • MachineLearningMastery Tutorial: Here the network model looks very different, completely sequential with 1 input and 1 output. From what I can tell, here the decoder gets just the output of the encoder.

            Is it correct to say that these are indeed two different approaches towards Seq2Seq? Which one is maybe better and why? Or do I read the 2nd tutorial wrongly? I already got an understanding in sequence classification and sequences labeling, but with sequence-to-sequence it hasn't properly clicked yet.

            ...

            ANSWER

            Answered 2018-Feb-11 at 03:12

            Yes, those two are different approaches and there are other variations as well. MachineLearningMastery simplifies things a bit to make it accessible. I believe Keras method might perform better and is what you will need if you want to advance to seq2seq with attention which is almost always the case.

            MachineLearningMastery has a hacky workaround that allows it to work without handing in decoder inputs. It simply repeats the last hidden state and passes that as the input at each timestep. This is not a flexible solution.

            Source https://stackoverflow.com/questions/48717670

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install Neural-Machine-Translation

            Users who want the pipeline to work on case HPC, just copy the directory named nmt from the home directory of my hpc acoount i.e /home/vxg195 & then follow the instructions described for training & translation.
            Users who want the pipeline to work on case HPC, just copy the directory named nmt from the home directory of my hpc acoount i.e /home/vxg195 & then follow the instructions described for training & translation.
            nmt directory will contain the following subdirectories: singularity data models Neural-Machine-Translation myenv
            The singularity directory contains a singularity image(rh_xenial_20180308.img) which is copied from the home directory of Mr. Michael Pacchioli's CASE HPC account. This singularity image contains some modules like CUDA and CUDANN needed for the system.
            The data directory consists of cleaned & processed datasets of respective language pairs. The subdirectories of this directory should be named like de-en where de & en are the language codes for German & English. So for any general language pair whose source language is $src and the target language is $tgt, the language data subdirectory should be named as $src-$tgt and it should contain the following files(train, validation & test): train.$src-$tgt.$src.processed train.$src-$tgt.$tgt.processed valid.$src-$tgt.$src.processed valid.$src-$tgt.$tgt.processed test.$src-$tgt.$src.processed test.$src-$tgt.$tgt.processed
            The models directory consists of trained models for the respective language pairs and also follows the same structure of subdirectories as data directory. For example, models/de-en will contains trained models for the German-English language pair.
            The following commands were used to install dependencies for the project: $ git clone https://github.com/RedHenLab/Neural-Machine-Translation.git $ virtualenv myenv $ source myenv/bin/activate $ pip install -r Neural-Machine-Translation/requirements.txt
            Note that the virtual environment(myenv) created using virtualenv command mentioned above, should be of Python2 .

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/RedHenLab/Neural-Machine-Translation.git

          • CLI

            gh repo clone RedHenLab/Neural-Machine-Translation

          • sshUrl

            git@github.com:RedHenLab/Neural-Machine-Translation.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link