pytorch-seq2seq | Decoder model with global attention mechanism | Translation library

by simpthon Python Version: Current License: No License

X-Ray Key Features Code Snippets Community Discussions(2)Vulnerabilities Install Support

kandi X-RAY | pytorch-seq2seq Summary

pytorch-seq2seq is a Python library typically used in Telecommunications, Media, Telecom, Utilities, Translation, Deep Learning, Transformer applications. pytorch-seq2seq has no bugs, it has no vulnerabilities and it has low support. However pytorch-seq2seq build file is not available. You can download it from GitHub.

An Implementation of Encoder-Decoder model with global attention mechanism.

Support

Quality

Security

License

Reuse

Support

pytorch-seq2seq has a low active ecosystem.

It has 13 star(s) with 4 fork(s). There are 2 watchers for this library.

It had no major release in the last 6 months.

There are 1 open issues and 0 have been closed. On average issues are closed in 27 days. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of pytorch-seq2seq is current.

Quality

pytorch-seq2seq has no bugs reported.

Security

pytorch-seq2seq has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

pytorch-seq2seq does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

pytorch-seq2seq releases are not available. You will need to build from source code and install.

pytorch-seq2seq has no build file. You will be need to create the build yourself to build the component from source.

Installation instructions are not available. Examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of pytorch-seq2seq

Get all kandi verified functions for this library.

pytorch-seq2seq Key Features

No Key Features are available at this moment for pytorch-seq2seq.

pytorch-seq2seq Examples and Code Snippets

No Code Snippets are available at this moment for pytorch-seq2seq.

Community Discussions

Trending Discussions on pytorch-seq2seq

RuntimeError: The size of tensor a (1024) must match the size of tensor b (512) at non-singleton dimension 3

Implementing Attention

QUESTION

RuntimeError: The size of tensor a (1024) must match the size of tensor b (512) at non-singleton dimension 3

Asked 2020-Aug-27 at 06:07

I am doing the following operation,

...

ANSWER

Answered 2020-Aug-27 at 06:07

I took a look at your code (which by the way, didnt run with seq_len = 10) and the problem is that you hard coded the batch_size to be equal 1 (line 143) in your code.

It looks like the example you are trying to run the model on has batch_size = 2.

Just uncomment the previous line where you wrote batch_size = query.shape[0] and everything runs fine.

Source https://stackoverflow.com/questions/63566232

QUESTION

Implementing Attention

Asked 2020-Jun-18 at 07:53

I'm implementing the Attention in PyTorch. I got questions during implementing the attention mechanism.

What is the initial state of the decoder $s_0$? Some post represents it as zero vector and some implements it as the final hidden state of the encoder. So what is real $s_0$? The original paper doesn't mention it.
Do I alternate the maxout layer to dropout layer? The original paper uses maxout layer of Goodfellow.
Is there any differences between encoder's dropout probability and decoder's? Some implementation sets different probabilities of dropouts for encoder and decoder.
When calculating $a_{ij}$ in the alignment model (concat), there are two trainable weights $W$ and $U$ . I think the better way to implement it is using two linear layers. If I use a linear layer, should I remove bias term in the linear layers?
The dimension of the output of the encoder(=$H$) doesn't fit the decoder's hidden state. $H$ is concatenated, so it has to be 2000 (for the original paper). However, the decoder's hidden dimension is also 1000. Do I need to add a linear layer after the encoder to fit the encoder's dimension and the decoder's dimension?

...

ANSWER

Answered 2020-Jun-18 at 07:53

In general, many answers are: it is different in different implementations. The original implementation from the paper is at https://github.com/lisa-groundhog/GroundHog/tree/master/experiments/nmt. For later implementations that reached better translation quality, you can check:

Neural Monkey or Nematus in TensorFlow
OpenNMT in PyTorch
Marian in C++

Now to your points:

In the original paper, it was a zero vector. Later implementations use a projection of either of the encoder final state or the average of the encoder states. The argument for using average is that it propagates the gradients more directly into the encoder states. However, this decision does not seem to influence the translation quality much.
Maxout layer is a variant of a non-linear layer. It is sort of two ReLU layers in one: you do two independent linear projections and take the maximum of them. You can happily replace Maxout with ReLU (modern implementations do so), but you still should use dropout.
I don't know about any meaningful use case in MT when I would set the dropout rates differently. Note, however, that seq2seq models are used in many wild scenarios when it might make sense.
Most implementations do use bias when computing attention energies. If you use two linear layers, you will have the bias split into two variables. Biases are usually zero-initialized, they will thus get the same gradients and the same updates. However, you can always disable the bias in a linear layer.
Yes, if you want to initialize s₀ with the decoder states. In the attention mechanism, matrix U takes care of it.

Source https://stackoverflow.com/questions/62444430

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install pytorch-seq2seq

You can download it from GitHub.
You can use pytorch-seq2seq like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: