pytorch-transformer | PyTorch implementation of the Transformer model | Machine Learning library
kandi X-RAY | pytorch-transformer Summary
kandi X-RAY | pytorch-transformer Summary
This repository provides a PyTorch implementation of the Transformer model that has been introduced in the paper Attention Is All You Need (Vaswani et al. 2017).
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Execute a query
- Compute attention
- Project query inputs
- Concatenate the output tensor
- Number of dimension keys
- Checks that the argument is a valid integer
- The attention dropout
- Sanitize a probability value
- Sample the output
- Prepare the data
- The number of dim values
- Model model number
- The number of layers
- The residual dropout
- Returns the number of head heads
pytorch-transformer Key Features
pytorch-transformer Examples and Code Snippets
Community Discussions
Trending Discussions on pytorch-transformer
QUESTION
I am building a Docker container based on python:3.7-slim-stretch
(same problem also happens on python:3.7-slim-stretch
), and it is getting Killed
on
ANSWER
Answered 2021-Feb-22 at 06:09I experience something similar on Windows when my docker containers run out of memory in WSL. I think the settings are different for Mac, but it looks like there is info here on setting the VM RAM/disk size/swap file settings for Docker for Desktop on Mac:
QUESTION
I have a sentence like: "I like sitting in my new chair and _____ about life"
.
And I have a SPECIFIC set of tokens like ["watch", "run", "think", "apple", "light"]
I would like to calculate the probability of each of those tokens to appear as the next word in that incomplete sentence. Hopefully I should get that the probability of "think"
is higher that "apple"
for instance.
I am working with pytorch-transformers (GPT2LMHeadModel specifically), and a possible solution is to evaluate the score of the full sentence with each of the tokens, but when number of tokens to evaluate is on the order of 100 or 1000 then the computation time starts to be too long.
It must be possible to process the sentence only once and somehow use the hidden states to calculate the probabilities of the set of tokens, but I don't know how to do it.
Any ideas? Thanks in advance
EDIT:
The actual code looks like the one below (estimating the probability for the full sentence every time). For every sentence it takes about 0.1 seconds to run the score()
method, which turns into hours if I want to evaluate some thousands of words.
ANSWER
Answered 2020-Aug-03 at 14:50Your example produced the following output and took around 48.5 seconds with 282 candiates to finish in my environment (I only conducted a 3 runs):
QUESTION
I started working on this about two months ago on Google Colab for a midterm project and everything worked perfectly. Now I am modifying it for a final project and keep getting the error 'RuntimeError: Trying to create tensor with negative dimension -1: [-1, 768]'. It looks like pytorch recently pushed a new version 1.5, so I downgraded to version 1.4 and still got the same error. Same with 1.3, and I know I wasn't using anything lower since that came out last year. I checked it with my midterm code and still got the same error, so I don't know what's going on. Here is the chunk of code related to downloading and using the model.
...ANSWER
Answered 2020-Apr-29 at 03:54You can try transformers instead of pytorch_transformers.
! pip install transformers
(Google Colab)
In terminal,
pip install transformers
QUESTION
I wanted to test TextGeneration with CTRL using PyTorch-Transformers, before using it for fine-tuning. But it doesn't prompt anything like it does with GPT-2 and other similar language generation models. I'm very new for this and am stuck and can't figure out what's going on.
This is the procedure I followed in my Colab notebook,
...ANSWER
Answered 2020-Mar-02 at 00:18The solution was to increase the RAM. Since I was using the Google Colab's free GPU, I was going through this: GitHub issue and found this useful: Solution
The following piece of code will crash the session in Colab and select 'Get more RAM', which will increase the RAM up to 25.51GB
QUESTION
Hi I am working on implementing a multi-classification model (5 classes) with the new SpaCy Model en_pytt_bertbaseuncased_lg
. The code for the new pipe is here:
ANSWER
Answered 2019-Aug-13 at 11:22This is a regression in the most recent version we released of spacy-pytorch-transformers
. Sorry about this!
The root cause is, this is another case of the evils of **kwargs
. I'm looking forward to refining the spaCy API to prevent these issues in future.
You can see the offending line here: https://github.com/explosion/spacy-pytorch-transformers/blob/c1def95e1df783c69bff9bc8b40b5461800e9231/spacy_pytorch_transformers/pipeline/textcat.py#L71 . We provide the nr_class
positional argument, which overlaps with the explicit argument you passed in during the config.
In order to workaround the problem, you can simply remove the nr_class
key from your the config
dict you're passing into spacy.create_pipe()
.
QUESTION
I am currently working with the spacy-pytorch-transformer
package to experiment with the respective embeddings.
When reading the introductionary article (essentially the GitHub README), my understanding was that the token-level embeddings are the mean over the embeddings of all corresponding word pieces, i.e. embed(complex)
would be the same as 1/2 * embed(comp#) * embed(#lex)
.
According to the BERT paper, this should simply utilize the last_hidden_state
property of the network, but my MCVE below shows that this is not the same for Spacy 2.1.8 and spacy-pytorch-transformers 0.4.0, for at least BERT and RoBERTa (have not verified it for more models):
ANSWER
Answered 2019-Dec-10 at 14:28It seems that there is a more elaborate weighting scheme behind this, which also accounts for the [CLS]
and [SEP]
token outputs in each sequence.
This has also been confirmed by an issue post from the spaCy developers.
Unfortunately, it seems that this part of the code has since moved with the renaming to spacy-transformers
.
QUESTION
In relation to the previous post on stackoverflow Model() got multiple values for argument 'nr_class' - SpaCy multi-classification model (BERT integration) in which my problem partialy have beed resolved I wanted to share the issue which comes up after implementing the solution.
if I take out the nr_class
argument, I get this error here:
ANSWER
Answered 2019-Aug-21 at 19:42As @Milla Well already commented the answer can be found here (the bug fix on github from @syllogism_)
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install pytorch-transformer
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page