longformer | Longformer : The Long-Document Transformer | Natural Language Processing library

by allenai Python Version: v0.2 License: Apache-2.0

X-Ray Key Features Code Snippets(3)Community Discussions(8)Vulnerabilities Install Support

kandi X-RAY | longformer Summary

longformer is a Python library typically used in Artificial Intelligence, Natural Language Processing, Deep Learning, Pytorch, Tensorflow, Bert, Neural Network, Transformer applications. longformer has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has medium support. You can download it from GitHub.

Longformer and LongformerEncoderDecoder (LED) are pretrained transformer models for long documents. A LongformerEncoderDecoder (LED) model is now available. It supports seq2seq tasks with long input. With gradient checkpointing, fp16, and 48GB gpu, the input length can be up to 16K tokens. Check the updated paper for the model details and evaluation. A significant speed degradation in the hugginface/transformers was recenlty discovered and fixed (check this PR for details). To avoid this problem, either use the old release v2.11.0 but it doesn't support gradient checkpointing, or use the master branch. This problem should be fixed with the next hugginface/transformers release.

Support

Quality

Security

License

Reuse

Support

longformer has a medium active ecosystem.

It has 1778 star(s) with 261 fork(s). There are 41 watchers for this library.

It had no major release in the last 12 months.

There are 122 open issues and 102 have been closed. On average issues are closed in 30 days. There are 10 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of longformer is v0.2

Quality

longformer has 0 bugs and 0 code smells.

Security

longformer has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

longformer code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

longformer is licensed under the Apache-2.0 License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

longformer releases are available to install and integrate.

Build file is available. You can build the component from source.

Installation instructions are not available. Examples and code snippets are available.

longformer saves you 1263 person hours of effort in developing the same functionality from scratch.

It has 2839 lines of code, 238 functions and 33 files.

It has high code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed longformer and discovered the below as its top functions. This is intended to give you an instant insight into longformer implemented functionality, and help decide if they suit your requirements.

Convert a single example .
Find library path .
Compile GPU .
Create a LongformerEncoder .
Convert arguments to TVM arguments .
Add command line arguments to the given parser .
Export the module to a file .
Register an extension .
Find and return the path to include .
Register a global function .

Get all kandi verified functions for this library.

longformer Key Features

No Key Features are available at this moment for longformer.

longformer Examples and Code Snippets

Rissanen Data Analysis,Reproducing Our Results,RDA on HotpotQA

Python

Lines of Code : 90

License : Non-SPDX (NOASSERTION)

Copy

# Function to google drive from terminal
function gdrive_download () {
  CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=$1" -O- | sed -rn 's/.*co

<code>Longformer for MS MARCO document ranking task</code>,Citing

Python

Lines of Code : 8

License : No License

Copy

@misc{sekuli2020longformer,
title={Longformer for MS MARCO Document Re-ranking Task},
author={Ivan Sekulić and Amir Soleimani and Mohammad Aliannejadi and Fabio Crestani},
year={2020},
eprint={2009.09392},
archivePrefix={arXiv},
primaryClass={cs.IR}

On Generalization in Coreference Resolution,Training and Inference,Training

Jupyter Notebook

Lines of Code : 4

License : No License

Copy

python main.py experiment=joint use_wandb=True

python main.py experiment=litbank trainer.label_smoothing_wt=0.0

python main.py experiment=ontonotes_pseudo model/doc_encoder/transformer=longformer_base

python main.py experiment=litbank model/memory

Community Discussions

Trending Discussions on longformer

understanding gpu usage huggingface classification - Total optimization steps

huggingface sequence classification unfreezing layers

transformers longformer classification problem with f1, precision and recall classification

ValueError: Unrecognized model in ./MRPC/. Should have a `model_type` key in its config.json, or contain one of the following strings in its name

Can the increase in training loss lead to better accuracy?

Transformers Longformer IndexError: index out of range in self

Longformer get last_hidden_state

Huggingface saving tokenizer

QUESTION

understanding gpu usage huggingface classification - Total optimization steps

Asked 2022-Mar-29 at 00:30

I am training huggingface longformer for a classification problem and got below output.

I am confused about Total optimization steps. As I have 7000 training data points and 5 epochs and Total train batch size (w. parallel, distributed & accumulation) = 64, shouldn't I get 7000*5/64 steps? that comes to 546.875? why is it showing Total optimization steps = 545
Why in the below output, there are 16 steps of Input ids are automatically padded from 1500 to 1536 to be a multiple of config.attention_window: 512 then [ 23/545 14:24 < 5:58:16, 0.02 it/s, Epoch 0.20/5]? what are these steps?

==========================================================

...

ANSWER

Answered 2022-Mar-29 at 00:30

1. Why 545 optimization steps?

Looking at the implementation of the transformers package, we see that the Trainer uses a variable called max_steps when printing the Total optimization steps message in the train method:

Source https://stackoverflow.com/questions/71607906

QUESTION

huggingface sequence classification unfreezing layers

Asked 2022-Mar-22 at 23:48

I am using longformer for sequence classification - binary problem

I have downloaded required files

...

ANSWER

Answered 2022-Mar-22 at 23:48

requires_grad==True means that we will compute the gradient of this tensor, so the default setting is we will train/finetune all layers.
You can only train the output layer by freezing the encoder with

Source https://stackoverflow.com/questions/71577525

QUESTION

transformers longformer classification problem with f1, precision and recall classification

Asked 2022-Feb-15 at 21:30

I am replicating code from this page and I am getting F1, precision and recall to be 0. I got accuracy as shown by the author. What could be reason?

I looked into compute_metrics function and it seems to be correct. I tried some toy data as below and precision_recall_fscore_support seems to be giving a correct answer

...

ANSWER

Answered 2022-Feb-15 at 20:46

My guess is that the transformation of your dependend variable was somehow messed up. This I think because all your metrics which depend on TP (True Posivites) are 0 ->

Both Precision and Sensitivity(Recall) depend on TP as numerator:

Source https://stackoverflow.com/questions/71086923

QUESTION

ValueError: Unrecognized model in ./MRPC/. Should have a `model_type` key in its config.json, or contain one of the following strings in its name

Asked 2022-Jan-13 at 14:10

Goal: Amend this Notebook to work with Albert and Distilbert models

Kernel: conda_pytorch_p36. I did Restart & Run All, and refreshed file view in working directory.

Error occurs in Section 1.2, only for these 2 new models.

For filenames etc., I've created a variable used everywhere:

...

ANSWER

Answered 2022-Jan-13 at 14:10

Explanation:

When instantiating AutoModel, you must specify a model_type parameter in ./MRPC/config.json file (downloaded during Notebook runtime).

List of model_types can be found here.

Solution:

Code that appends model_type to config.json, in the same format:

Source https://stackoverflow.com/questions/70697470

QUESTION

Can the increase in training loss lead to better accuracy?

Asked 2022-Jan-11 at 23:11

I'm working on a competition on Kaggle. First, I trained a Longformer base with the competition dataset and achieved a quite good result on the leaderboard. Due to the CUDA memory limit and time limit, I could only train 2 epochs with a batch size of 1. The loss started at about 2.5 and gradually decreased to 0.6 at the end of my training.

I then continued training 2 more epochs using that saved weights. This time I used a little bit larger learning rate (the one on the Longformer paper) and added the validation data to the training data (meaning I no longer split the dataset 90/10). I did this to try to achieve a better result.

However, this time the loss started at about 0.4 and constantly increased to 1.6 at about half of the first epoch. I stopped because I didn't want to waste computational resources.

Should I have waited more? Could it eventually lead to a better test result? I think the model could have been slightly overfitting at first.

...

ANSWER

Answered 2022-Jan-11 at 08:50

Your model got fitted to the original training data the first time you trained it. When you added the validation data to the training set the second time around, the distribution of your training data must have changed significantly. Thus, the loss increased in your second training session since your model was unfamiliar with this new distribution.

Should you have waited more? Yes, the loss would have eventually decreased (although not necessarily to a value lower than the original training loss)

Could it have led to a better test result? Probably. It depends on if your validation data contains patterns that are:

Not present in your training data already
Similar to those that your model will encounter in deployment

Source https://stackoverflow.com/questions/70663238

QUESTION

Transformers Longformer IndexError: index out of range in self

Asked 2021-Aug-27 at 18:33

From Transformers library I use LongformerModel, LongformerTokenizerFast, LongformerConfig (all of them use from_pretrained("allenai/longformer-base-4096")).

When I do

...

ANSWER

Answered 2021-Aug-27 at 18:33

I have managed to fix this by reindexing my position_ids.

When PyTorch was creating that tensor, for some reason some value in position_ids was bigger than 4098.

I used:

Source https://stackoverflow.com/questions/68951828

QUESTION

Longformer get last_hidden_state

Asked 2021-Apr-02 at 13:20

I am trying to follow this example in the huggingface documentation here https://huggingface.co/transformers/model_doc/longformer.html:

...

ANSWER

Answered 2021-Apr-02 at 13:20

Do not select via index:

Source https://stackoverflow.com/questions/66655023

QUESTION

Huggingface saving tokenizer

Asked 2020-Oct-28 at 09:27

I am trying to save the tokenizer in huggingface so that I can load it later from a container where I don't need access to the internet.

...

ANSWER

Answered 2020-Oct-28 at 09:27

save_vocabulary(), saves only the vocabulary file of the tokenizer (List of BPE tokens).

To save the entire tokenizer, you should use save_pretrained()

Thus, as follows:

Source https://stackoverflow.com/questions/64550503

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install longformer

You can download it from GitHub.
You can use longformer like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: