longformer | Longformer : The Long-Document Transformer | Natural Language Processing library

 by   allenai Python Version: v0.2 License: Apache-2.0

kandi X-RAY | longformer Summary

kandi X-RAY | longformer Summary

longformer is a Python library typically used in Artificial Intelligence, Natural Language Processing, Deep Learning, Pytorch, Tensorflow, Bert, Neural Network, Transformer applications. longformer has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has medium support. You can download it from GitHub.

Longformer and LongformerEncoderDecoder (LED) are pretrained transformer models for long documents. A LongformerEncoderDecoder (LED) model is now available. It supports seq2seq tasks with long input. With gradient checkpointing, fp16, and 48GB gpu, the input length can be up to 16K tokens. Check the updated paper for the model details and evaluation. A significant speed degradation in the hugginface/transformers was recenlty discovered and fixed (check this PR for details). To avoid this problem, either use the old release v2.11.0 but it doesn't support gradient checkpointing, or use the master branch. This problem should be fixed with the next hugginface/transformers release.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              longformer has a medium active ecosystem.
              It has 1778 star(s) with 261 fork(s). There are 41 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 122 open issues and 102 have been closed. On average issues are closed in 30 days. There are 10 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of longformer is v0.2

            kandi-Quality Quality

              longformer has 0 bugs and 0 code smells.

            kandi-Security Security

              longformer has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              longformer code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              longformer is licensed under the Apache-2.0 License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              longformer releases are available to install and integrate.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              longformer saves you 1263 person hours of effort in developing the same functionality from scratch.
              It has 2839 lines of code, 238 functions and 33 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed longformer and discovered the below as its top functions. This is intended to give you an instant insight into longformer implemented functionality, and help decide if they suit your requirements.
            • Convert a single example .
            • Find library path .
            • Compile GPU .
            • Create a LongformerEncoder .
            • Convert arguments to TVM arguments .
            • Add command line arguments to the given parser .
            • Export the module to a file .
            • Register an extension .
            • Find and return the path to include .
            • Register a global function .
            Get all kandi verified functions for this library.

            longformer Key Features

            No Key Features are available at this moment for longformer.

            longformer Examples and Code Snippets

            Rissanen Data Analysis,Reproducing Our Results,RDA on HotpotQA
            Pythondot img1Lines of Code : 90dot img1License : Non-SPDX (NOASSERTION)
            copy iconCopy
            # Function to google drive from terminal
            function gdrive_download () {
              CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=$1" -O- | sed -rn 's/.*co  
            copy iconCopy
            @misc{sekuli2020longformer,
            title={Longformer for MS MARCO Document Re-ranking Task},
            author={Ivan Sekulić and Amir Soleimani and Mohammad Aliannejadi and Fabio Crestani},
            year={2020},
            eprint={2009.09392},
            archivePrefix={arXiv},
            primaryClass={cs.IR}
              
            On Generalization in Coreference Resolution,Training and Inference,Training
            Jupyter Notebookdot img3Lines of Code : 4dot img3no licencesLicense : No License
            copy iconCopy
            python main.py experiment=joint use_wandb=True
            
            python main.py experiment=litbank trainer.label_smoothing_wt=0.0
            
            python main.py experiment=ontonotes_pseudo model/doc_encoder/transformer=longformer_base
            
            python main.py experiment=litbank model/memory  

            Community Discussions

            QUESTION

            understanding gpu usage huggingface classification - Total optimization steps
            Asked 2022-Mar-29 at 00:30

            I am training huggingface longformer for a classification problem and got below output.

            1. I am confused about Total optimization steps. As I have 7000 training data points and 5 epochs and Total train batch size (w. parallel, distributed & accumulation) = 64, shouldn't I get 7000*5/64 steps? that comes to 546.875? why is it showing Total optimization steps = 545

            2. Why in the below output, there are 16 steps of Input ids are automatically padded from 1500 to 1536 to be a multiple of config.attention_window: 512 then [ 23/545 14:24 < 5:58:16, 0.02 it/s, Epoch 0.20/5]? what are these steps?

            ==========================================================

            ...

            ANSWER

            Answered 2022-Mar-29 at 00:30
            1. Why 545 optimization steps?

            Looking at the implementation of the transformers package, we see that the Trainer uses a variable called max_steps when printing the Total optimization steps message in the train method:

            Source https://stackoverflow.com/questions/71607906

            QUESTION

            huggingface sequence classification unfreezing layers
            Asked 2022-Mar-22 at 23:48

            I am using longformer for sequence classification - binary problem

            I have downloaded required files

            ...

            ANSWER

            Answered 2022-Mar-22 at 23:48
            1. requires_grad==True means that we will compute the gradient of this tensor, so the default setting is we will train/finetune all layers.
            2. You can only train the output layer by freezing the encoder with

            Source https://stackoverflow.com/questions/71577525

            QUESTION

            transformers longformer classification problem with f1, precision and recall classification
            Asked 2022-Feb-15 at 21:30

            I am replicating code from this page and I am getting F1, precision and recall to be 0. I got accuracy as shown by the author. What could be reason?

            I looked into compute_metrics function and it seems to be correct. I tried some toy data as below and precision_recall_fscore_support seems to be giving a correct answer

            ...

            ANSWER

            Answered 2022-Feb-15 at 20:46

            My guess is that the transformation of your dependend variable was somehow messed up. This I think because all your metrics which depend on TP (True Posivites) are 0 ->

            Both Precision and Sensitivity(Recall) depend on TP as numerator:

            Source https://stackoverflow.com/questions/71086923

            QUESTION

            ValueError: Unrecognized model in ./MRPC/. Should have a `model_type` key in its config.json, or contain one of the following strings in its name
            Asked 2022-Jan-13 at 14:10

            Goal: Amend this Notebook to work with Albert and Distilbert models

            Kernel: conda_pytorch_p36. I did Restart & Run All, and refreshed file view in working directory.

            Error occurs in Section 1.2, only for these 2 new models.

            For filenames etc., I've created a variable used everywhere:

            ...

            ANSWER

            Answered 2022-Jan-13 at 14:10
            Explanation:

            When instantiating AutoModel, you must specify a model_type parameter in ./MRPC/config.json file (downloaded during Notebook runtime).

            List of model_types can be found here.

            Solution:

            Code that appends model_type to config.json, in the same format:

            Source https://stackoverflow.com/questions/70697470

            QUESTION

            Can the increase in training loss lead to better accuracy?
            Asked 2022-Jan-11 at 23:11

            I'm working on a competition on Kaggle. First, I trained a Longformer base with the competition dataset and achieved a quite good result on the leaderboard. Due to the CUDA memory limit and time limit, I could only train 2 epochs with a batch size of 1. The loss started at about 2.5 and gradually decreased to 0.6 at the end of my training.

            I then continued training 2 more epochs using that saved weights. This time I used a little bit larger learning rate (the one on the Longformer paper) and added the validation data to the training data (meaning I no longer split the dataset 90/10). I did this to try to achieve a better result.

            However, this time the loss started at about 0.4 and constantly increased to 1.6 at about half of the first epoch. I stopped because I didn't want to waste computational resources.

            Should I have waited more? Could it eventually lead to a better test result? I think the model could have been slightly overfitting at first.

            ...

            ANSWER

            Answered 2022-Jan-11 at 08:50

            Your model got fitted to the original training data the first time you trained it. When you added the validation data to the training set the second time around, the distribution of your training data must have changed significantly. Thus, the loss increased in your second training session since your model was unfamiliar with this new distribution.

            Should you have waited more? Yes, the loss would have eventually decreased (although not necessarily to a value lower than the original training loss)

            Could it have led to a better test result? Probably. It depends on if your validation data contains patterns that are:

            1. Not present in your training data already
            2. Similar to those that your model will encounter in deployment

            Source https://stackoverflow.com/questions/70663238

            QUESTION

            Transformers Longformer IndexError: index out of range in self
            Asked 2021-Aug-27 at 18:33

            From Transformers library I use LongformerModel, LongformerTokenizerFast, LongformerConfig (all of them use from_pretrained("allenai/longformer-base-4096")).

            When I do

            ...

            ANSWER

            Answered 2021-Aug-27 at 18:33

            I have managed to fix this by reindexing my position_ids.

            When PyTorch was creating that tensor, for some reason some value in position_ids was bigger than 4098.

            I used:

            Source https://stackoverflow.com/questions/68951828

            QUESTION

            Longformer get last_hidden_state
            Asked 2021-Apr-02 at 13:20

            I am trying to follow this example in the huggingface documentation here https://huggingface.co/transformers/model_doc/longformer.html:

            ...

            ANSWER

            Answered 2021-Apr-02 at 13:20

            Do not select via index:

            Source https://stackoverflow.com/questions/66655023

            QUESTION

            Huggingface saving tokenizer
            Asked 2020-Oct-28 at 09:27

            I am trying to save the tokenizer in huggingface so that I can load it later from a container where I don't need access to the internet.

            ...

            ANSWER

            Answered 2020-Oct-28 at 09:27

            save_vocabulary(), saves only the vocabulary file of the tokenizer (List of BPE tokens).

            To save the entire tokenizer, you should use save_pretrained()

            Thus, as follows:

            Source https://stackoverflow.com/questions/64550503

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install longformer

            You can download it from GitHub.
            You can use longformer like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/allenai/longformer.git

          • CLI

            gh repo clone allenai/longformer

          • sshUrl

            git@github.com:allenai/longformer.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link