gpt2 | Implement of openai gpt2 | Natural Language Processing library

 by   darr Python Version: Current License: No License

kandi X-RAY | gpt2 Summary

kandi X-RAY | gpt2 Summary

gpt2 is a Python library typically used in Artificial Intelligence, Natural Language Processing, Tensorflow applications. gpt2 has no bugs, it has no vulnerabilities and it has low support. However gpt2 build file is not available. You can download it from GitHub.

Implement of openai gpt2
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              gpt2 has a low active ecosystem.
              It has 6 star(s) with 3 fork(s). There are 3 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              gpt2 has no issues reported. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of gpt2 is current.

            kandi-Quality Quality

              gpt2 has 0 bugs and 0 code smells.

            kandi-Security Security

              gpt2 has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              gpt2 code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              gpt2 does not have a standard license declared.
              Check the repository for any license declaration and review the terms closely.
              OutlinedDot
              Without a license, all rights are reserved, and you cannot use the library in your applications.

            kandi-Reuse Reuse

              gpt2 releases are not available. You will need to build from source code and install.
              gpt2 has no build file. You will be need to create the build yourself to build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              It has 2762 lines of code, 290 functions and 58 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed gpt2 and discovered the below as its top functions. This is intended to give you an instant insight into gpt2 implemented functionality, and help decide if they suit your requirements.
            • Tokenize text
            • Clean text
            • Check if a character is a control character
            • Check if a character is whitespace
            • Forward computation
            • Splits the input tensors
            • Layer attention
            • Merge the rows of x
            • RunFinetuning
            • Evaluate the model
            • Loads a model from the given config
            • Evaluate the given model
            • Get the model and tokenizer
            • Create an instance from a JSON file
            • Create a GPT2 model from pretrained data
            • Generate a block for a given token
            • Returns a set of pair pairs
            • Set special tokens
            • Get a dictionary of special tokens
            • Convert a list of ids to a string
            • Convert ids to tokens
            • Retrieve bpeanks from merges_file
            • Loads the tokenizer
            • Load the model from pretrained data
            • Runs the direct evaluation of a trained graph
            • Runs the finished training
            Get all kandi verified functions for this library.

            gpt2 Key Features

            No Key Features are available at this moment for gpt2.

            gpt2 Examples and Code Snippets

            No Code Snippets are available at this moment for gpt2.

            Community Discussions

            QUESTION

            Solving "CUDA out of memory" when fine-tuning GPT-2 (HuggingFace)
            Asked 2022-Apr-03 at 09:45

            I get the reoccuring CUDA out of memory error when using the HuggingFace Transformers library to fine-tune a GPT-2 model and can't seem to solve it, despite my 6 GB GPU capacity, which I thought should be enough for fine-tuning on texts. The error reads as follows:

            ...

            ANSWER

            Answered 2022-Apr-03 at 09:45
            1. If the memory problems still persist, you could opt for DistillGPT2, as it has a 33% reduction in the parameters of the network (the forward pass is also twice as fast). Particularly for a small GPU memory like 6GB VRAM, it could be a solution/alternative to your problem.
            2. At the same time, it depends on how you preprocess the data. Indeed, the model is capable of "receiving" a maximum length of N tokens (could be for example 512/768) depending on the models you choose. I recently trained a named entity recognition model and the model had a maximum length of 768 tokens. However, when I manually set the dimension of the padded tokens in my PyTorch DataLoader() to a big number, I also got OOM memory (even on 3090 24GB VRAM). As I reduced the dimension of the tokens to a much smaller one (512 instead of 768 for example) the training started to work and I did not get any issues with the lack of memory.

            TLDR: Reducing the number of tokens in the preprocessing phase, regardless of the max capacity of the network, can also help to solve your memories problem. Note that reducing the number of tokens to process in a sequence is different from the dimension of a token.

            Source https://stackoverflow.com/questions/70606666

            QUESTION

            AttributeError: 'GPT2Model' object has no attribute 'gradient_checkpointing'
            Asked 2022-Mar-15 at 04:33

            I am trying to load a GPT2 fine tuned model in flask initially. The model is being loaded during the init functions using:

            ...

            ANSWER

            Answered 2021-Nov-20 at 11:21

            This issue is found to be occurring only if the framework is run using venv or deployment frameworks like uWSGI or gunicorn. It is resolved when transformers version 4.10.0 is used instead of the latest package.

            Source https://stackoverflow.com/questions/69773687

            QUESTION

            Chatbot using Huggingface Transformers
            Asked 2022-Mar-04 at 19:46

            I would like to use Huggingface Transformers to implement a chatbot. Currently, I have the code shown below. The transformer model already takes into account the history of past user input.

            Is there something else (additional code) I have to take into account for building the chatbot?

            Second, how can I modify my code to run with TensorFlow instead of PyTorch?

            Later on, I also plan to fine-tune the model on other data. I also plan to test different models such as BlenderBot and GPT2. I think to test this different models it should be as easy as replacing the corresponding model in AutoTokenizer.from_pretrained("microsoft/DialoGPT-small") and AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-small")

            ...

            ANSWER

            Answered 2021-Nov-21 at 17:17

            Here is an example of using the DialoGPT model with Tensorflow:

            Source https://stackoverflow.com/questions/70055966

            QUESTION

            query() of generator `max_length` being succeeded
            Asked 2022-Mar-04 at 10:30

            Goal: set min_length and max_length in Hugging Face Transformers generator query.

            I've passed 50, 200 as these parameters. Yet, the length of my outputs are much higher...

            There's no runtime failure.

            ...

            ANSWER

            Answered 2022-Mar-04 at 10:30
            Explanation:

            As explained by Narsil on Hugging Face 🤗 Transformers Git Issue response

            Models, don't ingest the text one character at a time, but one token at a time. There are different algorithms to achieve this but basically "My name is Nicolas" gets transformers into ["my", " name", " is", " nic", "olas"] for instance, and each of those tokens have a number.

            So when you are generating tokens, they can contain themselves 1 or more characters (usually several and almost any common word for instance). That's why you are seeing 1015 instead of your expected 200 (the tokens here have an average of 5 chars)

            Solution:

            As I resolved...

            Rename min_char_len, max_char_len to min_tokens, max_tokens and simply reduce their values by a ~1/4 or 1/5.

            Source https://stackoverflow.com/questions/71338307

            QUESTION

            HuggingFace | ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet con
            Asked 2022-Mar-03 at 13:51

            Not always, but occasionally when running my code this error appears.

            At first, I doubted it was a connectivity issue but to do with cashing issue, as discussed on an older Git Issue.

            Clearing cache didn't help runtime:

            ...

            ANSWER

            Answered 2022-Mar-03 at 11:59

            Since I am working in a conda venv and using Poetry for handling dependencies, I needed to re-install torch - a dependency for Hugging Face 🤗 Transformers.

            First, install torch: PyTorch's website lets you chose your exact setup/ specification for install. I my case, the command was

            Source https://stackoverflow.com/questions/71335585

            QUESTION

            How to save checkpoints for thie transformer gpt2 to continue training?
            Asked 2022-Feb-22 at 19:10

            I am retraining the GPT2 language model, and am following this blog :

            https://towardsdatascience.com/train-gpt-2-in-your-own-language-fc6ad4d60171

            Here, they have trained a network on GPT2, and I am trying to recreate a same. However, my dataset is too large(250Mb), so I want to continue training in intervals. In other words, I want to checkpoint the model training. If there is any help, or a piece of code that I can implement to checkpoint and continue training, it would help a great deal for me. Thank you.

            ...

            ANSWER

            Answered 2022-Feb-22 at 19:10
            training_args = TrainingArguments(
                output_dir=model_checkpoint,
                # other hyper-params
            )
            
            trainer = Trainer(
                model=model,
                args=training_args,
                train_dataset=train_set,
                eval_dataset=dev_set,
                tokenizer=tokenizer
            )
            
            trainer.train()
            # Save the model to model_dir
            trainer.save_model()
            
            def prepare_model(tokenizer, model_name_path):
                model = AutoModelForCausalLM.from_pretrained(model_name_path)
                model.resize_token_embeddings(len(tokenizer))
                return model
            
            # Assume tokenizer is defined, You can simply pass the saved model directory path.
            model = prepare_model(tokenizer, model_checkpoint)
            

            Source https://stackoverflow.com/questions/71215965

            QUESTION

            Hugging face - Efficient tokenization of unknown token in GPT2
            Asked 2022-Jan-16 at 07:28

            I am trying to train a dialog system using GPT2. For tokenization, I am using the following configuration for adding the special tokens.

            ...

            ANSWER

            Answered 2022-Jan-16 at 07:28

            For the important_tokens which contain several actual words (like frankie_and_bennys), you can replace underscore with the space and feed them normally, Or add them as a special token. I prefer the first option because this way you can use pre-trained embedding for their subtokens. For the ones which aren't actual words (like cb17dy), you must add them as special tokens.

            Source https://stackoverflow.com/questions/70672460

            QUESTION

            ValueError: Unrecognized model in ./MRPC/. Should have a `model_type` key in its config.json, or contain one of the following strings in its name
            Asked 2022-Jan-13 at 14:10

            Goal: Amend this Notebook to work with Albert and Distilbert models

            Kernel: conda_pytorch_p36. I did Restart & Run All, and refreshed file view in working directory.

            Error occurs in Section 1.2, only for these 2 new models.

            For filenames etc., I've created a variable used everywhere:

            ...

            ANSWER

            Answered 2022-Jan-13 at 14:10
            Explanation:

            When instantiating AutoModel, you must specify a model_type parameter in ./MRPC/config.json file (downloaded during Notebook runtime).

            List of model_types can be found here.

            Solution:

            Code that appends model_type to config.json, in the same format:

            Source https://stackoverflow.com/questions/70697470

            QUESTION

            How to use GPU for this python file
            Asked 2022-Jan-05 at 07:19

            I have this python file where I am trying to train a GPT2 model from scratch. For the same, I want to use gpu for faster acceleration and I am unable to do so. Help will be much appreciated

            My python code is as follows.

            PS : I am running this code on AWS Sagemaker, so I want to use their gpu acceleration.

            I have used this for reference link

            ...

            ANSWER

            Answered 2022-Jan-05 at 07:19

            You need to activate GPU runtime while hosting the notebook session in AWS SageMaker. The code will automatically take care of utilizing GPU resources.

            Looking at the link which you shared - it doesn't have any custom configs to manually specify GPU resources.

            If it's handled automatically by the framework which you're using to train the network, then in an active GPU session it will automatically allocate GPU resources while training.

            Source https://stackoverflow.com/questions/70588756

            QUESTION

            "ValueError: You have to specify either input_ids or inputs_embeds" when training AutoModelWithLMHead Model (GPT-2)
            Asked 2022-Jan-04 at 14:08

            I want to fine-tune the AutoModelWithLMHead model from this repository, which is a German GPT-2 model. I have followed the tutorials for pre-processing and fine-tuning. I have prepocessed a bunch of text passages for the fine-tuning, but when beginning training, I receive the following error:

            ...

            ANSWER

            Answered 2022-Jan-04 at 14:08

            I didn't find the concrete answer to this question, but a workaround. For anyone looking for examples on how to fine-tune the GPT models from HuggingFace, you may have a look into this repo. They listed a couple of examples on how to fine-tune different Transformer models, complemented by documented code examples. I used the run_clm.py script and it achieved what I wanted.

            Source https://stackoverflow.com/questions/70577285

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install gpt2

            You can download it from GitHub.
            You can use gpt2 like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/darr/gpt2.git

          • CLI

            gh repo clone darr/gpt2

          • sshUrl

            git@github.com:darr/gpt2.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Consider Popular Natural Language Processing Libraries

            transformers

            by huggingface

            funNLP

            by fighting41love

            bert

            by google-research

            jieba

            by fxsjy

            Python

            by geekcomputers

            Try Top Libraries by darr

            pytorch_gpu_memory

            by darrShell

            chatbot

            by darrPython

            DCGAN

            by darrPython

            easy_marl

            by darrShell

            gpt

            by darrPython