BERTweet | trained language model for English Tweets | Natural Language Processing library

 by   VinAIResearch Python Version: Current License: MIT

kandi X-RAY | BERTweet Summary

kandi X-RAY | BERTweet Summary

BERTweet is a Python library typically used in Artificial Intelligence, Natural Language Processing, Pytorch, Bert, Neural Network, Transformer applications. BERTweet has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. However BERTweet build file is not available. You can download it from GitHub.

The general architecture and experimental results of BERTweet can be found in our paper:. Please CITE our paper when BERTweet is used to help produce published results or is incorporated into other software. 845M English Tweets (cased). 23M COVID-19 English Tweets (cased). 23M COVID-19 English Tweets (uncased). As of 09/2020, we have collected a corpus of about 23M "cased" COVID-19 English Tweets, and also generate an "uncased" version of this corpus. Then we continue pre-training from vinai/bertweet-base on each of the "cased" and "uncased" corpora of 23M Tweets for 40 additional epochs, resulting in two BERTweet variants vinai/bertweet-covid19-base-cased and vinai/bertweet-covid19-base-uncased, respectively. Before applying fastBPE to the pre-training corpus of 850M English Tweets, we tokenized these Tweets using TweetTokenizer from the NLTK toolkit and used the emoji package to translate emotion icons into text strings (here, each icon is referred to as a word token). We also normalized the Tweets by converting user mentions and web/url links into special tokens @USER and HTTPURL, respectively. Thus it is recommended to also apply the same pre-processing step for BERTweet-based downstream applications w.r.t. the raw input Tweets. BERTweet provides this pre-processing step by enabling the normalization argument.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              BERTweet has a low active ecosystem.
              It has 504 star(s) with 56 fork(s). There are 12 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              There are 0 open issues and 43 have been closed. On average issues are closed in 13 days. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of BERTweet is current.

            kandi-Quality Quality

              BERTweet has 0 bugs and 0 code smells.

            kandi-Security Security

              BERTweet has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              BERTweet code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              BERTweet is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              BERTweet releases are not available. You will need to build from source code and install.
              BERTweet has no build file. You will be need to create the build yourself to build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              BERTweet saves you 11 person hours of effort in developing the same functionality from scratch.
              It has 49 lines of code, 2 functions and 1 files.
              It has low code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed BERTweet and discovered the below as its top functions. This is intended to give you an instant insight into BERTweet implemented functionality, and help decide if they suit your requirements.
            • Normalize Tweet
            • Normalize a token
            Get all kandi verified functions for this library.

            BERTweet Key Features

            No Key Features are available at this moment for BERTweet.

            BERTweet Examples and Code Snippets

            No Code Snippets are available at this moment for BERTweet.

            Community Discussions

            QUESTION

            Error Running "config = RobertaConfig.from_pretrained( "/Absolute-path-to/BERTweet_base_transformers/config.json""
            Asked 2020-Oct-24 at 17:16

            I'm trying to run the code 'transformers' version of this code to use the new pre-trained BERTweet model and I'm getting an error.

            The following lines of code ran successfully in my Google Colab notebook:

            ...

            ANSWER

            Answered 2020-Jun-16 at 12:15

            First of all you have to download the proper package as described in the github readme:

            Source https://stackoverflow.com/questions/62405867

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install BERTweet

            You can download it from GitHub.
            You can use BERTweet like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/VinAIResearch/BERTweet.git

          • CLI

            gh repo clone VinAIResearch/BERTweet

          • sshUrl

            git@github.com:VinAIResearch/BERTweet.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Consider Popular Natural Language Processing Libraries

            transformers

            by huggingface

            funNLP

            by fighting41love

            bert

            by google-research

            jieba

            by fxsjy

            Python

            by geekcomputers

            Try Top Libraries by VinAIResearch

            CPM

            by VinAIResearchPython

            WaveDiff

            by VinAIResearchPython

            XPhoneBERT

            by VinAIResearchPython

            blur-kernel-space-exploring

            by VinAIResearchPython

            PhoNLP

            by VinAIResearchPython