KeyBERT | Minimal keyword extraction with BERT | Natural Language Processing library

by MaartenGr Python Version: 0.8.5 License: MIT

X-Ray Key Features Code Snippets(1)Community Discussions(1)Vulnerabilities Install Support

kandi X-RAY | KeyBERT Summary

KeyBERT is a Python library typically used in Artificial Intelligence, Natural Language Processing, Bert applications. KeyBERT has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has medium support. You can install using 'pip install KeyBERT' or download it from GitHub, PyPI.

Although there are already many methods available for keyword generation (e.g., Rake, YAKE!, TF-IDF, etc.) I wanted to create a very basic, but powerful method for extracting keywords and keyphrases. This is where KeyBERT comes in! Which uses BERT-embeddings and simple cosine similarity to find the sub-phrases in a document that are the most similar to the document itself. First, document embeddings are extracted with BERT to get a document-level representation. Then, word embeddings are extracted for N-gram words/phrases. Finally, we use cosine similarity to find the words/phrases that are the most similar to the document. The most similar words could then be identified as the words that best describe the entire document. KeyBERT is by no means unique and is created as a quick and easy method for creating keywords and keyphrases. Although there are many great papers and solutions out there that use BERT-embeddings (e.g., 1, 2, 3, ), I could not find a BERT-based solution that did not have to be trained from scratch and could be used for beginners (correct me if I'm wrong!). Thus, the goal was a pip install keybert and at most 3 lines of code in usage.

Support

Quality

Security

License

Reuse

Support

KeyBERT has a medium active ecosystem.

It has 2419 star(s) with 276 fork(s). There are 27 watchers for this library.

It had no major release in the last 12 months.

There are 27 open issues and 119 have been closed. On average issues are closed in 61 days. There are 1 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of KeyBERT is 0.8.5

Quality

KeyBERT has 0 bugs and 3 code smells.

Security

KeyBERT has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

KeyBERT code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

KeyBERT is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

KeyBERT releases are available to install and integrate.

Deployable package is available in PyPI.

Build file is available. You can build the component from source.

Installation instructions, examples and code snippets are available.

It has 489 lines of code, 26 functions and 19 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed KeyBERT and discovered the below as its top functions. This is intended to give you an instant insight into KeyBERT implemented functionality, and help decide if they suit your requirements.

Embed documents
Embed a document using the tokenizer

Get all kandi verified functions for this library.

KeyBERT Key Features

No Key Features are available at this moment for KeyBERT.

KeyBERT Examples and Code Snippets

Publications

Python

Lines of Code : 67

License : Permissive (Apache-2.0)

Copy

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Nat

Community Discussions

Trending Discussions on KeyBERT

KeyBERT package is not working on Google Colab

QUESTION

KeyBERT package is not working on Google Colab

Asked 2021-Jun-24 at 03:46

I'm using KeyBERT on Google Colab to extract keywords from the text.

...

ANSWER

Answered 2021-Jun-24 at 03:46

I couldn't reproduce this issue with the code you've provided but from the provided error message I believe you're just missing an 's' in the model name so just make sure that the model name is as follows:

distilbert-base-nli-mean-tokens

and not

distilbert-base-nli-mean-token

Also refer to this link for all models available for use.

Source https://stackoverflow.com/questions/68107887

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install KeyBERT

Installation can be done using pypi:.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: