15 best Python Topic Modelling libraries in 2023
by reegs20 Updated: Dec 20, 2022
Guide Kit
Find patterns or themes in large document sets, create links, pinpoint important subjects, implement popular algorithms like LSA/LSI/SVD, and more for your Artificial Intelligence, Topic Modeling, Bert, Neural Network, Transformer, and NLP applications. Topic modeling is a method for locating hidden subjects in vast amounts of text. Extensive collections of unstructured text bodies can be organized and understood using topic models. Topic models have been used to find instructional structures in data, including genetic information, pictures, and networks, since they were first created as a text-mining technique. The method falls under the category of an unsupervised machine learning algorithm. Latent Dirichlet Allocation (LDA) is the algorithm's name, a component of Python's Gensim module.
Topic modeling is applied to several tasks, including document segmentation, classification, and summarization. Social networks, population genetics, and computer vision are some of the most novel applications. Topic modeling aids in query expansion in information retrieval. It also customizes search results or provides recommendations by associating user preferences with topics.
Some key features of the Python Topic Modelling libraries are intuitive interfaces, the ease with which you can plug in your input corpus or datastream, distributed computing, state-of-the-art multilingual word embeddings, large-scale, high-quality bilingual dictionaries for training and evaluation, etc.
Check out the below list to find the best Python topic modeling libraries for your application:
MUSEby facebookresearch
A library for Multilingual Unsupervised or Supervised word Embeddings
MUSEby facebookresearch
Python
3082
Version:Current
License: Others (Non-SPDX)
textheroby jbesomi
Text preprocessing, representation and visualization from zero to hero.
textheroby jbesomi
Python
2741
Version:1.1.0
License: Permissive (MIT)
BERTopicby MaartenGr
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
BERTopicby MaartenGr
Python
4305
Version:v0.15.0
License: Permissive (MIT)
awesome-sentence-embeddingby Separius
A curated list of pretrained sentence and word embedding models
awesome-sentence-embeddingby Separius
Python
2099
Version:Current
License: Strong Copyleft (GPL-3.0)
scattertextby JasonKessler
Beautiful visualizations of how language differs among document types.
scattertextby JasonKessler
Python
2072
Version:0.0.2.4.4
License: Permissive (Apache-2.0)
word2vec-apiby 3Top
Simple web service providing a word embedding model
word2vec-apiby 3Top
Python
1388
Version:Current
License: No License
deep-siamese-text-similarityby dhwajraj
Tensorflow based implementation of deep siamese LSTM network to capture phrase/sentence similarity using character/word embeddings
deep-siamese-text-similarityby dhwajraj
Python
1390
Version:Current
License: Permissive (MIT)
nlp-journeyby msgi
Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.
nlp-journeyby msgi
Python
1528
Version:v1.0
License: Permissive (Apache-2.0)
ldaby lda-project
Topic modeling with latent Dirichlet allocation using Gibbs sampling
ldaby lda-project
Python
1122
Version:0.3.2
License: Weak Copyleft (MPL-2.0)
contextualized-topic-modelsby MilaNLProc
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.
contextualized-topic-modelsby MilaNLProc
Python
1053
Version:Current
License: Permissive (MIT)
GuidedLDAby vi3k6i5
semi supervised guided topic model with custom guidedLDA
GuidedLDAby vi3k6i5
Python
404
Version:Current
License: Weak Copyleft (MPL-2.0)
dynamic-nmfby derekgreene
Dynamic Topic Modeling via Non-negative Matrix Factorization
dynamic-nmfby derekgreene
Python
239
Version:Current
License: Permissive (Apache-2.0)
topicsby vladsandulescu
Topic modeling with gensim and LDA
topicsby vladsandulescu
Python
158
Version:Current
License: Permissive (Apache-2.0)