Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Support
Quality
Security
License
Reuse
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Support
Quality
Security
License
Reuse
DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks
Support
Quality
Security
License
Reuse
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
Support
Quality
Security
License
Reuse
code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
Support
Quality
Security
License
Reuse
Easy to use extractive text summarization with BERT
Support
Quality
Security
License
Reuse
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
Support
Quality
Security
License
Reuse
🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP
Support
Quality
Security
License
Reuse
pytorch实现 Bert 做seq2seq任务,使用unilm方案,现在也可以做自动摘要,文本分类,情感分析,NER,词性标注等任务,支持t5模型,支持GPT2进行文章续写。
Support
Quality
Security
License
Reuse
Named Entity Recognition Tool
Support
Quality
Security
License
Reuse
Transformer seq2seq model, program that can build a language translator from parallel corpus
Support
Quality
Security
License
Reuse
Korean BERT pre-trained cased (KoBERT)
Support
Quality
Security
License
Reuse
Pytorch-Named-Entity-Recognition-with-BERT
Support
Quality
Security
License
Reuse
🔎 Search the information available on a webpage using natural language instead of an exact string match.
Support
Quality
Security
License
Reuse
E
Entity-Relation-Extractionby yuanxiaosc
Python 
1083
Version:Current
License: No License (No License)
Entity and Relation Extraction Based on TensorFlow and BERT. 基于TensorFlow和BERT的管道式实体及关系抽取,2019语言与智能技术竞赛信息抽取任务解决方案。Schema based Knowledge Extraction, SKE 2019
Support
Quality
Security
License
Reuse
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Support
Quality
Security
License
Reuse
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Support
Quality
Security
License
Reuse
1st Place Solution for Zhihu Machine Learning Challenge . Implementation of various text-classification models.(知乎看山杯第一名解决方案)
Support
Quality
Security
License
Reuse
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Support
Quality
Security
License
Reuse
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.
Support
Quality
Security
License
Reuse
Unsupervised Language Modeling at scale for robust sentiment classification
Support
Quality
Security
License
Reuse
Entity Linker solution
Support
Quality
Security
License
Reuse
:memo: This repository recorded my NLP journey.
Support
Quality
Security
License
Reuse
l
learn-nlp-with-transformersby datawhalechina
Shell 
1017
Version:Current
License: No License (No License)
we want to create a repo to illustrate usage of transformers in chinese
Support
Quality
Security
License
Reuse
Chinese Pre-Trained Language Models (CPM-LM) Version-I
Support
Quality
Security
License
Reuse
This Word Does Not Exist
Support
Quality
Security
License
Reuse
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。
Support
Quality
Security
License
Reuse
b
bert_language_understandingby brightmart
Python 
945
Version:Current
License: No License (No License)
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Support
Quality
Security
License
Reuse
:helicopter: 保险行业语料库,聊天机器人
Support
Quality
Security
License
Reuse
building a chinese dialogue system based on the newest version of rasa(基于最新版本rasa搭建的对话系统)
Support
Quality
Security
License
Reuse
Elasticsearch with BERT for advanced document search.
Support
Quality
Security
License
Reuse
Source code of K-BERT (AAAI2020)
Support
Quality
Security
License
Reuse
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
Support
Quality
Security
License
Reuse
Simple transformer implementation from scratch in pytorch.
Support
Quality
Security
License
Reuse
t
transformer-time-series-predictionby oliverguhr
Python 
854
Version:Current
License: Permissive (MIT)
proof of concept for a transformer-based time series prediction model
Support
Quality
Security
License
Reuse
A tool for learning vector representations of words and entities from Wikipedia
Support
Quality
Security
License
Reuse
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
Support
Quality
Security
License
Reuse
QAmatch(qa_match)/文本匹配/文本分类/文本embedding/文本聚类/文本检索(bow/ifidf/ngramtf-df/bert/albert/bm25/…/nn/gbdt/xgb/kmeans/dscan/faiss/….)
Support
Quality
Security
License
Reuse
Semantic Parser with Execution
Support
Quality
Security
License
Reuse
ccks baidu entity link 实体链接 第一名
Support
Quality
Security
License
Reuse
天池中药说明书实体识别挑战冠军方案;中文命名实体识别;NER; BERT-CRF & BERT-SPAN & BERT-MRC;Pytorch
Support
Quality
Security
License
Reuse
Collections of Chinese NLP corpus
Support
Quality
Security
License
Reuse
🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
Support
Quality
Security
License
Reuse
Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
Support
Quality
Security
License
Reuse
Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Support
Quality
Security
License
Reuse
B
Bert-Multi-Label-Text-Classificationby lonePatient
Python 
769
Version:Current
License: Permissive (MIT)
This repo contains a PyTorch implementation of a pretrained BERT model for multi-label text classification.
Support
Quality
Security
License
Reuse
P
Parrot_Paraphraserby PrithivirajDamodaran
Python 
765
Version:Current
License: Permissive (Apache-2.0)
A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Support
Quality
Security
License
Reuse
a bert for retrieval and generation
Support
Quality
Security
License
Reuse
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Support
Quality
Security
License
Reuse
Source code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)
Support
Quality
Security
License
Reuse
D
DPRby facebookresearch
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Python
1270
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
spacy-transformersby explosion
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Python
1243
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
detextby linkedin
DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks
Python
1239
Updated: 2 y ago
License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
E
EDA_NLP_for_Chineseby zhanlaoban
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
Python
1227
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
P
PreSummby nlpyang
code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
Python
1226
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
bert-extractive-summarizerby dmmiller612
Easy to use extractive text summarization with BERT
Python
1206
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
B
BERT-NERby kyzhouhzau
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
Python
1176
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
h
hmtlby huggingface
🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP
Python
1169
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
bert_seq2seqby 920232796
pytorch实现 Bert 做seq2seq任务,使用unilm方案,现在也可以做自动摘要,文本分类,情感分析,NER,词性标注等任务,支持t5模型,支持GPT2进行文章续写。
Python
1158
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
taggerby glample
Named Entity Recognition Tool
Python
1144
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
Transformerby SamLynnEvans
Transformer seq2seq model, program that can build a language translator from parallel corpus
Python
1123
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
K
KoBERTby SKTBrain
Korean BERT pre-trained cased (KoBERT)
Jupyter Notebook
1115
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
B
BERT-NERby kamalkraj
Pytorch-Named-Entity-Recognition-with-BERT
Python
1106
Updated: 2 y ago
License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
s
shift-ctrl-fby model-zoo
🔎 Search the information available on a webpage using natural language instead of an exact string match.
JavaScript
1103
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
E
Entity-Relation-Extractionby yuanxiaosc
Entity and Relation Extraction Based on TensorFlow and BERT. 基于TensorFlow和BERT的管道式实体及关系抽取,2019语言与智能技术竞赛信息抽取任务解决方案。Schema based Knowledge Extraction, SKE 2019
Python
1083
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
n
nlp-in-practiceby kavgan
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Jupyter Notebook
1059
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
P
PPLMby uber-research
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Python
1058
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
P
PyTorchTextby chenyuntc
1st Place Solution for Zhihu Machine Learning Challenge . Implementation of various text-classification models.(知乎看山杯第一名解决方案)
Python
1055
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
T2T-ViTby yitu-opensource
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Jupyter Notebook
1055
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
c
contextualized-topic-modelsby MilaNLProc
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.
Python
1053
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
sentiment-discoveryby NVIDIA
Unsupervised Language Modeling at scale for robust sentiment classification
Python
1047
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
B
Support
Quality
Security
License
Reuse
n
nlpby makcedward
:memo: This repository recorded my NLP journey.
Python
1043
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
l
learn-nlp-with-transformersby datawhalechina
we want to create a repo to illustrate usage of transformers in chinese
Shell
1017
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
C
CPM-Generateby TsinghuaAI
Chinese Pre-Trained Language Models (CPM-LM) Version-I
Python
997
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
this-word-does-not-existby turtlesoupy
This Word Does Not Exist
Python
992
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
G
GPT2-NewsTitleby liucongg
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。
Python
963
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
bert_language_understandingby brightmart
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Python
945
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
i
insuranceqa-corpus-zhby chatopera
:helicopter: 保险行业语料库,聊天机器人
Python
928
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
r
rasa_chatbot_cnby GaoQ1
building a chinese dialogue system based on the newest version of rasa(基于最新版本rasa搭建的对话系统)
Python
896
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
b
bertsearchby Hironsan
Elasticsearch with BERT for advanced document search.
Python
865
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
K
K-BERTby autoliuweijie
Source code of K-BERT (AAAI2020)
Python
863
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
T
Transformers4Recby NVIDIA-Merlin
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
Python
856
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
f
formerby pbloem
Simple transformer implementation from scratch in pytorch.
Python
855
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
transformer-time-series-predictionby oliverguhr
proof of concept for a transformer-based time series prediction model
Python
854
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
w
wikipedia2vecby wikipedia2vec
A tool for learning vector representations of words and entities from Wikipedia
Python
850
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
g
gpt-2-Pytorchby graykode
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
Python
845
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
TextMatchby MachineLP
QAmatch(qa_match)/文本匹配/文本分类/文本embedding/文本聚类/文本检索(bow/ifidf/ngramtf-df/bert/albert/bm25/…/nn/gbdt/xgb/kmeans/dscan/faiss/….)
Python
824
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
sempreby percyliang
Semantic Parser with Execution
Java
817
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
c
ccks_baidu_entity_linkby panchunguang
ccks baidu entity link 实体链接 第一名
Python
808
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
D
DeepNERby z814081807
天池中药说明书实体识别挑战冠军方案;中文命名实体识别;NER; BERT-CRF & BERT-SPAN & BERT-MRC;Pytorch
Python
807
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
C
Chinese-NLP-Corpusby OYE93
Collections of Chinese NLP corpus
Python
805
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
C
Chatitoby rodrigopivi
🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
TypeScript
805
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
N
NLP-Tutorialsby MorvanZhou
Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
Python
789
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gectorby grammarly
Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Python
787
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
B
Bert-Multi-Label-Text-Classificationby lonePatient
This repo contains a PyTorch implementation of a pretrained BERT model for multi-label text classification.
Python
769
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
Parrot_Paraphraserby PrithivirajDamodaran
A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Python
765
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
simbertby ZhuiyiTechnology
a bert for retrieval and generation
Python
746
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
CLUEPretrainedModelsby CLUEbenchmark
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Python
745
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
m
multiwozby budzianowski
Source code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)
Python
730
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse