Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Support
Quality
Security
License
Reuse
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Support
Quality
Security
License
Reuse
DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks
Support
Quality
Security
License
Reuse
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
Support
Quality
Security
License
Reuse
code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
Support
Quality
Security
License
Reuse
Easy to use extractive text summarization with BERT
Support
Quality
Security
License
Reuse
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
Support
Quality
Security
License
Reuse
🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP
Support
Quality
Security
License
Reuse
pytorch实现 Bert 做seq2seq任务,使用unilm方案,现在也可以做自动摘要,文本分类,情感分析,NER,词性标注等任务,支持t5模型,支持GPT2进行文章续写。
Support
Quality
Security
License
Reuse
Named Entity Recognition Tool
Support
Quality
Security
License
Reuse
Transformer seq2seq model, program that can build a language translator from parallel corpus
Support
Quality
Security
License
Reuse
Korean BERT pre-trained cased (KoBERT)
Support
Quality
Security
License
Reuse
Pytorch-Named-Entity-Recognition-with-BERT
Support
Quality
Security
License
Reuse
🔎 Search the information available on a webpage using natural language instead of an exact string match.
Support
Quality
Security
License
Reuse
E
Entity-Relation-Extractionby yuanxiaosc
Python 1083 Version:Current License: No License (No License)
Entity and Relation Extraction Based on TensorFlow and BERT. 基于TensorFlow和BERT的管道式实体及关系抽取,2019语言与智能技术竞赛信息抽取任务解决方案。Schema based Knowledge Extraction, SKE 2019
Support
Quality
Security
License
Reuse
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Support
Quality
Security
License
Reuse
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Support
Quality
Security
License
Reuse
1st Place Solution for Zhihu Machine Learning Challenge . Implementation of various text-classification models.(知乎看山杯第一名解决方案)
Support
Quality
Security
License
Reuse
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Support
Quality
Security
License
Reuse
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.
Support
Quality
Security
License
Reuse
Unsupervised Language Modeling at scale for robust sentiment classification
Support
Quality
Security
License
Reuse
Entity Linker solution
Support
Quality
Security
License
Reuse
:memo: This repository recorded my NLP journey.
Support
Quality
Security
License
Reuse
l
learn-nlp-with-transformersby datawhalechina
Shell 1017 Version:Current License: No License (No License)
we want to create a repo to illustrate usage of transformers in chinese
Support
Quality
Security
License
Reuse
Chinese Pre-Trained Language Models (CPM-LM) Version-I
Support
Quality
Security
License
Reuse
This Word Does Not Exist
Support
Quality
Security
License
Reuse
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。
Support
Quality
Security
License
Reuse
b
bert_language_understandingby brightmart
Python 945 Version:Current License: No License (No License)
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Support
Quality
Security
License
Reuse
:helicopter: 保险行业语料库,聊天机器人
Support
Quality
Security
License
Reuse
building a chinese dialogue system based on the newest version of rasa(基于最新版本rasa搭建的对话系统)
Support
Quality
Security
License
Reuse
Elasticsearch with BERT for advanced document search.
Support
Quality
Security
License
Reuse
Source code of K-BERT (AAAI2020)
Support
Quality
Security
License
Reuse
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
Support
Quality
Security
License
Reuse
Simple transformer implementation from scratch in pytorch.
Support
Quality
Security
License
Reuse
t
transformer-time-series-predictionby oliverguhr
Python 854 Version:Current License: Permissive (MIT)
proof of concept for a transformer-based time series prediction model
Support
Quality
Security
License
Reuse
A tool for learning vector representations of words and entities from Wikipedia
Support
Quality
Security
License
Reuse
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
Support
Quality
Security
License
Reuse
QAmatch(qa_match)/文本匹配/文本分类/文本embedding/文本聚类/文本检索(bow/ifidf/ngramtf-df/bert/albert/bm25/…/nn/gbdt/xgb/kmeans/dscan/faiss/….)
Support
Quality
Security
License
Reuse
Semantic Parser with Execution
Support
Quality
Security
License
Reuse
ccks baidu entity link 实体链接 第一名
Support
Quality
Security
License
Reuse
天池中药说明书实体识别挑战冠军方案;中文命名实体识别;NER; BERT-CRF & BERT-SPAN & BERT-MRC;Pytorch
Support
Quality
Security
License
Reuse
Collections of Chinese NLP corpus
Support
Quality
Security
License
Reuse
🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
Support
Quality
Security
License
Reuse
Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
Support
Quality
Security
License
Reuse
Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Support
Quality
Security
License
Reuse
B
Bert-Multi-Label-Text-Classificationby lonePatient
Python 769 Version:Current License: Permissive (MIT)
This repo contains a PyTorch implementation of a pretrained BERT model for multi-label text classification.
Support
Quality
Security
License
Reuse
P
Parrot_Paraphraserby PrithivirajDamodaran
Python 765 Version:Current License: Permissive (Apache-2.0)
A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Support
Quality
Security
License
Reuse
a bert for retrieval and generation
Support
Quality
Security
License
Reuse
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Support
Quality
Security
License
Reuse
Source code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)
Support
Quality
Security
License
Reuse
D
DPRby facebookresearch
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Python 1270Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
spacy-transformersby explosion
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Python 1243Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
detextby linkedin
DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks
Python 1239Updated: 2 y ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
E
EDA_NLP_for_Chineseby zhanlaoban
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
Python 1227Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
P
PreSummby nlpyang
code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
Python 1226Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
bert-extractive-summarizerby dmmiller612
Easy to use extractive text summarization with BERT
Python 1206Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
B
BERT-NERby kyzhouhzau
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
Python 1176Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
h
hmtlby huggingface
🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP
Python 1169Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
bert_seq2seqby 920232796
pytorch实现 Bert 做seq2seq任务,使用unilm方案,现在也可以做自动摘要,文本分类,情感分析,NER,词性标注等任务,支持t5模型,支持GPT2进行文章续写。
Python 1158Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
taggerby glample
Named Entity Recognition Tool
Python 1144Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
Transformerby SamLynnEvans
Transformer seq2seq model, program that can build a language translator from parallel corpus
Python 1123Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
K
KoBERTby SKTBrain
Korean BERT pre-trained cased (KoBERT)
Jupyter Notebook 1115Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
B
BERT-NERby kamalkraj
Pytorch-Named-Entity-Recognition-with-BERT
Python 1106Updated: 2 y ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
s
shift-ctrl-fby model-zoo
🔎 Search the information available on a webpage using natural language instead of an exact string match.
JavaScript 1103Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
E
Entity-Relation-Extractionby yuanxiaosc
Entity and Relation Extraction Based on TensorFlow and BERT. 基于TensorFlow和BERT的管道式实体及关系抽取,2019语言与智能技术竞赛信息抽取任务解决方案。Schema based Knowledge Extraction, SKE 2019
Python 1083Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
n
nlp-in-practiceby kavgan
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Jupyter Notebook 1059Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
P
PPLMby uber-research
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Python 1058Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
P
PyTorchTextby chenyuntc
1st Place Solution for Zhihu Machine Learning Challenge . Implementation of various text-classification models.(知乎看山杯第一名解决方案)
Python 1055Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
T2T-ViTby yitu-opensource
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Jupyter Notebook 1055Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
c
contextualized-topic-modelsby MilaNLProc
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.
Python 1053Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
sentiment-discoveryby NVIDIA
Unsupervised Language Modeling at scale for robust sentiment classification
Python 1047Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
B
Support
Quality
Security
License
Reuse
n
nlpby makcedward
:memo: This repository recorded my NLP journey.
Python 1043Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
learn-nlp-with-transformersby datawhalechina
we want to create a repo to illustrate usage of transformers in chinese
Shell 1017Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
CPM-Generateby TsinghuaAI
Chinese Pre-Trained Language Models (CPM-LM) Version-I
Python 997Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
this-word-does-not-existby turtlesoupy
This Word Does Not Exist
Python 992Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
G
GPT2-NewsTitleby liucongg
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。
Python 963Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
bert_language_understandingby brightmart
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Python 945Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
i
insuranceqa-corpus-zhby chatopera
:helicopter: 保险行业语料库,聊天机器人
Python 928Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
r
rasa_chatbot_cnby GaoQ1
building a chinese dialogue system based on the newest version of rasa(基于最新版本rasa搭建的对话系统)
Python 896Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
b
bertsearchby Hironsan
Elasticsearch with BERT for advanced document search.
Python 865Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
K
K-BERTby autoliuweijie
Source code of K-BERT (AAAI2020)
Python 863Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
Transformers4Recby NVIDIA-Merlin
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
Python 856Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
f
formerby pbloem
Simple transformer implementation from scratch in pytorch.
Python 855Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
transformer-time-series-predictionby oliverguhr
proof of concept for a transformer-based time series prediction model
Python 854Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
w
wikipedia2vecby wikipedia2vec
A tool for learning vector representations of words and entities from Wikipedia
Python 850Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
g
gpt-2-Pytorchby graykode
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
Python 845Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
TextMatchby MachineLP
QAmatch(qa_match)/文本匹配/文本分类/文本embedding/文本聚类/文本检索(bow/ifidf/ngramtf-df/bert/albert/bm25/…/nn/gbdt/xgb/kmeans/dscan/faiss/….)
Python 824Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
sempreby percyliang
Semantic Parser with Execution
Java 817Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
c
ccks_baidu_entity_linkby panchunguang
ccks baidu entity link 实体链接 第一名
Python 808Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
D
DeepNERby z814081807
天池中药说明书实体识别挑战冠军方案;中文命名实体识别;NER; BERT-CRF & BERT-SPAN & BERT-MRC;Pytorch
Python 807Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
Chinese-NLP-Corpusby OYE93
Collections of Chinese NLP corpus
Python 805Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
Chatitoby rodrigopivi
🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
TypeScript 805Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
N
NLP-Tutorialsby MorvanZhou
Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
Python 789Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
gectorby grammarly
Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Python 787Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
B
Bert-Multi-Label-Text-Classificationby lonePatient
This repo contains a PyTorch implementation of a pretrained BERT model for multi-label text classification.
Python 769Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
Parrot_Paraphraserby PrithivirajDamodaran
A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Python 765Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
simbertby ZhuiyiTechnology
a bert for retrieval and generation
Python 746Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
CLUEPretrainedModelsby CLUEbenchmark
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Python 745Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
multiwozby budzianowski
Source code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)
Python 730Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse