Minimal keyword extraction with BERT
Support
Quality
Security
License
Reuse
Implementation of BERT that could load official pre-trained models for feature extraction and prediction
Support
Quality
Security
License
Reuse
RoBERTa中文预训练模型: RoBERTa for Chinese
Support
Quality
Security
License
Reuse
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Support
Quality
Security
License
Reuse
A very simple BiLSTM-CRF model for Chinese Named Entity Recognition 中文命名实体识别 (TensorFlow)
Support
Quality
Security
License
Reuse
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Support
Quality
Security
License
Reuse
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Support
Quality
Security
License
Reuse
I
Information-Extraction-Chineseby crownpku
Python 
2103
Version:Current
License: No License (No License)
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
Support
Quality
Security
License
Reuse
a
awesome-sentence-embeddingby Separius
Python 
2099
Version:Current
License: Strong Copyleft (GPL-3.0)
A curated list of pretrained sentence and word embedding models
Support
Quality
Security
License
Reuse
Evolutionary Scale Modeling (esm): Pretrained language models for proteins
Support
Quality
Security
License
Reuse
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
Support
Quality
Security
License
Reuse
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
Support
Quality
Security
License
Reuse
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
Support
Quality
Security
License
Reuse
Super easy library for BERT based NLP models
Support
Quality
Security
License
Reuse
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Support
Quality
Security
License
Reuse
Longformer: The Long-Document Transformer
Support
Quality
Security
License
Reuse
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
Support
Quality
Security
License
Reuse
1st Place Solution for CrowdFlower Product Search Results Relevance Competition on Kaggle.
Support
Quality
Security
License
Reuse
Baidu's open-source Sentiment Analysis System.
Support
Quality
Security
License
Reuse
Datasets, SOTA results of every fields of Chinese NLP
Support
Quality
Security
License
Reuse
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
Support
Quality
Security
License
Reuse
该仓库主要记录 NLP 算法工程师相关的面试题
Support
Quality
Security
License
Reuse
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
Support
Quality
Security
License
Reuse
Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Support
Quality
Security
License
Reuse
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Support
Quality
Security
License
Reuse
Pre-Trained Chinese XLNet(中文XLNet预训练模型)
Support
Quality
Security
License
Reuse
A fast, efficient universal vector embedding utility package.
Support
Quality
Security
License
Reuse
A framework for few-shot evaluation of autoregressive language models.
Support
Quality
Security
License
Reuse
The implementation of DeBERTa
Support
Quality
Security
License
Reuse
中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Support
Quality
Security
License
Reuse
Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.
Support
Quality
Security
License
Reuse
jiant is an nlp toolkit
Support
Quality
Security
License
Reuse
Self-contained Machine Learning and Natural Language Processing library in Go
Support
Quality
Security
License
Reuse
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda
Support
Quality
Security
License
Reuse
Swift Core ML 3 implementations of GPT-2, DistilGPT-2, BERT, and DistilBERT for Question answering. Other Transformers coming soon!
Support
Quality
Security
License
Reuse
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Support
Quality
Security
License
Reuse
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用
Support
Quality
Security
License
Reuse
Python Keyphrase Extraction module
Support
Quality
Security
License
Reuse
Data augmentation for NLP, presented at EMNLP 2019
Support
Quality
Security
License
Reuse
T
Transformer-Explainabilityby hila-chefer
Jupyter Notebook 
1376
Version:Current
License: Permissive (MIT)
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
Support
Quality
Security
License
Reuse
Super easy library for BERT based NLP models
Support
Quality
Security
License
Reuse
Code for paper Fine-tune BERT for Extractive Summarization
Support
Quality
Security
License
Reuse
Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"
Support
Quality
Security
License
Reuse
A list of NLP(Natural Language Processing) tutorials
Support
Quality
Security
License
Reuse
Deep Learning NLP Pipeline implemented on Tensorflow
Support
Quality
Security
License
Reuse
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
Support
Quality
Security
License
Reuse
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Support
Quality
Security
License
Reuse
A BERT model for scientific text.
Support
Quality
Security
License
Reuse
Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)
Support
Quality
Security
License
Reuse
基于知识图谱的问答系统,BERT做命名实体识别和句子相似度,分为online和outline模式
Support
Quality
Security
License
Reuse
K
KeyBERTby MaartenGr
Minimal keyword extraction with BERT
Python
2419
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
keras-bertby CyberZHG
Implementation of BERT that could load official pre-trained models for feature extraction and prediction
Python
2411
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
roberta_zhby brightmart
RoBERTa中文预训练模型: RoBERTa for Chinese
Python
2366
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
t
texarby asyml
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Python
2360
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
z
zh-NER-TFby Determined22
A very simple BiLSTM-CRF model for Chinese Named Entity Recognition 中文命名实体识别 (TensorFlow)
Python
2214
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
e
electraby google-research
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Python
2179
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
K
Kashgariby BrikerMan
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Python
2137
Updated: 4 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
I
Information-Extraction-Chineseby crownpku
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
Python
2103
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
a
awesome-sentence-embeddingby Separius
A curated list of pretrained sentence and word embedding models
Python
2099
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
e
esmby facebookresearch
Evolutionary Scale Modeling (esm): Pretrained language models for proteins
Python
2078
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
text2vecby shibing624
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
Python
2066
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
H
HarvestTextby blmoistawinde
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
Python
1954
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
rust-bertby guillaume-be
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
Rust
1876
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
f
fast-bertby utterworks
Super easy library for BERT based NLP models
Python
1793
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
A
ABSA-PyTorchby songyouwei
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Python
1782
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
l
longformerby allenai
Longformer: The Long-Document Transformer
Python
1778
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
B
BERT-NER-Pytorchby lonePatient
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
Python
1770
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaggle-CrowdFlowerby ChenglongChen
1st Place Solution for CrowdFlower Product Search Results Relevance Competition on Kaggle.
C++
1742
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
S
Sentaby baidu
Baidu's open-source Sentiment Analysis System.
Python
1719
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
ChineseNLPby didi
Datasets, SOTA results of every fields of Chinese NLP
HTML
1710
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
g
gpt2-mlby imcaspar
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
Python
1682
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
N
NLP-Interview-Notesby km1994
该仓库主要记录 NLP 算法工程师相关的面试题
Jupyter Notebook
1678
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
C
ChineseGLUEby ChineseGLUE
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
Python
1671
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
b
biobertby dmis-lab
Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Python
1651
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
F
FARMby deepset-ai
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Python
1646
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
Chinese-XLNetby ymcui
Pre-Trained Chinese XLNet(中文XLNet预训练模型)
Python
1567
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
magnitudeby plasticityai
A fast, efficient universal vector embedding utility package.
Python
1564
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
l
lm-evaluation-harnessby EleutherAI
A framework for few-shot evaluation of autoregressive language models.
Python
1550
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
D
DeBERTaby microsoft
The implementation of DeBERTa
Python
1542
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
K
Keras-TextClassificationby yongzhuo
中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Python
1530
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nlp-journeyby msgi
Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.
Python
1528
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
j
Support
Quality
Security
License
Reuse
s
spagoby nlpodyssey
Self-contained Machine Learning and Natural Language Processing library in Go
Go
1510
Updated: 2 y ago
License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
n
nlpcdaby 425776024
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda
Python
1450
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
swift-coreml-transformersby huggingface
Swift Core ML 3 implementations of GPT-2, DistilGPT-2, BERT, and DistilBERT for Question answering. Other Transformers coming soon!
Swift
1438
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
N
NeuronBlocksby microsoft
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Python
1433
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nlp_xiaojiangby yongzhuo
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用
Python
1432
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pkeby boudinfl
Python Keyphrase Extraction module
Python
1408
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
e
eda_nlpby jasonwei20
Data augmentation for NLP, presented at EMNLP 2019
Python
1403
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
T
Transformer-Explainabilityby hila-chefer
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
Jupyter Notebook
1376
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
fast-bertby kaushaltrivedi
Super easy library for BERT based NLP models
Python
1372
Updated: 5 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
B
BertSumby nlpyang
Code for paper Fine-tune BERT for Extractive Summarization
Python
1367
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
E
ERNIEby thunlp
Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"
Python
1354
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nlp-tutorialby lyeoni
A list of NLP(Natural Language Processing) tutorials
Jupyter Notebook
1341
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
deepnlpby rockingdingo
Deep Learning NLP Pipeline implemented on Tensorflow
Python
1340
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
TurboTransformersby Tencent
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
C++
1322
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
C
CLUENER2020by CLUEbenchmark
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Python
1302
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
scibertby allenai
A BERT model for scientific text.
Python
1299
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
Chinese-ELECTRAby ymcui
Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)
Python
1289
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
K
KBQA-BERTby WenRichard
基于知识图谱的问答系统,BERT做命名实体识别和句子相似度,分为online和outline模式
Python
1279
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse