Minimal keyword extraction with BERT
Support
Quality
Security
License
Reuse
Implementation of BERT that could load official pre-trained models for feature extraction and prediction
Support
Quality
Security
License
Reuse
RoBERTa中文预训练模型: RoBERTa for Chinese
Support
Quality
Security
License
Reuse
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Support
Quality
Security
License
Reuse
A very simple BiLSTM-CRF model for Chinese Named Entity Recognition 中文命名实体识别 (TensorFlow)
Support
Quality
Security
License
Reuse
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Support
Quality
Security
License
Reuse
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Support
Quality
Security
License
Reuse
I
Information-Extraction-Chineseby crownpku
Python 2103 Version:Current License: No License (No License)
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
Support
Quality
Security
License
Reuse
a
awesome-sentence-embeddingby Separius
Python 2099 Version:Current License: Strong Copyleft (GPL-3.0)
A curated list of pretrained sentence and word embedding models
Support
Quality
Security
License
Reuse
Evolutionary Scale Modeling (esm): Pretrained language models for proteins
Support
Quality
Security
License
Reuse
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
Support
Quality
Security
License
Reuse
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
Support
Quality
Security
License
Reuse
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
Support
Quality
Security
License
Reuse
Super easy library for BERT based NLP models
Support
Quality
Security
License
Reuse
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Support
Quality
Security
License
Reuse
Longformer: The Long-Document Transformer
Support
Quality
Security
License
Reuse
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
Support
Quality
Security
License
Reuse
1st Place Solution for CrowdFlower Product Search Results Relevance Competition on Kaggle.
Support
Quality
Security
License
Reuse
Baidu's open-source Sentiment Analysis System.
Support
Quality
Security
License
Reuse
Datasets, SOTA results of every fields of Chinese NLP
Support
Quality
Security
License
Reuse
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
Support
Quality
Security
License
Reuse
该仓库主要记录 NLP 算法工程师相关的面试题
Support
Quality
Security
License
Reuse
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
Support
Quality
Security
License
Reuse
Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Support
Quality
Security
License
Reuse
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Support
Quality
Security
License
Reuse
Pre-Trained Chinese XLNet(中文XLNet预训练模型)
Support
Quality
Security
License
Reuse
A fast, efficient universal vector embedding utility package.
Support
Quality
Security
License
Reuse
A framework for few-shot evaluation of autoregressive language models.
Support
Quality
Security
License
Reuse
The implementation of DeBERTa
Support
Quality
Security
License
Reuse
中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Support
Quality
Security
License
Reuse
Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.
Support
Quality
Security
License
Reuse
jiant is an nlp toolkit
Support
Quality
Security
License
Reuse
Self-contained Machine Learning and Natural Language Processing library in Go
Support
Quality
Security
License
Reuse
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda
Support
Quality
Security
License
Reuse
Swift Core ML 3 implementations of GPT-2, DistilGPT-2, BERT, and DistilBERT for Question answering. Other Transformers coming soon!
Support
Quality
Security
License
Reuse
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Support
Quality
Security
License
Reuse
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用
Support
Quality
Security
License
Reuse
Python Keyphrase Extraction module
Support
Quality
Security
License
Reuse
Data augmentation for NLP, presented at EMNLP 2019
Support
Quality
Security
License
Reuse
T
Transformer-Explainabilityby hila-chefer
Jupyter Notebook 1376 Version:Current License: Permissive (MIT)
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
Support
Quality
Security
License
Reuse
Super easy library for BERT based NLP models
Support
Quality
Security
License
Reuse
Code for paper Fine-tune BERT for Extractive Summarization
Support
Quality
Security
License
Reuse
Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"
Support
Quality
Security
License
Reuse
A list of NLP(Natural Language Processing) tutorials
Support
Quality
Security
License
Reuse
Deep Learning NLP Pipeline implemented on Tensorflow
Support
Quality
Security
License
Reuse
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
Support
Quality
Security
License
Reuse
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Support
Quality
Security
License
Reuse
A BERT model for scientific text.
Support
Quality
Security
License
Reuse
Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)
Support
Quality
Security
License
Reuse
基于知识图谱的问答系统,BERT做命名实体识别和句子相似度,分为online和outline模式
Support
Quality
Security
License
Reuse
K
KeyBERTby MaartenGr
Minimal keyword extraction with BERT
Python 2419Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
keras-bertby CyberZHG
Implementation of BERT that could load official pre-trained models for feature extraction and prediction
Python 2411Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
roberta_zhby brightmart
RoBERTa中文预训练模型: RoBERTa for Chinese
Python 2366Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
t
texarby asyml
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Python 2360Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
z
zh-NER-TFby Determined22
A very simple BiLSTM-CRF model for Chinese Named Entity Recognition 中文命名实体识别 (TensorFlow)
Python 2214Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
e
electraby google-research
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Python 2179Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
K
Kashgariby BrikerMan
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Python 2137Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
I
Information-Extraction-Chineseby crownpku
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
Python 2103Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
awesome-sentence-embeddingby Separius
A curated list of pretrained sentence and word embedding models
Python 2099Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
e
esmby facebookresearch
Evolutionary Scale Modeling (esm): Pretrained language models for proteins
Python 2078Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
text2vecby shibing624
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
Python 2066Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
H
HarvestTextby blmoistawinde
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
Python 1954Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
rust-bertby guillaume-be
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
Rust 1876Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
f
fast-bertby utterworks
Super easy library for BERT based NLP models
Python 1793Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
A
ABSA-PyTorchby songyouwei
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Python 1782Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
l
longformerby allenai
Longformer: The Long-Document Transformer
Python 1778Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
B
BERT-NER-Pytorchby lonePatient
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
Python 1770Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kaggle-CrowdFlowerby ChenglongChen
1st Place Solution for CrowdFlower Product Search Results Relevance Competition on Kaggle.
C++ 1742Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
Sentaby baidu
Baidu's open-source Sentiment Analysis System.
Python 1719Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
ChineseNLPby didi
Datasets, SOTA results of every fields of Chinese NLP
HTML 1710Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
gpt2-mlby imcaspar
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
Python 1682Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
N
NLP-Interview-Notesby km1994
该仓库主要记录 NLP 算法工程师相关的面试题
Jupyter Notebook 1678Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
ChineseGLUEby ChineseGLUE
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
Python 1671Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
b
biobertby dmis-lab
Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Python 1651Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
F
FARMby deepset-ai
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Python 1646Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
Chinese-XLNetby ymcui
Pre-Trained Chinese XLNet(中文XLNet预训练模型)
Python 1567Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
magnitudeby plasticityai
A fast, efficient universal vector embedding utility package.
Python 1564Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
l
lm-evaluation-harnessby EleutherAI
A framework for few-shot evaluation of autoregressive language models.
Python 1550Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
D
DeBERTaby microsoft
The implementation of DeBERTa
Python 1542Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
K
Keras-TextClassificationby yongzhuo
中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Python 1530Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nlp-journeyby msgi
Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.
Python 1528Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
j
Support
Quality
Security
License
Reuse
s
spagoby nlpodyssey
Self-contained Machine Learning and Natural Language Processing library in Go
Go 1510Updated: 2 y ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
n
nlpcdaby 425776024
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda
Python 1450Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
swift-coreml-transformersby huggingface
Swift Core ML 3 implementations of GPT-2, DistilGPT-2, BERT, and DistilBERT for Question answering. Other Transformers coming soon!
Swift 1438Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
N
NeuronBlocksby microsoft
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Python 1433Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nlp_xiaojiangby yongzhuo
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用
Python 1432Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pkeby boudinfl
Python Keyphrase Extraction module
Python 1408Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
e
eda_nlpby jasonwei20
Data augmentation for NLP, presented at EMNLP 2019
Python 1403Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
Transformer-Explainabilityby hila-chefer
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
Jupyter Notebook 1376Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
fast-bertby kaushaltrivedi
Super easy library for BERT based NLP models
Python 1372Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
B
BertSumby nlpyang
Code for paper Fine-tune BERT for Extractive Summarization
Python 1367Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
E
ERNIEby thunlp
Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"
Python 1354Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nlp-tutorialby lyeoni
A list of NLP(Natural Language Processing) tutorials
Jupyter Notebook 1341Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
deepnlpby rockingdingo
Deep Learning NLP Pipeline implemented on Tensorflow
Python 1340Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
TurboTransformersby Tencent
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
C++ 1322Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
C
CLUENER2020by CLUEbenchmark
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Python 1302Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
scibertby allenai
A BERT model for scientific text.
Python 1299Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
Chinese-ELECTRAby ymcui
Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)
Python 1289Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
K
KBQA-BERTby WenRichard
基于知识图谱的问答系统,BERT做命名实体识别和句子相似度,分为online和outline模式
Python 1279Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse