TextCNN | 基于深度学习的中文文本分类 | Machine Learning library
kandi X-RAY | TextCNN Summary
kandi X-RAY | TextCNN Summary
基于深度学习(tensorflow)的中文文本分类
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Train the model
- Read the contents of a file
- Convert a keras data file into one - hot embedding
- Evaluate the model
- Generator for batches of data
- Return a file - like object
- Feed data into a dictionary
- Return native content
- Returns a timedelta from start_time
- Save train data files
- Read a file
- Runs test
- Builds a vocabulary from css files
- Return a dictionary of category and id
- Reads a vocab file
TextCNN Key Features
TextCNN Examples and Code Snippets
Community Discussions
Trending Discussions on TextCNN
QUESTION
I have a task and I wish to use TextCNN to finish it. The input sequence is like this:
...ANSWER
Answered 2020-Jun-15 at 07:03you can do it with tf tokenizer customizing the filters
argument
QUESTION
I am trying to implement the preprocessing code for this paper (code in this repo). The preprocessing code is described in the paper here:
"A convolutional neural network (Kim, 2014) is used to extract textual features from the transcript of the utterances. We use a single convolutional layer followed by max-pooling and a fully connected layer to obtain the feature representations for the utterances. The input to this network is the 300 dimensional pretrained 840B GloVe vectors (Pennington et al., 2014). We use filters of size 3, 4 and 5 with 50 feature maps in each. The convoluted features are then max-pooled with a window size of 2 followed by the ReLU activation (Nair and Hinton, 2010). These are then concatenated and fed to a 100 dimensional fully connected layer, whose activations form the representation of the utterance. This network is trained at utterance level with the emotion labels."
The authors of the paper state that CNN feature extraction code can be found in this repo. However, this code is for a complete model that does sequence classification. It does everything in the quote above except the bolded part (and it goes further to complete do classification). I want the edit the code to build that concatenates and feeds into the 100d layer and then extracts the activations. The data to train on is found in the repo (its the IMDB dataset).
The output should be a (100, ) tensor for each sequence.
Here's the code for the CNN model:
...ANSWER
Answered 2020-May-11 at 19:35The convolutional neural network you are trying to implement is a great baseline in the NLP domain. It was introduced for the first time in this paper (Kim, 2014).
I found very useful the code you report but may be more complex than we need. I try to rewrite the network in simple keras (I only miss regularizations)
QUESTION
I'm using textcnn model in estimator to classify some text. After i train the model, the trained model was stored in the form of checkpoints. But when i try to predict the same test file with same checkpoints,the predicted result(porbalility and logits) varies slightly.
- I have set the dropout_keep_prob=1 in dropout layer
- checkpoints and test file remain the same one.
- I have used the LoggingTensorHook to check the tensor values during the predict, two values begin to vary at the max_pool step(at least the conv values are same but i am not sure)
ANSWER
Answered 2019-Sep-26 at 02:29Autually,i figured out this problem. The variety is resulted by word embedding vectors which are generated randomly every time.
QUESTION
from keras.layers import Embedding, Dense, Input, Dropout, Reshape
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPool2D
from keras.layers import Concatenate, Lambda
from keras.backend import expand_dims
from keras.models import Model
from keras.initializers import constant, random_uniform, TruncatedNormal
class TextCNN(object):
def __init__(
self, sequence_length, num_classes, vocab_size,
embedding_size, filter_sizes, num_filters, l2_reg_lambda=0.0):
# input layer
input_x = Input(shape=(sequence_length, ), dtype='int32')
# embedding layer
embedding_layer = Embedding(vocab_size,
embedding_size,
embeddings_initializer=random_uniform(minval=-1.0, maxval=1.0))(input_x)
embedded_sequences = Lambda(lambda x: expand_dims(embedding_layer, -1))(embedding_layer)
# Create a convolution + maxpool layer for each filter size
pooled_outputs = []
for filter_size in filter_sizes:
conv = Conv2D(filters=num_filters,
kernel_size=[filter_size, embedding_size],
strides=1,
padding="valid",
activation='relu',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.1),
bias_initializer=constant(value=0.1),
name=('conv_%d' % filter_size))(embedded_sequences)
max_pool = MaxPool2D(pool_size=[sequence_length - filter_size + 1, 1],
strides=(1, 1),
padding='valid',
name=('max_pool_%d' % filter_size))(conv)
pooled_outputs.append(max_pool)
# combine all the pooled features
num_filters_total = num_filters * len(filter_sizes)
h_pool = Concatenate(axis=3)(pooled_outputs)
h_pool_flat = Reshape([num_filters_total])(h_pool)
# add dropout
dropout = Dropout(0.8)(h_pool_flat)
# output layer
output = Dense(num_classes,
kernel_initializer='glorot_normal',
bias_initializer=constant(0.1),
activation='softmax',
name='scores')(dropout)
self.model = Model(inputs=input_x, output=output)
# model saver callback
class Saver(Callback):
def __init__(self, num):
self.num = num
self.epoch = 0
def on_epoch _end(self, epoch, logs={}):
if self.epoch % self.num == 0:
name = './model/model.h5'
self.model.save(name)
self.epoch += 1
# evaluation callback
class Evaluation(Callback):
def __init__(self, num):
self.num = num
self.epoch = 0
def on_epoch_end(self, epoch, logs={}):
if self.epoch % self.num == 0:
score = model.evaluate(x_train, y_train, verbose=0)
print('train score:', score[0])
print('train accuracy:', score[1])
score = model.evaluate(x_dev, y_dev, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
self.epoch += 1
model.fit(x_train, y_train,
epochs=num_epochs,
batch_size=batch_size,
callbacks=[Saver(save_every), Evaluation(evaluate_every)])
Traceback (most recent call last):
File "D:/Projects/Python Program Design/sentiment-analysis-Keras/train.py", line 107, in
callbacks=[Saver(save_every), Evaluation(evaluate_every)])
File "D:\Anaconda3\lib\site-packages\keras\engine\training.py", line 1039, in fit
validation_steps=validation_steps)
File "D:\Anaconda3\lib\site-packages\keras\engine\training_arrays.py", line 204, in fit_loop
callbacks.on_batch_end(batch_index, batch_logs)
File "D:\Anaconda3\lib\site-packages\keras\callbacks.py", line 115, in on_batch_end
callback.on_batch_end(batch, logs)
File "D:/Projects/Python Program Design/sentiment-analysis-Keras/train.py", line 83, in on_batch_end
self.model.save(name)
File "D:\Anaconda3\lib\site-packages\keras\engine\network.py", line 1090, in save
save_model(self, filepath, overwrite, include_optimizer)
File "D:\Anaconda3\lib\site-packages\keras\engine\saving.py", line 382, in save_model
_serialize_model(model, f, include_optimizer)
File "D:\Anaconda3\lib\site-packages\keras\engine\saving.py", line 83, in _serialize_model
model_config['config'] = model.get_config()
File "D:\Anaconda3\lib\site-packages\keras\engine\network.py", line 931, in get_config
return copy.deepcopy(config)
File "D:\Anaconda3\lib\copy.py", line 150, in deepcopy
y = copier(x, memo)
File "D:\Anaconda3\lib\copy.py", line 240, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "D:\Anaconda3\lib\copy.py", line 150, in deepcopy
y = copier(x, memo)
File "D:\Anaconda3\lib\copy.py", line 215, in _deepcopy_list
append(deepcopy(a, memo))
File "D:\Anaconda3\lib\copy.py", line 150, in deepcopy
y = copier(x, memo)
File "D:\Anaconda3\lib\copy.py", line 240, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "D:\Anaconda3\lib\copy.py", line 150, in deepcopy
y = copier(x, memo)
File "D:\Anaconda3\lib\copy.py", line 240, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "D:\Anaconda3\lib\copy.py", line 150, in deepcopy
y = copier(x, memo)
File "D:\Anaconda3\lib\copy.py", line 220, in _deepcopy_tuple
y = [deepcopy(a, memo) for a in x]
File "D:\Anaconda3\lib\copy.py", line 220, in
y = [deepcopy(a, memo) for a in x]
File "D:\Anaconda3\lib\copy.py", line 150, in deepcopy
y = copier(x, memo)
File "D:\Anaconda3\lib\copy.py", line 220, in _deepcopy_tuple
y = [deepcopy(a, memo) for a in x]
File "D:\Anaconda3\lib\copy.py", line 220, in
y = [deepcopy(a, memo) for a in x]
File "D:\Anaconda3\lib\copy.py", line 180, in deepcopy
y = _reconstruct(x, memo, *rv)
File "D:\Anaconda3\lib\copy.py", line 280, in _reconstruct
state = deepcopy(state, memo)
File "D:\Anaconda3\lib\copy.py", line 150, in deepcopy
y = copier(x, memo)
File "D:\Anaconda3\lib\copy.py", line 240, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "D:\Anaconda3\lib\copy.py", line 180, in deepcopy
y = _reconstruct(x, memo, *rv)
File "D:\Anaconda3\lib\copy.py", line 280, in _reconstruct
state = deepcopy(state, memo)
File "D:\Anaconda3\lib\copy.py", line 150, in deepcopy
y = copier(x, memo)
File "D:\Anaconda3\lib\copy.py", line 240, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "D:\Anaconda3\lib\copy.py", line 180, in deepcopy
y = _reconstruct(x, memo, *rv)
File "D:\Anaconda3\lib\copy.py", line 280, in _reconstruct
state = deepcopy(state, memo)
File "D:\Anaconda3\lib\copy.py", line 150, in deepcopy
y = copier(x, memo)
File "D:\Anaconda3\lib\copy.py", line 240, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "D:\Anaconda3\lib\copy.py", line 169, in deepcopy
rv = reductor(4)
TypeError: can't pickle _thread.RLock objects
...ANSWER
Answered 2019-Mar-21 at 12:18It might be due to this layer:
QUESTION
I am relatively new to tensorflow and I am working on relation classification. I will list down my problem step wise so that it is clear and hope that someone can point out my mistake( which I am sure must be a silly one):
- For the word embedding layer I needed to initialize a tf variable with a tensor which was of size more that 2GB. So I followed the solutions provided here and changed my code.
Code snippets before change :
...ANSWER
Answered 2018-Aug-01 at 10:21The placeholder wordvecs
needs to be fed.
This can be reproduced by the following example from tf.placeholder
example in the official documentation -
QUESTION
I'm novice to deep learning.I use tensorflow to construct my TextCNN model(two categories) referring this tutorial.
This model can predict the categories of the text. But I want a score (continuous value in [0,1]) rather than the discrete value. For example, If the model give 0.77, the text is more likely one of the category; if it gives 1.0, the text is actually that category.
This is the part of my code.
ANSWER
Answered 2018-Jul-15 at 07:06Use tf.nn.softmax(self.logits)
to get probabilistic scores. Also see this question: What is logits, softmax and softmax_cross_entropy_with_logits?
QUESTION
I have trained a model using the Wild ML implementation of a CNN which can be found here, and deployed it to Google Cloud Platform. I am now trying to send a JSON prediction request to the model, but I am getting the following error:
...ANSWER
Answered 2018-Feb-24 at 05:06The issue does have nothing to do with the train.py
and text_cnn.py
. They build your model. After building your model, do the following modifications in your eval.py
code.
First, you can use argument library to get your JSON file.
QUESTION
I got error message such this:
IndexErrorTraceback (most recent call last)
...
ANSWER
Answered 2017-Nov-13 at 13:00Your problem seems similar to this StackOverflow question.
Try removing tf.reset_default_graph()
and see if that fixes your issue.
QUESTION
I have written tensorflow code based on:
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
but using precomputed word embeddings from the GoogleNews word2vec 300 dimension model.
I created my own data from the UCML News Aggregator Dataset in which I parsed the content of the news articles and have created my own labels.
Due to the size of the articles I use TF-IDF to filter out the top 120 words per article and embed those into 300 dimensions.
When I run the CNN I created regardless of the hyper parameters it converges to a small general accuracy, around 38%.
Hyper parameters changed:
Various filter sizes:
I've tried a single filter of 1,2,3 Combinations of filters [3,4,5], [1,3,4]
Learning Rate:
I've varied this from very low to very high, very low doesn't converge to 38% but anything between 0.0001 and 0.4 does.
Batch Size:
Tried many ranges between 5 and 100.
Weight and Bias Initialization:
Set stddev of weights between 0.4 and 0.01. Set bias initial values between 0 and 0.1. Tried using the xavier initializer for the conv2d weights.
Dataset Size:
I have only tried on two partial data sets, one with 15 000 training data, and the other on the 5000 test data. In total I have 263 000 data to train on. There is no accuracy difference whether trained and evaluated on the 15 000 training data or by using the 5000 test data as the training data (to save testing time).
I've run successful classifications on the 15 000 / 5000 split using a feed forward network with a BoW input (93% accurate), TF-IDF with SVM (92%), and TF-IDF with Native Bayes (91.5%). So I don't think it is the data.
What does this imply? Is the model just a poor model for this task? Is there an error in my work?
I feel like my do_eval function is incorrect to evaluate the accuracy / loss over an epoch of the data:
...ANSWER
Answered 2017-Sep-20 at 00:19Turns out my error was in the creation of the input matrix.
QUESTION
I am following this tutorial in order to understand CNNs in NLP. There are a few things which I don't understand despite having the code in front of me. I hope somebody can clear a few things up here.
The first rather minor thing is the sequence_length
parameter of the TextCNN
object. In the example on github this is just 56
which I think is the max-length of all sentences in the training data. This means that self.input_x
is a 56-dimensional vector which will contain just the indices from the dictionary of a sentence for each word.
This list goes into tf.nn.embedding_lookup(W, self.intput_x)
which will return a matrix consisting of the word embeddings of those words given by self.input_x
. According to this answer this operation is similar to using indexing with numpy:
ANSWER
Answered 2017-Feb-01 at 19:19Answer to ques 1 (So am I correct if I assume that tf.nn.embedding_lookup ignores the value 0?) :
The 0's in the input vector is the index to 0th symbol in the vocabulary, which is the PAD symbol. I don't think it gets ignored when the lookup is performed. 0th row of the embedding matrix will be returned.
Answer to ques 2 (But how can tf.nn.embedding_lookup know about those indices given by self.input_x?) :
Size of the embedding matrix is [V * E] where is the size of vocabulary and E is dimension of embedding vector. 0th row of matrix is embedding vector for 0th element of vocabulary, 1st row of matrix is embedding vector for 1st element of vocabulary. From the input vector x, we get the indices of words in vocabulary, which are used for indexing the embedding matrix.
Answer to ques 3 (Does this mean that we are actually learning the word embeddings here?).
Yes, we are actually learning the embedding matrix.
In the embedding layer, in line W = tf.Variable( tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),name="W")
W is the embedding matrix and by default, in tensorflow trainable=TRUE
for variable. So, W will also be a learned parameter. To use pre- trained model, set trainable = False
.
For detailed explanation of the code you can follow blog: https://agarnitin86.github.io/blog/2016/12/23/text-classification-cnn
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install TextCNN
You can use TextCNN like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page