CNN-for-text-classification | A simple CNN implementation in Keras | Machine Learning library
kandi X-RAY | CNN-for-text-classification Summary
kandi X-RAY | CNN-for-text-classification Summary
A simple CNN implementation in Keras.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Creates a CNN using ROB CNN
- Read ROB data
- Predict the model
- Loads a word2vec model
- Train a sentence model
- Train the sentence model
- Sample a balanced sample
- Performs preprocessing
- Fit the tokenizer
- Train the doc_model for train_documents
- Call the sentence model
CNN-for-text-classification Key Features
CNN-for-text-classification Examples and Code Snippets
Community Discussions
Trending Discussions on CNN-for-text-classification
QUESTION
I'm trying to add 2-stacked character-level CNNs into a larger neural network system but I'm getting ValueError for the input dimensions.
What I want to achieve is to get orthographic representations for the input words by replacing characters (according to capitalization, or being numeric or alphabetic) and feeding them into CNN. I'm aware that this can be achieved with LSTM/RNN but the requirements indicate using CNN so using another NN is not optional.
Most of the examples out there naturally uses image datasets (MNIST etc.) but not text datasets. So I'm confused and not sure how to "reshape" character embeddings so that they can be valid inputs for the CNN.
So here is the part of the code I'm trying to run:
...ANSWER
Answered 2018-May-11 at 15:16conv1d
expects channel dimension to be defined during the creating of the graph. So you cant pass the dimension as None
.
You need to make the following changes :
QUESTION
I'm novice to deep learning.I use tensorflow to construct my TextCNN model(two categories) referring this tutorial.
This model can predict the categories of the text. But I want a score (continuous value in [0,1]) rather than the discrete value. For example, If the model give 0.77, the text is more likely one of the category; if it gives 1.0, the text is actually that category.
This is the part of my code.
ANSWER
Answered 2018-Jul-15 at 07:06Use tf.nn.softmax(self.logits)
to get probabilistic scores. Also see this question: What is logits, softmax and softmax_cross_entropy_with_logits?
QUESTION
I wrote a module based on this article: http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
The idea is pass the input into multiple streams then concat together and connect to a FC layer. I divided my source code into 3 custom modules: TextClassifyCnnNet
>> FlatCnnLayer
>> FilterLayer
FilterLayer:
...ANSWER
Answered 2018-Jan-16 at 08:40I realised that L2_loss in Adam Optimizer make loss
value remain unchanged (I haven't tried in other Optimizer yet). It works when I remove L2_loss:
QUESTION
I misused binary cross-entropy for softmax, changed to categorical cross-entropy. And did some reviewing about details of the problem below in my own answer
I am trying to using open source data: sogou_news_csv(converted to pinyin using jieba from for text classification following https://arxiv.org/abs/1502.01710 "Text understanding from scratch" by Xiang Zhang and Yann LeCun. (mainly follow the idea of using character level CNN, but the structure proposed in the paper).
I did the preprocessing by using one-hot encoding according to a alphabet collection and filling all those not in the alphabet collection with 0s. As a result, I got the training data with the shape of (450000, 1000, 70),(data_size, sequence_length, alphabet_size).
Then I feed the data into a cnn structure following http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/.
Problem is
During the training, the loss and acc merely change, I tried preprocessing again for the data, and tried different learning rate settings
, but not helpful, So what went wrong?
Below is one-hot encoding:
...ANSWER
Answered 2017-Nov-01 at 08:43You have set SGD optimizer to 0.000001 (opt = SGD(lr=1e-6))
The default learning rate for SGD is 0.01
keras.optimizers.SGD(lr=0.01, momentum=0.0, decay=0.0, nesterov=False)
I suspect that 1e-6 is to small, try increase it and/or try a different optimizer
QUESTION
I am reviewing Denny Britz's tutorial on text classification using CNNs
in TensorFlow
. Filter and stride shapes make perfect sense in the image domain. However, when it comes to text, I am confused on how to correctly define the stride and filter shapes. Consider the following two layers from Denny's code:
ANSWER
Answered 2017-Oct-25 at 23:08Filters
Have I interpreted this correctly?
Yes, exactly.
Strides
Does this shape's dimensions correspond to the dimensions of the filter_shape?
Yes, it corresponds to the strides in which you convolve the filter on the input embedding.
It would seem that the nature of word vector representations means that the stride length should be [1, embedding_size, 1, 1] meaning I want to move the window one full word at-a-time over one channel for each filter.
Pay attention to the padding strategy - the padding in conv2d
is set to be VALID
. This means there will be no padding. Since the filter size in the embedding dimension covers the input entirely, it can fit only once without any consideration of the stride along this dimension.
Put differently - you can convolve along the embedding dimension only once independently of the stride.
QUESTION
I have written tensorflow code based on:
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
but using precomputed word embeddings from the GoogleNews word2vec 300 dimension model.
I created my own data from the UCML News Aggregator Dataset in which I parsed the content of the news articles and have created my own labels.
Due to the size of the articles I use TF-IDF to filter out the top 120 words per article and embed those into 300 dimensions.
When I run the CNN I created regardless of the hyper parameters it converges to a small general accuracy, around 38%.
Hyper parameters changed:
Various filter sizes:
I've tried a single filter of 1,2,3 Combinations of filters [3,4,5], [1,3,4]
Learning Rate:
I've varied this from very low to very high, very low doesn't converge to 38% but anything between 0.0001 and 0.4 does.
Batch Size:
Tried many ranges between 5 and 100.
Weight and Bias Initialization:
Set stddev of weights between 0.4 and 0.01. Set bias initial values between 0 and 0.1. Tried using the xavier initializer for the conv2d weights.
Dataset Size:
I have only tried on two partial data sets, one with 15 000 training data, and the other on the 5000 test data. In total I have 263 000 data to train on. There is no accuracy difference whether trained and evaluated on the 15 000 training data or by using the 5000 test data as the training data (to save testing time).
I've run successful classifications on the 15 000 / 5000 split using a feed forward network with a BoW input (93% accurate), TF-IDF with SVM (92%), and TF-IDF with Native Bayes (91.5%). So I don't think it is the data.
What does this imply? Is the model just a poor model for this task? Is there an error in my work?
I feel like my do_eval function is incorrect to evaluate the accuracy / loss over an epoch of the data:
...ANSWER
Answered 2017-Sep-20 at 00:19Turns out my error was in the creation of the input matrix.
QUESTION
I am following this tutorial in order to understand CNNs in NLP. There are a few things which I don't understand despite having the code in front of me. I hope somebody can clear a few things up here.
The first rather minor thing is the sequence_length
parameter of the TextCNN
object. In the example on github this is just 56
which I think is the max-length of all sentences in the training data. This means that self.input_x
is a 56-dimensional vector which will contain just the indices from the dictionary of a sentence for each word.
This list goes into tf.nn.embedding_lookup(W, self.intput_x)
which will return a matrix consisting of the word embeddings of those words given by self.input_x
. According to this answer this operation is similar to using indexing with numpy:
ANSWER
Answered 2017-Feb-01 at 19:19Answer to ques 1 (So am I correct if I assume that tf.nn.embedding_lookup ignores the value 0?) :
The 0's in the input vector is the index to 0th symbol in the vocabulary, which is the PAD symbol. I don't think it gets ignored when the lookup is performed. 0th row of the embedding matrix will be returned.
Answer to ques 2 (But how can tf.nn.embedding_lookup know about those indices given by self.input_x?) :
Size of the embedding matrix is [V * E] where is the size of vocabulary and E is dimension of embedding vector. 0th row of matrix is embedding vector for 0th element of vocabulary, 1st row of matrix is embedding vector for 1st element of vocabulary. From the input vector x, we get the indices of words in vocabulary, which are used for indexing the embedding matrix.
Answer to ques 3 (Does this mean that we are actually learning the word embeddings here?).
Yes, we are actually learning the embedding matrix.
In the embedding layer, in line W = tf.Variable( tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),name="W")
W is the embedding matrix and by default, in tensorflow trainable=TRUE
for variable. So, W will also be a learned parameter. To use pre- trained model, set trainable = False
.
For detailed explanation of the code you can follow blog: https://agarnitin86.github.io/blog/2016/12/23/text-classification-cnn
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install CNN-for-text-classification
You can use CNN-for-text-classification like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page