Convolutional-Network | A convolutional neural network from scratch | Machine Learning library
kandi X-RAY | Convolutional-Network Summary
kandi X-RAY | Convolutional-Network Summary
The purpose of this project was to understand the full architecture of a conv net and to visually break down what's going on while training to recognize images. In particular, I was interested in seeing how the weight kernels pick up some pattern over the course of the training. Even though you can get some insights into the learning during training, the network is extremely slow! This is mainly because it was never designed and optimized to process large volume of images. It would be great to rewrite this in Theano or Tensorflow.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Backpropagate a layer to the pool
- Max prime function
- Calculate the sigmoid
- Return the sigmoid
- Load training data
- Load MNIST dataset
- Return vectorized result
- Backprop compatibility
- Backprop
Convolutional-Network Key Features
Convolutional-Network Examples and Code Snippets
Community Discussions
Trending Discussions on Convolutional-Network
QUESTION
This might come across as a seriously newbie question, but I have not many options as I am not sure which direction I should be heading.
Now I am studying Deep Learning frequently and I want to toy around with Stanford's CS231N's Convolutional Neural Network Demo as I find it extremely user friendly. The visuals are embedded in this website. I really want to toy around with this but I do not know how and where to start.
I have knowledge of Python and VS-Code if that helps.
...ANSWER
Answered 2021-Jan-28 at 08:05Take the index.html file from the above link.
If you look closely in index.html, there are two scripts you need that to make it work.
Copy the files from the demo folder from the link and the files structure should look like this(same as in the github demo)
Now double click on index.html and choose a browser to open this work and should work as expected. And you can also modify the code and reload the index.html to see live changes.
QUESTION
Does Eigen support getting next block with stride =2?
I observed the default behavior is with stride =1 in this:
...ANSWER
Answered 2020-Mar-09 at 05:05You can declare the stride with Eigen::Map
as such:
QUESTION
I'm trying to use scipy
's ndimage.convolve
function to perform a convolution on a 3 dimensional image (RGB, width, height).
Taking a look here:
It is clear to see that for any input, each kernel/filter should only ever have an output of NxN, with strictly a depth of 1.
This is a problem with scipy
, as when you do ndimage.convolve
with an input of size (3, 5, 5)
and a filter/kernel of size (3, 3, 3)
, the result of this operation produces an output size of (3, 5, 5)
, clearly not summing the different channels.
Is there a way to force this summation without manually doing so? I try to do as little in base python as possible, as a lot of external libraries are written in c++ and do the same operations faster. Or is there an alternative?
...ANSWER
Answered 2020-Jan-17 at 17:21No scipy doesn't skip the summation of channels. The reason why you get a (3, 5, 5)
output is because ndimage.convolve
is padding the input array along all the axes and then performs convolution in the "same" mode (i.e. the output has the same shape as input, centered with respect to the output of the "full" mode correlation). See the scipy.signal.convolve for more detail on modes.
For your input of shape (3 ,5, 5)
and filter w0
of shape (3, 3, 3)
, the input is padded resulting in a (7, 9, 9)
array. See below (for simplicity I use constant padding with 0's):
QUESTION
Let's say input to intermediate CNN layer is of size 512×512×128 and that in the convolutional layer we apply 48 7×7 filters at stride 2 with no padding. I want to know what is the size of the resulting activation map?
I checked some previous posts (e.g., here or here) to point to this Stanford course page. And the formula given there is (W − F + 2P)/S + 1 = (512 - 7)/2 + 1, which would imply that this set up is not possible, as the value we get is not an integer.
However if I run the following snippet in Python 2.7, the code seems to suggest that the size of activation map was computed via (512 - 6)/2, which makes sense but does not match the formula above:
...ANSWER
Answered 2019-Dec-16 at 08:14QUESTION
I have retrained a VGG16 classifier and want to show the class activation map. Unfortunately this only works with some pictures, despite the images are preprocessed. It is only a binary classifier.
I have seen that some pictures are not in the desired width and height, despite setting the target_size while loading the image. Manual resizing did not help as well. z has the desired shape.
...ANSWER
Answered 2019-Aug-11 at 20:50Issue was connected to preprocess_input from applications.vgg16. Setting
QUESTION
I am reading multiple conflicting Stackoverflow posts and I'm really confused on what the reality is.
My question is the following. If I trained an FCN on 128x128x3
images, is it possible to feed an image of size 256x256x3
, or B)128x128
, or C) neither since the inputs have to be the same during training and testing?
Consider SO post #1. In this post, it suggests that the images have to be the same dimensions during input and output. This makes sense to me.
SO post #2: In this post, it suggests that we can forward a different sized image during test time and if you do some weird squeeze operations, this becomes possible. Not sure at all how this is possible.
SO post #3: In this post, it suggests that only the depth needs to be the same, not the height and width. How is this possible?
Bottom line as I understand it is, if I trained on 128x128x3
, then from the input layer to the first conv layer, (1) there is a fixed number of strides that take place. Consequently, (2) a fixed feature map size, and accordingly, (3) a fixed number of weights. If I suddenly change the input image size to 512x512x3
, there's no way that the feature maps from training and testing are even comparable, due to the difference in size UNLESS.
- When I input an image of size
512x512
, then only the top128x128
is considered and the rest of the image is ignored - The 512x512 image is resized before being fed to the network.
Can someone clarify this? As you can see there are multiple posts regarding this with not a canonical answer. Hence, a community aided answer that everyone agrees on would be very helpful.
...ANSWER
Answered 2019-Aug-05 at 07:17Here's my breakdown,
Post 1Yes, this is the standard way to do things. If you have variable sized inputs you crop/pad/resize them so that your inputs are all the same size.
Post 2Note tat this person is talking about a "fully convolutional network" not a "fully connected network". In a fully convolutional network, all the layers will be convolution layers and convolution layers have no issue with consuming arbitrary sized (width and height) inputs as long as the channel dimension is fixed.
The need to have fixed input size arises in standard convolutional networks because of the "flattening" done before feeding the convolution output to fully connected layers. So if you get rid of the fully connected layers (i.e. fully convolutional networks) you don't have that problem.
Post 3It is saying basically the same thing as Post 2 (in my eye). To summarise, if your convolution network has a fully connected layer, and you try to input variable sized inputs, you'll get a RunTimeError
. But if you have a convolutional output and you input a 7x7x512
(h x w x channel) input you'll get a (1x1x)
output, where if you input 8x8x512
input, you'll get a (2x2x)
output (because of the convolution operation).
The bottom line is that, if you're network has fully connected layers somewhere, you cannot directly feed variable sized inputs (without pad/crop/resize) but if your network is fully convolutional, you can.
One thing I don't know and can't comment is when the probability map is [None, n, n, num_classes]
sized (as in Post #2), how to bring that to [None, 1, 1, num_classes]
as you need to do that to perform tf.squeeze
.
Edit 1:
How the convolution kernel/input/output behaveI am adding this section to clarify how the input/output/kernel of a convolution operation behaves when input size changes. As you can see, a change in the input will change the size (that is, height and width dimensions). But the kernel (which is of shape [height x width x in_channels x out_channels]
will not be affected during this change.
Hope this makes sense.
QUESTION
I have implemented a paper about a CNN architecture in both Keras and Pytorch but keras implementation is much more efficient it takes 4 gb of gpu for training with 50000 samples and 10000 validation samples but pytorch one takes all the 12 gb of gpu and i cant even use a validation set ! Optimizer for both of them is sgd with momentum and same settings for both. more info about the paper:[architecture]:https://github.com/Moeinh77/Lightweight-Deep-Convolutional-Network-for-Tiny-Object-Recognition/edit/master/train.py
pytorch code :
...ANSWER
Answered 2019-Mar-27 at 04:17Edit: on a closer look, acc
doesn't seem to require gradient, so this paragraph probably doesn't apply.
It looks like the most significant issue is that total_train_acc
accumulates history across the training loop (see https://pytorch.org/docs/stable/notes/faq.html for details).
Changing total_train_acc += acc
to total_train_acc += acc.item()
should fix this.
Another thing you should use with torch.no_grad()
for the validation loop.
Not really about speed, but model.train()
and model.eval()
should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.
QUESTION
I have a question about tf.layers.conv3d. If I have understood correctly, it takes an input of shape
(Batch x depth x height x width x channels)
where channels should be only one; and given a filter (depth x height x width), it creates #filters different filters of the same shape to create #filters output channels and convolves them with the input to obtain an output of shape
(Batch x out_depth x out_height x out_width x num_filters)
First of all, am I right by now? The question is: it appears to me that this layer does not obey to the law binding input, output, filter and strides shapes of convolutional layers, that should be:
(W-F+2P)/S + 1
As described here. Instead, output depth width and height are always the same as inputs. What is happening? Thanks for the help!
...ANSWER
Answered 2019-Feb-11 at 11:13kinda true but if input shape, filter shape and strides:
[Batch, depth, height, width, channels]
[filter_depth,filter_height,filter_width,in_channels,out_channels]
[1,s1,s2,s3,1]
output shape
[Batch,int(depth/s1),int(height/s2),int(width/s3),out_channels]
QUESTION
https://github.com/kuangliu/pytorch-cifar/blob/master/models/resnet.py
From reading https://www.cs.toronto.edu/~kriz/cifar.html the cifar dataset consists of images each with 32x32 dimension.
My understanding of code :
...ANSWER
Answered 2018-Dec-30 at 06:11You do not provide enough information in your question (see my comment).
However, if I have to guess then you have two pooling layers (with stride 2) in between your convolution layers:
- input size 32x32 (3 channels)
conv1
output size 28x28 (6 channels): conv with no padding and kernel size 5, reduces input size by 4.- Pooling layer with stride 2, output size 14x14 (6 channels).
conv2
output size 10x10 (16 channels)- Another pooling layer with stride 2, output size 5x5 (16 channels)
- A fully connected layer (
nn.Linear
) connecting all 5x5x16 inputs to all 120 outputs.
A more thorough guide for estimating the receptive field can be found here.
QUESTION
Here is code I wrote to perform a single convolution and output the shape.
Using formula from http://cs231n.github.io/convolutional-networks/ to calculate output size :
You can convince yourself that the correct formula for calculating how many neurons “fit” is given by (W−F+2P)/S+1
The formula for computing the output size has been implemented below as
...ANSWER
Answered 2018-Dec-17 at 01:58The problem is that your input image is not a square, so you should apply the formula on the width
and the heigth
of the input image.
And also you should not use the nb_channels
in the formula because we are explicitly defining how many channels we want in the output.
Then you use your f=kernel_size
and not f=kernel_size*kernel_size
as described in the formula.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install Convolutional-Network
You can use Convolutional-Network like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page