Image-Segmentation | Image segmentation application | Machine Learning library
kandi X-RAY | Image-Segmentation Summary
kandi X-RAY | Image-Segmentation Summary
Image segmentation application for identifying background and foreground objects by using Gaussian and Markov Random Fields probabilistic models , Edmond-Karp/Dinic max flow algorithms, minimum cut.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of Image-Segmentation
Image-Segmentation Key Features
Image-Segmentation Examples and Code Snippets
Community Discussions
Trending Discussions on Image-Segmentation
QUESTION
I want to remove image background with Open CV in Android. Code is working fine but output quality not as per expectation. I followed java documentation for code reference:
https://opencv-java-tutorials.readthedocs.io/en/latest/07-image-segmentation.html
Thanks
My code snippet in Android:
...ANSWER
Answered 2021-May-11 at 02:14The task, as you have seen, is not trivial at all. OpenCV has a segmentation algorithm called "GrabCut" that tries to solve this particular problem. The algorithm is pretty good at classifying background and foreground pixels, however it needs very specific information to work. It can operate on two modes:
1st Mode (Mask Mode): Using a Binary Mask (same size as the original input) where 100% definite background pixels are marked, as well as 100% definite foreground pixels. You don't have to mark every pixel on the image, just a region where you are sure the algorithm will find either class of pixels.
2nd Mode (Foreground ROI): Using a bounding box that encloses 100% definite foreground pixels.
Now, I use the notation "100% definitive" to label those pixels you are 100% sure they correspond to either the background of foreground. The algorithm classifies the pixels in four possible classes: "Definite Background", "Probable Background", "Definite Foreground" and "Probable Foreground". It will predict both Probable Background and Probable Foreground pixels, but it needs a priori information of where to find at least "Definitive Foreground" pixels.
With that said, we can use GrabCut
in its 2nd mode (Rectangle ROI) to try an segment the input image . We can try and get a first, rough, binary mask of the input. This will mark where we are sure the algorithm can find foreground pixels. We will feed this rough mask to the algorithm and check out the results. Now, the method is not easy and its automation not straightforward, there's some manual information we will set that work particularly well for this input image. I don't know the Java implementation of OpenCV, so I'm giving you the solution for Python. Hopefully you will be able to port it. This is the general outline of the algorithm:
- Get a first rough mask of the foreground object via thresholding
- Detect contours on the rough mask to retrieve a bounding rectangle
- The bounding rectangle will serve as input ROI for the GrabCut algorithm
- Set the parameters needed for the GrabCut algorithm
- Clean the segmentation mask obtained by GrabCut
- Use the segmentation mask to finally segment the foreground object
This is the code:
QUESTION
I am trying to code the following image classification code: https://www.thepythoncode.com/article/kmeans-for-image-segmentation-opencv-python
but my question is; is there a way to write a loop such that for each cluster that you use, you get a new image that blackens out this part of the image?
I was trying for example this:
...ANSWER
Answered 2021-May-10 at 17:51In your approach, I think if you just change labels == cluster
to labels != cluster
, it should work.
However, here is another way in Python/OpenCV.
Input:
QUESTION
I want to solve a multiclass segmentation task using deep learning (in python). Here, is a summary of vgg_unet model that is mainly collected from GitHub. So, in my dataset 8 labels are available. So, at the last convolution layer, there are 8 channels for the categorical classification of every class. The summary of my model is below,
...ANSWER
Answered 2020-Aug-30 at 13:22There will be no issue if you do not reshape; in fact, the reshape operation is not necessary, in this case it is a redundant operation.
I also questioned myself when I began delving deeper into image segmentation. There are repositories that omit this step (most of them) and some of them which reshape and only then add the sigmoid/softmax activation.
In my experience, I did not see any mathematical advantage/better results/strong reasons why the reshape should be implemented. Therefore, I do not see any problem if you omit it in your code.
QUESTION
ANSWER
Answered 2020-Jun-11 at 10:43You'll have to decide if want an RGB or grayscale input for your images: Either convert your images to grayscale or change the conv layer. Another option would be to flatten the 256x256x3 input to a one dimension and use that as input.
QUESTION
I have a UNet++(view in private, code for model at the bottom of the article) which I'm trying to reconfigure. I'm getting some artifacts in some images so I'm following this article which suggests doing upsampling then a convolution operation.
I'm replacing the up-sample layers with sequential operation shown below but my model isn't learning. I suspect its to do with how I've configured the channels so I'd like another opinion.
Old up-sample operation:
...ANSWER
Answered 2020-Jun-08 at 13:01Do these two operations have the same output/function within my model?
If out_ch*2 == in_ch
, then: yes, they have the same output shape.
If the input x
is the output of a BatchNorm+ReLU
op, then they could be even more similar.
QUESTION
Problem Statement: I have an image and a pixel of the image can belong to only(either) one of Band5','Band6', 'Band7'
(see below for details). Hence, I have a pytorch multi-class problem but I am unable to understand how to set the targets which needs to be in form [batch, w, h]
My dataloader return two values:
...ANSWER
Answered 2020-May-20 at 05:34If I understand correctly, your current "target" is [batch_size, channels, w, h]
with channels==3
as you have three possible targets.
What are the values in your target represent? You basically have a 3-vector target for each pixel - are these the expected class probabilities? Are they "one-hot-vectors" indicating the correct "band"?
If so, you can get the target indices by simply taking the argmax
along the target channel dimension:
QUESTION
I'm using image-segmentation on some images, and sometimes it would be nice to be able to plot the borders of the segments.
I have a 2D NumPy array that I plot with Matplotlib, and the closest I've gotten, is using contour-plotting. This makes corners in the array, but is otherwise perfect.
Can Matplotlib's contour-function be made to only plot vertical/horizontal lines, or is there some other way to do this?
An example can be seen here:
...ANSWER
Answered 2020-Feb-07 at 14:53I wrote some functions to achieve this some time ago, but I would be glad to figure out how it can be done quicker.
QUESTION
I've been working on a segmentation problem for many days and after finally finding out how to properly read the dataset, I ran into this problem: "ValueError: Error when checking target: expected activation_1(Softmax) to have 3 dimensions, but got array with shape (32, 416, 608, 3)" I used the functional API, since I took the FCNN architecture from : https://github.com/divamgupta/image-segmentation-keras/blob/master/Models/FCN32.py.
It is slightly modified and adapted in accordance with my task(IMAGE_ORDERING = "channels_last"(TensorFlow backend)).
Can anyone please help me?
Massive thanks in advance.
The architecture below is for FCNN, which I try to implement for the purpose of the segmentation.
Here is the architecture(after calling model.summary()).
[1]: https://i.stack.imgur.com/2Ou5z.png
[2]: https://i.stack.imgur.com/zOFAz.png
[3]: The specific error is : https://i.stack.imgur.com/DVo2k.png
[4]: "Importing the dataset" function :https://i.stack.imgur.com/UY2FE.png
[5]: "Fit_Generator method calling" :https://i.stack.imgur.com/VskLj.png
ANSWER
Answered 2018-Apr-18 at 07:59The original code in the FCNN architecture example works with an input dimension of (416, 608)
. Whereas in your code, the input dimension is (192, 192)
(ignoring the channel dimension). Now if you notice carefully, this particular layer
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x)
generates an output of dimension (6, 6)
(you can verify in your model.summary()
).
The next convoltuion layer
o = (Convolution2D(4096,(7,7) , activation='relu' , padding='same', data_format=IMAGE_ORDERING))(o)
uses convolution filters of size (7, 7)
, but your input has already reduced to a size smaller than that (i.e. (6, 6)
). Try fixing that first.
Also if you look at the model.summary()
output, you'll notice that it does not contain the layers defined after the block5_pool layer. There is a transposed convolution
layer in it (which basically upsamples your input). You may want to take a look and try to resolve that as well.
NOTE: In all my dimensions, I have ignored the channel dimension.
EDIT Detailed Answer below
First of all, this is my keras.json
file. It uses Tensorflow backend, with image_ordering
set at channel_last.
QUESTION
I am using tensorflow's object detection api with faster_rcnn_resnet101 and get the following error when trying to train:
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Input to reshape is a tensor with 36 values, but the requested shape requires a multiple of 16
[[{{node Reshape_13}}]]
[[IteratorGetNext]]
[[IteratorGetNext/_7243]]
(1) Invalid argument: Input to reshape is a tensor with 36 values, but the requested shape requires a multiple of 16
[[{{node Reshape_13}}]]
[[IteratorGetNext]]
0 successful operations. 0 derived errors ignored.
I am using a slightly modified version of the pets-train.sh file to run the training (only paths have been altered). I am trying to train on tf.record files containing jpg images of size (1280, 720) and have made no changes to the network architecture (I have confirmed that all images in the record are of this size).
Curiously, I can successfully run inference on these images when I do something equivalent to what's in the tutorial file detect_pets.py. This makes me think something is wrong with the way that I've created the tf.record files (code below) rather than anything to do with the shape of the images, despite the error having to do with reshape. However,I've successfully trained on tf.records created in the same way before (from images of size (600, 600), (1024, 1024), and (720, 480), all with the same network). Moreover, I've previously encountered a similar error (only the numbers were different but the error was still with node Reshape_13) on a different data set of images with size (600, 600).
I am using python 3.7, tf version 1.14.0, cuda 10.2, Ubuntu 18.04
I've looked extensively at various other posts (here, here, here, here, and here) but I wasn't able to make any progress.
I've tried adjusting the keep_aspect_ratio_resizer parameters (originally min_dimension=600, max_dimension=1024 but I've also tried min, max = (720, 1280) and have tried pad_to_max_dimension: true with both of these min/max choices as well).
This is the code I'm using to create the tf.record file (apologies or indentations being off here):
...ANSWER
Answered 2019-Sep-17 at 22:41I'm an idiot: confirmed.
The problem was that classes_text, classes, and difficult were the wrong length.
Replaced
QUESTION
I am running a CNN of a regressive type which inputs and outputs images of different dimensions (so not a an Image-segmentation problem) based on a dataset of samples and corresponding labels. As a result the last dense layer of my network has the height and width of the labels multiplied together. Now, I have been training the network for a while now and I wanted to see what the images looked like so to see how good or bad my model is. Is there a function that provides me with this option or do I have to hard-code it? How do I do it? Down below is attached the code of my network and the network summary as well.
Layer (type) Output Shape Param #conv2d_1 (Conv2D) (None, 54, 1755, 4) 20
activation_1 (Activation) (None, 54, 1755, 4) 0
max_pooling2d_1 (MaxPooling2 (None, 18, 585, 4) 0
batch_normalization_1 (Batch (None, 18, 585, 4) 16
conv2d_2 (Conv2D) (None, 17, 584, 8) 136
activation_2 (Activation) (None, 17, 584, 8) 0
max_pooling2d_2 (MaxPooling2 (None, 8, 292, 8) 0
batch_normalization_2 (Batch (None, 8, 292, 8) 32
conv2d_3 (Conv2D) (None, 7, 291, 16) 528
activation_3 (Activation) (None, 7, 291, 16) 0
max_pooling2d_3 (MaxPooling2 (None, 3, 145, 16) 0
batch_normalization_3 (Batch (None, 3, 145, 16) 64
conv2d_4 (Conv2D) (None, 2, 144, 32) 2080
activation_4 (Activation) (None, 2, 144, 32) 0
max_pooling2d_4 (MaxPooling2 (None, 1, 72, 32) 0
batch_normalization_4 (Batch (None, 1, 72, 32) 128
flatten_1 (Flatten) (None, 2304) 0
dropout_1 (Dropout) (None, 2304) 0
dense_1 (Dense) (None, 19316) 44523380
activation_5 (Activation) (None, 19316) 0
=================================================================
Total params: 44,526,384 Trainable params: 44,526,264 Non-trainable params: 120
Thanks in advance!
...ANSWER
Answered 2019-Jul-29 at 11:03From what I understand, the output of your model should represent the grayscale values of the pixels of an image with the dimensions (11,1756).
There is no need to hard-code a special function, you can simply use the standard reshape() function on the output of the model.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install Image-Segmentation
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page