autoencoders | Implementation of several different types of autoencoders | SDK library
kandi X-RAY | autoencoders Summary
kandi X-RAY | autoencoders Summary
Implementation of several different types of autoencoders in Theano.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Setup the dataset
- Binarizes the given labels
- Get data from file
- Tile raster images
- Scale a nar
- Get the outputs of the model
- Setup the output
- Plot samples
- Get the outputs for the model
autoencoders Key Features
autoencoders Examples and Code Snippets
def fit(self, X, Y, Xtest, Ytest,
pretrain=True,
train_head_only=False,
learning_rate=0.1,
mu=0.99,
reg=0.0,
epochs=1,
batch_sz=100):
# cast to float32
learning_rate = np.float3
Community Discussions
Trending Discussions on autoencoders
QUESTION
I'm getting following error and I'm not able to figure out why:
RuntimeError: Model-building function did not return a valid Keras Model instance, found (, )
I have read the answers here and here which seem to telling to import keras
from tensorflow
instead of stand alone keras
which I'm doing but still getting the error. I would very much appreciate your help in figuring this out. Below is my entire code:
ANSWER
Answered 2021-Feb-21 at 09:13RuntimeError: Model-building function did not return a valid Keras Model instance, found (, )
(, )
As you can see this a tuple of two Keras Model instance. This is output of create_autoencoder(hp, input_dim, output_dim)
.
QUESTION
Autoencoders are generally used for reducing the number of dimensions. It compresses the image by reducing the unnecessary number of dimensions . . They work by compressing the input into a latent-space representation, and then reconstructing the output from this representation . So , How do the autoencoders came to know that which features are most important to retain and which are unimportant to throw it . There is one more question that How do autoencoders used for extracting the features of images just like in CNN , Convolutional layers are responsible for extracting the features of images . In autoencoders , How and which layer extract the features of images ?
...ANSWER
Answered 2021-Feb-01 at 14:51AutoEncoders is a particular network which tries to solve the identity problem expressed x' = g(h(x))
, where h
is the encoder block and g
the decoder block.
The latent space z
is the minimal expression for a given input x
, and it resides in the middle of the network. It's valid to clarify that in such space resides different shapes and each one corresponds to certain instance given during the training phase. Using the CNN that you reffered to support this, it's like a feature map but instead a bunch of feature maps accross the network there's only one, and again, it holds different representations bassed on what it observed during the training.
So, the question is how does it happens to compress and decompress? Well, the data used for training has a domain and every instance has similarities (all the cats have the same abstract qualities, the same for mountains, all have something in common), therefore the network learns how to fit what describes the data into smaller pieces combined, and from smaller pieces (with ranges from 0-1) how to build bigger pieces.
Taking the same samples from cats, all of them have two ears, have a fur, have two eyes and so on, I didn't mentioned the deatils, but you can think on the shape of those ears, how the fur is and probably how big those eyes are, the colors and the bightness. Think on the listing that I put as the latent space z
and the details as the x'
output.
For more details see this exhaustive explanation with the different AE variants: https://wikidocs.net/3413.
Hope that this helps you.
EDIT 1:
How and which layer extract the features of images?
Its design:
AutoEncoder is a network with a design that makes it possible to compress and decompress the training data, is not an arbitrary network at all.
First, it has a shape of a sand-clock, meaning this that the next layer has fewer neurons that the previous in the encoder block and right after the "latent space layer(s)" it starts doing the opposite by increasing the neurons size in the decoder block until it reaches the size of the input layer (the reconstruction, therefore the output).
Next, each layer is a Dense
layer, saying that all the neurons of each layer is fully plugged to the next, so, all the features are carried from layer to layer. The activation function of each neuron (ideally) is the tanh
saying that all the possible outputs are [-1,1]
being the case; finally, the loss function
tends to be the Root Mean Squared Error which tries to tell how far the reconstruction is from the original.
A bonus to this is to normalize the input tensors setting the mean of each feature to zero, this helps a lot the network to learn and I'll explain next.
Words are cheap, show me the backpropagation
Remember that values in the hidden layers are [-1,1]
? well, this range and the support of the weights and a bias (Wx + b
) makes it possible to have on each layer a continuous combination of fewer features (values from -1 to 1, considering ALL the possible rational numbers within).
With backpropagation (supported in the loss function
), the idea is to find a sweet spot of weights to turn the domain training set (say black and white MNIST digits, RGB Cats images, and so on) into a low dimension continuous set (really smalls numbers ranging between [-1,1]
) in the encoding layers, then, in the decoding layers it tries to use the same weights (remember is a sand-clock shape network) to emit the higher representation of the previous [-1,1]
combination.
An analogy
To put this into a kind of game, two persons are back to back, one looking trough a window and the other has a whiteboard upfront. The first looks outside and see a flower with all the details, and says sunflower (the latent space), the second person hears that and draws a sunflower with all the colors and details that that person learned in the past.
A real world sample please
Continuing with the sunflower analogy, imagine the same case, but your input image (tensor) has noise (you know, glitchy). The AutoEncoder was trained with high quality images, so it's capable of compress the sunflower concept, then reconstruct it. What happened to the glitch? The network encoded the sunflower colors, shape, and background (let's say is a blue sky), the decoder reconstruct it, the glitch was left behind as residual. And this, is a Denoise AutoEncoder, one of many application of the network.
QUESTION
I am trying to get a LSTM autoencoder to recreate its inputs. So far I have:
...ANSWER
Answered 2021-Jan-19 at 02:32The time-distributed dense layer as name suggested just an ordinary dense layer that apply to every temporal slice of an input, you can think it as special form of RNN cell, i.e without recurrent hidden state.
So you can using any layer that is time-distributed as your output layer for an Autoencoder that deal with time-distributed inputs, e.g RNN layer with LSTM Cell, GRU Cell, Simple RNN Cell or time-distributed dense layer; As in research paper that propose the LSTM-Autoencoder, it basic model for reconstruct sequence of vectors (image patches or features) only using one LSTM layer in both encoder and decoder, model structure is:
Following is an example to using time-distributed dense layer in decoder:
QUESTION
Specifically what spurred this question is the return_sequence
argument of TensorFlow's version of an LSTM layer.
The docs say:
Boolean. Whether to return the last output. in the output sequence, or the full sequence. Default: False.
I've seen some implementations, especially autoencoders that use this argument to strip everything but the last element in the output sequence as the output of the 'encoder' half of the autoencoder.
Below are three different implementations. I'd like to understand the reasons behind the differences, as the seem like very large differences but all call themselves the same thing.
Example 1 (TensorFlow):This implementation strips away all outputs of the LSTM except the last element of the sequence, and then repeats that element some number of times to reconstruct the sequence:
...ANSWER
Answered 2020-Dec-08 at 15:43There is no official or correct way of designing the architecture of an LSTM based autoencoder... The only specifics the name provides is that the model should be an Autoencoder and that it should use an LSTM layer somewhere.
The implementations you found are each different and unique on their own even though they could be used for the same task.
Let's describe them:
TF implementation:
- It assumes the input has only one channel, meaning that each element in the sequence is just a number and that this is already preprocessed.
- The default behaviour of the
LSTM layer
in Keras/TF is to output only the last output of the LSTM, you could set it to output all the output steps with thereturn_sequences
parameter. - In this case the input data has been shrank to
(batch_size, LSTM_units)
- Consider that the last output of an LSTM is of course a function of the previous outputs (specifically if it is a stateful LSTM)
- It applies a
Dense(1)
in the last layer in order to get the same shape as the input.
PyTorch 1:
- They apply an embedding to the input before it is fed to the LSTM.
- This is standard practice and it helps for example to transform each input element to a vector form (see word2vec for example where in a text sequence, each word that isn't a vector is mapped into a vector space). It is only a preprocessing step so that the data has a more meaningful form.
- This does not defeat the idea of the LSTM autoencoder, because the embedding is applied independently to each element of the input sequence, so it is not encoded when it enters the LSTM layer.
PyTorch 2:
- In this case the input shape is not
(seq_len, 1)
as in the first TF example, so the decoder doesn't need a dense after. The author used a number of units in the LSTM layer equal to the input shape.
- In this case the input shape is not
In the end you choose the architecture of your model depending on the data you want to train on, specifically: the nature (text, audio, images), the input shape, the amount of data you have and so on...
QUESTION
I want to use my VAE trained on an image dataset as a feature extractor for another task, so that I could for example replace a ResNet for feature extraction with my VAE. Which Layers do I use for this?
With "standard" autoencoders you just take the encoding network, but since the latent layer of the VAE consist of mean and distribution I do not know which layers I should use for feature extraction.
Does somebody know how to use a VAE as a feature extractor and what to consider with using different components?
...ANSWER
Answered 2020-Nov-10 at 17:44Hidden variables z are used in VAEs as the extracted features for dimensionality reduction. Here is an example dimensionality reduction from four features in the original space ([x1,x2,x3,x4]
) to two features in the reduced space ([z1,z2]
) (source):
Once you have trained the model, you can pass a sample to the encoder it extracts the features. You may find a Keras implementation example on mnist data here (see the plot_label_clusters
function):
QUESTION
How does one suppress certain outliers in Anomaly detection?
We built a model using autoencoders and it has detected anomalies. Some of the data points which are flagged as anomalies (outside the normal distribution) are not actually anomalies.
How do we train the model to not recognize these as anomalies ?
Do we add multiple duplicates of these data points into the dataset and then train again, or are there any other techniques we can apply here.
Here the normal distribution is of Cosine Similarity (distance) since data points are vectorized representations of text data (log entries). So if the cosine distance between the input and reconstructed vector does not fall under the normal distribution it is treated as anomaly.
...ANSWER
Answered 2020-Oct-28 at 09:38Since the Anomaly Detector is usually trained unsupervised, it can be hard to incorporate labels directly into that process without loosing outlier detection properties. A simple alternative is to take the instances that were marked as anomalies, and put them into a classifier that classifies into "real anomaly" vs "not real anomaly". This classifier would be trained on prior anomalies that have been labeled. It can be either binary classification, or one-class wrt to known "not real" samples. A simple starting point would be k-Nearest-Neighbours or a domain-specific distance function. The classifier can use the latent feature vector as input, or do its own feature extraction.
This kind of system is described in Anomaly Detection with False Positive Suppression (relayr.io). The same basic idea is used in this paper to minimize False Negative Rate: SNIPER: Few-shot Learning for Anomaly Detection to Minimize False-negative Rate with Ensured True-positive Rate
QUESTION
We are trying to build an anomaly detection model for application logs.
The preprocessing is already completed where we have built our own word2vec model which was trained on application log entries.
Now we have a training data of 1.5 M rows * 100 columns
Where each row is the vectorized representation of the log entries (the length of each vector is 100 hence 100 columns)
The problem is that most of the anomaly detection algorithms (LOF, SOS, SOD, SVM) are not scaling for this amount of data. We reduced the training size to 500K but still these algorithm hangs. SVM which performed best on POC sample data, does not have an option for n_jobs to run it on multiple cores.
Some algorithms are able to finish such as Isolation Forest (with low n_estimators), Histogram and Clustering. But these are not able to detect the anomalies which we purposely put in the training data.
Does anyone have an idea on how do we run the Anomaly detection algorithm for large datasets ?
Could not find any option for batch training in standard anomaly detection techniques.Shall we look into Neural Nets (autoencoders) ?
Selecting Best Model:
Given this is unsupervised learning, the approach we are taking for selecting a model is the following:
In the log entries training data, insert an entry from a novel (say Lord of the Rings). The vector representation of this log entry would be different from the rest of the log entires.
While running the dataset on various Anomaly detection algorithms, see which ones were able to detect the entry from the novel (which is an anomaly).
This approach worked when we tried to run anomaly detection on a very small dataset (1000 entries) where the log files were vectorized using the google provided word2vec model.
Is this approach a sound one ? We are open to other ideas as well. Given its an unsupervised learning algorithm we had to put in an anomalous entry and see which model was able to identify it.
The contaminiation ration put in is 0.003
...ANSWER
Answered 2020-Oct-19 at 09:17From your explanation, it seems that you are approaching a Novelty detection problem. The novelty detection problems are usually a semi-supervised problem (exceptions or approaches can vary).
Now the problem with huge matrix size can be solved if you use batch processing. This can help you- https://scikit-learn.org/0.15/modules/scaling_strategies.html
Finally yes, if you could use deep learning your problem can be solved in a much better way using both unsupervised learning or semi-supervised learning(I recommend this).
QUESTION
I'm following this tutorial of Building Autoencoders in Keras on MNIST handwritten digits. Here is the code bellow:
...ANSWER
Answered 2020-Sep-22 at 18:36On the first loop, i==0 because range(10)
is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
. You can't use 0 as an index for the subplots, which causes that error. You should instead use i+1
in your plt.subplot()
to get the correct axis.
QUESTION
I recently finished the Image super-resolution using Autoencoders in Coursera and when I try to run the same code on my laptop using Spyder and Jupyter notebook, I keep getting this error. I am using Nvidia GeForce 1650Ti along with Tensorflow-gpu=2.3.0, CUDA=10.1, cuDNN=7.6.5 and python=3.8.5. I have used the same configurations for running many deep neural network problems and none of them gave this error.
Code:
...ANSWER
Answered 2020-Sep-19 at 07:26The conv2d op raised an error message:
Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
Looking above, we see
Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3891 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1650 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
failed to allocate 3.80G (4080218880 bytes) from device:
CUDA_ERROR_OUT_OF_MEMORY: out of memory
failed to allocate 3.42G (3672196864 bytes) from device:
CUDA_ERROR_OUT_OF_MEMORY: out of memory
So this graph would need more memory than there is available on your GeForce GTX 1650 Ti (3891 MB). Try using a smaller input image size and/or a smaller batch size.
QUESTION
I want to train a VAE that had a huge dataset and decided to use a VAE code made for fashion MNIST and popular modifications for batch-loading using filenames that I found on github. My research collab notebook is here and a sample section of dataset.
But the way the VAE class is written it does not have a call function which should be there according to keras documentation. I am getting the error NotImplementedError: When subclassing the Model
class, you should implement a call
method.
ANSWER
Answered 2020-Sep-14 at 08:01APaul31,
Specifically in your code I suggest adding call()
function to the VAE class:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install autoencoders
You can use autoencoders like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page