LSTM-autoencoder | TensorFlow LSTM-autoencoder implementation | Machine Learning library
kandi X-RAY | LSTM-autoencoder Summary
kandi X-RAY | LSTM-autoencoder Summary
TensorFlow LSTM-autoencoder implementation
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Initialize the model .
LSTM-autoencoder Key Features
LSTM-autoencoder Examples and Code Snippets
Community Discussions
Trending Discussions on LSTM-autoencoder
QUESTION
Autoencoder underfits timeseries reconstruction and just predicts average value.
Question Set-up:Here is a summary of my attempt at a sequence-to-sequence autoencoder. This image was taken from this paper: https://arxiv.org/pdf/1607.00148.pdf
Encoder: Standard LSTM layer. Input sequence is encoded in the final hidden state.
Decoder: LSTM Cell (I think!). Reconstruct the sequence one element at a time, starting with the last element x[N]
.
Decoder algorithm is as follows for a sequence of length N
:
- Get Decoder initial hidden state
hs[N]
: Just use encoder final hidden state. - Reconstruct last element in the sequence:
x[N]= w.dot(hs[N]) + b
. - Same pattern for other elements:
x[i]= w.dot(hs[i]) + b
- use
x[i]
andhs[i]
as inputs toLSTMCell
to getx[i-1]
andhs[i-1]
Here is my implementation, starting with the encoder:
...ANSWER
Answered 2020-Dec-17 at 10:13Okay, after some debugging I think I know the reasons.
TLDR- You try to predict next timestep value instead of difference between current timestep and the previous one
- Your
hidden_features
number is too small making the model unable to fit even a single sample
Let's start with the code (model is the same):
QUESTION
Specifically what spurred this question is the return_sequence
argument of TensorFlow's version of an LSTM layer.
The docs say:
Boolean. Whether to return the last output. in the output sequence, or the full sequence. Default: False.
I've seen some implementations, especially autoencoders that use this argument to strip everything but the last element in the output sequence as the output of the 'encoder' half of the autoencoder.
Below are three different implementations. I'd like to understand the reasons behind the differences, as the seem like very large differences but all call themselves the same thing.
Example 1 (TensorFlow):This implementation strips away all outputs of the LSTM except the last element of the sequence, and then repeats that element some number of times to reconstruct the sequence:
...ANSWER
Answered 2020-Dec-08 at 15:43There is no official or correct way of designing the architecture of an LSTM based autoencoder... The only specifics the name provides is that the model should be an Autoencoder and that it should use an LSTM layer somewhere.
The implementations you found are each different and unique on their own even though they could be used for the same task.
Let's describe them:
TF implementation:
- It assumes the input has only one channel, meaning that each element in the sequence is just a number and that this is already preprocessed.
- The default behaviour of the
LSTM layer
in Keras/TF is to output only the last output of the LSTM, you could set it to output all the output steps with thereturn_sequences
parameter. - In this case the input data has been shrank to
(batch_size, LSTM_units)
- Consider that the last output of an LSTM is of course a function of the previous outputs (specifically if it is a stateful LSTM)
- It applies a
Dense(1)
in the last layer in order to get the same shape as the input.
PyTorch 1:
- They apply an embedding to the input before it is fed to the LSTM.
- This is standard practice and it helps for example to transform each input element to a vector form (see word2vec for example where in a text sequence, each word that isn't a vector is mapped into a vector space). It is only a preprocessing step so that the data has a more meaningful form.
- This does not defeat the idea of the LSTM autoencoder, because the embedding is applied independently to each element of the input sequence, so it is not encoded when it enters the LSTM layer.
PyTorch 2:
- In this case the input shape is not
(seq_len, 1)
as in the first TF example, so the decoder doesn't need a dense after. The author used a number of units in the LSTM layer equal to the input shape.
- In this case the input shape is not
In the end you choose the architecture of your model depending on the data you want to train on, specifically: the nature (text, audio, images), the input shape, the amount of data you have and so on...
QUESTION
I was looking the blog and the author used 'relu' instead of 'tanh', why? https://towardsdatascience.com/step-by-step-understanding-lstm-autoencoder-layers-ffab055b6352
...ANSWER
Answered 2020-Jul-08 at 16:12First, the ReLU function is not a cure-all activation function. Specifically, it still suffers from the exploding gradient problem, since it is unbounded in the positive domain. Implying, this problem would still exist in deeper LSTM networks. Most LSTM networks become very deep, so they have a decent chance of running into the exploding gradient problem. RNNs also have exploding gradients when using the same weight matrix at each time step. There are methods, such as gradient clipping, that help reduce this problem in RNNs. However, ReLU functions themselves do not solve the exploding gradient problem.
The ReLU function does help reduce the vanishing gradient problem, but it doesn't solve the vanishing gradient completely. Methods, such as batch normalization, can help reduce the vanishing gradient problem even further.
Now, to answer your question about using a ReLU function in place of a tanh function. As far as I know, there shouldn't be much of a difference between the ReLU and tanh activation functions on their own for this particular gate. Neither of them completely solve the vanishing/exploding gradient problems in LSTM networks. For more information about how LSTMs reduce the vanishing and exploding gradient problem, please refer to this post.
QUESTION
i am completely new to the field of LSTMs. Are there any tips to optimize my autoencoder for task to reconstruct sequences of len = 300
Bottleneck Layer should have 10-15 neurons
...ANSWER
Answered 2020-Jul-03 at 08:24I believe the activation function you are using here i.e. 'relu' might be killing the gradients. Try using other activation functions such as 'tanh' suitable for your data.
QUESTION
I try to build LSTM model that as input receives sequence of integer numbers and outputs probability for each integer to appear. If this probability is low, then the integer should be considered as anomaly. I tried to follow this tutorial - https://towardsdatascience.com/lstm-autoencoder-for-extreme-rare-event-classification-in-keras-ce209a224cfb, particularly this is where my model is from. My input looks like this:
...ANSWER
Answered 2020-May-12 at 09:08As you said it's an autoencoder. Your autoencoder tries to reconstruct your input. As you see, the output values are very close to the input values, there is not a big error. So the autoencoder is well trained.
Now if you want to detect outliers in your data, you can compute the reconstruction error (Could be Mean square Error between input and output) and set up a threshold.
If reconstruction error is superior than the threshold it's gonna be an outlier, since the autoencoder is not trained on reconstructing outlier data.
This schema reprensents better the idea:
I hope this helps ;)
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install LSTM-autoencoder
You can use LSTM-autoencoder like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page