wgan | Tensorflow Implementation of Wasserstein GAN | Machine Learning library
kandi X-RAY | wgan Summary
kandi X-RAY | wgan Summary
Tensorflow Implementation of Wasserstein GAN (and Improved version in wgan_v2)
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Train the model
- Reshape a grid to a given size
- Plot a matplotlib figure
- Split a number into two parts
- Biasy batch norm
- leaky_relu
wgan Key Features
wgan Examples and Code Snippets
Community Discussions
Trending Discussions on wgan
QUESTION
I am following this Github Repo for the WGAN implementation with Gradient Penalty.
And I am trying to understand the following method, which does the job of unit-testing the gradient-penalty calulations.
...ANSWER
Answered 2022-Apr-02 at 17:11good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)
First, note the Gradient Penalty term in WGAN is =>
(norm(gradient(interpolated)) - 1)^2
And for the Ideal Gradient (i.e. a Good Gradient), this Penalty term would be 0. i.e. A Good gradient is one which has its gradient_penalty is as close to 0 as possible
This means the following should satisfy, after considering the L2-Norm of the Gradient
(norm(gradient(x')) -1)^2 = 0
i.e norm(gradient(x')) = 1
i.e. sqrt(Sum(gradient_i^2) ) = 1
Now if you just continue simplifying the above (considering how norm
is calculated, see my note below) math expression, you will end up with
good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)
Since you are passing the image_shape
as (256, 1, 28, 28) - so torch.sqrt(image_size)
in your case is tensor(28.)
Effectively the above line is dividing each element of A 4-D Tensor like [[[[1., 1. ... ]]]] with a scaler tensor(28.)
Separately, note hownorm
is calculated
torch.norm
without extra arguments performs, what is called a Frobenius norm which is effectively reshaping the matrix into one long vector and returning the 2-norm of that.
Given an M * N matrix, The Frobenius Norm of a matrix is defined as the square root of the sum of the squares of the elements of the matrix.
QUESTION
I'm trying to develop a GAN using FastAi. When converting the Tensor to an Image I get this error.
...ANSWER
Answered 2021-Dec-11 at 17:40I suggest for you to use this code to convert the output of your model from a tensor to a PIL image:
QUESTION
I want to use WGAN-GP, and when I run the code, it gives me an error:
...ANSWER
Answered 2021-Sep-08 at 08:58All loss tensors which are saved outside of the optimization cycle (i.e. outside the for g_iter in range(generator_iters)
loop) need to be detached from the graph. Otherwise, you are keeping all previous computation graphs in memory.
As such, you should detach anything that gets appended to d_progress
, d_fake_progress
, d_real_progress
, penalty
, and g_progress
.
You can do so by converting the tensor to a scalar value with torch.Tensor.item
, the graph will free itself on the following iteration. Change the following lines:
QUESTION
On this GitHub page of "Improved WGAN Training", particularly in their Python-Cifar-file, the authors use the Python-module tflib
. I installed it via pip (pip install tflib
), but when I import it, I still get the error message No module named tflib
.
Any hints would be appreciated.
...ANSWER
Answered 2021-Feb-27 at 10:19The package on pypi is an unknown package.
If you look closely in the github repository, you will find that the tflib
is actually a custom module made by the author and provided in the same repository. Cloning the full repository will help resolve the issue
QUESTION
I have trained a WGAN on the CelebA dataset in PyTorch following this youtube video. Since I do this on Google Cloud Platform where TensorBoard is not availabe, I save one figure of generated images by the GAN every epoch to see how the GAN is actually doing.
Now, the saved pdf files look sth like this: generated images. Unfortunately, this is not really readable, and I suspect this has to do with the preprocessing I do:
...ANSWER
Answered 2021-Jan-09 at 16:33It seems like your output pixel values are in range [-1, 1]
(please verify this).
Therefore, when you save the images, the negative part is being clipped (as the error message you got suggests).
Try:
QUESTION
I am trying to implement a WGAN-GP model using tensorflow and keras (for credit card fraud data from kaggle).
I mostly followed the sample code that is provided in keras website and several other sample codes on the internet (but changed them from image to my data), and it is pretty straightforward.
But when I want to update the critic, the gradient of loss w.r.t critic's weights becomes all nan
after a few batches. And this causes the critic's weights to become nan
and after that the generator's weights become nan
,... So everything become nan
!
I used tf.debugging.enable_check_numerics
and found that the problem arises because a -Inf
appears in the gradient after some iterations.
This is directly related to the gradient-penalty term in the loss, because when I remove that the problem goes away.
Please note that the gp
itself is not nan
, but when I get the gradient of the loss w.r.t critic's weights (c_grads
in the code below) it contains -Inf
and then somehow becomes all nan
.
I checked the math and network architecture for possible mistakes (like probability of gradient vanishing, etc.), and I checked my code for possible bugs for hours and hours. But I'm stuck.
I would very much appreciate it if anyone can find the root of the problem
Note: Bear in mind that the critic's output and loss function is slightly different from the original paper (because I'm trying to make it conditional) but that has nothing to do with the problem because as I said before, the whole problem goes away when I just remove the gradient penalty term
This is my critic:
...ANSWER
Answered 2020-Aug-28 at 00:26So after much more digging into the internet, it turns out that this is because of the numerical instability of tf.norm
(and some other functions as well).
In the case of norm
function, the problem is that when calculating its gradient, its value appears in the denominator. So d(norm(x))/dx
at x = 0
would become 0 / 0
(this is the mysterious division-by-zero
I was looking for!)
The problem is that the computational graph sometimes ends up with things like a / a
where a = 0
which numerically is undefined but the limit exists. And because of the way tensorflow works (which computes the gradients using the chain rule) it results in nan
s or +/-Inf
s.
The best way probably would be for tensorflow to detect these patterns and replace them with their analytically-simplified equivalent. But until they do so, we have another way, and that is using something called tf.custom_gradient
to define our custom function with our custom gradient (related issue on their github)
Although in my case there was actually an even simpler solution (although it wasn't simple when I didn't know that the tf.norm
was the culprit):
So instead of:
QUESTION
I am working on a project with Wasserstein GANs and more specifically with an implementation of the improved version of Wasserstein GANs. I have two theoretical questions about wGANs regarding their stability and training process. Firstly, the result of the loss function notoriously is correlated with the quality of the result of the generated samples (that is stated here). Is there some extra bibliography that supports that argument?
Secondly, during my experimental phase, I noticed that training my architecture using wGANs is much faster than using a simple version of GANs. Is that a common behavior? Is there also some literature analysis about that?
Furthermore, one question about the continuous functions that are guaranteed by using Wasserstein loss. I am having some issues understanding this concept in practice, what it means that the normal GANs loss is not continuous function?
...ANSWER
Answered 2020-May-02 at 16:19You can check Inception Score and Frechet Inception Distance for now. And also here. The problem is that GANs not having a unified objective functions(there are two networks) there's no agreed way of evaluating and comparing GAN models. INstead people devise metrics that's relating the image distributinos and generator distributions.
wGAN could be faster due to having morestable training procedures as opposed to vanilla GAN(Wasserstein metric, weight clipping and gradient penalty(if you are using it) ) . I dont know if there's a literature analysis for speed and It may not always the case for WGAN faster than a simple GAN. WGAN cannot find the best Nash equlibirum like GAN.
Think two distributions: p and q. If these distributions overlap, i.e. , their domains overlap, then KL or JS divergence are differentiable. The problem arises when p and q don't overlap. As in WGAN paper example, say two pdfs on 2D space, V = (0, Z) , Q = (K , Z) where K is different from 0 and Z is sampled from uniform distribution. If you try to take derivative of KL/JS divergences of these two pdfs well you cannot. This is because these two divergence would be a binary indicator function (equal or not) and we cannot take derivative of these functions. However, if we use Wasserstein loss or Earth-Mover distance, we can take it since we are approximating it as a distance between two points on space. Short story: Normal GAN loss function is continuous iff the distributions have an overlap, otherwise it is discrete.
Hope this helps
QUESTION
Good morning,
I am trying to implement the improved WGAN for 1D data as described on this paper: https://arxiv.org/pdf/1704.00028.pdf
It has been implemented as an example in the keras-contrib github: https://github.com/keras-team/keras-contrib/blob/master/examples/improved_wgan.py Nevertheless, this implementation of the gradient penalty loss is not working anymore with tf2. K.gradients() returns [None].
...ANSWER
Answered 2020-Apr-08 at 17:06If you do what is proposed in the UPDATE, tf will just ignore the loss function
With Tensorflow 2, it seems imposible to to this the old way. I finally change the code to adapt it to this way of creating models. What I suggest?
- Create the gen/disc models with keras
- Join them extending tf.keras.Model class like the WGAN of : https://github.com/timsainb/tensorflow2-generative-models
QUESTION
I am trying to implement WGAN with GP in TensorFlow 2.0. To calculate the gradient penalty it requires you to calculate the gradients of the predictions with respect to input images.
Now, to make it a bit more tractable, instead of computing the gradients of the predictions with respect to all the input images, it computes interpolated data points along the lines of original and fake data points and uses these as the inputs.
To implement this, I am first developing the compute_gradients
function which would take some predictions and return the gradients of those with respect to some input images. First, I thought of doing this with tf.keras.backend.gradients
but it won't work in eager mode. So, I am trying to do this now using GradientTape
.
Here's the code I am using to test things out:
...ANSWER
Answered 2020-Mar-20 at 15:52Gradients of predictions with respect to some tensors ... Am I missing something here?
Yes. You need a tape.watch(interpolated_img)
:
QUESTION
As I understand what of the diff between regular GAN to WGAN is that we train the discriminator/critic with more examples in each epoch. If in the regular gan we have in each epoch one batch for both modules, in WGAN we will have 5 batches (or more..) for the discriminator and one for the generator.
So basically we have another inner loop for the discriminator :
...ANSWER
Answered 2020-Mar-17 at 03:16Yes, it does sound reasonable typically increasing batch_size during training, typically decreases the training time with a cost of using more memory and lower accuracy (lower generalization ability).
Having said this you should do always do trial and error with regards to batching as extreme values may or may not increase the training time.
For further discussion you can refer to this question
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install wgan
You can use wgan like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page