MNIST | Neural Network with single hidden layer learning MNIST | Machine Learning library

 by   evolvingstuff Java Version: Current License: MIT

kandi X-RAY | MNIST Summary

MNIST is a Java library typically used in Artificial Intelligence, Machine Learning, Deep Learning, Tensorflow, Keras, Neural Network applications. MNIST has no vulnerabilities, it has a Permissive License and it has low support. However MNIST has 9 bugs and it build file is not available. You can download it from GitHub.
Neural Network with single hidden layer learning MNIST with less than 1.2% test error. This is the source code for the following blog post:
    Support
      Quality
        Security
          License
            Reuse
            Support
              Quality
                Security
                  License
                    Reuse

                      kandi-support Support

                        summary
                        MNIST has a low active ecosystem.
                        summary
                        It has 19 star(s) with 18 fork(s). There are 3 watchers for this library.
                        summary
                        It had no major release in the last 6 months.
                        summary
                        MNIST has no issues reported. There are no pull requests.
                        summary
                        It has a neutral sentiment in the developer community.
                        summary
                        The latest version of MNIST is current.
                        MNIST Support
                          Best in #Machine Learning
                            Average in #Machine Learning
                            MNIST Support
                              Best in #Machine Learning
                                Average in #Machine Learning

                                  kandi-Quality Quality

                                    summary
                                    MNIST has 9 bugs (4 blocker, 0 critical, 0 major, 5 minor) and 132 code smells.
                                    MNIST Quality
                                      Best in #Machine Learning
                                        Average in #Machine Learning
                                        MNIST Quality
                                          Best in #Machine Learning
                                            Average in #Machine Learning

                                              kandi-Security Security

                                                summary
                                                MNIST has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
                                                summary
                                                MNIST code analysis shows 0 unresolved vulnerabilities.
                                                summary
                                                There are 1 security hotspots that need review.
                                                MNIST Security
                                                  Best in #Machine Learning
                                                    Average in #Machine Learning
                                                    MNIST Security
                                                      Best in #Machine Learning
                                                        Average in #Machine Learning

                                                          kandi-License License

                                                            summary
                                                            MNIST is licensed under the MIT License. This license is Permissive.
                                                            summary
                                                            Permissive licenses have the least restrictions, and you can use them in most projects.
                                                            MNIST License
                                                              Best in #Machine Learning
                                                                Average in #Machine Learning
                                                                MNIST License
                                                                  Best in #Machine Learning
                                                                    Average in #Machine Learning

                                                                      kandi-Reuse Reuse

                                                                        summary
                                                                        MNIST releases are not available. You will need to build from source code and install.
                                                                        summary
                                                                        MNIST has no build file. You will be need to create the build yourself to build the component from source.
                                                                        summary
                                                                        MNIST saves you 184 person hours of effort in developing the same functionality from scratch.
                                                                        summary
                                                                        It has 455 lines of code, 43 functions and 11 files.
                                                                        summary
                                                                        It has medium code complexity. Code complexity directly impacts maintainability of the code.
                                                                        MNIST Reuse
                                                                          Best in #Machine Learning
                                                                            Average in #Machine Learning
                                                                            MNIST Reuse
                                                                              Best in #Machine Learning
                                                                                Average in #Machine Learning
                                                                                  Top functions reviewed by kandi - BETA
                                                                                  kandi has reviewed MNIST and discovered the below as its top functions. This is intended to give you an instant insight into MNIST implemented functionality, and help decide if they suit your requirements.
                                                                                  • Main method for testing
                                                                                    • Evaluate the fitted fitness function
                                                                                    • Convert a file to a matrix
                                                                                    • Evaluate a sample -vised sample
                                                                                    • Write a matrix to a file
                                                                                    • Internal evaluation function
                                                                                    • Loads weights from a file
                                                                                    • Save the layers to a file
                                                                                    • Gets the task dimension
                                                                                    • Gets the dimension of the observation
                                                                                    • Sets the validation mode
                                                                                  • Creates a new readout layer
                                                                                    • Determine the effective input dimension
                                                                                  • Add a new hidden layer
                                                                                  Get all kandi verified functions for this library.
                                                                                  Get all kandi verified functions for this library.

                                                                                  MNIST Key Features

                                                                                  Neural Network with single hidden layer learning MNIST with less than 1.2% test error.

                                                                                  MNIST Examples and Code Snippets

                                                                                  No Code Snippets are available at this moment for MNIST.
                                                                                  Community Discussions

                                                                                  Trending Discussions on MNIST

                                                                                  Audio widget within Jupyter notebook is **not** playing. How can I get the widget to play the audio?
                                                                                  chevron right
                                                                                  How to calculate maximum gradient for each layer given a mini-batch
                                                                                  chevron right
                                                                                  Save GAN generated images one by one
                                                                                  chevron right
                                                                                  Flux.jl : Customizing optimizer
                                                                                  chevron right
                                                                                  Generator model doesn't produce pictures during training
                                                                                  chevron right
                                                                                  Use of tf.GradientTape() exhausts all the gpu memory, without it it doesn't matter
                                                                                  chevron right
                                                                                  partial tucker decomposition
                                                                                  chevron right
                                                                                  Is it possible to use a collection of hyperspectral 1x1 pixels in a CNN model purposed for more conventional datasets (CIFAR-10/MNIST)?
                                                                                  chevron right
                                                                                  Why did it always missing one subplot when I import mnist digits dataset?
                                                                                  chevron right
                                                                                  Can not run the the tflite model on Interpreter in android studio
                                                                                  chevron right

                                                                                  QUESTION

                                                                                  Audio widget within Jupyter notebook is **not** playing. How can I get the widget to play the audio?
                                                                                  Asked 2022-Mar-15 at 00:07

                                                                                  I writing my code within a Jupyter notebook in VS Code. I am hoping to play some of the audio within my data set. However, when I execute the cell, the console reports no errors, produces the widget, but the widget displays 0:00 / 0:00 (see below), indicating there is no sound to play.

                                                                                  Below, I have listed two ways to reproduce the error.

                                                                                  1. I have acquired data from the hub data store. Looking specifically at the spoken MNIST data set, I cannot get the data from the audio tensor to play
                                                                                  import hub
                                                                                  from IPython.display import display, Audio
                                                                                  from ipywidgets import interactive
                                                                                  
                                                                                  # Obtain the data using the hub module
                                                                                  ds = hub.load("hub://activeloop/spoken_mnist")
                                                                                  
                                                                                  # Create widget
                                                                                  sample = ds.audio[0].numpy()
                                                                                  display(Audio(data=sample, rate = 8000, autoplay=True))
                                                                                  
                                                                                  
                                                                                  1. The second example is a test (copied from another post) that I ran to see if it was something wrong with the data or something wrong with my console, environment, etc.
                                                                                  # Same imports as shown above
                                                                                  
                                                                                  # Toy Function to play beats in notebook
                                                                                  def beat_freq(f1=220.0, f2=224.0):
                                                                                      max_time = 5
                                                                                      rate = 8000
                                                                                      times = np.linspace(0,max_time,rate*max_time)
                                                                                      signal = np.sin(2*np.pi*f1*times) + np.sin(2*np.pi*f2*times)
                                                                                      display(Audio(data=signal, rate=rate))
                                                                                      return signal
                                                                                  
                                                                                  v = interactive(beat_freq, f1=(200.0,300.0), f2=(200.0,300.0))
                                                                                  display(v)
                                                                                  
                                                                                  

                                                                                  I believe that if it is something wrong with the data (this is a well-known data set so, I doubt it), then only the second one will play. If it is something to do with the IDE or something else, then neither will work, as is the case now.

                                                                                  ANSWER

                                                                                  Answered 2022-Mar-15 at 00:07

                                                                                  Apologies for the late reply! In the future, please tag the questions with activeloop so it's easier to sort through (or hit us up directly in community slack -> slack.activeloop.ai).

                                                                                  Regarding the Free Spoken Digit Dataset, I managed to track the error with your usage of activeloop hub and audio display.

                                                                                  adding [:,0] to 9th line will help fixing display on Colab as Audio expects one-dimensional data

                                                                                  %matplotlib inline
                                                                                  import hub
                                                                                  from IPython.display import display, Audio
                                                                                  from ipywidgets import interactive
                                                                                  
                                                                                  # Obtain the data using the hub module
                                                                                  ds = hub.load("hub://activeloop/spoken_mnist")
                                                                                  
                                                                                  # Create widget
                                                                                  sample = ds.audio[0].numpy()[:,0]
                                                                                  display(Audio(data=sample, rate = 8000, autoplay=True))
                                                                                  

                                                                                  (When we uploaded the dataset, we decided to upload the audio as (N,C) where C is the number of channels, which happens to be 1 for the particular dataset. The added dimension wasn't added automatically)

                                                                                  Regarding the VScode... the audio, unfortunately, would still not work (not because of us, but VScode), but you can still try visualizing Free Spoken Digit Dataset (you can play the music there, too). Hopefully this addresses your needs!

                                                                                  Let us know if you have further questions.

                                                                                  Mikayel from Activeloop

                                                                                  Source https://stackoverflow.com/questions/71200390

                                                                                  QUESTION

                                                                                  How to calculate maximum gradient for each layer given a mini-batch
                                                                                  Asked 2022-Mar-14 at 07:58

                                                                                  I try to implement a fully-connected model for classification using the MNIST dataset. A part of the code is the following:

                                                                                  n = 5
                                                                                  act_func = 'relu'
                                                                                  
                                                                                  classifier = tf.keras.models.Sequential()
                                                                                  classifier.add(layers.Flatten(input_shape = (28, 28, 1)))
                                                                                  for i in range(n):
                                                                                    classifier.add(layers.Dense(32, activation=act_func))
                                                                                  classifier.add(layers.Dense(10, activation='softmax'))
                                                                                  opt = tf.keras.optimizers.SGD(learning_rate=0.01)
                                                                                  classifier.compile(optimizer=opt,loss="categorical_crossentropy",metrics ="accuracy")
                                                                                  
                                                                                  classifier.summary()
                                                                                  
                                                                                  history = classifier.fit(x_train, y_train, batch_size=32, epochs=3, validation_data=(x_test,y_test))
                                                                                  

                                                                                  Is there a way to print the maximum gradient for each layer for a given mini-batch?

                                                                                  ANSWER

                                                                                  Answered 2022-Mar-10 at 08:19

                                                                                  You could start off with a custom training loop using tf.GradientTape:

                                                                                  import tensorflow as tf
                                                                                  import tensorflow_datasets as tfds 
                                                                                  
                                                                                  (ds_train, ds_test), ds_info = tfds.load(
                                                                                      'mnist',
                                                                                      split=['train', 'test'],
                                                                                      shuffle_files=True,
                                                                                      as_supervised=True,
                                                                                      with_info=True,
                                                                                  )
                                                                                  n = 5
                                                                                  act_func = 'relu'
                                                                                  
                                                                                  classifier = tf.keras.models.Sequential()
                                                                                  classifier.add(tf.keras.layers.Flatten(input_shape = (28, 28, 1)))
                                                                                  for i in range(n):
                                                                                    classifier.add(tf.keras.layers.Dense(32, activation=act_func))
                                                                                  classifier.add(tf.keras.layers.Dense(10, activation='softmax'))
                                                                                  opt = tf.keras.optimizers.SGD(learning_rate=0.01)
                                                                                  loss = tf.keras.losses.CategoricalCrossentropy()
                                                                                  
                                                                                  classifier.summary()
                                                                                  
                                                                                  epochs = 1
                                                                                  for epoch in range(epochs):
                                                                                      print("\nStart of epoch %d" % (epoch,))
                                                                                      for step, (x_batch_train, y_batch_train) in enumerate(ds_train.take(50).batch(10)):
                                                                                          x_batch_train = tf.cast(x_batch_train, dtype=tf.float32)
                                                                                          y_batch_train = tf.keras.utils.to_categorical(y_batch_train, 10)
                                                                                  
                                                                                          with tf.GradientTape() as tape:
                                                                                            logits = classifier(x_batch_train, training=True)
                                                                                            loss_value = loss(y_batch_train, logits)
                                                                                  
                                                                                          grads = tape.gradient(loss_value, classifier.trainable_weights)
                                                                                          opt.apply_gradients(zip(grads, classifier.trainable_weights)) 
                                                                                  
                                                                                          with tf.GradientTape(persistent=True) as tape:
                                                                                            tape.watch(x_batch_train)
                                                                                            x = classifier.layers[0](x_batch_train)
                                                                                            outputs = []
                                                                                            for layer in classifier.layers[1:]:
                                                                                                x = layer(x)
                                                                                                outputs.append(x)
                                                                                  
                                                                                          for idx, output in enumerate(outputs):
                                                                                             grad = tf.math.abs(tape.gradient(output, x_batch_train))
                                                                                             print('Max gradient for layer {} is {}'.format(idx + 1, tf.reduce_max(grad)))
                                                                                          print('End of batch {}'.format(step + 1))
                                                                                  
                                                                                  Model: "sequential_9"
                                                                                  _________________________________________________________________
                                                                                   Layer (type)                Output Shape              Param #   
                                                                                  =================================================================
                                                                                   flatten_9 (Flatten)         (None, 784)               0         
                                                                                                                                                   
                                                                                   dense_54 (Dense)            (None, 32)                25120     
                                                                                                                                                   
                                                                                   dense_55 (Dense)            (None, 32)                1056      
                                                                                                                                                   
                                                                                   dense_56 (Dense)            (None, 32)                1056      
                                                                                                                                                   
                                                                                   dense_57 (Dense)            (None, 32)                1056      
                                                                                                                                                   
                                                                                   dense_58 (Dense)            (None, 32)                1056      
                                                                                                                                                   
                                                                                   dense_59 (Dense)            (None, 10)                330       
                                                                                                                                                   
                                                                                  =================================================================
                                                                                  Total params: 29,674
                                                                                  Trainable params: 29,674
                                                                                  Non-trainable params: 0
                                                                                  _________________________________________________________________
                                                                                  
                                                                                  Start of epoch 0
                                                                                  Max gradient for layer 1 is 0.7913536429405212
                                                                                  Max gradient for layer 2 is 0.8477020859718323
                                                                                  Max gradient for layer 3 is 0.7188305854797363
                                                                                  Max gradient for layer 4 is 0.5108454823493958
                                                                                  Max gradient for layer 5 is 0.3362882435321808
                                                                                  Max gradient for layer 6 is 1.9748875867975357e-09
                                                                                  End of batch 1
                                                                                  Max gradient for layer 1 is 0.7535678148269653
                                                                                  Max gradient for layer 2 is 0.6814548373222351
                                                                                  Max gradient for layer 3 is 0.5748667120933533
                                                                                  Max gradient for layer 4 is 0.5439972877502441
                                                                                  Max gradient for layer 5 is 0.27793681621551514
                                                                                  Max gradient for layer 6 is 1.9541412932255753e-09
                                                                                  End of batch 2
                                                                                  Max gradient for layer 1 is 0.8606255650520325
                                                                                  Max gradient for layer 2 is 0.8506941795349121
                                                                                  Max gradient for layer 3 is 0.8556670546531677
                                                                                  Max gradient for layer 4 is 0.43756356835365295
                                                                                  Max gradient for layer 5 is 0.2675274908542633
                                                                                  Max gradient for layer 6 is 3.7072431791074223e-09
                                                                                  End of batch 3
                                                                                  Max gradient for layer 1 is 0.7640039324760437
                                                                                  Max gradient for layer 2 is 0.6926062107086182
                                                                                  Max gradient for layer 3 is 0.6164448857307434
                                                                                  Max gradient for layer 4 is 0.43013691902160645
                                                                                  Max gradient for layer 5 is 0.32356566190719604
                                                                                  Max gradient for layer 6 is 3.2926392723453546e-09
                                                                                  End of batch 4
                                                                                  Max gradient for layer 1 is 0.7604862451553345
                                                                                  Max gradient for layer 2 is 0.6908300518989563
                                                                                  Max gradient for layer 3 is 0.6122230887413025
                                                                                  Max gradient for layer 4 is 0.39982378482818604
                                                                                  Max gradient for layer 5 is 0.3172021210193634
                                                                                  Max gradient for layer 6 is 2.3238742041797877e-09
                                                                                  End of batch 5
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/71420132

                                                                                  QUESTION

                                                                                  Save GAN generated images one by one
                                                                                  Asked 2022-Mar-13 at 17:07

                                                                                  I have generated some images from the Fashion Mnist dataset, However, I am not able to come up with a function or the way to save each image as a single file. I only have found a way to save them in groups. Can someone help me on how to save images one by one?

                                                                                  This is what I have for the moment:

                                                                                  def generate_and_save_images(model, 
                                                                                       epoch,test_input):
                                                                                       predictions = model(test_input, training=False)
                                                                                  
                                                                                       fig = plt.figure(figsize=(4,4))
                                                                                  
                                                                                      for i in range(predictions.shape[0]):
                                                                                          plt.subplot(4, 4, i+1)
                                                                                          plt.imshow(predictions[i, :, :, 0] * 127.5 + 
                                                                                          127.5, cmap='gray')
                                                                                          plt.axis('off')
                                                                                  
                                                                                      plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
                                                                                  
                                                                                      plt.show()
                                                                                  

                                                                                  ANSWER

                                                                                  Answered 2022-Mar-13 at 17:07

                                                                                  Try using plt.imsave to save each image separately:

                                                                                  def generate_and_save_images(model, epoch, test_input):
                                                                                    predictions = model(test_input, training=False)
                                                                                    fig = plt.figure(figsize=(4, 4))
                                                                                  
                                                                                    for i in range(predictions.shape[0]):
                                                                                        plt.subplot(4, 4, i+1)
                                                                                        plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
                                                                                        plt.imsave('image_at_epoch_{:04d}-{}.png'.format(epoch, i), predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
                                                                                        plt.axis('off')
                                                                                  
                                                                                    plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
                                                                                    plt.show()
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/71452209

                                                                                  QUESTION

                                                                                  Flux.jl : Customizing optimizer
                                                                                  Asked 2022-Jan-25 at 07:58

                                                                                  I'm trying to implement a gradient-free optimizer function to train convolutional neural networks with Julia using Flux.jl. The reference paper is this: https://arxiv.org/abs/2005.05955. This paper proposes RSO, a gradient-free optimization algorithm updates single weight at a time on a sampling bases. The pseudocode of this algorithm is depicted in the picture below.

                                                                                  optimizer_pseudocode

                                                                                  I'm using MNIST dataset.

                                                                                  function train(; kws...)
                                                                                  args = Args(; kws...) # collect options in a stuct for convinience
                                                                                  
                                                                                  if CUDA.functional() && args.use_cuda
                                                                                      @info "Training on CUDA GPU"
                                                                                      CUDA.allwoscalar(false)
                                                                                      device = gpu
                                                                                  else
                                                                                      @info "Training on CPU"
                                                                                      device = cpu
                                                                                  end
                                                                                  
                                                                                  # Prepare datasets
                                                                                  x_train, x_test, y_train, y_test = getdata(args, device)
                                                                                  
                                                                                  # Create DataLoaders (mini-batch iterators)
                                                                                  train_loader = DataLoader((x_train, y_train), batchsize=args.batchsize, shuffle=true)
                                                                                  test_loader = DataLoader((x_test, y_test), batchsize=args.batchsize)
                                                                                  
                                                                                  # Construct model
                                                                                  model = build_model() |> device
                                                                                  ps = Flux.params(model) # model's trainable parameters
                                                                                  
                                                                                  best_param = ps
                                                                                  if args.optimiser == "SGD"
                                                                                      # Regular training step with SGD
                                                                                  
                                                                                  elseif args.optimiser == "RSO"
                                                                                      # Run RSO function and update ps
                                                                                      best_param .= RSO(x_train, y_train, args.RSOupdate, model, args.batchsize, device)
                                                                                  end
                                                                                  

                                                                                  And the corresponding RSO function:

                                                                                  function RSO(X,L,C,model, batch_size, device)
                                                                                  """
                                                                                  model = convolutional model structure
                                                                                  X = Input data
                                                                                  L = labels
                                                                                  C = Number of rounds to update parameters
                                                                                  W = Weight set of layers
                                                                                  Wd = Weight tensors of layer d that generates an activation
                                                                                  wid = weight tensor that generates an activation aᵢ
                                                                                  wj = a weight in wid
                                                                                  """
                                                                                  
                                                                                  # Normalize input data to have zero mean and unit standard deviation
                                                                                  X .= (X .- sum(X))./std(X)
                                                                                  train_loader = DataLoader((X, L), batchsize=batch_size, shuffle=true)
                                                                                  
                                                                                  #println("model = $(typeof(model))")
                                                                                  
                                                                                  std_prep = []
                                                                                  σ_d = Float64[]
                                                                                  D = 1
                                                                                  for layer in model
                                                                                      D += 1
                                                                                      Wd = Flux.params(layer)
                                                                                      # Initialize the weights of the network with Gaussian distribution
                                                                                      for id in Wd
                                                                                          wj = convert(Array{Float32, 4}, rand(Normal(0, sqrt(2/length(id))), (3,3,4,4)))
                                                                                          id = wj
                                                                                          append!(std_prep, vec(wj))
                                                                                      end
                                                                                      # Compute std of all elements in the weight tensor Wd
                                                                                      push!(σ_d, std(std_prep))
                                                                                  end
                                                                                  
                                                                                  W = Flux.params(model)
                                                                                  
                                                                                  # Weight update
                                                                                  for _ in 1:C
                                                                                      d = D
                                                                                      while d > 0
                                                                                          for id in 1:length(W[d])
                                                                                              # Randomly sample change in weights from Gaussian distribution
                                                                                              for j in 1:length(w[d][id])
                                                                                                  # Randomly sample mini-batch
                                                                                                  (x, l) = train_loader[rand(1:length(train_loader))]
                                                                                                  
                                                                                                  # Sample a weight from normal distribution
                                                                                                  ΔWj[d][id][j] = rand(Normal(0, σ_d[d]), 1)
                                                                                  
                                                                                                  loss, acc = loss_and_accuracy(data_loader, model, device)
                                                                                                  W = argmin(F(x,l, W+ΔWj), F(x,l,W), F(x,l, W-ΔWj))
                                                                                              end
                                                                                          end
                                                                                          d -= 1
                                                                                      end
                                                                                  end
                                                                                  
                                                                                  return W
                                                                                  end
                                                                                  

                                                                                  The problem here is the second block of the RSO function. I'm trying to evaluate the loss with the change of single weight in three scenarios, which are F(w, l, W+gW), F(w, l, W), F(w, l, W-gW), and choose the weight-set with minimum loss. But how do I do that using Flux.jl? The loss function I'm trying to use is logitcrossentropy(ŷ, y, agg=sum). In order to generate y_hat, we should use model(W), but changing single weight parameter in Zygote.Params() form was already challenging....

                                                                                  ANSWER

                                                                                  Answered 2022-Jan-14 at 23:47

                                                                                  Based on the paper you shared, it looks like you need to change the weight arrays per each output neuron per each layer. Unfortunately, this means that the implementation of your optimization routine is going to depend on the layer type, since an "output neuron" for a convolution layer is quite different than a fully-connected layer. In other words, just looping over Flux.params(model) is not going to be sufficient, since this is just a set of all the weight arrays in the model and each weight array is treated differently depending on which layer it comes from.

                                                                                  Fortunately, Julia's multiple dispatch does make this easier to write if you use separate functions instead of a giant loop. I'll summarize the algorithm using the pseudo-code below:

                                                                                  for layer in model
                                                                                    for output_neuron in layer
                                                                                      for weight_element in parameters(output_neuron)
                                                                                        weight_element = sample(N(0, sqrt(2 / num_outputs(layer))))
                                                                                      end
                                                                                    end
                                                                                    sigmas[layer] = stddev(parameters(layer))
                                                                                  end
                                                                                  
                                                                                  for c in 1 to C
                                                                                    for layer in reverse(model)
                                                                                      for output_neuron in layer
                                                                                        for weight_element in parameters(output_neuron)
                                                                                          x, y = sample(batches)
                                                                                          dw = N(0, sigmas[layer])
                                                                                          # optimize weights
                                                                                        end
                                                                                      end
                                                                                    end
                                                                                  end
                                                                                  

                                                                                  It's the for output_neuron ... portions that we need to isolate into separate functions.

                                                                                  In the first block, we don't actually do anything different to every weight_element, they are all sampled from the same normal distribution. So, we don't actually need to iterate the output neurons, but we do need to know how many there are.

                                                                                  using Statistics: std
                                                                                  
                                                                                  # this function will set the weights according to the
                                                                                  # normal distribution and the number of output neurons
                                                                                  # it also returns the standard deviation of the weights
                                                                                  function sample_weight!(layer::Dense)
                                                                                    sample = randn(eltype(layer.weight), size(layer.weight))
                                                                                    num_outputs = size(layer.weight, 1)
                                                                                    # notice the "." notation which is used to mutate the array
                                                                                    layer.weight .= sample .* num_outputs
                                                                                  
                                                                                    return std(layer.weight)
                                                                                  end
                                                                                  
                                                                                  function sample_weight!(layer::Conv)
                                                                                    sample = randn(eltype(layer.weight), size(layer.weight))
                                                                                    num_outputs = size(layer.weight, 4)
                                                                                    # notice the "." notation which is used to mutate the array
                                                                                    layer.weight .= sample .* num_outputs
                                                                                  
                                                                                    return std(layer.weight)
                                                                                  end
                                                                                  
                                                                                  sigmas = map(sample_weights!, model)
                                                                                  

                                                                                  Now, for the second block, we will do a similar trick by defining different functions for each layer.

                                                                                  function optimize_layer!(loss, layer::Dense, data, sigma)
                                                                                    for i in 1:size(layer.weight, 1)
                                                                                      for j in 1:size(layer.weight, 2)
                                                                                        wj = layer.weight[i, j]
                                                                                        x, y = data[rand(1:length(data))]
                                                                                        dw = randn() * sigma
                                                                                        ws = [wj + dw, wj, wj - dw]
                                                                                        losses = Float32[]
                                                                                        for (k, w) in enumerate(ws)
                                                                                          layer.weight[i, j] = w
                                                                                          losses[k] = loss(x, y)
                                                                                        end
                                                                                        layer.weight[i, j] = ws[argmin(losses)]
                                                                                      end
                                                                                    end
                                                                                  end
                                                                                  
                                                                                  function optimize_layer!(loss, layer::Conv, data, sigma)
                                                                                    for i in 1:size(layer.weight, 4)
                                                                                      # we use a view to reference the full kernel
                                                                                      # for this output channel
                                                                                      wid = view(layer.weight, :, :, :, i)
                                                                                      
                                                                                      # each index let's us treat wid like a vector
                                                                                      for j in eachindex(wid)
                                                                                        wj = wid[j]
                                                                                        x, y = data[rand(1:length(data))]
                                                                                        dw = randn() * sigma
                                                                                        ws = [wj + dw, wj, wj - dw]
                                                                                        losses = Float32[]
                                                                                        for (k, w) in enumerate(ws)
                                                                                          wid[j] = w
                                                                                          losses[k] = loss(x, y)
                                                                                        end
                                                                                        wid[j] = ws[argmin(losses)]
                                                                                      end
                                                                                    end
                                                                                  end
                                                                                  
                                                                                  for c in 1:C
                                                                                    for (layer, sigma) in reverse(zip(model, sigmas))
                                                                                      optimize_layer!(layer, data, sigma) do x, y
                                                                                        logitcrossentropy(model(x), y; agg = sum)
                                                                                      end
                                                                                    end
                                                                                  end
                                                                                  

                                                                                  Notice that nowhere did I use Flux.params which does not help us here. Also, Flux.params would include both the weight and bias, and the paper doesn't look like it bothers with the bias at all. If you had an optimization method that generically optimized any parameter regardless of layer type the same (i.e. like gradient descent), then you could use for p in Flux.params(model) ....

                                                                                  Source https://stackoverflow.com/questions/70641453

                                                                                  QUESTION

                                                                                  Generator model doesn't produce pictures during training
                                                                                  Asked 2022-Jan-15 at 02:45

                                                                                  I'm training GAN with MNIST and I want to visualize Generator output with noise input during training.

                                                                                  Here is the code:

                                                                                  from numpy import expand_dims
                                                                                  import numpy as np
                                                                                  import time
                                                                                  import tensorflow as tf
                                                                                  from numpy import zeros
                                                                                  from numpy import ones
                                                                                  from numpy import vstack
                                                                                  from numpy.random import randn
                                                                                  from numpy.random import randint
                                                                                  from tensorflow.keras.datasets.mnist import load_data
                                                                                  from tensorflow.keras.optimizers import Adam
                                                                                  from tensorflow.keras import layers, Sequential
                                                                                  import matplotlib.pyplot as plt
                                                                                  from IPython import display
                                                                                  import imageio # for creating gifs
                                                                                  import PIL
                                                                                  
                                                                                  (trainX, _), (_, _) = load_data()
                                                                                  # add channels dimension
                                                                                  X = expand_dims(trainX, axis=-1)
                                                                                  # convert from unsigned ints to floats
                                                                                  X = X.astype('float32')
                                                                                  # scale from [0,255] to [0,1]
                                                                                  dataset = X / 255.0
                                                                                  
                                                                                  def define_generator(latent_dim):
                                                                                      model = Sequential()
                                                                                      # foundation for 7x7 image
                                                                                      n_nodes = 128 * 7 * 7
                                                                                      model.add(layers.Dense(n_nodes, input_dim=latent_dim))
                                                                                      model.add(layers.LeakyReLU())
                                                                                      model.add(layers.Reshape((7, 7, 128)))
                                                                                      # upsample to 14x14
                                                                                      model.add(layers.Conv2DTranspose(128, (4,4), strides=(2,2), padding='same'))
                                                                                      model.add(layers.LeakyReLU())
                                                                                      # upsample to 28x28
                                                                                      model.add(layers.Conv2DTranspose(128, (4,4), strides=(2,2), padding='same'))
                                                                                      model.add(layers.LeakyReLU())
                                                                                      model.add(layers.Conv2D(1, (7,7), activation='sigmoid', padding='same'))
                                                                                      return model
                                                                                  
                                                                                  def define_discriminator(in_shape=(28,28,1)):
                                                                                      model = Sequential()
                                                                                      model.add(layers.Conv2D(128, (5,5), strides=(2, 2), padding='same', input_shape=in_shape))
                                                                                      model.add(layers.LeakyReLU())
                                                                                      model.add(layers.Dropout(0.3))
                                                                                      model.add(layers.Conv2D(128, (5,5), strides=(2, 2), padding='same'))
                                                                                      model.add(layers.LeakyReLU())
                                                                                      model.add(layers.Dropout(0.3))
                                                                                      model.add(layers.Flatten())
                                                                                      model.add(layers.Dense(1, activation='sigmoid'))
                                                                                      # compile model
                                                                                      opt = Adam(lr=0.0002)
                                                                                      model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
                                                                                      return model
                                                                                  
                                                                                  # size of the noise vector
                                                                                  latent_dim = 100
                                                                                  num_examples_to_generate = 16
                                                                                  
                                                                                  g_model = define_generator(latent_dim)
                                                                                  
                                                                                  d_model = define_discriminator()
                                                                                  
                                                                                  # define the combined generator and discriminator model, for updating the generator
                                                                                  def define_gan(g_model, d_model):
                                                                                      # make weights in the discriminator not trainable
                                                                                      d_model.trainable = False
                                                                                      # connect them
                                                                                      model = Sequential()
                                                                                      # add generator
                                                                                      model.add(g_model)
                                                                                      # add the discriminator
                                                                                      model.add(d_model)
                                                                                      # compile model
                                                                                      opt = Adam(lr=0.0002)
                                                                                      model.compile(loss='binary_crossentropy', optimizer=opt)
                                                                                      return model
                                                                                  
                                                                                  gan_model = define_gan(g_model, d_model)
                                                                                  
                                                                                  # select real samples
                                                                                  def generate_real_samples(dataset, n_samples):
                                                                                      # choose random instances
                                                                                      ix = randint(0, dataset.shape[0], n_samples)
                                                                                      # retrieve selected images
                                                                                      X = dataset[ix]
                                                                                      # generate 'real' class labels (1)
                                                                                      y = ones((n_samples, 1))
                                                                                      return X, y
                                                                                  
                                                                                  # generate points in latent space as input for the generator
                                                                                  def generate_latent_points(latent_dim, n_samples):
                                                                                      # generate noise vector for training
                                                                                      x_input = randn(latent_dim * n_samples)
                                                                                      # reshape into a batch of inputs for the network
                                                                                      x_input = x_input.reshape(n_samples, latent_dim)
                                                                                      return x_input
                                                                                  
                                                                                  # use the generator to generate n fake examples, with class labels
                                                                                  def generate_fake_samples(g_model, latent_dim, n_samples):
                                                                                      # generate points in latent space
                                                                                      x_input = generate_latent_points(latent_dim, n_samples)
                                                                                      # predict outputs
                                                                                      X = g_model.predict(x_input)
                                                                                      # create 'fake' class labels (0)
                                                                                      y = zeros((n_samples, 1))
                                                                                      return X, y
                                                                                  
                                                                                  def generate_and_save_images(epoch):
                                                                                    # for generating the fake images after each epoch
                                                                                  
                                                                                    # generate points in the latent space
                                                                                    noise = randn(latent_dim * num_examples_to_generate)
                                                                                    # reshape into a batch of inputs for the network
                                                                                    noise = noise.reshape(num_examples_to_generate, latent_dim)
                                                                                   
                                                                                    predictions = g_model(noise, training=False)
                                                                                  
                                                                                    fig = plt.figure(figsize=(4, 4))
                                                                                  
                                                                                    for i in range(predictions.shape[0]):
                                                                                        plt.subplot(4, 4, i+1)
                                                                                        plt.imshow(predictions[i, :, :, 0], cmap='gray')
                                                                                        plt.axis('off')
                                                                                  
                                                                                    plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
                                                                                    plt.show()
                                                                                  
                                                                                  def train(g_model, d_model, gan_model, dataset, latent_dim, n_epochs=60, n_batch=256):
                                                                                    bat_per_epo = int(dataset.shape[0] / n_batch)
                                                                                    half_batch = int(n_batch / 2)
                                                                                  
                                                                                    for epoch in range(n_epochs):
                                                                                  
                                                                                      start = time.time()
                                                                                  
                                                                                      for batch in range(bat_per_epo):
                                                                                        
                                                                                        X_real, y_real = generate_real_samples(dataset, half_batch)
                                                                                  
                                                                                        X_fake, y_fake = generate_fake_samples(g_model, latent_dim, half_batch)
                                                                                  
                                                                                        X = vstack((X_real, X_fake))
                                                                                  
                                                                                        y = vstack((y_real, y_fake))
                                                                                  
                                                                                        d_loss, _ = d_model.train_on_batch(X,y)
                                                                                  
                                                                                        X_gan = generate_latent_points(latent_dim, n_batch)
                                                                                  
                                                                                        y_gan = ones((n_batch, 1))
                                                                                  
                                                                                        g_loss = gan_model.train_on_batch(X_gan, y_gan)
                                                                                  
                                                                                      display.clear_output(wait=True)
                                                                                      generate_and_save_images(epoch + 1)
                                                                                  
                                                                                      print('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start))
                                                                                  
                                                                                    display.clear_output(wait=True)
                                                                                    generate_and_save_images(n_epochs)
                                                                                  
                                                                                  train(g_model, d_model, gan_model, dataset, latent_dim)
                                                                                  
                                                                                  

                                                                                  The output that I'm getting is:

                                                                                  Output for train(...) command

                                                                                  There is no error or something just I can't see the output from the Generator with the noise input.

                                                                                  The function that is supposed to show the output is generate_and_save_images.

                                                                                  ANSWER

                                                                                  Answered 2022-Jan-15 at 02:45

                                                                                  when you use cmap="gray" in plt.imshow() you must either unscale your output or set vmin and vmax. From what I see you scaled by dividing 255, so you must multiply your data by 255 or, alternativle set vmin=0, vmax=1 Option1:

                                                                                  plt.imshow(predictions[i, :, :, 0]*255, cmap='gray')
                                                                                  

                                                                                  Option2:

                                                                                  plt.imshow(predictions[i, :, :, 0], cmap='gray', vmin=0, vmax=1)
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/70707808

                                                                                  QUESTION

                                                                                  Use of tf.GradientTape() exhausts all the gpu memory, without it it doesn't matter
                                                                                  Asked 2022-Jan-07 at 11:47

                                                                                  I'm working on Convolution Tasnet, model size I made is about 5.05 million variables.

                                                                                  I want to train this using custom training loops, and the problem is,

                                                                                  for i, (input_batch, target_batch) in enumerate(train_ds): # each shape is (64, 32000, 1)
                                                                                      with tf.GradientTape() as tape:
                                                                                          predicted_batch = cv_tasnet(input_batch, training=True) # model name
                                                                                          loss = calculate_sisnr(predicted_batch, target_batch) # some custom loss
                                                                                      trainable_vars = cv_tasnet.trainable_variables
                                                                                      gradients = tape.gradient(loss, trainable_vars)
                                                                                      cv_tasnet.optimizer.apply_gradients(zip(gradients, trainable_vars))
                                                                                  

                                                                                  This part exhausts all the gpu memory (24GB available)..
                                                                                  When I tried without tf.GradientTape() as tape,

                                                                                  for i, (input_batch, target_batch) in enumerate(train_ds):
                                                                                          predicted_batch = cv_tasnet(input_batch, training=True)
                                                                                          loss = calculate_sisnr(predicted_batch, target_batch)
                                                                                  

                                                                                  This uses a reasonable amount of gpu memory(about 5~6GB).

                                                                                  I tried the same format of tf.GradientTape() as tape for the basic mnist data, then it works without problem.
                                                                                  So would the size matter? but the same error arises when I lowered BATCH_SIZE to 32 or smaller.

                                                                                  Why the 1st code block exhausts all the gpu memory?

                                                                                  Of course, I put

                                                                                  gpus = tf.config.experimental.list_physical_devices('GPU')
                                                                                  if gpus:
                                                                                      try:
                                                                                          # Currently, memory growth needs to be the same across GPUs
                                                                                          for gpu in gpus:
                                                                                              tf.config.experimental.set_memory_growth(gpu, True)
                                                                                      except RuntimeError as e:
                                                                                          # Memory growth must be set before GPUs have been initialized
                                                                                          print(e)
                                                                                  

                                                                                  this code at the very first cell.

                                                                                  ANSWER

                                                                                  Answered 2022-Jan-07 at 11:08

                                                                                  Gradient tape triggers automatic differentiation which requires tracking gradients on all your weights and activations. Autodiff requires multiple more memory. This is normal. You'll have to manually tune your batch size until you find one that works, then tune your LR. Usually, the tune just means guess & check or grid search. (I am working on a product to do all of that for you but I'm not here to plug it).

                                                                                  Source https://stackoverflow.com/questions/70615673

                                                                                  QUESTION

                                                                                  partial tucker decomposition
                                                                                  Asked 2021-Dec-28 at 21:06

                                                                                  I want to apply a partial tucker decomposition algorithm to minimize MNIST image tensor dataset of (60000,28,28), in order to conserve its features when applying another machine algorithm afterwards like SVM. I have this code that minimizes the second and third dimension of the tensor

                                                                                  i = 16
                                                                                  j = 10
                                                                                  core, factors = partial_tucker(train_data_mnist, modes=[1,2],tol=10e-5, rank=[i,j])
                                                                                  train_datapartial_tucker = tl.tenalg.multi_mode_dot(train_data_mnist, factors, 
                                                                                                                modes=modes, transpose=True)
                                                                                  test_data_partial_tucker = tl.tenalg.multi_mode_dot(test_data_mnist, factors, 
                                                                                                                modes=modes, transpose=True)
                                                                                  

                                                                                  How to find the best rank [i,j] when I'm using partial_tucker in tensorly that will give the best dimension reduction for the image while conserving as much data?

                                                                                  ANSWER

                                                                                  Answered 2021-Dec-28 at 21:05

                                                                                  So if you look at the source code for tensorly linked here you can see that the documentation for the function in question partial_tucker says:

                                                                                  """
                                                                                  Partial tucker decomposition via Higher Order Orthogonal Iteration (HOI)
                                                                                  Decomposes 'tensor' into a Tucker decomposition exclusively along 
                                                                                  the provided modes.
                                                                                  
                                                                                  Parameters
                                                                                  ----------
                                                                                  tensor: ndarray
                                                                                  modes: int list
                                                                                  list of the modes on which to perform the decomposition
                                                                                  rank: None, int or int list
                                                                                  size of the core tensor, 
                                                                                  if int, the same rank is used for all modes
                                                                                  """
                                                                                  

                                                                                  The purpose of this function is to provide you with the approximation which conserves as much data as possible for a given rank. I can not give you which rank "will give the best dimension reduction for the image while conserving as much data," because the optimal tradeoff between dimensionality reduction and loss of precision is something which has no objectively "correct," answer in the abstract, as it will largely depend upon the specific goals of your project and the computational resources available to you to achieve those goals.

                                                                                  If I told you to do "the best rank" it would eliminate the purpose of this approximate decomposition in the first place, because "the best rank" will be the rank which gives no "loss," which is no longer an approximation of fixed rank and sort of renders the term approximation meaningless. But how far to stray from this "best rank" in order to gain dimensionality reduction is not a question anyone else can answer for you objectively. One could certainly give an opinion, but this opinion would depend on much more information than I have from you at the moment. If you are looking for a more in-depth perspective on this tradeoff and what tradeoff suits you best I would suggest posting a question with some more detail on your situation on a site in the Stack network more focused on the mathematical/statistical underpinnings of dimensionality reduction and less so on the programming aspects that Stack Overflow focuses on more, such as Stack Exhange Cross Validated or perhaps Stack Exhange Data Science.

                                                                                  Sources/References/Further Reading:

                                                                                  1. Blog/Article on Tucker Decompositions
                                                                                  2. "Some mathematical notes on three-mode factor analysis" - Paper on Tucker Decomposition by Ledyard R Tucker
                                                                                  3. "A Multilinear Singular Value Decomposition" by Lathauwer et al
                                                                                  4. "On the Best Rank-1 and Rank-(R1 ,R2 ,. . .,RN) Approximation of Higher-Order Tensors" by Lathauwer et al

                                                                                  Source https://stackoverflow.com/questions/70466992

                                                                                  QUESTION

                                                                                  Is it possible to use a collection of hyperspectral 1x1 pixels in a CNN model purposed for more conventional datasets (CIFAR-10/MNIST)?
                                                                                  Asked 2021-Dec-17 at 09:08

                                                                                  I have created a working CNN model in Keras/Tensorflow, and have successfully used the CIFAR-10 & MNIST datasets to test this model. The functioning code as seen below:

                                                                                  import keras
                                                                                  from keras.datasets import cifar10
                                                                                  from keras.utils import to_categorical
                                                                                  from keras.models import Sequential
                                                                                  from keras.layers import Dense, Activation, Dropout, Conv2D, Flatten, MaxPooling2D
                                                                                  from keras.layers.normalization import BatchNormalization
                                                                                  
                                                                                  (X_train, y_train), (X_test, y_test) = cifar10.load_data()
                                                                                  
                                                                                  #reshape data to fit model
                                                                                  X_train = X_train.reshape(50000,32,32,3)
                                                                                  X_test = X_test.reshape(10000,32,32,3)
                                                                                  
                                                                                  y_train = to_categorical(y_train)
                                                                                  y_test = to_categorical(y_test)
                                                                                  
                                                                                  
                                                                                  # Building the model 
                                                                                  
                                                                                  #1st Convolutional Layer
                                                                                  model.add(Conv2D(filters=64, input_shape=(32,32,3), kernel_size=(11,11), strides=(4,4), padding='same'))
                                                                                  model.add(BatchNormalization())
                                                                                  model.add(Activation('relu'))
                                                                                  model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
                                                                                  
                                                                                  #2nd Convolutional Layer
                                                                                  model.add(Conv2D(filters=224, kernel_size=(5, 5), strides=(1,1), padding='same'))
                                                                                  model.add(BatchNormalization())
                                                                                  model.add(Activation('relu'))
                                                                                  model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
                                                                                  
                                                                                  #3rd Convolutional Layer
                                                                                  model.add(Conv2D(filters=288, kernel_size=(3,3), strides=(1,1), padding='same'))
                                                                                  model.add(BatchNormalization())
                                                                                  model.add(Activation('relu'))
                                                                                  
                                                                                  #4th Convolutional Layer
                                                                                  model.add(Conv2D(filters=288, kernel_size=(3,3), strides=(1,1), padding='same'))
                                                                                  model.add(BatchNormalization())
                                                                                  model.add(Activation('relu'))
                                                                                  
                                                                                  #5th Convolutional Layer
                                                                                  model.add(Conv2D(filters=160, kernel_size=(3,3), strides=(1,1), padding='same'))
                                                                                  model.add(BatchNormalization())
                                                                                  model.add(Activation('relu'))
                                                                                  model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
                                                                                  
                                                                                  model.add(Flatten())
                                                                                  
                                                                                  # 1st Fully Connected Layer
                                                                                  model.add(Dense(4096, input_shape=(32,32,3,)))
                                                                                  model.add(BatchNormalization())
                                                                                  model.add(Activation('relu'))
                                                                                  # Add Dropout to prevent overfitting
                                                                                  model.add(Dropout(0.4))
                                                                                  
                                                                                  #2nd Fully Connected Layer
                                                                                  model.add(Dense(4096))
                                                                                  model.add(BatchNormalization())
                                                                                  model.add(Activation('relu'))
                                                                                  #Add Dropout
                                                                                  model.add(Dropout(0.4))
                                                                                  
                                                                                  #3rd Fully Connected Layer
                                                                                  model.add(Dense(1000))
                                                                                  model.add(BatchNormalization())
                                                                                  model.add(Activation('relu'))
                                                                                  #Add Dropout
                                                                                  model.add(Dropout(0.4))
                                                                                  
                                                                                  #Output Layer
                                                                                  model.add(Dense(10))
                                                                                  model.add(BatchNormalization())
                                                                                  model.add(Activation('softmax'))
                                                                                  
                                                                                  
                                                                                  #compile model using accuracy to measure model performance
                                                                                  opt = keras.optimizers.Adam(learning_rate = 0.0001)
                                                                                  model.compile(optimizer=opt, loss='categorical_crossentropy', 
                                                                                                metrics=['accuracy'])
                                                                                  
                                                                                  
                                                                                  #train the model
                                                                                  model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=30)
                                                                                  

                                                                                  From this point after utilising the aforementioned datasets, I wanted to go one further and use a dataset with more channels than a greyscale or rgb presented, hence the inclusion of a hyperspectral dataset. When looking for a hyperspectral dataset I came across this one.

                                                                                  The issue at this stage was realising that this hyperspectral dataset was one image, with each value in the ground truth relating to each pixel. At this stage I reformatted the data from this into a collection of hyperspectral data/pixels.

                                                                                  Code reformatting corrected dataset for x_train & x_test:

                                                                                  import keras
                                                                                  import scipy
                                                                                  import numpy as np
                                                                                  import matplotlib.pyplot as plt
                                                                                  from keras.utils import to_categorical
                                                                                  from scipy import io
                                                                                  
                                                                                  mydict = scipy.io.loadmat('Indian_pines_corrected.mat')
                                                                                  dataset = np.array(mydict.get('indian_pines_corrected'))
                                                                                  
                                                                                  
                                                                                  #This is creating the split between x_train and x_test from the original dataset 
                                                                                  # x_train after this code runs will have a shape of (121, 145, 200) 
                                                                                  # x_test after this code runs will have a shape of (24, 145, 200)
                                                                                  x_train = np.zeros((121,145,200), dtype=np.int)
                                                                                  x_test = np.zeros((24,145,200), dtype=np.int)    
                                                                                  
                                                                                  xtemp = np.array_split(dataset, [121])
                                                                                  x_train = np.array(xtemp[0])
                                                                                  x_test = np.array(xtemp[1])
                                                                                  
                                                                                  # x_train will have a shape of (17545, 200) 
                                                                                  # x_test will have a shape of (3480, 200)
                                                                                  x_train = x_train.reshape(-1, x_train.shape[-1])
                                                                                  x_test = x_test.reshape(-1, x_test.shape[-1])
                                                                                  

                                                                                  Code reformatting ground truth dataset for Y_train & Y_test:

                                                                                  truthDataset = scipy.io.loadmat('Indian_pines_gt.mat')
                                                                                  gTruth = truthDataset.get('indian_pines_gt')
                                                                                  
                                                                                  #This is creating the split between Y_train and Y_test from the original dataset 
                                                                                  # Y_train after this code runs will have a shape of (121, 145) 
                                                                                  # Y_test after this code runs will have a shape of (24, 145)
                                                                                  
                                                                                  Y_train = np.zeros((121,145), dtype=np.int)
                                                                                  Y_test = np.zeros((24,145), dtype=np.int)    
                                                                                  
                                                                                  ytemp = np.array_split(gTruth, [121])
                                                                                  Y_train = np.array(ytemp[0])
                                                                                  Y_test = np.array(ytemp[1])
                                                                                  
                                                                                  # Y_train will have a shape of (17545) 
                                                                                  # Y_test will have a shape of (3480)
                                                                                  Y_train = Y_train.reshape(-1)
                                                                                  Y_test = Y_test.reshape(-1)
                                                                                  
                                                                                  
                                                                                  #17 binary categories ranging from 0-16
                                                                                  
                                                                                  #Y_train one-hot encode target column
                                                                                  Y_train = to_categorical(Y_train)
                                                                                  
                                                                                  #Y_test one-hot encode target column
                                                                                  Y_test = to_categorical(Y_test, num_classes = 17)
                                                                                  

                                                                                  My thought process was that, despite the initial image being broken down into 1x1 patches, the large number of channels each patch possessed with their respective values would aid in categorisation of the dataset.

                                                                                  Essentially I'd want to input this reformatted data into my model (seen within the first code fragment in this post), however I'm uncertain if I am taking the wrong approach to this due to my inexperience with this area of expertise. I was expecting to input a shape of (1,1,200), i.e the shape of x_train & x_test would be (17545,1,1,200) & (3480,1,1,200) respectively.

                                                                                  ANSWER

                                                                                  Answered 2021-Dec-16 at 10:18

                                                                                  If the hyperspectral dataset is given to you as a large image with many channels, I suppose that the classification of each pixel should depend on the pixels around it (otherwise I would not format the data as an image, i.e. without grid structure). Given this assumption, breaking up the input picture into 1x1 parts is not a good idea as you are loosing the grid structure.

                                                                                  I further suppose that the order of the channels is arbitrary, which implies that convolution over the channels is probably not meaningful (which you however did not plan to do anyways).

                                                                                  Instead of reformatting the data the way you did, you may want to create a model that takes an image as input and also outputs an "image" containing the classifications for each pixel. I.e. if you have 10 classes and take a (145, 145, 200) image as input, your model would output a (145, 145, 10) image. In that architecture you would not have any fully-connected layers. Your output layer would also be a convolutional layer.

                                                                                  That however means that you will not be able to keep your current architecture. That is because the tasks for MNIST/CIFAR10 and your hyperspectral dataset are not the same. For MNIST/CIFAR10 you want to classify an image in it's entirety, while for the other dataset you want to assign a class to each pixel (while most likely also using the pixels around each pixel).

                                                                                  Some further ideas:

                                                                                  • If you want to turn the pixel classification task on the hyperspectral dataset into a classification task for an entire image, maybe you can reformulate that task as "classifying a hyperspectral image as the class of it's center (or top-left, or bottom-right, or (21th, 104th), or whatever) pixel". To obtain the data from your single hyperspectral image, for each pixel, I would shift the image such that the target pixel is at the desired location (e.g. the center). All pixels that "fall off" the border could be inserted at the other side of the image.
                                                                                  • If you want to stick with a pixel classification task but need more data, maybe split up the single hyperspectral image you have into many smaller images (e.g. 10x10x200). You may even want to use images of many different sizes. If you model only has convolution and pooling layers and you make sure to maintain the sizes of the image, that should work out.

                                                                                  Source https://stackoverflow.com/questions/70226626

                                                                                  QUESTION

                                                                                  Why did it always missing one subplot when I import mnist digits dataset?
                                                                                  Asked 2021-Dec-07 at 04:08

                                                                                  I want to import mnist digits digits to show in one figure, and code like that,

                                                                                  import keras
                                                                                  from keras.datasets import mnist
                                                                                  import matplotlib.pyplot as plt
                                                                                  (X_train, y_train), (X_test, y_test) = mnist.load_data()
                                                                                  fig = plt.figure(figsize=(8,8))
                                                                                  n = 0
                                                                                  for i in range (5):
                                                                                      for j in range (5):
                                                                                          plt.subplot(5, 5, i*5 +j +1)
                                                                                     
                                                                                          plt.imshow(X_train[n], cmap='Greys')
                                                                                          plt.title("Digit:{}".format(y_train[n]))
                                                                                          n += 1
                                                                                          plt.tight_layout()
                                                                                  plt.show()
                                                                                  

                                                                                  However, no matter I change the row and col, it always missing one subplot on the bottom,like that I don't know what did it happen here...

                                                                                  ANSWER

                                                                                  Answered 2021-Dec-07 at 04:04

                                                                                  I was able to reproduce this bug too. It seems to be related to the plt.tight_layout() that you apply within the loop. Instead of doing this, use plt.subplots to produce the axes objects first, then iterate over those instead. Once you plot everything, use tight_layout on the opened figure:

                                                                                  import keras
                                                                                  from keras.datasets import mnist
                                                                                  import matplotlib.pyplot as plt
                                                                                  (X_train, y_train), (X_test, y_test) = mnist.load_data()
                                                                                  fig, axes = plt.subplots(nrows=5, ncols=5, figsize=(8,8))
                                                                                  for i, ax in enumerate(axes.flat):
                                                                                      ax.imshow(X_train[i], cmap='Greys')
                                                                                      ax.set_title("Digit:{}".format(y_train[i]))
                                                                                  fig.tight_layout()
                                                                                  plt.show()
                                                                                  

                                                                                  We now get what is expected:

                                                                                  Source https://stackoverflow.com/questions/70254697

                                                                                  QUESTION

                                                                                  Can not run the the tflite model on Interpreter in android studio
                                                                                  Asked 2021-Nov-24 at 00:05

                                                                                  I am trying to run a TensorFlow-lite model on my App on a smartphone. First, I trained the model with numerical data using LSTM and build the model layer using TensorFlow.Keras. I used TensorFlow V2.x and saved the trained model on a server. After that, the model is downloaded to the internal memory of the smartphone by the App and loaded to the interpreter using "MappedByteBuffer". Until here everything is working correctly.

                                                                                  The problem is in the interpreter can not read and run the model. I also added the required dependencies on the build.gradle.

                                                                                  The conversion code to tflite model in python:

                                                                                  from tensorflow import keras
                                                                                  from keras.models import Sequential
                                                                                  from keras.layers import Dense, Dropout, LSTM
                                                                                  from tensorflow.keras import regularizers
                                                                                  #Create the network
                                                                                  model = Sequential()
                                                                                  model.add(LSTM(...... name = 'First_layer'))
                                                                                  model.add(Dropout(rate=Drop_out))
                                                                                  model.add(LSTM(...... name = 'Second_layer'))
                                                                                  model.add(Dropout(rate=Drop_out))
                                                                                  
                                                                                  # compile model
                                                                                  model.compile(loss=keras.losses.mae, 
                                                                                  optimizer=keras.optimizers.Adam(learning_rate=learning_rate), metrics=["mae"])
                                                                                  
                                                                                  # fit model
                                                                                  model.fit(.......)
                                                                                  #save the model
                                                                                  tf.saved_model.save(model,'saved_model')
                                                                                  print("Model  type", model1.dtype)# Model type is float32 and size around 2MB
                                                                                  
                                                                                  #Convert saved model into TFlite
                                                                                  converter = tf.lite.TFLiteConverter.from_saved_model('saved_model')
                                                                                  tflite_model = converter.convert()
                                                                                  
                                                                                  with open("Model.tflite, "wb") as f:
                                                                                      f.write(tflite_model)
                                                                                  f.close()
                                                                                  

                                                                                  I tried also other conversion way using Keras

                                                                                  # converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
                                                                                  # tflite_model = converter.convert()
                                                                                  

                                                                                  After this step, the "Model.tflite" is converted and downloaded to the internal memory of the smartphone.

                                                                                  Android studio code:

                                                                                    try {
                                                                                          private Interpreter tflite = new Interpreter(loadModelFile());
                                                                                          Log.d("Load_model", "Created a Tensorflow Lite of AutoAuth.");
                                                                                  
                                                                                      } catch (IOException e) {
                                                                                          Log.e("Load_model", "IOException loading the tflite file");
                                                                                  
                                                                                      }
                                                                                  
                                                                                  private MappedByteBuffer loadModelFile() throws IOException {
                                                                                      String model_path = model_directory + model_name + ".tflite";
                                                                                      Log.d(TAG, model_path);
                                                                                      File file = new File(model_path);
                                                                                      if(file!=null){
                                                                                      FileInputStream inputStream = new FileInputStream(file);
                                                                                      FileChannel fileChannel = inputStream.getChannel();
                                                                                      return fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, file.length());
                                                                                      }else{
                                                                                          return null;
                                                                                      }
                                                                                  }
                                                                                  

                                                                                  The "loadModelFile()" function is working correctly because I checked it with another tflite model using MNIST dataset for image classification. The problem is only the interpreter.

                                                                                  This is also build.gradle's contents:

                                                                                  android {
                                                                                  aaptOptions {
                                                                                      noCompress "tflite"
                                                                                  }
                                                                                   }
                                                                                    android {
                                                                                       defaultConfig {
                                                                                          ndk {
                                                                                              abiFilters 'armeabi-v7a', 'arm64-v8a'
                                                                                          }
                                                                                        }
                                                                                      }
                                                                                  
                                                                                  dependencies {
                                                                                       implementation 'com.jakewharton:butterknife:8.8.1'
                                                                                       implementation 'org.tensorflow:tensorflow-lite:0.1.2-nightly'
                                                                                       annotationProcessor 'com.jakewharton:butterknife-compiler:8.8.1'
                                                                                       implementation fileTree(dir: 'libs', include: ['*.jar'])
                                                                                       //noinspection GradleCompatible
                                                                                       implementation 'com.android.support:appcompat-v7:28.0.0'
                                                                                      implementation 'com.android.support.constraint:constraint-layout:2.0.4'
                                                                                      testImplementation 'junit:junit:4.12'
                                                                                      androidTestImplementation 'com.android.support.test:runner:1.0.2'
                                                                                      androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2'
                                                                                      }
                                                                                  

                                                                                  Whenever I run Android studio, I got one of the following errors: 1-

                                                                                  or

                                                                                  2-

                                                                                  I have gone through many resources and threads and read about saving trained models, TFlite conversion, and Interpreters. I am trying for 5 days ago to solve this issue but have no hope. Can anyone give a solution for this?

                                                                                  ANSWER

                                                                                  Answered 2021-Nov-24 at 00:05

                                                                                  Referring to one of the most recent TfLite android app examples might help: Model Personalization App. This demo app uses transfer learning model instead of LSTM, but the overall workflow should be similar.

                                                                                  As Farmaker mentioned in the comment, try using SNAPSHOT in the gradle dependency:

                                                                                  implementation 'org.tensorflow:tensorflow-lite:0.0.0-nightly-SNAPSHOT'
                                                                                  

                                                                                  To load the model properly, can you try:

                                                                                  protected MappedByteBuffer loadMappedFile(String filePath) throws IOException {
                                                                                      AssetFileDescriptor fileDescriptor = assetManager.openFd(this.directoryName + "/" + filePath);
                                                                                  
                                                                                      FileInputStream inputStream = new FileInputStream(fileDescriptor.getFileDescriptor());
                                                                                      FileChannel fileChannel = inputStream.getChannel();
                                                                                      long startOffset = fileDescriptor.getStartOffset();
                                                                                      long declaredLength = fileDescriptor.getDeclaredLength();
                                                                                      return fileChannel.map(MapMode.READ_ONLY, startOffset, declaredLength);
                                                                                    }
                                                                                  

                                                                                  This snippet can also be found in the GitHub example link I posted above.

                                                                                  Source https://stackoverflow.com/questions/69796868

                                                                                  Community Discussions, Code Snippets contain sources that include Stack Exchange Network

                                                                                  Vulnerabilities

                                                                                  No vulnerabilities reported

                                                                                  Install MNIST

                                                                                  You can download it from GitHub.
                                                                                  You can use MNIST like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the MNIST component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

                                                                                  Support

                                                                                  For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
                                                                                  Find more information at:
                                                                                  Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
                                                                                  Find more libraries
                                                                                  Explore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits​
                                                                                  Save this library and start creating your kit
                                                                                  CLONE
                                                                                • HTTPS

                                                                                  https://github.com/evolvingstuff/MNIST.git

                                                                                • CLI

                                                                                  gh repo clone evolvingstuff/MNIST

                                                                                • sshUrl

                                                                                  git@github.com:evolvingstuff/MNIST.git

                                                                                • Share this Page

                                                                                  share link

                                                                                  Consider Popular Machine Learning Libraries

                                                                                  tensorflow

                                                                                  by tensorflow

                                                                                  youtube-dl

                                                                                  by ytdl-org

                                                                                  models

                                                                                  by tensorflow

                                                                                  pytorch

                                                                                  by pytorch

                                                                                  keras

                                                                                  by keras-team

                                                                                  Try Top Libraries by evolvingstuff

                                                                                  RecurrentJava

                                                                                  by evolvingstuffJava

                                                                                  LongShortTermMemory

                                                                                  by evolvingstuffJava

                                                                                  SimpleLSTM

                                                                                  by evolvingstuffJava

                                                                                  MonaLisa

                                                                                  by evolvingstuffJava

                                                                                  MetaList

                                                                                  by evolvingstuffJavaScript

                                                                                  Compare Machine Learning Libraries with Highest Support

                                                                                  youtube-dl

                                                                                  by ytdl-org

                                                                                  scikit-learn

                                                                                  by scikit-learn

                                                                                  models

                                                                                  by tensorflow

                                                                                  tensorflow

                                                                                  by tensorflow

                                                                                  keras

                                                                                  by keras-team

                                                                                  Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
                                                                                  Find more libraries
                                                                                  Explore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits​
                                                                                  Save this library and start creating your kit