# MNIST | Neural Network with single hidden layer learning MNIST | Machine Learning library

## kandi X-RAY | MNIST Summary

## Support

## Quality

## Security

## License

## Reuse

- Main method for testing
- Evaluate the fitted fitness function
- Convert a file to a matrix
- Evaluate a sample -vised sample
- Write a matrix to a file
- Internal evaluation function
- Loads weights from a file
- Save the layers to a file
- Gets the task dimension
- Gets the dimension of the observation
- Sets the validation mode

- Creates a new readout layer
- Determine the effective input dimension

- Add a new hidden layer

## MNIST Key Features

## MNIST Examples and Code Snippets

Trending Discussions on MNIST

Trending Discussions on MNIST

QUESTION

I writing my code within a Jupyter notebook in VS Code. I am hoping to play some of the audio within my data set. However, when I execute the cell, the console reports no errors, produces the widget, but the widget displays 0:00 / 0:00 (see below), indicating there is no sound to play.

Below, I have listed two ways to reproduce the error.

- I have acquired data from the hub data store. Looking specifically at the spoken MNIST data set, I cannot get the data from the
`audio`

tensor to play

```
import hub
from IPython.display import display, Audio
from ipywidgets import interactive
# Obtain the data using the hub module
ds = hub.load("hub://activeloop/spoken_mnist")
# Create widget
sample = ds.audio[0].numpy()
display(Audio(data=sample, rate = 8000, autoplay=True))
```

- The second example is a test (copied from another post) that I ran to see if it was something wrong with the data or something wrong with my console, environment,
*etc.*

```
# Same imports as shown above
# Toy Function to play beats in notebook
def beat_freq(f1=220.0, f2=224.0):
max_time = 5
rate = 8000
times = np.linspace(0,max_time,rate*max_time)
signal = np.sin(2*np.pi*f1*times) + np.sin(2*np.pi*f2*times)
display(Audio(data=signal, rate=rate))
return signal
v = interactive(beat_freq, f1=(200.0,300.0), f2=(200.0,300.0))
display(v)
```

I believe that if it is something wrong with the data (this is a well-known data set so, I doubt it), then only the second one will play. If it is something to do with the IDE or something else, then neither will work, as is the case now.

ANSWER

Answered 2022-Mar-15 at 00:07Apologies for the late reply! In the future, please tag the questions with activeloop so it's easier to sort through (or hit us up directly in community slack -> slack.activeloop.ai).

Regarding the Free Spoken Digit Dataset, I managed to track the error with your usage of activeloop hub and audio display.

adding [:,0] to 9th line will help fixing display on Colab as Audio expects one-dimensional data

```
%matplotlib inline
import hub
from IPython.display import display, Audio
from ipywidgets import interactive
# Obtain the data using the hub module
ds = hub.load("hub://activeloop/spoken_mnist")
# Create widget
sample = ds.audio[0].numpy()[:,0]
display(Audio(data=sample, rate = 8000, autoplay=True))
```

(When we uploaded the dataset, we decided to upload the audio as (N,C) where C is the number of channels, which happens to be 1 for the particular dataset. The added dimension wasn't added automatically)

Regarding the VScode... the audio, unfortunately, would still not work (not because of us, but VScode), but you can still try visualizing Free Spoken Digit Dataset (you can play the music there, too). Hopefully this addresses your needs!

Let us know if you have further questions.

Mikayel from Activeloop

QUESTION

I try to implement a fully-connected model for classification using the MNIST dataset. A part of the code is the following:

```
n = 5
act_func = 'relu'
classifier = tf.keras.models.Sequential()
classifier.add(layers.Flatten(input_shape = (28, 28, 1)))
for i in range(n):
classifier.add(layers.Dense(32, activation=act_func))
classifier.add(layers.Dense(10, activation='softmax'))
opt = tf.keras.optimizers.SGD(learning_rate=0.01)
classifier.compile(optimizer=opt,loss="categorical_crossentropy",metrics ="accuracy")
classifier.summary()
history = classifier.fit(x_train, y_train, batch_size=32, epochs=3, validation_data=(x_test,y_test))
```

Is there a way to print the maximum gradient for each layer for a given mini-batch?

ANSWER

Answered 2022-Mar-10 at 08:19You could start off with a custom training loop using `tf.GradientTape`

:

```
import tensorflow as tf
import tensorflow_datasets as tfds
(ds_train, ds_test), ds_info = tfds.load(
'mnist',
split=['train', 'test'],
shuffle_files=True,
as_supervised=True,
with_info=True,
)
n = 5
act_func = 'relu'
classifier = tf.keras.models.Sequential()
classifier.add(tf.keras.layers.Flatten(input_shape = (28, 28, 1)))
for i in range(n):
classifier.add(tf.keras.layers.Dense(32, activation=act_func))
classifier.add(tf.keras.layers.Dense(10, activation='softmax'))
opt = tf.keras.optimizers.SGD(learning_rate=0.01)
loss = tf.keras.losses.CategoricalCrossentropy()
classifier.summary()
epochs = 1
for epoch in range(epochs):
print("\nStart of epoch %d" % (epoch,))
for step, (x_batch_train, y_batch_train) in enumerate(ds_train.take(50).batch(10)):
x_batch_train = tf.cast(x_batch_train, dtype=tf.float32)
y_batch_train = tf.keras.utils.to_categorical(y_batch_train, 10)
with tf.GradientTape() as tape:
logits = classifier(x_batch_train, training=True)
loss_value = loss(y_batch_train, logits)
grads = tape.gradient(loss_value, classifier.trainable_weights)
opt.apply_gradients(zip(grads, classifier.trainable_weights))
with tf.GradientTape(persistent=True) as tape:
tape.watch(x_batch_train)
x = classifier.layers[0](x_batch_train)
outputs = []
for layer in classifier.layers[1:]:
x = layer(x)
outputs.append(x)
for idx, output in enumerate(outputs):
grad = tf.math.abs(tape.gradient(output, x_batch_train))
print('Max gradient for layer {} is {}'.format(idx + 1, tf.reduce_max(grad)))
print('End of batch {}'.format(step + 1))
```

```
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_9 (Flatten) (None, 784) 0
dense_54 (Dense) (None, 32) 25120
dense_55 (Dense) (None, 32) 1056
dense_56 (Dense) (None, 32) 1056
dense_57 (Dense) (None, 32) 1056
dense_58 (Dense) (None, 32) 1056
dense_59 (Dense) (None, 10) 330
=================================================================
Total params: 29,674
Trainable params: 29,674
Non-trainable params: 0
_________________________________________________________________
Start of epoch 0
Max gradient for layer 1 is 0.7913536429405212
Max gradient for layer 2 is 0.8477020859718323
Max gradient for layer 3 is 0.7188305854797363
Max gradient for layer 4 is 0.5108454823493958
Max gradient for layer 5 is 0.3362882435321808
Max gradient for layer 6 is 1.9748875867975357e-09
End of batch 1
Max gradient for layer 1 is 0.7535678148269653
Max gradient for layer 2 is 0.6814548373222351
Max gradient for layer 3 is 0.5748667120933533
Max gradient for layer 4 is 0.5439972877502441
Max gradient for layer 5 is 0.27793681621551514
Max gradient for layer 6 is 1.9541412932255753e-09
End of batch 2
Max gradient for layer 1 is 0.8606255650520325
Max gradient for layer 2 is 0.8506941795349121
Max gradient for layer 3 is 0.8556670546531677
Max gradient for layer 4 is 0.43756356835365295
Max gradient for layer 5 is 0.2675274908542633
Max gradient for layer 6 is 3.7072431791074223e-09
End of batch 3
Max gradient for layer 1 is 0.7640039324760437
Max gradient for layer 2 is 0.6926062107086182
Max gradient for layer 3 is 0.6164448857307434
Max gradient for layer 4 is 0.43013691902160645
Max gradient for layer 5 is 0.32356566190719604
Max gradient for layer 6 is 3.2926392723453546e-09
End of batch 4
Max gradient for layer 1 is 0.7604862451553345
Max gradient for layer 2 is 0.6908300518989563
Max gradient for layer 3 is 0.6122230887413025
Max gradient for layer 4 is 0.39982378482818604
Max gradient for layer 5 is 0.3172021210193634
Max gradient for layer 6 is 2.3238742041797877e-09
End of batch 5
```

QUESTION

I have generated some images from the Fashion Mnist dataset, However, I am not able to come up with a function or the way to save each image as a single file. I only have found a way to save them in groups. Can someone help me on how to save images one by one?

This is what I have for the moment:

```
def generate_and_save_images(model,
epoch,test_input):
predictions = model(test_input, training=False)
fig = plt.figure(figsize=(4,4))
for i in range(predictions.shape[0]):
plt.subplot(4, 4, i+1)
plt.imshow(predictions[i, :, :, 0] * 127.5 +
127.5, cmap='gray')
plt.axis('off')
plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
plt.show()
```

ANSWER

Answered 2022-Mar-13 at 17:07Try using `plt.imsave`

to save each image separately:

```
def generate_and_save_images(model, epoch, test_input):
predictions = model(test_input, training=False)
fig = plt.figure(figsize=(4, 4))
for i in range(predictions.shape[0]):
plt.subplot(4, 4, i+1)
plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
plt.imsave('image_at_epoch_{:04d}-{}.png'.format(epoch, i), predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
plt.axis('off')
plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
plt.show()
```

QUESTION

I'm trying to implement a gradient-free optimizer function to train convolutional neural networks with Julia using Flux.jl. The reference paper is this: https://arxiv.org/abs/2005.05955. This paper proposes RSO, a gradient-free optimization algorithm updates single weight at a time on a sampling bases. The pseudocode of this algorithm is depicted in the picture below.

I'm using MNIST dataset.

```
function train(; kws...)
args = Args(; kws...) # collect options in a stuct for convinience
if CUDA.functional() && args.use_cuda
@info "Training on CUDA GPU"
CUDA.allwoscalar(false)
device = gpu
else
@info "Training on CPU"
device = cpu
end
# Prepare datasets
x_train, x_test, y_train, y_test = getdata(args, device)
# Create DataLoaders (mini-batch iterators)
train_loader = DataLoader((x_train, y_train), batchsize=args.batchsize, shuffle=true)
test_loader = DataLoader((x_test, y_test), batchsize=args.batchsize)
# Construct model
model = build_model() |> device
ps = Flux.params(model) # model's trainable parameters
best_param = ps
if args.optimiser == "SGD"
# Regular training step with SGD
elseif args.optimiser == "RSO"
# Run RSO function and update ps
best_param .= RSO(x_train, y_train, args.RSOupdate, model, args.batchsize, device)
end
```

And the corresponding RSO function:

```
function RSO(X,L,C,model, batch_size, device)
"""
model = convolutional model structure
X = Input data
L = labels
C = Number of rounds to update parameters
W = Weight set of layers
Wd = Weight tensors of layer d that generates an activation
wid = weight tensor that generates an activation aᵢ
wj = a weight in wid
"""
# Normalize input data to have zero mean and unit standard deviation
X .= (X .- sum(X))./std(X)
train_loader = DataLoader((X, L), batchsize=batch_size, shuffle=true)
#println("model = $(typeof(model))")
std_prep = []
σ_d = Float64[]
D = 1
for layer in model
D += 1
Wd = Flux.params(layer)
# Initialize the weights of the network with Gaussian distribution
for id in Wd
wj = convert(Array{Float32, 4}, rand(Normal(0, sqrt(2/length(id))), (3,3,4,4)))
id = wj
append!(std_prep, vec(wj))
end
# Compute std of all elements in the weight tensor Wd
push!(σ_d, std(std_prep))
end
W = Flux.params(model)
# Weight update
for _ in 1:C
d = D
while d > 0
for id in 1:length(W[d])
# Randomly sample change in weights from Gaussian distribution
for j in 1:length(w[d][id])
# Randomly sample mini-batch
(x, l) = train_loader[rand(1:length(train_loader))]
# Sample a weight from normal distribution
ΔWj[d][id][j] = rand(Normal(0, σ_d[d]), 1)
loss, acc = loss_and_accuracy(data_loader, model, device)
W = argmin(F(x,l, W+ΔWj), F(x,l,W), F(x,l, W-ΔWj))
end
end
d -= 1
end
end
return W
end
```

The problem here is the second block of the RSO function. I'm trying to evaluate the loss with the change of single weight in three scenarios, which are `F(w, l, W+gW), F(w, l, W), F(w, l, W-gW)`

, and choose the weight-set with minimum loss. But how do I do that using Flux.jl? The loss function I'm trying to use is `logitcrossentropy(ŷ, y, agg=sum)`

. In order to generate y_hat, we should use model(W), but changing single weight parameter in Zygote.Params() form was already challenging....

ANSWER

Answered 2022-Jan-14 at 23:47Based on the paper you shared, it looks like you need to change the weight arrays per each output neuron per each layer. Unfortunately, this means that the implementation of your optimization routine is going to depend on the layer type, since an "output neuron" for a convolution layer is quite different than a fully-connected layer. In other words, just looping over `Flux.params(model)`

is not going to be sufficient, since this is just a set of all the weight arrays in the model and each weight array is treated differently depending on which layer it comes from.

Fortunately, Julia's multiple dispatch does make this easier to write if you use separate functions instead of a giant loop. I'll summarize the algorithm using the pseudo-code below:

```
for layer in model
for output_neuron in layer
for weight_element in parameters(output_neuron)
weight_element = sample(N(0, sqrt(2 / num_outputs(layer))))
end
end
sigmas[layer] = stddev(parameters(layer))
end
for c in 1 to C
for layer in reverse(model)
for output_neuron in layer
for weight_element in parameters(output_neuron)
x, y = sample(batches)
dw = N(0, sigmas[layer])
# optimize weights
end
end
end
end
```

It's the `for output_neuron ...`

portions that we need to isolate into separate functions.

In the first block, we don't actually do anything different to every `weight_element`

, they are all sampled from the same normal distribution. So, we don't actually need to iterate the output neurons, but we do need to know how many there are.

```
using Statistics: std
# this function will set the weights according to the
# normal distribution and the number of output neurons
# it also returns the standard deviation of the weights
function sample_weight!(layer::Dense)
sample = randn(eltype(layer.weight), size(layer.weight))
num_outputs = size(layer.weight, 1)
# notice the "." notation which is used to mutate the array
layer.weight .= sample .* num_outputs
return std(layer.weight)
end
function sample_weight!(layer::Conv)
sample = randn(eltype(layer.weight), size(layer.weight))
num_outputs = size(layer.weight, 4)
# notice the "." notation which is used to mutate the array
layer.weight .= sample .* num_outputs
return std(layer.weight)
end
sigmas = map(sample_weights!, model)
```

Now, for the second block, we will do a similar trick by defining different functions for each layer.

```
function optimize_layer!(loss, layer::Dense, data, sigma)
for i in 1:size(layer.weight, 1)
for j in 1:size(layer.weight, 2)
wj = layer.weight[i, j]
x, y = data[rand(1:length(data))]
dw = randn() * sigma
ws = [wj + dw, wj, wj - dw]
losses = Float32[]
for (k, w) in enumerate(ws)
layer.weight[i, j] = w
losses[k] = loss(x, y)
end
layer.weight[i, j] = ws[argmin(losses)]
end
end
end
function optimize_layer!(loss, layer::Conv, data, sigma)
for i in 1:size(layer.weight, 4)
# we use a view to reference the full kernel
# for this output channel
wid = view(layer.weight, :, :, :, i)
# each index let's us treat wid like a vector
for j in eachindex(wid)
wj = wid[j]
x, y = data[rand(1:length(data))]
dw = randn() * sigma
ws = [wj + dw, wj, wj - dw]
losses = Float32[]
for (k, w) in enumerate(ws)
wid[j] = w
losses[k] = loss(x, y)
end
wid[j] = ws[argmin(losses)]
end
end
end
for c in 1:C
for (layer, sigma) in reverse(zip(model, sigmas))
optimize_layer!(layer, data, sigma) do x, y
logitcrossentropy(model(x), y; agg = sum)
end
end
end
```

Notice that nowhere did I use `Flux.params`

which does not help us here. Also, `Flux.params`

would include both the weight and bias, and the paper doesn't look like it bothers with the bias at all. If you had an optimization method that generically optimized any parameter regardless of layer type the same (i.e. like gradient descent), then you could use `for p in Flux.params(model) ...`

.

QUESTION

I'm training GAN with MNIST and I want to visualize Generator output with noise input during training.

Here is the code:

```
from numpy import expand_dims
import numpy as np
import time
import tensorflow as tf
from numpy import zeros
from numpy import ones
from numpy import vstack
from numpy.random import randn
from numpy.random import randint
from tensorflow.keras.datasets.mnist import load_data
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import layers, Sequential
import matplotlib.pyplot as plt
from IPython import display
import imageio # for creating gifs
import PIL
(trainX, _), (_, _) = load_data()
# add channels dimension
X = expand_dims(trainX, axis=-1)
# convert from unsigned ints to floats
X = X.astype('float32')
# scale from [0,255] to [0,1]
dataset = X / 255.0
def define_generator(latent_dim):
model = Sequential()
# foundation for 7x7 image
n_nodes = 128 * 7 * 7
model.add(layers.Dense(n_nodes, input_dim=latent_dim))
model.add(layers.LeakyReLU())
model.add(layers.Reshape((7, 7, 128)))
# upsample to 14x14
model.add(layers.Conv2DTranspose(128, (4,4), strides=(2,2), padding='same'))
model.add(layers.LeakyReLU())
# upsample to 28x28
model.add(layers.Conv2DTranspose(128, (4,4), strides=(2,2), padding='same'))
model.add(layers.LeakyReLU())
model.add(layers.Conv2D(1, (7,7), activation='sigmoid', padding='same'))
return model
def define_discriminator(in_shape=(28,28,1)):
model = Sequential()
model.add(layers.Conv2D(128, (5,5), strides=(2, 2), padding='same', input_shape=in_shape))
model.add(layers.LeakyReLU())
model.add(layers.Dropout(0.3))
model.add(layers.Conv2D(128, (5,5), strides=(2, 2), padding='same'))
model.add(layers.LeakyReLU())
model.add(layers.Dropout(0.3))
model.add(layers.Flatten())
model.add(layers.Dense(1, activation='sigmoid'))
# compile model
opt = Adam(lr=0.0002)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
return model
# size of the noise vector
latent_dim = 100
num_examples_to_generate = 16
g_model = define_generator(latent_dim)
d_model = define_discriminator()
# define the combined generator and discriminator model, for updating the generator
def define_gan(g_model, d_model):
# make weights in the discriminator not trainable
d_model.trainable = False
# connect them
model = Sequential()
# add generator
model.add(g_model)
# add the discriminator
model.add(d_model)
# compile model
opt = Adam(lr=0.0002)
model.compile(loss='binary_crossentropy', optimizer=opt)
return model
gan_model = define_gan(g_model, d_model)
# select real samples
def generate_real_samples(dataset, n_samples):
# choose random instances
ix = randint(0, dataset.shape[0], n_samples)
# retrieve selected images
X = dataset[ix]
# generate 'real' class labels (1)
y = ones((n_samples, 1))
return X, y
# generate points in latent space as input for the generator
def generate_latent_points(latent_dim, n_samples):
# generate noise vector for training
x_input = randn(latent_dim * n_samples)
# reshape into a batch of inputs for the network
x_input = x_input.reshape(n_samples, latent_dim)
return x_input
# use the generator to generate n fake examples, with class labels
def generate_fake_samples(g_model, latent_dim, n_samples):
# generate points in latent space
x_input = generate_latent_points(latent_dim, n_samples)
# predict outputs
X = g_model.predict(x_input)
# create 'fake' class labels (0)
y = zeros((n_samples, 1))
return X, y
def generate_and_save_images(epoch):
# for generating the fake images after each epoch
# generate points in the latent space
noise = randn(latent_dim * num_examples_to_generate)
# reshape into a batch of inputs for the network
noise = noise.reshape(num_examples_to_generate, latent_dim)
predictions = g_model(noise, training=False)
fig = plt.figure(figsize=(4, 4))
for i in range(predictions.shape[0]):
plt.subplot(4, 4, i+1)
plt.imshow(predictions[i, :, :, 0], cmap='gray')
plt.axis('off')
plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
plt.show()
def train(g_model, d_model, gan_model, dataset, latent_dim, n_epochs=60, n_batch=256):
bat_per_epo = int(dataset.shape[0] / n_batch)
half_batch = int(n_batch / 2)
for epoch in range(n_epochs):
start = time.time()
for batch in range(bat_per_epo):
X_real, y_real = generate_real_samples(dataset, half_batch)
X_fake, y_fake = generate_fake_samples(g_model, latent_dim, half_batch)
X = vstack((X_real, X_fake))
y = vstack((y_real, y_fake))
d_loss, _ = d_model.train_on_batch(X,y)
X_gan = generate_latent_points(latent_dim, n_batch)
y_gan = ones((n_batch, 1))
g_loss = gan_model.train_on_batch(X_gan, y_gan)
display.clear_output(wait=True)
generate_and_save_images(epoch + 1)
print('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start))
display.clear_output(wait=True)
generate_and_save_images(n_epochs)
train(g_model, d_model, gan_model, dataset, latent_dim)
```

The output that I'm getting is:

There is no error or something just I can't see the output from the Generator with the noise input.

The function that is supposed to show the output is `generate_and_save_images`

.

ANSWER

Answered 2022-Jan-15 at 02:45when you use `cmap="gray"`

in `plt.imshow()`

you must either unscale your output or set vmin and vmax. From what I see you scaled by dividing 255, so you must multiply your data by 255 or, alternativle set `vmin=0, vmax=1`

Option1:

```
plt.imshow(predictions[i, :, :, 0]*255, cmap='gray')
```

Option2:

```
plt.imshow(predictions[i, :, :, 0], cmap='gray', vmin=0, vmax=1)
```

QUESTION

I'm working on Convolution Tasnet, model size I made is about 5.05 million variables.

I want to train this using custom training loops, and the problem is,

```
for i, (input_batch, target_batch) in enumerate(train_ds): # each shape is (64, 32000, 1)
with tf.GradientTape() as tape:
predicted_batch = cv_tasnet(input_batch, training=True) # model name
loss = calculate_sisnr(predicted_batch, target_batch) # some custom loss
trainable_vars = cv_tasnet.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
cv_tasnet.optimizer.apply_gradients(zip(gradients, trainable_vars))
```

This part exhausts all the gpu memory (24GB available)..

When I tried without `tf.GradientTape() as tape`

,

```
for i, (input_batch, target_batch) in enumerate(train_ds):
predicted_batch = cv_tasnet(input_batch, training=True)
loss = calculate_sisnr(predicted_batch, target_batch)
```

This uses a reasonable amount of gpu memory(about 5~6GB).

I tried the same format of `tf.GradientTape() as tape`

for the basic mnist data, then it works without problem.

So would the size matter? but the same error arises when I lowered `BATCH_SIZE`

to 32 or smaller.

Why the 1st code block exhausts all the gpu memory?

Of course, I put

```
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)
```

this code at the very first cell.

ANSWER

Answered 2022-Jan-07 at 11:08Gradient tape triggers automatic differentiation which requires tracking gradients on all your weights and activations. Autodiff requires multiple more memory. This is normal. You'll have to manually tune your batch size until you find one that works, then tune your LR. Usually, the tune just means guess & check or grid search. (I am working on a product to do all of that for you but I'm not here to plug it).

QUESTION

I want to apply a partial tucker decomposition algorithm to minimize MNIST image tensor dataset of (60000,28,28), in order to conserve its features when applying another machine algorithm afterwards like SVM. I have this code that minimizes the second and third dimension of the tensor

```
i = 16
j = 10
core, factors = partial_tucker(train_data_mnist, modes=[1,2],tol=10e-5, rank=[i,j])
train_datapartial_tucker = tl.tenalg.multi_mode_dot(train_data_mnist, factors,
modes=modes, transpose=True)
test_data_partial_tucker = tl.tenalg.multi_mode_dot(test_data_mnist, factors,
modes=modes, transpose=True)
```

How to find the best rank `[i,j]`

when I'm using `partial_tucker`

in tensorly that will give the best dimension reduction for the image while conserving as much data?

ANSWER

Answered 2021-Dec-28 at 21:05So if you look at the source code for `tensorly`

linked here you can see that the documentation for the function in question `partial_tucker`

says:

```
"""
Partial tucker decomposition via Higher Order Orthogonal Iteration (HOI)
Decomposes 'tensor' into a Tucker decomposition exclusively along
the provided modes.
Parameters
----------
tensor: ndarray
modes: int list
list of the modes on which to perform the decomposition
rank: None, int or int list
size of the core tensor,
if int, the same rank is used for all modes
"""
```

The purpose of this function is to provide you with the approximation which conserves as much data as possible *for a given rank*. I can not give you which rank "will give the best dimension reduction for the image while conserving as much data," because the optimal tradeoff between dimensionality reduction and loss of precision is something which has no objectively "correct," answer in the abstract, as it will largely depend upon the specific goals of your project and the computational resources available to you to achieve those goals.

If I told you to do "the best rank" it would eliminate the purpose of this approximate decomposition in the first place, because "the best rank" will be the rank which gives no "loss," which is no longer an approximation of fixed rank and sort of renders the term *approximation* meaningless. But how far to stray from this "best rank" in order to gain dimensionality reduction is not a question anyone else can answer for you objectively. One could certainly give an opinion, but this opinion would depend on much more information than I have from you at the moment. If you are looking for a more in-depth perspective on this tradeoff and what tradeoff suits you best I would suggest posting a question with some more detail on your situation on a site in the Stack network more focused on the mathematical/statistical underpinnings of dimensionality reduction and less so on the programming aspects that Stack Overflow focuses on more, such as Stack Exhange Cross Validated or perhaps Stack Exhange Data Science.

Sources/References/Further Reading:

- Blog/Article on Tucker Decompositions
- "Some mathematical notes on three-mode factor analysis" - Paper on Tucker Decomposition by Ledyard R Tucker
- "A Multilinear Singular Value Decomposition" by Lathauwer et al
- "On the Best Rank-1 and Rank-(R1 ,R2 ,. . .,RN) Approximation of Higher-Order Tensors" by Lathauwer et al

QUESTION

I have created a working CNN model in Keras/Tensorflow, and have successfully used the CIFAR-10 & MNIST datasets to test this model. The functioning code as seen below:

```
import keras
from keras.datasets import cifar10
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Conv2D, Flatten, MaxPooling2D
from keras.layers.normalization import BatchNormalization
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
#reshape data to fit model
X_train = X_train.reshape(50000,32,32,3)
X_test = X_test.reshape(10000,32,32,3)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
# Building the model
#1st Convolutional Layer
model.add(Conv2D(filters=64, input_shape=(32,32,3), kernel_size=(11,11), strides=(4,4), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
#2nd Convolutional Layer
model.add(Conv2D(filters=224, kernel_size=(5, 5), strides=(1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
#3rd Convolutional Layer
model.add(Conv2D(filters=288, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
#4th Convolutional Layer
model.add(Conv2D(filters=288, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
#5th Convolutional Layer
model.add(Conv2D(filters=160, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
model.add(Flatten())
# 1st Fully Connected Layer
model.add(Dense(4096, input_shape=(32,32,3,)))
model.add(BatchNormalization())
model.add(Activation('relu'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))
#2nd Fully Connected Layer
model.add(Dense(4096))
model.add(BatchNormalization())
model.add(Activation('relu'))
#Add Dropout
model.add(Dropout(0.4))
#3rd Fully Connected Layer
model.add(Dense(1000))
model.add(BatchNormalization())
model.add(Activation('relu'))
#Add Dropout
model.add(Dropout(0.4))
#Output Layer
model.add(Dense(10))
model.add(BatchNormalization())
model.add(Activation('softmax'))
#compile model using accuracy to measure model performance
opt = keras.optimizers.Adam(learning_rate = 0.0001)
model.compile(optimizer=opt, loss='categorical_crossentropy',
metrics=['accuracy'])
#train the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=30)
```

From this point after utilising the aforementioned datasets, I wanted to go one further and use a dataset with more channels than a greyscale or rgb presented, hence the inclusion of a hyperspectral dataset. When looking for a hyperspectral dataset I came across this one.

The issue at this stage was realising that this hyperspectral dataset was one image, with each value in the ground truth relating to each pixel. At this stage I reformatted the data from this into a collection of hyperspectral data/pixels.

**Code reformatting corrected dataset for x_train & x_test:**

```
import keras
import scipy
import numpy as np
import matplotlib.pyplot as plt
from keras.utils import to_categorical
from scipy import io
mydict = scipy.io.loadmat('Indian_pines_corrected.mat')
dataset = np.array(mydict.get('indian_pines_corrected'))
#This is creating the split between x_train and x_test from the original dataset
# x_train after this code runs will have a shape of (121, 145, 200)
# x_test after this code runs will have a shape of (24, 145, 200)
x_train = np.zeros((121,145,200), dtype=np.int)
x_test = np.zeros((24,145,200), dtype=np.int)
xtemp = np.array_split(dataset, [121])
x_train = np.array(xtemp[0])
x_test = np.array(xtemp[1])
# x_train will have a shape of (17545, 200)
# x_test will have a shape of (3480, 200)
x_train = x_train.reshape(-1, x_train.shape[-1])
x_test = x_test.reshape(-1, x_test.shape[-1])
```

**Code reformatting ground truth dataset for Y_train & Y_test:**

```
truthDataset = scipy.io.loadmat('Indian_pines_gt.mat')
gTruth = truthDataset.get('indian_pines_gt')
#This is creating the split between Y_train and Y_test from the original dataset
# Y_train after this code runs will have a shape of (121, 145)
# Y_test after this code runs will have a shape of (24, 145)
Y_train = np.zeros((121,145), dtype=np.int)
Y_test = np.zeros((24,145), dtype=np.int)
ytemp = np.array_split(gTruth, [121])
Y_train = np.array(ytemp[0])
Y_test = np.array(ytemp[1])
# Y_train will have a shape of (17545)
# Y_test will have a shape of (3480)
Y_train = Y_train.reshape(-1)
Y_test = Y_test.reshape(-1)
#17 binary categories ranging from 0-16
#Y_train one-hot encode target column
Y_train = to_categorical(Y_train)
#Y_test one-hot encode target column
Y_test = to_categorical(Y_test, num_classes = 17)
```

My thought process was that, despite the initial image being broken down into 1x1 patches, the large number of channels each patch possessed with their respective values would aid in categorisation of the dataset.

Essentially I'd want to input this reformatted data into my model (seen within the first code fragment in this post), however I'm uncertain if I am taking the wrong approach to this due to my inexperience with this area of expertise. I was expecting to input a shape of (1,1,200), i.e the shape of x_train & x_test would be (17545,1,1,200) & (3480,1,1,200) respectively.

ANSWER

Answered 2021-Dec-16 at 10:18If the hyperspectral dataset is given to you as a large image with many channels, I suppose that the classification of each pixel should depend on the pixels around it (otherwise I would not format the data as an image, i.e. without grid structure). Given this assumption, breaking up the input picture into 1x1 parts is not a good idea as you are loosing the grid structure.

I further suppose that the order of the channels is arbitrary, which implies that convolution over the channels is probably not meaningful (which you however did not plan to do anyways).

Instead of reformatting the data the way you did, you may want to create a model that takes an image as input and also outputs an "image" containing the classifications for each pixel. I.e. if you have 10 classes and take a (145, 145, 200) image as input, your model would output a (145, 145, 10) image. In that architecture you would not have any fully-connected layers. Your output layer would also be a convolutional layer.

That however means that you will not be able to keep your current architecture. That is because the tasks for MNIST/CIFAR10 and your hyperspectral dataset are not the same. For MNIST/CIFAR10 you want to classify an image in it's entirety, while for the other dataset you want to assign a class to each pixel (while most likely also using the pixels around each pixel).

Some further ideas:

- If you want to turn the pixel classification task on the hyperspectral dataset into a classification task for an entire image, maybe you can reformulate that task as "classifying a hyperspectral image as the class of it's center (or top-left, or bottom-right, or (21th, 104th), or whatever) pixel". To obtain the data from your single hyperspectral image, for each pixel, I would shift the image such that the target pixel is at the desired location (e.g. the center). All pixels that "fall off" the border could be inserted at the other side of the image.
- If you want to stick with a pixel classification task but need more data, maybe split up the single hyperspectral image you have into many smaller images (e.g. 10x10x200). You may even want to use images of many different sizes. If you model only has convolution and pooling layers and you make sure to maintain the sizes of the image, that should work out.

QUESTION

I want to import mnist digits digits to show in one figure, and code like that,

```
import keras
from keras.datasets import mnist
import matplotlib.pyplot as plt
(X_train, y_train), (X_test, y_test) = mnist.load_data()
fig = plt.figure(figsize=(8,8))
n = 0
for i in range (5):
for j in range (5):
plt.subplot(5, 5, i*5 +j +1)
plt.imshow(X_train[n], cmap='Greys')
plt.title("Digit:{}".format(y_train[n]))
n += 1
plt.tight_layout()
plt.show()
```

ANSWER

Answered 2021-Dec-07 at 04:04I was able to reproduce this bug too. It seems to be related to the `plt.tight_layout()`

that you apply within the loop. Instead of doing this, use `plt.subplots`

to produce the axes objects first, then iterate over those instead. Once you plot everything, use `tight_layout`

on the opened figure:

```
import keras
from keras.datasets import mnist
import matplotlib.pyplot as plt
(X_train, y_train), (X_test, y_test) = mnist.load_data()
fig, axes = plt.subplots(nrows=5, ncols=5, figsize=(8,8))
for i, ax in enumerate(axes.flat):
ax.imshow(X_train[i], cmap='Greys')
ax.set_title("Digit:{}".format(y_train[i]))
fig.tight_layout()
plt.show()
```

QUESTION

I am trying to run a TensorFlow-lite model on my App on a smartphone. First, I trained the model with numerical data using LSTM and build the model layer using TensorFlow.Keras. I used TensorFlow V2.x and saved the trained model on a server. After that, the model is downloaded to the internal memory of the smartphone by the App and loaded to the interpreter using "MappedByteBuffer". Until here everything is working correctly.

The problem is in the interpreter can not read and run the model. I also added the required dependencies on the build.gradle.

**The conversion code to tflite model in python:**

```
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM
from tensorflow.keras import regularizers
#Create the network
model = Sequential()
model.add(LSTM(...... name = 'First_layer'))
model.add(Dropout(rate=Drop_out))
model.add(LSTM(...... name = 'Second_layer'))
model.add(Dropout(rate=Drop_out))
# compile model
model.compile(loss=keras.losses.mae,
optimizer=keras.optimizers.Adam(learning_rate=learning_rate), metrics=["mae"])
# fit model
model.fit(.......)
#save the model
tf.saved_model.save(model,'saved_model')
print("Model type", model1.dtype)# Model type is float32 and size around 2MB
#Convert saved model into TFlite
converter = tf.lite.TFLiteConverter.from_saved_model('saved_model')
tflite_model = converter.convert()
with open("Model.tflite, "wb") as f:
f.write(tflite_model)
f.close()
```

**I tried also other conversion way using Keras**

```
# converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
# tflite_model = converter.convert()
```

After this step, the "Model.tflite" is converted and downloaded to the internal memory of the smartphone.

**Android studio code:**

```
try {
private Interpreter tflite = new Interpreter(loadModelFile());
Log.d("Load_model", "Created a Tensorflow Lite of AutoAuth.");
} catch (IOException e) {
Log.e("Load_model", "IOException loading the tflite file");
}
private MappedByteBuffer loadModelFile() throws IOException {
String model_path = model_directory + model_name + ".tflite";
Log.d(TAG, model_path);
File file = new File(model_path);
if(file!=null){
FileInputStream inputStream = new FileInputStream(file);
FileChannel fileChannel = inputStream.getChannel();
return fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, file.length());
}else{
return null;
}
}
```

The "loadModelFile()" function is working correctly because I checked it with another tflite model using MNIST dataset for image classification. The problem is only the interpreter.

**This is also build.gradle's contents:**

```
android {
aaptOptions {
noCompress "tflite"
}
}
android {
defaultConfig {
ndk {
abiFilters 'armeabi-v7a', 'arm64-v8a'
}
}
}
dependencies {
implementation 'com.jakewharton:butterknife:8.8.1'
implementation 'org.tensorflow:tensorflow-lite:0.1.2-nightly'
annotationProcessor 'com.jakewharton:butterknife-compiler:8.8.1'
implementation fileTree(dir: 'libs', include: ['*.jar'])
//noinspection GradleCompatible
implementation 'com.android.support:appcompat-v7:28.0.0'
implementation 'com.android.support.constraint:constraint-layout:2.0.4'
testImplementation 'junit:junit:4.12'
androidTestImplementation 'com.android.support.test:runner:1.0.2'
androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2'
}
```

ANSWER

Answered 2021-Nov-24 at 00:05Referring to one of the most recent TfLite android app examples might help: Model Personalization App. This demo app uses transfer learning model instead of LSTM, but the overall workflow should be similar.

As Farmaker mentioned in the comment, try using SNAPSHOT in the gradle dependency:

```
implementation 'org.tensorflow:tensorflow-lite:0.0.0-nightly-SNAPSHOT'
```

To load the model properly, can you try:

```
protected MappedByteBuffer loadMappedFile(String filePath) throws IOException {
AssetFileDescriptor fileDescriptor = assetManager.openFd(this.directoryName + "/" + filePath);
FileInputStream inputStream = new FileInputStream(fileDescriptor.getFileDescriptor());
FileChannel fileChannel = inputStream.getChannel();
long startOffset = fileDescriptor.getStartOffset();
long declaredLength = fileDescriptor.getDeclaredLength();
return fileChannel.map(MapMode.READ_ONLY, startOffset, declaredLength);
}
```

This snippet can also be found in the GitHub example link I posted above.

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

## Vulnerabilities

No vulnerabilities reported

## Install MNIST

You can use MNIST like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the MNIST component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

## Support

###### Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

Find more libraries###### Explore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits

###### Save this library and start creating your kit

Share this Page