emnist | Classify EMNIST Digits using Convolutional Neural Networks | Machine Learning library

 by   j05t Jupyter Notebook Version: Current License: GPL-3.0

kandi X-RAY | emnist Summary

kandi X-RAY | emnist Summary

emnist is a Jupyter Notebook library typically used in Artificial Intelligence, Machine Learning, Deep Learning applications. emnist has no bugs, it has no vulnerabilities, it has a Strong Copyleft License and it has low support. You can download it from GitHub.

EMNIST (Cohen, G., Afshar, S., Tapson, J. and van Schaik, A., 2017. EMNIST: an extension of MNIST to handwritten letters.) Digits dataset downloaded from (Matlab format dataset). The Matlab format dataset can be conveniently imported with scipy.io.loadmat. All models have been trained from scratch on EMNIST Digits training data using realtime data augmentation. All test error rates in percent. All results were obtained with Keras using the Theano backend. Source code resides in a slightly adapted (and more recent) version using the TensorFlow backend is available at See for information regarding setup and usage. Detailed results can be viewed at Best model weights have been uploaded. A JSON/H5 export for the best single model has been uploaded into export_json_h5 directory. The input normalization function for this exported model has hard coded values for image mean and standard deviation based on EMNIST Digits training data.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              emnist has a low active ecosystem.
              It has 38 star(s) with 18 fork(s). There are 2 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              emnist has no issues reported. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of emnist is current.

            kandi-Quality Quality

              emnist has no bugs reported.

            kandi-Security Security

              emnist has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

            kandi-License License

              emnist is licensed under the GPL-3.0 License. This license is Strong Copyleft.
              Strong Copyleft licenses enforce sharing, and you can use them when creating open source projects.

            kandi-Reuse Reuse

              emnist releases are not available. You will need to build from source code and install.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of emnist
            Get all kandi verified functions for this library.

            emnist Key Features

            No Key Features are available at this moment for emnist.

            emnist Examples and Code Snippets

            No Code Snippets are available at this moment for emnist.

            Community Discussions

            QUESTION

            How to flatten test image dataset and create a batch of tuple of (flattened image , labels)?
            Asked 2022-Feb-04 at 18:21

            I'm working in Handwritten Math's symbol Classification using Federated Learning. I have preprocessed the image from keras.preprocessing.image.ImageDataGenerator and also obtained the labels of each images.

            ...

            ANSWER

            Answered 2022-Feb-04 at 18:20

            You can try something like this:

            Source https://stackoverflow.com/questions/70990386

            QUESTION

            How to get rid of placements(SERVER or CLIENTS) so that I can transform float32@SERVER to float32?
            Asked 2021-Jul-18 at 12:42

            I am trying to do learning rate decay challange of Building Your Own Federated Learning Algorithm tutorial. I have used the following code

            ...

            ANSWER

            Answered 2021-Jul-18 at 10:49

            The problem in your code is when manually creating federated_server_type_with_LR.

            In the type system, different from @SERVER. You can convert the former to the latter by using tff.federated_zip(), which promotes the placement to the top-level.

            Two solutions:

            (1) Modify the decorator of next_fn to be @tff.federated_computation(tff.federated_zip(federated_server_type_with_LR), federated_dataset_type)

            (2) [preferred, to avoid this kind of issue] Do not create the type manually, and read it from initialize_fn instead. The decorator would be @tff.federated_computation(initialize_fn.type_signature.result, federated_dataset_type)

            Source https://stackoverflow.com/questions/68424157

            QUESTION

            Tensorflow federated (TFF) 0.19 performs significantly worse than TFF 0.17 when running "Building Your Own Federated Learning Algorithm" tutorial
            Asked 2021-Jul-14 at 02:04

            At the very end the "Building Your Own Federated Learning Algorithm" tutorial it is stated ,after training our model for 15 rounds, we shall expect a sparse_categorical_accuracy around 0.25, but running the tutorial in colab as is gives a result between 0.09 and 0.11 based on my runs. Yet simply changing the tf and tff versions to 2.3.x and 0.17, respectively, gives a result around 0.25, just like we expected!

            To replicate run the said tutorial as is, it should use tf 2.5 and tff 0.19. After that run the same tutorial by simply changing

            ...

            ANSWER

            Answered 2021-Jul-12 at 16:21

            TFF 0.19 moved the provided datasets (including EMNIST, which is used in the tutorial) away from an HDF5-backed implementation to a SQL-backed implementation (commit). It's possible that this changed the ordering of the clients, which would change which clients are used for training in the tutorial.

            It's worth noting that in most simulations, this should not change anything. Clients should generally be randomly sampled at each round (which is not done in the tutorial for reasons of exposition) and generally at least 100 rounds should be done (as you say).

            I'll update the tutorial to guarantee reproducibility by sorting the client ids, and then selecting them in order.

            For anyone who's interested, a better practice would be to a) sorting the client ids, and then b) sample using something like np.random.RandomState, as in the following snippet:

            Source https://stackoverflow.com/questions/68341156

            QUESTION

            Flipping the labels of a TF dataset
            Asked 2021-Jun-23 at 17:24

            I want to create a malicious dataset for CIFAR-100 to test a Federated Learning Attack similar to this malicious dataset for EMNIST:

            ...

            ANSWER

            Answered 2021-Jun-23 at 17:24

            In general, tf.data.Dataset objects can be modified using their .map method. So for example, a simple label flipping could be done as follows:

            Source https://stackoverflow.com/questions/68104098

            QUESTION

            ValueError: shapes (240000,28,28) and (2,512) not aligned: 28 (dim 2) != 2 (dim 0)
            Asked 2021-May-07 at 20:49

            I'm making a CNN and I've got this error that the matrices don't align and i understand the error but i don't know how to fix it. Here is the code:

            ...

            ANSWER

            Answered 2021-May-07 at 20:49

            Firstly, you should flatten your input so its shape is (240000, 28*28) = (240000, 784). After that, the problem is in this line:

            Source https://stackoverflow.com/questions/67441364

            QUESTION

            How to feed TensorFlow Datasets into traning_x, training_y, testing_x,testing_y into Keras API?
            Asked 2021-Mar-28 at 07:24

            TensorFlow Datasets was a convent tool to utilize the datasets from the internet. However, I got confused about how to feed it into the Input layer in tensor flow Keras API. The dataset used was the tensorflow dataset's emnist.

            Here's what was known:

            Point 1: Instead of the store the dataset into the memory, tensorflow database, warp around the tensorflow data module, preprocess the dataset on the hard drive, and use a pipeline(a class instance like object?) to feed the data into the python function. It does so with a load function.

            Issue 1 "as_supervised": However, there were two "different" load methods with or without "as_supervised" being on,

            ...

            ANSWER

            Answered 2021-Mar-28 at 07:24
            Issue 1

            About as_supervised, according to the doc

            bool, if True, the returned tf.data.Dataset will have a 2-tuple structure (input, label) according to builder.info.supervised_keys. If False, the default, the returned tf.data.Dataset will have a dictionary with all the features.

            Source https://stackoverflow.com/questions/66837940

            QUESTION

            How to load Fashion MNIST dataset in Tensorflow Fedarated?
            Asked 2020-Nov-28 at 16:28

            I am working on a project with Tensorflow federated. I have managed to use the libraries provided by TensorFlow Federated Learning simulations in order to load, train, and test some datasets.

            For example, i load the emnist dataset

            ...

            ANSWER

            Answered 2020-Nov-13 at 16:38

            You're on the right track. To recap: the datasets returned by tff.simulation.dataset APIs are tff.simulation.ClientData objects. The object returned by tf.keras.datasets.fashion_mnist.load_data is a tuple of numpy arrays.

            So what is needed is to implement a tff.simulation.ClientData to wrap the dataset returned by tf.keras.datasets.fashion_mnist.load_data. Some previous questions about implementing ClientData objects:

            This does require answering an important question: how should the Fashion MNIST data be split into individual users? The dataset doesn't include features that that could be used for partitioning. Researchers have come up with a few ways to synthetically partition the data, e.g. randomly sampling some labels for each participant, but this will have a great effect on model training and is useful to invest some thought here.

            Source https://stackoverflow.com/questions/64760396

            QUESTION

            TensorFlow Federated: How to tune non-IIDness in federated dataset?
            Asked 2020-Nov-25 at 00:46

            I am testing some algorithms in TensorFlow Federated (TFF). In this regard, I would like to test and compare them on the same federated dataset with different "levels" of data heterogeneity, i.e. non-IIDness.

            Hence, I would like to know whether there is any way to control and tune the "level" of non-IIDness in a specific federated dataset, in an automatic or semi-automatic fashion, e.g. by means of TFF APIs or just traditional TF API (maybe inside the Dataset utils).

            To be more practical: for instance, the EMNIST federated dataset provided by TFF has 3383 clients with each one of them having their handwritten characters. However, these local dataset seems to be quite balanced in terms of number of local examples and in terms of represented classes (all classes are, more or less, represented locally). If I would like to have a federated dataset (e.g., starting by the TFF's EMNIST one) that is:

            • Patologically non-IID, for example having clients that hold only one class out of N classes (always referring to a classification task). Is this the purpose of tff.simulation.datasets.build_single_label_dataset documentation here. If so, how should I use it from a federated dataset such as the ones already provided by TFF?;
            • Unbalanced in terms of the amount of local examples (e.g., one client has 10 examples, another one has 100 examples);
            • Both the possibilities;

            how should I proceed inside the TFF framework to prepare a federated dataset with those characteristics?

            Should I do all the stuff by hand? Or do some of you have some advices to automate this process?

            An additional question: in this paper "Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification", by Hsu et al., they exploit the Dirichlet distribution to synthesize a population of non-identical clients, and they use a concentration parameter to control the identicalness among clients. This seems an wasy-to-tune way to produce datasets with different levels of heterogeneity. Any advice about how to implement this strategy (or a similar one) inside the TFF framework, or just in TensorFlow (Python) considering a simple dataset such as the EMNIST, would be very useful too.

            Thank you a lot.

            ...

            ANSWER

            Answered 2020-Nov-25 at 00:46

            For Federated Learning simulations, its quite reasonable to setup the client datasets in Python, in the experiment driver, to achieve the desired distributions. At some high-level, TFF handles modeling data location ("placements" in the type system) and computation logic. Re-mixing/generating a simulation dataset is not quite core to the library, though there are helpful libraries as you've found. Doing this directly in python by manipulating the tf.data.Dataset and then "pushing" the client datasets into a TFF computation seems straightforward.

            Label non-IID

            Yes, tff.simulation.datasets.build_single_label_dataset is intended for this purpose.

            It takes a tf.data.Dataset and essentially filters out all examples that don't match desired_label values for the label_key (assuming the dataset yields dict like structures).

            For EMNIST, to create a dataset of all the ones (regardless of user), this could be achieved by:

            Source https://stackoverflow.com/questions/64970504

            QUESTION

            Compile a TensorFlow Lite project for microcontrollers with arm-none-eabi-gcc
            Asked 2020-Nov-17 at 13:43

            I just downloaded the TensorFlow repository from github (https://github.com/tensorflow/tensorflow (v.2.3.1)). I included it in my C++ project. After I make my Makefile I got the error:

            ...

            ANSWER

            Answered 2020-Nov-17 at 13:43

            I got the solution.
            In your Makefile you have to include the directory of your TensorFlow library.

            This part of my Makefile looks in my case like this:\

            Source https://stackoverflow.com/questions/64818867

            QUESTION

            Change of the dataset type in the execution stack
            Asked 2020-Jul-23 at 03:51

            The problem is the change of the dataset from one type to another during different points of the execution stack. For example, if I add a new dataset class with more member properties of interest (which inherits from one of the classes in ops.data.dataset_ops like UnaryDataset), the result is at later execution point (client_update function), the dataset is converted to _VaraintDataset Type and hence any added attributes are lost. So the question is how to retain the member attributes of the newly defined dataset class over the course of execution. Below is the emnist example where the type changes from ParallelMapDataset to _VariantDataset.

            In the function client_dataset of training_utils.py line 194, I modified it to show the type of the dataset as follows

            ...

            ANSWER

            Answered 2020-Jul-21 at 14:19

            The new dataset Python class will need to support serialization. This is necessary because TensorFlow Federated is designed to be run on the machines that are not necessary the same as the machine that wrote the computation (e.g. smartphones in the case of cross-device federated learning). These machines may not be running Python, and hence not understand the new subclass that is created, hence the serialization layer would need to be updated. However, this is pretty low-level and there maybe alternative ways to achieve the desired goal.

            Going out on a limb: If the goal is to provide metadata along with the dataset for a client, it maybe easier to alter the function signature of the iterative process returned by fed_avg_schedule.build_fed_avg_process to accept a tuple of (dataset, metadata structure) for each client.

            Currently the signature of the next computation is (in TFF type shorthand introduced in Custom Federated Algorithms, Part 1: Introduction to the Federated Core):

            Source https://stackoverflow.com/questions/62993389

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install emnist

            You can download it from GitHub.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/j05t/emnist.git

          • CLI

            gh repo clone j05t/emnist

          • sshUrl

            git@github.com:j05t/emnist.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link