hickle | a HDF5-based python pickle replacement

 by   telegraphic Python Version: 5.0.3 License: Non-SPDX

kandi X-RAY | hickle Summary

kandi X-RAY | hickle Summary

hickle is a Python library. hickle has no bugs, it has no vulnerabilities, it has build file available and it has low support. However hickle has a Non-SPDX License. You can install using 'pip install hickle' or download it from GitHub, PyPI.

Hickle is an [HDF5] based clone of pickle, with a twist: instead of serializing to a pickle file, Hickle dumps to an HDF5 file (Hierarchical Data Format). It is designed to be a "drop-in" replacement for pickle (for common data objects), but is really an amalgam of h5py and dill/pickle with extended functionality. That is: hickle is a neat little way of dumping python variables to HDF5 files that can be read in most programming languages, not just Python. Hickle is fast, and allows for transparent compression of your data (LZF / GZIP).
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              hickle has a low active ecosystem.
              It has 381 star(s) with 64 fork(s). There are 21 watchers for this library.
              There were 1 major release(s) in the last 12 months.
              There are 6 open issues and 85 have been closed. On average issues are closed in 389 days. There are 1 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of hickle is 5.0.3

            kandi-Quality Quality

              hickle has 0 bugs and 0 code smells.

            kandi-Security Security

              hickle has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              hickle code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              hickle has a Non-SPDX License.
              Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.

            kandi-Reuse Reuse

              hickle releases are available to install and integrate.
              Deployable package is available in PyPI.
              Build file is available. You can build the component from source.
              Installation instructions, examples and code snippets are available.
              hickle saves you 1091 person hours of effort in developing the same functionality from scratch.
              It has 2469 lines of code, 213 functions and 31 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed hickle and discovered the below as its top functions. This is intended to give you an instant insight into hickle implemented functionality, and help decide if they suit your requirements.
            • Dump a Python object to a file
            • Dump a py_obj to HDF5
            • Create an hdf5 file opener
            • Test if f is an io - like object
            • Load NDMasked Dataset
            • Convert to array
            • Load an ndarray Dataset
            • Get the type and data of a node
            • Create a setlike dataset from an object
            • Create a dataset for a list - like object
            • Recover a custom dataset
            • Load the python dtype dataset
            • Register a class
            • Yields all reference types from the parent
            • Load a sparse matrix data
            • Register a list of classes
            • Load an astropy quantity dataset
            • Load a numpy dtype dataset
            • Load an astropy constant dataset
            • Load an ASTropy - angle dataset
            • Load a numpy array from h_node
            • Load an astropy SkyCoord dataset
            • Load a string from a hickle file
            • Helper function to create a dataset
            • Load an astropy time dataset
            • Load an astropy table from a h5 node
            Get all kandi verified functions for this library.

            hickle Key Features

            No Key Features are available at this moment for hickle.

            hickle Examples and Code Snippets

            No Code Snippets are available at this moment for hickle.

            Community Discussions

            QUESTION

            What is the most compact way of storing numpy data?
            Asked 2020-Apr-15 at 13:52

            I have large data set.

            The best I could achieve is use numpy arrays and make a binary file out of it and then compressing it:

            ...

            ANSWER

            Answered 2020-Apr-15 at 13:52

            An array with 46800 x 4 x 18 8-byte floats takes up 26956800 bytes. That's 25.7MiB or 27.0MB. A compressed size of 22MB is an 18% (or 14% if you really meant MiB) compression, which is pretty good by most standards, especially for random binary data. You are unlikely to improve on that much. Using a smaller datatype like float32, or perhaps trying to represent your data as rationals may be useful.

            Since you mention that you want to store metadata, you can record a byte for the number of dimensions (numpy allows at most 32 dimensions), and N integers for the size in each dimension (either 32 or 64 bit). Let's say you use 64 bit integers. That makes for 193 bytes of metadata in your particular case, or 7*10-4% of the total array size.

            Source https://stackoverflow.com/questions/61213944

            QUESTION

            least memory comsuming way of saving large python data into DB
            Asked 2020-Apr-10 at 11:28

            I have to save a significantly large python data into the a mysql database which consists of lists and dictionaries but I get memory exception during the save operation.

            I have already benchmarked the saving operation and also tried different ways of dumping the data, including binary format but all methods seemed to consume a lot of memory. Benchmarks below:

            MAX MEMORY USAGE DURING JSON SAVE: 966.83MB

            SIZE AFTER DUMPING json: 81.03 MB pickle: 66.79 MB msgpack: 33.83 MB

            COMPRESSION TIME: json: 5.12s pickle: 11.17s msgpack: 0.27s

            DECOMPRESSION TIME: json: 2.57s pickle: 1.66s msgpack: 0.52s

            COMPRESSION MAX MEMORY USAGE: json dumping: 840.84MB pickle: 1373.30MB msgpack: 732.67MB

            DECOMPRESSION MAX MEMORY USAGE: json: 921.41MB pickle: 1481.25MB msgpack: 1006.12MB

            msgpack seems to be the most performant library but the decompression takes up a lot of memory too. I also tried hickle which is said to consume little memory but the final size ended up being 800MB.

            Does anyone have a suggestion? Should I just increase the memory limit? can mongodb handle the save operation with less memory?

            find below the stacktrace

            ...

            ANSWER

            Answered 2020-Apr-06 at 21:54

            In essence, here is how I would do it to reduce memory consumption and improve performance:

            1. Load json file (no way to stream it in python AFAIK)
            2. Chunk the array of dictionaries into smaller chunks
            3. Convert chunk into objects
            4. Call bulk_create
            5. Garbage collect after every loop iteration

            Source https://stackoverflow.com/questions/61069230

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install hickle

            You should have Python 3.5 and above installed. Install h5py (Official page: http://docs.h5py.org/en/latest/build.html). Install hdf5 (Official page: http://www.hdfgroup.org/ftp/HDF5/current/src/unpacked/release_docs/INSTALL). Download hickle: via terminal: git clone https://github.com/telegraphic/hickle.git via manual download: Go to https://github.com/telegraphic/hickle and on right hand side you will find Download ZIP file. cd to your downloaded hickle directory. Then run the following command in the hickle directory: python setup.py install.
            You should have Python 3.5 and above installed
            Install h5py (Official page: http://docs.h5py.org/en/latest/build.html)
            Install hdf5 (Official page: http://www.hdfgroup.org/ftp/HDF5/current/src/unpacked/release_docs/INSTALL)
            Download hickle: via terminal: git clone https://github.com/telegraphic/hickle.git via manual download: Go to https://github.com/telegraphic/hickle and on right hand side you will find Download ZIP file
            cd to your downloaded hickle directory
            Then run the following command in the hickle directory: python setup.py install

            Support

            Documentation for hickle can be found at [telegraphic.github.io/hickle/](http://telegraphic.github.io/hickle/).
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
          • PyPI

            pip install hickle

          • CLONE
          • HTTPS

            https://github.com/telegraphic/hickle.git

          • CLI

            gh repo clone telegraphic/hickle

          • sshUrl

            git@github.com:telegraphic/hickle.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link