huffman-coding | Python Implementaion of Huffman Coding | Compression library
kandi X-RAY | huffman-coding Summary
kandi X-RAY | huffman-coding Summary
Explanation on YouTube video Was originally posted in blog article at Consists compress and decompress function. To run the code for compression of any other text file, edit the path variable in the useHuffman.py file. For now, the decompress() function is to be called from the same object from which the compress() function was called. (as the encoding information is stored in the data members of the object only).
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Decompress the file
- Decode encoded text
- Remove padding
- Compress the text
- Recursively build code
- Convert encoded text to byte array
- Merges two nodes
- Pad encoded text
- Make frequency from text
- Get encoded text
- Make the code blocks
- Make a heap
huffman-coding Key Features
huffman-coding Examples and Code Snippets
Community Discussions
Trending Discussions on huffman-coding
QUESTION
Given a random integer, for example, 19357982357627685397198. How can I compress these numbers into a string of text that has fewer characters?
The string of text must only contain numbers or alphabetical characters, both uppercase and lowercase.
I've tried Base64 and Huffman-coding that claim to compress, but none of them makes the string shorter when writing on a keyboard.
I also tried to make some kind of algorithm that tries to divide the integer by the numbers "2,3,...,10" and check if the last number in the result is the number it was divided by (looks for 0 in case of division by 10). So, when decrypting, you would just multiply the number by the last number in the integer. But that does not work because in some cases you can't divide by anything and the number would stay the same, and when it would be decrypted, it would just multiply it into a larger number than you started with.
I also tried to divide the integer into blocks of 2 numbers starting from left and giving a letter to them (a=1, b=2, o=15), and when it would get to z it would just roll back to a. This did not work because when it was decrypted, it would not know how many times the number rolled over z and therefore be a much smaller number than in the start.
I also tried some other common encryption strategies. For example Base32, Ascii85, Bifid Cipher, Baudot Code, and some others I can not remember.
It seems like an unsolvable problem. But because it starts with an integer, each number can contain 10 different combinations. While in the alphabet, letters can contain 26 different combinations. This makes it so that you can store more data in 5 alphabetical letters, than in a 5 digit integer. So it is possible to store more data in a string of characters than in an integer in mathematical means, but I just can't find anyone who has ever done it.
...ANSWER
Answered 2021-Jun-01 at 21:47You switch from base 10 to eg. base 62 by repeatedly dividing by 62 and record the remainders from each step like this:
QUESTION
I have a requirement where text files are send from one location to other. Both location are in our control. The nature of content and the words that could appear in this are mostly the same. Which means, if I keep the delate dictionary
in both location once, there is no need to send it with file.
I have been reading about this last 1 week and experimenting with some available codes such as this & this.
However, I am still in dark.
Few questions I still have:
- Can we generate and use custom deflate dictionary from a preset of words?
- Can we send file without the deflate dictionary and use local one?
- If not gzip, are there any such compression library that can be used for this purpose?
Some references I stumbled upon so far:
...ANSWER
Answered 2021-May-22 at 00:15The zlib library supports dictionaries with the zlib (not gzip) format. See deflateSetDictionary()
and inflateSetDictionary()
.
There is nothing special about the construction of a dictionary. All it is is 32K bytes of strings that you believe will occur often in the data you are compressing. You should put the most common strings at the end of the 32K.
QUESTION
I am running the Java program shown here to generate canonical Huffman codes, https://www.geeksforgeeks.org/canonical-huffman-coding/
Although the code gives the correct canonical Huffman codes with the shown input, for other cases I don't find the codes to be prefix code and correct. For example ,
...ANSWER
Answered 2021-May-09 at 22:24It is generating the codes correctly, but then printing them incorrectly. It is leaving off the leading zero bits of the codes that have them. They should have prepended the necessary zero bits after converting the number to a string of digits.
If you replace the line that prints the code with this:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install huffman-coding
You can use huffman-coding like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page