huffman-coding | decompression program based on Huffman

by cynricfu C++ Version: Current License: MIT

X-Ray Key Features Code Snippets Community Discussions(6)Vulnerabilities Install Support

kandi X-RAY | huffman-coding Summary

huffman-coding is a C++ library. huffman-coding has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

This project is to design compression and decompression programs based on Huffman Coding. The idea of Huffman Coding is to minimize the weighted expected length of the code by means of assigning shorter codes to frequently-used characters and longer codes to seldom-used code.

Support

Quality

Security

License

Reuse

Support

huffman-coding has a low active ecosystem.

It has 23 star(s) with 21 fork(s). There are 3 watchers for this library.

It had no major release in the last 6 months.

There are 1 open issues and 0 have been closed. There are 1 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of huffman-coding is current.

Quality

huffman-coding has no bugs reported.

Security

huffman-coding has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

huffman-coding is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

huffman-coding releases are not available. You will need to build from source code and install.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of huffman-coding

Get all kandi verified functions for this library.

huffman-coding Key Features

No Key Features are available at this moment for huffman-coding.

huffman-coding Examples and Code Snippets

No Code Snippets are available at this moment for huffman-coding.

Community Discussions

Trending Discussions on huffman-coding

How to compress an integer to a smaller string of text?

Using a preset deflate dictionary to reduce compressed archive file size

Issues with a Reference Code for Running Canonical Huffman Code on Java

Dynamic Programming for prefix-free coding

Create structs from array elements based on index and value

How Huffman coding figured out the property that the codes are unique

QUESTION

How to compress an integer to a smaller string of text?

Asked 2021-Jun-01 at 21:47

Given a random integer, for example, 19357982357627685397198. How can I compress these numbers into a string of text that has fewer characters?

The string of text must only contain numbers or alphabetical characters, both uppercase and lowercase.

I've tried Base64 and Huffman-coding that claim to compress, but none of them makes the string shorter when writing on a keyboard.

I also tried to make some kind of algorithm that tries to divide the integer by the numbers "2,3,...,10" and check if the last number in the result is the number it was divided by (looks for 0 in case of division by 10). So, when decrypting, you would just multiply the number by the last number in the integer. But that does not work because in some cases you can't divide by anything and the number would stay the same, and when it would be decrypted, it would just multiply it into a larger number than you started with.

I also tried to divide the integer into blocks of 2 numbers starting from left and giving a letter to them (a=1, b=2, o=15), and when it would get to z it would just roll back to a. This did not work because when it was decrypted, it would not know how many times the number rolled over z and therefore be a much smaller number than in the start.

I also tried some other common encryption strategies. For example Base32, Ascii85, Bifid Cipher, Baudot Code, and some others I can not remember.

It seems like an unsolvable problem. But because it starts with an integer, each number can contain 10 different combinations. While in the alphabet, letters can contain 26 different combinations. This makes it so that you can store more data in 5 alphabetical letters, than in a 5 digit integer. So it is possible to store more data in a string of characters than in an integer in mathematical means, but I just can't find anyone who has ever done it.

...

ANSWER

Answered 2021-Jun-01 at 21:47

You switch from base 10 to eg. base 62 by repeatedly dividing by 62 and record the remainders from each step like this:

Source https://stackoverflow.com/questions/67791840

QUESTION

Using a preset deflate dictionary to reduce compressed archive file size

Asked 2021-May-23 at 02:01

I have a requirement where text files are send from one location to other. Both location are in our control. The nature of content and the words that could appear in this are mostly the same. Which means, if I keep the delate dictionary in both location once, there is no need to send it with file.

I have been reading about this last 1 week and experimenting with some available codes such as this & this.

However, I am still in dark.

Few questions I still have:

Can we generate and use custom deflate dictionary from a preset of words?
Can we send file without the deflate dictionary and use local one?
If not gzip, are there any such compression library that can be used for this purpose?

Some references I stumbled upon so far:

...

ANSWER

Answered 2021-May-22 at 00:15

The zlib library supports dictionaries with the zlib (not gzip) format. See deflateSetDictionary() and inflateSetDictionary().

There is nothing special about the construction of a dictionary. All it is is 32K bytes of strings that you believe will occur often in the data you are compressing. You should put the most common strings at the end of the 32K.

Source https://stackoverflow.com/questions/67632095

QUESTION

Issues with a Reference Code for Running Canonical Huffman Code on Java

Asked 2021-May-09 at 22:24

I am running the Java program shown here to generate canonical Huffman codes, https://www.geeksforgeeks.org/canonical-huffman-coding/

Although the code gives the correct canonical Huffman codes with the shown input, for other cases I don't find the codes to be prefix code and correct. For example ,

...

ANSWER

Answered 2021-May-09 at 22:24

It is generating the codes correctly, but then printing them incorrectly. It is leaving off the leading zero bits of the codes that have them. They should have prepended the necessary zero bits after converting the number to a string of digits.

If you replace the line that prints the code with this:

Source https://stackoverflow.com/questions/67462155

QUESTION

Dynamic Programming for prefix-free coding

Asked 2019-Aug-05 at 03:22

Is there a way of computing the prefix-free coding of a given dictionary of letters and their frequencies. Similar to Huffman-Coding but dynamically computed - how does the optimization function look like?

The problem with building the tree just to position i of the dictionary is, that the lowest frequent letters could change and so the whole tree's structure would.

...

ANSWER

Answered 2019-Aug-05 at 03:22

Yes, there are several ways to generate prefix-free codes dynamically.

As you suggested, it would be conceptually simple to start with some default frequency, track the frequencies of the letters used so far, and for every letter decoded, increment that letter's count and then re-build a Huffman tree from all the counts. (potentially completely changing the tree after each letter). That would require a lot of work for each letter and be very slow -- and yet there are a couple of adaptive Huffman coding algorithms that effectively do the same thing -- using clever algorithms that do much less work, and so are faster.

Many other data compression algorithms also generate prefix-free codes dynamically much faster than any adaptive Huffman algorithm, at a small sacrifice of compression -- such as Polar codes or Engel coding or universal codes such as Elias delta coding.

The arithmetic coding data compression algorithm is technically not a prefix-free code, but typically gives slightly better compression (but runs slower) than either static Huffman coding or adaptive Huffman coding. Arithmetic coding is generally implemented adaptively, tracking the frequencies of all the letters used so far. (Many arithmetic coding implementations track even more context -- if the previous letter was a "t", it remembers that the most-frequent letter in this context is "h" and exactly how frequent it was, etc., giving even better compression).

Source https://stackoverflow.com/questions/55110156

QUESTION

Create structs from array elements based on index and value

Asked 2019-Apr-14 at 10:01

Is there a more elegant way to express the following code (e.g. without explicit for-loop)?

...

ANSWER

Answered 2019-Apr-14 at 10:01

How about:

Source https://stackoverflow.com/questions/55638157

QUESTION

How Huffman coding figured out the property that the codes are unique

Asked 2019-Feb-12 at 09:28

I have just read this:

This is where a really smart idea called Huffman coding comes in! The idea is that we represent our characters (like a, b, c, d, ….) with codes like
...

ANSWER

Answered 2019-Feb-12 at 09:28

Huffman code works by laying out data in a tree. If you have a binary tree, you can associate every leaf to a code by saying that left child corresponds to a bit at 0 and right child to a 1. The path that leads from the root to a leaf corresponds to a code in a not ambiguous way.

This works for any tree and the prefix property is based on the fact that a leaf is terminal. Hence, you cannot go to leaf (have a code) by passing though another leaf (by having another code be a prefix).

The basic idea of Huffman coding is that you can build trees in such a way that the depth of every node is correlated with the probability of appearance of the node (codes more likely to happen will be closer the root).

There are several algorithms to build such a tree. For instance, assume you have a set of items you want to code, say a..f. You must know the probabilities of appearance every item, thanks to either a model of the source or an analysis of the actual values (for instance by analysing the file to code).

Then you can:

sort the items by probability
pickup the two items with the lowest probability
remove these items, group them in a new compound node and assign one item to left child (code 0) and the other to right child (code 1).
The probability of the compound node is the sum of individual probabilities and insert this new node in the sorted item list.
goto 2 while the number of items is >1

For the previous tree, it may correspond to a set of probabilities

a (0.5) b (0.2) c (0.1) d (0.05) e (0.05) f (0.1)

Then you pick items with the lowest probability (d and e), group them in a compound node (de) and get the new list

a (0.5) b (0.2) c (0.1) (de) (0.1) f (0.1)

And the successive item lists can be

a (0.5) b (0.2) c(de) (0.2) f (0.1)

a (0.5) b (0.2) (c(de))f (0.3)

a (0.5) b((c(de))f) (0.5)

a(b(((c(de))f)) 1.0

So the prefix property is insured by construction.

Source https://stackoverflow.com/questions/54643091

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install huffman-coding

You can download it from GitHub.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: