Reed-Solomon | Reed Solomon BCH encoder and decoder | Messaging library

by mersinvald C++ Version: Current License: MIT

X-Ray Key Features Code Snippets Community Discussions(6)Vulnerabilities Install Support

kandi X-RAY | Reed-Solomon Summary

Reed-Solomon is a C++ library typically used in Messaging applications. Reed-Solomon has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

This RS implementation was designed for embedded purposes, so all the memory allocations performed on the stack. If somebody want to reimplement memory management with heap usege, pull requests are welcome.

Support

Quality

Security

License

Reuse

Support

Reed-Solomon has a low active ecosystem.

It has 96 star(s) with 31 fork(s). There are 9 watchers for this library.

It had no major release in the last 6 months.

There are 1 open issues and 5 have been closed. On average issues are closed in 13 days. There are 1 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of Reed-Solomon is current.

Quality

Reed-Solomon has 0 bugs and 0 code smells.

Security

Reed-Solomon has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

Reed-Solomon code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

Reed-Solomon is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

Reed-Solomon releases are not available. You will need to build from source code and install.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of Reed-Solomon

Get all kandi verified functions for this library.

Reed-Solomon Key Features

No Key Features are available at this moment for Reed-Solomon.

Reed-Solomon Examples and Code Snippets

No Code Snippets are available at this moment for Reed-Solomon.

Community Discussions

Trending Discussions on Reed-Solomon

RAID storage structure

Why error-only decoding has high miscorrection rate for small parity?

Can you please explain Reed Solomon encoding part's Identity matrix?

Does Reed-Solomon Error algorithm allow correction only if error occur on input data part?

Use of Reed-Solomon error correction algorithm with 4-state barcodes

Understanding crc32 polynomial of cdrom

QUESTION

RAID storage structure

Asked 2020-Oct-14 at 16:01

I've some questions related to the RAID storage structure. First, let me show a picture of RAID system. [RAID][1][1]: https://i.stack.imgur.com/Whd6B.png. According to my understanding, each disk is partitioned into multiple strips. The encoding or decoding is done along a stripe, which is the collection of the strips. Let says each stripe consists of k data strips and t parity strips. If I'm using Reed-Solomon (RS) code constructed over GF(2^w), each strip with a size of "q" byte will be divided into q/k symbols and each symbol consists of w-bit.

My questions:

When we talk about RS en/decoding, are we treat each strip as a RS symbol or each w-bit in the strip as RS symbol, although each w-bit in a strip is multiplying with the same w-bit element in GF(2^w). Description:

When I was working on the software implementation of RAID system, the things I've learn from each Research paper, especially from this one: "A Performance Evaluation and Examination of Open-Source Erasure
Coding Libraries For Storage", is that the t parity strip, P=[p0,p1,...,p_{t-1}] is computed as matrix multiplication as P=CxD, where D=[d0,d1,...,d_{k-1}] and C is t x k Cauchy matrix (I'm using Cauchy matrix-based encoding as an example).
When I was reading another paper, which focus on hardware implementation of RS coding: "A Low Complexity Design of Reed Solomon Code Algorithm for Advanced RAID System", they seems to refering each strip as RS symbol. I mean how is that possible. Let says we are using GF(2^8), can a size of strip becomes only 8-bit? OR, in hardware implementation, people simply uses a higher order of finite field to construct the RAID system?

I sometime see people describe the RAID storage system as drives, where "each drive is divided into multiple strips". So, what is the difference between the terms drive and disk? Are they used interchangbly?

...

ANSWER

Answered 2020-Oct-14 at 16:01

each strip

Each group of blocks can be considered to be a matrix, let r = number of disks, and c = # bytes per block, then the matrix is a r row by c column matrix. The Reed Solomon like encoding and decoding are performed on each column of the matrix, where each column of the matrix is treated as an array of r bytes.

open source erasure

Erasure encoding may be different than Raid encoding. Erasure encoding can be based on a modified Vandermonde matrix, the transpose of what Wikipedia calls the systematic encoding for original view Reed Solomon:

https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#Systematic_encoding_procedure:_The_message_as_an_initial_sequence_of_values

Erasure can also be based on Cauchy matrix as shown in the paper you read.

For Raid, the encoding matrix rows corresponding to the P parity is all 1's effectively XOR, Q parity is powers of 2 (in GF(2^8)), R parity is powers of 4, ... . Unlike standard Reed Solomon, which generates parities (based on remainder of a generating polynomial), Raid generates syndromes.

In all 3 cases, the encoding matrix is the identity matrix augmented by the rows to encode the parities, but the augmented rows are different for Vandermonde, Cauchy, and Raid.

drive or disk

In this case, they mean the same thing.

encoding

Let d = number of data rows per strip, and p = number of parity rows per strip, then encoding can be implemented with a matrix multiply using the last p rows of the encoding matrix, a p by d encoding sub-matrix times the d by c matrix of data rows.

decoding

Open source erasure is done in two steps. Let e = number of erased data rows. The e erased data rows are regenerated by taking the first d rows of the encoding matrix corresponding (same row index) to the non-erased rows of data, inverting the d by d sub-matrix, then the e rows of the inverted matrix corresponding to the erased data rows results in an e by d matrix used to multiply the d by c non-erased data row matrix to regenerate the data. Once the data is regenerated, any parity rows are regenerated by re-encoding the now regenerated data.

Raid decoding is different. If there is only a single erasure in the data or P rows, then XOR is used to regenerate the erased row. If there is only a single erasure in the Q, R, ... rows, then the erased row is reencoded. For multiple row erasures, GF() algebra is normally used to regenerate erased data. This could also be done by creating an e by e matrix based on powers of 2 to the indexes of the erased rows, inverting the e by e matrix, then multiplying the inverted e by e times the first e of the p encoding rows of the encoding matrix, then using that e by c matrix to regenerate the erased data.

Note that neither of these decoding methods are conventional Reed Solomon decoding methods which are intended to correct errors as well as erasures. The modified Vandermonde matrix is the same encoding as original view systematic encoding, but erasure decoding would be the same as described above.

If the encoding was based on what Wiki calls "BCH view", then the parity rows are actual parities (remainder from generator polynomial), and during decoding, e rows of syndromes are generated, and once the e rows of syndromes are generated, then a Raid like regeneration of erased data could be done. I'm not aware of an erasure code based on "BCH view" encoding.

https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#Systematic_encoding_procedure

Update based on comments.

Raid (standard version) does not use Cauchy matrix. Raid does not use modified Vandermonde matrix either. Cauchy and modified Vandermonde matrix are mostly used by jerasure type algorihtms, not Raid. Raid encoding is based on powers of 2, {1 1 1 1 1 ... } {2 4 8 16 ...}, {4 16 64 ... }.

Raid is byte oriented, not bit oriented. Each column of a "strip" is treated as an array of bytes.

Raid (and erasure) encoding and decoding is performed on the columns of a matrix, treating each column as an independent array of bytes.

As for the limitation of operating within GF(2^8), this limits Raid to 255 data drives and 255 parity (syndromes) drives. Jerasure is limited to a total of 255 (BCH view) or 256 (original view) drives for both data and parity. I"m not aware of any implementation that comes close to these limits. Raid 6 (2 parities, P and Q) specifies a max of 32 drives, but that is an arbitrary choice. A common jerasure implementation splits up a file into 17 logical strips called shards and adds 3 shards for parity, where each shard is typically stored on a different Raid block of drives or on a different node or on a different server.

Source https://stackoverflow.com/questions/64163337

QUESTION

Why error-only decoding has high miscorrection rate for small parity?

Asked 2020-Oct-01 at 18:27

I've a question regarding to a statement written in this paper, "Generalized Integrated Interleaved Codes". The paper mentions that erasure decoding of Reed-Solomon (RS) code incurs no miscorrection, but error-only decoding of RS code incurs a high miscorrection rate if the correction capability is too low.

From my understanding, I think the difference between erasure decoding and error-only decoding is that erasure decoding does not require to compute the error locations. On the other hand, error-only decoding requires to know the error locations, which can be computed by Berlekamp–Massey algorithm. I wonder if the miscorrection for error-only decoding comes from computing the wrong error locations? If yes, why the miscorrection rate is related to the correction capability of the RS code?

...

ANSWER

Answered 2020-Oct-01 at 18:27

miscorrection for error-only decoding comes from computing the wrong error locations

Yes. For example, consider an RS code with 6 parities, which can correct 3 errors. Assume that 4 errors have occurred, and that a 3 error correction attempt created an additional 3 errors, for a total of 7 errors. It will produce a valid codeword, but the wrong codeword.

There are situations where the probability of miscorrection can be lowered. If the message is a shortened message, say 64 bytes of data and 6 parities for a total of 70 bytes, then if a 3 error case produces an invalid location, miscorrection can be avoided. In this case, the odds of 3 random locations being valid is (70/255)^3 ~= .02 (2%).

Another way to avoid miscorrection is to not use all of the parities for correction. With 6 parities, the correction could be limited to 2 errors, leaving 2 parities for detection purposes. Or use 7 parities, for 3 error correction, with 1 parity used for detection.

Follow up based on comments:

First note that there are 3 decoders that can be used for BCH view Reed Solomon: PGZ (Peterson Gorenstein Zierler) matrix decoder, BKM (Berlekamp Massey) discrepancy decoder , and Sugiyama's extended Euclid decoder. PGZ has greater time complexity O((n-k)^3) than BKM or Euclid, so most implementations use BKM or Euclid. You can read a bit more about these decoders here:

https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction

Getting back to 6 parities, 4 errors. All valid RS(n+6, n) codewords differ from each other by at least 7 elements. If 4 of the elements of a message are in error, there may be a valid codeword that differs by only 3 more elements from that message with 4 error elements, and in this case, all 3 decoders will find that the message differs from a valid codeword by 3 elements, and "correct" those 3 elements to produce a valid codeword, but in this case the wrong valid codeword, which will differ by 7 elements from the original codeword. With 5 elements in error, a 2 or 3 error miscorrection could occur, and with 6 or more elements in error, a 1 or 2 or 3 error miscorrection could occur.

Invalid location - consider a RS code based on GF(2^8), which allows for a message size up to 255 bytes. Valid locations for a 255 byte message are 0 to 254. If the message size is less than 255 bytes, for example 64 data + 6 parity = 70 bytes, then locations 0 to 69 are valid, while locations 70 to 254 are invalid. In what would otherwise be a case of miscorrection, if a calculated location is out of range, then the decoder has detected an uncorrectable message, rather than miscorrect it. Assume a garbled message and that the decoder generates 3 random locations in the range 0 to 254, the probability of all 3 being in the range 0 to 69 is (70/255)^3.

Another case where miscorrection is avoided is when the number of distinct roots of the error locator polynomial does not match the degree of the polynomial. Consider a 3 error case with generated error locator polynomial x^3 + a x^2 + b x + c. If there are more than 3 errors in the message, then the generated polynomial may have less than 3 distinct roots, such as a double root, or zero roots, or ... , in which case miscorrection is avoided and the message is detected as being uncorrectable.

Source https://stackoverflow.com/questions/64095705

QUESTION

Can you please explain Reed Solomon encoding part's Identity matrix?

Asked 2020-May-23 at 14:08

I am working on a object storage project where I need to understand Reed Solomon error correction algorithm, I have gone through this Doc as a starter and also some thesis paper.
1. content.sakai.rutgers.edu
2. theseus.fi
but I can't seem to understand the lower part of the identity matrix (red box), where it is coming from. How this calculation is done?

Can anyone please explain this.

...

ANSWER

Answered 2020-May-23 at 14:08

The encoding matrix is a 6 x 4 Vandermonde matrix using the evaluation points {0 1 2 3 4 5} modified so that the upper 4 x 4 portion of the matrix is the identity matrix. To create the matrix, a 6 x 4 Vandermonde matrix is generated (where matrix[r][c] = pow(r,c) ), then multiplied with the inverse of the upper 4 x 4 portion of the matrix to produce the encoding matrix. This is the equivalent of "systematic encoding" with Reed Solomon's "original view" as mentioned in the Wikipedia article you linked to above, which is different than Reed Solomon's "BCH view", which links 1. and 2. refer to. The Wikipedia's example systematic encoding matrix is a transposed version of the encoding matrix used in the question.

https://en.wikipedia.org/wiki/Vandermonde_matrix

https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#Systematic_encoding_procedure:_The_message_as_an_initial_sequence_of_values

The code to generate the encoding matrix is near the bottom of this github source file:

https://github.com/Backblaze/JavaReedSolomon/blob/master/src/main/java/com/backblaze/erasure/ReedSolomon.java

Source https://stackoverflow.com/questions/59929677

QUESTION

Does Reed-Solomon Error algorithm allow correction only if error occur on input data part?

Asked 2020-May-08 at 08:54

Reed-Solomon algorithm is adding an additional data to the input, so potential errors (of particular size/quantity) on such damaged input can be corrected back to the original state. Correct? Does this algorithm protects also such added data not being part of the input, but used by the algorithm? If not, what happened if the error occurs in such non-input data part?

...

ANSWER

Answered 2020-May-08 at 08:54

An important aspect is that Reed-Solomon (RS) codes are cyclic: the set of codewords is stable by cyclic shift.

A consequence is that no particular part of a code word is more protected or less protected.

A RS code has a error correction capability equal to t = (n-k)/2, where n is the code length (generally expressed in bytes) and k is the information part length.

If the total number of errors (in both parts) is less than t, the RS decoder will be able to correct the errors (more precisely, the t erroneous bytes in the general case). If it is higher, the errors cannot be corrected (but could be detected, another story).

The emplacement of the errors, either in the information part or the added part, has no influence on the error correction capability.

EDIT: the rule t = (n-k)/2 that I mentioned is valid for Reed-Solomon codes. This rule is not generally correct for BCH codes: t <= (n-k)/2. However, with respect to your question, this does not change the answer: these families of code have a given capacity correction, corresponding to the minimum distance between codewords, the decoders can then correct t errors, whatever the position of the errors in the codeword

Source https://stackoverflow.com/questions/61641353

QUESTION

Use of Reed-Solomon error correction algorithm with 4-state barcodes

Asked 2020-May-07 at 21:51

I have a combined data information that requires minimum 35 bits.

Using a 4-state barcode, each bar represents 2 bits, so the above mentioned information can be translated into 18 bars.

I would like to add some strong error correction to this barcode, so if it's somehow damaged, it can be corrected. One of such approach is Reed-Solomon error correction.

My goal is to add as strong error correction as possible, but on the other hand I have a size limitation on the barcode. If I understood the Reed-Solomon algorithm correctly, m∙k has to be at least the size of my message, i.e. 35 in my case.

Based on the Reed-Solomon Interactive Demo, I can go with (m, n, t, k) being (4, 15, 3, 9), which would allow me to code message up to 4∙9 = 36 bits. This would lead to code word of size 4∙15 = 60 bits, or 30 bars, but the error correction ratio t / n would be just 20.0%.

Next option is to go with (m, n, t, k) being (5, 31, 12, 7), which would allow me to code message up to 5∙7 = 35 bits. This would lead to code word of size 5∙31 = 155 bits, or 78 bars, and the error correction ratio t / n would be ~38.7%.

The first scenario requires use of barcode with 30 bars, which is nice, but 20.0% error correction is not as great as desired. The second scenario offers excellent error correction of 38.7%, but the barcode would have to have 78 bars, which is too many.

Is there some other approach or a different method, that would offer great error correction and a reasonable barcode length?

...

ANSWER

Answered 2020-May-07 at 21:51

You could use a shortened code word such as (5, 19, 6, 7) 31.5% correction ratio, 95 bits, 48 bars. One advantage of a shortened code word is reduced chance of mis-correction if it is allowed to correct the maximum of 6 errors. If any of the 6 error locations is outside of the range of valid locations, that is an indication of that there are more than 6 errors. The probability of mis-correction is about (19/31)^6 = 5.3%.

Source https://stackoverflow.com/questions/61663905

QUESTION

Understanding crc32 polynomial of cdrom

Asked 2020-May-06 at 08:13

CDROM data use a 3rd layer of error detection using Reed-Solomon and an EDC using a 32_bits CRC polynomial.

The ECMA 130 standard define the EDC CRC polynomial as follow (page 16, 14.3):

P(X) = (X^16 + x^15 + x^2 + 1).(x^16 + x^2 + x + 1)

and

The least significant bit of a data byte is used first.

Usually, translating the polynomial into its integer value form is pretty straightforward. Using modulo math, the extended polynomial must be P(X) = x^32 + x^31 + x^18 + x^17 + x^16 + x^15 + x^4 + x^3 + x^2 + x + 1 , thus the value being 0x8007801F

The last sentence means that the polynomial is reversed (if I get it right).

But I didn't managed to get the right value so far. The Cdrtools source code use 0x08001801 as polynomial value. Can someone explain how did they find that value?

...

ANSWER

Answered 2020-May-06 at 08:13

Posting the answer :

First, I made a mistake in the modulo-2 algebra used to expand the polynomial. Non-modulo expanded form is :

Source https://stackoverflow.com/questions/61289512

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install Reed-Solomon

There is no need in building RS library, cause all the implementation is in headers. To build tests and examples simply run make in the folder with cloned repo and executables will emerge in the ./build folder.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: