base32-js | Base32 encoding for JavaScript
kandi X-RAY | base32-js Summary
kandi X-RAY | base32-js Summary
Base32 encoding for JavaScript, based (loosely) on Crockford's Base32
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Class for decoding stream output .
- Stream encoder
- sha1 - 1
- Hash a file
- Process input .
- Decode a string .
- Encode an input string
- compute a t
- generate uuid
- draw a triple
base32-js Key Features
base32-js Examples and Code Snippets
/**
* Decodes a Nano-implementation Base32 encoded string into a Uint8Array
* @param {string} input A Nano-Base32 encoded string
* @returns {Uint8Array}
*/
function decode (input)
/**
* Encode provided Uint8Array using the Nano-specific Base-3
// using color(x, y) to describe the image pixels
[ red(0, 0), green(0, 0), blue(0, 0), alpha(0, 0), red(1, 0), ... ] =
[ 124, 12, 123, 255, 122, ... ]
// JavaScript Array
[ /* pixel 1 */ 124, 12, 123, 255, /* pixe
$> echo -n "" | openssl dgst -binary -sha1 | base32
3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ
warc-record = header CRLF
block CRLF CRLF
$> cat request
GET /vital-signs/carbon-
function dec2hex(s) {
return (s < 15.5 ? '0' : '') + Math.round(s).toString(16);
}
function hex2dec(s) {
return parseInt(s, 16);
}
function base32tohex(base32) {
var base32chars = "ABCDEF
Community Discussions
Trending Discussions on base32-js
QUESTION
I want to encode random integers I create in my Web application using Douglas Crockford's Base32 implementation, described at the following URL https://www.crockford.com/base32.html. I had planned to build the encoder myself as a learning exercise, but the 'lower level' details have opened a bit of a Pandora's box for me.
Problem- Encoding
"12345"
using the Base32 implementations I have tried (e.g. https://github.com/agnoster/base32-js) results in"64t36d1n"
. - Encoding the same number using
(12345).toString(32)
result inc1p
, which makes more sense to me, because it's shorter (and that's my goal).
My hunch is that the difference comes from operating on a string as opposed to a number. However inspecting the code of the implementations I have tried reveal they turn strings into integers using something akin to byte.charCodeAt(0)
anyway, so naivety tells me this is the same.
I would use 2., except for the fact I would like to have control over the alphabet (e.g. to omit U, I, etc). I would appreciate if someone in the know could help point me in the right direction and help improve my understanding of this topic. Many thanks.
...ANSWER
Answered 2021-Jul-04 at 12:56The confusion probably stems from the fact that there are two different things (albeit closely related) you could mean when you say "base 32".
Thing 1: Number radixThe way a representation of a number is structured, defining how many different options of symbols a single "digit" has. We humans usually use base 10 to represent our numbers with 10 different symbols (0-9), but you can also have binary which is base 2 (using only symbols 0 and 1), octal which is base 8 (using symbols 0-7) and so on. See base/radix on Wikipedia. With bases higher than 10 you usually use letters, starting with A for the 11th symbol. For example, hexadecimal (base 16) uses symbols 0-9 and A-F.
Since we only have 26 distinct letters that can be used in addition to the 10 symbols 0-9, in most scenarios only representations up to base 36 are defined. If you try to run 12345..toString(40)
you'll get a RangeError: toString() radix argument must be between 2 and 36
for this reason.
Now, representing the number 12345 in base 32 this way (using symbols 0-9 and A-V) will give you C1P
since C
has the value 13, 1
has the value 1 and P
has the value 25, and 13 * 32^2 + 1 * 32^1 + 25 * 32^0 equals 12345. This is just a way to write a number using 32 distinct symbols instead of 10, I wouldn't call it an "encoding" as such.
If the base is larger than 10, this will result in a shorter1 (or equally long) representation than regular base 10.
Thing 2: baseN encodingThe term of a "baseN" encoding as in "base64" (the most well-known such encoding) means the encoding of an octet stream (a stream of bytes, of 8-bit binary2 data) using an "alphabet" (a set of allowed symbols), where the N specifies how many symbols the alphabet has3. This is used to store or transmit any octet stream (regardless of its contents) on a medium that doesn't allow for the full range of possible values in a byte to be used (for instance, a medium such as an email which may only contain text, or an URL which doesn't allow for certain special characters such as /
or ?
because they have semantic meaning, etc. - or even a piece of paper because the symbols 0
and O
as well as I
and l
and 1
can't be reliably used without danger for confusion between them when read back by a human).
Now comes the part that marks the relation to the first "thing": The way the conversion works can be imagined by turning the input bytes into a huge number, and changing its radix but using the alphabet the encoding defines, not necessarily digits followed by letters. A great visual explanation can be found here.
The "turning the input bytes into a huge number" part is where the charCodeAt
comes into play that you mentioned: I can turn the string ABC
into the number 4276803 for instance, which becomes more obvious when looking at the bytes in their hexadecimal representation because a byte can have 256 values this range fits neatly into the exact range of two hexadecimal "digits" (0x00-0xFF4). The three bytes5 in ABC
have hexadecimal values of 0x65, 0x66 and 0x67 respectively, and if I put them next to each other I can look at them as a large number 0x656667 = 4276803.
An additional overlap with the first "thing" is that in cryptography, very large numbers come into play, and often those are also encoded using a mechanism like base32 or base58 or base64, but unless the programming language and/or processor have a data type/register that fits the large number, the number is at that point already represented as some sort of octet stream yet again (the inverse of what I just described, sometimes with the reverse byte order though).
Of course this is only conceptually how it's done, because otherwise the algorithm would have to cope with gigantic numbers once we are talking about encoding not 3 bytes but 3,000,000 bytes. In reality, clever ways involving bit shifting etc. are used to achieve the same result on any length of data sequentially.
Since a "string" as you are used to seeing (ignoring Unicode for a second) can be loosely compared to a byte's numerical number represented in sort of a base 256 (one symbol for each of the possible 256 values in a byte), this means that any such baseN encoding will make the output longer because the new "radix" is lower than 256. Note that putting 12345
into a base32 algorithm will mean the string 12345
which could be viewed as the number 211295614005 (or 0x3132333435) in my explanation above. Looking at it this way, 64t36d1n
(what you got from base32) is definitely shorter than 211295614005 in base 10, so it all makes sense again.
Important note: This explanation isn't entirely correct if you have input data which can't be exactly mapped to its new representation without padding due to its length. For example, a 3-byte-long data chunk occupies 3*8=24 bits, and a base64 representation of it that uses 6 bits per symbol is easily possible because exactly four of these symbols would also occupy 4*6=24 bits. But a 4-byte-long data chunk occupies 4*8=32 bits and would therefore require 5.333... symbols in base64 (5.333...*6=32). To "fill up" the remaining data space, some sort of padding6 is used so we can round it up to 6 symbols. Normally, the padding is added to the end of the input data, and this is where reality differs from my "changing radix of huge number" concept above because in math you'd expect leading zeroes as padding.
Douglas Crockford's base32To address your initial question:
Douglas Crockford's base32 algorithm is actually designed for numbers but with a modified alphabet, it doesn't take an octet stream as input as programmers are used to. So it's more like a middle ground of the two things described above. You are right that toString(32)
goes half the way to what you need, but you'd have to map between the regular "alphabet" of radix 32 (0-9, A-V, case insensitive) and Crockford's (with 0-9 and A-Z but without I, O and U, case insensitive, mapping I to 1 and O to 0 when decoding).
Replacing those things back and forth is enough complexity that I guess it'd be cleaner (and more educational) to write the algorithm yourself from scratch instead of relying on toString
.
(Also, Crockford proposes an additional "check symbol" at the end which goes beyond what was explained here anyway.)
Footnotes:
1: This is assuming integers. If you have fractions, then things are very different, because you can get recurring decimals in the new radix for numbers that didn't have recurring decimals in the old radix, or the other way round. For instance, 0.1 in base 32 is 0.36CPJ6CPJ6CPJ6CPJ... which is an infinitely long number (in that particular representation).
2: The term "binary" here doesn't refer to representation in radix 2 but to "any kind of data which can use the full range of values from 0-255 per byte, not restricted to values representing human-readable text in ASCII range 32-126".
3: Note that from the N alone, you can't infer what the alphabet exactly is, only how long it is. Well-known encodings have universally accepted conventions about which alphabet is used, such as base64 and base58 (the latter often being used for cryptocurrency addresses, and its alphabet is not even in alphabetical order by the way). Note that even for base64 there are variations like base64url which change the alphabet slightly. Others such as base32 don't have a universally accepted alphabet yet which is why the website that you linked mentions "this is a base32 encoding and not the base32 encoding" - notably it's not the same as Crockford's alphabet.
4: The prefix 0x
is commonly used to denote that the following symbols are to be interpreted as a number in base 16 (hexadecimal) instead of base 10.
5: I'm talking about bytes here, because this is what the baseN algorithms work with, but in fact strings are based on characters and not bytes, and they may also contain Unicode characters with numerical values beyond 255, therefore not fitting into a single byte anymore. Normally, strings are first encoded using a character encoding like UTF-8 to bytes and then the baseN encoding is performed on those bytes.
6: base64 uses =
as padding, and to retain the information how many padding characters were used, the same number of =
characters is also appended to the output (=
isn't in the alphabet of base64).
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install base32-js
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page