This blog will attempt to explain how to encode streams of bytes into Base32 and Base64 methods. Yes, again, this is with unsigned bytes for almost no reason other than I might just mess something up or create some inadequacy but not doing so.
Base32 works with any amount of data. Let's say I got the stream 10010110 00101101 10001100 00010000 01011011
This is in a block of five bytes, which is what every full encoding requires.
What we do is combine the stream into one giant amalgamation:
1001011000101101100011000001000001011011
I wonder what integer this represents...
Next, we break it apart into eight groups of five bits:
10010 11000 10110 11000 11000 00100 00010 11011
Five bits can represent a total of 32 possible values (hence Base32). How do we show those values? We simply start at 0 for 0 up to 9 for 9, then continue with A for 10 up to V for 31.
So, the first group from the left defines the integer 18, which becomes I. The next group represents the integer 24, which becomes O. The entire output is IOMOO42R.
Let's add one more byte to the stream: 10010110 00101101 10001100 00010000 01011011 00101101
When we group it into as many blocks of five bytes as possible, we get 1001011000101101100011000001000001011011 00101101
In this situation, we need to pad or add more bytes to fill the set. We always read this from left to right, so we add four more bytes of 0 to the right.
You pad whenever you don't have enough bytes to make a group of five. Sometimes you'll see these strings with ends of equal signs. This is because several methods of Base32 include character sets that define what means what. A common character set starts at A for 0 up to Z for 25,then continues with 2 for 26 up to 7 for 31. 1 and 0 are skipped because they look too much like I and O. Just be aware that it means the data is padded. The same string above in this character set would be:
SYWYYEC3FUAAAAAA
The idea is the similar for Base64. Let's take our original stream and makes groups of three bytes:
100101100010110110001100 0001000001011011 (make sure to pad here)
100101100010110110001100 000100000101101100000000
Then we split each group into four groups of six bits:
Six bits can represent a total of 64 values. Base64 also uses a character set which I will use. What we're going to do is start at A for 0 up to Z for 25, then continue with lowercase a for 26 up to lowercase z for 51, then with 0 for 52 up to 9 for 61, ending with + for 62 and / for 63.
We get the string: li2MEFsA
Voila!
((Also Base32 is case insensitive, so a and A are the same.))