Getting rid of binary strings, part 2

Click here for part 1

Prepending bits
Another common operation is to add bits to the beginning of some data.

var bin = "10011101011110100"; //80628
bin = "0110011" + bin; //51
console.log(bin); //Prints 011001110011101011110100
console.log(parseInt(bin, 2)); //Prints 6765300

It’s trivial to do it using bit manipulations as long as everything fits into the buffer.

var buffer = 80628,
    nbBits = 17;
buffer |= 51 < < nbBits;
nbBits += 7; //I'm adding 7 relevant bits to the buffer
console.log(buffer); //6765300

//Explanation
// 51 << 17 = 11001100000000000000000 (51 in binary followed by 17 zeroes)
//Thus
//          10011101011110100
//            BINARY OR
//    11001100000000000000000
// => 11001110011101011110100 (6765300)

Continue reading

Getting rid of binary strings, part 1

Every programmer is used to manipulating data at the byte level, but most rarely, if ever, have to directly manipulate bits. When such a situation arises, the “obvious” way to do it (when it comes to beginners) is to use binary strings, that is, strings of ones and zeroes. The manipulations are done using the usual string concatenation, substr and other basic string manipulation tools. The string is then finally parsed 8 characters at a time into bytes (using the language’s equivalent of parseInt()). In many languages, including Java and Javascript, strings are immutable, so any concatenation causes the string to be copied all over, everytime.

Any half decent programmer would feel “dirty” writing such code, but my feeling is that most would do it anyway and say something like “it’s fast enough”, “it’s more readable”, “I’ll optimize it later if it’s too slow, but that’ll do for now” to justify themselves. Until relatively recently, I used to be one of them and hopefully you will know better than using binary strings after reading this. I’ll be using Javascript, but this article is language-agnostic.

Partial bytes are often used in compression and stream algorithms.

Appending bits
By far the most common operation performed on them is to append more binary data, such as:

var arr = [],
    bin = "";

bin += "101110";
bin += "0101";
bin += "0000";
bin += "1100110011";

//Pad the last byte if needed
if (bin.length % 8 !== 0){
    bin += Array(8 - bin.length%8 +1).join("0");
}

//Parse the string
for (var i=0, len=Math.ceil(bin.length/8);i<len;i++){
    arr.push(parseInt(bin.substring(i*8, (i+1)*8), 2));
}

console.log(arr); //Prints [185, 67, 51]

Continue reading

Javascript integers and bitwise operations

This is the first post in a series to demystify Javascript.

Javascript is the most misunderstood language, it has a huge stigma attached to it. Programmers learn it by copy-pasting it from Stack Overflow question and retroactively applying their knowledge of other languages to Javascript, but JavaScript is different. It should be treated as a mostly functional language, not as a broken Java or C# web clone and it should be learned properly. That stigma is starting to go, mostly due to projects like Node.JS, Express, EmberJS and the other client-side frameworks.

In JS, variables are declared using the var keyword. The type is inferred from the value of the variable. While JS only has a Number object to represent numbers, internally, modern engines have 2 types: 32-bit signed integers and 64-bit double-precision floats. JS switches from one type to the other automatically.

Bitwise operations is one of those cases, for obvious reasons. The AND (&), OR (|) and XOR (^) operators are well known, but the NOT (~) and bitshift operators (< <, >> and >>>) have a few less known uses. I’m going to be using 16-bit ints for my examples.

Continue reading

Hello world!

Let’s get started with this blog thing.
I never thought I’d end up having one myself, but after reading dozens of interesting blog posts in the last year (mostly linked on Hacker News), I can see why programmers start one.

Since I’ve been mostly writing Node.js and Javascript in the last year, many posts will be covering these topics.