Logarithmic scaling with the fastest JSON validator

If there’s something any backend engineer loves, it’s when they manage to solve a problem in an O(log(n)) way. I did that this week.

As the number of requests per second goes up exponentially, the number of servers needed to handle them goes up linearly (until we reach a bandwidth bottleneck). Amazing!

First, let me recap. We were having performance issues in an application. We needed a lot of big and expensive EC2 compute-optimized instances to handle fairly low traffic by our standards (250mb/s). Those instances were burning CPU like crazy. The codebase is built with performance in mind and we couldn’t find a single inefficiency in it. That means it’s time for flame graphs!

They look like this:
flame

Continue reading

My impressions after a week of CoffeeScript

Until a few days ago, I was a hardcore pure JavaScript believer.
I didn’t want to hear anything about Dart, TypeScript, LiveScript, CoffeeScript, IcedCoffeeScript and all the other compile-to-JS languages out there. To me, they were only an additional layer of stuff that could go wrong. I still think they’re all too focused on solving a non-issue with JS, namely the prototypal Object system. I absolutely love the approach used by Javascript when it comes to OOP because it takes the best functional-style features of JS and creates an elegant way of managing objects during runtime.

I’ve been following the development of ECMAScript 6 for many months, and I’m excited by all the new features, like list comprehensions, splats, array destructuring, syntactic sugar for map/reduce/filter/etc and many others. I’m sure you can see where I’m going with this. I knew CoffeeScript provided all that, but to me it was just a layer of features that are about to land in V8 any second now, so why bother learning it? Well, for one, the dense syntax is great. It can be really confusing at first. Take this snippet, for example. It prints the numbers from 1 to 5, waiting 1 second every time.

for (var i=1;i<=5;i++){
    (function(i){
        setTimeout(function(){
            console.log(i);
        }, i*1000);
    })(i);
}

Continue reading

Learning JavaScript the right way™, part 1

I’ve been thinking about writing this for a while and I’ve finally decided to bite the bullet. This is going to be very dense and terse, I’ll go deep into details whenever I feel necessary. This whole series is all about avoiding the bad parts of JavaScript and focusing on the good ones. There isn’t any lack of explanations online about how a for loop works, so I’m assuming that the reader already knows:

  • The structures of imperative programming (for, while, if, else, return, etc.)
  • The principles of Object Oriented Programming (class, instance, interface, method, etc.)
  • Simple data structures (arrays, lists, hashmaps)
  • Big-O notation on a basic level (O(1) vs O(n), etc)

The style recommendations (do’s and don’ts) are my own opinion of what makes JS code readable and maintainable.

Parts:

  1. Basic types
  2. Arrays
  3. Objects
  4. Functions and Scope
  5. Prototypes
  6. Good JS practices


Continue reading

Getting rid of binary strings, part 2

Click here for part 1

Prepending bits
Another common operation is to add bits to the beginning of some data.

var bin = "10011101011110100"; //80628
bin = "0110011" + bin; //51
console.log(bin); //Prints 011001110011101011110100
console.log(parseInt(bin, 2)); //Prints 6765300

It’s trivial to do it using bit manipulations as long as everything fits into the buffer.

var buffer = 80628,
    nbBits = 17;
buffer |= 51 < < nbBits;
nbBits += 7; //I'm adding 7 relevant bits to the buffer
console.log(buffer); //6765300

//Explanation
// 51 << 17 = 11001100000000000000000 (51 in binary followed by 17 zeroes)
//Thus
//          10011101011110100
//            BINARY OR
//    11001100000000000000000
// => 11001110011101011110100 (6765300)

Continue reading

Getting rid of binary strings, part 1

Every programmer is used to manipulating data at the byte level, but most rarely, if ever, have to directly manipulate bits. When such a situation arises, the “obvious” way to do it (when it comes to beginners) is to use binary strings, that is, strings of ones and zeroes. The manipulations are done using the usual string concatenation, substr and other basic string manipulation tools. The string is then finally parsed 8 characters at a time into bytes (using the language’s equivalent of parseInt()). In many languages, including Java and Javascript, strings are immutable, so any concatenation causes the string to be copied all over, everytime.

Any half decent programmer would feel “dirty” writing such code, but my feeling is that most would do it anyway and say something like “it’s fast enough”, “it’s more readable”, “I’ll optimize it later if it’s too slow, but that’ll do for now” to justify themselves. Until relatively recently, I used to be one of them and hopefully you will know better than using binary strings after reading this. I’ll be using Javascript, but this article is language-agnostic.

Partial bytes are often used in compression and stream algorithms.

Appending bits
By far the most common operation performed on them is to append more binary data, such as:

var arr = [],
    bin = "";

bin += "101110";
bin += "0101";
bin += "0000";
bin += "1100110011";

//Pad the last byte if needed
if (bin.length % 8 !== 0){
    bin += Array(8 - bin.length%8 +1).join("0");
}

//Parse the string
for (var i=0, len=Math.ceil(bin.length/8);i<len;i++){
    arr.push(parseInt(bin.substring(i*8, (i+1)*8), 2));
}

console.log(arr); //Prints [185, 67, 51]

Continue reading

Javascript integers and bitwise operations

This is the first post in a series to demystify Javascript.

Javascript is the most misunderstood language, it has a huge stigma attached to it. Programmers learn it by copy-pasting it from Stack Overflow question and retroactively applying their knowledge of other languages to Javascript, but JavaScript is different. It should be treated as a mostly functional language, not as a broken Java or C# web clone and it should be learned properly. That stigma is starting to go, mostly due to projects like Node.JS, Express, EmberJS and the other client-side frameworks.

In JS, variables are declared using the var keyword. The type is inferred from the value of the variable. While JS only has a Number object to represent numbers, internally, modern engines have 2 types: 32-bit signed integers and 64-bit double-precision floats. JS switches from one type to the other automatically.

Bitwise operations is one of those cases, for obvious reasons. The AND (&), OR (|) and XOR (^) operators are well known, but the NOT (~) and bitshift operators (< <, >> and >>>) have a few less known uses. I’m going to be using 16-bit ints for my examples.

Continue reading