Software Reliability

print
It seems I have to deal with a question of who to trust – our new product or an established software package – way too often. Answers make me question what is the level of testing in open-source software and what is the reliability of software in general.

We are using JavaScript libraries for basic cryptographic operations. One of them is SJCL – Stanford Javascript Crypto Library. It is a nice, small library I have been using since 2012 or so. This time, we use it for client-side cryptography for our new secure file sharing application.

What a surprise when a colleague of mine discovered that it fails to compute any of the hash functions (SHA1, SHA256, SHA512) from data of more than 256MB. We are building use cases for our cloud encryption service Enigma Bridge and it took us more than a day to figure out that the problem is not on our side nor on the way we use the library. The problem was in the library itself and the reason was quite interesting (from the technical point of view – see below).

What strikes me more though is how this kind of error could remain hidden since the inception of the SJCL library. What kind of testing has been done to verify the correctness of the implementation? After all, standards describing these functions a) describe the algorithm and provide test vectors b) define constraints for all external inputs – including the maximum length of data the function is able to process.

Also, as I said – this is a low-level library doing just one, well-defined task – it computes several cryptographic algorithms according to their specifications. There is no user interface, no business logic, not databases to store data. It’s just a stateless, re-entrant mathematical library.  Still it fails.

Since we started using continuous integration and rich sets of test cases internally, I started looking for similar tests in other projects, especially on GitHub. There aren’t many and I wonder whether it’s safe to rely on something only because it’s been around for a long time.

 

P.S.: The problem our guy discovered was in how Javascript deals with integer v floating point variables. Javascript variables are all doubles – 64-bit representations with 52-bit mantissa.  However, for bitwise operators (and, or, xor, …) numbers are first converted to 32-bit integers (i.e., numbers able to represent values up to 2 billion and in our example 256 million bytes as length is counted in bits). If corrected, the length limit is about 1,126 TB – something pretty hard to reach these days.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *