I went to see “Interstellar”, the new Christopher Nolan film, over the weekend. )You caught me – I easily cop to being a geek, particularly when it comes to movies.) In addition to all the other qualities that make it a good watch (engaging actors, top notch effects, interesting script), it did something particularly intriguing. The movie made the relativistic time dilation impacts of galactic scale travel an intrinsic aspect of the personal stories of the characters.
Time has a material impact on the relative quality of data protection, too. No cipher will ever provide unending data protection, so long as technology and expertise continues to increase. There is a recent example related to the MD5 hash function in the news just a few days ago, underscoring how time is eroding data protection. Nat McHugh demonstrated how to create a collision in less than a day, and for just pennies in cost.
A hash function is an algorithm for mapping an indeterminate size data input to an output with a fixed size. For a well-made function, small difference in the input cause radical differences in the output. Hash functions are frequently used for managing user passwords, in that variable length passwords (usually along with other inputs, like salt) are passed through the function to create irreversible, fixed length hashes of the originals. Then, when a user presents the password as a credential for authentication, the hash of the presented password is compared to the hash stored by the system. In that way, there is still confirmation that the right password is entered without the risk of persistently storing the actual passwords.
The MD5 hash function was first published in 1992, and was widely used for many purposes, including as a check routine for file transmissions. A sending transmission application would read the file being sent, calculate an MD5 hash it would append to the end of transmitted file, and send it. The receiving application would independently create a hash of the same input using MD5, and compare. A match meant a successful transmission and a mismatch confirmed there was a problem. As early as 1996, though, researchers had determined that the hash function could allow for “collisions” – that is, different variable length inputs could result in the same fixed length hash.
That conclusion was largely theoretical and academic in 1996, though cryptologists started to recommend moving to newer functions, such as SHA-1, even then. However, the theoretical proved to be practical, as technology progressed and faster processors, particularly GPUs in place of CPUs, could be applied to the mathematics. Marc Stevens, in his Masters project HackClash, estimated that an MD5 collision could be created in as little as a day, using only three PlayStation 3 consoles.
While that is relatively inexpensive in terms of time and resources, it reflects a time before cloud computing became ubiquitously available and its capacity commoditized. Rather than spending more than a thousand dollars for several game consoles and tediously connecting them for parallel processing, McHugh created an MD5 collision of the hashes of two disparate inputs (images of two different performers) in less than 10 hours and for only $0.65US in fees for use of a GPU instance on Amazon Web Services. Practically, this means that hackers could relatively inexpensively reverse engineer password inputs that would match to MD5 hashed passwords contained in a fraudulently obtained copy of such a database.
The point of this, past “gee, whiz,” is that the sun will set on every cryptographic algorithm, hashing and encryption, sooner or later, so long as we continue to make advances in the speed of processors and programs. The data we seek to protect will only be protected by the algorithms we apply today for a limited and shortening period of time. This can be critical issue for enterprises in industries who are required by regulation (banking, financial) or necessary practice (healthcare) to maintain sensitive data for years, even decades. The ability to quickly and pervasively shift to newer algorithms without unnecessary disruption to the lines-of-business or the operating budget becomes, therefore, a mandatory best practice.
For another discussion of the best practices for protecting enterprise data, see the prior post "How Companies Should Protect PII."
Many thanks to the head of our Product Support group, Sean Workman, for pointing out last week’s article.
EncryptRIGHT, Prime Factors' universal data protection platform, was designed to separate the application of policy (protect data in this way) from the means (use algorithm x to protect data). This makes it only an administrative change to move from SHA-1 to SHA-256 hash function or 3DES to AES encryption algorithm for the data protection integrated into your applications. For a free trial of EncryptRIGHT, click the button below.