Secure password storage – a myth?

by Ochronus on April 29, 2011

Secure password

sony playstation, playstation network, security breach, sony hack

With Sony‘s recent fiasco (PSN hacked, user data, including passwords stolen) the old topic of password storage is in spotlight again. Is it really that hard to store passwords securely? Obviously not. Let me show you how a post-stone age developer would solve it and some of the cryptographic background.

 

 

 

First of all, what’s so wrong with storing passwords as-is (plaintext)?

Basically, it offends your users’ right to privacy. You, the sysadmin, the IT guy or a hacker can get access to it. It’s a huge security risk in your service (the hacker can then impersonate your users easily) and also many users use the same passwords on many sites, so you might easily be posing a security threat to other sites, companies. Moreover, many people choose personal passwords – sensitive insights about them is thus revealed by their password choice. As a summary, it’s both unethical and insecure to store plaintext passwords.

 

What’s the solution then? We have to be able to verify the user somehow upon login, don’t we? True, but for this, the password itself is of no information. The real information in the authorization process is whether the person on the other end of the line knows the password or not. It’s a huge difference. So how do we verfiy the knowledge itself without actually storing the information? We need something that produces some obfuscated info from the original, sensitive plaintext data (password). We have many possible candidates for this, e.g. simple checksum, crc , the point is that it’s like a function which always produces the same output from a single input. With such a function at hand we only need to store the result of the function, the obfuscated info, then during authorization use the same function on the typed-in password and see if the outcome is the same as the stored value.

This is only one part of the whole story, though. Simple checksums and crc’s are not considered secure, because when someone gets hold of your database containing the obfuscated values, there are mathematically very easy (simple and fast) ways to recover the original password from them. Enter secure hashes.

A secure (also called cryptographic) hash is an one-way function which has the following properties:

  • it is easy to compute the hash value for any given message (fast forward calculation),
  • it is infeasible to generate a message that has a given hash,
  • it is infeasible to modify a message without the hash being changed,
  • it is infeasible to find two different messages with the same hash.

Some of the best known secure hash algorithms are MD5 and SHA*. Their usage is quite simple: take your favourite language (or use MySQL‘s, etc. built-on functions), get a lib for it if these crypto functions aren’t built-in, when the user registers, calculate the hash, store it in your DB and when it’s stolen, no hacker will be able to recover the password.

Really? Well, not really. Simple MD5 and SHA-1 are still quite vulnerable. There’s a clever attack on them, known as the rainbow table attack. Basically rainbow table is an efficient data structure to store precomputed hash chains. It’s like a cache for brute-force trial attack and it’s proven to be quite effective.

OMG, what can I do then?

Don’t panic. It’s not a lost fight. Fortunately rainbow table efficiency can be nulled with using a very simple method: salts. Salting is adding a pre-generated (used as a constant in the registration-authorization cycle) string which you automatically add (concatenate) to the passwords before you’re giving them to the secure hash functions. For example given your pre-generated salt is ‘iw8ahphieBa7aesahLeenaKae3lae0ee’, when the user registers with the password ‘i love bunnies’, you’ll be hashing like this: stored_hash = md5(‘I love bunnies iw8ahphieBa7aesahLeenaKae3lae0ee’).

Even better, periodically ‘change’ your salt. Say at every 10th user you generate a new salt and from that time you use it. Of course this method requires storing additional information, like salt history, but it’s a quite secure solution.

Well, I hope you agree by now that secure password storage is not a myth, it’s a quite easy method. Please share your experience and further thoughts!

Edit

A few really nice comments came up, and after reviewing some solutions I feel you’re right – currently the best way is to use bcrypt. Thanks for the great comments and suggestions guys!

Enhanced by Zemanta

  • Chui Tey

    You should probably talk about key strengthening. It may be ancient, but it is especially valid today when salting alone may not suffice.

  • TheGreatProgrammer

    It is a lot easier to just generate a random number/string as salt for each user, and store that together with the hash in the database.

    • http://blog.mostof.it/ ochronus

      And how does that protect you when your DB is stolen/accessed?

      • Kevin Ruble

        Storing the randomly generated salt along with each user doesn’t make the password any less secure. The sole purpose of a salt is to prevent an attacker from matching a hash with a pre-computed rainbow table. The salt serves this purpose regardless of whether it’s secret or not. Even if the attacker knows the user’s salt/nonce, they would still have to recreate a new rainbow table to use in an attack for each individual user, which would be so time-consuming that the attacker would probably use a different approach.

        Theoretically, a “secret” salt has advantages, but how will the salt be kept secret? Hard-coded in a script? Encrypted with a set cipher (which is also stored somewhere)? In most situations where a database is compromised, other parts of the application (your source code) are as well. All you’re doing is adding one additional step for the attacker – find where the hard-coded salt or encryption keys are held. Once they find that, they either have only just one salt to hash a rainbow table for all the users’ passwords (or users/10, users/100, etc. tables if you use a “change salt every n users” as mentioned in the article), or use the key to decrypt the salt, leaving you in the same situation you’d be in if you just stored them as plain-text in the user’s database row in the first place.

        Randomly unique ‘public’ salts for each user are more secure than fewer – or one – “hidden” salt. The extra effort to try to keep salts “secret” isn’t worth it in the long run for the marginal benefit it provides (having to decrypt a salt means that much more time to authenticate each user), and actually makes your application less secure if it encourages you to use any fewer than N salts for your N users.

  • https://matiaskorhonen.fi Matt

    Just use bcrypt (and certainly don’t use MD5 or reuse hashes!). See this for the reasons why: http://codahale.com/how-to-safely-store-a-password/

    And further discussion on Hacker News: http://news.ycombinator.com/item?id=2500495

    Advising people to use general purpose (computationally fast) hashes and re-used salts is just plain wrong.

  • Jesse

    The method you suggest isn’t especially secure; take a look at the discussion about this on HN: http://news.ycombinator.com/item?id=2500495

  • Aaron

    Don’t use MD5 or SHA*; they are fast and thus the enemy of “secure” (because fast means easy to brute force). Use bcrypt (or things like it such as scrypt) that are intentionally designed to be slow.

  • Cyranix

    You should be reading the comments on Hacker News: http://news.ycombinator.com/item?id=2500495

    The short version is that you are wrong. The comments on HN contain links to much better ideas. (But hey, I didn’t know about bcrypt until a couple of months ago myself — we all have to learn sometime.)

  • http://weevify.com Milan Cvejic

    Unfortunately, even adding salt to your password is not a perfect solution due to md5 and sha coallisions. As multiple values can sometimes have same md5 values.

    Cheers.

  • http://tech.sybreon.com Shawn Tan

    Salting is also not encouraged – due to the fact that developers may not implement it correctly. Best practice for now would be to HMAC the password.

  • Kevin

    In the press conference, Sony stated that they did only store the hashed value of passwords. http://www.insidesocal.com/techout/2011/04/sonys-press-con.html

    • http://blog.mostof.it/ ochronus

      Well, they only stated that in an update and I wouldn’t be so sure to trust them…

  • Lonny Eachus

    It is foolish to take the advice of Coda Hale (the above link) without a large grain of salt. Pun intended… blame Coda Hale for making it necessary.

    First, as Ochronus correctly mentions, even visible salts can essentially eliminate the utility of rainbow table attacks, which is a significant gain. Coda Hale fails to address that point… in fact he contradicts it.

    Second, while bcrypt supposedly somehow uses salts internally, I have not yet seen how that is done and until I do, I will not place my trust in that fact.

    Third, relying strictly on computability time is a temporary solution at best, as Hale himself makes clear (apparently without realizing it), by pointing out how easy it is to do brute-force attacks today that would have been infeasible only a few years ago.

    And fourth, an attempt to get around the temporary nature of such a solution by introducing a “work factor” may or may not be effective in the future. Again, I do not yet know how this work factor is built in, but if I had to guess sight-unseen, I would say that it is likely someone could implement their own version of bcrypt that did not include the “work factor”, which renders it irrelevant for all practical purposes.

    People, keep in mind that unless you are talking about a single scenario (an external brute-force attack against a website password field, for example), the speed of YOUR hashing algorithm is pretty much irrelevant. While a slow hash can protect against that, it has absolutely no relevance to an equally likely scenario: someone has your database and is trying to break the passwords. In that case, they will be using THEIR OWN program to compute the hash, and you had better believe they will make it as fast as they reasonably can.

    So keep in mind that bcrypt is (A) a solution for only a single, narrow circumstance, and (B) very likely a temporary solution even for that.

    If a solution to that one problem is all you need, then bcrypt might be just the thing.

  • Lonny Eachus

    I should add one more thing:

    If, in fact, that one scenario is all you are concerned about, then you could continue to use the conventional hashing algorithms exactly as you have in the past, and introduce a back-end (server side) delay before computing the hash. The effect would be exactly the same.

  • http://www.facebook.com/tomfluff Tom Luff

    I usually store the username and hash the username, password & salt. New salts are generated everyday on the server and the salt table is cached in memory by a daemon so if the database is breached the salts are not retrieved. Then 

    • http://blog.mostof.it/ Ochronus

      Nice solution, good tradeoff between efficiency and perfection

  • http://www.facebook.com/tomfluff Tom Luff

    I usually store the username and hash the username, password & salt. New salts are generated everyday on the server and the salt table is cached in memory by a daemon so if the database is breached the salts are not retrieved. Then 

    • http://blog.mostof.it/ Ochronus

      Nice solution, good tradeoff between efficiency and perfection

Previous post:

Next post: