Could you Pass the Salt? Improving the Security in Encrypting Passwords using MD5By Scott Mitchell and Thomas Tomiczek
A couple weeks ago Scott Mitchell wrote an article titled Using MD5 to Encrypt Passwords in a Database. In his article, Scott examined how to use the built-in ASP.NET
MD5CryptoServiceProviderclass to use MD5 hashing. To recap, MD5 is known as a one-way encryption algorithm. It is presented a plain-text string and then computes an encrypted version of that string. Given the encrypted version, it is computationally infeasible to determine the plain-text version. (If you haven't yet read Scott's article, I would encourage you to closely read the article before continuing here.)
Specifically, in his article, Scott encourages the hashing of the user's password, and storing that encrypted value in the database. When a user wishes to login, then, they will enter their plain-text password, which will then be hashed and compared to the value of their hashed password in the database. The idea behind this is that if a insidious hacker does somehow manage to get access to the database, they won't be able to determine any of the user's passwords since they are stored in encrypted form.
The Perils of Scott's Approach
Unfortunately there is a security risk with Scott's approach. To see why, assume that a hacker has managed to break into the user database of a Web site with 10 million user accounts, and that the passwords in this database are protected by the same, simple MD5 hash of the password field as Scott proposed. Now, this hacker is interested in obtaining a valid password for any arbitrary user (not necessarily a specific user or specific set of users). The hacker can create a dictionary of common password entries, which he can then compute their hashed values using the MD5 algorithm. Next, the hacker can start going through the dictionary of common passwords in their hashed form, checking to see if any of the 10 million user accounts have a matching encrypted password field.
Since only the password field is hashed, the hacker can simultaneously check each of the 10 million user's passwords at once! For example, imagine that the word "foobar" was a common password, and that the hashed MD5 version of "foobar" was "#KH##NM@MM@M". The hacker could then run a SQL query like:
If any user has the password "foobar" the hacker would see their username. With this information Hence, by only hashing on the password field a hacker can attack 10 million encrypted passwords at once.
Take it With a Grain of Salt
This weakness of using hashes is well known, but can be easily solved by "salting" the password fields before hasing them. This is done by adding a second piece of information to the hash that is non-changing and unique for every user – like, for example, the username, or a user ID. Personally, when developing a user database, I normally use a GUID as primary key for the table since it is random and big enough to be used as "salt input" into the hash.
Hopefully an example will clear up any confusion. In Scott's article from a few weeks back, when a user is creating a new user account their chosen password is hashed and the hashed version is stored in the database. In VB.NET, this code might look like:
txtPwd.Text is the value of the user's plain-text password and
is the encrypted byte array. (This encrypted byte array would then be stored in the user's
password field in the database.) My approach would include adding a salt to the value before computing
its hashed value. For example, if we used the username, the last line of code in the above example
txtUsername.Text corresponds to the user's username. Say that this user, "user1" chose the
popular password "foobar". Also imagine that "user2" choose the password "foobar". In Scott's original
article both users would have the same value in the encrypted password field. However, by salting the
value before hashing, "user1"'s password field will contain the hashed version of "foobaruser1", while
"user2"'s password field will contain the hashed version of "foobaruser2" - these are different and
will result in nonsimilar hashed values.
Naturally the code needed to authenticate a user would need to change. For example, when a user logs on they would supply their username and password. In Scott's original article, the code used hashed the user's entered password value and then examined if the hashed value matched up to the password field with the user's specified username. With salting, however, you would need to combine the user's username and password, then compute the hash value, and then determine if the particular user's password field matched up to this computed value.
Even if the hacker is privy to how we salt the hash values, he can only attack one user's password at a time. That is, say that he wants to check all users to see if any of them has the password "foobar". Rather than being able to issue one SQL query, he must issue a unique SQL query for each of the 10 million users.
In this article we examined Scott's earlier article titled Using MD5 to Encrypt Passwords in a Database. Specifically, we examined a major security issue inherent in Scott's approach, which was present due to the fact that Scott was not salting the password values before storing their hashed value. As we discussed, failure to salt a value prior to hashing allows for a hacker who has compromised the system to efficiently attack all user accounts. That is, the hacker can quickly determine if any user is using a particular password. For sites with many, many users, there is a good chance that there will be at least one user with a common password.
However, the problem becomes much more difficult for the hacker if we salt the password value with some known, non-changing value that is unique to every user, such as a username or user ID. Salting improves security because it denies the hacker the ability to attack all user accounts in one go. Therefore, if you plan on encrypting your passwords in your database using a hashing encryption algorithm like MD5, be certain to salt the passwords prior to computing their hashed value!
About the Authors:
Thomas Tomiczek is Microsoft MVP for C# and Director of Technology of THONA Consulting Ltd., a UK based software developer with its main development offices in Berlin, Germany. His main area of expertise is the development of high performance distributed applications and database based systems, and he has spent a good part of the last year leading the development of a business object framework for the .NET environment that allows developers to work with automatically persisted business objects, instead of writing SQL.
For information about Scott Mitchell, please visit http://www.4guysfromrolla.com/ScottMitchell.shtml