Encrypting Sensitive Data in a DatabaseBy Scott Mitchell
Every couple of weeks or so tech news sites like News.com or ZDNet tell an increasingly common tale - some big name company's databases have been hacked, resulting in the hackers discovering oodles of personal information about the company's customers. If you build public-facing, data-driven web applications, stories like these can easily send shivers down your spine. What steps are you taking to secure your application's data from outside threats?
There are a plethora of steps one can take to harden their data-drive applications from malicious hackers. The first and
most important step is to harden your database software. Do you have the latest service packs installed? If you're using
Microsoft SQL Server, have you chosen a sufficiently complex password for the
sa account? Is your database configured
only to accept Windows authentication? For more information on improving your database's security see 10
Steps to Help Secure SQL Server 2000.
Even if you lock down SQL server as best you can, the unthinkable might still happen, and your data might fall into the
hands of ne'er-do-wells. If you have particularly sensitive information - financial records, social-security numbers, and
so on - it might behoove you to add an additional layer of security and actually encrypt the extremely sensitive data.
Microsoft SQL Server 2000 and earlier versions do not include a way to automatically encrypt the contents of your database tables. (SQL Server 2005, which is still in beta at the time of this writing, will provide native encryption capabilities; see SQL Server 2005 Security - Encryption for more information.) Therefore, if you want to encrypt the contents of your tables, you'll need to do it yourself. There are a variety of techniques; the one we'll be examining in this article is how to use code in the .NET layer to encrypt the sensitive data before writing it to SQL Server and how to decrypt it back to its plain-text form when reading the encrypted content from SQL Server. Read on to learn more!
A Quick Overview of Encrypting Database Contents
Imagine that we are working on creating a database for an eCommerce site that has a
Customerstable. This table might have fields like
CreditCardExpiration. Clearly all of the records and fields in the
Customerstable are meant to be kept private, but the payment-related fields (
CreditCardExpiration) especially so. That is, while our customers would surely be upset if their email address and home address was discovered by hackers, losing their payment information could lead to real financial harm. Therefore, we might plan for the worst and say to ourselves, "Well, we have hardened our database server as best we can, so it should be very, very hard for someone to breach it. But, if it is breached and the data is compromised, it would be nice to have the payment information encrypted, so that there's another level of protection of the most sensitive information."
Implementing this carries a few challenges, questions, and concerns:
- What is the schema of our database? If we were just storing the
CreditCardExpirationfields as plain-text, they'd likely be just
varcharfields. But now that we want to encrypt them, what data type should these fields be?
- How do we enrcypt the data before inserting it into the database? Likewise, once we read back the database data how can we decrypt it so that we can see it's plain-text value?
- What encryption algorithm should be used, and how does the choice of the encryption routine affect the overall security of this approach? More bluntly, if a hacker gets his hands on this encrypted payment information, will he be able to decrypt it?
- What consequences, if any, does encrypting the data stored in a table carry? Does it impede performance? Usability? Maintainability? Does it limit what queries I can run against the underlying data?
In order to store the encrypted data we'll use a field of type
varbinary. That is, the encrypted data will
be saved as a series of bytes. This means that from our ASP.NET application, when inserting or updating encrypted data,
we'll need to:
- Take the user's plain-text input,
- Encrypt it,
- Convert it from a string into a byte array, and, finally,
- Issue the
- Read the encrypted data from the database using a
- Convert it from a byte array into a string,
- Decrypt it, and, finally
- Work with the plain-text data
Customerstable the only encrypted fields are the payment-related fields -
CreditCardExpiration- all other fields remain as plain-text. So we only need to go through these steps when needing to read or insert/update the customer's payment information.)
Choosing an Encryption Algorithm
There are two flavors of encryption algorithms: symmetric and asymmetric. Symmetric algorithms involve the use of a secret key, which is some bit of information that is known only to those participating in the encryption and decryption. With asymmetric encryption, public and private keys are used. A detailed discussion between these different encryption models is far beyond the aim of this article. In our example, let's use symmetric encryption.
|A Drawback to Symmetric Encryption|
A major downside of symmetric encryption is this secret key that is used. If the hacker who has broken into our database
can find the symmetric key used the encrypt and decrypt the data, he could easily decrypt the payment information. (Just like
if a thief gets hold of your ATM card and can find out what your PIN is, your account balance will quickly go to $0.00)
This symmetric key will be stored within the code of our ASP.NET Web application. Therefore, in order for the hacker to get
ahold of the encrypted payment information he'll have to both break into the database server and the Web server.
On an aside, this highlights one of the fundamental concepts of security - there is no such thing, really. Sure, you can take steps to improve your application's security, but it is impossible to have air-tight, guaranteed security. There will always be some open hole, some backdoor, a disgruntled employee who has access to the server room... something. Building a secure application is about making the application as difficult as possible for attackers.
There are a variety of symmetric encryption routines available, some more robust than others. One of the most robust
symmetric encryption routines is AES, which is the symmetric key
encryption standard used by the United States government. In fact, there already exists a great article on how to
provide database encryption using AES: Implementing Encrypted
SQL Server Database Columns with .NET. The .NET Framework contains classes in the
namespace for implementing this encryption algorithm.
For this article I'll use a different symmetric encryption routine, since the AES angle has already been examined. Specifically, I'll use RC4. I chose RC4 in part because of an earlier article here on 4Guys, Mike Shaffer's article RC4 Encryption Using ASP & VBScript, which was converted into .NET code by Chris Scott in his article Converting Mike Shaffer's VBScript RC4 Encryption Code to C#. Before using this algorithm (or any security-related algorithm, for that matter) in production code be sure to have a firm understanding of RC4 and its strengths and weaknesses.
One final comment on symmetric encryption - when encrypting data with a secret key it is important to realize that using the same secret key on the same piece of data will result in the same encrypted output. If you are encrypting output that might have duplicates, this fact can lead to a weakness of the system. For example, if a hacker somehow knows customer x credit card expiration is 07/09 (maybe he is customer x, or he already has customer x's payment information from exploiting some other site), and he sees that customer x's encrypted expiration is, say, "&9k@m", our hacker now knows that any other customer who has the encrypted expiration date of "&9k@m" also has a credit card that expires on 07/09. While this information, alone, isn't going to do much good for the hacker, it does bring him one step closer to his nefarious goal.
To overcome this weakness, it is important to salt the secret key with some extra bit of information specific to each row.
That is, rather than using the exact same secret key to encrypt the expiration date for each and every customer, we might
instead use a secret key that is composed of some common secret key concatenated with the customer's
field. (Of course, this assumes that the
CustomerID field won't change; it might be more prudent to create a
separate field in the table that simply hold's the salt.) For more information on salting, refer to a
previous 4Guys article by Thomas Tomiczek, Could you Pass
the Salt? Improving the Security in Encrypting Passwords using MD5.
In Part 2 we'll look at creating the database and writing the code to perform the encryption/decryption needed.