I've talked about DPAPI and symmetric encryption. Both of these are very good for certain things. But what about passwords? Encrypting them with DPAPI is not ideal ... as DPAPI from ASP.NET would be machine-specific; it won't scale out and it's not easy to transfer between machines if there is a need for disaster recovery. Symmetric encryption can be a reasonable option, but there is a more secure (and faster) way to do this. Let me explain a bit further. Let's say that you forget your Windows domain password. Can you get that password back? No, you can only reset the password. Yes, I know there are password crackers, but they do tend to be brute-force tools or they use tables with known hashes and compare them to what's in the SAM. So, I'm sure you can guess what it is ... hash algorithms (yeah, I guess the title was a giveaway).
Hash algorithms have a simple function: the take input text, run it through and algorithm and produce output that cannot be reversed to the original. A small change in the input results in a large change in the output. The output itself will always have the same size, in bits, regardless of the input. So, for example, a 500 character string processed by a 256 bit hash algorithm will always return a 256 bit value. As would a 1 character string. This is another key difference between hashing and encryption functions. However, the same input will produce the same output ... so it is, as you can certainly guess, a very good way to store passwords. Since it's not reversible, it is very hard, if not impossible, for it to be retrieved except through a brute-force attack. And there are ways to make even a brute force attack even more difficult than they already are; we will touch on that. Hashes can be used for checksums (you'll see MD5 hashes used for checksums on many Linux distribution downloads) ... they can be considered a "digital fingerprint" that ensures the integrity of a downloaded file, zip archive and more; however, there are other algorithms that can also be used for these purposes that are not secure (for example: CRC or cyclic redundancy check). All of the hash algorithms in .Net inherit from System.Security.Cryptography.HashAlgorithm. And, of course, you can find them in the System.Security.Cryptography namespace.
Hash Algorithms Supported in .Net
- MD5: This is a widely used 128-bit hash algorithm, especially for validating downloaded files. It is an Internet standard, being described in RFC 1321. However, there are known issues with MD5, with collisions (that is, two different inputs producing the same hash) being having been shown to be found on a laptop computer in a minute. While there are ways to mitigate this, in general it is not recommended for new applications.
- RipeMD160: This is a 160-bit hash algorithm designed to replace the earlier RipeMD, which was, in turn, based on the now-defunct MD4 (which was replaced by MD5). Like MD4, the original RipeMD was found to have some weaknesses. RipeMD 160 improves on this, if only because the size is larger.
- SHA1: Designed by the National Security Agency for use as a Federal Information Publishing Standard (FIPS). This produces a 160-bit hash value. It is in the process of being phased out due to vulnerabilities that have been reported in the algorithm.
- SHA256, SHA384, SHA512: This family of algorithms is collectively known as SHA2. They have lengths of 256, 294 and 512 bits, respectively. Due to the known issues with SHA1, these algorithms are generally considered more secure.
OK, so we have that out of the way. So, let me run something else by you. Remember when I said that the same input produces the same output? That can be problematic, especially with passwords. This is because if you know that one password is, for example P@ssw0rd and see that another entry has the same hash value, then you will know that the second entry is also P@ssw0rd. Symmetric cryptography has a similar issue and, with symmetric crypto, we use an initialization vector to resolve it. But hash algorithms don't have an IV. Instead, with a hash algorithm, we use a salt. This salt is the extra bit of gobbledygook that provide the randomization required to ensure that the above scenario doesn't occur. As with an initialization vector, this can be stored in the clear. But ... it's something that you have to add to the data to be hashed; there are no properties for it as there are with an IV.
And now, without further ado, for some code. A note ... I'm passing the name of the algorithm into the function. This isn't necessary, but it does provide some flexibility. You can use the names of the algorithm (above) or you can hard-code the algorithm's class into the function.
This first code sample shows hashing in its simplest form ... no salt, nothing special, just a straight hash. The return value is a Base64 Encoded string ... I do like to use these for better (and easier) storage at the database level, though it is at the expense of some CPU cycles on the application logic level.
private string HashPasswordSimple(string password, string hashAlg)
//convert the password to bytes with UTF8 Encoding. byte passwordBytes = System.Text.Encoding.UTF8.GetBytes(password);//HashAlgorithm is disposable, so we'll use a "using" blockusing (HashAlgorithm hashAlgorithm = HashAlgorithm.Create(hashAlg))
byte passwordHash = hashAlgorithm.ComputeHash(passwordBytes);//convert the computed hash to a string representation ... string hashString = System.Convert.ToBase64String(passwordHash);return hashString;
As you can see, there's not that much to it. Pretty straightforward. To verify a password, you recalculate the password's hash and then compare it to the stored value. Adding a salt takes this up a level and, of course, you'll need to the salt somewhere as well. Good thing is that the salt isn't helpful by itself to a bad guy, so you can store it in the clear. Here is one method of using a salt (you can also add it to the end, etc. just a long as you can reproduce it).
private static string HashPasswordSalt(string password, byte salt, string hashAlg)
//convert the password to bytes with UTF8 Encoding. byte passwordBytes = System.Text.Encoding.UTF8.GetBytes(password);//Add the hash to the password bytes. byte hashData = new byte[passwordBytes.Length + salt.Length];//Use Buffer.BlockCopy to copy the salt and password//into a new array that will actually be hashed.
Buffer.BlockCopy(salt, 0, hashData, 0, salt.Length);
Buffer.BlockCopy(passwordBytes, 0, hashData, salt.Length, passwordBytes.Length);//From here, compute the hash.//HashAlgorithm is disposable, so we'll use a "using" blockusing (HashAlgorithm hashAlgorithm = HashAlgorithm.Create(hashAlg))
byte passwordHash = hashAlgorithm.ComputeHash(hashData);//convert the computed hash to a string representation ... string hashString = System.Convert.ToBase64String(passwordHash);return hashString;
The next question, of course, is how to create the salt. There are many ways to go about it as long as it is unique to the individual hash (i.e. the same passwords should not have the same salt ... would defeat the purpose). You can use the System.Security.Cryptography.RandomNumberGenerator class to create the salt. This class generates a cryptographically strong random sequence of values ... just using the System.Random class doesn't do that. You can use a unique identifier associated with the user account (for example) to create the salt ... i.e a user id Guid. You can do any number of things as long as it is unique in the hashing context.
In addition to traditional hash algorithms, .Net also has support for keyed hash algorithms. These take regular hashes a step up and are more commonly called a Hash Message Authentication Code (HMAC). These algorithms use a hash algorithm in addition to a secret key. This provides not just the data integrity, but also the integrity of the message. Think about it for a second ... if a hash algorithm is repeatable, a hacker could intercept the message, change it, recalculate the hash and you'd be none the wiser. With an HMAC, this is not possible as the key is required to regenerate the hash. A keyed hash algorithm is essential to protect the integrity of a hash value that is transmitted to users (for example, in ASP.NET's ViewState). Keep in mind, however, that you still need to think about protecting the key. With all of that said, .Net does support 2 keyed hash algorithms and they both inherit from System.Security.Cryptography.KeyedHashAlgorithm. This, of course, inherits from HashAlgorithm.
Keyed Hash Algorithms Supported in .Net
- HMACSHA1: Based on the SHA1 hashing algorithm (and, therefore 160 bits), this adds a key of arbitrary length to the function.
- MACTripleDES: As it's name implies, that uses the TripleDES algorithm to produce a hash. The keys can be 8, 16 or 24 bytes and generates a 64-bit hash.
The only difference between a straight hash algorithm and a keyed hash algorithm is the addition of the key. There isn't a need for a salt with a keyed algorithm; it is used for a different purpose (message authentication and validation) than a regular hash algorithm and, since the HMAC is authentication a message sent in the clear, there really isn't any point to it. An example of using a Keyed Hash Algorithm is in ASP.NET ... the <pages> element has an attribute of "enableViewStateMac". This has nothing to do with enabling ViewState on Macintosh, but to add a MAC to the ViewState. There is also a page directive that will do this at the page level. The key used can be specified in the <machineKey>; if you have it auto-generated, you run the chance that the ViewState will fail validation when the AppDomain recycles or, if you are using a web farm, the request goes to another web server.
That's all for now. Have fun and happy coding!