This one is a slightly different Question of the Week. Makerofthings7 asked a list-type question, which generally doesn’t fit on the Stack Exchange network, however this question generated a lot of interest and some excellent answers containing a lot of useful information, so it is probably worthwhile posting an excerpt of that content here. If you are a budding cryptographer, or a developer asked to implement a crypto function, read these guidelines first!
D.W., one of our resident high-rep cryptographers provided a number of the highest scoring answers.
Don’t roll your own crypto.
Don’t invent your own encryption algorithm or protocol; that is extremely error-prone. As Bruce Schneier likes to say,
“Anyone can invent an encryption algorithm they themselves can’t break; it’s much harder to invent one that no one else can break”.
Crypto algorithms are very intricate and need intensive vetting to be sure they are secure; if you invent your own, you won’t get that, and it’s very easy to end up with something insecure without realizing it.
Instead, use a standard cryptographic algorithm and protocol. Odds are that someone else has encountered your problem before and designed an appropriate algorithm for that purpose.
Your best case is to use a high-level well-vetted scheme: for communication security, use TLS (or SSL); for data at rest, use GPG (or PGP). If you can’t do that, use a high-level crypto library, like cryptlib, GPGME, Keyczar, or NaCL, instead of a low-level one, like OpenSSL, CryptoAPI, JCE, etc.. Thanks to Nate Lawson for this suggestion.
Don’t use encryption without message authentication
It is a very common error to encrypt data without also authenticating it.
Example: The developer wants to keep a message secret, so encrypts the message with AES-CBC mode. The error: This is not sufficient for security in the presence of active attacks, replay attacks, reaction attacks, etc. There are known attacks on encryption without message authentication, and the attacks can be quite serious. The fix is to add message authentication.
This mistake has led to serious vulnerabilities in deployed systems that used encryption without authentication, including ASP.NET, XML encryption, Amazon EC2, JavaServer Faces, Ruby on Rails, OWASP ESAPI, IPSEC, WEP, ASP.NET again, and SSH2. You don’t want to be the next one on this list.
To avoid these problems, you need to use message authentication every time you apply encryption. You have two choices for how to do that:
Probably the simplest solution is to use an encryption scheme that provides authenticated encryption, e.g.., GCM, CWC, EAX, CCM, OCB. (See also: 1.) The authenticated encryption scheme handles this for you, so you don’t have to think about it.
Alternatively, you can apply your own message authentication, as follows. First, encrypt the message using an appropriate symmetric-key encryption scheme (e.g., AES-CBC). Then, take the entire ciphertext (including any IVs, nonces, or other values needed for decryption), apply a message authentication code (e.g., AES-CMAC, SHA1-HMAC, SHA256-HMAC), and append the resulting MAC digest to the ciphertext before transmission. On the receiving side, check that the MAC digest is valid before decrypting. This is known as the encrypt-then-authenticate construction. (See also: 1, 2.) This also works fine, but requires a little more care from you.
Be careful when concatenating multiple strings, before hashing.
An error I sometimes see: People want a hash of the strings S and T. They concatenate them to get a single string S||T, then hash it to get H(S||T). This is flawed.
The problem: Concatenation leaves the boundary between the two strings ambiguous. Example:
insecurely. Put another way, the hash H(S||T) does not uniquely identify the string S and T. Therefore, the attacker may be able to change the boundary between the two strings, without changing the hash. For instance, if Alice wanted to send the two strings
securely, the attacker could change them to the two strings
insecurely without invalidating the hash.
Similar problems apply when applying a digital signature or message authentication code to a concatenation of strings.
The fix: rather than plain concatenation, use some encoding that is unambiguously decodeable. For instance, instead of computing H(S||T), you could compute H(length(S)||S||T), where length(S) is a 32-bit value denoting the length of S in bytes. Or, another possibility is to use H(H(S)||H(T)), or even H(H(S)||T).
For a real-world example of this flaw, see this flaw in Amazon Web Services or this flaw in Flickr [pdf].
Make sure you seed random number generators with enough entropy.
Make sure you use crypto-strength pseudorandom number generators for things like generating keys, choosing IVs/nonces, etc. Don’t use
Make sure you seed the pseudorandom number generator with enough entropy. Don’t seed it with the time of day; that’s guessable.
srand(time(NULL)) is very bad. A good way to seed your PRNG is to grab 128 bits or true-random numbers, e.g., from
/dev/urandom, CryptGenRandom, or similar. In Java, use SecureRandom, not Random. In .NET, use System.Security.Cryptography.RandomNumberGenerator, not System.Random. In Python, use random.SystemRandom, not random. Thanks to Nate Lawson for some examples.
Real-world example: see this flaw in early versions of Netscape’s browser, which allowed an attacker to break SSL.
Don’t reuse nonces or IVs
Many modes of operation require an IV (Initialization Vector). You must never re-use the same value for an IV twice; doing so can cancel all the security guarantees and cause a catastrophic breach of security.
For stream cipher modes of operation, like CTR mode or OFB mode, re-using a IV is a security disaster. It can cause the encrypted messages to be trivially recoverable.
For other modes of operation, like CBC mode, re-using an IV can also facilitate plaintext-recovery attacks in some cases.
No matter what mode of operation you use, you shouldn’t reuse the IV. If you’re wondering how to do it right, the NIST specification provides detailed documentation of how to use block cipher modes of operation properly.
Don’t use a block cipher with ECB for symmetric encryption
(Applies to AES, 3DES, … )
Equivalently, don’t rely on library default settings to be secure. Specifically, many libraries which implement AES implement the algorithm described in FIPS 197, which is so called ECB (Electronic Code Book) mode, which is essentially a straightforward mapping of:
AES(plaintext byte, key byte) -> ciphertext byte
is very insecure. The reasoning is simple, while the number of possible keys in the keyspace is quite large, the weak link here is the amount of entropy in the message. As always, xkcd.com describes is better than I http://xkcd.com/257/
It’s very important to use something like CBC (Cipher Block Chaining) which basically makes ciphertext[i] a mapping:
ciphertext[i] = SomeFunction(ciphertext[i-1], message[i], key)
Just to point out a few language libraries where this sort of mistake is easy to make:http://golang.org/pkg/crypto/aes/ provides an AES implementation which, if used naively, would result in ECB mode.
The pycrypto library defaults to ECB mode when creating a new AES object.
OpenSSL, does this right. Every AES call is explicit about the mode of operation. Really the safest thing IMO is to just try not to do low level crypto like this yourself. If you’re forced to, proceed as if you’re walking on broken glass (carefully), and try to make sure your users are justified in placing their trust in you to safeguard their data.
Don’t use the same key for both encryption and authentication. Don’t use the same key for both encryption and signing.
A key should not be reused for multiple purposes; that may open up various subtle attacks.
For instance, if you have an RSA private/public key pair, you should not both use it for encryption (encrypt with the public key, decrypt with the private key) and for signing (sign with the private key, verify with the public key): pick a single purpose and use it for just that one purpose. If you need both abilities, generate two keypairs, one for signing and one for encryption/decryption.
Similarly, with symmetric cryptography, you should use one key for encryption and a separate independent key for message authentication. Don’t re-use the same key for both purposes.
Kerckhoffs’s principle: A cryptosystem should be secure even if everything about the system, except the key, is public knowledge
A wrong example: LANMAN hashes
The LANMAN hashes would be hard to figure out if noone knew the algorithm, however once the algorithm was known it is now very trivial to crack.
The algorithm is as follows (from wikipedia) :
- The user’s ASCII password is converted to uppercase.
- This password is null-padded to 14 bytes
- The “fixed-length” password is split into two seven-byte halves.
- These values are used to create two DES keys, one from each 7-byte half
- Each of the two keys is used to DES-encrypt the constant ASCII string “KGS!@#$%”, resulting in two 8-byte ciphertext values.
- These two ciphertext values are concatenated to form a 16-byte value, which is the LM hash
Because you now know the ciphertext of these facts you can now very easily break the ciphertext into two ciphertext’s which you know is upper case resulting in a limited set of characters the password could possibly be.
A correct example: AES encryption
Scales with technology. Increase key size when in need of more cryptographic oomph
Try to avoid using passwords as encryption keys.
A common weakness in many systems is to use a password or passphrase, or a hash of a password or passphrase, as the encryption/decryption key. The problem is that this tends to be highly susceptible to offline keysearch attacks. Most users choose passwords that do not have sufficient entropy to resist such attacks.
The best fix is to use a truly random encryption/decryption key, not one deterministically generated from a password/passphrase.
However, if you must use one based upon a password/passphrase, use an appropriate scheme to slow down exhaustive keysearch. I recommend PBKDF2, which uses iterative hashing (along the lines of H(H(H(….H(password)…)))) to slow down dictionary search. Arrange to use sufficiently many iterations to cause this process to take, say, 100ms on the user’s machine to generate the key.
In a cryptographic protocol: Make every authenticated message recognisable: no two messages should look the same
A generalisation/variant of:
Be careful when concatenating multiple strings, before hashing.
Don’t reuse keys.
Don’t reuse nonces.
During a run of cryptographic protocol many messages that cannot be counterfeited without a secret (key or nonce) can be exchanged. These messages can be verified by the received because he knows some public (signature) key, or because only him and the sender know some symmetric key, or nonce. This makes sure that these messages have not been modified.
But this does not make sure that these messages have been emitted during the same run of the protocol: an adversary might have captured these messages previously, or during a concurrent run of the protocol. An adversary may start many concurrent runs of a cryptographic protocol to capture valid messages and reuse them unmodified.
By cleverly replaying messages, it might be possible to attack a protocol without compromising any primary key, without attacking any RNG, any cypher, etc.
By making every authenticated message of the protocol obviously distinct for the receiver, opportunities to replay unmodified messages are reduced (not eliminated).
Don’t use the same key in both directions.
In network communications, a common mistake is to use the same key for communication in the A->B direction as for the B->A direction. This is a bad idea, because it often enables replay attacks that replay something A sent to B, back to A.
The safest approach is to negotiate two independent keys, one for each direction. Alternatively, you can negotiate a single key K, then use K1 = AES(K,00..0) for one direction and K2 = AES(K,11..1) for the other direction.
Don’t use insecure key lengths.
Ensure you use algorithms with a sufficiently long key.
For symmetric-key cryptography, I’d recommend at least a 80-bit key, and if possible, a 128-bit key is a good idea. Don’t use 40-bit crypto; it is insecure and easily broken by amateurs, simply by exhaustively trying every possible key. Don’t use 56-bit DES; it is not trivial to break, but it is within the reach of dedicated attackers to break DES. A 128-bit algorithm, like AES, is not appreciably slower than 40-bit crypto, so you have no excuse for using crummy crypto.
For public-key cryptography, key length recommendations are dependent upon the algorithm and the level of security required. Also, increasing the key size does harm performance, so massive overkill is not economical; thus, this requires a little more thought than selection of symmetric-key key sizes. For RSA, El Gamal, or Diffie-Hellman, I’d recommend that the key be at least 1024 bits, as an absolute minimum; however, 1024-bit keys are on the edge of what might become crackable in the near term and are generally not recommended for modern use, so if at all possible, I would recommend 1536- or even 2048-bit keys. For elliptic-curve cryptography, 160-bit keys appear adequate, and 224-bit keys are better. You can also refer to published guidelines establishing rough equivalences between symmetric- and public-key key sizes.
Don’t re-use the same key on many devices.
The more widely you share a cryptographic key, the less likely you’ll be able to keep it secret. Some deployed systems have re-used the same symmetric key onto every device on the system. The problem with this is that sooner or later, someone will extract the key from a single device, and then they’ll be able to attack all the other devices. So, don’t do that.
See also “Symmetric Encryption Don’t #6: Don’t share a single key across many devices” in this blog article. Credits to Matthew Green.
A one-time pad is not a one-time pad if the key is stretched by an algorithm
The identifier “one-time pad” (also known as a Vernam cipher) is frequently misapplied to various cryptographic solutions in an attempt to claim unbreakable security. But by definition, a Vernam cipher is secure if and only if all three of these conditions are met:
The key material is truly unpredictable; AND
The key material is the same length as the plaintext; AND
The key material is never reused.
Any violation of those conditions means it is no longer a one-time pad cipher.
The common mistake made is that a short key is stretched with an algorithm. This action violates the unpredictability rule (never mind the key length rule.) Once this is done, the one-time pad is mathematically transformed into the key-stretching algorithm. Combining the short key with random bytes only alters the search space needed to brute force the key-stretching algorithm. Similarly, using “randomly generated” bytes turns the random number generator algorithm into the security algorithm.
You may have a very good key-stretching algorithm. You may also have a very secure random number generator. However, your algorithm is by definition not a one-time pad, and thus does not have the unbreakable property of a one-time pad.
Don’t use an OTP or stream cipher in disk encryption
Suppose two files are saved using a stream cipher / OTP. If the file is resaved after a minor edit, an attacker can see that only certain bits were changed and infer information about the document. (Imagine changing the salutation “Dear Bob” to “Dear Alice”).
There is no integrity in the output: an attacker can modify the ciphertext and modify the contents of the data by simply XORing the data.
Take away: Modifications to ciphertext are undetected and have predictable impact on the plaintext.
Use a Block cipher for these situations that includes message integrity checks
Liked this question of the week? Interested in reading more detail, and other answers? See the question in full. Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.