Posts Tagged ‘crypto’

A short statement on the Heartbleed problem and its impact on common Internet users.

2014-04-11 by lucaskauffman. 2 comments

On the 7th of April 2014 a team of security engineers (Riku, Antti and Matti) at Codenomicon and Neel Mehta of Google Security published information on a security issue in OpenSSL. OpenSSL is a piece of software used in the encryption process; it helps you in coding your computer traffic to ensure unauthorized people cannot understand what you are sending from one computer network to another. It is used in many applications: for example if you use on-line banking websites, code such as OpenSSL helps to ensure that your PIN code remains secret.

The information that was released caused great turmoil in the security community, and many panic buttons were pressed because of the wide-spread use of OpenSSL. If you are using a computer and the Internet you might be impacted: people at home just as much as major corporations. OpenSSL is used for example in web, e-mail and VPN servers and even in some security appliances. However, the fact that you have been impacted does not mean you can no longer use your PC or any of its applications. You may be a little more vulnerable, but the end of the world may still be further than you think. First of all some media reported on the “Heartbleed virus”. Heartbleed is in fact not a virus at all. You cannot be infected with it and you cannot protect against being infected. Instead it is an error in the computer programming code for specific OpenSSL versions (not all) which a hacker could potentially use to obtain  information from the server (which could possibly include passwords and encryption keys, along with other random data in the server’s memory) potentially allowing him to break into a system or account.

Luckily, most applications in which OpenSSL is used, rely on more security measures than only OpenSSL. Most banks for instance continuously work to remain abreast of security issues, and have implemented several measures that lower the risk this vulnerability poses. An example of such a protective measure is transaction signing with an off-line card reader or other forms of two –factor authentication. Typically exploiting the vulnerability on its own will not allow an attacker post fraudulent transactions if you are using two-factor authentication or an offline token generator for transaction signing.

So in summary, does the Heartbleed vulnerability affect end-users? Yes, but not dramatically. A lot of the risk to the end-users can be lowered by following common-sense security principles:

  • Regularly change your on-line passwords (as soon as the websites you use let you know they have updated their software, this is worthwhile, but it should be part of your regular activity)
  • Ideally, do not use the same password for two on-line websites or applications
  • Keep the software on your computer up-to-date.
  • Do not perform on-line transactions on a public network (e.g. WiFi hotspots in an airport). Anyone could be trying to listen in.

Security Stack Exchange has a wide range of questions on Heartbleed ranging from detail on how it works to how to explain it to non-technical friends. 

Authors: Ben Van Erck, Lucas Kauffman

QoTW #47: Lessons learned and misconceptions regarding encryption and cryptology

2013-06-10 by roryalsop. 0 comments

This one is a slightly different Question of the Week. Makerofthings7 asked a list-type question, which generally doesn’t fit on the Stack Exchange network, however this question generated a lot of interest and some excellent answers containing a lot of useful information, so it is probably worthwhile posting an excerpt of that content here. If you are a budding cryptographer, or a developer asked to implement a crypto function, read these guidelines first!

D.W., one of our resident high-rep cryptographers provided a number of the highest scoring answers.

Don’t roll your own crypto.

Don’t invent your own encryption algorithm or protocol; that is extremely error-prone. As Bruce Schneier likes to say,

“Anyone can invent an encryption algorithm they themselves can’t break; it’s much harder to invent one that no one else can break”.

Crypto algorithms are very intricate and need intensive vetting to be sure they are secure; if you invent your own, you won’t get that, and it’s very easy to end up with something insecure without realizing it.

Instead, use a standard cryptographic algorithm and protocol. Odds are that someone else has encountered your problem before and designed an appropriate algorithm for that purpose.

Your best case is to use a high-level well-vetted scheme: for communication security, use TLS (or SSL); for data at rest, use GPG (or PGP). If you can’t do that, use a high-level crypto library, like cryptlib, GPGME, Keyczar, or NaCL, instead of a low-level one, like OpenSSL, CryptoAPI, JCE, etc.. Thanks to Nate Lawson for this suggestion.

Don’t use encryption without message authentication

It is a very common error to encrypt data without also authenticating it.

Example: The developer wants to keep a message secret, so encrypts the message with AES-CBC mode. The error: This is not sufficient for security in the presence of active attacks, replay attacks, reaction attacks, etc. There are known attacks on encryption without message authentication, and the attacks can be quite serious. The fix is to add message authentication.

This mistake has led to serious vulnerabilities in deployed systems that used encryption without authentication, including ASP.NETXML encryptionAmazon EC2JavaServer Faces, Ruby on Rails, OWASP ESAPIIPSECWEPASP.NET again, and SSH2. You don’t want to be the next one on this list.

To avoid these problems, you need to use message authentication every time you apply encryption. You have two choices for how to do that:

Probably the simplest solution is to use an encryption scheme that provides authenticated encryption, e.g.., GCM, CWC, EAX, CCM, OCB. (See also: 1.) The authenticated encryption scheme handles this for you, so you don’t have to think about it.

Alternatively, you can apply your own message authentication, as follows. First, encrypt the message using an appropriate symmetric-key encryption scheme (e.g., AES-CBC). Then, take the entire ciphertext (including any IVs, nonces, or other values needed for decryption), apply a message authentication code (e.g., AES-CMAC, SHA1-HMAC, SHA256-HMAC), and append the resulting MAC digest to the ciphertext before transmission. On the receiving side, check that the MAC digest is valid before decrypting. This is known as the encrypt-then-authenticate construction. (See also: 12.) This also works fine, but requires a little more care from you.

Be careful when concatenating multiple strings, before hashing.

An error I sometimes see: People want a hash of the strings S and T. They concatenate them to get a single string S||T, then hash it to get H(S||T). This is flawed.

The problem: Concatenation leaves the boundary between the two strings ambiguous. Example:builtin||securely = built||insecurely. Put another way, the hash H(S||T) does not uniquely identify the string S and T. Therefore, the attacker may be able to change the boundary between the two strings, without changing the hash. For instance, if Alice wanted to send the two strings builtin andsecurely, the attacker could change them to the two strings built and insecurely without invalidating the hash.

Similar problems apply when applying a digital signature or message authentication code to a concatenation of strings.

The fix: rather than plain concatenation, use some encoding that is unambiguously decodeable. For instance, instead of computing H(S||T), you could compute H(length(S)||S||T), where length(S) is a 32-bit value denoting the length of S in bytes. Or, another possibility is to use H(H(S)||H(T)), or even H(H(S)||T).

For a real-world example of this flaw, see this flaw in Amazon Web Services or this flaw in Flickr [pdf].

Make sure you seed random number generators with enough entropy.

Make sure you use crypto-strength pseudorandom number generators for things like generating keys, choosing IVs/nonces, etc. Don’t use rand()random()drand48(), etc.

Make sure you seed the pseudorandom number generator with enough entropy. Don’t seed it with the time of day; that’s guessable.

Examples: srand(time(NULL)) is very bad. A good way to seed your PRNG is to grab 128 bits or true-random numbers, e.g., from /dev/urandom, CryptGenRandom, or similar. In Java, use SecureRandom, not Random. In .NET, use System.Security.Cryptography.RandomNumberGenerator, not System.Random. In Python, use random.SystemRandom, not random. Thanks to Nate Lawson for some examples.

Real-world example: see this flaw in early versions of Netscape’s browser, which allowed an attacker to break SSL.

Don’t reuse nonces or IVs

Many modes of operation require an IV (Initialization Vector). You must never re-use the same value for an IV twice; doing so can cancel all the security guarantees and cause a catastrophic breach of security.

For stream cipher modes of operation, like CTR mode or OFB mode, re-using a IV is a security disaster. It can cause the encrypted messages to be trivially recoverable. For other modes of operation, like CBC mode, re-using an IV can also facilitate plaintext-recovery attacks in some cases.

No matter what mode of operation you use, you shouldn’t reuse the IV. If you’re wondering how to do it right, the NIST specification provides detailed documentation of how to use block cipher modes of operation properly.

Don’t use a block cipher with ECB for symmetric encryption

(Applies to AES, 3DES, … )

Equivalently, don’t rely on library default settings to be secure. Specifically, many libraries which implement AES implement the algorithm described in FIPS 197, which is so called ECB (Electronic Code Book) mode, which is essentially a straightforward mapping of:

AES(plaintext [32]byte, key [32]byte) -> ciphertext [32]byte

is very insecure. The reasoning is simple, while the number of possible keys in the keyspace is quite large, the weak link here is the amount of entropy in the message. As always, xkcd.com describes is better than I http://xkcd.com/257/

It’s very important to use something like CBC (Cipher Block Chaining) which basically makes ciphertext[i] a mapping:

ciphertext[i] = SomeFunction(ciphertext[i-1], message[i], key)

Just to point out a few language libraries where this sort of mistake is easy to make:http://golang.org/pkg/crypto/aes/ provides an AES implementation which, if used naively, would result in ECB mode.

The pycrypto library defaults to ECB mode when creating a new AES object.

OpenSSL, does this right. Every AES call is explicit about the mode of operation. Really the safest thing IMO is to just try not to do low level crypto like this yourself. If you’re forced to, proceed as if you’re walking on broken glass (carefully), and try to make sure your users are justified in placing their trust in you to safeguard their data.

Don’t use the same key for both encryption and authentication. Don’t use the same key for both encryption and signing.

A key should not be reused for multiple purposes; that may open up various subtle attacks.

For instance, if you have an RSA private/public key pair, you should not both use it for encryption (encrypt with the public key, decrypt with the private key) and for signing (sign with the private key, verify with the public key): pick a single purpose and use it for just that one purpose. If you need both abilities, generate two keypairs, one for signing and one for encryption/decryption.

Similarly, with symmetric cryptography, you should use one key for encryption and a separate independent key for message authentication. Don’t re-use the same key for both purposes.

Kerckhoffs’s principle: A cryptosystem should be secure even if everything about the system, except the key, is public knowledge

A wrong example: LANMAN hashes

The LANMAN hashes would be hard to figure out if noone knew the algorithm, however once the algorithm was known it is now very trivial to crack.

The algorithm is as follows (from wikipedia) :

  1. The user’s ASCII password is converted to uppercase.
  2. This password is null-padded to 14 bytes
  3. The “fixed-length” password is split into two seven-byte halves.
  4. These values are used to create two DES keys, one from each 7-byte half
  5. Each of the two keys is used to DES-encrypt the constant ASCII string “KGS!@#$%”, resulting in two 8-byte ciphertext values.
  6. These two ciphertext values are concatenated to form a 16-byte value, which is the LM hash

Because you now know the ciphertext of these facts you can now very easily break the ciphertext into two ciphertext’s which you know is upper case resulting in a limited set of characters the password could possibly be.

A correct example: AES encryption

Known algorithm

Scales with technology. Increase key size when in need of more cryptographic oomph

Try to avoid using passwords as encryption keys.

A common weakness in many systems is to use a password or passphrase, or a hash of a password or passphrase, as the encryption/decryption key. The problem is that this tends to be highly susceptible to offline keysearch attacks. Most users choose passwords that do not have sufficient entropy to resist such attacks.

The best fix is to use a truly random encryption/decryption key, not one deterministically generated from a password/passphrase.

However, if you must use one based upon a password/passphrase, use an appropriate scheme to slow down exhaustive keysearch. I recommend PBKDF2, which uses iterative hashing (along the lines of H(H(H(….H(password)…)))) to slow down dictionary search. Arrange to use sufficiently many iterations to cause this process to take, say, 100ms on the user’s machine to generate the key.

In a cryptographic protocol: Make every authenticated message recognisable: no two messages should look the same

A generalisation/variant of:

Be careful when concatenating multiple strings, before hashing. Don’t reuse keys. Don’t reuse nonces.

During a run of cryptographic protocol many messages that cannot be counterfeited without a secret (key or nonce) can be exchanged. These messages can be verified by the received because he knows some public (signature) key, or because only him and the sender know some symmetric key, or nonce. This makes sure that these messages have not been modified.

But this does not make sure that these messages have been emitted during the same run of the protocol: an adversary might have captured these messages previously, or during a concurrent run of the protocol. An adversary may start many concurrent runs of a cryptographic protocol to capture valid messages and reuse them unmodified.

By cleverly replaying messages, it might be possible to attack a protocol without compromising any primary key, without attacking any RNG, any cypher, etc.

By making every authenticated message of the protocol obviously distinct for the receiver, opportunities to replay unmodified messages are reduced (not eliminated).

Don’t use the same key in both directions.

In network communications, a common mistake is to use the same key for communication in the A->B direction as for the B->A direction. This is a bad idea, because it often enables replay attacks that replay something A sent to B, back to A.

The safest approach is to negotiate two independent keys, one for each direction. Alternatively, you can negotiate a single key K, then use K1 = AES(K,00..0) for one direction and K2 = AES(K,11..1) for the other direction.

Don’t use insecure key lengths.

Ensure you use algorithms with a sufficiently long key.

For symmetric-key cryptography, I’d recommend at least a 80-bit key, and if possible, a 128-bit key is a good idea. Don’t use 40-bit crypto; it is insecure and easily broken by amateurs, simply by exhaustively trying every possible key. Don’t use 56-bit DES; it is not trivial to break, but it is within the reach of dedicated attackers to break DES. A 128-bit algorithm, like AES, is not appreciably slower than 40-bit crypto, so you have no excuse for using crummy crypto.

For public-key cryptography, key length recommendations are dependent upon the algorithm and the level of security required. Also, increasing the key size does harm performance, so massive overkill is not economical; thus, this requires a little more thought than selection of symmetric-key key sizes. For RSA, El Gamal, or Diffie-Hellman, I’d recommend that the key be at least 1024 bits, as an absolute minimum; however, 1024-bit keys are on the edge of what might become crackable in the near term and are generally not recommended for modern use, so if at all possible, I would recommend 1536- or even 2048-bit keys. For elliptic-curve cryptography, 160-bit keys appear adequate, and 224-bit keys are better. You can also refer to published guidelines establishing rough equivalences between symmetric- and public-key key sizes.

Don’t re-use the same key on many devices.

The more widely you share a cryptographic key, the less likely you’ll be able to keep it secret. Some deployed systems have re-used the same symmetric key onto every device on the system. The problem with this is that sooner or later, someone will extract the key from a single device, and then they’ll be able to attack all the other devices. So, don’t do that.

See also “Symmetric Encryption Don’t #6: Don’t share a single key across many devices” in this blog article. Credits to Matthew Green.

A one-time pad is not a one-time pad if the key is stretched by an algorithm

The identifier “one-time pad” (also known as a Vernam cipher) is frequently misapplied to various cryptographic solutions in an attempt to claim unbreakable security. But by definition, a Vernam cipher is secure if and only if all three of these conditions are met:

The key material is truly unpredictable; AND The key material is the same length as the plaintext; AND The key material is never reused.

Any violation of those conditions means it is no longer a one-time pad cipher.

The common mistake made is that a short key is stretched with an algorithm. This action violates the unpredictability rule (never mind the key length rule.) Once this is done, the one-time pad is mathematically transformed into the key-stretching algorithm. Combining the short key with random bytes only alters the search space needed to brute force the key-stretching algorithm. Similarly, using “randomly generated” bytes turns the random number generator algorithm into the security algorithm.

You may have a very good key-stretching algorithm. You may also have a very secure random number generator. However, your algorithm is by definition not a one-time pad, and thus does not have the unbreakable property of a one-time pad.

Don’t use an OTP or stream cipher in disk encryption

Example 1

Suppose two files are saved using a stream cipher / OTP. If the file is resaved after a minor edit, an attacker can see that only certain bits were changed and infer information about the document. (Imagine changing the salutation “Dear Bob” to “Dear Alice”).

Example 2

There is no integrity in the output: an attacker can modify the ciphertext and modify the contents of the data by simply XORing the data.

Take away: Modifications to ciphertext are undetected and have predictable impact on the plaintext.

Solution

Use a Block cipher for these situations that includes message integrity checks

Liked this question of the week? Interested in reading more detail, and other answers? See the question in full. Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.

QoTW #38: What is SHA-3 – and why did we change it?

2012-10-12 by ninefingers. 0 comments

Lucas Kauffman selected this week’s Question of the Week: What is SHA3 and why did we change it? 

No doubt if you are at least a little bit curious about security, you’ll have heard of AES, the advanced encryption standard. Way back in 1997, when winters were really hard, our modems froze as we used them and Windows 98 had yet to appear, NIST saw the need to replace the then mainstream Data Encryption Standard with something resistant to the advances in cryptography that had occurred since its inception. So, NIST announced a competition and invited interested parties to submit algorithms matching the desired specification – the AES Process was underway.

A number of algorithms with varying designs were submitted for the process and three rounds held, with comments, cryptanalysis and feedback submitted at each stage. Between rounds, designers could tweak their algorithms if needed to address minor concerns; clearly broken algorithms did not progress. This was somewhat of a first for the crypto community – after the export restrictions and the so called “crypto wars” of the 90s, open cryptanalysis of published algorithms was novel, and it worked. We ended up with AES (and some of you may also have used Serpent, or Twofish) as a result.

Now, onto hashing. Way back in 1996, discussions were underway in the cryptographic community on the possibility of finding a collision within MD5. Practically MD5 started to be commonly exploited in 2005 to create fake certificate authorities. More recently, the FLAME malware used MD5 collisions to bypass Windows signature restrictions. Indeed, we covered this attack right here on the security blog.

The need for new hash functions has been known for some time, therefore. To replace MD5, SHA-1 was released. However, like its predecessor, cryptanalysis began to reveal that its collision resistance required a less-than-bruteforce search. Given that this eventually yields practical exploits that undermine cryptographic systems, a hash standard is needed that is resistant to finding collisions.

As of 2001, we have also had available to us SHA-2, a family of functions that as yet has survived cryptanalysis. However, SHA-2 is similar in design to its predecessor, SHA-1, and one might deduce that similar weaknesses may hold.

So, in response and in a similar vein to the AES process, NIST launched the SHA3 competition in 2007, in their words, in response to recent improvements in cryptanalysis of hash functions. Over the past few years, various algorithms have been analyzed and the number of candidates reduced, much like a reality TV show (perhaps without the tears, though). The final round algorithms essentially became the candidates for SHA3.

The big event this year is that Keccak has been announced as the SHA-3 hash standard. Before we go too much further, we should clarify some parts of the NIST process. Depending on the round an algorithm has reached determines the amount of cryptanalysis it will have received – the longer a function stays in the competition, the more analysis it faces. The report of round two candidates does not reveal any suggestion of breakage; however, NIST has selected its final round candidates based on a combination of performance factors and safety margins. Respected cryptographer Bruce Schneier even suggested that perhaps NIST should consider adopting several of the finalist functions as suitable.

That’s the background, so I am sure you are wondering: how does this affect me? Well, here’s what you should take into consideration:

  • MD5 is broken. You should not use it; it has been used in practical exploits in the wild, if reports are to be believed – and even if they are not, there are alternatives.
  • SHA-1 is shown to be theoretically weaker than expected. It is possible it may become practical to exploit it. As such, it would be prudent to migrate to a better hash function.
  • In spite of concerns, the family of SHA-2 functions has thus far survived cryptanalysis. These are fine for current usage.
  • Keccak and selected other SHA-3 finalists will likely become available in mainstream cryptographic libraries soon. SHA-3 is approved by NIST, so it is fine for current usage.

Liked this question of the week? Interested in reading it or adding an answer? See the question in full. Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.

QOTW #34 – iMessage – what security features are present?

2012-09-07 by Terry Chia. 0 comments

Two weeks ago, a phishing vulnerability in the text messaging function of Apple’s iPhone was discovered by pod2g. The statement released by Apple said that iMessage, Apple’s proprietary instant messaging service was secure and suggested using it instead.

Apple takes security very seriously. When using iMessage instead of SMS, addresses are verified which protects against these kinds of spoofing attacks. One of the limitations of SMS is that it allows messages to be sent with spoofed addresses to any phone, so we urge customers to be extremely careful if they’re directed to an unknown website or address over SMS.

This prompted a rather large interest in the security community over how secure iMessage really is, given that the technology behind it is not public information. There have been several blog post and articles about that topic, including one right here on security.stackexchange –  The inner workings of iMessage security?

The answer given by dr jimbob provided a few clues, sourced from several links.

It appears that the connection between the phone and Apple’s servers are encrypted with SSL/TLS (unclear which version) using a certificate self signed by Apple with a 2048 bit RSA key.

The iMessage service has been partially reverse engineered. More information can be found here and here.

My thoughts – Is iMessage truly more secure?

Compared to traditional SMS, yes. I do consider iMessage more secure. For one thing, it uses SSL/TLS to encrypt the connection between the phone and Apple’s servers. Compared to the A5/1 cipher used to encrypt SMS communications, this is much more secure.

However, iMessage still should not be used to send sensitive information. All data so far indicates that the messages are stored in plaintext in Apple’s servers. This presents several vulnerabilities. Apple or anyone able to compromise Apple’s servers would be able to read your messages – for as long as their cached.

Treat iMessage as you would emails or SMS communications. It is safe enough for daily usage, but highly sensitive information should not be sent through it.

References:

http://en.wikipedia.org/wiki/IMessage

http://www.pod2g.org/2012/08/never-trust-sms-ios-text-spoofing.html

http://www.engadget.com/2012/08/18/apple-responds-to-iphone-text-message-spoofing/

http://blog.cryptographyengineering.com/2012/08/dear-apple-please-set-imessage-free.html

http://imfreedom.org/wiki/IMessage

https://github.com/meeee/pushproxy

Liked this question of the week? Interested in reading it or adding an answer? See the question in full. Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.

Why passwords should be hashed

2011-11-01 by Thomas Pornin. 3 comments

How passwords should be hashed before storage or usage is a very common question, always triggering passionate debate. There is a simple and comprehensive answer (use bcrypt, but PBKDF2 is not bad either) which is not the end of the question since theoretically better solutions have been proposed and will be worth considering once they have withstood the test of time (i.e. “5 to 10 years in the field, and not broken yet”).

The less commonly asked question is:

why should a password be hashed?

This is what this post is about.

Encryption and Hashing

A first thing to note is that there are many people who talk about encrypted passwords but really mean hashed passwords. An encrypted password is like anything else which has been encrypted: it has been rendered unreadable through a process which used an extra piece of secret data (the key) and which can be reversed with knowledge of the same key (or of a distinct, mathematically related key, in the case of asymmetric encryption). For password hashing, on the other hand, there is no key. The hashing process is like a meat grinder: there is no key, everybody can operate it, but there is no way to get your cow back in full moo-ing state. Whereas encryption would be akin to locking the cow in a stable. Cryptographic hash functions are functions which anybody can compute, efficiently, over arbitrary inputs. They are deterministic (same input yields same output, for everybody).

In shorter words: if MD5 or SHA-1 is involved, this is password hashing, not password encryption. Let’s use the correct term.

Once hashed, the password is still quite useful, because even though the hashing process is not reversible, the output still contains the “essence” of the hashed password and two distinct passwords will yield, with very high probability (i.e. always, in practice), two distinct hashed values (that’s because we are talking about cryptographic hash function, not the other kind). And the hash function is deterministic, so you can always rehash a putative password and see if the result is equal to a given hash value. Thus, a hashed password is sufficient to verify whether a given password is correct or not.

This still does not tell us why we would hash a password, only that hashing a password does not forfeit the intended usage of authenticating users.

To Hash or Not To Hash ?

Let’s see the following scenario: you have a Web site, with users who can “sign in” by showing their name and password. Once signed in, users gain “extra powers” such as reading and writing data. The server must then store “something” which can be used to verify user passwords. The most basic “something” consists in the password themselves. Presumably, the passwords would be stored in a SQL database, probably along with whatever data is used by the application.

The bad thing about such “cleartext” storage of passwords is that it induces a vulnerability in the case of an attack model where the attacker could get a read-only access to the server data. If that data includes the user passwords, then the villain could use these passwords to sign in as any user and get the corresponding powers, including any write access that valid users may have. This is an edge case (attacker can read the database but not write to it). However, this is a realistic edge case. Unwanted read access to parts of a Web server database is a common consequence of an SQL injection vulnerability. This really happens.

Also, the passwords themselves can be a prized spoil of war: users are human beings, they tend to reuse passwords across several systems. Or, on the more trivial aspect of things, many users choose as password the name of their girlfriend/boyfriend/dog. So knowing the password for a user on a given site has a tactical value which extends beyond that specific site, and even having an accidental look at the passwords can be embarrassing and awkward for the most honest system administrator.

Storing only hashed passwords solves these problems as best as one can hope for. It is unavoidable that a complete dump of the server data yields enough information to “try” passwords (that’s an “offline dictionary attack”) because the dump allows the attacker to “simulate” the complete server on his own machines, and try passwords at his leisure. We just want that the attacker may not have any faster strategy. A hash function is the right tool for that. In full details, the hashing process should include a per-password random salt (stored along the hashed value) and be appropriately slow (through thousands or millions of nested iterations), but that’s not the subject of this post. Just use bcrypt.

Summary: we hash passwords to prevent an attacker with read-only access from escalating to higher power levels. Password hashing will not make your Web site impervious to attacks; it will still be hacked. Password hashing is damage containment.

A drawback of password hashing is that since you do not store the passwords themselves (but only a piece of data which is sufficient to verify a password without being able to recover it), you cannot send back their passwords to users who have forgotten them. Instead, you must select a new random password for them, or let them choose a new password. Is that an issue ? One could say that since the user forgot his old password, then that password was not easy to remember, so changing it is not a bad thing after all. Users are accustomed to such a procedure. It may surprise them if you are able to send them back their password. Some of them might even frown upon your lack of security savviness if you so demonstrates that you do not hash the stored passwords.

Storing secrets in software

2011-09-06 by ninefingers. 0 comments

This question comes up on Stack Overflow and IT Security relatively regularly, and goes along one of these lines:

  1. I have a symmetric encryption key I would like to store in my application so attackers can’t find it.
  2. I have an asymmetric encryption key I would like to store in my application so attackers can’t find it.
  3. I would like to store authentication details of some kind in my application so attackers can’t find them.
  4. I have developed an algorithm. How do I make it so attackers can never find it.
  5. I am selling commercial music. I need to make it so The Nasty Pirates can’t decode it.

Firstly, a review of what cryptography is at heart: encryption and decryption are all about sending data over untrusted networks or storing data in untrusted places such that only the intended recipient can read that information. This gives you confidentiality; cryptography as a whole also aims to provide integrity checks through signatures and message digests. Confusingly, cryptography is often confused with access control, in which it so often plays a part. Cryptography ceases to be able to protect you when you decide to put the key and the encrypted data together – at this point, the data has reached its destination, the trusted place. The expectation that cryptography can protect data once it is decrypted is similar to the expectation that a locked door will protect your house if you leave the key under the mat. The act of accessing the key and decrypting the data (and even encrypting it) is a weak link in the chain: it assumes the system you’re performing these actions on guarantees your confidentiality and integrity – it assumes that system is trusted. Reading the argument presented here, our resident cryptographer provided the following explanation:  “Encryption does not create confidentiality, it just concentrates confidentiality into the key. Presumably, it is easier to keep confidential a small key of fixed size, and the key uniform structure allows for the key confidentiality to be measured. Yet you have to start confidentiality at something. Once the key is known, confidentiality has left.”

All of the above questions are really forms of the same thing: how do I on an untrusted system safely decode some encrypted data without interception? In other words, you’re now asking for cryptography to guarantee the security of that information even after you’ve decrypted it. This isn’t possible. The problem then becomes one of how do you ensure that the system in question will maintain confidentiality and integrity for you.

The answer, then, is to create a system which acts as the recipient such that data can be decoded in it and never needs to be transferred outside of it. I’ll call it the black box. The rest of this blog post will be about looking for the black box setup.

  1. We will begin with the idea we want to write a program that stores something securely whilst preventing the user from accessing that information. So the first place people want to do this is in their source code. Which is fine, except source code can be disassembled. Yes, there are ways to make this more difficult but it is not possible to prevent. The same goes for hiding or stashing files around the system, since a cursory analysis with the right tools will tell you exactly where to look. I should add that disassembly prevention probably makes your software less stable and/or portable.
  2. The next option is to coerce the system into helping you hide your information. In any form, this will essentially look like a rootkit. This is dangerous: you may well have your application categorised as malware, for starters, but more importantly at this stage it is very easy to introduce extra vulnerability into the system you have just hooked. You could also crash your customer’s system, which is “not cool” whichever way you look at it. Finally, whilst rootkits are difficult to remove, taking a live listing of files and then an offline one is not. Rootkits can be found and their installation prevented. In fact, a cunning reverse engineer might replace your rootkit with their own equivalent, stealing the information you send it. Handy.
  3. Stage 3 is one for the slashdot home page: the operating system vendor is complicit in helping you. Unless you are a music giant I suspect this probably is not an option for you. A complicit operating system is harder to circumvent, but entirely possible. All you need is access to ring 0 (for people not familiar with the term, ring 0 is the mode where the processor will not stop you lifting any restrictions placed upon memory or code. You can ignore read only page checks, rewrite chunks of kernel memory, whatever you want to do). Depending on the system, this might be difficult to achieve, but it certainly is not impossible. Another route to this stage is to compromise the boot process. You can pass the OS the correct validation codes if it checks anything. Etc. Clearly, this stage is difficult to pull off and requires time/effort, but it can be done.
  4. So, OS security not enough? Hardware then. At this stage you actually have a real black box (or chip) somewhere on the computer. Of course, if the OS has responses sent to it, see stages 1-3. So now your black box needs to talk to all your other hardware, like your monitor or speaker system. Oh and they need to be intercept-proof too – maybe they can’t be trusted either and are really decoding the data straight back to the hard disk. This might sound far fetched, but High-bandwidth Digital Content Protection (wikipedia) aims to provide exactly this kind of protection.

The move/counter-move sequence carries on – so what is to stop a hardware engineer taking your box apart? Self-destructing hardware? Ok, how do you prevent the fact that you’ll display it on a monitor, sending light-waves and electronic signals out?

What does this mean for software on the PC?

  1. You cannot store passwords, encryption keys etc in your code, or anywhere on the system. The only way around that is to allow for user input and aim to prevent interception, which could in itself be difficult.
  2. License keys do not work either, for the same reason.
  3. Nor does obfuscated code. It has to be decrypted to execute, at some point.

However, there is a case where hardware complicit defences are perfectly possible and may well be encouraged, where the hardware and software combination come together. There are many scenarios in which this happens, the most obvious being the mobile phone. A mobile phone is, relatively speaking, hard to take apart and put back together again whole (unless you’re a mobile phone engineer) so it is possible to have hardware-based security work reasonably effectively. Smart cards with on-board cryptographic function again ensure the keys are exceptionally hard to steal. In the case of smart cards, the boundary problem still exists in terms of transferring the decrypted data back to the untrusted system, but on a mobile phone-like device it would be entirely feasible not to route that information through the OS itself and instead play it directly on the screen, isolating those buffers from the “untrusted” sections of the OS. However, let’s leave that idea here. Trusted platforms is a blog post for another day.

Is there a case for devices capable of such secure display? Absolutely. Want to securely read data on a financial transaction, or have a trusted communication channel with your bank? Upping the bar like this definitely helps protect against malware and other interception threats. However, the most common use case appears to be “how do I defend my asset in an untrusted environment in order to enforce my desired price model?” Which brings me full circle – that is not the problem set cryptography solves.

When people say secure, they often mean “impossible for bad guy, possible for me”. That is almost never the case. Clearly, looking at the above, the number of people skilled enough to counter 3 is actually pretty small. You will, therefore, achieve part of this aim: “hard for bad guy, easyish for me”. When people say secure, they also frequently mean technical security measures. Security is more than just technical security – that’s why we have policies, community, law, education, awareness etc. Going as far as points 2/3 reduces the number of people capable of subverting your protection system to the point where legal action is feasible.

However, I think we still have a narrow definition of security in the first place. If you are talking about delivering content or software, the level of issues a customer is likely to come up against increases exponentially as you move from 1 through to point 4. Does your customer really want to be told that you do not support Windows 8 yet? Or that they cannot use their favourite media player for your content? Or that they need to upgrade their BigMediaCorpSatelliteTVBox because you altered the algorithms? Or that they’ve been locked out of that Abba Specials subscription channel they paid for because of a hardware fault, or… Security here is not just about protecting your content, it is about protecting your business. Will the cost of securing the content using any of 1-4 will make up for the potential revenue you could have made if everyone brought the copy legally? Or will the number of legitimate sales simply go down as consumers react to all those technical barriers shattering their plug-and-play expectation? I do not have any numbers on that one, but I am willing to bet the net result of adding this extra “security” is a loss.

Security needs to be appropriate to the risk of the situation presented, having been fully evaluated from all angles. The general consensus is that DRM is an excessive protection measure given the risks involved – indeed, a very simple solution is to make software/music/whatever at the right price point and value such that the vast majority of users buy it, and allow for the fact that some people will pirate/steal/make unauthorised use of it. In certain situations, providing a value add around the product can make the difference (some companies selling open source software generate their entire revenue using this model), and even persuade users of illegal copies to buy a licence in order to gain access to these services (e.g. support, upgrades).