Posts Tagged ‘cryptography’

QoTW #38: What is SHA-3 – and why did we change it?

2012-10-12 by ninefingers. 0 comments

Lucas Kauffman selected this week’s Question of the Week: What is SHA3 and why did we change it? 

No doubt if you are at least a little bit curious about security, you’ll have heard of AES, the advanced encryption standard. Way back in 1997, when winters were really hard, our modems froze as we used them and Windows 98 had yet to appear, NIST saw the need to replace the then mainstream Data Encryption Standard with something resistant to the advances in cryptography that had occurred since its inception. So, NIST announced a competition and invited interested parties to submit algorithms matching the desired specification – the AES Process was underway.

A number of algorithms with varying designs were submitted for the process and three rounds held, with comments, cryptanalysis and feedback submitted at each stage. Between rounds, designers could tweak their algorithms if needed to address minor concerns; clearly broken algorithms did not progress. This was somewhat of a first for the crypto community – after the export restrictions and the so called “crypto wars” of the 90s, open cryptanalysis of published algorithms was novel, and it worked. We ended up with AES (and some of you may also have used Serpent, or Twofish) as a result.

Now, onto hashing. Way back in 1996, discussions were underway in the cryptographic community on the possibility of finding a collision within MD5. Practically MD5 started to be commonly exploited in 2005 to create fake certificate authorities. More recently, the FLAME malware used MD5 collisions to bypass Windows signature restrictions. Indeed, we covered this attack right here on the security blog.

The need for new hash functions has been known for some time, therefore. To replace MD5, SHA-1 was released. However, like its predecessor, cryptanalysis began to reveal that its collision resistance required a less-than-bruteforce search. Given that this eventually yields practical exploits that undermine cryptographic systems, a hash standard is needed that is resistant to finding collisions.

As of 2001, we have also had available to us SHA-2, a family of functions that as yet has survived cryptanalysis. However, SHA-2 is similar in design to its predecessor, SHA-1, and one might deduce that similar weaknesses may hold.

So, in response and in a similar vein to the AES process, NIST launched the SHA3 competition in 2007, in their words, in response to recent improvements in cryptanalysis of hash functions. Over the past few years, various algorithms have been analyzed and the number of candidates reduced, much like a reality TV show (perhaps without the tears, though). The final round algorithms essentially became the candidates for SHA3.

The big event this year is that Keccak has been announced as the SHA-3 hash standard. Before we go too much further, we should clarify some parts of the NIST process. Depending on the round an algorithm has reached determines the amount of cryptanalysis it will have received – the longer a function stays in the competition, the more analysis it faces. The report of round two candidates does not reveal any suggestion of breakage; however, NIST has selected its final round candidates based on a combination of performance factors and safety margins. Respected cryptographer Bruce Schneier even suggested that perhaps NIST should consider adopting several of the finalist functions as suitable.

That’s the background, so I am sure you are wondering: how does this affect me? Well, here’s what you should take into consideration:

  • MD5 is broken. You should not use it; it has been used in practical exploits in the wild, if reports are to be believed – and even if they are not, there are alternatives.
  • SHA-1 is shown to be theoretically weaker than expected. It is possible it may become practical to exploit it. As such, it would be prudent to migrate to a better hash function.
  • In spite of concerns, the family of SHA-2 functions has thus far survived cryptanalysis. These are fine for current usage.
  • Keccak and selected other SHA-3 finalists will likely become available in mainstream cryptographic libraries soon. SHA-3 is approved by NIST, so it is fine for current usage.

Liked this question of the week? Interested in reading it or adding an answer? See the question in full. Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.

QoTW #37: How does SSL work ?

2012-10-05 by Thomas Pornin. 0 comments

In this week’s question, we will talk about SSL. This question was asked by @Polynomial, who noticed that our site did not have yet a generic question on how SSL works. There were already some questions on the concept of SSL, but nothing really detailed.

Three answers were given, one by @Luc, and two by myself (because I got really verbose and there is a size limit on answers). The three answers concentrated on distinct aspects of SSL; together, they can explain why SSL works: SSL appears to be decently secure and we can see how this is achieved.

My first answer is a painfully long description of the detailed protocol as it appears on the wire. I wrote it as an introduction to the intricacies of the protocol; what information it contains must be known if you want to understand the details of the cryptographic attacks which have been tried on SSL. This is not much more than RFC-reading, but I still made an effort to merge four RFC (for the four protocol versions: SSLv3, TLS 1.0, 1.1 and 1.2) into one text which should be readable linearly. If you plan on implementing your own SSL client or server (a very instructive exercise, which I recommend for its pedagogical virtues), then I hope that this answer will be a useful reading guide for the actual standards.

What the protocol description shows is that at one point, during the initial steps of the SSL connection (the “handshake”), the server sends its “certificate” to the client (actually, a certificate chain), and then, a few steps later, the client appears to have gained some knowledge of the server’s public key, with which asymmetric cryptography is then performed. The SSL/TLS protocol handles these certificates as opaque blobs. What usually happens is that the client decodes the blobs as X.509 certificates and validates them with regards to a set of known trust anchors. The validation yields the server’s public key, with some guarantee that it really is the key owned by the intended server.

This certificate validation is the first foundation of SSL, as it is used for the Web (i.e. HTTPS). @Luc’s answer contains clear explanations on why certificates are used, and on what the guarantees rely on. Most enlightening is this excerpt:

You have to trust the CA not to make certificates as they please. When organizations like Microsoft, Apple and Mozilla trust a CA though, the CA must have audits; another organization checks on them periodically to make sure everything is still running according to the rules.

So the whole system relies on big companies checking on each other. Some of the trusted CA are governmental (from various governments) but the most often used are private business (e.g. Thawte, Verisign…). An important point to make is that it suffices to subvert or corrupt one CA to get a fake certificate which will be trusted worldwide; so this really is as robust as the weakest of the hundred or so trusted CA which browser vendors include by default. Nevertheless, it works quite well (attacks on CA are rare).

Note that since the certificate parts are quite isolated in the protocol itself (the certificates are just opaque blobs), SSL/TLS can be used without certificate validation in setups where the client “just knows” the server’s public key. This happens a lot in closed environments, such as embedded systems which talk to a mother server. Also, there are a few certificate-less cipher suites, such as the ones which use SRP.

This brings us to the second foundation of SSL: its intricate usage of cryptographic algorithms. Asymmetric encryption (RSA) or key exchange (Diffie-Hellman, or an elliptic curve variant), symmetric encryption with stream or block ciphers, hashing, message authentication codes (HMAC)… the whole paraphernalia is there. Assembling all these primitives into a coherent and secure protocol is not easy at all, and the history of SSL is a rather lengthy sequence of attacks and fixes. My second answer gives details on some of them. What must be remembered is that SSL is state of the art: every attack which has ever been conceived has been tried on SSL, because it is a high-value target. SSL got a lot of exposure, and its survival is testimony to its strength. Sure, it was occasionally harmed, but it was always salvaged. It is rather telling that the recent crop of attacks from Duong and Rizzo (ASP.NET padding oracles, BEAST, CRIME) are actually old attacks which Duong and Rizzo applied; their merit is not in inventing them (they didn’t) but in showing how practical they can be in a Web context (and masterfully did they show it).

From all of this we can list the reasons which make SSL work:

  • The binding between the alleged public key and the intended server is addressed. Granted, it is done with X.509 certificates, which have been designed by the Adversary to drive implementors crazy; but at least the problem is dealt with upfront.
  • The encryption system includes checked integrity, with a decent primitive (HMAC). The encryption uses CBC mode for block ciphers and the MAC is included in the MAC-then-encrypt way: both characteristics are suboptimal, and security with these choices requires special care in the specification (the need of random unpredictible IV, basis of the BEAST attack, fixed in TLS 1.1) and in the implementation (information leak through the padding, used in padding oracle attacks, fixed when Microsoft finally consented to notice the warning which was raised by Vaudenay in 2002).
  • All internal key expansion and checksum tasks are done with a specific function (called “the PRF”) which builds on standard primitives (HMAC with cryptographic hash functions).
  • The client and the server send random values, which are included in all PRF invocations, and protect against replay attacks.
  • The handshake ends with a couple of checksums, which are covered by the encryption+MAC layer, and the checksums are computed over all of the handshake messages (and this is important in defeating a lot of nasty things that an active attacker could otherwise do).
  • Algorithm agility. The cryptographic algorithms (cipher suites) and other features (protocol version, compression) are negotiated between the client and server, which allows for a smooth and gradual transition. This is how current browsers and servers can use AES encryption, which was defined in 2001, several years after SSLv3. It also facilitates recovery from attacks on some features, which can be deactivated on the client and/or the server (e.g. compression, which is used in CRIME).

All these characteristics contribute to the strength of SSL.

Liked this question of the week? Interested in reading it or adding an answer? See the question in full. Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.

QotW #11: Is it possible to have a key for encryption, that cannot be used for decryption?

2011-09-30 by ninefingers. 0 comments

This week’s question of the week was asked by George Bailey, who wanted to know if it were possible to have a key for encryption that could not be used for decryption. This seems at first sight like a simple question, but underneath it there are some cryptographic truths that are interesting to look at.

Firstly, as our first answerer SteveS pointed out, the process of encrypting data according to this model is asymmetric encryption. Steve provided links to several other answers we have. First up from this list was asymmetric vs symmetric encryption. From our answers there, public key cryptography requires two keys, one that can only encrypt material and another which can decrypt material. As was observed in several answers, when compared to straightforward symmetric encryption, the requirement for the public key in public key cryptography creates a large additional burden that depends heavily on careful mathematics, while symmetric key encryption really relies on the confusion and diffusion principle outlined in Shannon’s 1949 Communication Theory of Secrecy Systems. I’ll cover some other points raised in answers later on.

A similarly excellent source of information is what are private and public key cryptography and where are they useful?

So that answered the “is it possible to have such a system” question; the next step is how. This question was asked on the SE network’s Crypto site – how does asymmetric encryption work?. In brief, in the most commonly used asymmetric encryption algorithm (RSA), the core element is a trapdoor function or permutation – a process that is relatively trivial to perform in one direction, but difficult (ideally, impossible, but we’ll discuss that in a minute) to perform in reverse, except for those who own some “insider information” — knowledge of the private key being that information. For this to work, the “insider information” must not be guessable from the outside.

This leads directly into interesting territory on our original question. The next linked answer was what is the mathematical model behind the security claims of symmetric ciphers and hash algorithms. Our accepted answer there by D.W. tells you everything you need to know – essentially, there isn’t one. We only believe these functions are secure based on the fact no vulnerability has yet been found.

The problem then becomes: are asymmetric algorithms “secure”? Let’s take RSA as example. RSA uses a trapdoor permutation, which is raising values to some exponent (e.g. 3) modulo a big non-prime integer (the modulus). Anybody can do that (well, with a computer at least). However, the reverse operation (extracting a cube root) appears to be very hard, except if you know the factorization of the modulus, in which case it becomes easy (again, using a computer). We have no actual proof that factoring the modulus is required to compute a cube root; but more of 30 years of research have failed to come up with a better way. And we have no actual proof either that integer factorization is inherently hard; but that specific problem has been studied for, at least, 2500 years, so easy integer factorization is certainly not obvious. Right now, the best known factorization algorithm is General Number Field Sieve and its cost becomes prohibitive when the modulus grows (current World record is for a 768-bit modulus). So it seems that RSA is secure (with a long enough modulus): breaking it would require to outsmart the best mathematicians in the field. Yet it is conceivable that a new mathematical advance may occur any day, leading to an easy (or at least easier) factorization algorithm. The basis for the security claim remains the same: smart people spent time thinking about it, and found no weakness.

Cryptography offers very few algorithms with mathematically proven security (e.g. One-Time Pad), let alone practical algorithms with mathematically proven security; none of them is an asymmetric encryption algorithm. There is no proof that asymmetric encryption can really exist. But there is no proof that hash functions exist, either, and it never prevented anybody from using hash functions.

Blog promotion afficionado Jeff Ferland provided some extra detail in his answer. Specifically, Jeff addressed which cipher setup should be used for actually encrypting the data, noting that the best setup for most real world scenarios is the combined use of asymmetric and symmetric cryptography as occurs in PGP, for example, where a transfer key encrypts the data using symmetric encryption and that key, a much smaller piece of data, can be effectively be protected by asymmetric encryption; this is often called “hybrid encryption”. The reason asymmetric encryption is not used throughout, aside from speed, is the padding requirement as Jeff himself and this question over on Crypto.SE discusses.

So in conclusion, it is definitely possible to have a key that works only for encryption and not for decryption; it requires mathematical structure, and faith in the difficulty of inverting some of these operations. However, using asymmetric encryption correctly and effectively is one of the biggest challenges in the security field; beyond the maths, private key storage, public key distribution, and key usage without leaking confidential information through careless implementation are very difficult to get right.

QotW #4: How can you reliably wipe data from a storage device?

2011-08-03 by Thomas Pornin. 1 comments

Today we investigate the problem of disposing of hardware storage devices (say, hard disks) which may contain sensitive data. The question which prompted this discussion “Is it enough to only wipe a flash drive once” is about Flash disks (SSD) and received some very good answers; here, we will try to look at the wider picture.

Confidentiality Issues and How to Avoid Them

Storage devices grow old; at some point we want to get rid of them, either because they broke down and need to be replaced, or because they became too small with regards to what we want to do with them. A storage device does not shrink over time, but our needs increase. Quite some time ago, I was sharing some disk space with about one hundred co-students, and the disk offered a hefty 120 megabytes. At that time, a colorful GIF picture was considered to be the top of technology. Today, we take for granted that we can store 2-hour long HD videos on a cellphone. The increase in disk space shows no sign of slowing down, so we have a steady stream of old disks (or USB sticks or SD cards or even ZIP/Jaz cartridges for the old timers — I will not delve into floppy disks) to dispose of. The problem is that all these pieces of hardware have been used quite liberally to store data, possibly confidential data. For the home user, think about the browser cookies, the saved passwords, the cryptographic private keys… In a business context, just about any data element could be of value for competitors.

This is a matter of confidentiality. There are two generic ways to deal with it: encryption, and destruction.

Disk Encryption

Disk encryption is about transforming data in a way which is reversible only with the knowledge of a given secret short-sized element called a key. We are talking about symmetric encryption here; the key is a simple sequence of, say, 128 bits chosen at random. The confidentiality issue is not totally obliterated, only severely reduced: supposedly, it is easier to deal with the secrecy of a sequence of 128 bits than that of 100 GB worth of data. Encryption can be done either by the disk itself, by the operating system, or by the application.

Application-based encryption is limited to what the application can control. For instance, it may have to fight a bit with the OS about virtual memory (that’s pure memory from the point of view of the application, but the OS is prone to write it to the disk anyway). Also, most applications do not have any encryption feature, and modifying all of them is out of the question (not even counting the closed-source ones, that’s just too much work).

OS disk encryption is more thorough, since it can be applied on the complete disk for all files from all applications. It has a few drawbacks, though:

  • The computer must still be able to boot up; in particular, to read from the disk the code which is used to decrypt data. So there must be at least an unencrypted area on the disk. This can cause some system administration headaches.
  • Performance may suffer, for an internal hard disk. An x86 Core2 CPU at 2.4 GHz can encrypt about 160 MB/s with AES (that’s what my PC does with the well-known OpenSSL crypto library): not only do some SSD go faster than that, I also have other things to do with my CPU (my OS is multitasking).
  • For external media (USB sticks, Flash cards…), there can be interoperability issues. There is no well-established standard on disk encryption. You could transport some appropriate software on the media itself but most places will not let you install applications as easily.

Encryption on the hardware itself is easier, but you do not really know if the drive does it properly. Also, the drive must keep the key in some way, and you want it to “forget” that key when the media is decommissioned. As Jesper’s answer describes, good encrypted disks keep the key in NVRAM (i.e. RAM with a battery) and can be instructed to forget the key, but this can prove difficult if the disk is broken: if it does not respond to commands anymore, you cannot really be sure that the NVRAM got blanked.

So while encryption is the theoretically appropriate way to go, it is not complete (you still have to manage the confidentiality of the key) and, in practice, it is not easy. Most of all, it works only if it is applied from day one: encryption can do nothing about data which was written before encryption was envisioned at all. So let’s see what can be done to destroy data.

Data Wiping

Wiping out data is a popular method; but popular does not necessarily mean efficient. The traditional wiping patterns rely on the idea that each data bit will be written exactly over the bit which was at the same logical emplacement on the disk: this was mostly true in 1996, but technology has evolved. Today’s hard disks do not have “track railings” to guide the read/write heads; instead, they use the data itself as guide. The net result is that the new data may be physically off the previous one by a small bit; the old data is still readable “on the edge”.

Also, modern hard disks do not have visibly bad sectors. Bad sectors still exist, but the disk transparently substitutes good sectors instead of bad sectors. This happens dynamically: when a disk writes some data on a sector and detects that the write operation did not go well, then it will allocate a new good sector from its spare area and do the write again there. From the point of view of the operating system, this is invisible; the only consequence is that the write took a few more milliseconds than could have been expected. However, the data has been written to the bad sector (admittedly, one or two bits of it may be wrong, but this leaves more than 4000 genuine bits) and since the sector is now marked as “bad”, it is forever inaccessible from the host computer. No amount of wiping can do anything about that.

On Flash, the same issue arises, multiplied a thousand times. Flash memory works by “blocks” (a few dozen kilobytes) in which two operations can be done:

  • changing a bit from value 1 to value 0;
  • erasing a whole block: all the bit blocks are set to 1.

A given block will endure only that many “erase” operations, so Flash devices use wear leveling techniques, in which write operations are scattered all about the place. The “Flash Translation Layer” is a standard wear leveling algorithm, designed to operate smoothly with a FAT filesystem. The wear leveling means that if you try to overwrite a file, the wiping pattern will be most of the time written elsewhere. Moreover, some blocks can be declared as “bad” and remapped to spare blocks, in a way similar to what magnetic hard disks do (only more often). So some data blocks just linger, forever unreachable from any software wiping.

The basic conclusion is that wiping does not work against a determined attacker. Simply overwriting the whole partition with zeros is enough to deter an attacker who will simply plug the disk in a computer: logically, a single write is enough to “remove” the data, and by working on the partition instead of files, you avoid any filesystem shenanigans. But if your enemies are so cheap that they will limit themselves to the logical layer, then you are lucky indeed.

@nealmcb’s question, How can I reliably erase all information on a hard drive? sparked some excellent discussion including NIST’s guidelines for Media Sanitisation.

Media Destruction

Without encryption (does not work for data which is already there) and data wiping (does not work reliably, or at all), the remaining solution is physical destruction. It is not that easy, though; a good sledgehammer swing, for instance, though satisfying, is not very effective towards data destruction. After all, this is a hard disk. To destroy its contents, you have to remove the cover (there are often many screws, some of which hidden, and glue, and rivets in some models) and then extract the platters, which are quite rigid disks. A simple office shredder will choke on those (although they would easily munch through older floppy disks). The magnetic layer is not thick, so mechanical abrasion may do the trick: use a sander or a grinder. Otherwise, dipping the platters in concentrated sulfuric acid should work.

For Flash devices, things are simpler: that’s mostly silicon with small bits of copper or aluminium, and a plastic cover. Just burn it.

Bottom-line: media destruction requires resources. In a business environment, this could be a system administrator task, but it will involve extra manpower, safety issues (seriously, a geeky system administrator with access to an acid cauldron or a furnace, isn’t it a bit scary ?) and possibly environmental considerations.

Conclusion

Data security must be thought throughout the complete life cycle of storage devices. Whether you go crypto or physical, you must put some thought and resources into it. Most people will simply store old disks and hope for the problem to get away on its own accord.