Posts Tagged ‘hash’

QoTW #38: What is SHA-3 – and why did we change it?

2012-10-12 by ninefingers. 0 comments

Lucas Kauffman selected this week’s Question of the Week: What is SHA3 and why did we change it? 

No doubt if you are at least a little bit curious about security, you’ll have heard of AES, the advanced encryption standard. Way back in 1997, when winters were really hard, our modems froze as we used them and Windows 98 had yet to appear, NIST saw the need to replace the then mainstream Data Encryption Standard with something resistant to the advances in cryptography that had occurred since its inception. So, NIST announced a competition and invited interested parties to submit algorithms matching the desired specification – the AES Process was underway.

A number of algorithms with varying designs were submitted for the process and three rounds held, with comments, cryptanalysis and feedback submitted at each stage. Between rounds, designers could tweak their algorithms if needed to address minor concerns; clearly broken algorithms did not progress. This was somewhat of a first for the crypto community – after the export restrictions and the so called “crypto wars” of the 90s, open cryptanalysis of published algorithms was novel, and it worked. We ended up with AES (and some of you may also have used Serpent, or Twofish) as a result.

Now, onto hashing. Way back in 1996, discussions were underway in the cryptographic community on the possibility of finding a collision within MD5. Practically MD5 started to be commonly exploited in 2005 to create fake certificate authorities. More recently, the FLAME malware used MD5 collisions to bypass Windows signature restrictions. Indeed, we covered this attack right here on the security blog.

The need for new hash functions has been known for some time, therefore. To replace MD5, SHA-1 was released. However, like its predecessor, cryptanalysis began to reveal that its collision resistance required a less-than-bruteforce search. Given that this eventually yields practical exploits that undermine cryptographic systems, a hash standard is needed that is resistant to finding collisions.

As of 2001, we have also had available to us SHA-2, a family of functions that as yet has survived cryptanalysis. However, SHA-2 is similar in design to its predecessor, SHA-1, and one might deduce that similar weaknesses may hold.

So, in response and in a similar vein to the AES process, NIST launched the SHA3 competition in 2007, in their words, in response to recent improvements in cryptanalysis of hash functions. Over the past few years, various algorithms have been analyzed and the number of candidates reduced, much like a reality TV show (perhaps without the tears, though). The final round algorithms essentially became the candidates for SHA3.

The big event this year is that Keccak has been announced as the SHA-3 hash standard. Before we go too much further, we should clarify some parts of the NIST process. Depending on the round an algorithm has reached determines the amount of cryptanalysis it will have received – the longer a function stays in the competition, the more analysis it faces. The report of round two candidates does not reveal any suggestion of breakage; however, NIST has selected its final round candidates based on a combination of performance factors and safety margins. Respected cryptographer Bruce Schneier even suggested that perhaps NIST should consider adopting several of the finalist functions as suitable.

That’s the background, so I am sure you are wondering: how does this affect me? Well, here’s what you should take into consideration:

  • MD5 is broken. You should not use it; it has been used in practical exploits in the wild, if reports are to be believed – and even if they are not, there are alternatives.
  • SHA-1 is shown to be theoretically weaker than expected. It is possible it may become practical to exploit it. As such, it would be prudent to migrate to a better hash function.
  • In spite of concerns, the family of SHA-2 functions has thus far survived cryptanalysis. These are fine for current usage.
  • Keccak and selected other SHA-3 finalists will likely become available in mainstream cryptographic libraries soon. SHA-3 is approved by NIST, so it is fine for current usage.

Liked this question of the week? Interested in reading it or adding an answer? See the question in full. Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.

QotW #19: Why can’t a password hash be reverse engineered?

2012-02-24 by ninefingers. 0 comments

Are you a systems administrator of professional computer systems? Well, serverfault is where you want to be and that’s where this week’s question of the week came from.

New user mucker wanted to understand why, if hashing is just an algorithm, it cannot simply be reverse engineered. A fair question and security.SE as usual did not disappoint.

Since I’m a moderator on crypto.se this question is a perfect fit to write up, so much so I’m going to take a slight detour and define some terms for you and a little on how hashes work.

A background on the internals of hash functions

First, an analogy for hash functions. A hash function works one block of data at a time – so when you hash your large file, the hash function takes so many blocks (depending on the algorithm) at once. It has an initial state – i.e. configuration – which is why sha of nothing actually has a value. Then, each set of incoming blocks alter those values. (Side note: collision resistance is achieved like this).

The analogy in this case is like a bike lock with twisty bits on. Imagine the default state is “1234” and every time you get a number, you alter each of the digits according to the input. When you’ve processed all of the incoming blocks, you then read the number you have in front of you. Hash functions work in a similar way – the state is an array and individual parts of it are shifted, xor’d etc depending on each incoming block. See the linked articles above for more.

Then, we can define input and output of two things: one instance of the hash function has inputs and outputs, as does the overall process of passing all your data through the hash function.

The answers

The top answer from Dietrich Epp is excellent – a simple example was provided of a function – in this case multiplication – which one can do easily forwards (O(N^2)) but that becomes difficult backwards. Factoring large numbers, especially ones with large prime factors, is a famous “hard problem”. Hash functions rely on exactly this property: it is not that they cannot be inverted, it is just that they are hard.

Before migration, Serverfault user Coredump also provided a similar explanation. Some interesting debate came up in the comments of this answer – user nealmcb observed that actually collisions are available in abundance. To go back to the mathsy stuff – the number of inputs is every possible piece of data there is, whereas for outputs we only have 256 bits of data.  So, there are many really long passwords that map to each valid hash value, but that still doesn’t help you find them.

Neal then answered the question himself to raise some further important issues – from a security perspective, it is important to not think of hashes as “impossible” to reverse.  At best they are “hard”, and that is true only if the hash is expertly designed.   As Neal alludes to, breaking hashes often involves significant computing power and dictionary attacks, and might be considered, to steal his words, “messy” (as opposed to a pretty closed-form inverted function) but it can be done.  And all-too-often, it is not even “hard”, as we see with both the famously bad LanMan hash that the original poster mentioned, and the original MYSQL hash.

Several other answers also provided excellent explanations – one to note from Mikeazo that in practise, hash functions are many to one as a result of the fact there are infinite possible inputs, but a fixed number of outputs (hash strings). Luckily for us, a well designed hash function has a large enough output space that collisions aren’t a problem.

So hashes can be inverted?

As a final point on hash functions I’m going to briefly link to this question about the general justification for the security of block ciphers and hash functions. The answer is that even for the best common hashes, no, there is no guarantee of the hardness of reversing them – just as there is no cast iron guarantee products of large primes cannot be factored.

Liked this question of the week? Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.