ZM15 asked this interesting question just before Christmas over on Superuser. It came over to Security Stack Exchange for some security specific input and I was delighted to see it, as I have done a fair bit of work in the practical elements of securing communications – so this blog post may be a tad biased towards my experiences.
For those not in the know, Powerline ethernet is a technology which allows you to transmit ethernet over your existing mains wiring – which is very useful for buildings which aren’t suitable for running cabling, as all you need to do is pop one of these where you want to connect a computer or other ethernet enabled device and they will be able to route TCP/IP packets. There are some caveats of course, the signal really only works on a single phase, so if you have multiple phases in your house the signal may not travel from one to another, although as DBasnett commented, to get around this, commercial properties may inject the signal deliberately onto all phases.
Early Powerline adapters had very poor signal quality – noise on the mains caused many problems – but since then the technology has improved considerably, partly through increasing the signal strength, but also through improving the filters which allow you to separate signal from mains.
This is where the security problem lies – that signal can travel quite far down wires, and despite fuse boxes offering some resistance to signals, you can often find the signal is retrievable in the neighbour’s house. Damien answered:
I have experienced the signal bleed from my next door neighbor. I … could identify two other powerline adapters using the same network name. I got anywhere between 10 to 20Mbps of throughput between their adapters and mine. I was able to access their router, watch streaming video and see the computers on the network. I also noticed they had gotten IPs on my router also.
This prompted him to enable security.
Tylerl gave an excellent viewpoint, which is as accurate here as it has ever been:
Many of the more expensive network security disasters in IT have come from the assumption that “behind the firewall” everything is safe.
Here the assumption was that the perimeter of the house is a barrier, but it really isn’t.
Along even weirder lines, as is the way with any electrical signal, it will be transmitted to some degree from every wire that carries it, so if you have the right equipment you may be able to pick up the traffic from a vehicle parked on the street. This has long been an issue for organisations dealing in highly sensitive information, so various techniques have been developed to shield against these transmissions, however you are unlikely to have a Faraday cage built into your house. (See the article on TEMPEST over on Wikipedia or this 1972 NSA document for more information)
For similar wireless eavesdropping, read about keyboards, securing physical locations, this answer from Tom Leek and this one from Rook – all pointing out that to a determined attacker, there is not a lot the average person can do to protect themselves.
Well, unless you have attackers specifically targeting you, you shouldn’t be, as it is very straightforward to enable security that would be appropriate for most individuals, at least for the foreseeable future. TEMPEST shielding should not be necessary and if you do run Powerline ethernet:
Most Powerline adapters have a security option – simply encryption using a shared key. It adds a little overhead to each communication, but as you can now get 1Gb adapters, this shouldn’t affect most of us. If you need >1Gb, get your property wired.
Liked this question of the week? Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.
Are you a systems administrator of professional computer systems? Well, serverfault is where you want to be and that’s where this week’s question of the week came from.
New user mucker wanted to understand why, if hashing is just an algorithm, it cannot simply be reverse engineered. A fair question and security.SE as usual did not disappoint.
Since I’m a moderator on crypto.se this question is a perfect fit to write up, so much so I’m going to take a slight detour and define some terms for you and a little on how hashes work.
A background on the internals of hash functions
First, an analogy for hash functions. A hash function works one block of data at a time – so when you hash your large file, the hash function takes so many blocks (depending on the algorithm) at once. It has an initial state – i.e. configuration – which is why sha of nothing actually has a value. Then, each set of incoming blocks alter those values. (Side note: collision resistance is achieved like this).
The analogy in this case is like a bike lock with twisty bits on. Imagine the default state is “1234” and every time you get a number, you alter each of the digits according to the input. When you’ve processed all of the incoming blocks, you then read the number you have in front of you. Hash functions work in a similar way – the state is an array and individual parts of it are shifted, xor’d etc depending on each incoming block. See the linked articles above for more.
Then, we can define input and output of two things: one instance of the hash function has inputs and outputs, as does the overall process of passing all your data through the hash function.
The top answer from Dietrich Epp is excellent – a simple example was provided of a function – in this case multiplication – which one can do easily forwards (O(N^2)) but that becomes difficult backwards. Factoring large numbers, especially ones with large prime factors, is a famous “hard problem”. Hash functions rely on exactly this property: it is not that they cannot be inverted, it is just that they are hard.
Before migration, Serverfault user Coredump also provided a similar explanation. Some interesting debate came up in the comments of this answer – user nealmcb observed that actually collisions are available in abundance. To go back to the mathsy stuff – the number of inputs is every possible piece of data there is, whereas for outputs we only have 256 bits of data. So, there are many really long passwords that map to each valid hash value, but that still doesn’t help you find them.
Neal then answered the question himself to raise some further important issues – from a security perspective, it is important to not think of hashes as “impossible” to reverse. At best they are “hard”, and that is true only if the hash is expertly designed. As Neal alludes to, breaking hashes often involves significant computing power and dictionary attacks, and might be considered, to steal his words, “messy” (as opposed to a pretty closed-form inverted function) but it can be done. And all-too-often, it is not even “hard”, as we see with both the famously bad LanMan hash that the original poster mentioned, and the original MYSQL hash.
Several other answers also provided excellent explanations – one to note from Mikeazo that in practise, hash functions are many to one as a result of the fact there are infinite possible inputs, but a fixed number of outputs (hash strings). Luckily for us, a well designed hash function has a large enough output space that collisions aren’t a problem.
So hashes can be inverted?
As a final point on hash functions I’m going to briefly link to this question about the general justification for the security of block ciphers and hash functions. The answer is that even for the best common hashes, no, there is no guarantee of the hardness of reversing them – just as there is no cast iron guarantee products of large primes cannot be factored.
Liked this question of the week? Have questions of a security nature of your own? Security expert and want to help others? Come and join us at security.stackexchange.com.
How passwords should be hashed before storage or usage is a very common question, always triggering passionate debate. There is a simple and comprehensive answer (use bcrypt, but PBKDF2 is not bad either) which is not the end of the question since theoretically better solutions have been proposed and will be worth considering once they have withstood the test of time (i.e. “5 to 10 years in the field, and not broken yet”).
The less commonly asked question is:
why should a password be hashed?
This is what this post is about.
Encryption and Hashing
A first thing to note is that there are many people who talk about encrypted passwords but really mean hashed passwords. An encrypted password is like anything else which has been encrypted: it has been rendered unreadable through a process which used an extra piece of secret data (the key) and which can be reversed with knowledge of the same key (or of a distinct, mathematically related key, in the case of asymmetric encryption). For password hashing, on the other hand, there is no key. The hashing process is like a meat grinder: there is no key, everybody can operate it, but there is no way to get your cow back in full moo-ing state. Whereas encryption would be akin to locking the cow in a stable. Cryptographic hash functions are functions which anybody can compute, efficiently, over arbitrary inputs. They are deterministic (same input yields same output, for everybody).
Once hashed, the password is still quite useful, because even though the hashing process is not reversible, the output still contains the “essence” of the hashed password and two distinct passwords will yield, with very high probability (i.e. always, in practice), two distinct hashed values (that’s because we are talking about cryptographic hash function, not the other kind). And the hash function is deterministic, so you can always rehash a putative password and see if the result is equal to a given hash value. Thus, a hashed password is sufficient to verify whether a given password is correct or not.
This still does not tell us why we would hash a password, only that hashing a password does not forfeit the intended usage of authenticating users.
To Hash or Not To Hash ?
Let’s see the following scenario: you have a Web site, with users who can “sign in” by showing their name and password. Once signed in, users gain “extra powers” such as reading and writing data. The server must then store “something” which can be used to verify user passwords. The most basic “something” consists in the password themselves. Presumably, the passwords would be stored in a SQL database, probably along with whatever data is used by the application.
The bad thing about such “cleartext” storage of passwords is that it induces a vulnerability in the case of an attack model where the attacker could get a read-only access to the server data. If that data includes the user passwords, then the villain could use these passwords to sign in as any user and get the corresponding powers, including any write access that valid users may have. This is an edge case (attacker can read the database but not write to it). However, this is a realistic edge case. Unwanted read access to parts of a Web server database is a common consequence of an SQL injection vulnerability. This really happens.
Also, the passwords themselves can be a prized spoil of war: users are human beings, they tend to reuse passwords across several systems. Or, on the more trivial aspect of things, many users choose as password the name of their girlfriend/boyfriend/dog. So knowing the password for a user on a given site has a tactical value which extends beyond that specific site, and even having an accidental look at the passwords can be embarrassing and awkward for the most honest system administrator.
Storing only hashed passwords solves these problems as best as one can hope for. It is unavoidable that a complete dump of the server data yields enough information to “try” passwords (that’s an “offline dictionary attack”) because the dump allows the attacker to “simulate” the complete server on his own machines, and try passwords at his leisure. We just want that the attacker may not have any faster strategy. A hash function is the right tool for that. In full details, the hashing process should include a per-password random salt (stored along the hashed value) and be appropriately slow (through thousands or millions of nested iterations), but that’s not the subject of this post. Just use bcrypt.
Summary: we hash passwords to prevent an attacker with read-only access from escalating to higher power levels. Password hashing will not make your Web site impervious to attacks; it will still be hacked. Password hashing is damage containment.
A drawback of password hashing is that since you do not store the passwords themselves (but only a piece of data which is sufficient to verify a password without being able to recover it), you cannot send back their passwords to users who have forgotten them. Instead, you must select a new random password for them, or let them choose a new password. Is that an issue ? One could say that since the user forgot his old password, then that password was not easy to remember, so changing it is not a bad thing after all. Users are accustomed to such a procedure. It may surprise them if you are able to send them back their password. Some of them might even frown upon your lack of security savviness if you so demonstrates that you do not hash the stored passwords.
This week’s question of the week was asked by George Bailey, who wanted to know if it were possible to have a key for encryption that could not be used for decryption. This seems at first sight like a simple question, but underneath it there are some cryptographic truths that are interesting to look at.
Firstly, as our first answerer SteveS pointed out, the process of encrypting data according to this model is asymmetric encryption. Steve provided links to several other answers we have. First up from this list was asymmetric vs symmetric encryption. From our answers there, public key cryptography requires two keys, one that can only encrypt material and another which can decrypt material. As was observed in several answers, when compared to straightforward symmetric encryption, the requirement for the public key in public key cryptography creates a large additional burden that depends heavily on careful mathematics, while symmetric key encryption really relies on the confusion and diffusion principle outlined in Shannon’s 1949 Communication Theory of Secrecy Systems. I’ll cover some other points raised in answers later on.
A similarly excellent source of information is what are private and public key cryptography and where are they useful?
So that answered the “is it possible to have such a system” question; the next step is how. This question was asked on the SE network’s Crypto site – how does asymmetric encryption work?. In brief, in the most commonly used asymmetric encryption algorithm (RSA), the core element is a trapdoor function or permutation – a process that is relatively trivial to perform in one direction, but difficult (ideally, impossible, but we’ll discuss that in a minute) to perform in reverse, except for those who own some “insider information” — knowledge of the private key being that information. For this to work, the “insider information” must not be guessable from the outside.
This leads directly into interesting territory on our original question. The next linked answer was what is the mathematical model behind the security claims of symmetric ciphers and hash algorithms. Our accepted answer there by D.W. tells you everything you need to know – essentially, there isn’t one. We only believe these functions are secure based on the fact no vulnerability has yet been found.
The problem then becomes: are asymmetric algorithms “secure”? Let’s take RSA as example. RSA uses a trapdoor permutation, which is raising values to some exponent (e.g. 3) modulo a big non-prime integer (the modulus). Anybody can do that (well, with a computer at least). However, the reverse operation (extracting a cube root) appears to be very hard, except if you know the factorization of the modulus, in which case it becomes easy (again, using a computer). We have no actual proof that factoring the modulus is required to compute a cube root; but more of 30 years of research have failed to come up with a better way. And we have no actual proof either that integer factorization is inherently hard; but that specific problem has been studied for, at least, 2500 years, so easy integer factorization is certainly not obvious. Right now, the best known factorization algorithm is General Number Field Sieve and its cost becomes prohibitive when the modulus grows (current World record is for a 768-bit modulus). So it seems that RSA is secure (with a long enough modulus): breaking it would require to outsmart the best mathematicians in the field. Yet it is conceivable that a new mathematical advance may occur any day, leading to an easy (or at least easier) factorization algorithm. The basis for the security claim remains the same: smart people spent time thinking about it, and found no weakness.
Cryptography offers very few algorithms with mathematically proven security (e.g. One-Time Pad), let alone practical algorithms with mathematically proven security; none of them is an asymmetric encryption algorithm. There is no proof that asymmetric encryption can really exist. But there is no proof that hash functions exist, either, and it never prevented anybody from using hash functions.
Blog promotion afficionado Jeff Ferland provided some extra detail in his answer. Specifically, Jeff addressed which cipher setup should be used for actually encrypting the data, noting that the best setup for most real world scenarios is the combined use of asymmetric and symmetric cryptography as occurs in PGP, for example, where a transfer key encrypts the data using symmetric encryption and that key, a much smaller piece of data, can be effectively be protected by asymmetric encryption; this is often called “hybrid encryption”. The reason asymmetric encryption is not used throughout, aside from speed, is the padding requirement as Jeff himself and this question over on Crypto.SE discusses.
So in conclusion, it is definitely possible to have a key that works only for encryption and not for decryption; it requires mathematical structure, and faith in the difficulty of inverting some of these operations. However, using asymmetric encryption correctly and effectively is one of the biggest challenges in the security field; beyond the maths, private key storage, public key distribution, and key usage without leaking confidential information through careless implementation are very difficult to get right.
This question comes up on Stack Overflow and IT Security relatively regularly, and goes along one of these lines:
- I have a symmetric encryption key I would like to store in my application so attackers can’t find it.
- I have an asymmetric encryption key I would like to store in my application so attackers can’t find it.
- I would like to store authentication details of some kind in my application so attackers can’t find them.
- I have developed an algorithm. How do I make it so attackers can never find it.
- I am selling commercial music. I need to make it so The Nasty Pirates can’t decode it.
Firstly, a review of what cryptography is at heart: encryption and decryption are all about sending data over untrusted networks or storing data in untrusted places such that only the intended recipient can read that information. This gives you confidentiality; cryptography as a whole also aims to provide integrity checks through signatures and message digests. Confusingly, cryptography is often confused with access control, in which it so often plays a part. Cryptography ceases to be able to protect you when you decide to put the key and the encrypted data together – at this point, the data has reached its destination, the trusted place. The expectation that cryptography can protect data once it is decrypted is similar to the expectation that a locked door will protect your house if you leave the key under the mat. The act of accessing the key and decrypting the data (and even encrypting it) is a weak link in the chain: it assumes the system you’re performing these actions on guarantees your confidentiality and integrity – it assumes that system is trusted. Reading the argument presented here, our resident cryptographer provided the following explanation: “Encryption does not create confidentiality, it just concentrates confidentiality into the key. Presumably, it is easier to keep confidential a small key of fixed size, and the key uniform structure allows for the key confidentiality to be measured. Yet you have to start confidentiality at something. Once the key is known, confidentiality has left.”
All of the above questions are really forms of the same thing: how do I on an untrusted system safely decode some encrypted data without interception? In other words, you’re now asking for cryptography to guarantee the security of that information even after you’ve decrypted it. This isn’t possible. The problem then becomes one of how do you ensure that the system in question will maintain confidentiality and integrity for you.
The answer, then, is to create a system which acts as the recipient such that data can be decoded in it and never needs to be transferred outside of it. I’ll call it the black box. The rest of this blog post will be about looking for the black box setup.
- We will begin with the idea we want to write a program that stores something securely whilst preventing the user from accessing that information. So the first place people want to do this is in their source code. Which is fine, except source code can be disassembled. Yes, there are ways to make this more difficult but it is not possible to prevent. The same goes for hiding or stashing files around the system, since a cursory analysis with the right tools will tell you exactly where to look. I should add that disassembly prevention probably makes your software less stable and/or portable.
- The next option is to coerce the system into helping you hide your information. In any form, this will essentially look like a rootkit. This is dangerous: you may well have your application categorised as malware, for starters, but more importantly at this stage it is very easy to introduce extra vulnerability into the system you have just hooked. You could also crash your customer’s system, which is “not cool” whichever way you look at it. Finally, whilst rootkits are difficult to remove, taking a live listing of files and then an offline one is not. Rootkits can be found and their installation prevented. In fact, a cunning reverse engineer might replace your rootkit with their own equivalent, stealing the information you send it. Handy.
- Stage 3 is one for the slashdot home page: the operating system vendor is complicit in helping you. Unless you are a music giant I suspect this probably is not an option for you. A complicit operating system is harder to circumvent, but entirely possible. All you need is access to ring 0 (for people not familiar with the term, ring 0 is the mode where the processor will not stop you lifting any restrictions placed upon memory or code. You can ignore read only page checks, rewrite chunks of kernel memory, whatever you want to do). Depending on the system, this might be difficult to achieve, but it certainly is not impossible. Another route to this stage is to compromise the boot process. You can pass the OS the correct validation codes if it checks anything. Etc. Clearly, this stage is difficult to pull off and requires time/effort, but it can be done.
- So, OS security not enough? Hardware then. At this stage you actually have a real black box (or chip) somewhere on the computer. Of course, if the OS has responses sent to it, see stages 1-3. So now your black box needs to talk to all your other hardware, like your monitor or speaker system. Oh and they need to be intercept-proof too – maybe they can’t be trusted either and are really decoding the data straight back to the hard disk. This might sound far fetched, but High-bandwidth Digital Content Protection (wikipedia) aims to provide exactly this kind of protection.
The move/counter-move sequence carries on – so what is to stop a hardware engineer taking your box apart? Self-destructing hardware? Ok, how do you prevent the fact that you’ll display it on a monitor, sending light-waves and electronic signals out?
What does this mean for software on the PC?
- You cannot store passwords, encryption keys etc in your code, or anywhere on the system. The only way around that is to allow for user input and aim to prevent interception, which could in itself be difficult.
- License keys do not work either, for the same reason.
- Nor does obfuscated code. It has to be decrypted to execute, at some point.
However, there is a case where hardware complicit defences are perfectly possible and may well be encouraged, where the hardware and software combination come together. There are many scenarios in which this happens, the most obvious being the mobile phone. A mobile phone is, relatively speaking, hard to take apart and put back together again whole (unless you’re a mobile phone engineer) so it is possible to have hardware-based security work reasonably effectively. Smart cards with on-board cryptographic function again ensure the keys are exceptionally hard to steal. In the case of smart cards, the boundary problem still exists in terms of transferring the decrypted data back to the untrusted system, but on a mobile phone-like device it would be entirely feasible not to route that information through the OS itself and instead play it directly on the screen, isolating those buffers from the “untrusted” sections of the OS. However, let’s leave that idea here. Trusted platforms is a blog post for another day.
Is there a case for devices capable of such secure display? Absolutely. Want to securely read data on a financial transaction, or have a trusted communication channel with your bank? Upping the bar like this definitely helps protect against malware and other interception threats. However, the most common use case appears to be “how do I defend my asset in an untrusted environment in order to enforce my desired price model?” Which brings me full circle – that is not the problem set cryptography solves.
When people say secure, they often mean “impossible for bad guy, possible for me”. That is almost never the case. Clearly, looking at the above, the number of people skilled enough to counter 3 is actually pretty small. You will, therefore, achieve part of this aim: “hard for bad guy, easyish for me”. When people say secure, they also frequently mean technical security measures. Security is more than just technical security – that’s why we have policies, community, law, education, awareness etc. Going as far as points 2/3 reduces the number of people capable of subverting your protection system to the point where legal action is feasible.
However, I think we still have a narrow definition of security in the first place. If you are talking about delivering content or software, the level of issues a customer is likely to come up against increases exponentially as you move from 1 through to point 4. Does your customer really want to be told that you do not support Windows 8 yet? Or that they cannot use their favourite media player for your content? Or that they need to upgrade their BigMediaCorpSatelliteTVBox because you altered the algorithms? Or that they’ve been locked out of that Abba Specials subscription channel they paid for because of a hardware fault, or… Security here is not just about protecting your content, it is about protecting your business. Will the cost of securing the content using any of 1-4 will make up for the potential revenue you could have made if everyone brought the copy legally? Or will the number of legitimate sales simply go down as consumers react to all those technical barriers shattering their plug-and-play expectation? I do not have any numbers on that one, but I am willing to bet the net result of adding this extra “security” is a loss.
Security needs to be appropriate to the risk of the situation presented, having been fully evaluated from all angles. The general consensus is that DRM is an excessive protection measure given the risks involved – indeed, a very simple solution is to make software/music/whatever at the right price point and value such that the vast majority of users buy it, and allow for the fact that some people will pirate/steal/make unauthorised use of it. In certain situations, providing a value add around the product can make the difference (some companies selling open source software generate their entire revenue using this model), and even persuade users of illegal copies to buy a licence in order to gain access to these services (e.g. support, upgrades).
Bogus SSL certificates are not a new problem and incidents can be found going back over a decade. With a fake certificate for Google.com in the wild, they’re in the news again today. My last two posts have touched on the SSL topic and Moxie Marlinspike’s Convergence.io software is being mentioned in these news articles. At the same time, Dan Kaminski has been pushing for DNSSEC as a replacement for the SSL CA system. Last night, Moxie and Dan had it out 140 characters at a time on Twitter over whether DNSSEC for key distribution was wise.
I’m going to do two things with the discussion: I’m going to cover the discussion I said should be had about replacing the CA system, and I’m going to try and show a risk-based assessment to justify that.
The Risks of the Current System
Risk: Any valid CA chain can create a valid certificate. Currently, there are a number of root certificate authorities. My laptop lists 175 root certificates, and these include names like Comodo, DigiNotar, GoDaddy and Verisign. Any of those 175 root certificates can sign a trusted chain for any certificate and my system will accept it, even if it has previously seen a different unexpired certificate before and even if that certificate has been issued by a different CA. A greater exposure for this risk comes from intermediate certificates. These exist where the CA has signed a certificate for another party that has flags authorizing it to sign other keys and maintain a trusted path. While Google’s real root CA is Equifax, another CA signed a valid certificate that would be accepted by users.
Risk mitigation: DNSSEC. DNSSEC limits the exposure we see above. In the DNSSEC system, *.google.com can only be valid if the signature chain is the DNS root, then the “com.” domain. If Google’s SSL key is distributed through DNSSEC, then we know that the key can only be inappropriate if the DNS root or top-level domain was compromised or is mis-behaving. Just as we trust the properties of an SSL key to secure data, we trust it to maintain a chain, and from that we can say that there is no other risk of a spoofed certificate path if the only certificate innately trusted is the one belonging to the DNS root. Courtesy of offline domain signing, this also means that host a root server does not provide the opportunity to provide malicious responses even by modifying the zone files (they would become invalid).
Risks After Moving to DNSSEC
Risk: We distrust that chain. We presume from China’s actions that China has a direct interest in being able to track all information flowing over the Internet for its users. While the Chinese government has no control over the “com.” domain, they do control the “cn.” domain. Visiting google.cn. does not mean that one aims to let the Chinese government view their data. Many countries have enough resources to perform attacks of collusion where they can alter both the name resolution under their control and the data path.
Risk mitigation: Multiple views of a single site. Convergence.io is a tool that allows one to view the encryption information as seen from different sites all over the world. Without Convergence.io, an attacker needs to compromise the DNS chain and a point between you and your intended site. With Convergence.io, the bar is raised again so that the attacker must compromise both the DNS chain and the point between your target website and the rest of the world.
Weighing the Risks
The two threats we have identified require gaining a valid certificate and compromising the information path to the server. To be a successful attack, both must occur together. In the current system, the bar for getting a validly signed yet inappropriately issued certificate is too low. The bar for performing a Man-in-the-Middle attack on the client side is also relatively low, especially over public wireless networks. DNSSEC raises the certificate bar, and Convergence raises the network bar (your attack will be detected unless you compromise the server side).
That seems ideal for the risks we identified, so what are we weighing? Well, we have to weigh the complexity of Convergence. A new layer is being added, and at the moment it requires an active participation by the user. This isn’t weighing a risk so much as weighing how much we think we can benefit. We also must remember that Convergence doesn’t raise the bar for one who can perform a MITM attack that is between the target server and the whole world.
Weighing DNSSEC, Moxie drives the following risk home for it in key distribution: “DNSSEC provides reduced trust agility. We have to trust VeriSign forever.” I argue that we must already trust the DNS system to point us to the right place, however. If we distrust that system, be it for name resolution or for key distribution, the action is still the same — the actors must be replaced, or the resolution done outside of it. The counter Moxie’s statement in implementing DNSSEC for SSL key distribution is we’ve made the problem an entirely social one. We now know that if we see a fake certificate in the wild, it is because somebody in the authorized chain has lost control of their systems or acted maliciously. We’ve reduced the exposure to only include those in the path — a failure with the “us.” domain doesn’t affect the “com.” domain.
So we’re left with the social risk that we have a bad actor who is now clearly identified. Convergence.io can help to detect that issue even if it only enjoys limited deployment. We are presently ham-strung by the fact that any of a broad number of CAs and ever-broader number of delegates can be responsible for issuing “certified lies,” and that still needs to be reduced.
Making a Decision
Convergence.io is a bolt-on. It does not require anybody else to change except the user. It does add complexity that all users may not be comfortable with, and that may limit its utility. I see no additional exposure risk (in the realm of SSL discussion, not in the realm of somebody changing the source code illicitly) from using it. To that end, Moxie has released a great tool and I see no reason to not be a proponent of the concept.
Moxie still argues that DNSSEC has a potential downside. The two-part question is thus: is reducing the number of valid certification paths for a site to one an improvement? When we remove the risk of an unwanted certification of the site, is the new risk that we can’t drop a certifier a credible increase in risk? Because we must already trust the would-be DSNSEC certifier to direct us to the correct site, the technical differences in my mind are moot.
To put into a practical example: China as a country can already issue a “valid” certificate for Google. They can control any resolution of google.cn. Whether the control the sole key channel for google.cn. or any valid key channel for google.cn., you can’t reach the “true” server and certificate / key combination without trusting China. The solution for that, whether it be the IP address or the keypair is to hold them accountable or find another channel. DNSSEC does not present an additional risk in that regard. It also removes a lot of ugliness related to parsing X.509 certificates and codes (which includes such storied history as the “*” certificate that worked on anything before browser vendors hard-coded it out). Looking at the risks presented and arguments from both sides, I think it’s time to start putting secure channel keys in the DNS stream.
Today we investigate the problem of disposing of hardware storage devices (say, hard disks) which may contain sensitive data. The question which prompted this discussion “Is it enough to only wipe a flash drive once” is about Flash disks (SSD) and received some very good answers; here, we will try to look at the wider picture.
Confidentiality Issues and How to Avoid Them
Storage devices grow old; at some point we want to get rid of them, either because they broke down and need to be replaced, or because they became too small with regards to what we want to do with them. A storage device does not shrink over time, but our needs increase. Quite some time ago, I was sharing some disk space with about one hundred co-students, and the disk offered a hefty 120 megabytes. At that time, a colorful GIF picture was considered to be the top of technology. Today, we take for granted that we can store 2-hour long HD videos on a cellphone. The increase in disk space shows no sign of slowing down, so we have a steady stream of old disks (or USB sticks or SD cards or even ZIP/Jaz cartridges for the old timers — I will not delve into floppy disks) to dispose of. The problem is that all these pieces of hardware have been used quite liberally to store data, possibly confidential data. For the home user, think about the browser cookies, the saved passwords, the cryptographic private keys… In a business context, just about any data element could be of value for competitors.
This is a matter of confidentiality. There are two generic ways to deal with it: encryption, and destruction.
Disk encryption is about transforming data in a way which is reversible only with the knowledge of a given secret short-sized element called a key. We are talking about symmetric encryption here; the key is a simple sequence of, say, 128 bits chosen at random. The confidentiality issue is not totally obliterated, only severely reduced: supposedly, it is easier to deal with the secrecy of a sequence of 128 bits than that of 100 GB worth of data. Encryption can be done either by the disk itself, by the operating system, or by the application.
Application-based encryption is limited to what the application can control. For instance, it may have to fight a bit with the OS about virtual memory (that’s pure memory from the point of view of the application, but the OS is prone to write it to the disk anyway). Also, most applications do not have any encryption feature, and modifying all of them is out of the question (not even counting the closed-source ones, that’s just too much work).
OS disk encryption is more thorough, since it can be applied on the complete disk for all files from all applications. It has a few drawbacks, though:
- The computer must still be able to boot up; in particular, to read from the disk the code which is used to decrypt data. So there must be at least an unencrypted area on the disk. This can cause some system administration headaches.
- Performance may suffer, for an internal hard disk. An x86 Core2 CPU at 2.4 GHz can encrypt about 160 MB/s with AES (that’s what my PC does with the well-known OpenSSL crypto library): not only do some SSD go faster than that, I also have other things to do with my CPU (my OS is multitasking).
- For external media (USB sticks, Flash cards…), there can be interoperability issues. There is no well-established standard on disk encryption. You could transport some appropriate software on the media itself but most places will not let you install applications as easily.
Encryption on the hardware itself is easier, but you do not really know if the drive does it properly. Also, the drive must keep the key in some way, and you want it to “forget” that key when the media is decommissioned. As Jesper’s answer describes, good encrypted disks keep the key in NVRAM (i.e. RAM with a battery) and can be instructed to forget the key, but this can prove difficult if the disk is broken: if it does not respond to commands anymore, you cannot really be sure that the NVRAM got blanked.
So while encryption is the theoretically appropriate way to go, it is not complete (you still have to manage the confidentiality of the key) and, in practice, it is not easy. Most of all, it works only if it is applied from day one: encryption can do nothing about data which was written before encryption was envisioned at all. So let’s see what can be done to destroy data.
Wiping out data is a popular method; but popular does not necessarily mean efficient. The traditional wiping patterns rely on the idea that each data bit will be written exactly over the bit which was at the same logical emplacement on the disk: this was mostly true in 1996, but technology has evolved. Today’s hard disks do not have “track railings” to guide the read/write heads; instead, they use the data itself as guide. The net result is that the new data may be physically off the previous one by a small bit; the old data is still readable “on the edge”.
Also, modern hard disks do not have visibly bad sectors. Bad sectors still exist, but the disk transparently substitutes good sectors instead of bad sectors. This happens dynamically: when a disk writes some data on a sector and detects that the write operation did not go well, then it will allocate a new good sector from its spare area and do the write again there. From the point of view of the operating system, this is invisible; the only consequence is that the write took a few more milliseconds than could have been expected. However, the data has been written to the bad sector (admittedly, one or two bits of it may be wrong, but this leaves more than 4000 genuine bits) and since the sector is now marked as “bad”, it is forever inaccessible from the host computer. No amount of wiping can do anything about that.
On Flash, the same issue arises, multiplied a thousand times. Flash memory works by “blocks” (a few dozen kilobytes) in which two operations can be done:
- changing a bit from value 1 to value 0;
- erasing a whole block: all the bit blocks are set to 1.
A given block will endure only that many “erase” operations, so Flash devices use wear leveling techniques, in which write operations are scattered all about the place. The “Flash Translation Layer” is a standard wear leveling algorithm, designed to operate smoothly with a FAT filesystem. The wear leveling means that if you try to overwrite a file, the wiping pattern will be most of the time written elsewhere. Moreover, some blocks can be declared as “bad” and remapped to spare blocks, in a way similar to what magnetic hard disks do (only more often). So some data blocks just linger, forever unreachable from any software wiping.
The basic conclusion is that wiping does not work against a determined attacker. Simply overwriting the whole partition with zeros is enough to deter an attacker who will simply plug the disk in a computer: logically, a single write is enough to “remove” the data, and by working on the partition instead of files, you avoid any filesystem shenanigans. But if your enemies are so cheap that they will limit themselves to the logical layer, then you are lucky indeed.
@nealmcb’s question, How can I reliably erase all information on a hard drive? sparked some excellent discussion including NIST’s guidelines for Media Sanitisation.
Without encryption (does not work for data which is already there) and data wiping (does not work reliably, or at all), the remaining solution is physical destruction. It is not that easy, though; a good sledgehammer swing, for instance, though satisfying, is not very effective towards data destruction. After all, this is a hard disk. To destroy its contents, you have to remove the cover (there are often many screws, some of which hidden, and glue, and rivets in some models) and then extract the platters, which are quite rigid disks. A simple office shredder will choke on those (although they would easily munch through older floppy disks). The magnetic layer is not thick, so mechanical abrasion may do the trick: use a sander or a grinder. Otherwise, dipping the platters in concentrated sulfuric acid should work.
For Flash devices, things are simpler: that’s mostly silicon with small bits of copper or aluminium, and a plastic cover. Just burn it.
Bottom-line: media destruction requires resources. In a business environment, this could be a system administrator task, but it will involve extra manpower, safety issues (seriously, a geeky system administrator with access to an acid cauldron or a furnace, isn’t it a bit scary ?) and possibly environmental considerations.
Data security must be thought throughout the complete life cycle of storage devices. Whether you go crypto or physical, you must put some thought and resources into it. Most people will simply store old disks and hope for the problem to get away on its own accord.
Last week, we looked at the hardening tag. Today we are going back on a specific question : Does an established SSL connection mean a line is really secure?
Why did we pick up this question? Because almost everyone has heard of SSL, but many are not sure how it works and what it is used for.
Well, the first thing we need to talk about is history of the protocol.
Secure Sockets Layer
The Secure Sockets Layer (SSL) is a cryptographic protocol – now renamed to Transport Layer Security (or TLS). This protocol is designed to create eavesdropping-proof and tampering-proof communication over the Internet and other untrusted networks.
Originally developed by Netscape, the protocol came out in 1995 with its 2.0 version (1.0 was never publicly released). But SSL 2.0 came with some serious security flaws, which included a Man in the Middle vulnerability, which could allow an attacker to sit in the middle of an encrypted communication, unknown to either end, and decrypt the traffic. A new version was released in 1996 as SSL 3.0. The next step of the standard is SSL 3.1 also named TLS 1.0. Only a few improvements were made for this version, but enough to make TLS 1.0 and SSL 3.0 incompatible. The current version of TLS is the 1.2 release from 2008.
So what is it used for? Well many people use it on a regular basis. In fact, TLS is used on many websites to provide the HTTPS connections. But it can also encapsulate other protocols, like FTP, SMTP, NNTP or XMPP. You can even use it to secure an entire VPN as with OpenVPN.
So is it secure?
The question can’t be answered as is. It depends…
First thing to rule out is that you are not using SSL < 3.0 versions. Since they all showed flaws they should not be used now.
Secondly we must ensure the implementation is correct. This question discusses the SSL TLS Renegotiation Vulnerability which is present in some browser versions.
Next we must make sure the connection is using encryption. What? Yes: TLS supports a mode of NULL encryption.
From curiosity I’ve looked in the
about:config page of Firefox 5.x. Be relieved, all ssl2 settings and null encryption mode are disabled.
And finally you need to check that the connection has been established with regard to the protocol. You may be curious on what you could have done not to respect the protocol, let me explain:
The connection is established in multiple steps called a handshake.
- The client connects, and if it requires a secure connection it sends a list of supported ciphers.
- The server picks a cipher from the list (Usually the first in the client list which is compatible with the server list, not necesarrily the strongest), then it notifies the client.
- The server also sends back its identification, packed into a digital certificate. This certification contains the asymmetric public key of the server and it is signed by a Certificate authority.
- The client MAY contact this Certificate Authority to confirm the validity of the server’s public key. This is very important, but the cost of online verification makes it impractical. So usually, the browser embeds Certificate Authority public key to perform an online check of the server’s certificate.
- To generate the session keys, the client encrypts a random number using the server’s public key. Asymmetric cryptography makes deciphering the number without the private key infeasible.
- Now the client and the server have a shared secret they can use to derive the keys for the actual connection.
One of the key point in this scenario is the verification of the Certificate. Did you ever connect to a site and see your browser pop up a message about the certificate? Did you read the message? Usually those kind of security warning are here to tell you that the server certificate has expired, or that it is not signed by a trusted authority.
Theses warnings are critical! You may be subject to a man in the middle attack. By clicking : go to this site anyway, you are giving your browser the express command to trust the certificate you were presented. But, unless you had verified it from the Certificate Authority yourself and decided it should be trusted, you could have accepted a forged certificate by a compromised authority. Your connection is then no longer secure.
Does this means that respecting the protocol ensures your security? Well yes.
Provided you made all the verifications, and as AviD said in the today’s featured question:
While there are some mostly theoretical attacks on the cryptography of SSL[TLS], from my PoV its still plenty strong enough for almost all purposes, and will be for a long time.
But I should ponder that yes the connection is secure. And even if we will not turn into paranoiacs of security, one should always ask oneself what the server will be doing with the data provided. It is a good thing to send data on a secured connection, but this has no meaning if the other endpoint forwards them on unsecured lines.