A few weeks ago the anti-spam provider Spamhaus was hit by one of the biggest denial of service attacks ever seen, producing over 300 gbit in traffic. The technique used to generate most of the traffic was DNS Amplification, a technique which doesn’t require thousands of infected hosts, but exploits misconfigured DNS servers and a serious design flaw in DNS. We will discuss how this works, what it abuses and how Spamhaus was capable of mitigating the attack.
A short refreshment of the DNS protocol. We have different types of DNS servers, namely:
- Authoritative Nameservers
- Resolving/caching Nameservers
The authoritative nameserver is the nameserver which is responsible for a/multiple domain(s). These DNS servers know the correct IP address that belongs to one of his domains. If you need to know the IP that belongs to example.com. The responsible DNS server for example.com will have to be consulted to know the actual IP.
The resolving nameserver is a nameserver to whom clients can ask to resolve a certain domain. For instance if the client needs to know the IP of example.com it will ask the resolving nameserver, the resolving nameserver will then ask to the authorative nameserver (or a DNS server which can point him to what DNS server might know) what IP belongs to example.com. To put a little simplistic here’s an example:
- Computer: I need to know the IP of example.com, I shall query the configured Resolving DNS server A
- DNS server A: Computer needs to resolve example.com, I don’t know example.com, but I know who knows .com, I’ll ask DNS Server B
- DNS server B: I don’t know the IP of example.com, but I know the server who is responsible for example.com, it’s DNS Server C
- DNS server A: Hello DNS server C, what is the IP of example.com?
- DNS server C: Hello DNS server A, I know the IP for example.com, it’s 220.127.116.11
- DNS server A: Hello computer, the IP address of example.com is 18.104.22.168
In principle a resolving DNS server should only respond to hosts he trusts. For instance as an ISP you want all requests comming from clients using IPs allocated or used by you to be able perform DNS requests, but not outside of your network. However this is where things go wrong, some resolving DNS servers are completely open and will reply to anyone who asks. This is handy in one way, because if you ever need to resolve domain names and the network you are in doesn’t have a DNS server, you can just ask any of the open ones. I always make my machines resolve from 22.214.171.124, which is Google’s DNS service. It’s an easy to remember number and you know the reliability will probably be alot higher than your ISP can guarantee.
Another feature of DNS which facilitates this attack is contained in the 4th layer of the OSI model, the transport layer. DNS is sent over UDP (it can actually also be sent over TCP according to the RFC, but it’s not used a lot), UDP is a connection-less protocol meaning it provides no consistency or reliability that you received all data, but at the same time the amount of overhead is reduced, making it very fast. The light weightness and speed of UDP makes it ideal for DNS, in the end you just want a chunk of a relatively small amount of data. Because it doesn’t require a handshake to be completed, you can easily spoof the source IP of the UDP request and the server will answer to whatever source IP you put in the UDP request.
Amplification: a flaw in the DNS protocol
Do note that the following examples are in an ideal world and would possibly not be achievable. As said before you can request a DNS server to give you a corresponding IP for a certain domain name you would like to resolve. A major flaw in DNS is the size of the question versus that of the answer. On average a DNS request is about 20-30 bytes long, but answers (depending on what you ask) can go up to 512 bytes. This is where things become interesting, because this means that a server replying to a DNS message needs to send significantly more data than the original requestor did. This was actually used in the past to take down DNS servers as they need to perform a lot more effort than the requestor.
Since we can alter the source address of the DNS request, we can put the our victim’s IP address as source. We then send a valid DNS request with the modified source address to a resolving DNS server. It will then send the answer to the address located in the DNS request. Let’s say we have a 1 Mbit line (upload) and we saturate it completely with 25 byte long requests, we should (ideally) be able to send 5000 requests per second. Because we have altered the DNS request source address it’s our victim that will receive all the answers. Since the requests are up to 512 bytes long, this might result in 5000*512 bytes/s (2.4 megabyte) or almost 20 Mbit/s. This is why it’s called a DNS amplification attack, because we can transform our 1 mbit into 20 mbit, amplifying the traffic.
There are a lot of benefits for using this type of Denial of Service, because you can use “slow” lines. The upload speed needn’t be massive to generate a lot of traffic, so ADSL lines used in internet home connections can be used more effectively. You also don’t need a lot of compromised machines. To get the same amount of traffic by just doing a normal UDP flood, you would need 20 times (the amount of amplification) more hosts. Discovering these bots will be difficult as well since the source address is altered.
Now in the case of Spamhaus, an attack was initiated with a bandwidth reaching up to 300 gbit/s. There were some claims about the DDoS being “the one that almost broke the internet“, however a Tier one provider (NTT) said that while 300 gbit/s may be a lot of bandwidth for a single enterprise, but considering Tier 1 providers are running into several tbit/s, 300 gbit isn’t going to take them down so easily. Akamai said that they didn’t see the internet taking a big hit, just a rather high increase of traffic around Western Europe (as displayed in the figure below).
This doesn’t mean it wasn’t significant, Kaspersky Labs said that it is probably the largest DDoS ever recorded. Eugene Kaspersky even said we have to be happy they only used 30 thousand resolvers. This attack could have taken down a lot of DNS servers as well, which may end in disturbing service. The fear mongering is obviously a marketing technique used by CloudFlare to gain more clients, but regardless of their marketing stunts, I must admit they still did a great job at mitigating attack. PALMAM QUI MERUIT FERAT, “let whoever earns the palm bear it”, not only CloudFlare helped Spamhaus to mitigate this attack, but also Google absorbed some of the traffic.
Who ever initiated the attack is still unclear, but a lot of people seem to think it was the recent mass-blocking of all CyberBunker clients. Investigations are still pending.
How was this mitigated? Well first have a read of Thomas Pornin’s answer. In this case, instead of calling the firemen, they actually decided no-one can access Jim’s shop directly, instead people can come in through different gates, there are a lot of gates available and at every gate there is a guard, verifying why the person came to Jim’s shop. Now the guard will check if someone is telling the shopkeeper he’s from Germany, but coming from a road originating from France, chances are he’s from France and not really from Germany, so there must be something wrong and he shouldn’t be let into Jim’s shop.
In practice this was done with the anycast protocol. Anycast is used to route traffic to the nearest node, depending on your geographical location this will be a different machine/network for a different location. Google uses this as well for their DNS services. So anycast is “one IP, multiple machines“.
Because the attackers were sending requests from different locations, this resulted in the traffic being divided between the 24 datacenters CloudFare owns. This means 12.5 gbit per datacenter, which is a lot more manageable for one datacenter. Also note that routers route an amount of packets, vendors often say they can route 10 or 24 gbit/s, but actually this is calculated as a certain amount of packets with a certain size. Because the packet-size is quite high the router can cope with this type of attack relatively well (the 12.5 gbit/s is comprised of 300-500 byte packets).
Now what can you do against a DNS Amplification attack? First of all everyone should secure their DNS servers, only allow certain hosts to do DNS requests. Now open recursive DNS servers aren’t actually the only ones to blame. The actual problem is the possibility of IP spoofing, so to counter that BCP38 was defined. BCP38 describes ingres filtering, making “sure that incoming packets are actually from the networks that they claim to be from“.
BCP38 defines that if a packet with a certain IP is coming from a segment within your LAN which is actually impossible to be there, drop it. For instance if you manage a subnet 126.96.36.199/16 and suddenly you see a packet flying by with source IP 188.8.131.52 you know something is fishy because there is no way that 184.108.40.206 is legitimately located within that LAN segment (we can also look at complete regions rather than subnets) under your control and therefore it should be dropped rather than forwarded. In short:
IF packet's source address from within [its assigned space] THEN forward as appropriate IF packet's source address is anything else THEN deny packetIn the event that there is actually a reason to allow such behavior, manual exceptions can be made. I can’t come up with a valid reason, but there probably are…somewhere.
BCP38 has been around for 13 years. So it’s about time that everyone adopts (already 80% of the internet is!) it as it will mitigate a lot of attacks involving IP spoofing.
To wrap it up:
- DNS uses UDP which allows the source IP address to be spoofed easily
- 300 gbit/s didn’t actually pose a threat to the internet
- 300 gbit/s is however, probably the biggest DDoS we have ever seen
- DNS Amplification is caused by open DNS resolvers, but the open resolver is not the only problem
- Some providers aren’t even aware they are open resolvers
- DNS Amplification is caused because a lot of people have not adopted BCP38
If you have comments, questions or think I’m wrong, I’m always open to constructive criticism, so feel free to contact me or leave a comment below.