The Collateral Damage of DDoS Attacks - Part 3. Data available from the router
Mar 20, 2018
The set of all edge/peering routers collectively forwards the entire traffic – legitimate and malicious. Therefore, monitoring volume but also type/kind of this traffic is required to detect attacks.
Generally detecting DDoS is based on anomaly detection and/or fingerprints, for example:
Unexpectedly high traffic targeting a single IP host.
Moderate but persistent traffic sourced from UDP port 19 (chargen).
UDP/53(DNS) traffic with QTYPE of any (0xFF).
The information of interface utilization and dynamic of its changes (sharp increases) could also be indicators of volumetric attack, but not sufficient to identify the attack's fingerprint which is necessary to launch countermeasures.
Ideally, a router should report information on traffic passing that allows us to derive following:
TTL, fragment flags, TCP control flags, ToS/DS-field, etc
Volume of flow in bps/pps (or count of Bytes and packets handled in the last period)
This Information needs to be delivered to the DDoS detection system as soon as possible, in order to allow for a fast detection and mitigation. Therefore, collecting information on routers (e.g. in files) and then transferring it via an FTP-like protocol is not acceptable.
For purpose of volumetric DDoS detection, it is worth noting that the above information needs to only be statistically correct. Some micro-flows could go unreported, some big flows could be over-represented. Since we are looking for attacks that cause collateral damage to network capacity, the above inaccuracies are acceptable. Therefore, reporting on traffic based on sampling is a perfectly fine approach.
When looking at the required information, we can see that there are several methods how a router/switch can provide it:
JFlow/NetFlow/Caida version 5. These are traditional flow export formats commonly available on routers. They provide information on flows identified by some L3 and L4 fields, followed by Bytes and Packets carried in the flow, during the time the report covers and some meta-information such as routers interface, src/dst ASN. Etc. There are two drawbacks to this technique:
Inability to expose packet payload. What badly impact ability of detection attack base on their signature available only in packet payload.
Another drawback of this method is that reports represents flow statistical data collected over period of time – inactivity/activity timeout- and send periodically. This creates delay in reporting of new flows. For example, if report period is set to say 5 min, the DDoS flow would be exposed for detection system with 5 minutes delay, and only then it could be recognized, and mitigation action triggered.
JFlow/NetFlow/Caida version 8. This version is used for process and aggregate flow information on the router and export pre-aggregated data. As with all aggregation some details are lost, therefore this format is not applicable for DDoS detection. Also pre-aggregation mean longer periods between reports and higher demand on CPU and memory of router in order to perform aggregation operation.
JFlow/NetFlow/Caida v9. This format are very flexible in terms of data it exports. For DDoS detection application, FlowExport v9 provides all data needed including packet payload (IPFIX Information Element 314). However, the way how FlowExport is expected to work make IE 314 virtually unusable. As said earlier, router is collecting per-flow information for some period of time in form of flow table, and then periodically generate flow report. Therefore, single flow report covers potentially multiple packets, that shares same L3/L4 fields, but packet payload may naturally vary. In such situation payload of first packet of flow is exported (RFC5477 chapter 8.5). This has devastating implication on DDoS detection. Please consider below example. There is legitimate traffic from DNS server A to destination host B – [A,B,UDP,53,C] with QTYPE set to “A” (IPv4 address). And at same time we have DNS amplification attack triggered agaist host B - [A,B,UDP,53,C] with QTYPE set to “ANY”. The FlowExport would see this as single flow [A,B,UDP,53,C] and attach IE314 of first packet into report. If by pure lack first packet was legitimate (QTYPE==”A”), then entire flow would be seen as legitimate. Else, entire flow would be seen as DNS amplification, and if mitigation would be triggered (e.g. discard all [*,B,UDP,53,*]), legitimate traffic would suffer as well. The FlowExport v9, because of periodical report concept, also would introduce delay in attack flow reporting (same as other FlowExport)
IPFIX follow same concept as FlowExport v9 but in addition allows for optional reporting w/o aggregation. (flow timeout ==0). In such case IE 314 would provide correct information, and there would be no unnecessary latency in DDoS detection.
sFlow is similar to IPFIX in terms of data it could export. The packet payload could be exported by use of “Raw Packet Header” ( opaque = flow_data; enterprise = 0; format = 1 ). The sFlow explicitly mandate reporting per-packet, not per-flow-over-time-period. In essence sFlow, despite its name, is not flow oriented data export mechanism, but per-packet reporting.
Packet mirroring/Span-port. In this technique unmodified packet is copied by router. While original of packet is processed and forwarded normally toward destination, the copy is send to other over hard-codded egress interface or tunnel. In context of DDoS protection, technique naturally exposes packet payload and allow fine grain identification and then discarding, of malicious traffic. It is also immediate, so DDoS detection time is reduced to necessary minimum. As a drawback of this method we can list two:
Lack of metadata such prefix/net-mask given packet matched, in/out interfaces, AS numbers etc. This information are not critical for DDoS detection, due to _distributed_ nature and common case of spoofing source IP.
Amount of date it could generate if every packet is mirrored. Again this problem could be mitigated by using statistical sampling and packet truncating. With just 1:1000 sampling which more than sufficient for volumetric DDoS attacks detection, 1Tbps of screened data become only 1Gbps.
All of above traffic reporting techniques base on sampling mechanism. So, only 1 out of N packet is exposed to FlowExport/IPFIX/sFlow process on router/switch and reported. The sampling rate, or rather number of packet that could be sampled per second, is a property of router/switch platform. Therefore it can vary significantly among vendors and products.