BGP

Network Security Fall 2021
Due


The goals of this lab are to:

  1. Analyze a real-world network failure from the perspective of BGP
  2. Gain experience in BGP updates and AS connectivity

Facebook and associated services such as WhatsApp and Instagram recently experienced a global outage on October 4, 2021. A post-mortem published by Facebook’s engineering team outlines the root causes of the downtime. However, the external effects of the underlying network configuration errors as documented by Cloudflare manifested as global routing instability and, in turn, unavailability of Facebook’s authoritative DNS servers.

In this lab, you will investigate this instability using BGP UPDATE messages collected during the incident. The messages you will be using in this investigation were collected from a BGP looking glass located at the San Francisco Metropolitan Internet Exchange (SFMIX). The messages were originally captured as MRT records, but have been post-processed using bgpdump to a pipe-separated format. That is, each record occupies one line, and each field is separated by a | character.

There are two message types with slightly different formats you will need to handle: announcements and withdrawals:

Index Field
0 BGP4_ET
1 UNIX timestamp (seconds.nanoseconds)
2 A (announcement) or W (withdrawal)
3 IP address of BGP message source
4 ASN of BGP message source
5 Announced or withdrawn network prefix
6 (announcements only) AS path
8 (announcements only) Next hop IP address

The message archive is located in Canvas as facebook-20211004.psv.zst. In addition, a mapping from ASN to organization is given in asn.txt.zst.

Lab Objectives

Using the information contained in the Cloudflare blog post on this incident, answer the following questions:

  1. What prefixes does Facebook own?
  2. What prefixes were withdrawn during the outage?
  3. What prefixes were still reachable during the outage?
  4. What organizations does Facebook peer with?
  5. From the vantage point of the looking glass, what is an example of an UPDATE that would allow a malicious ASN to hijack (a subset of) Facebook’s traffic? Why would this UPDATE work? Assume the malicious ASN is 65536.
  6. Now, assume that Facebook publishes an RPKI ROA. What would a malicious UPDATE look like in that scenario?

Submission Instructions

Submit your code and a README with your answers to the above questions.