2020 ARIN Community Grant Program Recipient Report
Artwork by Natasha Allegri
Nowadays almost everyone is aware that Resource Public Key Infrastructure (RPKI) exists and can help protect the Internet’s routing system, but perhaps lesser known is … how do computer systems fetch RPKI data?
In this article we share a report on technical challenges encountered when implementing RPKI Repository Delta Protocol (RRDP) in the free, functional, and secure RPKI validator software rpki-client, and how those challenges were addressed. We also want to invite RPKI operators to celebrate the release of rpki-client 7.0 and help with testing. But first, let’s start with how RPKI data is transported between computer systems and what role RRDP plays in this.
Transmission of RPKI Object over the Internet
The RPKI’s threat model allows signed objects to be transported via any means – even unsecured channels! The RPKI technology is transport protocol agnostic, which permits the Internet industry to transition from one RPKI data synchronisation protocol, to another synchronisation protocol, and then the next one after that! In the current IETF RPKI specifications, the matrix of valid on-the-wire transportation mechanisms are (all of the following via IPv4 or IPv6), UDP (HTTP/3), TCP 443 (HTTP/TLS), and even unencrypted TCP 873 (RSYNC). [Note that HTTP over TCP 80 is not permitted – however some RPKI engineers have argued any strict requirement for common WebPKI Trust Anchors between RPKI Repository and Relying Party might have unintended and detrimental consequences. Reducing the requirements to be able to synchronize RPKI data to just a mutual RPKI Trust Anchor and some form of Internet access would help facilitate the exchange of RPKI data at a planetary scale across all geo-political boundaries to foster a truly global Internet.]
Whatever Layer-3 or Layer-4 protocol ends up being used, determining which computer files containing RPKI files should be transported from CAs (Certificate Authorities – organizations in charge of delegating IP space to third parties) towards Internet Service Providers (“Relying Parties” in RPKI lingo) can be determined in fundamentally two ways: RSYNC or RRDP. Both acronyms represent data transfer protocols which are capable of very efficient one-direction synchronization. Both RRDP and RSYNC require different purpose-built client/server software, and each protocol has a different impact as to where most of the operational burden lies.
An advantage of the RSYNC protocol is that it allows for easy manual debugging with standard UNIX utilities, but on the other hand an advantage of RRDP is that there are more HTTPS implementations to choose from than RSYNC protocol implementations. The RPKI community, as a whole, benefits from multiple distribution mechanisms concurrently existing: if one protocol doesn’t work, perhaps fetching via another protocol yields desirable results. Also – in the author’s opinion – it would be beneficial for long term evolution of the RPKI to continue to encourage RPKI implementers to not amalgamate with one specific synchronization protocol.
What exactly is RRDP?
RRDP’s design borrowed from the concept of distributing Internet Routing Registry (IRR) data through the Near Real Time Mirroring (NRTM) protocol – and in turn – the operational experience with RRDP is expected to influence the next iteration of IRR NRTM v4.
RRDP is a mechanism defined in RFC 8182. A short overview of how it works: an RPKI Repository operator writes a bundle of RPKI objects into a single “Delta” file (B64 encoded DER wrapped in XML). These Delta files are referenced from a central journal called the “Update Notification” file (also XML), then both files are published as static files hosted on a HTTPS server.
RRDP clients periodically fetch the Update Notification file (which is only a very small file), and if the notification file was changed (compared to the last fetch), the client proceeds to download only the missing delta files. A RRDP client bootstrapping from an empty local cache can use a RRDP Snapshot file to get up to speed. A Snapshot is expected to contain the complete repository, and from then onwards the client can fetch just Delta files.
While “piecing together the delta files” increases the computational cost on the ISP side a tiny little bit, the ease for Repository Operators in merely moving static (cacheable) files around is considered advantageous by many in the RPKI community.
RRDP implementation challenges
Until the advent of RRDP, RPKI Repositories were synonymous to directories accessible via RSYNC modules on a RSYNC server. Most RPKI validator implementations would simply synchronize the entire module, and in doing so fetch all CA Repositories from the RSYNC server in one single operation. However, in RRDP the CA’s location is somewhat obfuscated behind a slightly more opaque “RPKI Notify URI”, and additionally, a given X.509 certificate’s “CA Repository” attribute in the SIA Extension now means multiple things: it can either literally mean the RSYNC location, or an indicator for a relative location in the global RPKI file hierarchy. At first glance this might seem a clever trick, but overloading of existing data structures always increases the potential for considerable confusion when implementing a protocol!
Challenge 1: What’s in a name?
OpenBSD rpki-client requires POSIX filesystem semantics to store to-be-validated and validated RPKI data. Storing RPKI objects simply as files on a file system allows for greater debuggability and offers significant advantages to users who wish to construct advanced pipelines where all RPKI X.509 artefacts are archived before and after validation.
When synchronizing through RRDP, a Relying Party downloads arbitrary “self-labeled” digital objects which reference filenames that could collide with the validated filesystem hierarchy. Because retrieved objects have not yet been validated, they are not mappable to the validated RPKI object tree stored on the local filesystem hierarchy. It presents a little bit of a chicken and egg problem!
RPKI-client’s developers came up with a trick for this catch-22: to link data retrieved from a RRDP service to a hierarchical file system layout, a SHA256 digest is calculated for the object’s RPKI Notify URI, and this digest is used as the unique directory name to store objects retrieved from the RRDP server.
Challenge 2: The RRDP <withdraw> XML element, a burden or a feature?
An interesting aspect of RRDP is that the RRDP information stream itself is not signed (just like RSYNC data exchanges are not protected). This means that any intermediate relay (for example a CDN) is in a position to modify (either by accident or deliberately) any data presented to the RRDP client. To reduce the risk of a rogue RSYNC server instructing the client to empty its cache, rpki-client invokes openrsync, without the ‘–delete’ option. Instead, rpki-client performs a Garbage Collection process based on whether any valid current Manifest references a given file, and if no Manifest lists the files found in the cache directory, delete the files. Similarly, an RRDP client could ignore <withdraw> instructions from the RRDP server, and instead rely on cryptographically asserted file reference counting. The operational and security implications of RRDP as one of multiple synchronization channels for RPKI data is an area of further research and study.
Challenge 3: RRDP can suffer from publication inconsistencies… just like RSYNC!
During the development of rpki-client it came to light that some Internet Registries, motivated by performance and reliability considerations use HTTP load balancing to distribute incoming RRDP fetching requests across multiple backends. However, when load balancing requests towards data sensitive to inconsistency (such as RPKI files!,) it is of paramount importance to ensure all backends are perfectly synchronized.
Much to our surprise we found that some RRDP service operators appeared unable to keep backends perfectly synchronized. As the RRDP protocol requires multiple successful HTTP requests to perform a single synchronization, this can lead to race conditions when fetching an Update Notification from one backend which references Delta files not yet available on the other backends. Similar cache inconsistency issues can exist in suboptimal RSYNC publication pipelines. A method to reduce the risk of fetching from multiple different backends is for RRDP clients to establish a persistent HTTP connection. A proposal on how to implement HTTP Keep-Alive support in rpki-client was shared in this email thread. Another (complimentary) approach is for RRDP server operators to use a “sticky” bucket assignment process. All RRDP service operators we reached out to responded positively to our problem reports, and in most cases were able to improve service reliability in a matter of days.
At the time of writing, it appears there is no formal requirement in the RRDP specification for clients and servers to support HTTP Keep-Alive. This might be an opportunity for clarification in the next RPKI synchronization protocol specification!
RPKI operators should keep in mind that publication data inconsistencies can exist within RRDP itself, within RSYNC, but also between RSYNC and RRDP. Similar to how hosts need to monitor both IPv4 and IPv6 when offering dual-stack services, RPKI Repository operators have to monitor both RRDP and RSYNC. Having said that, the industry as a whole benefits from a deterministic approach on how to move forward with RSYNC and RRDP. Simply put, if all validator implementations prefer RRDP, and use RSYNC as a fallback option, eventually RSYNC will no longer receive synchronization requests. RSYNC falling out of fashion is of course contingent on a steady quality RRDP service offering!
RRDP is a protocol across external boundaries, outside the local trust domain
Another discovery was that some RRDP feeds produced IETF Specification Non-conformant XML. Given that RRDP is a conduit between distinct administrative domains, it is very desirable for validators to apply the highest level of scrutiny and to expect nothing less than a strict interpretation of the IETF standards. It was discovered that some RPKI validators were unable to handle certain malformed RRDP input without crashing. Security sensitive applications such as RPKI validators require the opposite of the Robustness Principle, instead: be conservative in what you do, be even more conservative in what you accept from others!
Security features in rpki-client
The OpenBSD project has an extensive history pushing the envelope of cyber security research. The rpki-client application architecture is such that each task of the fetch and validation process takes place in a different process context (privilege separation), and each task-specific subprocess is further restricted from potential unauthorized access to resources through pledge(2) and unveil(2) system calls. For example, the embedded asynchronous HTTP client can access the Internet, but not the local file system; while the embedded RRDP XML parser has neither access to the local file system nor any network functions, but can only communicate to the main process via imsg pipes. The XML tree is constructed with the stream-oriented XML parser libexpat. TLS connections are established and maintained with the novel OpenBSD libtls library, which aims to make it easier and safer to write TLS applications.
The rpki-client utility’s RRDP implementation follows a very restricted access pattern, this positively helps reduce the cybersecurity attack surface!
Benefits to the Internet industry in the ARIN region
The rpki-client software is available to all ARIN members (and the rest of the world) under a permissive open source license. Some ARIN members already today use rpki-client directly or indirectly, and ARIN tests against rpki-client (along with other validators.)
All ARIN members who create RPKI ROAs through ARIN’s RPKI service will benefit from the validator’s capability to use both rsync and RRDP. The RRDP protocol allows the ARIN organization to further streamline the delivery of RPKI data to Relying Parties, while rsync continues to serve as an excellent secondary synchronization protocol in case there is an issue with RRDP.
The rpki-client code was used in demonstrations to other RPKI (rsync/rrdp) implementers and stakeholders on how to implement flipping back and forth between RSYNC and RRDP, and the reception was positive.
Acknowledgement of contributors for RRDP support in OpenBSD rpki-client
RRDP support in rpki-client was primarily developed by Nils Fisher (Australia) and Claudio Jeker (Switzerland); testing and code changeset review by Theo de Raadt (Canada), Theo Beuhler (Germany), Job Snijders (The Netherlands), and Sebastian Benoit (Norway). Tom Harrison and George Michaelson (APNIC) offered assistance as RRDP subject matter experts. Nils Fisher’s portion of the project received financial support from the American Registry for Internet Numbers (ARIN) and the Asia Pacific Network Information Centre (APNIC).
The RRDP protocol offers advantages to the global network operations community; the deployment of RRDP is an important evolution in the RPKI technology stack. However, the technology is not entirely without issues. Implementers will need to take great care to avoid scenarios in which the RRDP protocol itself can be used as an attack vector. Hopefully this implementation report contributes to the development of future IETF specifications for future RPKI synchronization protocols.
Call to Action – Rpki-client 7.0 (with support for RRDP as Technology Preview) was released on April 15th, 2021. All RPKI Repository operators are requested to assist in testing rpki-client in relation to their RPKI publication service offering. We call upon each RIR and NIR to ensure their RPKI publication service is interoperable with rpki-client. Repository Operators can benefit from rpki-client as it can function as an early warning system, for example, as part of the preparation process before commencing maintenance. Additionally, use of rpki-client helps define what the common denominator is amongst a diverse set of RPKI validator implementations.
The rpki-client 7.0 release notes are available here, and signed release files here. Most people are expected to run rpki-client either natively on OpenBSD, or through third party software frameworks such as EPEL (CentOS, Fedora, Red Hat), or Ubuntu/Debian. Example build scripts to generate containers compatible with the Open Container Initiative (OCI) format are available here. Software defects in rpki-client itself may be reported to email@example.com. Issues found when building or running rpki-client on Linux, MacOS, FreeBSD, or Windows may be filed at the rpki-client-portable project on github. We welcome feedback and improvements from the broader community!