IPv6 Case Study from Monmouth University in West Long Branch, New Jersey
Today all of Monmouth University’s academic, lab, administrative, and wireless networks are dual stacked with IPv6. IPv4 has been around for decades, it’s tried and true, and for that exact reason, we were hesitant to move on to IPv6. After all, most of us know IPv4 inside and out: the private IPv4 ranges, the multicast ranges, the APIPA range (colloquially, “plug and play”), etc. And lets face it, if you’re a network engineer, you can probably convert subnet masks from dotted-quad notation to CIDR and back again in your sleep (or if you’re in other areas of IT, you probably have a cheat sheet). Introduce IPv6. We’re all beginners again. What’s with addresses having letters in them? Colons instead of dots? Double colons? Heresy I say! Deep breath. It’s OK. Your knowledge ramps up quickly and in our case, we went from being an IPv6 zero to an IPv6 hero in less than one summer. Here is our story.
The Wakeup Call
Rewind to 2013. NJEdge, a consortium of higher education institutions in New Jersey, from whom we buy Internet access, gave us a block IPv6 addresses to play around with. We tinkered a little, but saw no real value in going further. However, our wakeup call to get serious about IPv6 arrived the following summer when we were having an issue with Microsoft Exchange. We had recently spun up a few new Exchange 2013 servers – a routine hardware refresh. The upgrade should have resulted in an enhanced client experience. However, we were now experiencing delays when clients would attempt a connection to the servers. After a full day of troubleshooting with 5 server administrators and 2 network engineers, we gave up and opened a trouble ticket with Microsoft. After several days of calls, their engineers isolated the problem. Sometimes the most difficult problems to troubleshoot are those where erroneous assumptions are made. In our case, at some point during the spin-up, our new servers published their IPv4 and IPv6 addresses to DNS. Our network wasn’t fully set up to handle IPv6 yet. Clients were attempting an IPv6 connection, timing out and falling back to IPv4. There is an old saying involving an egg and a person’s face which I shall not retell here, but suffice to say, we felt that it was finally time to “get with it” and embrace IPv6.
We applied for (and ARIN assigned) our own /48 address block (2620:3A:C000::/48). For those new to IPv6, I should explain that there are 8 “octets” (actually called hexadectets, as it represents 16 bits instead of 8, as in IPv4). However, the important takeaway is that ARIN allocated 1.2 septillion addresses to us. Did I mention how large the entire IPv6 range was? We needed to come up with an internal allocation scheme that would be simple and memorable. We eventually settled on a simple format of 2620:3A:C000:<vlan#>:0000:0000:<x>:<z>; where <x> is used to identify if the address is static or dynamic and <z> is dynamically assigned. This helps the IT staff make a good guess as to a server’s IPv6 address if they know its IPv4 address, and vice versa. In the case of a client using DHCP, the address provides us the VLAN # and which DHCP server assigned the address.
Limited IPv4 address resources
With the advent of cheap mobile devices, we have seen our total device count nearly triple in the past decade. IPv4 addresses are tough to come by these days, and like other organizations, we are forced to use private addressing internally and NAT to assign a public address to private addresses when they access the Internet. Occasionally we even run out of NAT addresses and fall back to using PAT (that is, one public address translates to multiple private addresses).
As an educational institution (and essentially an ISP), we occasionally receive copyright infringement notices that need to be traced back to a particular private address/user. Typically, we need to consult our firewall logs to determine which private address was assigned the address shown in the infringement notice at a particular time. We then take that information and consult another log file, which gives us the username. This can be time consuming, especially with large files.
The process is streamlined with IPv6 because the address that is exposed to the Internet is the actual address that was handed out to the device. The more traffic that uses IPv6, the less demand there is on our limited IPv4 NAT addresses. If one of our students is on an IPv6-enabled website such as Facebook, YouTube or Google, they use IPv6, and are not using up NAT resources. In turn, this means we will not fall back onto using PAT during peak times (tracing IPs that used PAT instead of NAT is sometimes more tricky).
People look at IPv4 and say everything is working fine now. Then, when they move to IPv6, they say everything is still working fine, so why do I want to go through the hassle? From a user perspective (or even management perspective), there’s no change, but from a network perspective it makes our lives easier because the firewall doesn’t have to maintain as many entries on the NAT table. The network performance is better, even if users don’t notice a difference.
Since almost all modern OS’s and applications are dual-stack aware, we found that there were no departments or users saying they “need” IPv6 capability. Without any outside pressure to migrate toward IPv6, the IT department itself had to spearhead the movement. We used our idle time (typically mid-summer) to implement IPv6 on a small scale – first enabling the VLAN that the IT department uses along with a test VLAN that would simulate the network that held most of our servers. Our deployment team consisted of two network engineers and one server administrator. This team of three was able to make the needed router changes, create a test VLAN, spin up a test server on the test VLAN, and create DNS records. Since IPv6 has actually been around for years, we were assured that Mr. Google could help us fix any beginner issues we would likely encounter.
As we finished our internal testing, we enabled all VLANs with IPv6. This essentially was telling our routers what IPv6 prefixes belonged to which VLANs. We had full IPv6 connectivity internally (at least to the servers that we re-enabled IPv6 on and published AAAA records for). We then began to enable VLANs going out toward our internet border routers. This is where we ran into our first real roadblock.
Challenge 1: Our network design at the border consists of two separate border routers, each with their own link to our ISP. However, load balancing is accomplished by a third router (with built-in redundancy) which connects the two border routers to our firewall. We found that the Cisco IOS software installed on this 3rd router did not support IPv6. The software needed to be changed out to a different version that had IPv6 support – this resulted in a 5-minute outage overnight as the router rebooted. However, the next day we found that the new version did not have EIGRP support (an internal routing protocol). Again, another 5 minute planned outage overnight for a reboot to install the correct software (for real this time). This wasn’t terrible, but it did create a planned outage, one of which could have been avoided.
Challenge 2: Enabling IPv6 on Wi-Fi. Our wireless controller vendor claimed to support IPv6. It did, sort of. Normally the controller handles about 8,000 users/IP addresses. However, now that each device was using 3 addresses (IPv4, IPv6 global, and an IPv6 link-local address), its load suddenly was up to 12,000+. It wasn’t quite double/triple, as some devices weren’t actively using more than one address at a given time (depending on what sites the device happened to be communicating with). We needed to upgrade our controller to the next-fastest model to alleviate some performance issues. This was an unexpected purchase, but the cost was low enough to squeeze into the existing budget.
Challenge 3: Enabling statistics reporting. The big question was, “Ok, we enabled everything with IPv6. How much internet traffic is actually using it?” We settled on implementing this at the two internet border routers. Out of the box, the router does not discern between IPv4 or IPv6 traffic. We created two sub-interfaces and routed IPv4 traffic to one, and IPv6 traffic to the other. This allowed us to pull traffic statistics from the sub-interfaces to visualize IPv6 utilization. Approximately 40% of our inbound traffic is IPv6.
Challenge 4: A couple pieces of equipment claimed to support IPv6, but then would not do their job correctly over IPv6. For example, our bandwidth management appliance would pass IPv6 traffic, but it would not actually rate-limit any IPv6 traffic. We had to work with the vendor to resolve the issue. Once they saw how much of our traffic was IPv6, they agreed to finish implementing full support. Now they actively highlight IPv6 as a feature of their product. Our spam firewall solution, from a well-known but far from cutting-edge company, advertises IPv6 support, but simple features like being able to whitelist an IPv6 address were (and are still) missing. This has not been a major issue yet, but in the years to come, it may be.
While SLAAC auto-configuration a great idea – clients pick their own address within the valid range for the network it is on – in practice, we found it to be the wild west.
We chose to do stateful DHCPv6 over SLAAC because we wanted better accountability to trace a user when necessary. With SLAAC, we wouldn’t always be able to do that. As an educational institution (an ISP) we must have the ability to look back determine who was using an IP address at what time. We are already using DHCPv4, so why not also use DHCPv6 as well to hand out addresses?
While we are currently using stateful DHCPv6 (aka “regular” DHCP just like v4), we have found that Android (by intended design) does not support DHCPv6. Instead, Android authors believe that users have better privacy when a device chooses its own short-duration IP addresses via SLAAC. This is true, however we found that there were issues when an Android device chose a SLAAC address on our network, then lost Wi-Fi connectivity and continued using our address over the mobile carrier network. We needed to turn off IPv6 for Android mobile users so they did not get time outs or refusal to connect. We may re-evaluate the use of stateful DHCPv6 and the current state of Android behavior in the months to come.
Your webserver is just another host
We treated migrating our main website onto IPv6 just like any other server. We gave an IPv6 address to our main web server, published the record in DNS, made the required firewall changes and that was it. By that point, we already had other servers enabled, we had all of our clients enabled, and we had our path all the way out to the Internet enabled. There was nothing special about the webserver compared to any other. In our initial testing, we spun up a webserver similar to ours to do testing with IPv6 along with a variety of browsers.
Since we advertise both A and AAAA DNS records for the main website, our biggest fear was IPv4-only enabled clients would have a timeout delay. It turned out we didn’t have any issues with that, but it was a big unknown at first.
Don’t be afraid of IPv6. The IPv6 numbering scheme seems haphazard at first glance – letters, numbers, colons, double colons, not all “octets” have 4 characters. However, it is only a shorthand notation that you will come to appreciate very quickly. Depending on how you do your numbering plan, there is even a lot of information you can put into the address itself of a server. The choice to use SLAAC or stateful DHCP may depend on your accountability needs, but a DHCP server such as Windows Server 2016 or 2012R2 is very easy to implement.
In the education field, we try to stay on the cutting edge. We want to present the technology that the student will encounter in the workplace 4 years down the road. While enabling IPv6 connectivity may be behind-the-scenes for most users, a computer science or software engineering student would take notice at some point during their academic career.