Docker containers with IPv6 behind NAT

WARNING

In production IPv6 should always be used without NAT. Only use IPv6 and NAT for testing purposes. There is no valid reason to use IPv6 with NAT in any production environment.

IPv6 and NAT

IPv6 is designed to remove the need for NAT and that is a very, very good thing. NAT breaks Peer-to-Peer connections and that is exactly what is one of the great things of IPv6. Every device on the internet gets it’s own public IP-Address again.

Docker and IPv6

Support for IPv6 in Docker has been there for a while now. It is disabled by default however. The documentation describes on how to enable it.

I wanted to enable IPv6 on my Docker setup on my laptop running Ubuntu, but as my laptop is a mobile device the IPv6 prefix I have changes when I move to a different location. IPv6 Prefix Delegation isn’t available at every IPv6-enabled location either, so I wanted to figure out if I could enable IPv6 in my Docker setup locally and use NAT to have my containers reach the internet over IPv6.

At home I have IPv6 via ZeelandNet and at the office we have a VDSL connection from XS4All. When I’m on a remote location I enable our OpenVPN tunnel which has IPv6 enabled. This way I always have IPv6 available.

The Docker documentation shows that enabling IPv6 is very easy. I modified the systemd service file of docker and added a fixed IPv6 CIDR:

ExecStart=/usr/bin/dockerd --ipv6 --fixed-cidr-v6="fd00::/64" -H fd://

fd00::/64 is a Site-Local IPv6 subnet (deprecated) which can be safely used.

I then added a NAT rule into ip6tables so that it would NAT for me:

sudo ip6tables -t nat -A POSTROUTING -s fd00::/64 -j MASQUERADE

Result

My Docker containers now get a IPv6 Address as can be seen below:

root@da80cf3d8532:~# ip -6 a
1: lo:  mtu 65536 state UNKNOWN qlen 1
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
15: eth0@if16:  mtu 1500 state UP 
    inet6 fd00::242:ac11:2/64 scope global nodad 
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:2/64 scope link 
       valid_lft forever preferred_lft forever
root@da80cf3d8532:~#

In this case the address is fd00::242:ac11:2 which as assigned by Docker.

Since my laptop has IPv6 I can now ping pcextreme.nl from my Docker container.

root@da80cf3d8532:~# ping6 -c 3 pcextreme.nl -n
PING pcextreme.nl (2a00:f10:101:0:46e:c2ff:fe00:93): 56 data bytes
64 bytes from 2a00:f10:101:0:46e:c2ff:fe00:93: icmp_seq=0 ttl=61 time=14.368 ms
64 bytes from 2a00:f10:101:0:46e:c2ff:fe00:93: icmp_seq=1 ttl=61 time=16.132 ms
64 bytes from 2a00:f10:101:0:46e:c2ff:fe00:93: icmp_seq=2 ttl=61 time=15.790 ms
--- pcextreme.nl ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max/stddev = 14.368/15.430/16.132/0.764 ms
root@da80cf3d8532:~#

Again, this should ONLY be used for testing purposes. For production IPv6 Prefix Delegation is the route to go down.

Docker and IPv6 Prefix Delegation

As posted earlier I have IPv6 Prefix Delegation working at our office to test with Docker.

One of the missing links was to automatically configure Docker to use the prefix obtained through DHCPv6+PD. I manually configured the prefix in Docker, but I also had to run dhclient manually.

I figured this could be automated so I gave it a try.

Ubuntu Networking

At first I tried to figure out if Ubuntu’s networking was somehow able to request a prefix through DHCPv6. Long story short: Neither Ubuntu nor CentOS are able to do so. You have to script this manually.

dhclient

To obtain a prefix I had to run dhclient manually. That wasn’t to hard. Simply run:

dhclient -6 -P -d -v eth0

This resulted in obtaining a prefix:

Bound to *:546
Listening on Socket/eth0
Sending on   Socket/eth0
PRC: Confirming active lease (INIT-REBOOT).
XMT: Forming Rebind, 0 ms elapsed.
XMT:  X-- IA_PD d5:68:28:08
XMT:  | X-- Requested renew  +3600
XMT:  | X-- Requested rebind +5400
XMT:  | | X-- IAPREFIX 2001:980:XXXX:140::/60
XMT:  | | | X-- Preferred lifetime +7200
XMT:  | | | X-- Max lifetime +7500
XMT:  V IA_PD appended.
XMT: Rebind on eth0, interval 940ms.
RCV: Reply message on eth0 from fe80::da67:d9ff:fe81:bcec.
RCV:  X-- IA_PD d5:68:28:08
RCV:  | X-- starts 1457617054
RCV:  | X-- t1 - renew  +604800
RCV:  | X-- t2 - rebind +967680
RCV:  | X-- [Options]
RCV:  | | X-- IAPREFIX 2001:980:XXXX:140::/60
RCV:  | | | X-- Preferred lifetime 1209600.
RCV:  | | | X-- Max lifetime 2592000.
RCV:  X-- Server ID: 00:03:00:01:d8:67:d9:81:bc:f0
PRC: Bound to lease 00:03:00:01:d8:67:d9:81:bc:f0.
PRC: Renewal event scheduled in 604800 seconds, to run for 362880 seconds.
PRC: Depreference scheduled in 1209600 seconds.
PRC: Expiration scheduled in 2592000 seconds.

As you can see, I got a /60 prefix. Now I had to somehow get this automated and configure Docker to use it.

Upstart

Since I was testing with Docker 1.10 under Ubuntu 14.04 I had to use Upstart to run dhclient.

The /etc/init/dhclient6-pd.conf Upstart script I created was rather simple:

description     "DHCPv6 Prefix Delegation client"

start on runlevel [2345]
stop on runlevel [!2345]

respawn
respawn limit 30 3
umask 022

console log

exec dhclient -6 -P -d eth0

DHCP hook

dhclient has hooks which it can execute when something happens. I wrote a hook which extracted the delegated IPv6 prefix and restarted Docker.

I placed the hook in the default location for DHCP hooks: /etc/dhcp/dhclient-enter-hooks.d/docker-ipv6:

#!/bin/bash

SUBNET_SIZE=80
DOCKER_ETC_DIR="/etc/docker"
DOCKER_PREFIX_FILE="${DOCKER_ETC_DIR}/ipv6.prefix"

if [ ! -z "$new_ip6_prefix" ]; then
    SUBNET=$(sipcalc -S $SUBNET_SIZE $new_ip6_prefix|grep Network|head -n 1|awk '{print $3}')
    echo "${SUBNET}/${SUBNET_SIZE}" > $DOCKER_PREFIX_FILE

    if [ "$old_ip6_prefix" != "$new_ip6_prefix" ]; then
        service docker restart
    fi
fi

For this to work you need to modify /etc/default/docker so that this line reads:

DOCKER_OPTS="--ipv6 --fixed-cidr-v6=`cat /etc/docker/ipv6.prefix`"

The result

Docker was now running properly with a IPv6 subnet configured and my containers have a IPv6 address as well.

wido@wido-desktop:~$ docker exec -ti 94c8f02 ip addr show dev eth0
13: eth0:  mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.2/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2001:980:XXXX:140:0:242:ac11:2/80 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:2/64 scope link 
       valid_lft forever preferred_lft forever
wido@wido-desktop:~$

Native IPv6 in my Docker containers fully automated and dynamic!

All the scripts I used can be found on Github.

IPv6 Prefix Delegation on a Cisco 887VA behind a XS4All VDSL2 connection

XS4All connection

At the PCextreme office we have a XS4All VDSL2 connection which has native IPv6. We get a /48 from XS4All.

I wrote two earlier blogposts about getting the Cisco 887VA router setup which might be of interest before you continue reading:

IPv6 Prefix Delegation

From XS4All we get a /48 routed to our office using DHCPv6 Prefix Delegation. We are experimenting and testing with Docker at the office where we also want to test the IPv6 capabilities of Docker.

The goal was to sub-delegate /60 subnets out of a /56 towards clients internally. I had to figure out how to get this configured on Cisco IOS.

  • We get a /48 delegated from XS4All
  • The first /56 is used for our local networks (LAN, Guest and Servers)
  • The second /56 is used as a pool to delegate /60 subnets from

Sipcalc

To calculate the IPv6 subnets used the tool ‘sipcalc’. I needed to find the second /56 in our /48:

sipcalc -S 56 2001:980:XX::/48

The output is rather long, so I trimmed it a bit:

-[ipv6 : 2001:980:XX::/48] - 0

[Split network]
Network			- 2001:0980:XX:0000:0000:0000:0000:0000 -
			  2001:0980:XX:00ff:ffff:ffff:ffff:ffff
Network			- 2001:0980:XX:0100:0000:0000:0000:0000 -
			  2001:0980:XX:01ff:ffff:ffff:ffff:ffff
Network			- 2001:0980:XX:0200:0000:0000:0000:0000 -
			  2001:0980:XX:02ff:ffff:ffff:ffff:ffff
...
...
Network			- 2001:0980:XX:ff00:0000:0000:0000:0000 -
			  2001:0980:XX:ffff:ffff:ffff:ffff:ffff

-

In this case 2001:0980:XX:0100:0000:0000:0000:0000:/56 is the second /56 in our /48.

Cisco IOS

Some searching brought me to cisco.com which had some examples.

Eventually it was actually quite easy to get it working.

Configuration

You need a DHCPv6 pool inside the Cisco and tell it to start a DHCPv6 server on the proper interface.

ipv6 dhcp pool local-ipv6
 prefix-delegation pool local-ipv6-pd-pool lifetime 3600 1800
 dns-server 2001:888:0:6::66
 dns-server 2001:888:0:9::99
 domain-name pcextreme.nl
interface Vlan1
 ip address 192.168.5.1 255.255.255.0
 ip nat inside
 ip virtual-reassembly in
 ipv6 address xs4all-prefix ::1/64
 ipv6 enable
 ipv6 nd other-config-flag
 ipv6 nd ra interval 30
 ipv6 nd ra dns server 2001:888:0:6::66
 ipv6 nd ra dns server 2001:888:0:9::99
 ipv6 dhcp server local-ipv6 rapid-commit
 ipv6 mld query-interval 60
ipv6 local pool local-ipv6-pd-pool 2001:980:XX:100::/56 60

That’s all!

Asking for a Prefix

On my Ubuntu desktop I could now request a subnet:

wido@wido-desktop:~$ sudo dhclient -6 -P -v eth0
Internet Systems Consortium DHCP Client 4.2.4
Copyright 2004-2012 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Bound to *:546
Listening on Socket/eth0
Sending on   Socket/eth0
PRC: Soliciting for leases (INIT).
XMT: Forming Solicit, 0 ms elapsed.
XMT:  X-- IA_PD d5:68:28:08
XMT:  | X-- Request renew in  +3600
XMT:  | X-- Request rebind in +5400
XMT: Solicit on eth0, interval 1060ms.
RCV: Advertise message on eth0 from fe80::da67:d9ff:fe81:bcec.
RCV:  X-- IA_PD d5:68:28:08
RCV:  | X-- starts 1455279332
RCV:  | X-- t1 - renew  +900
RCV:  | X-- t2 - rebind +1440
RCV:  | X-- [Options]
RCV:  | | X-- IAPREFIX 2001:980:XX:100::/60
RCV:  | | | X-- Preferred lifetime 1800.
RCV:  | | | X-- Max lifetime 3600.
RCV:  X-- Server ID: 00:03:00:01:d8:67:d9:81:bc:f0
RCV:  Advertisement recorded.
PRC: Selecting best advertised lease.

As you can see I got 2001:980:XX:100::/60 delegated to my desktop.

IPv6 routes

After I asked for a subnet on my desktop this is how the routes look like. You can see a /60 being routed to my Link-Local Address.

firewall-vdsl-veldzigt#show ipv6 route
IPv6 Routing Table - default - 8 entries
Codes: C - Connected, L - Local, S - Static, U - Per-user Static route
       B - BGP, HA - Home Agent, MR - Mobile Router, R - RIP
       H - NHRP, D - EIGRP, EX - EIGRP external, ND - ND Default
       NDp - ND Prefix, DCE - Destination, NDr - Redirect, O - OSPF Intra
       OI - OSPF Inter, OE1 - OSPF ext 1, OE2 - OSPF ext 2, ON1 - OSPF NSSA ext 1
       ON2 - OSPF NSSA ext 2, la - LISP alt, lr - LISP site-registrations
       ld - LISP dyn-eid, a - Application
S   ::/0 [1/0]
     via Dialer0, directly connected
S   2001:980:XX::/48 [1/0]
     via Null0, directly connected
C   2001:980:XX::/64 [0/0]
     via Vlan1, directly connected
L   2001:980:XX::1/128 [0/0]
     via Vlan1, receive
C   2001:980:XX:1::/64 [0/0]
     via Vlan300, directly connected
L   2001:980:XX:1::1/128 [0/0]
     via Vlan300, receive
S   2001:980:XX:100::/60 [1/0]
     via FE80::C23F:D5FF:FE68:XX, Vlan1
L   FF00::/8 [0/0]
     via Null0, receive
firewall-vdsl-veldzigt#

The subnet is working now and I can use it to hand it out to my Docker containers.

Maximum amount of Docker containers on a single host

While playing with Docker I wanted to know how many containers I could spawn on a single system.

A quick for-loop told me that the maximum is 1023 containers on a single host:

Error response from daemon: Cannot start container 09c8f46b59ccc311e8d0352789db6debd0fa1df98186c5cda98583d762d48601: adding interface vetha5d205e to bridge docker0 failed: exchange full

The limitation here is the Linux bridging which can’t have more then 1023 interfaces attached. Specifically net/bridge/br_private.h BR_PORT_BITS cannot be extended because of spanning tree requirements.

wido@wido-desktop:~$ docker ps|wc -l
1024
wido@wido-desktop:~$

Although that says 1024 there is a header line, so we have to subtract one. That brings it to 1023.

wido@wido-desktop:~$ docker version
Client:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   f4bf5c7
 Built:        Mon Oct 12 05:37:18 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   f4bf5c7
 Built:        Mon Oct 12 05:37:18 UTC 2015
 OS/Arch:      linux/amd64
wido@wido-desktop:~$