Using the new dashboard in ceph-mgr

The upcoming Ceph Luminous (12.2.0) release features the new ceph-mgr daemon which has a few default plugins. One of these plugins is a dashboard to give you a graphical overview of your cluster.

Enabling Module

To enable the dashboard you have to enable the module in your /etc/ceph/ceph.conf on all machines running the ceph-mgr daemon. These are usually your Monitors.

Add this to the configuration:

[mgr]
mgr_modules = dashboard

Don’t restart your ceph-mgr daemon yet. More configuration changes have to be made first.

Setting server address and port

A server address and optionally a port have to be configured as a config-key.

By setting the value to :: the dashboard will be available on all IPv4 and IPv6 addresses on port 7000 (default):

ceph config-key put mgr/dashboard/server_addr ::

Restart daemons

Now restart all ceph-mgr daemons on your hosts:

systemctl restart ceph-mgr@

Accessing the dashboard

The default port is 7000, so now go to the IP-Address of the active ceph-mgr and open the see the dashboard.

You can find the active ceph-mgr in the ceph status:

root@alpha:~# ceph -s
  cluster:
    id:     30d838cd-955f-42e5-bddb-5609e1c880f8
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum alpha,bravo,charlie
    mgr: charlie(active), standbys: alpha, bravo
    osd: 3 osds: 3 up, 3 in
 
  data:
    pools:   1 pools, 64 pgs
    objects: 0 objects, 0 bytes
    usage:   3173 MB used, 27243 MB / 30416 MB avail
    pgs:     64 active+clean
 
root@alpha:~#

In this case charlie is the active mgr which in my case has IPv6 Address 2001:db8::102.

Point your browser to: http://[2001:db8::102]:7000 and you will see the dashboard.

Apache CloudStack and MySQL 5.7

SQL Mode

Starting with MySQL 5.7 the default SQL mode is far more strict then it was before.

It now includes ONLY_FULL_GROUP_BY, STRICT_TRANS_TABLES, NO_ZERO_IN_DATE, NO_ZERO_DATE, ERROR_FOR_DIVISION_BY_ZERO, NO_AUTO_CREATE_USER, and NO_ENGINE_SUBSTITUTION.

This can cause problems for applications which need other SQL modes. Apache CloudStack is one of these applications.

The best thing would be to modify the SQL queries executed by CloudStack, but that’s not that easy.

Changing the mode

Luckily the SQL mode can be changed in either the my.conf or as a session variable.

In the my.cnf one can add:

[mysqld]
sql_mode = 'STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'

Or modify the /etc/cloudstack/management/db.properties file to include this line:

db.cloud.url.params=prepStmtCacheSize=517&cachePrepStmts=true&sessionVariables=sql_mode='STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'

You should now be able to run a CloudStack management server on MySQL 5.7!

Future

In the future CloudStack should only be using SQL queries which comply with the new more strict SQL mode. In the meantine a issue and Pull Request have been created to track this situation.

Docker containers with IPv6 behind NAT

WARNING

In production IPv6 should always be used without NAT. Only use IPv6 and NAT for testing purposes. There is no valid reason to use IPv6 with NAT in any production environment.

IPv6 and NAT

IPv6 is designed to remove the need for NAT and that is a very, very good thing. NAT breaks Peer-to-Peer connections and that is exactly what is one of the great things of IPv6. Every device on the internet gets it’s own public IP-Address again.

Docker and IPv6

Support for IPv6 in Docker has been there for a while now. It is disabled by default however. The documentation describes on how to enable it.

I wanted to enable IPv6 on my Docker setup on my laptop running Ubuntu, but as my laptop is a mobile device the IPv6 prefix I have changes when I move to a different location. IPv6 Prefix Delegation isn’t available at every IPv6-enabled location either, so I wanted to figure out if I could enable IPv6 in my Docker setup locally and use NAT to have my containers reach the internet over IPv6.

At home I have IPv6 via ZeelandNet and at the office we have a VDSL connection from XS4All. When I’m on a remote location I enable our OpenVPN tunnel which has IPv6 enabled. This way I always have IPv6 available.

The Docker documentation shows that enabling IPv6 is very easy. I modified the systemd service file of docker and added a fixed IPv6 CIDR:

ExecStart=/usr/bin/dockerd --ipv6 --fixed-cidr-v6="fd00::/64" -H fd://

fd00::/64 is a Site-Local IPv6 subnet (deprecated) which can be safely used.

I then added a NAT rule into ip6tables so that it would NAT for me:

sudo ip6tables -t nat -A POSTROUTING -s fd00::/64 -j MASQUERADE

Result

My Docker containers now get a IPv6 Address as can be seen below:

root@da80cf3d8532:~# ip -6 a
1: lo:  mtu 65536 state UNKNOWN qlen 1
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
15: eth0@if16:  mtu 1500 state UP 
    inet6 fd00::242:ac11:2/64 scope global nodad 
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:2/64 scope link 
       valid_lft forever preferred_lft forever
root@da80cf3d8532:~#

In this case the address is fd00::242:ac11:2 which as assigned by Docker.

Since my laptop has IPv6 I can now ping pcextreme.nl from my Docker container.

root@da80cf3d8532:~# ping6 -c 3 pcextreme.nl -n
PING pcextreme.nl (2a00:f10:101:0:46e:c2ff:fe00:93): 56 data bytes
64 bytes from 2a00:f10:101:0:46e:c2ff:fe00:93: icmp_seq=0 ttl=61 time=14.368 ms
64 bytes from 2a00:f10:101:0:46e:c2ff:fe00:93: icmp_seq=1 ttl=61 time=16.132 ms
64 bytes from 2a00:f10:101:0:46e:c2ff:fe00:93: icmp_seq=2 ttl=61 time=15.790 ms
--- pcextreme.nl ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max/stddev = 14.368/15.430/16.132/0.764 ms
root@da80cf3d8532:~#

Again, this should ONLY be used for testing purposes. For production IPv6 Prefix Delegation is the route to go down.

Do not use SMR disks with Ceph

Many new disks like the Seagate He8 disks are using a technique called Shingled Magnetic Recording to increase capacity.

As these disks offer a very low price per Gigabyte they seem interesting to use in a Ceph cluster.

Performance

Due to the nature of SMR these disks are very, very, very bad when it comes to Random Write performance. Random I/O is something that Ceph does a lot on the backing disks.

This results in disks spiking to 100% utilization very quickly causing all kinds of trouble with OSDS going down and committing suicide.

Do NOT use them

The solution is very simple. Do not use SMR disks in Ceph but stick to the traditional PMR disks in your Ceph cluster.

In the future we might see SMR support in the new BlueStore of Ceph, but at this moment no work has been done, so don’t expect anything soon.

Testing Ceph BlueStore with the Kraken release

Ceph version Kraken (11.2.0) has been released and the Release Notes tell us that the new BlueStore backend for the OSDs is now available.

BlueStore

The current backend for the OSDs is the FileStore which mainly uses the XFS filesystem to store it’s data. To overcome several limitations of XFS and POSIX in general the BlueStore backend was developed.

It will provide more performance (mainly writes), data safety due to checksumming and compression.

Users are encouraged to test BlueStore starting with the Kraken release for non-production and non-critical data sets and report back to the community.

Deploying with BlueStore

To deploy OSDs with BlueStore you can use the ceph-deploy by using the –bluestore flag.

I created a simple test cluster with three machines: alpha, bravo and charlie.

Each machine will be running a ceph-mon and ceph-osd proces.

This is the sequence of ceph-deploy commands I used to deploy the cluster

ceph-deploy new alpha bravo charlie
ceph-deploy mon create alpha bravo charlie

Now, edit the ceph.conf file in the current directory and add:

[osd]
enable_experimental_unrecoverable_data_corrupting_features = bluestore

With this setting we allow the use of BlueStore and we can now deploy our OSDs:

ceph-deploy --overwrite-conf osd create --bluestore alpha:sdb bravo:sdb charlie:sdb

Running BlueStore

This tiny cluster how runs three OSDs with BlueStore:

root@alpha:~# ceph -s
    cluster c824e460-2f09-4994-8b2f-108aedc52d19
     health HEALTH_OK
     monmap e2: 3 mons at {alpha=[2001:db8::100]:6789/0,bravo=[2001:db8::101]:6789/0,charlie=[2001:db8::102]:6789/0}
            election epoch 14, quorum 0,1,2 alpha,bravo,charlie
        mgr active: charlie standbys: alpha, bravo
     osdmap e14: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds,require_kraken_osds
      pgmap v24: 64 pgs, 1 pools, 0 bytes data, 0 objects
            43356 kB used, 30374 MB / 30416 MB avail
                  64 active+clean
root@alpha:~#
root@alpha:~# ceph osd tree
ID WEIGHT  TYPE NAME        UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 0.02907 root default                                       
-2 0.00969     host alpha                                     
 0 0.00969         osd.0         up  1.00000          1.00000 
-3 0.00969     host bravo                                     
 1 0.00969         osd.1         up  1.00000          1.00000 
-4 0.00969     host charlie                                   
 2 0.00969         osd.2         up  1.00000          1.00000 
root@alpha:~#

On alpha I see that osd.0 only has a small partition for a bit of configuration and the rest is used by BlueStore.

root@alpha:~# df -h /var/lib/ceph/osd/ceph-0
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1        97M  5.4M   92M   6% /var/lib/ceph/osd/ceph-0
root@alpha:~# lsblk 
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0    8G  0 disk 
├─sda1   8:1    0  7.5G  0 part /
├─sda2   8:2    0    1K  0 part 
└─sda5   8:5    0  510M  0 part [SWAP]
sdb      8:16   0   10G  0 disk 
├─sdb1   8:17   0  100M  0 part /var/lib/ceph/osd/ceph-0
└─sdb2   8:18   0  9.9G  0 part 
sdc      8:32   0   10G  0 disk 
root@alpha:~# cat /var/lib/ceph/osd/ceph-0/type
bluestore
root@alpha:~#

The OSDs should work just like OSDs running FileStore, but they should perform better.

Running headless VirtualBox inside Nested KVM

For the Ceph training at 42on I use VirtualBox to build Virtual Machines. This is because they work under MacOS, Windows and Linux.

For the internal Git at 42on we use Gitlab and I wanted to use Gitlab’s CI to build my Virtual Machines automatically.

As we don’t have any physical hardware at 42on (everything runs in the cloud) I wanted to see if I could run VirtualBox Headless inside a VM with Nested KVM enabled.

Nested KVM

The first thing I checked was if my KVM Virtual Machine actually supported Nested KVM. This can be verified with the kvm-ok command under Ubuntu:

root@glrun01:~# kvm-ok 
INFO: /dev/kvm exists
KVM acceleration can be used
root@glrun01:~#

Now that’s verified I tried to install VirtualBox.

VirtualBox

Installing VirtualBox is straight forward. Just add the repository and install the packages. Don’t forget to reboot afterwards to make sure all kernel modules are loaded and properly installed.

apt-get install virtualbox

VirtualBox Extension Pack

The trick to get everything working properly is to install Oracle’s VirtualBox Extension Pack. It took me a while to figure out that I need to install it manually. It wasn’t done by default after install.

You need to download the pack and install it using the VBoxManage command.

wget http://download.virtualbox.org/virtualbox/5.0.24/Oracle_VM_VirtualBox_Extension_Pack-5.0.24.vbox-extpack
vboxmanage extpack install Oracle_VM_VirtualBox_Extension_Pack-5.0.24.vbox-extpack
vboxmanage list extpacks
vboxmanage setproperty vrdeextpack "Oracle VM VirtualBox Extension Pack"

With that installed and configured I rebooted the machine again just to be sure.

It works!

With that it actually worked. The VirtualBox VMs can now be built inside a Nested KVM machine controlled by Gitlab’s CI 🙂

3 years of Model S ownership

September 26th 2013

On 26-09-2013 the day had finally arrived: Delivery of my Tesla Model S!

In the morning my Delivery Specialist send me this picture asking me if I was ready (with a smiley behind it 😉 ).

Tesla Model S 2013

It was 3 years since I ordered my Model S, so I couldn’t wait to pick it up! (In the back you see the Blue Model S of my colleague)

Specifications

These are the options I chose for my Model S:

  • 85kWh non-performance (RWD)
  • Pearl White
  • All Glass Panoramic Roof
  • Base 19″ wheels
  • Black Nappa Leather Interior
  • Piano Black Décor
  • Tech Package
  • Sound Studio Package
  • Active Air Suspension
  • Lighting Package
  • Parking Sensors
  • Twin Chargers (22kW)

Price: EUR 97.890,00 (Including all taxes)

Afterwards I swapped the 19″ base wheels for the 19″ Cyclone Grey. These wheels are no longer available.

SuperCharger Germany

September 2016

Fast forward 3 years and 141.466km: I’m still super happy with my Model S. Best car ever, hands down.

The 100.000km mark was hit at November 14th 2015 as I was almost home. Literally, I was just 200m away from my house. So I could stop I take a good picture.

100.000km on Model S

Now, 3 years later it is well over 141.000km and will probably hit 150.000km somewhere in October.

Tesla Model S 141k km

Tesla Model S driveway

The roadtrips I did in these 3 years were all over Europe:

  • 3x: Octoberfest in München (DE): 2.000km
  • 3x: To Prague (CZ): 2.2000km
  • 3x: To Berlin (DE): 2.000km
  • 2x: Northern Norway above the Arctic circle: 6.000km
  • 2x: Summer roadtrip to Germany, Austria, Italy and Switzerland: 3.500km
  • 1x: To Swansea in Wales (UK): 1.500km

Each of these trips were with either friends or my girlfriend. Awesome trips, each of them. All powered by the ever expanding SuperCharger network.

Over the Air software updates

Due to the over the Air software updates the car only got better and better in these 3 years. A few things (but not all of them) which were added:

  • Trip Planner using SuperChargers
  • Spotify integration (awesome!)
  • New UI
  • Calender sync
  • Slightly improved efficieny

All for free and while my car was parked at home. My other cars never got better over time, they always got worse.

Problems

In three years I’ve driven my car to a lot of places in Europe (see below). From the cold in Norway to the heat of Italy.

Did I have some issues? Yes, but to be clear: I was never stranded! It did not malfunction in such a way that it was disabled.

So what did I experience?

  • Humming drivetrain. It was replaced 3 times under warranty
  • Main contactor failure in battery. Reboot of car worked and contactor was replaced.
  • Fogging rear lights
  • Slave charger failure. Causing reduced charging speed with AC charging
  • Window washer pump failure.

Again, none of these issues left me stranded along the road. They were also all fixed under warranty except for the window washer pump. That was EUR 100 in total.

Maintenance

In total my S went for service 3 times. I figured once every year would be enough. I paid two invoices of EUR 700,00 each. The other ones were discounted from the referral credit I have at Tesla.

Including the washer fluid pump my total expenses on service and maintenance were EUR 800,00. Not bad I would say!

Energy Consumption

About 70% of my charging is done at home, the rest at SuperChargers and other (public) chargers.

There is a kWh meter in front of my charging station at home and I’ve used about 20.000kWh. Judging from my 70% ‘charge at home’ assumption my total energy usage in 3 years was roughly 28.000kWh.

28.000kWh / 141.000km = 198Wh/km, which is about what I see in my general consumption in the car.

Roadtrips

As I wrote above I undertook multiple Roadtrips in the three years, but the best trips I did were the trips to Wales, Norway and to Slovenia. I wrote blogs about two of them:

I didn’t write a blogpost about my roadtrip through Europe in June 2015, but you can see the route below (Prague, Austria, Slovenia, Austria, Germany, Netherlands).

I tried to draw the routes I’ve driven on a Google Maps overview.

Routes Tesla Model S

Highlights

In the threee years I still think my trips to the Arctic are the highlights for me. However, there were more highlights, so I gathered a bunch of pictures I took and added them below in a random order.

Route 74 between Norway and Sweden:

Route 74 in Norway

Stuck on the arctic circle in Norway:

Model S stuck Arctic Circle

On the Lofoten Islands in Norway:

Car under snow from back

Model S next to house Lofoten

Octoberfest in München: (I’m the green blouse)

Octoberfest in München.

Charging at Fastned using 50kW CHAdeMO.

Charging at Fastned

In the Belgian Ardennes:

In the Belgian Ardennes

On a train in Austria going towards Bad Gastein:

On train in Austria

At the Slovenian <> Italian border:

Slovenian Border

Conclusion

After owning a Audi A3 2.0TDI (2007), Toyota Auris Hybrid (2011) and BMW M5 E39 (1999) I can saw that Model S is the best car I’ve ever owned. I love driving it and still enjoy every KM. (Except when stuck in traffic….). Even without Autopilot it is still an amazing car!

People still come up to me to ask things about the car and are really interested.

As I said. Best car ever. Period. I will never by a car which burns fossil fuel again.

My deposit for Model 3 was made at the day they opened. Waiting!

Chown Ceph OSD data directory using GNU Parallel

Starting with Ceph version Jewel (10.2.X) all daemons (MON and OSD) will run under the privileged user ceph. Prior to Jewel daemons were running under root which is a potential security issue.

This means data has to change ownership before a daemon running the Jewel code can run.

Chown data

As the Release Notes state you will have to chown all your data to ceph:ceph in /var/lib/ceph.

chown -R ceph:ceph /var/lib/ceph

On a system with multiple OSDs this might take a lot of time, using GNU Parallel you can save yourself a lot of time.

Static UID

The ceph User and Group have been assigned static UID and GIDs in the major distributions:

  • Fedora/CentOS/RHEL: 167:167
  • Debian/Ubuntu: 64045/64045

Chown in parallel

Using these commands you can chown the data in /var/lib/ceph much faster.

WARNING: Make sure the OSDs are stopped on the system before you continue!

Now you can run these commands (Ubuntu in this case):

find /var/lib/ceph/osd -maxdepth 1 -mindepth 1 -type d|parallel chown -R 64045:64045
chown 64045:64045 /var/lib/ceph
chown 64045:64045 /var/lib/ceph/*
chown 64045:64045 /var/lib/ceph/bootstrap-*/*

The first command will take the longest. I tested it on a system with 24 OSDs all containing about 800GB of data. That took roughly 20 minutes.

Calculating SLAAC IPv6 Address in Java

SLAAC

With IPv6 a host on a network can use StateLess Address AutoConfiguration (SLAAC) to configure it’s network.

Routers will send out Router Advertisements telling the network which subnet is used in the network.

Based on their MAC address (modified EUI-64) a host will then obtain a IPv6 it will use.

Java

For the Apache CloudStack project I had to write Java code which would take a subnet and a MAC address as an argument and would generate a IPv6 SLAAC address from it.

Combining subnet 2001:db8:100::/64 with MAC address 06:7a:88:00:00:8b yields IPv6 address 2001:db8:100:0:47a:88ff:fe00:8b.

/*
 * Java code using Java-ipv6 from Google Code to convert
 * a given IPv6 subnet and a MAC address into a IPv6 address
 * calculated using SLAAC.
 *
 * Author: Wido den Hollander 
*/
import com.googlecode.ipv6.IPv6Address;
import com.googlecode.ipv6.IPv6Network;

public class IPv6EUI64 {
    public static IPv6Address EUI64Address(final IPv6Network cidr, final String macAddress) {
        if (cidr.getNetmask().asPrefixLength() > 64) {
            throw new IllegalArgumentException("IPv6 subnet " + cidr.toString() + " is not 64 bits or larger in size");
        }

        String mac[] = macAddress.toLowerCase().split(":");

        return IPv6Address.fromString(cidr.getFirst().toString() +
                Integer.toHexString(Integer.parseInt(mac[0], 16) ^ 2) +
                mac[1] + ":" + mac[2] + "ff:fe" + mac[3] +":" + mac[4] + mac[5]);
    }

    public static void main(String[] argv) {
        IPv6Network cidr = IPv6Network.fromString("2001:db8:100::/64");
        String mac = "06:7a:88:00:00:8b";
        IPv6Address eui64addr = EUI64Address(cidr, mac);

        /* This will print 2001:db8:100:0:47a:88ff:fe00:8b */
        System.out.println(eui64addr);
    }
}

The code can also be found on my Github Gist page.

Calculating DS record from DNSKEY with Python 3

While working on DNSSEC for PCextreme’s Aurora DNS I had to convert a DNSKEY to a DS-record which could be set in the parent zone for proper delegation.

The foundation for Aurora DNS is PowerDNS together with Python 3.

The API for Aurora DNS has to return the DS-records so that a end-user can use these in the parent zone. I had the DNSKEY, but I didn’t have the DS-record so I had to calculate it using Python 3.

I eventually ended up with this Python code which you can find on my Github Gists page.

"""
Generate a DNSSEC DS record based on the incoming DNSKEY record

The DNSKEY can be found using for example 'dig':

$ dig DNSKEY secure.widodh.nl

The output can then be parsed with the following code to generate a DS record
for in the parent DNS zone

Author: Wido den Hollander 

Many thanks to this blogpost: https://www.v13.gr/blog/?p=239
"""

import struct
import base64
import hashlib


DOMAIN = 'secure.widodh.nl'
DNSKEY = '257 3 8 AwEAAckZ+lfb0j6aHBW5AanV5A0V0IfF99vAKFZd6+fJfEChpZtjnItWDnJLPa3/LAFec/tUhLZ4jgmzaoEuX3EQQgI1V4kp9SYf8HMlFPP014eO+AnjkYFGLE2uqHPx/Tu7/pO3EyKwTXi5fMadROKuo/mfat5AEIhGjteGGO93DhnOa6kcqj5RHYJBh5OZ/GoZfbeYHK6Muur1T16hHiI12rYGoqJ6ZW5+njYprG6qwp6TZXxJyE7wF1JdD+Zhbjhf0Md4zMEysP22wBLghBaX6eDIBh/7jU7dw1Ob+I42YWk+X4NSiU3sRYPaq1R13JEK4zVqQtL++UVtgRPEbfj5RQ8='


def _calc_keyid(flags, protocol, algorithm, dnskey):
    st = struct.pack('!HBB', int(flags), int(protocol), int(algorithm))
    st += base64.b64decode(dnskey)

    cnt = 0
    for idx in range(len(st)):
        s = struct.unpack('B', st[idx:idx+1])[0]
        if (idx % 2) == 0:
            cnt += s << 8
        else:
            cnt += s

    return ((cnt & 0xFFFF) + (cnt >> 16)) & 0xFFFF


def _calc_ds(domain, flags, protocol, algorithm, dnskey):
    if domain.endswith('.') is False:
        domain += '.'

    signature = bytes()
    for i in domain.split('.'):
        signature += struct.pack('B', len(i)) + i.encode()

    signature += struct.pack('!HBB', int(flags), int(protocol), int(algorithm))
    signature += base64.b64decode(dnskey)

    return {
        'sha1':    hashlib.sha1(signature).hexdigest().upper(),
        'sha256':  hashlib.sha256(signature).hexdigest().upper(),
    }


def dnskey_to_ds(domain, dnskey):
    dnskeylist = dnskey.split(' ', 3)

    flags = dnskeylist[0]
    protocol = dnskeylist[1]
    algorithm = dnskeylist[2]
    key = dnskeylist[3].replace(' ', '')

    keyid = _calc_keyid(flags, protocol, algorithm, key)
    ds = _calc_ds(domain, flags, protocol, algorithm, key)

    ret = list()
    ret.append(str(keyid) + ' ' + str(algorithm) + ' ' + str(1) + ' '
               + ds['sha1'].lower())
    ret.append(str(keyid) + ' ' + str(algorithm) + ' ' + str(2) + ' '
               + ds['sha256'].lower())
    return ret


print(dnskey_to_ds(DOMAIN, DNSKEY))