Comparing two Ceph CRUSH maps

Sometimes you want to test if changes you are about to make to a CRUSH map will cause data to move or not.

In this case I wanted to change a rule in CRUSH where it would use device classes, but I didn’t want any of the ~1PB of data in that cluster to move.

By swapping IDs I could prevent data to move:

root default {
id -50 # do not change unnecessarily
id -53 class hdd # do not change unnecessarily
id -122 class ssd # do not change unnecessarily
root default {
id -53 # do not change unnecessarily
id -50 class hdd # do not change unnecessarily
id -122 class ssd # do not change unnecessarily

Notice how I swapped the IDs. After this I updated the rule:

rule rgw {
id 6
type replicated
min_size 1
max_size 10
step take ams02-objects class hdd
step chooseleaf firstn 0 type host
step emit
}

I then compiled the CRUSHMap and ran crushtool to see if there were any differences:

root@mon01:~# crushtool -i crushmap --compare crushmap.new 
rule 0 had 0/10240 mismatched mappings (0)
rule 1 had 0/10240 mismatched mappings (0)
rule 2 had 0/10240 mismatched mappings (0)
rule 3 had 0/10240 mismatched mappings (0)
rule 4 had 0/10240 mismatched mappings (0)
rule 5 had 0/3072 mismatched mappings (0)
rule 6 had 0/10240 mismatched mappings (0)
maps appear equivalent
root@mon01:~#

No changes! So it was safe to inject this map:

root@mon01:~# ceph osd setcrushmap -i crushmap.new

HAProxy in front of Ceph Manager dashboard

The Ceph Mgr dashboard plugin allows for an easy dashboard which can show you how your Ceph cluster is performing.

In certain situations you can’t contact the Mgr daemons directly and you have to place a Proxy server between your computer and the Mgr daemons.

This can be done easily with HAProxy and the following configuration which assumes that:

  • SSL has been disabled in the Dashboard plugin
  • Dashboard plugin listens in port 8080
  • Mgr is running on the hosts mon01, mon02 and mon03
global
  log         127.0.0.1 local1
  log         127.0.0.1 local2 notice

  chroot      /var/lib/haproxy
  pidfile     /var/run/haproxy.pid
  maxconn     4000
  user        haproxy
  group       haproxy
  daemon

  stats socket /var/lib/haproxy/stats

defaults
  log                     global
  mode                    http
  retries                 3
  timeout http-request    10s
  timeout queue           1m
  timeout connect         10s
  timeout client          1m
  timeout server          1m
  timeout http-keep-alive 10s
  timeout check           10s
  maxconn                 3000
  option                  httplog
  no option               httpclose
  no option               http-server-close
  no option               forceclose

  stats enable
  stats hide-version
  stats refresh 30s
  stats show-node
  stats uri /haproxy?stats
  stats auth admin:haproxy

frontend https
  bind *:80
  default_backend ceph-dashboard

backend ceph-dashboard
  balance roundrobin
  option httpchk GET /
  http-check expect status 200
  server mon01 mon01:8080 check
  server mon02 mon02:8080 check
  server mon03 mon03:8080 check

You can now point your browser to the URL/IP of your HAProxy and use your Ceph dashboard.

In case a Mgr machine fails the health checks of HAProxy will make sure it fails over to on of the other Mgr daemons.

Renaming a network interface with systemd-networkd on Ubuntu 18.04

On a Ubuntu system where I’m creating a VXLAN Proof of Concept with CloudStack I wanted to rename the interface enp5s0 to cloudbr0.

I found many documentation on the internet on how to do this with *.link files, but I was missing the golden tip, which was you need to re-generate your initramfs.

/etc/systemd/network/50-cloudbr0.link

[Match]
MACAddress=00:25:90:4b:81:54

[Link]
Name=cloudbr0

After you create this file, re-generate your initramfs:

update-initramfs -c -k all

You can now use cloudbr0 in *.network files to use it like any other network interface.

In my case this is how my interfaces look like:

1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
6: cloudbr0:  mtu 9000 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:25:90:4b:81:54 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.11/24 brd 192.168.0.255 scope global cloudbr0
       valid_lft forever preferred_lft forever
    inet6 2a00:f10:114:0:225:90ff:fe4b:8154/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 2591993sec preferred_lft 604793sec
    inet6 fe80::225:90ff:fe4b:8154/64 scope link 
       valid_lft forever preferred_lft forever
8: cloudbr1:  mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 86:fa:b6:31:6e:c1 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.11/24 brd 172.16.0.255 scope global cloudbr1
       valid_lft forever preferred_lft forever
    inet6 fe80::84fa:b6ff:fe31:6ec1/64 scope link 
       valid_lft forever preferred_lft forever
9: vxlan100:  mtu 1450 qdisc noqueue master cloudbr1 state UNKNOWN group default qlen 1000
    link/ether 56:df:29:8d:db:83 brd ff:ff:ff:ff:ff:ff