Testing with ConfigDrive and cloud-init

cloud-init

cloud-init is a very easy way to bootstrap/configure Virtual Machines running in a cloud environment. It can read it’s metadata from various data sources and configure for public SSH keys or create users for example.

Most large clouds support cloud-init to quickly deploy new Instances.

Config Drive

Config Drive is a data source which reads a local ‘CD-Rom’ device which contains the metadata for the Virtual Machine. This allows for auto configuration of Virtual Machines without them requiring network. My main use case for Config Drive is CloudStack which has support for Config Drive since version 4.11.

I wanted to test with Config Drive outside CloudStack to test some functionality.

meta_data.json

On my laptop I spun up a Ubuntu 18.04 Virtual Machine with cloud-init installed and I attached a ISO which I created.

/home/wido/Desktop/
             cloud-init/
               openstack/
                 latest/
                   meta_data.json

In meta_data.json I put:

{
  "hostname": "ubuntu-test",
  "name": "ubuntu-test",
  "uuid": "0109a241-6fd9-46b6-955a-cd52ad168ee7",
  "public_keys": [
      "ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAwPKBJDJdlvOKIfilr0VSkF9i3viwLtO8GyCpxL/8TxrGKnEg19LPLN3lwKWbbTqBZgRmbrR3bgQfM4ffPoTCxSPv44eZCF8jMPv8PxpC0yVaTcqW4Q7woD7pjdIuGVImrmEls0U8rS3uGQDx7LhFphkAh+blfUtobqzyHvqcbtVEh+drESn8AXrKd1MZfGg6OB8Xrfdr6d959uHBHFJ8pOxxppYbInxKREPb3XmZzmoNQUmqFRN/VNVTreRHAxDcPM8pEPuNmr3Vp+vDVvfpA58yr2rZ21ASB4LlNOOSEx7vnLd6uH9rsqAJOtr0ZEE29fU609i4rd6Zda2HTGQO+Q== wido@wido-laptop",
      "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC0HJ4gm7QCUZjeXh/GmCzUamABgtvOZ4GcDOA1mSGoXqGHYkG8qPa5OaURx5yqROmclTtgwkaHxjY4in6Bi7DGxlSyFQuEOKBwNCcZOrGFiRbazeh3c5jEXgoCtDynrqBjuivWTaxzl46WVglxGSy2LAaps9sAhDU+8Xfm9wBYYtjv3CDAHEe9rN3UxgGEO5K+yhoFzcI9vZl4Q9hvMZRRBBFq3vOFoLr9pyyxVUNiH9oQLubeiqDwnQqaGn48WMX9eO9KiH32eSnzlgJg8viFyonVRJq2dxJMEcrfolQhD9GR1BXNeVGDNJYuo9FX1/NeZZYJTUAwZ4jP008CfaBAnuKBfdZTkjNZlXxMQOY29VAetHI/urvh4NK17o4g4gcBpmLV+13saBIHTLXmA8ADkGZgWE6yIYEah04nBHr28vmZJBX4vpQYz9VWiZcJeEsGah/oaPkFMNQB7mXE+plz46YJ1JtozBpntir7A4cHJX3qZNz/JwOU7yAc4K3EP6fGirSvIDtQ/SJs9Gp51Wbv32E3/5ouemsGnh/ziOc8eIUJMTFWqpEG9qVi/tVJiryuvRcha6+0cZmnWnknBztzRO3Oh3JgPwECdNA/X0acXlzFlRbkQpAJx9+ADESauNaIvfQ3l+kZx3m/1eAAZpAI2ZrnvQMUa40XYksiwncM4Q== wido@wido-desktop"
  ]
}

Using mkisofs I was able to create a ISO and attached it to the VM:

mkisofs -iso-level 3 -V 'CONFIG-2' -o /var/lib/libvirt/images/cloud-init.iso /home/wido/Desktop/cloud-init

The ISO was attached to the Virtual Machine and after reboot I could log in with the user ubuntu over SSH using my pubkey!

Placement Groups with Ceph Luminous stay in activating state

Placement Groups stuck in activating

When migrating from FileStore with BlueStore with Ceph Luminuous you might run into the problem that certain Placement Groups stay stuck in the activating state.

44    activating+undersized+degraded+remapped

PG Overdose

This is a side-effect of the new PG overdose protection in Ceph Luminous.

Too many PGs on your OSDs can cause serious performance or availability problems.

You can see the amount of Placement Groups per OSD using this command:

$ ceph osd df

Increase Max PG per OSD

The default value is a maximum of 200 PGs per OSD and you should stay below that! However, if you are hit by PGs in the activating state you can set this configuration value:

[global]
mon_max_pg_per_osd = 500

Then restart the OSDs and MONs which are serving the affected by this.

Usually you shouldn’t run into this, but if this hits you in the middle of a migration or upgrade this might save you.

Quick overview of Ceph version running on OSDs

When checking a Ceph cluster it’s useful to know which versions you OSDs in the cluster are running.

There is a very simple on-line command to do this:

ceph osd metadata|jq '.[].ceph_version'|sort|uniq -c

Running this on a cluster which is currently being upgraded to Jewel to Luminous it shows:

     10 "ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)"
   1670 "ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)"
    426 "ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0)"
     66 "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)"

So 66 OSDs are running Luminous and 2106 OSDs are running Jewel.

Starting with Luminous there is also this command:

ceph features

This shows us all daemon and client versions in the cluster:

{
    "mon": {
        "group": {
            "features": "0x1ffddff8eea4fffb",
            "release": "luminous",
            "num": 5
        }
    },
    "osd": {
        "group": {
            "features": "0x7fddff8ee84bffb",
            "release": "jewel",
            "num": 426
        },
        "group": {
            "features": "0x1ffddff8eea4fffb",
            "release": "luminous",
            "num": 66
        }
    },
    "client": {
        "group": {
            "features": "0x7fddff8ee84bffb",
            "release": "jewel",
            "num": 357
        },
        "group": {
            "features": "0x1ffddff8eea4fffb",
            "release": "luminous",
            "num": 7
        }
    }
}