Changing the region of a RGW bucket

As of Ceph version 0.67 (Dumpling) the Ceph Object Gateway aka RADOS Gateway supports regions. This allows you to create a geo-replicated Amazon S3 compatible service.

While working on a setup we decided later in the process that we wanted regions, but we already created about 50 buckets with data in them. We didn’t feel like re-creating all the buckets, so we wanted to change the region of the buckets.

A fresh Object Gateway has a region ‘default’ with one zone ‘default’. We created the region ‘ams02’ (Amsterdam) with one zone called ‘zone01’.

All buckets had the region ‘default’ which we wanted to change to ‘ams02’. No data migrated is required since all the data is on the same Ceph cluster.

This can be done with a couple of ‘radosgw-admin’ commands.

The bucket in these examples is ‘widodh’.

$ radosgw-admin metadata get bucket:widodh

This outputs JSON data:

{ "key": "bucket:widodh",
  "ver": { "tag": "_2qGuaDCBixHpx2lddTe0g-x",
      "ver": 1},
  "mtime": 1380653343,
  "data": { "bucket": { "name": "widodh",
          "pool": ".rgw.buckets",
          "index_pool": ".rgw.buckets.index",
          "marker": "default.20111.1",
          "bucket_id": "default.20111.1"},
      "owner": "widodh",
      "creation_time": 1380653343,
      "linked": "true",
      "has_bucket_info": "false"}}

With this information we can get the rest of the information:

$ radosgw-admin metadata get bucket.instance:widodh:default.20111.1

The id at the end is ‘bucket_id’ from the previous command.

This returns us:

{ "key": "bucket.instance:widodh:default.20111.1",
  "ver": { "tag": "_-HNwyMLAnRALV9tyPqdX5_V",
      "ver": 1},
  "mtime": 1380653343,
  "data": { "bucket_info": { "bucket": { "name": "widodh",
              "pool": ".rgw.buckets",
              "index_pool": ".rgw.buckets.index",
              "marker": "default.20111.1",
              "bucket_id": "default.20111.1"},
          "creation_time": 1380653343,
          "owner": "widodh",
          "flags": 0,
          "region": "default",
          "placement_rule": "default-placement",
          "has_instance_obj": "true"},
      "attrs": [
            { "key": "user.rgw.acl",
              "val": "AgKXAAAAAgIgAAAABgAAAHdpZG9kaBIAAABXaWRvIGRlbiBIb2xsYW5kZXIDA2sAAAABAQAAAAYAAAB3aWRvZGgPAAAAAQAAAAYAAAB3aWRvZGgDA0AAAAACAgQAAAAAAAAABgAAAHdpZG9kaAAAAAAAAAAAAgIEAAAADwAAABIAAABXaWRvIGRlbiBIb2xsYW5kZXIAAAAAAAAAAA=="},
            { "key": "user.rgw.idtag",
              "val": ""},
            { "key": "user.rgw.manifest",
              "val": ""}]}}

Save this output to a file and change the ‘region’ value to what you want, in this case I changed ‘default’ to ‘ams02’.

Afterwards you run:

$ radosgw-admin metadata put bucket.instance:widodh:default.20111.1 < bucket.json

Now I could change these configuration variables in the ceph.conf:

[client.radosgw.rgw1]
    host = rgw1
    ...
    ...
    rgw zone = zone01
    rgw region = ams02
    ...
    ...

We had to change the information of 50 buckets and we didn't feel like doing this manually, so I wrote this script:

#!/usr/bin/env python

import rados
import os
import json
import copy
import subprocess

ceph_id = "admin"
ceph_secret = "ADMIN SECRET"
ceph_monitor = "MONITOR ADDRESS"
ceph_rgw_pool = ".rgw"
ceph_rgw_region = "NEW RGW REGION"

def change_bucket_region(bucket, region):
	me = os.popen("radosgw-admin metadata get bucket:" + bucket)
	meta = json.loads(me.read())
	id = meta['data']['bucket']['bucket_id']
	mei = os.popen("radosgw-admin metadata get bucket.instance:" + bucket + ":" + id)
	imeta = json.loads(mei.read())
	region = imeta['data']['bucket_info']['region']
	if region is not ceph_rgw_region:
		newmeta = copy.copy(imeta)
		newmeta['data']['bucket_info']['region'] = ceph_rgw_region
		stdin = json.dumps(newmeta)
		process = subprocess.Popen(['radosgw-admin', 'metadata', 'put', "bucket.instance:" + bucket + ":" + id], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
		process.stdin.write(stdin)
		process.stdin.close()
		process.wait()


try:
	r = rados.Rados(rados_id=ceph_id)
	r.conf_set("mon_host", ceph_monitor)
	r.conf_set("key", ceph_secret)
	r.connect()

	io = r.open_ioctx(ceph_rgw_pool)

	i = io.list_objects()
	while True:
		try:
			o = i.next()
			b = str(o.key)
			if b[0] is not ".":
				change_bucket_region(b, ceph_rgw_region)
		except StopIteration:
			break

	io.close()
	r.shutdown()
except Exception as e:
	print "Error" + str(e)

Also available as a download.

Use this script with caution since it will change the region of ALL buckets on your cluster to what you specify.

CloudStack: The given command does not exist or it is not available for user

So I was working on CloudStack today and I build new packages from the 4.2 branch to test some new things for the Ceph integration.

After installing the new packages and restarting my management server I wasn’t able to log on anymore. This is what I got:

The given command does not exist or it is not available for user

It took me quite some time to figure out what was going on, but after turning on MySQL logging it turned out that I was missing a column in a database. This setup is a dev setup where I build packages on a daily basis and perform a lot of database changes manually.

The problem was that my database was out of sync with what the code expected it to be. When you go from version A to B the management server will upgrade the database accordingly, but I went from version B to B, which did have some database changes, but weren’t taken care of by the DatabaseUpgradeChecker, which makes perfectly sense since this is a dev server.

So should you encounter this message at some point, turn on MySQL query logging and see the queries it tries to do. You’ll probably see that one of them is failing.

This causes the whole management server not to start properly.

A quick note on running CloudStack with RBD on Ubuntu 12.04

When you want to use Ceph as Primary Storage in Apache CloudStack you need a recent version of libvirt with RBD storage pool support enabled.

If you want to use Ubuntu 12.04 LTS (Precise) you would need to manually compile libvirt since the default libvirt version doesn’t include RBD storage pool support.

But not any more! Ubuntu has their Cloud Archive which is aimed at OpenStack, but that doesn’t matter, we just want a newer version of libvirt with RBD storage pool support.

So, add this PPA and a Apt source for Ceph and you can use RBD with CloudStack without compiling anything!

$ sudo apt-get install ubuntu-cloud-keyring
$ echo deb http://ubuntu-cloud.archive.canonical.com/ubuntu precise-updates/grizzly main | sudo tee /etc/apt/sources.list.d/cloud-archive.list
$ wget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | sudo apt-key add -
$ echo deb http://eu.ceph.com/debian-cuttlefish/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list
$ sudo apt-get install cloudstack-agent

Voila, you now have all the packages you need to run a CloudStack agent with RBD support.

How I built my 3-phase Open EVSE

Ever since I posted on my blog that I built my own Open EVSE for my future Tesla Model S I’m getting a lot of e-mails from people asking how I build it.

A couple of notes to everybody who wants to build one:

I’m using a Open EVSE board with a modified firmware
I modified the firmware so that with the Advanced Power Supply it will switch to level 2 charging when it senses 230V on L1.

The source code can be found on my Github account.

I also have two compiled versions (with LCD support) available (both from 01-09-2012):

You can program the EVSE using ‘avrdude’ and the right programmer.

The relais I’m using is a 40A 4p
I’m using a Hager ESL440S relais.

This relais has 4 poles and works on 12V AC or DC.

There is a second relais which switches on my main relais
The main relais (Hager ESL440S) works on 12V DC, but pulls about 1000mA to switch on.

That is a bit to much for the Open EVSE board, so I had to buy a 12V DC transformer and a second smaller relais. When Open EVSE board switches on the small relais, it switches on the main relais by using the external 12V transformer.

If you go to page 8 of the PDF I wrote you can see these components.

In the casing where the Open EVSE board is you can see the small relais on the left.

The external 12V DC power supply is in the distribution panel on the right and is on the left of the main relais. You can see the green and red LED on it.

I limited my EVSE to 30A
I limited the pilot signal to 30A. 32A would stress some fuses in the distribution panel in my house, since that 32A relais also provides power to the TL-lights in my shed. So I turned the EVSE down to 30A instead of 32A. Technically I could use 32A, but 30A was a safe bet in this case.

1-phase of 3-phase doesn’t matter
The EVSE itself doesn’t know anything about 1-phase or 3-phase. When a car connects and talks to the EVSE it requests power, when all the criteria match the EVSE turns on the relais.

The car then senses 3-phases and will use them if the charger supports it. The EVSE has nothing to do with that.

To conclude:

  • Read the PDF I wrote.
  • For the EVSE 3-phase or 1-phase doesn’t matter. It just switches on a relais.
  • Read the Open EVSE website about J1772, programming, etc, etc

Redundant Ceph monitors with Round Robin DNS

One of the unique features of Ceph is that it can be build without any Single Point of Failure. No single machine will take your cluster down when designed properly.

Ceph’s monitors play a crucial part in this. To make them redundant you want a odd number of monitors, where 3 is more then sufficient for most clusters.

When librados (The RADOS client) reads the ceph.conf it can read something like:

[mon.a]
  mon addr = 192.168.0.1:6789

[mon.b]
  mon addr = 192.168.0.2:6789

[mon.c]
  mon addr = 192.168.0.3:6789

The problem is that when working with for example Apache CloudStack you can’t have it read a ceph.conf nor does CloudStack support multiple Ceph monitors.

The reason behind this is that CloudStack passes storage pools in the form or URIs internally, for example: rbd://1.2.3.4:6789/mypool

So you’d be stuck with a single monitor in CloudStack. It’s not a disaster, since when a client successfully connects to the Ceph cluster it will receive a monitor map which tells it which other monitors are available should the one he’s connected to fail. But when you want to connect when that specific monitor is down you have a problem.

A solution to this is to create a Round Robin DNS record with all your monitors in it:

monitor.ceph.lan. A 192.168.0.1
monitor.ceph.lan. A 192.168.0.2
monitor.ceph.lan. A 192.168.0.3

You can have your librados client connect to “monitor.ceph.lan” and it will connect to one of the monitors listed in that A record. Is one of the monitors down? It will connect to another one.

This doesn’t only work with CloudStack, but it works with any RADOS client like Qemu, libvirt, phprados, rados-java, python-rados, etc, etc. Anything that connects via librados.

P.S.: Ceph fully (!) supports IPv6, so you can also create a Round Robin AAAA-record 🙂

CloudStack: Zone X is is not ready to launch console proxy yet

As you might know, I’m a committer in the Apache CloudStack project and I work on it on a daily basis.

I have a couple of development setups running and I upgraded one of them (where I do all my Ceph development) from 4.0 to 4.1 (isn’t out yet) and suddenly I got this message in my logs:

Zone 1 is not ready to launch console proxy yet

That log line didn’t tell me that much, so I started digging through the code as of WHY my Zone wasn’t ready, since it was working under 4.0.

It turns out that my global setting “secondary.storage.vm” wasn’t set to true and that caused my KVM zone not to work.

This setting can’t be changed through the Web UI (not sure why) and I had to change it in the database instead. After setting it to “true” my System VMs began to start again and all worked just fine.

It seems this was legacy on my end since the upgrade process doesn’t touch this setting at all. I’m adding some extra debugging to the code to make it a bit more clear as of WHY your zone isn’t ready.

Should you ever encounter this one, verify this setting.

Enhanced RBD support for CloudStack 4.2

About 1 hour ago the new storage subsystem got merged into the master branch of CloudStack. That is wonderful news for all you out there who want to use features like snapshotting with RBD in CloudStack.

In pre-4.2 CloudStack a snapshot was the same as a backup. As soon as you created a snapshot it would also copy that snapshot to the secondary storage. This could not only lead to high network utilization when talking about 1TB RBD volumes, but it also caused problems with the underlying ‘qemu-img’ tool. To make a long story short: Snapshots with RBD just wouldn’t work in CloudStack 4.0 or 4.1 without resorting to dirty hacking. Which we didn’t.

The new storage subsystem separates the backup and snapshot process. Snapshots are handled by the primary storage and they can be copied to the ‘backup storage’ on request. This allows is to use the full snapshot potential of RBD.

I was waiting for the storage subsystem to be merged into the master branch before I could start working on this. About two weeks ago I already wrote a small function spec in CloudStack’s wiki to describe what has to be done.

A couple of choices still have to be made. Traditionally we could do everything through libvirt and ‘qemu-img’, but from what I can see now we’ll run into some trouble. We might have to go through the process of wrapping librbd into a Java library to get it all done, but I’m not completely positive about that. Some patches for libvirt(-java) could probably also do the job, but it would take a lot of time and work to get those upstream and into the repositories. The goal is to have this new RBD code work natively on a Ubuntu 13.04 system.

The expectation is that CloudStack 4.2 will be released mid-July this year, but if you are a daredevil you can always track the master branch and play around with that.

I’ll post updates on the cloudstack-dev list on a regular base about the progress, but you can also watch the master branch and search for commits with ‘RBD’ in the message.

100% CPU utilization on a Cisco 887VA

Some time ago I wrote a blogpost about using a Cisco 887VA router on a XS4All (dutch ISP) connection. The original article is mostly in Dutch, but I’ll keep this one in English since it will probably help users all over the world.

A couple of days ago I got an e-mail from somebody who read my blogpost and asked me if the 887VA was able to handle more then 25Mbit. I never really tested it since I thought the copper-cable in our office wasn’t that good. During a download I logged into the router and saw that the CPU was 94% utilized!

The VDSL line was however online at 38Mbit, so how could this happen? Was the router underpowered?

I couldn’t wrap my head around it. A brand new VDSL router from Cisco couldn’t handle just 25Mbit? Something had to be wrong.

Some searching brought me to the Cisco Support Forums and one of the suggestions was to turn on CEF. A Cisco technology to improve Layer 3 performance.

Logging in to the router showed me indeed that CEF was disabled for both IPv4 and IPv6.

no ip cef
no ipv6 cef

Enabling CEF was simple:

conf t
ip cef
ipv6 cef

And voila! I suddenly was able to use the full 38Mbit with just ~50% CPU load.

My EVSE is online!

It took some work and tuning, but my own Open EVSE is online!

After connecting the Advanced Power Supply cabling it’s automatically switchting to Level 2 charging on 30A.

I made a small change to the Open EVSE code since in the EU we have 230/400V instead of 110/220V. This can be found on my Github account.

 

 

 

Today I already got another Roadster on visit while helping him with installing OVMS in his 2.0 Roadster Sport. It charged nicely on 30A for about 4 hours.

To get Open EVSE working with the Roadster I had to add a 2.4k resistor on top of R1 to get the resistance back to 650 ~ 700 Ohm, like mentioned in the Open EVSE issue tracker.

 

 

 

 

 

Below are two pictures of both Roadster charging at the newly installed EVSE.

If you have any questions, feel free to contact me!

Ceph distributed storage with CloudStack

As we are nearing the CloudStack 4.0 release I figured it was time I’d write something about the Ceph integration in CloudStack 4.0

In the beginning of this year we (my company) decided we wanted to use CloudStack for our cloud product, but we also wanted to use Ceph for the storage. CloudStack lacked the support for Ceph, so I decided I’d implement that.

Fast forward 4 months, a long flight to California, becoming a committer and PPMC member of CloudStack, various patches for libvirt(-java) and here we are, 25 September 2012!

RBD, the RADOS Block Device from Ceph enables you to stripe disks for (virtual) machines across your Ceph cluster. This not only gives high performance, it gives you virtually unlimited scalability (without downtime!) and redundancy. Something your NetApp, EMC or EqualLogic SAN can’t give you.

Although I’m a very big fan of Nexenta (use it a lot) it also has it’s limitations. A SAS environment won’t keep scaling for ever and SAS is expensive! Yes, ZFS is truly awesome, but you can’t compare it to the distributed powers Ceph has.

The current implementation of RBD in CloudStack is for Primary Storage only, but that’s mainly what you want, it has a couple of limitations though:

  • You still need either NFS or Local Storage for your System VMs
  • Snapshotting isn’t enabled (see below!)
  • It only works with KVM (Using RBD in Qemu)

If you are happy with that you’ll able to allocate hundreds of TB’s to your CloudStack cluster like it was nothing.

What do you need to use RBD for Primary Storage?

  • CloudStack 4.0 (RC2 is out now)
  • Hypervisors with Ubuntu 12.04.1
  • librbd and librados on your hypervisors
  • Libvirt 0.10.0 (Needs manual installation)
  • Qemu compiled with RBD enabled

There is no need for special configuration on your Hypervisor, that’s all controlled by the Management Server. I’d however recommend that you test the Ceph connectivity first:

rbd -m <monitor address> –user <cephx id> –key <cephx key> ls

If that works you can go ahead and add the RBD Primary Storage pool to your CloudStack cluster. It should be there when adding a new storage pool.

It behaves like any storage pool in CloudStack, except the fact that it is running on the next generation of storage 🙂

About the snapshots, this will be implemented in a later version, probably 4.2. It mainly has to do with the way how CloudStack currently handles snapshots. A major overhaul of the storage code is planned and as part of that I’ll implement snapshotting.

Testing is needed! So if you have the time, please test and report back!

You can find me on the Ceph and CloudStack IRC channels and mailinglists, feel free to contact me. Remember that I’m in GMT +2 (Netherlands).