Easy mapping+mounting rbd images on CentOS 7 with Systemd

Warning: Things change a lot around Ceph. The guide below might not work with the current release.

Following Laurent Barble guide about mapping rbd images without installing ceph-common I've played with the systemd + fstab on CentOS7 (other distributions with kmod-rbd and systemd might also support this method). Systemd adds some interesting new features to regular fstab entries to automount filesystems.

Setting things up

Let's assume you have a working ceph cluster and already created a block storage {ceph_image} in {ceph_pool}.
On the client side you need a CentOS 7 minimal install (no ceph software installed).

Update: RHEL 7.1 now contains the ceph and rbd modules so no need to install them.

First install kmod-rbd and kmod-ceph from the official Ceph repos. (You can also try your luck with kernel-ml from ELRepo that also supports rbd). If your kernel has rbd support you can skip this.

 yum -y install https://ceph.com/rpm-testing/rhel7/x86_64/kmod-rbd-3.10-0.1.20140702gitd... https://ceph.com/rpm-testing/rhel7/x86_64/kmod-libceph-3.10-0.1.20140702...

Then try to load the rbd module

 modprobe rbd

If it's OK...

Map the rbd device manually (test and initial setup)

Now try to map the remote rbd image to a local device. The admin name and key can be read using "ceph auth list" and "ceph auth get-key" on your Ceph monitor node.

 /bin/echo {ceph_monitor_ip} name={ceph_admin},secret={ceph_key} {ceph_pool} {ceph_image} >/sys/bus/rbd/add

This will create a new block device called /dev/rbdX or /dev/rbd/{ceph_pool}/{ceph_image}.
After this you can format and mount your filesystem using the new device.

mkfs.xfs -L {ceph_image} /dev/rbd/...  # It is important to label the filesystem. XFS is the default filesystem on CentOS7
mount /dev/rbd/... {local_mount_point} 

If you want to remove the rbd device, then unmount the filesystem and issue

 echo "0" >/sys/bus/rbd/remove

(assuming this was your only rbd device).

Now comes the interesting part: how to mount automagically.

Specifying the mount entry in /etc/fstab

If you have multiple rbd images (thus multiple rbd devices) it is important to label your filesystems and use the label in fstab entries. Something like this in /etc/fstab (replace image and mount point names)

LABEL={ceph_image} {local_mount_point} xfs rw 0 2 

Now, let's tell systemd to take care of mounting the filesystem when needed. Modify the previous entry this way

LABEL={ceph_image} {local_mount_point} xfs rw,noauto,x-systemd.automount 0 2 

Noauto means that the filesystem will not be mounted when "mount -a" is given. Systemd will take care of mounting it. You may add other options like seclabel, relatime and x-systemd.device-timeout=10.

A systemd service to map and mount automatically on boot / demand

You can create a systemd service to do the rbd mapping on boot.
Create a new systemd service unit (e.g. /etc/systemd/system/rbd-{ceph_pool}-{ceph_image}.service) for each of your remote rbd images:

 [Unit]
Description=RADOS block device mapping for "{ceph_pool}"/"{ceph_image}"
Conflicts=shutdown.target
Wants=network-online.target
 # Remove this if you don't have Networkmanager
After=NetworkManager-wait-online.service

[Service]
Type=oneshot
ExecStart=/sbin/modprobe rbd
ExecStart=/bin/sh -c "/bin/echo {ceph_mon_ip} name={ceph_admin},secret={ceph_key} {ceph_pool} {ceph_image} >/sys/bus/rbd/add"
TimeoutSec=0
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target
WantedBy=remote-fs-pre.target

Start the service and check whether /dev/rbd0 is created or not.

systemctl start rbd-{ceph_pool}-{ceph_image}.service
systemctl -l status rbd-{ceph_pool}-{ceph_image}.service

If everything seems OK, enable the service:

systemctl enable  rbd-{ceph_pool}-{ceph_image}.service

The above unit script (withouth the Networkmanager dependency) works well for me on oVirt nodes.

Problems... and solutions?

Systemd is a pretty complex piece of software. Sometimes it does not work as expected. I thought putting "Wants=network-online.target" means that the network connection will be up and working by the time this service starts. Well, it was not. This is why I've also enabled the NetworkManager-wait-online.service and put it as a dependency for the rbd mapping service. NM, however, is not enabled on all systems (for example it is disabled on oVirt nodes).

Running out of space?? No problem. Expand your storage on the ceph admin node

rbd --pool={ceph_pool} --size={new-size} resize {ceph_image} 

and resize the filesystem on-the-fly

xfs_growfs  {local_mount}

...