Kubernetes and Ceph : I think I want to marry you

I’ve been playing with my Kubernetes Cluster for quite sometime now. Mostly trying to learn by deploying simple workloads. I’ve played with Ceph before and was really impressed with it. So I have that old Ceph cluster around and I was thinking (and because I’m in a LSS with that Bruno Mar’s song while writing this), why not marry the two after reading the following today:

According to Gartner, 50% of global enterprises will be running containers in production by the year 2020. By that time, over 20% of enterprise storage capacity will be allocated to container workloads, compared to only 1% today.

Similar to CNI, there’s also a standard for Storage Vendors when developing a plugin that works on Orchestration Systems. I stumbled upon rbd-provisioner and this is my experience on how I managed to use Ceph as my storage provider for my Kubernetes Cluster.

Let’s start by creating the ClusterRole, ClusterRoleBinding, Role, RoleBinding, Service Account, and the RBD Provisioner Deployment resources using this.

[me@devops Ceph]# kubectl create -n kube-system -f Ceph-RBD-Provisioner.yaml
clusterrole.rbac.authorization.k8s.io/rbd-provisioner created
clusterrolebinding.rbac.authorization.k8s.io/rbd-provisioner created
role.rbac.authorization.k8s.io/rbd-provisioner created
rolebinding.rbac.authorization.k8s.io/rbd-provisioner created
serviceaccount/rbd-provisioner created
deployment.extensions/rbd-provisioner created
[me@devops Ceph]#

Checking the rbd-provisioner deployment

[me@devops Ceph]# kubectl get pods -l app=rbd-provisioner -n kube-system
NAME READY STATUS RESTARTS AGE
rbd-provisioner-67b4857bcd-nb97h 1/1 Running 0 58s
[me@devops Ceph]#

rbd-provisioner requires the admin client key which you can get by issuing the following on your Ceph cluster

[cephuser@ceph-admin-mon ~]$ ceph auth get-key client.admin
AQCp+ltdatIKFhAAOia5xyKg/CeTvwd4rUImvw==
[cephuser@ceph-admin-mon ~]$

And using that key, let’s create a Secret resource using the following

kubectl create secret generic ceph-secret — type=”kubernetes.io/rbd” — from-literal=key=’AQCp+ltdatIKFhAAOia5xyKg/CeTvwd4rUImvw==’ — namespace=kube-system

Let’s now create a new Ceph pool and also a client key for it.

[cephuser@ceph-admin-mon ~]$ ceph — cluster ceph osd pool create kube 16 16
pool ‘kube’ created
[cephuser@ceph-admin-mon ~]$

[cephuser@ceph-admin-mon ~]$ ceph — cluster ceph auth get-or-create client.kube mon ‘allow r’ osd ‘allow rwx pool=kube’
[client.kube]
key = AQCFsIhdiHVuMRAAIps556gTO6UiUotI41LGog==
[cephuser@ceph-admin-mon ~]$

Let’s create a Secret resource that will hold that key for the pool.

[me@devops Ceph]# kubectl create secret generic ceph-secret-kube — type=”kubernetes.io/rbd” — from-literal=key=”AQCFsIhdiHVuMRAAIps556gTO6UiUotI41LGog==” — namespace=kube-system
secret/ceph-secret-kube created
[me@devops Ceph]#

Let’s now create a new Storage Class for our Ceph pool. This basically holds the client information details (ceph cluster, client keys etc) and the provisioner to be used (remember that Deployment resource earlier?).

[me@devops Ceph]# kubectl create -f Ceph-RBD-StorageClass.yaml
storageclass.storage.k8s.io/fast-rbd created
[me@devops Ceph]#

Let’s check the rbd-provisioner

[me@devops Ceph]# kubectl describe po rbd-provisioner-67b4857bcd-nb97h -n kube-system
Name: rbd-provisioner-67b4857bcd-nb97h
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: k8s-node2/192.168.0.158
Start Time: Mon, 23 Sep 2019 20:40:17 +0800
Labels: app=rbd-provisioner
pod-template-hash=67b4857bcd
Annotations: cni.projectcalico.org/podIP: 10.244.2.184/32
Status: Running
IP: 10.244.2.184
Controlled By: ReplicaSet/rbd-provisioner-67b4857bcd
Containers:
rbd-provisioner:
Container ID: docker://eef78544e2047020de1d2e7614413d0bf1f49220fb6a7922602cb5c902022420
Image: quay.io/external_storage/rbd-provisioner:latest
Image ID: docker-pullable://quay.io/external_storage/rbd-provisioner@sha256:94fd36b8625141b62ff1addfa914d45f7b39619e55891bad0294263ecd2ce09a
Port: <none>
Host Port: <none>
State: Running
Started: Mon, 23 Sep 2019 20:41:11 +0800
Ready: True
Restart Count: 0
Environment:
PROVISIONER_NAME: ceph.com/rbd
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from rbd-provisioner-token-nmrj6 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
rbd-provisioner-token-nmrj6:
Type: Secret (a volume populated by a Secret)
SecretName: rbd-provisioner-token-nmrj6
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
— — — — — — — — — — — — –
Normal Scheduled <invalid> default-scheduler Successfully assigned kube-system/rbd-provisioner-67b4857bcd-nb97h to k8s-node2
Normal Pulling <invalid> kubelet, k8s-node2 pulling image “quay.io/external_storage/rbd-provisioner:latest”
Normal Pulled <invalid> kubelet, k8s-node2 Successfully pulled image “quay.io/external_storage/rbd-provisioner:latest”
Normal Created <invalid> kubelet, k8s-node2 Created container
Normal Started <invalid> kubelet, k8s-node2 Started container
[me@devops Ceph]#

Let’s test it out by creating a PersistentVolumeClaim using this.

[me@devops Ceph]# kubectl create -f Ceph-RBD-PVC.yaml
persistentvolumeclaim/testclaim created

You will see that a PersistentVolume is automatically created

[me@devops Ceph]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
mysql-pv-volume 1Gi RWO Retain Bound ehandoff/mysql-pv-claim manual 100d
pvc-4d816e48-de02–11e9–9d31–525400459b48 1Gi RWO Delete Bound default/testclaim fast-rbd 4s
[me@devops Ceph]#

And here is our PersistentVolumeClaim

[me@devops Ceph]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
testclaim Bound pvc-4d816e48-de02–11e9–9d31–525400459b48 1Gi RWO fast-rbd 8m55s
[me@devops Ceph]#

Looking at our Ceph Cluster

So far so good. Let’s now try and use that PVC in our Container mounting it under /data.

[me@devops Ceph]# kubectl create -f nginx-ceph.yaml
pod/nginx-ceph created
[me@devops Ceph]#

So far so good. But

[root@devops Ceph]# kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-ceph 0/1 ContainerCreating 0 118s
[root@devops Ceph]#

But checking in detail what’s going on with the above, I am getting the following

Normal SuccessfulAttachVolume <invalid> attachdetach-controller AttachVolume.Attach succeeded for volume “pvc-4d816e48-de02–11e9–9d31–525400459b48”
Warning FailedMount <invalid> (x6 over <invalid>) kubelet, k8s-master Unable to mount volumes for pod “nginx-ceph_default(a9e6054d-de04–11e9–9d31–525400459b48)”: timeout expired waiting for volumes to attach or mount for pod “default”/”nginx-ceph”. list of unmounted volumes=[ceph-rbd-storage]. list of unattached volumes=[ceph-rbd-storage default-token-99fc5]

And here is the culprit

Warning FailedMount <invalid> (x15 over <invalid>) kubelet, k8s-master MountVolume.WaitForAttach failed for volume “pvc-4d816e48-de02–11e9–9d31–525400459b48” : fail to check rbd image status with: (executable file not found in $PATH), rbd output: ()

Going to the node, I forgot that this needs the Ceph client specifically rbd to work. Easily we can get that by installing ceph-common

[me@k8s-master ~]# yum install ceph-common

Once installed, redeploying our pod, we can now see in our node that the RBD is mounted correctly.

[me@k8s-master ~]# df -h | grep rbd
/dev/rbd0 976M 2.6M 958M 1% /var/lib/kubelet/plugins/kubernetes.io/rbd/mounts/kube-image-kubernetes-dynamic-pvc-4e191de7-de02–11e9–9d2c-aa2e8e29c249
[me@k8s-master ~]#

And going inside our container

Let’s create a sample file

me@nginx-ceph:/data# touch test.txt
me@nginx-ceph:/data# ls- ltrh
me@nginx-ceph:/data# ls -ltrh
total 16K
drwx — — — 2 root root 16K Sep 23 13:39 lost+found
-rw-r — r — 1 root root 0 Sep 23 13:41 test.txt
me@nginx-ceph:/data#

And checking Ceph again, we can see an object was created.

Conclusion

In this post we have seen how to integrate Kubernetes and Ceph. Aside from ceph-rbd, we can also leverage Ceph using ceph-fs. This is just a simple test on how we can leverage Ceph and provide storage for our Kubernetes workloads.