Wednesday, September 14, 2022

Installing Red Hat Openshift Data Foundation via the command line

Red Hat Openshift Data Foundation (ODF) is an Openshift Container Platform (OCP) add-on for providing Ceph storage to pods running within a deployed cluster. This Red Hat product is based on the upstream Rook project. When deployed, ODF will provide CephFS, RBD and RGW backed storage classes. The Ceph cluster monitors will be installed on the OCP master nodes with the Ceph ODS processes running on the designated OCP worker nodes.

The processes in this post have been validated against OCP and ODF versions 4.8 through 4.11.

A basic understanding of Openshift, Openshift Data Foundation and Ceph are needed to understand this post. This post is not intended to replace official product documentation.

Lab Description

Hardware

This post is developed utilizing a virtual lab consisting of 7 virtual machines. These consist of the following types of hosts:

  • Provision: 2 CPU, 16G RAM, >70G system disk, RHEL 8
  • Master (x3): 6 CPU, 16G RAM, 60G system disk
  • Worker (x3): 14 CPU, 48G RAM, 60G system disk, 10G data disk (x3)

Network

The virtual lab consists of two networks.

  • Provision: Isolated, managed by the Openshift Installer
  • Baremetal: Bridged, managed by an DNS/DHCP server out of scope for this document

Additional Lab Infrastructure

  • A virtual BMC service is used to provide IMPI management of the virtual machines. This service runs on the hypervisor and is reachable from the provision host.
  • A DNS/DHCP virtual server. This service provides DHCP and DNS services to the OCP cluster. The OS provided dhcp and bind servers are used to provide IP address assignment and name resolution.

Cluster Installation

The OCP cluster is deployed using the Openshift Installer Provisioned Infrastructure (IPI) install method. This method uses of a series of Ansible playbooks executed from the provision node to deploy an OCP cluster. The details of the installation is beyond the scope of this post.

Installation Environment
The ODF installation process is performed from the provision host by a user with access to the admin level kubeconfig. This is automatically configured for the user who runs the IPI installer.

ODF Installation Process

Node Verification

Verify availability of nodes. Ensure three master and three worker nodes are available.
$ oc get nodes
NAME       STATUS   ROLES    AGE    VERSION
master-0   Ready    master   110m   v1.24.0+b62823b
master-1   Ready    master   110m   v1.24.0+b62823b
master-2   Ready    master   110m   v1.24.0+b62823b
worker-0   Ready    worker   91m    v1.24.0+b62823b
worker-1   Ready    worker   91m    v1.24.0+b62823b
worker-2   Ready    worker   90m    v1.24.0+b62823b

Verify availability of storage on the worker nodes. At least one disk needs to be available per host but three disks are used in this example. worker-0 is used as an example with worker-1 and worker-2 being similar.
$ oc debug node/worker-0

Pod IP: 192.168.122.35
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0   60G  0 disk 
|-sda1   8:1    0    1M  0 part 
|-sda2   8:2    0  127M  0 part 
|-sda3   8:3    0  384M  0 part /boot
`-sda4   8:4    0 59.5G  0 part /sysroot
vda    252:0    0   10G  0 disk 
vdb    252:16   0   10G  0 disk 
vdc    252:32   0   10G  0 disk

ODF Subscription Installation and Verification

The ODF subscription needs to be added to the OCP cluster. This subscription is available via the Openshift Marketplace and is added to the cluster with the following yaml file.
$ cat odf-subscription.yaml
apiVersion: v1
kind: Namespace
metadata:
  labels:
    openshift.io/cluster-monitoring: "true"
  name: openshift-storage
spec: {}
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  annotations:
  generateName: openshift-storage-
  name: openshift-storage-24mhn
  namespace: openshift-storage
spec:
  targetNamespaces:
  - openshift-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  labels:
    operators.coreos.com/odf-operator.openshift-storage: ""
  name: odf-operator
  namespace: openshift-storage
spec:
  channel: stable-4.11
  installPlanApproval: Automatic
  name: odf-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: odf-operator.v4.11.0
  
$ oc apply -f 22-ocs-sub.yaml
namespace/openshift-storage created
operatorgroup.operators.coreos.com/openshift-storage-24mhn created
subscription.operators.coreos.com/odf-operator created
The installation can be monitored with the following command. The operator is fully installed when "Succeed" is returned as the phase.
$ watch -d "oc get csv -n openshift-storage -l operators.coreos.com/ocs-operator.openshift-storage='' -o jsonpath='{.items[0].status.phase}'"

The openshift-storage pods can be listed with the following command. All pods should be in "Running" or "Completed" state. No pods should in a "Pending" or "Error" state.
$ oc get pods -n openshift-storage

Next we will enable the ODF console in the OCP web console. This will allow the modification and monitoring of the ODF cluster via the OCP web console.
$ oc patch console.operator cluster -n openshift-storage --type json -p '[{"op": "add", "path": "/spec/plugins", "value": ["odf-console"]}]'

Local Storage Subscription Installation and Verification

The Local Storage subscription needs to be installed. This is to support the worker nodes service as Ceph OSD servers. This subscription is available via the Openshift Marketplace and is added to the cluster with the following yaml file.

$ cat local-subscription.yaml
apiVersion: v1
kind: Namespace
metadata:
  labels:
    kubernetes.io/metadata.name: openshift-local-storage
    olm.operatorgroup.uid/1b9690c6-f7d4-47f4-8046-b389b44b0612: ""
  name: openshift-local-storage
spec:
  finalizers:
  - kubernetes
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  annotations:
    olm.providedAPIs: LocalVolume.v1.local.storage.openshift.io,LocalVolumeDiscovery.v1alpha1.local.storage.openshift.io,LocalVolumeDiscoveryResult.v1alpha1.local.storage.openshift.io,LocalVolumeSet.v1alpha1.local.storage.openshift.io
  name: openshift-local-storage-operator
  namespace: openshift-local-storage
spec:
  targetNamespaces:
  - openshift-local-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  labels:
    operators.coreos.com/local-storage-operator.openshift-local-storage: ""
  name: local-storage-operator
  namespace: openshift-local-storage
spec:
  channel: stable
  installPlanApproval: Automatic
  name: local-storage-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace

The installation of the local storage operator can be monitored with a command similar to the ODF operator installation. The operator is fully installed when "Succeed" is returned as the phase.

$ watch -d "`oc get csv -n openshift-local-storage -l operators.coreos.com/local-storage-operator.openshift-local-storage="" -o=jsonpath='{.items[0].status.phase}'"

Again, the status of the pods can be checked. All pods in the openshift-local-storage namespace should be in either the "Running" or "Completed" state.

Configure Local Storage Disks

Once the ODF and local storage operators are configured, the local disks can be configured and make ready for the storage cluster deployment. This consists of two elements:
  1. Labeling nodes which are used for OSD storage
  2. Configuring the disks for usage.
Each worker node will need to be labeled with the "cluster.ocs.openshift.io/openshift-storage=" tag. The command for worker-0 is given as an example below. worker-1 and worker-2 are labeled with a similar command.

$ oc label nodes worker-0 cluster.ocs.openshift.io/openshift-storage=''

The local storage disks are provisioned with following yaml file

$ cat label-disks.yaml
apiVersion: local.storage.openshift.io/v1alpha1
kind: LocalVolumeSet
metadata:
  name: localpv
  namespace: openshift-local-storage
spec:
  deviceInclusionSpec:
    deviceTypes: #Unused disks and partitions meeting these requirements are used
    - disk
    - part
    minSize: 1Gi
  nodeSelector:
    nodeSelectorTerms:
    - matchExpressions: #Nodes with this label are used
      - key: cluster.ocs.openshift.io/openshift-storage
        operator: Exists
  storageClassName: localblock
  tolerations:
  - effect: NoSchedule
    key: node.ocs.openshift.io/storage
    operator: Equal
    value: "true"
  volumeMode: Block
$ oc apply -f label-disks.yaml
localvolumeset.local.storage.openshift.io/localpv created

The apply command will start the configuration process of the local volume disks and their related persistent volumes (PV). The progress of this process can be monitored with the following command. The process is complete when the status returns "True"

$ watch -d "oc get localvolumeset -n openshift-local-storage localpv -o jsonpath='{.status.conditions[0].status}'"

This command will provide more information while monitoring the configuration of the local disks.
$ watch -d "oc get LocalVolumeSet -A;echo; oc get pods -n openshift-local-storage; echo ; oc get pv"

There should be one PV for each disk which matches the "deviceInclusionSpec". In this lab, a total of 9 PVs should be available.

$ oc get pv | wc -l

Configure ODF Storage Cluster

The ODF storage cluster can be deployed with the following yaml file. The storageDeviceSets count should be a OSD sets configured. In this lab, 9 disks are configure for OSD usage and the storageDeviceSet count is set to 3. This value will need to be adjusted for the local environment.

$ cat deploy-storagecluster.yamlapiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  name: ocs-storagecluster
  namespace: openshift-storage
spec:
  arbiter: {}
  encryption:
    kms: {}
  externalStorage: {}
  flexibleScaling: true
  resources:
    mds:
      limits:
        cpu: "3"
        memory: "8Gi"
      requests:
        cpu: "3"
        memory: "8Gi"
  monDataDirHostPath: /var/lib/rook
  managedResources:
    cephBlockPools:
      reconcileStrategy: manage   # <-- Default value is manage
    cephConfig: {}
    cephFilesystems: {}
    cephObjectStoreUsers: {}
    cephObjectStores: {}
  multiCloudGateway:
    reconcileStrategy: manage   # <-- Default value is manage
  storageDeviceSets:
  - count: 3  # <-- Modify count to desired value. For each set of 3 disks increment the count by 1.
    dataPVCTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: "100Mi"
        storageClassName: localblock
        volumeMode: Block
    name: ocs-deviceset
    placement: {}
    portable: false
    replica: 3
    resources:
      limits:
        cpu: "2"
        memory: "5Gi"
      requests:
        cpu: "2"
        memory: "5Gi"
$ oc apply -f deploy-storagecluster.yaml
storagecluster.ocs.openshift.io/ocs-storagecluster created

The storage cluster deployment start the configuration process of the Ceph monitors and OSD processes on the node. A simple monitoring of this process can be monitored with the below command. The storage cluster is deployed and ready for usage once the phase returns "Ready".

$ watch -d "oc get storagecluster -n openshift-storage ocs-storagecluster -o jsonpath={'.status.phase'}"

A more through monitoring of the configuration process can be accomplished with this command

$ watch -d "oc get storagecluster -n openshift-storage; echo ; oc get cephcluster -n openshift-storage; echo; oc get noobaa -n openshift-storage ; echo; oc get pods -n openshift-storage|tail -n 20"

Checking Available Storage Classes

Multiple Ceph storage classes are available at the completion of the storage cluster deployment. This can be viewed with the following command

$ oc get sc
NAME                          PROVISIONER                             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
localblock                    kubernetes.io/no-provisioner            Delete          WaitForFirstConsumer   false                  5m54s
ocs-storagecluster-ceph-rbd   openshift-storage.rbd.csi.ceph.com      Delete          Immediate              true                   2m4s
ocs-storagecluster-ceph-rgw   openshift-storage.ceph.rook.io/bucket   Delete          Immediate              false                  5m36s
ocs-storagecluster-cephfs     openshift-storage.cephfs.csi.ceph.com   Delete          Immediate              true                   2m4s
openshift-storage.noobaa.io   openshift-storage.noobaa.io/obc         Delete          Immediate              false                  14s

PVs can now be created against this storage classes with their eventual consumption by PVCs and pods.

Deploying Ceph Toolbox Pod

A pod containing the Ceph command line tools can be added to the deploy cluster with the below command.
$ oc patch OCSInitialization ocsinit -n openshift-storage --type json --patch  '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'
ocsinitialization.ocs.openshift.io/ocsinit patched

The Ceph tool box can be used with the following command
$ $ oc rsh -n openshift-storage `oc get pods -n openshift-storage -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}'`
sh-4.4$ ceph status
  cluster:
    id:     c680a945-60bb-4da3-b419-64f017884b8f
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 17m)
    mgr: a(active, since 17m)
    mds: 1/1 daemons up, 1 hot standby
    osd: 9 osds: 9 up (since 16m), 9 in (since 17m)
    rgw: 1 daemon active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   12 pools, 449 pgs
    objects: 362 objects, 134 MiB
    usage:   437 MiB used, 90 GiB / 90 GiB avail
    pgs:     449 active+clean
 
  io:
    client:   3.4 KiB/s rd, 80 KiB/s wr, 4 op/s rd, 9 op/s wr

sh-4.4$ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME                          STATUS  REWEIGHT  PRI-AFF
-1         0.08817  root default                                                
-7         0.02939      host worker-0-osp-example-com                           
 0    hdd  0.00980          osd.0                          up   1.00000  1.00000
 5    hdd  0.00980          osd.5                          up   1.00000  1.00000
 8    hdd  0.00980          osd.8                          up   1.00000  1.00000
-3         0.02939      host worker-1-osp-example-com                           
 1    hdd  0.00980          osd.1                          up   1.00000  1.00000
 3    hdd  0.00980          osd.3                          up   1.00000  1.00000
 7    hdd  0.00980          osd.7                          up   1.00000  1.00000
-5         0.02939      host worker-2-osp-example-com                           
 2    hdd  0.00980          osd.2                          up   1.00000  1.00000
 4    hdd  0.00980          osd.4                          up   1.00000  1.00000
 6    hdd  0.00980          osd.6                          up   1.00000  1.00000

Conclusion and Followup

This post has detailed the operations needed to configure OCP for ODF, configuration of the local disks and the deployment of the storage cluster. Additional posts will be made providing examples of how to consume the storage provide by ODF.

References:

No comments:

Post a Comment