Deploy ceph cluster on Ubuntu 18.04 and CentOS 7.8

Posted by Iceberg on August 22, 2021

In this article, we learn to deploy ceph cluster on ubuntu 18.04. Three nodes are used for this study.

We target to deploy the most recent ceph release which is called Pacific. With this release, we can use cephadm to create a ceph cluster by bootstrapping on a single host and expanding the cluster to additional hosts.

Intro to ceph

Whether you want to provide Ceph Object Storage and/or Ceph Block Device services to Cloud Platforms, deploy a Ceph Filesystem or use Ceph for another purpose, all Ceph Storage Cluster deployments begin with setting up each Ceph Node, your network, and the Ceph Storage Cluster. A Ceph Storage Cluster requires at least one Ceph Monitor, Ceph Manager, and Ceph OSD (Object Storage Daemon). The Ceph Metadata Server is also required when running Ceph Filesystem clients.

  • Monitors: A Ceph Monitor (ceph-mon) maintains maps of the cluster state, including the monitor map, manager map, the OSD map, and the CRUSH map. These maps are critical cluster state required for Ceph daemons to coordinate with each other. Monitors are also responsible for managing authentication between daemons and clients. At least three monitors are normally required for redundancy and high availability.

  • Managers: A Ceph Manager daemon (ceph-mgr) is responsible for keeping track of runtime metrics and the current state of the Ceph cluster, including storage utilization, current performance metrics, and system load. The Ceph Manager daemons also host python-based plugins to manage and expose Ceph cluster information, including a web-based dashboard and REST API. At least two managers are normally required for high availability.

  • Ceph OSDs: A Ceph OSD (object storage daemon, ceph-osd) stores data, handles data replication, recovery, rebalancing, and provides some monitoring information to Ceph Monitors and Managers by checking other Ceph OSD Daemons for a heartbeat. At least 3 Ceph OSDs are normally required for redundancy and high availability.

  • MDSs: A Ceph Metadata Server (MDS, ceph-mds) stores metadata on behalf of the Ceph Filesystem (i.e., Ceph Block Devices and Ceph Object Storage do not use MDS). Ceph Metadata Servers allow POSIX file system users to execute basic commands (like ls, find, etc.) without placing an enormous burden on the Ceph Storage Cluster.

Ceph stores data as objects within logical storage pools. Using the CRUSH algorithm, Ceph calculates which placement group should contain the object, and further calculates which Ceph OSD Daemon should store the placement group. The CRUSH algorithm enables the Ceph Storage Cluster to scale, rebalance, and recover dynamically.

Deploy a ceph storage cluster

Prepare Ubuntu Linux and packages

From ceph installation guide, the following system requirements must be met before deployment.

  • Python 3
  • Systemd
  • Podman or Docker for running containers
  • Time synchronization (such as chrony or NTP)
  • LVM2 for provisioning storage devices
1
2
3
root@host1:~# cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04

Upgrade to Ubuntu 18.0.4

1
2
3
4
5
6
7
8
9
10
11
12
root@host1:~# apt install update-manager-core

root@host1:~# do-release-upgrade -c
Checking for a new Ubuntu release
New release '18.04.5 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

root@host1:~# do-release-upgrade

root@host1:~# cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04

Install python3

1
root@host1:~# apt-get install python3

Install docker

Refer to here

Install ntp

1
2
3
root@host1:~# apt-get install ntp
root@host1:~# service ntp start
root@host1:~# timedatectl set-timezone UTC

Install lvm2

1
root@host1:~# apt-get install lvm2

Check and disable firewall status

1
root@host1:~# ufw status

Add cluster nodes to /etc/hosts

Configure passwordless ssh from primary host to the others

Install cephadm

The cephadm command can

  • bootstrap a new cluster
  • launch a containerized shell with a working Ceph CLI
  • aid in debugging containerized Ceph daemons
1
2
3
4
5
6
7
8
9
root@host1:~# curl --silent --remote-name --location https://github.com/ceph/ceph/raw/pacific/src/cephadm/cephadm
root@host1:~# ls
cephadm 
root@host1:~# chmod +x cephadm

root@host1:~# ./cephadm add-repo --release pacific
root@host1:~# ./cephadm install
root@host1:~# which cephadm
/usr/sbin/cephadm

Bootstrap a new cluster

The first step in creating a new Ceph cluster is running the cephadm bootstrap command on the Ceph cluster’s first host. The act of running the cephadm bootstrap command on the Ceph cluster’s first host creates the Ceph cluster’s first “monitor daemon”, and that monitor daemon needs an IP address. You must pass the IP address of the Ceph cluster’s first host to the ceph bootstrap command, so you’ll need to know the IP address of that host.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
root@host1:~# cephadm bootstrap --mon-ip <host1-ip> --allow-fqdn-hostname
Ceph Dashboard is now available at:

	     URL: https://host1:8443/
	    User: admin
	Password: btauef87vj

Enabling client.admin keyring and conf on hosts with "admin" label
You can access the Ceph CLI with:

	sudo /usr/sbin/cephadm shell --fsid ad30a6fc-068f-11ec-8323-000c29bf98ea -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Please consider enabling telemetry to help improve Ceph:

	ceph telemetry on

For more information see:

	https://docs.ceph.com/docs/pacific/mgr/telemetry/

Bootstrap complete.

root@host1:~# docker ps
CONTAINER ID   IMAGE                        COMMAND                  CREATED         STATUS         PORTS     NAMES
a946ae868dbc   prom/alertmanager:v0.20.0    "/bin/alertmanager -…"   6 minutes ago   Up 6 minutes             ceph-ad30a6fc-068f-11ec-8323-000c29bf98ea-alertmanager.host1
504d9271b24c   ceph/ceph-grafana:6.7.4      "/bin/sh -c 'grafana…"   6 minutes ago   Up 6 minutes             ceph-ad30a6fc-068f-11ec-8323-000c29bf98ea-grafana.host1
622a5e234406   prom/prometheus:v2.18.1      "/bin/prometheus --c…"   6 minutes ago   Up 6 minutes             ceph-ad30a6fc-068f-11ec-8323-000c29bf98ea-prometheus.host1
6c2b0440d4c1   prom/node-exporter:v0.18.1   "/bin/node_exporter …"   6 minutes ago   Up 6 minutes             ceph-ad30a6fc-068f-11ec-8323-000c29bf98ea-node-exporter.host1
8bc618e9ffa3   ceph/ceph                    "/usr/bin/ceph-crash…"   6 minutes ago   Up 6 minutes             ceph-ad30a6fc-068f-11ec-8323-000c29bf98ea-crash.host1
b57a021238ba   ceph/ceph:v16                "/usr/bin/ceph-mgr -…"   7 minutes ago   Up 7 minutes             ceph-ad30a6fc-068f-11ec-8323-000c29bf98ea-mgr.host1.ltfphc
e812853ef17d   ceph/ceph:v16                "/usr/bin/ceph-mon -…"   7 minutes ago   Up 7 minutes             ceph-ad30a6fc-068f-11ec-8323-000c29bf98ea-mon.host1

Enable ceph CLI

To execute ceph commands, you can also run commands like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
root@host1:~# cephadm shell -- ceph -s
Inferring fsid ad30a6fc-068f-11ec-8323-000c29bf98ea
Inferring config /var/lib/ceph/ad30a6fc-068f-11ec-8323-000c29bf98ea/mon.host1/config
Using recent ceph image ceph/ceph@sha256:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb
  cluster:
    id:     ad30a6fc-068f-11ec-8323-000c29bf98ea
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum host1 (age 9m)
    mgr: host1.ltfphc(active, since 10m)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

Cephadm does not require any Ceph packages to be installed on the host. However, it recommends enabling easy access to the ceph command.

You can install the ceph-common package, which contains all of the ceph commands, including ceph, rbd, mount.ceph (for mounting CephFS file systems), etc.:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
root@host1:~# cephadm add-repo --release pacific
Installing repo GPG key from https://download.ceph.com/keys/release.gpg...
Installing repo file at /etc/apt/sources.list.d/ceph.list...
Updating package list...
Completed adding repo.
root@host1:~# cephadm install ceph-common
Installing packages ['ceph-common']...

root@host1:~# ceph -v
ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)

root@host1:~# ceph status
  cluster:
    id:     ad30a6fc-068f-11ec-8323-000c29bf98ea
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum host1 (age 11m)
    mgr: host1.ltfphc(active, since 12m)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

Adding additional hosts to the cluster

To add each new host to the cluster, perform two steps:

  1. Install the cluster’s public SSH key in the new host’s root user’s authorized_keys file:

    1
    2
    
    root@host1:~# ssh-copy-id -f -i /etc/ceph/ceph.pub root@host2
    root@host1:~# ssh-copy-id -f -i /etc/ceph/ceph.pub root@host3
    
  2. Tell Ceph that the new node is part of the cluster:

    1
    2
    
    root@host1:~# ceph orch host add host2 <host2-ip> --labels _admin
    root@host1:~# ceph orch host add host3 <host3-ip> --labels _admin
    

Wait for a while until the monitor detects the new hosts. Verify the new added hosts as below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
root@host1:~# cat /etc/ceph/ceph.conf
# minimal ceph.conf for ad30a6fc-068f-11ec-8323-000c29bf98ea
[global]
	fsid = ad30a6fc-068f-11ec-8323-000c29bf98ea
	mon_host = [v2:<host2-ip>:3300/0,v1:<host2-ip>:6789/0] [v2:<host3-ip>:3300/0,v1:<host3-ip>:6789/0] [v2:<host1-ip>:3300/0,v1:<host1-ip>:6789/0]

root@host1:~# ceph status
  cluster:
    id:     ad30a6fc-068f-11ec-8323-000c29bf98ea
    health: HEALTH_WARN
            clock skew detected on mon.host2, mon.host3
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 3 daemons, quorum host1,host2,host3 (age 115s)
    mgr: host1.ltfphc(active, since 30m), standbys: host2.dqlsnk
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

Adding storage

To add storage to the cluster, either tell Ceph to consume any available and unused device:

1
ceph orch apply osd --all-available-devices

Or Deploy OSDs with specified storage devices.

Listing storage devices

In order to deploy an OSD, there must be a storage device that is available on which the OSD will be deployed.

Run this command to display an inventory of storage devices on all cluster hosts:

1
2
3
4
5
root@host1:~# ceph orch device ls
Hostname                            Path      Type  Serial  Size   Health   Ident  Fault  Available
host2  /dev/sdb  hdd           85.8G  Unknown  N/A    N/A    Yes
host3  /dev/sdb  hdd           85.8G  Unknown  N/A    N/A    Yes
host1  /dev/sdb  hdd           85.8G  Unknown  N/A    N/A    Yes

A storage device is considered available if all of the following conditions are met:

  • The device must have no partitions.
  • The device must not have any LVM state.
  • The device must not be mounted.
  • The device must not contain a file system.
  • The device must not contain a Ceph BlueStore OSD.
  • The device must be larger than 5 GB.

Ceph will not provision an OSD on a device that is not available.

Creating new OSDs

There are a few ways to create new OSDs:

  • Tell Ceph to consume any available and unused storage device:

    1
    
    ceph orch apply osd --all-available-devices
    

    After running the above command:

    • If you add new disks to the cluster, they will automatically be used to create new OSDs.
    • If you remove an OSD and clean the LVM physical volume, a new OSD will be created automatically.

    If you want to avoid this behavior (disable automatic creation of OSD on available devices), use the unmanaged parameter:

    1
    
    ceph orch apply osd --all-available-devices --unmanaged=true
    
  • Create an OSD from a specific device on a specific host:

    1
    
    ceph orch daemon add osd *<host>*:*<device-path>*
    

    For example:

    1
    
    ceph orch daemon add osd host1:/dev/sdb
    

In our case, we use the following commands to create OSDs for the three nodes. We only need run the commands from host1.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
root@host1:~#   ceph orch daemon add osd host1:/dev/sdb
Created osd(s) 0 on host 'host1'
root@host1:~# ceph orch daemon add osd host2:/dev/sdb
Created osd(s) 1 on host 'host2'
root@host1:~# ceph orch daemon add osd host3:/dev/sdb
Created osd(s) 2 on host 'host3'

root@host1:~# ceph status
  cluster:
    id:     ad30a6fc-068f-11ec-8323-000c29bf98ea
    health: HEALTH_WARN
            clock skew detected on mon.host2, mon.host3
            59 slow ops, oldest one blocked for 130 sec, mon.host2 has slow ops

  services:
    mon: 3 daemons, quorum host1,host2,host3 (age 2m)
    mgr: host1.ltfphc(active, since 102s), standbys: host2.dqlsnk
    osd: 3 osds: 3 up (since 7m), 3 in (since 7m)

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   15 MiB used, 240 GiB / 240 GiB avail
    pgs:     1 active+clean

Rry Run

The –dry-run flag causes the orchestrator to present a preview of what will happen without actually creating the OSDs.

For example:

1
ceph orch apply osd --all-available-devices --dry-run

Create a pool

Pools are logical partitions for storing objects. When you first deploy a cluster without creating a pool, Ceph uses the default pools for storing data.

By default, Ceph makes 3 replicas of RADOS objects. Ensure you have a realistic number of placement groups. Ceph recommends approximately 100 per OSD and always use the nearest power of 2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
root@host1:~# ceph osd lspools
1 device_health_metrics
root@host1:~# ceph osd pool create datapool 128 128
pool 'datapool' created
root@host1:~# ceph osd lspools
1 device_health_metrics
2 datapool


root@host1:~# ceph osd pool ls detail
pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 22 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth
pool 2 'datapool' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 39 flags hashpspool stripe_width 0

root@host1:~# ceph osd pool get datapool all
size: 3
min_size: 2
pg_num: 128
pgp_num: 128
crush_rule: replicated_rule
hashpspool: true
nodelete: false
nopgchange: false
nosizechange: false
write_fadvise_dontneed: false
noscrub: false
nodeep-scrub: false
use_gmt_hitset: 1
fast_read: 0
pg_autoscale_mode: on

On the admin node, use the rbd tool to initialize the pool for use by RBD:

1
[ceph: root@host1 /]# rbd pool init datapool

Create rbd volume and map to a block device on the host

The rbd command enables you to create, list, introspect and remove block device images. You can also use it to clone images, create snapshots, rollback an image to a snapshot, view a snapshot, etc.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
root@host1:~# rbd create --size 512000 datapool/rbdvol1
root@host1:~# rbd map datapool/rbdvol1
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable datapool/rbdvol1 object-map fast-diff deep-flatten".
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (6) No such device or address

root@host1:~# dmesg | tail
[50268.015821] cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation
[59168.019848] Key type ceph registered
[59168.020080] libceph: loaded (mon/osd proto 15/24)
[59168.023667] rbd: loaded (major 252)
[59168.028478] libceph: mon2 <host1-ip>:6789 session established
[59168.028571] libceph: mon2 <host1-ip>:6789 socket closed (con state OPEN)
[59168.028594] libceph: mon2 <host1-ip>:6789 session lost, hunting for new mon
[59175.101037] libceph: mon0 <host1-ip>:6789 session established
[59175.101413] libceph: client14535 fsid ad30a6fc-068f-11ec-8323-000c29bf98ea
[59175.105601] rbd: image rbdvol1: image uses unsupported features: 0x38

root@host1:~# rbd feature disable datapool/rbdvol1 object-map fast-diff deep-flatten

root@host1:~# rbd map datapool/rbdvol1
/dev/rbd0

root@host1:~# rbd showmapped
id  pool      namespace  image    snap  device
0   datapool             rbdvol1  -     /dev/rbd0

root@host1:~# lsblk
NAME                                                                                                  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                                                                                                     8:0    0  128G  0 disk
├─sda1                                                                                                  8:1    0  127G  0 part /
├─sda2                                                                                                  8:2    0    1K  0 part
└─sda5                                                                                                  8:5    0  975M  0 part
sdb                                                                                                     8:16   0   80G  0 disk
└─ceph--bc7eff08--2ac6--44a5--b941--5444c4a8600a-osd--block--b4dfb938--05af--413d--a327--18d26fc75b8d 253:0    0   80G  0 lvm
rbd0                                                                                                  252:0    0  100G  0 disk

root@host1:~# ls -la /dev/rbd/datapool/rbdvol1
lrwxrwxrwx 1 root root 10 Aug 26 19:35 /dev/rbd/datapool/rbdvol1 -> ../../rbd0

root@host1:~# ls -la /dev/rbd0
brw-rw---- 1 root disk 252, 0 Aug 26 19:35 /dev/rbd0

root@host1:~# rbd status datapool/rbdvol1
Watchers:
	watcher=<host1-ip>:0/2778790200 client.14556 cookie=18446462598732840967

root@host1:~# rbd info datapool/rbdvol1
rbd image 'rbdvol1':
	size 100 GiB in 25600 objects
	order 22 (4 MiB objects)
	snapshot_count: 0
	id: 38bebe718b2f
	block_name_prefix: rbd_data.38bebe718b2f
	format: 2
	features: layering, exclusive-lock
	op_features:
	flags:
	create_timestamp: Thu Aug 26 19:31:29 2021
	access_timestamp: Thu Aug 26 19:31:29 2021
	modify_timestamp: Thu Aug 26 19:31:29 2021

Create filesystem and mount rbd volume

You can use Linux standard commands to create filesystem on the volume and mount it for different purpose.

Troubleshooting

  1. Ceph does not support pacific or later on centos7.8

    If you are installing Ceph with version of pacific on CentOS 7.8, you may see the following issue.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    $ cat /etc/*release
    CentOS Linux release 7.8.2003 (Core)
    NAME="CentOS Linux"
    
    $ uname -r
    5.7.12-1.el7.elrepo.x86_64
    
    # ./cephadm add-repo --release pacific
    ERROR: Ceph does not support pacific or later for this version of this linux distro and therefore cannot add a repo for it
    

    You can install Ceph with version “octopus” instead.

    1
    2
    3
    4
    
    $ ./cephadm add-repo --release octopus
    Writing repo to /etc/yum.repos.d/ceph.repo...
    Enabling EPEL...
    Completed adding repo
    

    Note: cephadm is new in Ceph release v15.2.0 (Octopus) and does not support older versions of Ceph.

  2. Invalid GPG Key

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    
    $ ./cephadm install
    Installing packages ['cephadm']...
    Non-zero exit code 1 from yum install -y cephadm
    yum: stdout Loaded plugins: fastestmirror, langpacks, priorities
    yum: stdout Loading mirror speeds from cached hostfile
    yum: stdout  * base: pxe.dev.purestorage.com
    yum: stdout  * centosplus: pxe.dev.purestorage.com
    yum: stdout  * epel: mirror.lax.genesisadaptive.com
    yum: stdout  * extras: pxe.dev.purestorage.com
    yum: stdout  * updates: pxe.dev.purestorage.com
    yum: stdout 279 packages excluded due to repository priority protections
    yum: stdout Resolving Dependencies
    yum: stdout --> Running transaction check
    yum: stdout ---> Package cephadm.noarch 2:15.2.14-0.el7 will be installed
    yum: stdout --> Finished Dependency Resolution
    yum: stdout
    yum: stdout Dependencies Resolved
    yum: stdout
    yum: stdout ================================================================================
    yum: stdout  Package        Arch          Version                  Repository          Size
    yum: stdout ================================================================================
    yum: stdout Installing:
    yum: stdout  cephadm        noarch        2:15.2.14-0.el7          Ceph-noarch         55 k
    yum: stdout
    yum: stdout Transaction Summary
    yum: stdout ================================================================================
    yum: stdout Install  1 Package
    yum: stdout
    yum: stdout Total download size: 55 k
    yum: stdout Installed size: 223 k
    yum: stdout Downloading packages:
    yum: stdout Public key for cephadm-15.2.14-0.el7.noarch.rpm is not installed
    yum: stdout Retrieving key from https://download.ceph.com/keys/release.gpg
    yum: stderr warning: /var/cache/yum/x86_64/7/Ceph-noarch/packages/cephadm-15.2.14-0.el7.noarch.rpm: Header V4 RSA/SHA256 Signature, key ID    460f3994: NOKEY
    yum: stderr
    yum: stderr
    yum: stderr Invalid GPG Key from https://download.ceph.com/keys/release.gpg: No key found in given key data
    Traceback (most recent call last):
      File "./cephadm", line 8432, in <module>
        main()
      File "./cephadm", line 8420, in main
        r = ctx.func(ctx)
      File "./cephadm", line 6384, in command_install
        pkg.install(ctx.packages)
      File "./cephadm", line 6231, in install
        call_throws(self.ctx, [self.tool, 'install', '-y'] + ls)
      File "./cephadm", line 1461, in call_throws
        raise RuntimeError('Failed command: %s' % ' '.join(command))
    RuntimeError: Failed command: yum install -y cephadm
    

    Based on (Ceph Documentation)[https://docs.ceph.com/en/mimic/install/get-packages/] , execute the following to install the release.asc key.

    1
    
    $ rpm --import 'https://download.ceph.com/keys/release.asc'
    

    Install cephadm package again and it succeeds.

    1
    2
    3
    4
    5
    
    $ ./cephadm install
    Installing packages ['cephadm']...
       
    $ which cephadm
    /usr/sbin/cephadm
    
  3. Failed to add host during bootstrap

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    
    $ cephadm bootstrap --mon-ip 192.168.1.183
    Adding host host1...
    Non-zero exit code 22 from /usr/bin/docker run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=docker.io/ceph/ceph:v15    -e NODE_NAME=host1 -v /var/log/ceph/ccc938de-0c30-11ec-8c3f-ac1f6bc8d268:/var/log/ceph:z -v /tmp/ceph-tmpdqxjp0ly:/etc/ceph/ceph.client.admin.   keyring:z -v /tmp/ceph-tmpmt5hrjo9:/etc/ceph/ceph.conf:z docker.io/ceph/ceph:v15 orch host add host1
    /usr/bin/ceph: stderr Error EINVAL: Failed to connect to host1 (host1).
    /usr/bin/ceph: stderr Please make sure that the host is reachable and accepts connections using the cephadm SSH key
    /usr/bin/ceph: stderr
    /usr/bin/ceph: stderr To add the cephadm SSH key to the host:
    /usr/bin/ceph: stderr > ceph cephadm get-pub-key > ~/ceph.pub
    /usr/bin/ceph: stderr > ssh-copy-id -f -i ~/ceph.pub root@host1
    /usr/bin/ceph: stderr
    /usr/bin/ceph: stderr To check that the host is reachable:
    /usr/bin/ceph: stderr > ceph cephadm get-ssh-config > ssh_config
    /usr/bin/ceph: stderr > ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key
    /usr/bin/ceph: stderr > chmod 0600 ~/cephadm_private_key
    /usr/bin/ceph: stderr > ssh -F ssh_config -i ~/cephadm_private_key root@host1
    ERROR: Failed to add host <host1>: Failed command: /usr/bin/docker run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e    CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=host1 -v /var/log/ceph/ccc938de-0c30-11ec-8c3f-ac1f6bc8d268:/var/log/ceph:z -v /tmp/   ceph-tmpdqxjp0ly:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpmt5hrjo9:/etc/ceph/ceph.conf:z docker.io/ceph/ceph:v15 orch host add    host1
    

    Note: If there are multiple networks and interfaces, be sure to choose one that will be accessible by any host accessing the Ceph cluster.

    Make sure passwordless ssh is configured on each host.

  4. Remove ceph cluster

    1
    
    $ cephadm  rm-cluster --fsid ccc938de-0c30-11ec-8c3f-ac1f6bc8d268 --force
    
  5. ceph-common installation failure

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    
    $ cephadm install ceph-common
    Installing packages ['ceph-common']...
    Non-zero exit code 1 from yum install -y ceph-common
    yum: stdout Loaded plugins: fastestmirror, langpacks, priorities
    yum: stdout Loading mirror speeds from cached hostfile
    yum: stdout  * base: pxe.dev.purestorage.com
    yum: stdout  * centosplus: pxe.dev.purestorage.com
    yum: stdout  * epel: mirror.lax.genesisadaptive.com
    yum: stdout  * extras: pxe.dev.purestorage.com
    yum: stdout  * updates: pxe.dev.purestorage.com
    yum: stdout 279 packages excluded due to repository priority protections
    yum: stdout Resolving Dependencies
    yum: stdout --> Running transaction check
    yum: stdout ---> Package ceph-common.x86_64 1:10.2.5-4.el7 will be installed
    yum: stdout --> Processing Dependency: python-rbd = 1:10.2.5-4.el7 for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout Package python-rbd is obsoleted by python3-rbd, but obsoleting package does not provide for requirements
    yum: stdout --> Processing Dependency: python-rados = 1:10.2.5-4.el7 for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout Package python-rados is obsoleted by python3-rados, but obsoleting package does not provide for requirements
    yum: stdout --> Processing Dependency: hdparm for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout --> Processing Dependency: gdisk for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout --> Processing Dependency: libboost_regex-mt.so.1.53.0()(64bit) for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout --> Processing Dependency: libboost_program_options-mt.so.1.53.0()(64bit) for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout --> Running transaction check
    yum: stdout ---> Package boost-program-options.x86_64 0:1.53.0-28.el7 will be installed
    yum: stdout ---> Package boost-regex.x86_64 0:1.53.0-28.el7 will be installed
    yum: stdout --> Processing Dependency: libicuuc.so.50()(64bit) for package: boost-regex-1.53.0-28.el7.x86_64
    yum: stdout --> Processing Dependency: libicui18n.so.50()(64bit) for package: boost-regex-1.53.0-28.el7.x86_64
    yum: stdout --> Processing Dependency: libicudata.so.50()(64bit) for package: boost-regex-1.53.0-28.el7.x86_64
    yum: stdout ---> Package ceph-common.x86_64 1:10.2.5-4.el7 will be installed
    yum: stdout --> Processing Dependency: python-rbd = 1:10.2.5-4.el7 for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout Package python-rbd is obsoleted by python3-rbd, but obsoleting package does not provide for requirements
    yum: stdout --> Processing Dependency: python-rados = 1:10.2.5-4.el7 for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout Package python-rados is obsoleted by python3-rados, but obsoleting package does not provide for requirements
    yum: stdout ---> Package gdisk.x86_64 0:0.8.10-3.el7 will be installed
    yum: stdout ---> Package hdparm.x86_64 0:9.43-5.el7 will be installed
    yum: stdout --> Running transaction check
    yum: stdout ---> Package ceph-common.x86_64 1:10.2.5-4.el7 will be installed
    yum: stdout --> Processing Dependency: python-rbd = 1:10.2.5-4.el7 for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout Package python-rbd is obsoleted by python3-rbd, but obsoleting package does not provide for requirements
    yum: stdout --> Processing Dependency: python-rados = 1:10.2.5-4.el7 for package: 1:ceph-common-10.2.5-4.el7.x86_64
    yum: stdout Package python-rados is obsoleted by python3-rados, but obsoleting package does not provide for requirements
    yum: stdout ---> Package libicu.x86_64 0:50.2-4.el7_7 will be installed
    yum: stdout --> Finished Dependency Resolution
    yum: stdout  You could try using --skip-broken to work around the problem
    yum: stdout  You could try running: rpm -Va --nofiles --nodigest
    yum: stderr Error: Package: 1:ceph-common-10.2.5-4.el7.x86_64 (base)
    yum: stderr            Requires: python-rbd = 1:10.2.5-4.el7
    yum: stderr            Available: 1:python-rbd-10.2.5-4.el7.x86_64 (base)
    yum: stderr                python-rbd = 1:10.2.5-4.el7
    yum: stderr            Available: 2:python3-rbd-15.2.14-0.el7.x86_64 (Ceph)
    yum: stderr                python-rbd = 2:15.2.14-0.el7
    yum: stderr Error: Package: 1:ceph-common-10.2.5-4.el7.x86_64 (base)
    yum: stderr            Requires: python-rados = 1:10.2.5-4.el7
    yum: stderr            Available: 1:python-rados-10.2.5-4.el7.x86_64 (base)
    yum: stderr                python-rados = 1:10.2.5-4.el7
    yum: stderr            Available: 2:python3-rados-15.2.14-0.el7.x86_64 (Ceph)
    yum: stderr                python-rados = 2:15.2.14-0.el7
    Traceback (most recent call last):
      File "/usr/sbin/cephadm", line 6242, in <module>
        r = args.func()
      File "/usr/sbin/cephadm", line 5073, in command_install
        pkg.install(args.packages)
      File "/usr/sbin/cephadm", line 4931, in install
        call_throws([self.tool, 'install', '-y'] + ls)
      File "/usr/sbin/cephadm", line 1112, in call_throws
        raise RuntimeError('Failed command: %s' % ' '.join(command))
    RuntimeError: Failed command: yum install -y ceph-common
    
  6. cephadm log

    /var/log/ceph/cephadm.log

  7. rbd image map failed

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    
    [ceph: root@host1 /]# rbd map datapool/rbdvol1
    modinfo: ERROR: Module alias rbd not found.
    modprobe: FATAL: Module rbd not found in directory /lib/modules/5.7.12-1.el7.elrepo.x86_64
    rbd: failed to load rbd kernel module (1)
    rbd: sysfs write failed
    In some cases useful info is found in syslog - try "dmesg | tail".
    rbd: map failed: (2) No such file or directory
    
    [root@host1 ~]# modprobe rbd
    [root@host1 ~]# lsmod | grep rbd
    rbd                   106496  0
    libceph               331776  1 rbd
    
  8. rbd image map failed on other cluster nodes

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    
    [ceph: root@host2 /]# rbd map datapool/rbdvol5 --id admin
    2021-09-21T19:49:49.384+0000 7f91ea781500 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
    rbd: sysfs write failed
    2021-09-21T19:49:49.387+0000 7f91ea781500 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
    2021-09-21T19:49:49.387+0000 7f91ea781500 -1 AuthRegistry(0x5633b09431e0) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx
    2021-09-21T19:49:49.388+0000 7f91ea781500 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
    2021-09-21T19:49:49.388+0000 7f91ea781500 -1 AuthRegistry(0x7fffd357c350) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx
    2021-09-21T19:49:49.389+0000 7f91d9b68700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
    2021-09-21T19:49:49.389+0000 7f91da369700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
    2021-09-21T19:49:49.389+0000 7f91dab6a700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
    2021-09-21T19:49:49.389+0000 7f91ea781500 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication
    rbd: couldnot connect to the cluster!
    In some cases useful info is found in syslog - try "dmesg | tail".
    rbd: map failed: (22) Invalid argument
    
    [ceph: root@host2 /]# ls /etc/ceph
    ceph.conf  rbdmap
    

    Copy the /etc/ceph/ceph.keyring from admin node host1 to host2

    1
    2
    3
    4
    5
    6
    7
    
    [ceph: root@host2 /]# ls /etc/ceph
    ceph.conf  ceph.keyring  rbdmap
    
    [ceph: root@host2 /]# rbd map datapool/rbdvol5
    [ceph: root@host2 /]# rbd device list
    id  pool      namespace  image    snap  device
    0   datapool             rbdvol5  -     /dev/rbd0
    

Reference