跳到主要内容

Ceph 集群部署 (二)

推荐配置

CPU

16C 或 32C 双路

内存

64G以上

磁盘

NVME 企业级SSD 
推荐
intel D7 P5520 数据中心企业级固态硬盘U.2 nvme协议服务器工作站SSD P5520 3.84TB
intel 英特尔 S4510/S4520 数据中心企业级固态硬盘SATA3 S4520 3.84T

mon 服务器

16C 32G 200G

Mgr 服务器

8C 16G 200G

Ceph-deploy

4c 8G 120G

一、部署方式

ceph-ansible:https://github.com/ceph/ceph-ansible #python ceph-salt:https://github.com/ceph/ceph-salt #python ceph-container:https://github.com/ceph/ceph-container #shell ceph-chef:https://github.com/ceph/ceph-chef #Ruby

ceph-deploy:https://github.com/ceph/ceph-deploy #python ceph-deploy是一个 ceph 官方维护的基于 ceph-deploy 命令行部署 ceph 集群的工具,基于 ssh 执行可以 sudo 权限的 shell 命令以及一些 python 脚本 实现 ceph 集群的部署和管理维护。

Ceph-deploy 只用于部署和管理 ceph 集群,客户端需要访问 ceph,需要部署客户端工具。

二、服务器准备

硬件推荐:https://docs.ceph.com/en/latest/start/hardware-recommendations/#

2.1 OSD服务器

三台服务器作为 ceph 集群 OSD 存储服务器,每台服务器支持两个网络,public 网络针对客户端访问,cluster 网络用于集群管理及数据同步,每台三块或以上的磁盘。

10.1.0.30/192.168.10.240
10.0.0.31/192.168.10.241
10.0.0.32/192.168.10.242

三台存储服务器磁盘划分:
/dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf #200G

2.2 Mon 监视服务器

三台服务器作为 ceph 集群 Mon 监视服务器,每台服务器可以和 ceph 集群的 cluster 网络通信。

10.1.0.33/192.168.10.243
10.0.0.34/192.168.10.244
10.0.0.35/192.168.10.245

2.3 ceph-mgr 管理服务器

两个 ceph-mgr 管理服务器,可以和 ceph 集群的 cluster 网络通信。

10.1.0.30/192.168.10.240
10.0.0.31/192.168.10.241

2.4 Ceph-deploy 部署服务器

一个服务器用于部署 ceph 集群即安装 Ceph-deploy,也可以和 ceph-mgr 等复用。

10.1.0.31/192.168.10.248

三、 服务器环境准备

3.1 配置集群网络

#更改网卡名称为eth*:

sudo vim /etc/default/grub

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=hidden
GRUB_TIMEOUT=0
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="maybe-ubiquity"
GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0"


~$ sudo update-grub
Sourcing file `/etc/default/grub'
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.15.0-55-generic
Found initrd image: /boot/initrd.img-4.15.0-55-generic
done
#配置cluster和public网络
vim /etc/apt/sources.list

# This is the network config written by 'subiquity'
network:
ethernets:
eth0:
dhcp4: no
dhcp6: no
addresses:
- 10.1.0.39/24
gateway4: 10.1.0.254
nameservers:
addresses:
- 223.5.5.5
eth1:
dhcp4: no
dhcp6: no
addresses: [192.168.10.239/24]
version: 2


#生效
netplan apply

#验证两块网卡IP地址
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether fe:fc:fe:cf:34:9d brd ff:ff:ff:ff:ff:ff
inet 10.1.0.39/24 brd 10.1.0.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::fcfc:feff:fecf:349d/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether fe:fc:fe:79:6f:6e brd ff:ff:ff:ff:ff:ff
inet 192.168.10.239/24 brd 192.168.10.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::fcfc:feff:fe79:6f6e/64 scope link
valid_lft forever preferred_lft forever

3.2 配置主机名解析

vim /etc/hosts
10.1.0.39 ceph-node1.xx.local ceph-node1
10.1.0.40 ceph-node2.xx.local ceph-node2
10.1.0.41 ceph-node3.xx.local ceph-node3
10.1.0.39 ceph-mon1.xx.local ceph-mon1
10.1.0.40 ceph-mon2.xx.local ceph-mon2
10.1.0.41 ceph-mon3.xx.local ceph-mon3
10.1.0.40 ceph-mgr1.xx.local ceph-mgr1
10.1.0.41 ceph-mgr2.xx.local ceph-mgr2
10.1.0.39 ceph-deploy.xx.local ceph-deploy

3.3 配置apt源

https://download.ceph.com/ #Ceph官方源
https://mirrors.aliyun.com/ceph/ #阿里云镜像仓库
http://mirrors.163.com/ceph/ #网易镜像仓库
https://mirrors.tuna.tsinghua.edu.cn/ceph/ #清华大学镜像源

所有节点添加ceph 源

wget -q -O- 'https://download.ceph.com/keys/release.asc' | sudo apt-key add -
echo deb https://download.ceph.com/debian-pacific/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list

sudo apt update

3.4 时间同步

#设置时区
timedatectl set-timezone Asia/Shanghai


#安装chrony
#三节点安装
apt install chrony -y


##服务端配置
vim /etc/chrony/chrony.conf

# Welcome to the chrony configuration file. See chrony.conf(5) for more
# information about usuable directives.

# This will use (up to):
# - 4 sources from ntp.ubuntu.com which some are ipv6 enabled
# - 2 sources from 2.ubuntu.pool.ntp.org which is ipv6 enabled as well
# - 1 source from [01].ubuntu.pool.ntp.org each (ipv4 only atm)
# This means by default, up to 6 dual-stack and up to 2 additional IPv4-only
# sources will be used.
# At the same time it retains some protection against one of the entries being
# down (compare to just using one of the lines). See (LP: #1754358) for the
# discussion.
#
# About using servers from the NTP Pool Project in general see (LP: #104525).
# Approved by Ubuntu Technical Board on 2011-02-08.
# See http://www.pool.ntp.org/join.html for more information.
# 因为想修改本地时间,不去和其他服务器同步,将下面这四个pool注释掉
pool ntp.ubuntu.com iburst maxsources 4
#pool 0.ubuntu.pool.ntp.org iburst maxsources 1
#pool 1.ubuntu.pool.ntp.org iburst maxsources 1
#pool 2.ubuntu.pool.ntp.org iburst maxsources 2

# 添加自己作为服务器
#server 192.168.1.1 iburst
# 为了方便客户端连接权限设置为允许所有
allow all
# 当无法和其他同步时,使用本地的时间去给客户端同步

#注释:值10可以被其他值取代(1~15),stratum 1表示计算机具有直接连接的真实时间的参考时间源,例如gps,原子钟都和真实时间很接近欸, #stratum 2表示计算机有一个stratum 1的计算机作为同步时间源,stratum 3表示该计算机有一个stratum 10的计算机作为同步时间源。 #选择stratum 10.这个值是比较大的,表示距离有真实时间的服务器比较远,它的时间不太可靠,因此,local命令选取stratum 10可以 #防止机器本身的时间与真实时间混淆,可以保证该机器不会将本身的时间授给那些可以连接同步到真实时间的ntp服务器的ntp客户端】

local stratum 10

# This directive specify the location of the file containing ID/key pairs for
# NTP authentication.
keyfile /etc/chrony/chrony.keys

# This directive specify the file into which chronyd will store the rate
# information.
driftfile /var/lib/chrony/chrony.drift

# Uncomment the following line to turn logging on.
#log tracking measurements statistics

# Log files location.
logdir /var/log/chrony

# Stop bad estimates upsetting machine clock.
maxupdateskew 100.0

# This directive enables kernel synchronisation (every 11 minutes) of the
# real-time clock. Note that it can’t be used along with the 'rtcfile' directive.
rtcsync

# Step the system clock instead of slewing it if the adjustment is larger than
# one second, but only in the first three clock updates.
makestep 1 3
#客户端

# This will use (up to):
# - 4 sources from ntp.ubuntu.com which some are ipv6 enabled
# - 2 sources from 2.ubuntu.pool.ntp.org which is ipv6 enabled as well
# - 1 source from [01].ubuntu.pool.ntp.org each (ipv4 only atm)
# This means by default, up to 6 dual-stack and up to 2 additional IPv4-only
# sources will be used.
# At the same time it retains some protection against one of the entries being
# down (compare to just using one of the lines). See (LP: #1754358) for the
# discussion.
#
# About using servers from the NTP Pool Project in general see (LP: #104525).
# Approved by Ubuntu Technical Board on 2011-02-08.
# See http://www.pool.ntp.org/join.html for more information.
pool 10.1.0.39 iburst maxsources 4
#pool 0.ubuntu.pool.ntp.org iburst maxsources 1
#pool 1.ubuntu.pool.ntp.org iburst maxsources 1
#pool 2.ubuntu.pool.ntp.org iburst maxsources 2

# This directive specify the location of the file containing ID/key pairs for
# NTP authentication.
keyfile /etc/chrony/chrony.keys

# This directive specify the file into which chronyd will store the rate
# information.
driftfile /var/lib/chrony/chrony.drift

# Uncomment the following line to turn logging on.
#log tracking measurements statistics

# Log files location.
logdir /var/log/chrony

# Stop bad estimates upsetting machine clock.
maxupdateskew 100.0

# This directive enables kernel synchronisation (every 11 minutes) of the
# real-time clock. Note that it can’t be used along with the 'rtcfile' directive.
rtcsync

# Step the system clock instead of slewing it if the adjustment is larger than
# one second, but only in the first three clock updates.
makestep 1 3
#查看 ntp_servers 状态  
chronyc sourcestats -v

#强制同步下系统时钟
chronyc -a makestep

#查看 ntp_servers 是否在线
chronyc activity -v
200 OK

3.5 创建普通用户

推荐使用指定的普通用户部署和运行 ceph 集群,普通用户只要能以非交互方式执行 sudo 命令执行一些特权命令即可,新版的 ceph-deploy 可以指定包含 root 的在内只要可以执行 sudo 命令的用户,不过仍然推荐使用普通用户,比如 ceph、cephuser、cephadmin 这样的用户去管理 ceph 集群。

在包含 ceph-deploy 节点的存储节点、mon 节点和 mgr 节点等创建 ceph 用户。

#创建用户
groupadd -r -g 20235 xceo && useradd -r -m -s /bin/bash -u 20235 -g 20235 xceo && echo xceo:ceamg.com | chpasswd

~ #:id xceo
uid=20235(xceo) gid=20235(xceo) groups=20235(xceo)


#允许ceph 用户以sudo执行特殊权限
echo "ceph ALL=(ALL:ALL) NOPASSWD:ALL" >> /etc/sudoers

3.6 配置免密登录

root@ceph-node1:~ su - xceo
ceph@ceph-node1:~$
ceph@ceph-node1:~$ ssh-keygen
ssh-copy-id ceph@10.1.0.40
ssh-copy-id ceph@10.1.0.41

ssh ceph-mon1.xx.local
ssh ceph-mon2.xx.local
ssh ceph-mon3.xx.local
ssh ceph-mgr1.xx.local
ssh ceph-mgr2.xx.local
ssh ceph-node1.xx.local
ssh ceph-node2.xx.local
ssh ceph-node3.xx.local
ssh ceph-deploy.xx.local

3.7 其他基本优化

~$ vi /etc/sysctl.conf
添加:
fs.file-max = 10000000000
fs.nr_open = 1000000000


~$ vi /etc/security/limits.conf
#root账户的资源软限制和硬限制

root soft core unlimited
root hard core unlimited
root soft nproc 1000000
root hard nproc 1000000
root soft nofile 1000000
root hard nofile 1000000
root soft memlock 32000
root hard memlock 32000
root soft msgqueue 8192000
root hard msgqueue 8192000

#其他账户的资源软限制和硬限制
* soft core unlimited
* hard core unlimited
* soft nproc 1000000
* hard nproc 1000000
* soft nofile 1000000
* hard nofile 1000000
* soft memlock 32000
* hard memlock 32000
* soft msgqueue 8192000
* hard msgqueue 8192000

# 将这两个修改过的文件拷贝到其他节点
~$ scp /etc/sysctl.conf ceph-node2:/etc/
~$ scp /etc/sysctl.conf ceph-node3:/etc/
~$ scp /etc/security/limits.conf ceph-node2:/etc/security/
~$ scp /etc/security/limits.conf ceph-node3:/etc/security/

# 在三台虚拟机上分别执行以下命令,让内核参数生效,并重启
~$ sysctl -p
~$ reboot

四、部署 RADOS 集群

4.1 安装Ceph-Deploy 部署工具

#查看可用版本
root@ceph-node1[15:45:17]~ #:apt-cache madison ceph-deploy

ceph-deploy | 2.0.1-0ubuntu1.1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu focal-updates/universe amd64 Packages
ceph-deploy | 2.0.1-0ubuntu1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu focal/universe amd64 Packages

root@ceph-node1[15:45:17]~ #:apt install ceph-deploy -y

4.2 安装 python 2.7

root@ceph-mon1[09:49:37]~ apt install python2.7

#所有机器设置软链接
ln -sv /usr/bin/python2.7 /usr/bin/python2

#安装好后 执行python2测试 如果可以执行,说明安装好了
root@ceph-mon1[09:49:37]~ #:python2
Python 2.7.18 (default, Jul 1 2022, 12:27:04)
[GCC 9.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

4.3 初始化Ceph-Deploy

#切换到xceo账户创建集群目录
~$su - xceo
~$ pwd
/home/xceo
~$ mkdir ceph-cluster
~$ cd ceph-cluster/
~/ceph-cluster$
#查看的结果如下:

xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy --help
usage: ceph-deploy [-h] [-v | -q] [--version] [--username USERNAME] [--overwrite-conf] [--ceph-conf CEPH_CONF] COMMAND ...

Easy Ceph deployment

-^-
/ \
|O o| ceph-deploy v2.0.1
).-.(
'/|||\`
| '|` |
'|`

Full documentation can be found at: http://ceph.com/ceph-deploy/docs

ceph-deploy 使用帮助

ceph-deploy --help
new:开始部署一个新的 ceph 存储集群,并生成 CLUSTER.conf 集群配置文件和 keyring 认证文件。
install: 在远程主机上安装 ceph 相关的软件包, 可以通过--release 指定安装的版本。
rgw:管理 RGW 守护程序(RADOSGW,对象存储网关)。
mgr:管理 MGR 守护程序(ceph-mgr,Ceph Manager DaemonCeph 管理器守护程序)。
mds:管理 MDS 守护程序(Ceph Metadata Server,ceph 源数据服务器)。
mon:管理 MON 守护程序(ceph-mon,ceph 监视器)。
gatherkeys:从指定获取提供新节点的验证 keys,这些 keys 会在添加新的 MON/OSD/MDS 加入的时候使用。
disk:管理远程主机磁盘。
osd:在远程主机准备数据磁盘,即将指定远程主机的指定磁盘添加到 ceph 集群作为 osd 使用。
repo:远程主机仓库管理。
admin:推送 ceph 集群配置文件和 client.admin 认证文件到远程主机。
config:将 ceph.conf 配置文件推送到远程主机或从远程主机拷贝。
uninstall:从远端主机删除安装包。
purgedata:从/var/lib/ceph 删除 ceph 数据,会删除/etc/ceph 下的内容。
purge: 删除远端主机的安装包和所有数据。
forgetkeys:从本地主机删除所有的验证 keyring, 包括 client.admin, monitor, bootstrap 等认证文件。
pkg:管理远端主机的安装包。
calamari:安装并配置一个 calamari web 节点,calamari 是一个 web 监控平台。

4.4 生成mon配置文件

在管理节点初始化mon节点

xceo@ceph-node1:~/ceph-cluster$ ceph-deploy new --cluster-network 192.168.10.0/24 --public-network 10.1.0.0/24 ceph-mon1.xx.local
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy new --cluster-network 192.168.10.0/24 --public-network 10.1.0.0/24 ceph-mon1.xx.local
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] mon : ['ceph-mon1.xx.local']
[ceph_deploy.cli][INFO ] ssh_copykey : True
[ceph_deploy.cli][INFO ] fsid : None
[ceph_deploy.cli][INFO ] cluster_network : 192.168.10.0/24
[ceph_deploy.cli][INFO ] public_network : 10.1.0.0/24
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7fa0a0be30a0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function new at 0x7fa0a0bdaf70>
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds
[ceph-mon1.xx.local][DEBUG ] connected to host: ceph-node1
[ceph-mon1.xx.local][INFO ] Running command: ssh -CT -o BatchMode=yes ceph-mon1.xx.local
[ceph-mon1.xx.local][DEBUG ] connection detected need for sudo
[ceph-mon1.xx.local][DEBUG ] connected to host: ceph-mon1.xx.local
[ceph-mon1.xx.local][INFO ] Running command: sudo /bin/ip link show
[ceph-mon1.xx.local][INFO ] Running command: sudo /bin/ip addr show
[ceph-mon1.xx.local][DEBUG ] IP addresses found: ['10.1.0.39', '192.168.10.239']
[ceph_deploy.new][DEBUG ] Resolving host ceph-mon1.xx.local
[ceph_deploy.new][DEBUG ] Monitor ceph-mon1 at 10.1.0.39
[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-mon1']
[ceph_deploy.new][DEBUG ] Monitor addrs are ['10.1.0.39']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...

,是否生成配置文件

xceo@ceph-node1:~/ceph-cluster$ ll
total 24
drwxrwxr-x 2 xceo xceo 4096 May 26 16:29 ./
drwxr-xr-x 5 xceo xceo 4096 May 26 16:15 ../
-rw-rw-r-- 1 xceo xceo 259 May 26 16:29 ceph.conf
-rw-rw-r-- 1 xceo xceo 7307 May 26 16:29 ceph-deploy-ceph.log
-rw------- 1 xceo xceo 73 May 26 16:29 ceph.mon.keyring
xceo@ceph-node1:~/ceph-cluster$ cat ceph.conf
[global]
fsid = 31fdd971-2963-459b-9d6f-588f1811993f
public_network = 10.1.0.0/24
cluster_network = 192.168.10.0/24
mon_initial_members = ceph-mon1
mon_host = 10.1.0.39
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

在安装过程中会报如下错误

[ceph_deploy][ERROR ] RuntimeError: AttributeError: module 'platform' has no attribute 'linux_distribution'

这是由于python3.7后不再支持platform.linux_distribution

修改方法

修改/usr/lib/python3/dist-packages/ceph_deploy/hosts/remotes.py文件为如下所示

def platform_information(_linux_distribution=None):
""" detect platform information from remote host """
"""
linux_distribution = _linux_distribution or platform.linux_distribution
distro, release, codename = linux_distribution()
"""
distro = release = codename = None
try:
linux_distribution = _linux_distribution or platform.linux_distribution
distro, release, codename = linux_distribution()
except AttributeError:
pass

验证初始化完成之后,会得到三个文件

如下:

xceo@ceph-node1:~/ceph-cluster$ ls
ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
-------------------------------------------------
ceph.conf # 配置文件
ceph-deploy-ceph.log #部署的日志
ceph.mon.keyring #mon.keyring

4.5 初始化node节点

#no-adjust-repos 是不指定源 之前有配置好
#使用普通账户下操作 (在ceph-cluster下操作会有报错)
xceo@ceph-mon1:~/ceph-cluster$ cd ~

ceph-deploy install --no-adjust-repos --nogpgcheck ceph-node1 ceph-node2 ceph-node3 (不指定版本)
ceph-deploy install --no-adjust-repos --nogpgcheck --release pacific ceph-node1 ceph-node2 ceph-node3(指定版本)
xceo@ceph-node1:~/ceph-cluster$ ceph-deploy install --no-adjust-repos --nogpgcheck --release pacific  ceph-node1  ceph-node2 ceph-node3


4.5 初始化mon节点

在三个 mon 节点安装 ceph-mon

#注意安装mon时hostname要和Ceph.conf配置文件中mon主机名一样

root@ceph-mon1[10:40:52]~ #:apt-cache madison ceph-mon
ceph-mon | 16.2.13-1focal | https://download.ceph.com/debian-pacific focal/main amd64 Packages
ceph-mon | 15.2.17-0ubuntu0.20.04.4 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu focal-updates/main amd64 Packages
ceph-mon | 15.2.17-0ubuntu0.20.04.3 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu focal-security/main amd64 Packages
ceph-mon | 15.2.1-0ubuntu1 | http://mirrors.tuna.tsinghua.edu.cn/ubuntu focal/main amd64 Packages

#指定版本
apt -y install 16.2.13-1focal
xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy mon create-initial
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon create-initial
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : create-initial
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7fb775ca7a60>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function mon at 0x7fb775d12700>
[ceph_deploy.cli][INFO ] keyrings : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-mon1
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon1 ...
[ceph-mon1][DEBUG ] connection detected need for sudo
[ceph-mon1][DEBUG ] connected to host: ceph-mon1
[ceph_deploy.mon][INFO ] distro info: ubuntu 20.04 focal
[ceph-mon1][DEBUG ] determining if provided host has same hostname in remote
[ceph-mon1][DEBUG ] deploying mon to ceph-mon1
[ceph-mon1][DEBUG ] remote hostname: ceph-mon1
[ceph-mon1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon1/done
[ceph-mon1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-mon1/done
[ceph-mon1][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring
[ceph-mon1][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph-mon1 --keyring /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring --setuser 64045 --setgroup 64045
[ceph-mon1][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring
[ceph-mon1][INFO ] Running command: sudo systemctl enable ceph.target
[ceph-mon1][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-mon1
[ceph-mon1][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/ceph-mon@ceph-mon1.service → /lib/systemd/system/ceph-mon@.service.
[ceph-mon1][INFO ] Running command: sudo systemctl start ceph-mon@ceph-mon1
[ceph-mon1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status
[ceph-mon1][DEBUG ] ********************************************************************************
[ceph-mon1][DEBUG ] status for monitor: mon.ceph-mon1
[ceph-mon1][DEBUG ] {
[ceph-mon1][DEBUG ] "election_epoch": 3,
[ceph-mon1][DEBUG ] "extra_probe_peers": [],
[ceph-mon1][DEBUG ] "feature_map": {
[ceph-mon1][DEBUG ] "mon": [
[ceph-mon1][DEBUG ] {
[ceph-mon1][DEBUG ] "features": "0x3f01cfbdfffdffff",
[ceph-mon1][DEBUG ] "num": 1,
[ceph-mon1][DEBUG ] "release": "luminous"
[ceph-mon1][DEBUG ] }
[ceph-mon1][DEBUG ] ]
[ceph-mon1][DEBUG ] },
[ceph-mon1][DEBUG ] "features": {
[ceph-mon1][DEBUG ] "quorum_con": "4540138314316775423",
[ceph-mon1][DEBUG ] "quorum_mon": [
[ceph-mon1][DEBUG ] "kraken",
[ceph-mon1][DEBUG ] "luminous",
[ceph-mon1][DEBUG ] "mimic",
[ceph-mon1][DEBUG ] "osdmap-prune",
[ceph-mon1][DEBUG ] "nautilus",
[ceph-mon1][DEBUG ] "octopus",
[ceph-mon1][DEBUG ] "pacific",
[ceph-mon1][DEBUG ] "elector-pinging"
[ceph-mon1][DEBUG ] ],
[ceph-mon1][DEBUG ] "required_con": "2449958747317026820",
[ceph-mon1][DEBUG ] "required_mon": [
[ceph-mon1][DEBUG ] "kraken",
[ceph-mon1][DEBUG ] "luminous",
[ceph-mon1][DEBUG ] "mimic",
[ceph-mon1][DEBUG ] "osdmap-prune",
[ceph-mon1][DEBUG ] "nautilus",
[ceph-mon1][DEBUG ] "octopus",
[ceph-mon1][DEBUG ] "pacific",
[ceph-mon1][DEBUG ] "elector-pinging"
[ceph-mon1][DEBUG ] ]
[ceph-mon1][DEBUG ] },
[ceph-mon1][DEBUG ] "monmap": {
[ceph-mon1][DEBUG ] "created": "2023-05-29T04:19:43.502249Z",
[ceph-mon1][DEBUG ] "disallowed_leaders: ": "",
[ceph-mon1][DEBUG ] "election_strategy": 1,
[ceph-mon1][DEBUG ] "epoch": 1,
[ceph-mon1][DEBUG ] "features": {
[ceph-mon1][DEBUG ] "optional": [],
[ceph-mon1][DEBUG ] "persistent": [
[ceph-mon1][DEBUG ] "kraken",
[ceph-mon1][DEBUG ] "luminous",
[ceph-mon1][DEBUG ] "mimic",
[ceph-mon1][DEBUG ] "osdmap-prune",
[ceph-mon1][DEBUG ] "nautilus",
[ceph-mon1][DEBUG ] "octopus",
[ceph-mon1][DEBUG ] "pacific",
[ceph-mon1][DEBUG ] "elector-pinging"
[ceph-mon1][DEBUG ] ]
[ceph-mon1][DEBUG ] },
[ceph-mon1][DEBUG ] "fsid": "62be32df-9cb4-474f-8727-d5c4bbceaf97",
[ceph-mon1][DEBUG ] "min_mon_release": 16,
[ceph-mon1][DEBUG ] "min_mon_release_name": "pacific",
[ceph-mon1][DEBUG ] "modified": "2023-05-29T04:19:43.502249Z",
[ceph-mon1][DEBUG ] "mons": [
[ceph-mon1][DEBUG ] {
[ceph-mon1][DEBUG ] "addr": "10.1.0.39:6789/0",
[ceph-mon1][DEBUG ] "crush_location": "{}",
[ceph-mon1][DEBUG ] "name": "ceph-mon1",
[ceph-mon1][DEBUG ] "priority": 0,
[ceph-mon1][DEBUG ] "public_addr": "10.1.0.39:6789/0",
[ceph-mon1][DEBUG ] "public_addrs": {
[ceph-mon1][DEBUG ] "addrvec": [
[ceph-mon1][DEBUG ] {
[ceph-mon1][DEBUG ] "addr": "10.1.0.39:3300",
[ceph-mon1][DEBUG ] "nonce": 0,
[ceph-mon1][DEBUG ] "type": "v2"
[ceph-mon1][DEBUG ] },
[ceph-mon1][DEBUG ] {
[ceph-mon1][DEBUG ] "addr": "10.1.0.39:6789",
[ceph-mon1][DEBUG ] "nonce": 0,
[ceph-mon1][DEBUG ] "type": "v1"
[ceph-mon1][DEBUG ] }
[ceph-mon1][DEBUG ] ]
[ceph-mon1][DEBUG ] },
[ceph-mon1][DEBUG ] "rank": 0,
[ceph-mon1][DEBUG ] "weight": 0
[ceph-mon1][DEBUG ] }
[ceph-mon1][DEBUG ] ],
[ceph-mon1][DEBUG ] "removed_ranks: ": "",
[ceph-mon1][DEBUG ] "stretch_mode": false,
[ceph-mon1][DEBUG ] "tiebreaker_mon": ""
[ceph-mon1][DEBUG ] },
[ceph-mon1][DEBUG ] "name": "ceph-mon1",
[ceph-mon1][DEBUG ] "outside_quorum": [],
[ceph-mon1][DEBUG ] "quorum": [
[ceph-mon1][DEBUG ] 0
[ceph-mon1][DEBUG ] ],
[ceph-mon1][DEBUG ] "quorum_age": 3,
[ceph-mon1][DEBUG ] "rank": 0,
[ceph-mon1][DEBUG ] "state": "leader",
[ceph-mon1][DEBUG ] "stretch_mode": false,
[ceph-mon1][DEBUG ] "sync_provider": []
[ceph-mon1][DEBUG ] }
[ceph-mon1][DEBUG ] ********************************************************************************
[ceph-mon1][INFO ] monitor: mon.ceph-mon1 is running
[ceph-mon1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status
[ceph_deploy.mon][INFO ] processing monitor mon.ceph-mon1
[ceph-mon1][DEBUG ] connection detected need for sudo
[ceph-mon1][DEBUG ] connected to host: ceph-mon1
[ceph-mon1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status
[ceph_deploy.mon][INFO ] mon.ceph-mon1 monitor has reached quorum!
[ceph_deploy.mon][INFO ] all initial monitors are running and have formed quorum
[ceph_deploy.mon][INFO ] Running gatherkeys...
[ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmpcsuc00de
[ceph-mon1][DEBUG ] connection detected need for sudo
[ceph-mon1][DEBUG ] connected to host: ceph-mon1
[ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-mon1.asok mon_status
[ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.admin
[ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-mds
[ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-mgr
[ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-osd
[ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-rgw
[ceph_deploy.gatherkeys][INFO ] Storing ceph.client.admin.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mds.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mgr.keyring
[ceph_deploy.gatherkeys][INFO ] keyring 'ceph.mon.keyring' already exists
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-osd.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-rgw.keyring
[ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpcsuc00de


#查看生成ceph.client.admin.keyring的配置文件 (在哪台ceph机器管理集群,这个文件就放在哪里)
xceo@ceph-mon1:~/ceph-cluster$ ls
ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph-deploy-ceph.log
ceph.bootstrap-mgr.keyring ceph.bootstrap-rgw.keyring ceph.conf ceph.mon.keyring

#查看mon服务状态
xceo@ceph-mon1:~/ceph-cluster$ ps -ef | grep ceph-mon
ceph 898964 1 1 May29 ? 00:28:37 /usr/bin/ceph-mon -f --cluster ceph --id ceph-mon1 --setuser ceph --setgroup ceph

4.6 分发admin密钥到node节点(非必须)

保存好生成的ceph.client.admin.keyring文件 在deploy和node节点上安装 ceph- common

sudo apt install ceph-common -y
xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy --overwrite-conf admin ceph-node1 ceph-node2 ceph-node3
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy --overwrite-conf admin ceph-node1 ceph-node2 ceph-node3
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : True
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] client : ['ceph-node1', 'ceph-node2', 'ceph-node3']
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f19678e51f0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function admin at 0x7f1967e2f040>
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node1
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node2
[ceph-node2][DEBUG ] connection detected need for sudo
[ceph-node2][DEBUG ] connected to host: ceph-node2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node3
[ceph-node3][DEBUG ] connection detected need for sudo
[ceph-node3][DEBUG ] connected to host: ceph-node3

认证文件的属主和属组为了安全考虑,默认设置为了root 用户和root组,如果需要ceph用户也能执行ceph 命令,那么就需要对ceph 用户进行授权
setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring 在deploy主机执行

xceo@ceph-mon1:~$ sudo apt install acl

root@ceph-mon1[14:11:31]~ #:setfacl -m u:xceo:rw /etc/ceph/ceph.client.admin.keyring
xceo@ceph-mon1:~$ ceph -s
cluster:
id: 62be32df-9cb4-474f-8727-d5c4bbceaf97
health: HEALTH_WARN
mon is allowing insecure global_id reclaim

services:
mon: 1 daemons, quorum ceph-mon1 (age 112m)
mgr: no daemons active
osd: 0 osds: 0 up, 0 in

data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:

4.7 部署Ceph-Mgr节点

mgr节点需要读取ceph的配置文件,即/etc/ceph目录中的配置文件

#在mgr节点提前将mgr服务安装
apt install ceph-mgr -y

xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy mgr create ceph-mgr1
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-mgr1
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7fd8738734c0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function mgr at 0x7fd87392ec10>
[ceph_deploy.cli][INFO ] mgr : [('ceph-mgr1', 'ceph-mgr1')]
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-mgr1:ceph-mgr1
The authenticity of host 'ceph-mgr1 (10.1.0.40)' can't be established.
ECDSA key fingerprint is SHA256:lhRjKQBhgEhjbqcfKBb6oyle8C9EIOzu48QUoaeISIE.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'ceph-mgr1' (ECDSA) to the list of known hosts.
[ceph-mgr1][DEBUG ] connection detected need for sudo
[ceph-mgr1][DEBUG ] connected to host: ceph-mgr1
[ceph_deploy.mgr][INFO ] Distro info: ubuntu 20.04 focal
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-mgr1
[ceph-mgr1][WARNIN] mgr keyring does not exist yet, creating one
[ceph-mgr1][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-mgr1 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-mgr1/keyring
[ceph-mgr1][INFO ] Running command: sudo systemctl enable ceph-mgr@ceph-mgr1
[ceph-mgr1][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-mgr1.service → /lib/systemd/system/ceph-mgr@.service.
[ceph-mgr1][INFO ] Running command: sudo systemctl start ceph-mgr@ceph-mgr1
[ceph-mgr1][INFO ] Running command: sudo systemctl enable ceph.target
xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy mgr create ceph-mgr2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-mgr2
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f5b7f96b4c0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function mgr at 0x7f5b7fa26c10>
[ceph_deploy.cli][INFO ] mgr : [('ceph-mgr2', 'ceph-mgr2')]
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-mgr2:ceph-mgr2
The authenticity of host 'ceph-mgr2 (10.1.0.41)' can't be established.
ECDSA key fingerprint is SHA256:lhRjKQBhgEhjbqcfKBb6oyle8C9EIOzu48QUoaeISIE.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'ceph-mgr2' (ECDSA) to the list of known hosts.
[ceph-mgr2][DEBUG ] connection detected need for sudo
[ceph-mgr2][DEBUG ] connected to host: ceph-mgr2
[ceph_deploy.mgr][INFO ] Distro info: ubuntu 20.04 focal
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-mgr2
[ceph-mgr2][WARNIN] mgr keyring does not exist yet, creating one
[ceph-mgr2][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-mgr2 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-mgr2/keyring
[ceph-mgr2][INFO ] Running command: sudo systemctl enable ceph-mgr@ceph-mgr2
[ceph-mgr2][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-mgr2.service → /lib/systemd/system/ceph-mgr@.service.
[ceph-mgr2][INFO ] Running command: sudo systemctl start ceph-mgr@ceph-mgr2
[ceph-mgr2][INFO ] Running command: sudo systemctl enable ceph.target

验证Mgr服务

img

在deploy上 查看 ceph -s

图中目前有这个健康警告 mons are allowing…

#由于启用了不安全信道,执行改完后没有该警告了
ceph config set mon auth_allow_insecure_global_id_reclaim false

4.8 初始化 OSD 节点

#在deploy主机执行
ceph-deploy install --release pacific ceph-node1 ceph-node2 ceph-node3 #这步在初始化node节点已经做过
#擦除磁盘之前通过deploy节点对node节点执行安装ceph基本运行环境

报错 File "/usr/lib/python3/dist-packages/ceph_deploy/util/decorators.py", line 69, in newfunc

#列出远端存储node节点的磁盘信息
xceo@ceph-mon1:~/ceph-cluster$ sudo ceph-deploy disk list ceph-node2
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk list ceph-node2
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : list
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7ff31b4353a0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function disk at 0x7ff31b4d2a60>
[ceph_deploy.cli][INFO ] host : ['ceph-node2']
[ceph_deploy.cli][INFO ] debug : False
The authenticity of host 'ceph-node2 (10.1.0.40)' can't be established.
ECDSA key fingerprint is SHA256:lhRjKQBhgEhjbqcfKBb6oyle8C9EIOzu48QUoaeISIE.
Are you sure you want to continue connecting (yes/no/[fingerprint])? ^C[ceph_deploy][ERROR ] KeyboardInterrupt

xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy disk list ceph-node2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk list ceph-node2
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : list
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f4c4cac53d0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function disk at 0x7f4c4cb62af0>
[ceph_deploy.cli][INFO ] host : ['ceph-node2']
[ceph_deploy.cli][INFO ] debug : False
[ceph-node2][DEBUG ] connection detected need for sudo
[ceph-node2][DEBUG ] connected to host: ceph-node2
[ceph-node2][INFO ] Running command: sudo fdisk -l
[ceph_deploy][ERROR ] Traceback (most recent call last):
[ceph_deploy][ERROR ] File "/usr/lib/python3/dist-packages/ceph_deploy/util/decorators.py", line 69, in newfunc
[ceph_deploy][ERROR ] return f(*a, **kw)
[ceph_deploy][ERROR ] File "/usr/lib/python3/dist-packages/ceph_deploy/cli.py", line 166, in _main
[ceph_deploy][ERROR ] return args.func(args)
[ceph_deploy][ERROR ] File "/usr/lib/python3/dist-packages/ceph_deploy/osd.py", line 434, in disk
[ceph_deploy][ERROR ] disk_list(args, cfg)
[ceph_deploy][ERROR ] File "/usr/lib/python3/dist-packages/ceph_deploy/osd.py", line 375, in disk_list
[ceph_deploy][ERROR ] if line.startswith('Disk /'):
[ceph_deploy][ERROR ] TypeError: startswith first arg must be bytes or a tuple of bytes, not str
[ceph_deploy][ERROR ]

解决方法

sudo vim /usr/lib/python3/dist-packages/ceph_deploy/osd.py中
if line.startswith('Disk /'):
#替换为
if line.startswith(b'Disk /'):
ceph-deploy disk list ceph-node1   
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk list ceph-node1
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : list
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f90a9868310>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function disk at 0x7f90a9906af0>
[ceph_deploy.cli][INFO ] host : ['ceph-node1']
[ceph_deploy.cli][INFO ] debug : False
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph-node1][INFO ] Running command: sudo fdisk -l
[ceph-node1][INFO ] b'Disk /dev/loop0: 55.45 MiB, 58130432 bytes, 113536 sectors'
[ceph-node1][INFO ] b'Disk /dev/loop1: 55.65 MiB, 58339328 bytes, 113944 sectors'
[ceph-node1][INFO ] b'Disk /dev/loop2: 70.32 MiB, 73728000 bytes, 144000 sectors'
[ceph-node1][INFO ] b'Disk /dev/loop3: 91.85 MiB, 96292864 bytes, 188072 sectors'
[ceph-node1][INFO ] b'Disk /dev/loop4: 53.24 MiB, 55824384 bytes, 109032 sectors'
[ceph-node1][INFO ] b'Disk /dev/loop5: 63.46 MiB, 66535424 bytes, 129952 sectors'
[ceph-node1][INFO ] b'Disk /dev/vda: 1 TiB, 1099511627776 bytes, 2147483648 sectors'
[ceph-node1][INFO ] b'Disk /dev/vdb: 120 GiB, 128849018880 bytes, 251658240 sectors'
[ceph-node1][INFO ] b'Disk /dev/vdc: 120 GiB, 128849018880 bytes, 251658240 sectors'
[ceph-node1][INFO ] b'Disk /dev/vdd: 120 GiB, 128849018880 bytes, 251658240 sectors'
[ceph-node1][INFO ] b'Disk /dev/vde: 120 GiB, 128849018880 bytes, 251658240 sectors'
[ceph-node1][INFO ] b'Disk /dev/vdf: 120 GiB, 128849018880 bytes, 251658240 sectors'

使用ceph-deploy disk zap 擦除各ceph-osd节点的ceph数据磁盘

xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy disk zap  ceph-node1 /dev/vdb
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap ceph-node1 /dev/vdb
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : zap
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7fc53b4543a0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function disk at 0x7fc53b4f1af0>
[ceph_deploy.cli][INFO ] host : ceph-node1
[ceph_deploy.cli][INFO ] disk : ['/dev/vdb']
[ceph_deploy.cli][INFO ] debug : False
[ceph_deploy.osd][DEBUG ] zapping /dev/vdb on ceph-node1
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph_deploy.osd][INFO ] Distro info: ubuntu 20.04 focal
[ceph-node1][INFO ] Running command: sudo /usr/sbin/ceph-volume lvm zap /dev/vdb
[ceph-node1][WARNIN] --> Zapping: /dev/vdb
[ceph-node1][WARNIN] --> --destroy was not specified, but zapping a whole device will remove the partition table
[ceph-node1][WARNIN] Running command: /usr/bin/dd if=/dev/zero of=/dev/vdb bs=1M count=10 conv=fsync
[ceph-node1][WARNIN] stderr: 10+0 records in
[ceph-node1][WARNIN] 10+0 records out
[ceph-node1][WARNIN] stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.103617 s, 101 MB/s
[ceph-node1][WARNIN] --> Zapping successful for: <Raw Device: /dev/vdb>


xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy disk zap ceph-node1 /dev/vdc
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap ceph-node1 /dev/vdc
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : zap
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f9c0700b3a0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function disk at 0x7f9c070a9af0>
[ceph_deploy.cli][INFO ] host : ceph-node1
[ceph_deploy.cli][INFO ] disk : ['/dev/vdc']
[ceph_deploy.cli][INFO ] debug : False
[ceph_deploy.osd][DEBUG ] zapping /dev/vdc on ceph-node1
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph_deploy.osd][INFO ] Distro info: ubuntu 20.04 focal
[ceph-node1][INFO ] Running command: sudo /usr/sbin/ceph-volume lvm zap /dev/vdc
[ceph-node1][WARNIN] --> Zapping: /dev/vdc
[ceph-node1][WARNIN] --> --destroy was not specified, but zapping a whole device will remove the partition table
[ceph-node1][WARNIN] Running command: /usr/bin/dd if=/dev/zero of=/dev/vdc bs=1M count=10 conv=fsync
[ceph-node1][WARNIN] stderr: 10+0 records in
[ceph-node1][WARNIN] 10+0 records out
[ceph-node1][WARNIN] stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.106743 s, 98.2 MB/s
[ceph-node1][WARNIN] --> Zapping successful for: <Raw Device: /dev/vdc>


xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy disk zap ceph-node1 /dev/vdd
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap ceph-node1 /dev/vdd
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : zap
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f719c0fa3a0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function disk at 0x7f719c197af0>
[ceph_deploy.cli][INFO ] host : ceph-node1
[ceph_deploy.cli][INFO ] disk : ['/dev/vdd']
[ceph_deploy.cli][INFO ] debug : False
[ceph_deploy.osd][DEBUG ] zapping /dev/vdd on ceph-node1
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph_deploy.osd][INFO ] Distro info: ubuntu 20.04 focal
[ceph-node1][INFO ] Running command: sudo /usr/sbin/ceph-volume lvm zap /dev/vdd
[ceph-node1][WARNIN] --> Zapping: /dev/vdd
[ceph-node1][WARNIN] --> --destroy was not specified, but zapping a whole device will remove the partition table
[ceph-node1][WARNIN] Running command: /usr/bin/dd if=/dev/zero of=/dev/vdd bs=1M count=10 conv=fsync
[ceph-node1][WARNIN] stderr: 10+0 records in
[ceph-node1][WARNIN] 10+0 records out
[ceph-node1][WARNIN] stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.122102 s, 85.9 MB/s
[ceph-node1][WARNIN] --> Zapping successful for: <Raw Device: /dev/vdd>

.....

擦除成功的提示

4.9 添加OSD 节点

  • Data:即ceph保存的对象数据
  • block:rocks DB 数据即元数据
  • block-wal: 数据库的 wal 日志

添加OSD节点前的告警

#添加磁盘
xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy osd create ceph-node1 --data /dev/vdb
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy osd create ceph-node1 --data /dev/vdb
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7fef2b8bbfa0>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function osd at 0x7fef2b959a60>
[ceph_deploy.cli][INFO ] data : /dev/vdb
[ceph_deploy.cli][INFO ] journal : None
[ceph_deploy.cli][INFO ] zap_disk : False
[ceph_deploy.cli][INFO ] fs_type : xfs
[ceph_deploy.cli][INFO ] dmcrypt : False
[ceph_deploy.cli][INFO ] dmcrypt_key_dir : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO ] filestore : None
[ceph_deploy.cli][INFO ] bluestore : None
[ceph_deploy.cli][INFO ] block_db : None
[ceph_deploy.cli][INFO ] block_wal : None
[ceph_deploy.cli][INFO ] host : ceph-node1
[ceph_deploy.cli][INFO ] debug : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/vdb
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph_deploy.osd][INFO ] Distro info: ubuntu 20.04 focal
[ceph_deploy.osd][DEBUG ] Deploying osd to ceph-node1
[ceph-node1][INFO ] Running command: sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/vdb
[ceph-node1][WARNIN] Running command: /usr/bin/ceph-authtool --gen-print-key
[ceph-node1][WARNIN] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new d9637751-9b7f-42d6-b9d8-2718039162b5
[ceph-node1][WARNIN] Running command: vgcreate --force --yes ceph-ac4ab09d-3655-4447-a0e3-d0d79f3b326a /dev/vdb
[ceph-node1][WARNIN] stdout: Physical volume "/dev/vdb" successfully created.
[ceph-node1][WARNIN] stdout: Volume group "ceph-ac4ab09d-3655-4447-a0e3-d0d79f3b326a" successfully created
[ceph-node1][WARNIN] Running command: lvcreate --yes -l 30719 -n osd-block-d9637751-9b7f-42d6-b9d8-2718039162b5 ceph-ac4ab09d-3655-4447-a0e3-d0d79f3b326a
[ceph-node1][WARNIN] stdout: Logical volume "osd-block-d9637751-9b7f-42d6-b9d8-2718039162b5" created.
[ceph-node1][WARNIN] Running command: /usr/bin/ceph-authtool --gen-print-key
[ceph-node1][WARNIN] Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
[ceph-node1][WARNIN] --> Executable selinuxenabled not in PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
[ceph-node1][WARNIN] Running command: /usr/bin/chown -h ceph:ceph /dev/ceph-ac4ab09d-3655-4447-a0e3-d0d79f3b326a/osd-block-d9637751-9b7f-42d6-b9d8-2718039162b5
[ceph-node1][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /dev/dm-0
[ceph-node1][WARNIN] Running command: /usr/bin/ln -s /dev/ceph-ac4ab09d-3655-4447-a0e3-d0d79f3b326a/osd-block-d9637751-9b7f-42d6-b9d8-2718039162b5 /var/lib/ceph/osd/ceph-0/block
[ceph-node1][WARNIN] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-0/activate.monmap
[ceph-node1][WARNIN] stderr: 2023-05-29T14:48:36.998+0800 7f3b7f4e2700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
[ceph-node1][WARNIN] 2023-05-29T14:48:36.998+0800 7f3b7f4e2700 -1 AuthRegistry(0x7f3b7805bc18) no keyring found at /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx
[ceph-node1][WARNIN] stderr: got monmap epoch 1
[ceph-node1][WARNIN] --> Creating keyring file for osd.0
[ceph-node1][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
[ceph-node1][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
[ceph-node1][WARNIN] Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid d9637751-9b7f-42d6-b9d8-2718039162b5 --setuser ceph --setgroup ceph
[ceph-node1][WARNIN] stderr: 2023-05-29T14:48:37.374+0800 7ff803100080 -1 bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
[ceph-node1][WARNIN] --> ceph-volume lvm prepare successful for: /dev/vdb
[ceph-node1][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
[ceph-node1][WARNIN] Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-ac4ab09d-3655-4447-a0e3-d0d79f3b326a/osd-block-d9637751-9b7f-42d6-b9d8-2718039162b5 --path /var/lib/ceph/osd/ceph-0 --no-mon-config
[ceph-node1][WARNIN] Running command: /usr/bin/ln -snf /dev/ceph-ac4ab09d-3655-4447-a0e3-d0d79f3b326a/osd-block-d9637751-9b7f-42d6-b9d8-2718039162b5 /var/lib/ceph/osd/ceph-0/block
[ceph-node1][WARNIN] Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block
[ceph-node1][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /dev/dm-0
[ceph-node1][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
[ceph-node1][WARNIN] Running command: /usr/bin/systemctl enable ceph-volume@lvm-0-d9637751-9b7f-42d6-b9d8-2718039162b5
[ceph-node1][WARNIN] stderr: Created symlink /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-0-d9637751-9b7f-42d6-b9d8-2718039162b5.service → /lib/systemd/system/ceph-volume@.service.
[ceph-node1][WARNIN] Running command: /usr/bin/systemctl enable --runtime ceph-osd@0
[ceph-node1][WARNIN] stderr: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service → /lib/systemd/system/ceph-osd@.service.
[ceph-node1][WARNIN] Running command: /usr/bin/systemctl start ceph-osd@0
[ceph-node1][WARNIN] --> ceph-volume lvm activate successful for osd ID: 0
[ceph-node1][WARNIN] --> ceph-volume lvm create successful for: /dev/vdb
[ceph-node1][INFO ] checking OSD status...
[ceph-node1][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[ceph-node1][WARNIN] there is 1 OSD down
[ceph_deploy.osd][DEBUG ] Host ceph-node1 is now ready for osd use.

......

需要添加的磁盘如下

#ceph-node1:
ceph-deploy osd create ceph-node1 --data /dev/vdb
ceph-deploy osd create ceph-node1 --data /dev/vdc
ceph-deploy osd create ceph-node1 --data /dev/vdd
ceph-deploy osd create ceph-node1 --data /dev/vde
ceph-deploy osd create ceph-node1 --data /dev/vdf


#ceph-node2:
ceph-deploy osd create ceph-node2 --data /dev/vdb
ceph-deploy osd create ceph-node2 --data /dev/vdc
ceph-deploy osd create ceph-node2 --data /dev/vdd
ceph-deploy osd create ceph-node2 --data /dev/vde
ceph-deploy osd create ceph-node2 --data /dev/vdf

#ceph-node3:
ceph-deploy osd create ceph-node3 --data /dev/vdb
ceph-deploy osd create ceph-node3 --data /dev/vdc
ceph-deploy osd create ceph-node3 --data /dev/vdd
ceph-deploy osd create ceph-node3 --data /dev/vde
ceph-deploy osd create ceph-node3 --data /dev/vdf

全部添加后再次检查ceph集群

在OSD节点上查看进程

备注:特别要注意 添加一定要记录 每个硬盘对应的数字, 以免后续故障后排查

五、 验证 Ceph 集群

5.1 从RADOS 移除OSD

这个步骤主要做了解,一般不常用

Ceph集群中的一个OSD是一个node节点的服务进程且对应于一个物理磁盘设备,是一个专用的守护进程。

在某OSD设备出现故障,或管理员处于管理之需确实要移除特定的OSD设备时,需要先停止相关的守护进程,而后再进行移除操作,对于linux以及以后的版本来说,停止和移除命令的格式分别如下:

 停用设备: ceph osd out {osd-num}
停止进程: sudo systemctl stop ceph-osd@{osd-num}
移除设备: ceph osd purge {id} --yes-i-really-mean-it

OSD 的配置信息存在于 ceph.conf 配置文件中, 管理员在删除 OSD 之后手动将其删除

#于 CRUSH 运行图中移除设备
ceph osd crush remove {name}
#移除 OSD 的认证 key:
ceph auth del osd.{osd-num}
#最后移除 OSD 设备:
ceph osd rm {osd-num}

5.2 测试上传和下载数据

RADOS命令 存取数据时, 客户端必须首先连接至 RADOS 集群上某存储池, 然后根据对象名称由相关的CRUSH 规则完成数据对象寻址。 于是, 为了测试集群的数据存取功能, 这里首先创建一个用于测试的存储池 mypool, 并设定其 PG 数量为 32 个

创建pool

root@ceph-mon1[20:03:36]~ #:ceph osd pool create mypool 32 32 #创建
pool 'mypool' created

#或者rados lspools #验证
root@ceph-mon1[20:44:54]~ #:ceph osd pool ls
device_health_metrics
mypool

ceph pg ls-by-pool mypool | awk '{print $1,$2,$15}' #验证PG 与 PGP 组合

root@ceph-mon1[20:45:22]~ #:ceph pg ls-by-pool mypool | awk '{print $1,$2,$15}'
PG OBJECTS ACTING
2.0 0 [8,10,3]p8
2.1 0 [2,13,9]p2
2.2 0 [5,1,10]p5
2.3 0 [5,4,14]p5
2.4 0 [1,12,6]p1
2.5 0 [12,4,8]p12
2.6 0 [1,13,9]p1
2.7 0 [6,13,2]p6
2.8 0 [8,13,0]p8
2.9 0 [4,9,12]p4
2.a 0 [11,4,8]p11
2.b 0 [13,7,4]p13
2.c 0 [12,0,5]p12
2.d 0 [12,8,3]p12
2.e 0 [2,13,8]p2
2.f 0 [11,8,0]p11
2.10 0 [10,1,8]p10
2.11 0 [6,1,12]p6
2.12 0 [10,3,9]p10
2.13 0 [13,6,3]p13
2.14 0 [8,13,0]p8
2.15 0 [10,1,5]p10
2.16 0 [8,12,1]p8
2.17 0 [6,14,2]p6
2.18 0 [13,9,2]p13
2.19 0 [3,6,13]p3
2.1a 0 [6,14,2]p6
2.1b 0 [11,7,3]p11
2.1c 0 [10,7,1]p10
2.1d 0 [10,7,0]p10
2.1e 0 [3,13,5]p3
2.1f 0 [4,7,14]p4

* NOTE: afterwards

上传文件

sudo rados put msg1 /var/log/syslog --pool=mypool
#把消息文件上传到 mypool 并指定对象 id 为 msg1

rados ls --pool=mypool #列出文件
root@ceph-mon1[20:58:42]~ #:rados ls --pool=mypool
msg1

#可以获取到存储池中数据对象的具体位置信息:
root@ceph-mon1[21:00:22]~ #:ceph osd map mypool msg1
osdmap e86 pool 'mypool' (2) object 'msg1' -> pg 2.c833d430 (2.10) -> up ([10,1,8], p10) acting ([10,1,8], p10)

-------------------------------------------------------------------------------

表示文件存储了存储池 id 为 2 的 c833d430 的 PG 上,10 为当前 PG 的 id, 2.10 表示数据是在 id 为 2 的存储池中 id 为 10 的 PG 中存储,在线的 OSD 编号 10,1,8主 OSD 为 10,活动 的OSD 10,1,8,三个OSD表示数据放一共3个副本,PG中的OSD是ceph的crush算法计算 算出三份数据保存在哪些OSD。

下载文件

#下载mypool里的msg1文件 放在本地目录的名字
sudo rados get msg1 --pool=mypool /opt/my.txt

xceo@ceph-mon1:~/ceph-cluster$ sudo rados get msg1 --pool=mypool /opt/my.txt
xceo@ceph-mon1:~/ceph-cluster$ ll -ls /opt/my.txt
860 -rw-r--r-- 1 root root 879881 May 30 09:10 /opt/my.txt

修改文件

#将/etc/passwd 上传到mypool下的 msg1文件中
xceo@ceph-mon1:~/ceph-cluster$ sudo rados put msg1 /etc/passwd --pool=mypool

#重新从mypool里下载msg1 并存放本地目录重命名为/opt/my3.txt
xceo@ceph-mon1:~/ceph-cluster$ sudo rados get msg1 --pool=mypool /opt/my3.txt
xceo@ceph-mon1:~/ceph-cluster$ head /opt/my3.txt
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin

删除文件

sudo rados rm msg1 --pool=mypool
rados ls --pool=mypool

六、扩展 ceph 实现高可用

主要是扩展ceph集群的mon节点以及mgr节点 以实现集群高可用

6.1 扩展 ceph-mon 节点

在mon2 mon3 上安装ceph-mon Ceph-mon 是原生具备自噬以实现高可用机制的 ceph 服务,节点数量通常是奇数

#ubuntu上安装指定版本
apt install ceph-mon=16.2.13-1focal

#在ceph-deploy操作将mon3添加
ceph-deploy mon add ceph-mon2 ceph-mon3
#添加ceph-mon2 

xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy mon add ceph-mon2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon add ceph-mon2
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : add
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f4eea93da90>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function mon at 0x7f4eea9a8700>
[ceph_deploy.cli][INFO ] address : None
[ceph_deploy.cli][INFO ] mon : ['ceph-mon2']
[ceph_deploy.mon][INFO ] ensuring configuration of new mon host: ceph-mon2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-mon2
[ceph-mon2][DEBUG ] connection detected need for sudo
[ceph-mon2][DEBUG ] connected to host: ceph-mon2
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph-mon2
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 10.1.0.40
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon2 ...
[ceph-mon2][DEBUG ] connection detected need for sudo
[ceph-mon2][DEBUG ] connected to host: ceph-mon2
[ceph_deploy.mon][INFO ] distro info: ubuntu 20.04 focal
[ceph-mon2][DEBUG ] determining if provided host has same hostname in remote
[ceph-mon2][DEBUG ] adding mon to ceph-mon2
[ceph-mon2][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon2/done
[ceph-mon2][INFO ] Running command: sudo systemctl enable ceph.target
[ceph-mon2][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-mon2
[ceph-mon2][INFO ] Running command: sudo systemctl start ceph-mon@ceph-mon2
[ceph-mon2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon2.asok mon_status
[ceph-mon2][WARNIN] ceph-mon2 is not defined in `mon initial members`
[ceph-mon2][WARNIN] monitor ceph-mon2 does not exist in monmap
[ceph-mon2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon2.asok mon_status
[ceph-mon2][DEBUG ] ********************************************************************************
[ceph-mon2][DEBUG ] status for monitor: mon.ceph-mon2
[ceph-mon2][DEBUG ] {
[ceph-mon2][DEBUG ] "election_epoch": 0,
[ceph-mon2][DEBUG ] "extra_probe_peers": [],
[ceph-mon2][DEBUG ] "feature_map": {
[ceph-mon2][DEBUG ] "mon": [
[ceph-mon2][DEBUG ] {
[ceph-mon2][DEBUG ] "features": "0x3f01cfbdfffdffff",
[ceph-mon2][DEBUG ] "num": 1,
[ceph-mon2][DEBUG ] "release": "luminous"
[ceph-mon2][DEBUG ] }
[ceph-mon2][DEBUG ] ]
[ceph-mon2][DEBUG ] },
[ceph-mon2][DEBUG ] "features": {
[ceph-mon2][DEBUG ] "quorum_con": "0",
[ceph-mon2][DEBUG ] "quorum_mon": [],
[ceph-mon2][DEBUG ] "required_con": "2449958197560098820",
[ceph-mon2][DEBUG ] "required_mon": [
[ceph-mon2][DEBUG ] "kraken",
[ceph-mon2][DEBUG ] "luminous",
[ceph-mon2][DEBUG ] "mimic",
[ceph-mon2][DEBUG ] "osdmap-prune",
[ceph-mon2][DEBUG ] "nautilus",
[ceph-mon2][DEBUG ] "octopus",
[ceph-mon2][DEBUG ] "pacific",
[ceph-mon2][DEBUG ] "elector-pinging"
[ceph-mon2][DEBUG ] ]
[ceph-mon2][DEBUG ] },
[ceph-mon2][DEBUG ] "monmap": {
[ceph-mon2][DEBUG ] "created": "2023-05-29T04:19:43.502249Z",
[ceph-mon2][DEBUG ] "disallowed_leaders: ": "",
[ceph-mon2][DEBUG ] "election_strategy": 1,
[ceph-mon2][DEBUG ] "epoch": 1,
[ceph-mon2][DEBUG ] "features": {
[ceph-mon2][DEBUG ] "optional": [],
[ceph-mon2][DEBUG ] "persistent": [
[ceph-mon2][DEBUG ] "kraken",
[ceph-mon2][DEBUG ] "luminous",
[ceph-mon2][DEBUG ] "mimic",
[ceph-mon2][DEBUG ] "osdmap-prune",
[ceph-mon2][DEBUG ] "nautilus",
[ceph-mon2][DEBUG ] "octopus",
[ceph-mon2][DEBUG ] "pacific",
[ceph-mon2][DEBUG ] "elector-pinging"
[ceph-mon2][DEBUG ] ]
[ceph-mon2][DEBUG ] },
[ceph-mon2][DEBUG ] "fsid": "62be32df-9cb4-474f-8727-d5c4bbceaf97",
[ceph-mon2][DEBUG ] "min_mon_release": 16,
[ceph-mon2][DEBUG ] "min_mon_release_name": "pacific",
[ceph-mon2][DEBUG ] "modified": "2023-05-29T04:19:43.502249Z",
[ceph-mon2][DEBUG ] "mons": [
[ceph-mon2][DEBUG ] {
[ceph-mon2][DEBUG ] "addr": "10.1.0.39:6789/0",
[ceph-mon2][DEBUG ] "crush_location": "{}",
[ceph-mon2][DEBUG ] "name": "ceph-mon1",
[ceph-mon2][DEBUG ] "priority": 0,
[ceph-mon2][DEBUG ] "public_addr": "10.1.0.39:6789/0",
[ceph-mon2][DEBUG ] "public_addrs": {
[ceph-mon2][DEBUG ] "addrvec": [
[ceph-mon2][DEBUG ] {
[ceph-mon2][DEBUG ] "addr": "10.1.0.39:3300",
[ceph-mon2][DEBUG ] "nonce": 0,
[ceph-mon2][DEBUG ] "type": "v2"
[ceph-mon2][DEBUG ] },
[ceph-mon2][DEBUG ] {
[ceph-mon2][DEBUG ] "addr": "10.1.0.39:6789",
[ceph-mon2][DEBUG ] "nonce": 0,
[ceph-mon2][DEBUG ] "type": "v1"
[ceph-mon2][DEBUG ] }
[ceph-mon2][DEBUG ] ]
[ceph-mon2][DEBUG ] },
[ceph-mon2][DEBUG ] "rank": 0,
[ceph-mon2][DEBUG ] "weight": 0
[ceph-mon2][DEBUG ] }
[ceph-mon2][DEBUG ] ],
[ceph-mon2][DEBUG ] "removed_ranks: ": "",
[ceph-mon2][DEBUG ] "stretch_mode": false,
[ceph-mon2][DEBUG ] "tiebreaker_mon": ""
[ceph-mon2][DEBUG ] },
[ceph-mon2][DEBUG ] "name": "ceph-mon2",
[ceph-mon2][DEBUG ] "outside_quorum": [],
[ceph-mon2][DEBUG ] "quorum": [],
[ceph-mon2][DEBUG ] "rank": -1,
[ceph-mon2][DEBUG ] "state": "probing",
[ceph-mon2][DEBUG ] "stretch_mode": false,
[ceph-mon2][DEBUG ] "sync_provider": []
[ceph-mon2][DEBUG ] }
[ceph-mon2][DEBUG ] ********************************************************************************
[ceph-mon2][INFO ] monitor: mon.ceph-mon2 is currently at the state of probing
#添加ceph-mon2
xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy mon add ceph-mon3
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon add ceph-mon3
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : add
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7fd7bb83da90>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function mon at 0x7fd7bb8a8700>
[ceph_deploy.cli][INFO ] address : None
[ceph_deploy.cli][INFO ] mon : ['ceph-mon3']
[ceph_deploy.mon][INFO ] ensuring configuration of new mon host: ceph-mon3
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-mon3
[ceph-mon3][DEBUG ] connection detected need for sudo
[ceph-mon3][DEBUG ] connected to host: ceph-mon3
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph-mon3
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 10.1.0.41
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon3 ...
[ceph-mon3][DEBUG ] connection detected need for sudo
[ceph-mon3][DEBUG ] connected to host: ceph-mon3
[ceph_deploy.mon][INFO ] distro info: ubuntu 20.04 focal
[ceph-mon3][DEBUG ] determining if provided host has same hostname in remote
[ceph-mon3][DEBUG ] adding mon to ceph-mon3
[ceph-mon3][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon3/done
[ceph-mon3][INFO ] Running command: sudo systemctl enable ceph.target
[ceph-mon3][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-mon3
[ceph-mon3][INFO ] Running command: sudo systemctl start ceph-mon@ceph-mon3
[ceph-mon3][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon3.asok mon_status
[ceph-mon3][WARNIN] ceph-mon3 is not defined in `mon initial members`
[ceph-mon3][WARNIN] monitor ceph-mon3 does not exist in monmap
[ceph-mon3][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon3.asok mon_status
[ceph-mon3][DEBUG ] ********************************************************************************
[ceph-mon3][DEBUG ] status for monitor: mon.ceph-mon3
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "election_epoch": 0,
[ceph-mon3][DEBUG ] "extra_probe_peers": [],
[ceph-mon3][DEBUG ] "feature_map": {
[ceph-mon3][DEBUG ] "mon": [
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "features": "0x3f01cfbdfffdffff",
[ceph-mon3][DEBUG ] "num": 1,
[ceph-mon3][DEBUG ] "release": "luminous"
[ceph-mon3][DEBUG ] }
[ceph-mon3][DEBUG ] ]
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "features": {
[ceph-mon3][DEBUG ] "quorum_con": "0",
[ceph-mon3][DEBUG ] "quorum_mon": [],
[ceph-mon3][DEBUG ] "required_con": "2449958197560098820",
[ceph-mon3][DEBUG ] "required_mon": [
[ceph-mon3][DEBUG ] "kraken",
[ceph-mon3][DEBUG ] "luminous",
[ceph-mon3][DEBUG ] "mimic",
[ceph-mon3][DEBUG ] "osdmap-prune",
[ceph-mon3][DEBUG ] "nautilus",
[ceph-mon3][DEBUG ] "octopus",
[ceph-mon3][DEBUG ] "pacific",
[ceph-mon3][DEBUG ] "elector-pinging"
[ceph-mon3][DEBUG ] ]
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "monmap": {
[ceph-mon3][DEBUG ] "created": "2023-05-29T04:19:43.502249Z",
[ceph-mon3][DEBUG ] "disallowed_leaders: ": "",
[ceph-mon3][DEBUG ] "election_strategy": 1,
[ceph-mon3][DEBUG ] "epoch": 2,
[ceph-mon3][DEBUG ] "features": {
[ceph-mon3][DEBUG ] "optional": [],
[ceph-mon3][DEBUG ] "persistent": [
[ceph-mon3][DEBUG ] "kraken",
[ceph-mon3][DEBUG ] "luminous",
[ceph-mon3][DEBUG ] "mimic",
[ceph-mon3][DEBUG ] "osdmap-prune",
[ceph-mon3][DEBUG ] "nautilus",
[ceph-mon3][DEBUG ] "octopus",
[ceph-mon3][DEBUG ] "pacific",
[ceph-mon3][DEBUG ] "elector-pinging"
[ceph-mon3][DEBUG ] ]
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "fsid": "62be32df-9cb4-474f-8727-d5c4bbceaf97",
[ceph-mon3][DEBUG ] "min_mon_release": 16,
[ceph-mon3][DEBUG ] "min_mon_release_name": "pacific",
[ceph-mon3][DEBUG ] "modified": "2023-05-30T07:32:19.726912Z",
[ceph-mon3][DEBUG ] "mons": [
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "addr": "10.1.0.39:6789/0",
[ceph-mon3][DEBUG ] "crush_location": "{}",
[ceph-mon3][DEBUG ] "name": "ceph-mon1",
[ceph-mon3][DEBUG ] "priority": 0,
[ceph-mon3][DEBUG ] "public_addr": "10.1.0.39:6789/0",
[ceph-mon3][DEBUG ] "public_addrs": {
[ceph-mon3][DEBUG ] "addrvec": [
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "addr": "10.1.0.39:3300",
[ceph-mon3][DEBUG ] "nonce": 0,
[ceph-mon3][DEBUG ] "type": "v2"
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "addr": "10.1.0.39:6789",
[ceph-mon3][DEBUG ] "nonce": 0,
[ceph-mon3][DEBUG ] "type": "v1"
[ceph-mon3][DEBUG ] }
[ceph-mon3][DEBUG ] ]
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "rank": 0,
[ceph-mon3][DEBUG ] "weight": 0
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "addr": "10.1.0.40:6789/0",
[ceph-mon3][DEBUG ] "crush_location": "{}",
[ceph-mon3][DEBUG ] "name": "ceph-mon2",
[ceph-mon3][DEBUG ] "priority": 0,
[ceph-mon3][DEBUG ] "public_addr": "10.1.0.40:6789/0",
[ceph-mon3][DEBUG ] "public_addrs": {
[ceph-mon3][DEBUG ] "addrvec": [
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "addr": "10.1.0.40:3300",
[ceph-mon3][DEBUG ] "nonce": 0,
[ceph-mon3][DEBUG ] "type": "v2"
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "addr": "10.1.0.40:6789",
[ceph-mon3][DEBUG ] "nonce": 0,
[ceph-mon3][DEBUG ] "type": "v1"
[ceph-mon3][DEBUG ] }
[ceph-mon3][DEBUG ] ]
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "rank": 1,
[ceph-mon3][DEBUG ] "weight": 0
[ceph-mon3][DEBUG ] }
[ceph-mon3][DEBUG ] ],
[ceph-mon3][DEBUG ] "removed_ranks: ": "",
[ceph-mon3][DEBUG ] "stretch_mode": false,
[ceph-mon3][DEBUG ] "tiebreaker_mon": ""
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "name": "ceph-mon3",
[ceph-mon3][DEBUG ] "outside_quorum": [],
[ceph-mon3][DEBUG ] "quorum": [],
[ceph-mon3][DEBUG ] "rank": -1,
[ceph-mon3][DEBUG ] "state": "probing",
[ceph-mon3][DEBUG ] "stretch_mode": false,
[ceph-mon3][DEBUG ] "sync_provider": []
[ceph-mon3][DEBUG ] }
[ceph-mon3][DEBUG ] ********************************************************************************
[ceph-mon3][INFO ] monitor: mon.ceph-mon3 is currently at the state of probing

验证集群ceph-mon状态

ceph quorum_status --format json-pretty

在添加mon节点是出现问题如下

xceo@ceph-mon1:~/ceph-cluster$ ceph-deploy mon add ceph-mon3
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/xceo/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon add ceph-mon3
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] subcommand : add
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f068934ca90>
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] func : <function mon at 0x7f06893b7700>
[ceph_deploy.cli][INFO ] address : None
[ceph_deploy.cli][INFO ] mon : ['ceph-mon3']
[ceph_deploy.mon][INFO ] ensuring configuration of new mon host: ceph-mon3
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-mon3
[ceph-mon3][DEBUG ] connection detected need for sudo
[ceph-mon3][DEBUG ] connected to host: ceph-mon3
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph-mon3
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 10.1.0.41
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon3 ...
[ceph-mon3][DEBUG ] connection detected need for sudo
[ceph-mon3][DEBUG ] connected to host: ceph-mon3
[ceph_deploy.mon][INFO ] distro info: ubuntu 20.04 focal
[ceph-mon3][DEBUG ] determining if provided host has same hostname in remote
[ceph-mon3][DEBUG ] adding mon to ceph-mon3
[ceph-mon3][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon3/done
[ceph-mon3][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-mon3/done
[ceph-mon3][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-mon3.mon.keyring
[ceph-mon3][INFO ] Running command: sudo ceph --cluster ceph mon getmap -o /var/lib/ceph/tmp/ceph.ceph-mon3.monmap
[ceph-mon3][WARNIN] got monmap epoch 1
[ceph-mon3][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph-mon3 --monmap /var/lib/ceph/tmp/ceph.ceph-mon3.monmap --keyring /var/lib/ceph/tmp/ceph-ceph-mon3.mon.keyring --setuser 64045 --setgroup 64045
[ceph-mon3][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-mon3.mon.keyring
[ceph-mon3][INFO ] Running command: sudo systemctl enable ceph.target
[ceph-mon3][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-mon3
[ceph-mon3][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/ceph-mon@ceph-mon3.service → /lib/systemd/system/ceph-mon@.service.
[ceph-mon3][INFO ] Running command: sudo systemctl start ceph-mon@ceph-mon3
[ceph-mon3][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon3.asok mon_status
[ceph-mon3][WARNIN] ceph-mon3 is not defined in `mon initial members`
[ceph-mon3][WARNIN] monitor ceph-mon3 does not exist in monmap
[ceph-mon3][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon3.asok mon_status
[ceph-mon3][DEBUG ] ********************************************************************************
[ceph-mon3][DEBUG ] status for monitor: mon.ceph-mon3
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "election_epoch": 0,
[ceph-mon3][DEBUG ] "extra_probe_peers": [],
[ceph-mon3][DEBUG ] "feature_map": {},
[ceph-mon3][DEBUG ] "features": {
[ceph-mon3][DEBUG ] "quorum_con": "0",
[ceph-mon3][DEBUG ] "quorum_mon": [],
[ceph-mon3][DEBUG ] "required_con": "2449958197560098820",
[ceph-mon3][DEBUG ] "required_mon": [
[ceph-mon3][DEBUG ] "kraken",
[ceph-mon3][DEBUG ] "luminous",
[ceph-mon3][DEBUG ] "mimic",
[ceph-mon3][DEBUG ] "osdmap-prune",
[ceph-mon3][DEBUG ] "nautilus",
[ceph-mon3][DEBUG ] "octopus",
[ceph-mon3][DEBUG ] "pacific",
[ceph-mon3][DEBUG ] "elector-pinging"
[ceph-mon3][DEBUG ] ]
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "monmap": {
[ceph-mon3][DEBUG ] "created": "2023-05-29T04:19:43.502249Z",
[ceph-mon3][DEBUG ] "disallowed_leaders: ": "",
[ceph-mon3][DEBUG ] "election_strategy": 1,
[ceph-mon3][DEBUG ] "epoch": 0,
[ceph-mon3][DEBUG ] "features": {
[ceph-mon3][DEBUG ] "optional": [],
[ceph-mon3][DEBUG ] "persistent": [
[ceph-mon3][DEBUG ] "kraken",
[ceph-mon3][DEBUG ] "luminous",
[ceph-mon3][DEBUG ] "mimic",
[ceph-mon3][DEBUG ] "osdmap-prune",
[ceph-mon3][DEBUG ] "nautilus",
[ceph-mon3][DEBUG ] "octopus",
[ceph-mon3][DEBUG ] "pacific",
[ceph-mon3][DEBUG ] "elector-pinging"
[ceph-mon3][DEBUG ] ]
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "fsid": "62be32df-9cb4-474f-8727-d5c4bbceaf97",
[ceph-mon3][DEBUG ] "min_mon_release": 16,
[ceph-mon3][DEBUG ] "min_mon_release_name": "pacific",
[ceph-mon3][DEBUG ] "modified": "2023-05-29T04:19:43.502249Z",
[ceph-mon3][DEBUG ] "mons": [
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "addr": "10.1.0.39:6789/0",
[ceph-mon3][DEBUG ] "crush_location": "{}",
[ceph-mon3][DEBUG ] "name": "ceph-mon1",
[ceph-mon3][DEBUG ] "priority": 0,
[ceph-mon3][DEBUG ] "public_addr": "10.1.0.39:6789/0",
[ceph-mon3][DEBUG ] "public_addrs": {
[ceph-mon3][DEBUG ] "addrvec": [
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "addr": "10.1.0.39:3300",
[ceph-mon3][DEBUG ] "nonce": 0,
[ceph-mon3][DEBUG ] "type": "v2"
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] {
[ceph-mon3][DEBUG ] "addr": "10.1.0.39:6789",
[ceph-mon3][DEBUG ] "nonce": 0,
[ceph-mon3][DEBUG ] "type": "v1"
[ceph-mon3][DEBUG ] }
[ceph-mon3][DEBUG ] ]
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "rank": 0,
[ceph-mon3][DEBUG ] "weight": 0
[ceph-mon3][DEBUG ] }
[ceph-mon3][DEBUG ] ],
[ceph-mon3][DEBUG ] "removed_ranks: ": "",
[ceph-mon3][DEBUG ] "stretch_mode": false,
[ceph-mon3][DEBUG ] "tiebreaker_mon": ""
[ceph-mon3][DEBUG ] },
[ceph-mon3][DEBUG ] "name": "ceph-mon3",
[ceph-mon3][DEBUG ] "outside_quorum": [],
[ceph-mon3][DEBUG ] "quorum": [],
[ceph-mon3][DEBUG ] "rank": -1,
[ceph-mon3][DEBUG ] "state": "???",
[ceph-mon3][DEBUG ] "stretch_mode": false,
[ceph-mon3][DEBUG ] "sync_provider": []
[ceph-mon3][DEBUG ] }
[ceph-mon3][DEBUG ] ********************************************************************************
[ceph-mon3][INFO ] monitor: mon.ceph-mon3 is currently at the state of ???


#状态是???

查看日志

May 30 15:30:45 ceph-mon2 ceph-mon[3434060]: 2023-05-30T15:30:45.027+0800 7f3d9ff07700 -1 mon.ceph-node2@-1(probing) e0 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
May 30 15:31:00 ceph-mon2 ceph-mon[3434060]: 2023-05-30T15:31:00.044+0800 7f3d9ff07700 -1 mon.ceph-node2@-1(probing) e0 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
May 30 15:31:15 ceph-mon2 ceph-mon[3434060]: 2023-05-30T15:31:15.057+0800 7f3d9ff07700 -1 mon.ceph-node2@-1(probing) e0 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
May 30 15:31:30 ceph-mon2 ceph-mon[3434060]: 2023-05-30T15:31:30.073+0800 7f3d9ff07700 -1 mon.ceph-node2@-1(probing) e0 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
May 30 15:31:45 ceph-mon2 ceph-mon[3434060]: 2023-05-30T15:31:45.090+0800 7f3d9ff07700 -1 mon.ceph-node2@-1(probing) e0 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
May 31 11:20:25 ceph-mon2 systemd[1]: Started Ceph cluster monitor daemon.
May 31 11:20:26 ceph-mon2 ceph-mon[2463379]: 2023-05-31T11:20:26.133+0800 7f2df3583700 -1 Processor -- bind unable to bind to v2:10.1.0.40:3300/0: (98) Address already in use
May 31 11:20:26 ceph-mon2 ceph-mon[2463379]: 2023-05-31T11:20:26.133+0800 7f2df3583700 -1 Processor -- bind was unable to bind. Trying again in 5 seconds
May 31 11:20:31 ceph-mon2 ceph-mon[2463379]: 2023-05-31T11:20:31.133+0800 7f2df3583700 -1 Processor -- bind unable to bind to v2:10.1.0.40:3300/0: (98) Address already in use
May 31 11:20:31 ceph-mon2 ceph-mon[2463379]: 2023-05-31T11:20:31.133+0800 7f2df3583700 -1 Processor -- bind was unable to bind. Trying again in 5 seconds

解决方法:

#删除相同的服务

root@ceph-mon2[15:32:07]/etc/systemd/system/ceph-mon.target.wants #:ls
ceph-mon@ceph-mon2.service ceph-mon@ceph-node2.service.bak
root@ceph-mon2[15:31:21]/etc/systemd/system/ceph-mon.target.wants #:mv ceph-mon@ceph-node2.service ceph-mon@ceph-node2.service.bak
root@ceph-mon2[15:31:44]/etc/systemd/system/ceph-mon.target.wants #:systemctl disable ceph-mon@ceph-node2.service
root@ceph-mon2[15:31:49]/etc/systemd/system/ceph-mon.target.wants #:systemctl stop ceph-mon@ceph-node2.service

问题

health: HEALTH_WARN

clock skew detected on mon.ceph-mon2, mon.ceph-mon3

由于集群之间时间不同步导致,ceph严格要求时间同步

6.2 扩展Ceph-Mgr 节点

#在mgr主机安装mgr
apt install ceph-mgr -y

# 同步配 置文件到ceph-mg2 节点
ceph-deploy admin ceph-mgr2
ceph-deploy mgr create ceph-mgr2