ceph集羣重啓之後節點掉線,健康檢查出錯的問題

一:重啓之後出現報錯

[root@ct ~(keystone_admin)]# systemctl list-units --type=service|grep ceph
  ceph-crash.service                        loaded active running Ceph crash dump collector
● [email protected]                    loaded failed failed  Ceph cluster manager daemon
  [email protected]                       loaded active running Ceph cluster manager daemon
● [email protected]                    loaded failed failed  Ceph cluster monitor daemon
● [email protected]                    loaded failed failed  Ceph cluster monitor daemon
  [email protected]                       loaded active running Ceph cluster monitor daemon
  [email protected]                        loaded active running Ceph object storage daemon osd.0
● [email protected]                    loaded failed failed  Ceph object storage daemon osd.comp2
● [email protected]                       loaded failed failed  Ceph object storage daemon osd.ct
[root@ct ~(keystone_admin)]# systemctl reset-failed [email protected]
[root@ct ~(keystone_admin)]# systemctl reset-failed [email protected]
[root@ct ~(keystone_admin)]# systemctl reset-failed [email protected]
[root@ct ~(keystone_admin)]# systemctl reset-failed [email protected]
[root@ct ~(keystone_admin)]# systemctl reset-failed [email protected]

然後再查看狀態

[root@ct ~(keystone_admin)]# ceph osd status
+----+------+-------+-------+--------+---------+--------+---------+-----------+
| id | host |  used | avail | wr ops | wr data | rd ops | rd data |   state   |
+----+------+-------+-------+--------+---------+--------+---------+-----------+
| 0  |      |    0  |    0  |    0   |     0   |    0   |     0   | exists,up |
| 1  |      |    0  |    0  |    0   |     0   |    0   |     0   | exists,up |
| 2  |      |    0  |    0  |    0   |     0   |    0   |     0   |   exists  |
+----+------+-------+-------+--------+---------+--------+---------+-----------+
[root@ct ~(keystone_admin)]# ceph -s
  cluster:
    id:     15200f4f-1a57-46c5-848f-9b8af9747e54
    health: HEALTH_WARN
            Reduced data availability: 192 pgs inactive, 192 pgs peering
            1 slow ops, oldest one blocked for 584 sec, mon.ct has slow ops
 
  services:
    mon: 3 daemons, quorum ct,comp1,comp2
    mgr: ct(active), standbys: comp1
    osd: 3 osds: 2 up, 2 in
 
  data:
    pools:   3 pools, 192 pgs
    objects: 406  objects, 1.8 GiB
    usage:   8.3 GiB used, 3.0 TiB / 3.0 TiB avail
    pgs:     100.000% pgs not active
             192 peering
 

重啓服務,哪個節點的服務掉了,就到哪個節點去重啓對應的服務

systemctl stop ceph-mon.target
systemctl restart ceph-mon.target
systemctl status ceph-mon.target
systemctl enable ceph-mon.target

systemctl stop ceph-mgr.target
systemctl restart ceph-mgr.target
systemctl status ceph-mgr.target
systemctl enable ceph-mgr.target

systemctl restart ceph-osd.target
systemctl status ceph-osd.target
systemctl enable ceph-osd.target

ceph osd pool application enable vms mon
ceph osd pool application enable images mon
ceph osd pool application enable volumes mon
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章