Kolla-Ansible安裝OpenStack Ocata版本部署成功但是Ceph osd沒有啓動

Kolla-Ansible安裝OpenStack Ocata版本部署成功但是Ceph osd沒有啓動

環境配置:

OpenStack版本:Ocata

節點數:4個

各節點宿主操作系統:CentOS7.7

在使用Kolla-Ansible部署OpenStack Ocata版本的時候,可以成功部署,後端存儲使用的是ceph,但是發現存儲節點上,ceph osd都沒有啓動,感覺很奇怪,發現日誌內容如下:

TASK [ceph : Looking up disks to bootstrap for Ceph OSDs] *********************************************************************************************
 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: {{ osd_lookup.stdout.find('localhost |
SUCCESS => ') != -1 and (osd_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute01]
 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: {{ osd_lookup.stdout.find('localhost |
SUCCESS => ') != -1 and (osd_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute03]
 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: {{ osd_lookup.stdout.find('localhost |
SUCCESS => ') != -1 and (osd_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute02]

TASK [ceph : Parsing disk info for Ceph OSDs] *********************************************************************************************************
ok: [Compute01]
ok: [Compute02]
ok: [Compute03]

TASK [ceph : Looking up disks to bootstrap for Ceph Cache OSDs] ***************************************************************************************
 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: {{ osd_cache_lookup.stdout.find('localhost
| SUCCESS => ') != -1 and (osd_cache_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: {{ osd_cache_lookup.stdout.find('localhost
| SUCCESS => ') != -1 and (osd_cache_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute01]
ok: [Compute03]
 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: {{ osd_cache_lookup.stdout.find('localhost
| SUCCESS => ') != -1 and (osd_cache_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute02]

...

在查找待啓動的Ceph OSDs的劇本成功了,其中並沒有錯誤,只有一些警告,開始以爲是警告導致的原因,於是苦苦查找半天沒有結果。

最後,準備換一個思路,重新進行了deploy,通過添加參數-vvv打印詳細日誌,在詳細日誌中排查找到如下內容:

TASK [ceph : Looking up disks to bootstrap for Ceph OSDs] *********************************************************************************************
...
中間省略幾十行
...
ok: [Compute01] => {
    "changed": false,
    "cmd": [
        "docker",
        "exec",
        "-t",
        "kolla_toolbox",
        "sudo",
        "-E",
        "/usr/bin/ansible",
        "localhost",
        "-m",
        "find_disks",
        "-a",
        "partition_name='KOLLA_CEPH_OSD_BOOTSTRAP' match_mode='prefix' use_udev=True"
    ],
    "delta": "0:00:01.029759",
    "end": "2020-03-11 14:37:57.020245",
    "failed": false,
    "failed_when_result": false,
    "invocation": {
        "module_args": {
            "_raw_params": "docker exec -t kolla_toolbox sudo -E /usr/bin/ansible localhost -m find_disks -a \"partition_name='KOLLA_CEPH_OSD_BOOTSTRAP' match_mode='prefix' use_udev=True\"",
            "_uses_shell": false,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "warn": true
        }
    },
    "rc": 0,
    "start": "2020-03-11 14:37:55.990486",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "localhost | SUCCESS => {\r\n    \"changed\": false, \r\n    \"disks\": \"[]\"\r\n}",
    "stdout_lines": [
            "localhost | SUCCESS => {",
        "    \"changed\": false, ",
        "    \"disks\": \"[]\"",
        "}"
    ]
}
...

從打印出的詳細結果json串中可以發現,輸出的disks是空數組[]。即表示可以作爲osds的盤kolla並沒有找到。但是在每個計算(存儲)節點上,確實已經有硬盤準備好了做爲osds,只能是在osd硬盤準備的過程中出現了問題。

根據官方文檔中對ceph磁盤的準備工作,可以確定總計分爲兩種,一種是不單獨設置journal,一種是單獨設置journal。我這裏採用的是講journal交由ceph自己創建,不獨立指定硬盤。應該用到的命令如下:

parted /dev/sdb -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP 1 -1
parted /dev/sdb print
Model: VMware, VMware Virtual S (scsi)
Disk /dev/sdb: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number  Start   End     Size    File system  Name                      Flags
     1      1049kB  10.7GB  10.7GB               KOLLA_CEPH_OSD_BOOTSTRAP

這裏關鍵處在於對磁盤打標籤時候,標籤的名稱是KOLLA_CEPH_OSD_BOOTSTRAP,這裏的File system根據系統的不同,可能出來的效果不一樣,但是沒關係,osd啓動成功後,經過ceph處理後,都會變成Ceph OSD這個類型。這個標籤名稱我在配置的時候,加上了後綴,設置成了KOLLA_CEPH_OSD_BOOTSTRAP1這樣,但是在不使用journal的時候,這樣設置會導致kolla檢測ceph osd磁盤的時候,將這些標籤的磁盤忽略掉,所以纔會出現沒有OSD盤的問題。

而在官方文檔中,獨立配置journal盤的時候,必須要對標籤加後綴,用以區分不同的盤和對應的journal盤。說明如下:

Prepare the storage drive in the same way as documented above:

# <WARNING ALL DATA ON $DISK will be LOST!>
# where $DISK is /dev/sdb or something similar
parted $DISK -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_FOO 1 -1

To prepare the journal external drive execute the following command:

# <WARNING ALL DATA ON $DISK will be LOST!>
# where $DISK is /dev/sdc or something similar
parted $DISK -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_FOO_J 1 -1
Note Use different suffixes (_42, _FOO, _FOO42, ..) to use different external journal drives for different storage drives. One external journal drive can only be used for one storage drive.

Note The partition labels KOLLA_CEPH_OSD_BOOTSTRAP and KOLLA_CEPH_OSD_BOOTSTRAP_J are not working when using external journal drives. It is required to use suffixes (_42, _FOO, _FOO42, ..). If you want to setup only one storage drive with one external journal drive it is also necessary to use a suffix.

所以,在這兩種情況下,對磁盤標籤的處理一定要對應上,不能混淆,否則將會導致OSD無法啓動。

至於爲什麼在不使用獨立的journal時,標籤加後綴會導致osd起不來,可以從上邊的日誌中看出端倪。如下命令:

docker exec -t kolla_toolbox sudo -E /usr/bin/ansible localhost -m find_disks -a \"partition_name='KOLLA_CEPH_OSD_BOOTSTRAP' match_mode='prefix' use_udev=True\"

其實,kolla在部署的時候,是通過在各個節點上的kolla_toolbox這個容器中執行命令,來實現相應的操作的。這裏執行了一個find_disks的命令,這個命令對應到kolla_toolbox中的一個pyhon文件,內容如下:

#!/usr/bin/python

# Copyright 2015 Sam Yaple
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This module has been relicensed from the source below:
# https://github.com/SamYaple/yaodu/blob/master/ansible/library/ceph_osd_list

DOCUMENTATION = '''
---
module: find_disks
short_description: Return list of devices containing a specfied name or label
description:
     - This will return a list of all devices with either GPT partition name
       or filesystem label of the name specified.
options:
  match_mode:
    description:
      - Label match mode, either strict or prefix
    default: 'strict'
    required: False
    choices: [ "strict", "prefix" ]
    type: str
  name:
    description:
      - Partition name or filesystem label
    required: True
    type: str
    aliases: [ 'partition_name' ]
  use_udev:
    description:
      - When True, use Linux udev to read disk info such as partition labels,
        uuid, etc.  Some older host operating systems have issues using udev to
        get the info this module needs. Set to False to fall back to more low
        level commands such as blkid to retrieve this information. Most users
        should not need to change this.
    default: True
    required: False
    type: bool
author: Sam Yaple
'''

EXAMPLES = '''
- hosts: ceph-osd
  tasks:
    - name: Return all valid formated devices with the name KOLLA_CEPH_OSD
      find_disks:
          name: 'KOLLA_CEPH_OSD'
      register: osds

- hosts: swift-object-server
  tasks:
    - name: Return all valid devices with the name KOLLA_SWIFT
      find_disks:
          name: 'KOLLA_SWIFT'
      register: swift_disks

- hosts: swift-object-server
  tasks:
    - name: Return all valid devices with wildcard name 'swift_d*'
      find_disks:
          name: 'swift_d' match_mode: 'prefix'
      register: swift_disks
'''

import json
import pyudev
import re
import subprocess  # nosec


def get_id_part_entry_name(dev, use_udev):
    if use_udev:
        dev_name = dev.get('ID_PART_ENTRY_NAME', '')
    else:
        part = re.sub(r'.*[^\d]', '', dev.device_node)
        parent = dev.find_parent('block').device_node
        # NOTE(Mech422): Need to use -i as -p truncates the partition name
        out = subprocess.Popen(['/usr/sbin/sgdisk', '-i', part,  # nosec
                                parent],
                               stdout=subprocess.PIPE).communicate()
        match = re.search(r'Partition name: \'(\w+)\'', out[0])
        if match:
            dev_name = match.group(1)
        else:
            dev_name = ''
    return dev_name


def get_id_fs_uuid(dev, use_udev):
    if use_udev:
        id_fs_uuid = dev.get('ID_FS_UUID', '')
    else:
        out = subprocess.Popen(['/usr/sbin/blkid', '-o', 'export',  # nosec
                                dev.device_node],
                               stdout=subprocess.PIPE).communicate()
        match = re.search(r'\nUUID=([\w-]+)', out[0])
        if match:
            id_fs_uuid = match.group(1)
        else:
            id_fs_uuid = ''
    return id_fs_uuid


def is_dev_matched_by_name(dev, name, mode, use_udev):
    if dev.get('DEVTYPE', '') == 'partition':
        dev_name = get_id_part_entry_name(dev, use_udev)
    else:
        dev_name = dev.get('ID_FS_LABEL', '')

    if mode == 'strict':
        return dev_name == name # 必須相等,這裏name爲傳入參數,爲KOLLA_CEPH_OSD_BOOTSTRAP
    elif mode == 'prefix':
        return dev_name.startswith(name)
    else:
        return False


def find_disk(ct, name, match_mode, use_udev):
    for dev in ct.list_devices(subsystem='block'):
        if is_dev_matched_by_name(dev, name, match_mode, use_udev):
            yield dev


def extract_disk_info(ct, dev, name, use_udev):
    if not dev:
        return
    kwargs = dict()
    kwargs['fs_uuid'] = get_id_fs_uuid(dev, use_udev)
    kwargs['fs_label'] = dev.get('ID_FS_LABEL', '')
    if dev.get('DEVTYPE', '') == 'partition':
        kwargs['device'] = dev.find_parent('block').device_node
        kwargs['partition'] = dev.device_node
        kwargs['partition_num'] = \
            re.sub(r'.*[^\d]', '', dev.device_node)
        if is_dev_matched_by_name(dev, name, 'strict', use_udev): #這裏對標籤進行嚴格檢查,嚴格檢查必須完全匹配,可以到is_dev_matched_by_name函數中看註釋,name即爲KOLLA_CEPH_OSD_BOOTSTRAP
            kwargs['external_journal'] = False  # 這種情況下,不使用用外部journal
            kwargs['journal'] = dev.device_node[:-1] + '2'
            kwargs['journal_device'] = kwargs['device']
            kwargs['journal_num'] = 2
        else:  # 這裏則匹配到以KOLLA_CEPH_OSD_BOOTSTRAP開頭的標籤
            kwargs['external_journal'] = True  # 使用外部journal
            journal_name = get_id_part_entry_name(dev, use_udev) + '_J'  # 外部journal盤則在osd盤結尾接啊_J,還要嚴格匹配
            for journal in find_disk(ct, journal_name, 'strict', use_udev):
                kwargs['journal'] = journal.device_node
                kwargs['journal_device'] = \
                    journal.find_parent('block').device_node
                kwargs['journal_num'] = \
                    re.sub(r'.*[^\d]', '', journal.device_node)
                break
            if 'journal' not in kwargs:  # 最終我的配置是進入了這個分支,直接return了
                # NOTE(SamYaple): Journal not found, not returning info
                return
    else:
        kwargs['device'] = dev.device_node
    yield kwargs


def main():
    argument_spec = dict(
        match_mode=dict(required=False, choices=['strict', 'prefix'],
                        default='strict'),  # 傳入的參數,match_mode='prefix'
        name=dict(aliases=['partition_name'], required=True, type='str'), # partition_name='KOLLA_CEPH_OSD_BOOTSTRAP'
        use_udev=dict(required=False, default=True, type='bool')
    )
    module = AnsibleModule(argument_spec)
    match_mode = module.params.get('match_mode')  # 實爲prefix
    name = module.params.get('name')  # 實爲KOLLA_CEPH_OSD_BOOTSTRAP
    use_udev = module.params.get('use_udev')

    try:
        ret = list()
        ct = pyudev.Context()
        for dev in find_disk(ct, name, match_mode, use_udev):  # 這裏初步篩選,匹配模式是前綴模式,只要是以KOLLA_CEPH_OSD_BOOTSTRAP開頭的標籤盤,都會命中
            for info in extract_disk_info(ct, dev, name, use_udev):  # 這個函數中會根據標籤的形式,來區分是否是使用外部journal盤
                if info:
                    ret.append(info)

        module.exit_json(disks=json.dumps(ret))
    except Exception as e:
        module.exit_json(failed=True, msg=repr(e))

# import module snippets
from ansible.module_utils.basic import *  # noqa
if __name__ == '__main__':
    main()

通過上邊的代碼分析,可以看出,find_disks.py本質上是通過標籤名稱來區別是使用外部journal還是使用osd盤一部分作爲journal分區。所以,根據這種設定,我在準備osd盤的時候,在打標籤時添加了後綴,肯定會被誤認爲是有外部journal盤,但是又找不到,所以直接就return了。最終纔會返回結果是空數組。當我把各個節點上的osd盤標籤都改爲KOLLA_CEPH_OSD_BOOTSTRAP以後,重新deploy,則各個osd都順利創建出來了。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章