Linux存儲IO棧(4)-- SCSI子系統之概述

概述

Linux SCSI子系統的分層架構:

這裏寫圖片描述

  • 低層:代表與SCSI的物理接口的實際驅動器,例如各個廠商爲其特定的主機適配器(Host Bus Adapter, HBA)開發的驅動,低層驅動主要作用是發現連接到主機適配器的scsi設備,在內存中構建scsi子系統所需的數據結構,並提供消息傳遞接口,將scsi命令的接受與發送解釋爲主機適配器的操作。

  • 高層: 代表各種scsi設備類型的驅動,如scsi磁盤驅動,scsi磁帶驅動,高層驅動認領低層驅動發現的scsi設備,爲這些設備分配名稱,將對設備的IO轉換爲scsi命令,交由低層驅動處理。

  • 中層:包含scsi棧的公共服務函數。高層和低層通過調用中層的函數完成其功能,而中層在執行過程中,也需要調用高層和低層註冊的回調函數做一些個性化處理。

Linux SCSI模型

這裏寫圖片描述

Linux SCSI模型是內核的抽象,主機適配器連接主機IO總線(如PCI總線)和存儲IO總線(如SCSI總線)。一臺計算機可以有多個主機適配器,而主機適配器可以控制一條或多條SCSI總線,一條總線可以有多個目標節點與之相連,並且一個目標節點可以有多個邏輯單元。

在Linux SCSI子系統中,內核中的目標節點(target)對應SCSI磁盤,SCSI磁盤中可以有多個邏輯單元,統一由磁盤控制器控制,這些邏輯單元纔是真正作爲IO終點的存儲設備,內核用設備(device)對邏輯單元進行抽象;內核中的Host對應主機適配器(物理的HBA/RAID卡,虛擬的iscsi target)

內核使用四元組 來唯一標識一個scsi的邏輯單元,在sysfs中查看sda磁盤<2:0:0:0>顯示如下:

root@ubuntu16:/home/comet/Costor/bin# ls /sys/bus/scsi/devices/2\:0\:0\:0/block/sda/
alignment_offset  device             events_poll_msecs  integrity  removable  sda5    subsystem
bdi               discard_alignment  ext_range          power      ro         size    trace
capability        events             holders            queue      sda1       slaves  uevent
dev               events_async       inflight           range      sda2       stat
root@ubuntu16:/home/comet/Costor/bin# cat /sys/bus/scsi/devices/2\:0\:0\:0/block/sda/dev
8:0
root@ubuntu16:/home/comet/Costor/bin# ll /dev/sda
brw-rw---- 1 root disk 8, 0 Sep 19 11:36 /dev/sda
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • host: 主機適配器的唯一編號。
  • channel: 主機適配器中scsi通道編號,由主機適配器固件維護。
  • id: 目標節點唯一標識符。
  • lun: 目標節點內邏輯單元編號。

SCSI命令

SCSI 命令是在 Command Descriptor Block (CDB) 中定義的。CDB 包含了用來定義要執行的特定操作的操作代碼,以及大量特定於操作的參數。

命令 用途
Test unit ready 查詢設備是否已經準備好進行傳輸
Inquiry 請求設備基本信息
Request sense 請求之前命令的錯誤信息
Read capacity 請求存儲容量信息
Read 從設備讀取數據
Write 向設備寫入數據
Mode sense 請求模式頁面(設備參數)
Mode select 在模式頁面配置設備參數

藉助大約 60 種可用命令,SCSI 可適用於許多設備(包括隨機存取設備,比如磁盤和像磁帶這樣的順序存儲設備)。SCSI 也提供了專門的命令以訪問箱體服務(比如存儲箱體內部當前的傳感和溫度)。

核心數據結構

主機適配器模板scsi_host_template

主機適配器模板是相同型號主機適配器的公共內容,包括請求隊列深度,SCSI命令處理回調函數,錯誤處理恢復函數。分配主機適配器結構時,需要使用主機適配器模板來賦值。在編寫SCSI低層驅動時,第一步便是定義模板scsi_host_template,之後纔能有模板生成主機適配器。

struct scsi_host_template {
    struct module *module;  //指向使用該模板實現的scsi_host,低層驅動模塊。
    const char *name;       //主機適配器名稱

    int (* detect)(struct scsi_host_template *);
    int (* release)(struct Scsi_Host *);

    const char *(* info)(struct Scsi_Host *); //返回HBA相關信息,可選實現

    int (* ioctl)(struct scsi_device *dev, int cmd, void __user *arg); //用戶空間ioctl函數的實現,可選實現


#ifdef CONFIG_COMPAT
    //通過該函數,支持32位系統的用戶態ioctl函數
    int (* compat_ioctl)(struct scsi_device *dev, int cmd, void __user *arg);
#endif

    //將scsi命令放進低層驅動的隊列,由中間層調用,必須實現
    int (* queuecommand)(struct Scsi_Host *, struct scsi_cmnd *);

    //以下5個函數是錯誤處理回調函數,由中間層按照嚴重程度調用
    int (* eh_abort_handler)(struct scsi_cmnd *);        //Abort
    int (* eh_device_reset_handler)(struct scsi_cmnd *); //Device Reset
    int (* eh_target_reset_handler)(struct scsi_cmnd *); //Target Reset
    int (* eh_bus_reset_handler)(struct scsi_cmnd *);    //Bus Reset
    int (* eh_host_reset_handler)(struct scsi_cmnd *);   //Host Reset

    //當掃描到新磁盤時調用,中間層回調這個函數中可以分配和初始化低層驅動所需要的結構
    int (* slave_alloc)(struct scsi_device *)

//在設備受到INQUIRY命令後,執行相關的配置操作
    int (* slave_configure)(struct scsi_device *);

    //在scsi設備銷燬之前調用,中間層回調用於釋放slave_alloc分配的私有數據
    void (* slave_destroy)(struct scsi_device *);

    //當發現新的target,中間層調用,用戶分配target私有數據
    int (* target_alloc)(struct scsi_target *);

    //在target被銷燬之前,中間層調用,低層驅動實現,用於釋放target_alloc分配的數據
    void (* target_destroy)(struct scsi_target *);

    //需要自定義掃描target邏輯時,中間層循環檢查返回值,直到該函數返回1,表示掃描完成
    int (* scan_finished)(struct Scsi_Host *, unsigned long);

    //需要自定義掃描target邏輯時,掃描開始前回調
    void (* scan_start)(struct Scsi_Host *);

    //改變主機適配器的隊列深度,返回設置的隊列深度
    int (* change_queue_depth)(struct scsi_device *, int);

    //返回磁盤的BIOS參數,如size, device, list (heads, sectors, cylinders)
    int (* bios_param)(struct scsi_device *, struct block_device *,
            sector_t, int []);

    void (*unlock_native_capacity)(struct scsi_device *);

    //在procfs中的讀寫操作回調
    int (*show_info)(struct seq_file *, struct Scsi_Host *);
    int (*write_info)(struct Scsi_Host *, char *, int);

    //中間層發現scsi命令超時回調
    enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);

    //通過sysfs屬性reset主機適配器時,回調
    int (*host_reset)(struct Scsi_Host *shost, int reset_type);
#define SCSI_ADAPTER_RESET  1
#define SCSI_FIRMWARE_RESET 2

    const char *proc_name; //在proc文件系統的名稱

    struct proc_dir_entry *proc_dir;

    int can_queue; //主機適配器能同時接受的命令數

    int this_id;

    /*
     * This determines the degree to which the host adapter is capable
     * of scatter-gather.
     */  //聚散列表的參數
    unsigned short sg_tablesize;
    unsigned short sg_prot_tablesize;

    /*
     * Set this if the host adapter has limitations beside segment count.
     */ //單個scsi命令能夠訪問的扇區最大數量
    unsigned int max_sectors;

    /*
     * DMA scatter gather segment boundary limit. A segment crossing this
     * boundary will be split in two.
     */
    unsigned long dma_boundary; //DMA聚散段邊界值,超過該值將被切割成兩個

#define SCSI_DEFAULT_MAX_SECTORS    1024

    short cmd_per_lun;

    /*
     * present contains counter indicating how many boards of this
     * type were found when we did the scan.
     */
    unsigned char present;

    /* If use block layer to manage tags, this is tag allocation policy */
    int tag_alloc_policy;

    /*
     * Track QUEUE_FULL events and reduce queue depth on demand.
     */
    unsigned track_queue_depth:1;

    /*
     * This specifies the mode that a LLD supports.
     */
    unsigned supported_mode:2; //低層驅動支持的模式(initiator或target)

    /*
     * True if this host adapter uses unchecked DMA onto an ISA bus.
     */
    unsigned unchecked_isa_dma:1;

    unsigned use_clustering:1;

    /*
     * True for emulated SCSI host adapters (e.g. ATAPI).
     */
    unsigned emulated:1;

    /*
     * True if the low-level driver performs its own reset-settle delays.
     */
    unsigned skip_settle_delay:1;

    /* True if the controller does not support WRITE SAME */
    unsigned no_write_same:1;

    /*
     * True if asynchronous aborts are not supported
     */
    unsigned no_async_abort:1;

    /*
     * Countdown for host blocking with no commands outstanding.
     */
    unsigned int max_host_blocked; //主機適配器發送隊列的低閥值,允許累計多個命令同時派發

#define SCSI_DEFAULT_HOST_BLOCKED   7

    /*
     * Pointer to the sysfs class properties for this host, NULL terminated.
     */
    struct device_attribute **shost_attrs; //主機適配器類屬性

    /*
     * Pointer to the SCSI device properties for this host, NULL terminated.
     */
    struct device_attribute **sdev_attrs;  //主機適配器設備屬性

    struct list_head legacy_hosts;

    u64 vendor_id;

    /*
     * Additional per-command data allocated for the driver.
     */  //scsi 命令緩衝池,scsi命令都是預先分配好的,保存在cmd_pool中
    unsigned int cmd_size;
    struct scsi_host_cmd_pool *cmd_pool;

    /* temporary flag to disable blk-mq I/O path */
    bool disable_blk_mq;  //禁用通用塊層多隊列模式標誌
};
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173

主機適配器Scsi_Host

Scsi_Host描述一個SCSI主機適配器,SCSI主機適配器通常是一塊基於PCI總線的擴展卡或是一個SCSI控制器芯片。每個SCSI主機適配器可以存在多個通道,一個通道實際擴展了一條SCSI總線。每個通過可以連接多個SCSI目標節點,具體連接數量與SCSI總線帶載能力有關,或者受具體SCSI協議的限制。 真實的主機總線適配器是接入主機IO總線上(通常是PCI總線),在系統啓動時,會掃描掛載在PCI總線上的設備,此時會分配主機總線適配器。
Scsi_Host結構包含內嵌通用設備,將被鏈入SCSI總線類型(scsi_bus_type)的設備鏈表。

struct Scsi_Host {
    struct list_head    __devices; //設備鏈表
    struct list_head    __targets; //目標節點鏈表

    struct scsi_host_cmd_pool *cmd_pool; //scsi命令緩衝池
    spinlock_t      free_list_lock;   //保護free_list
    struct list_head    free_list; /* backup store of cmd structs, scsi命令預先分配的備用命令鏈表 */
    struct list_head    starved_list; //scsi命令的飢餓鏈表

    spinlock_t      default_lock;
    spinlock_t      *host_lock;

    struct mutex        scan_mutex;/* serialize scanning activity */

    struct list_head    eh_cmd_q; //執行錯誤的scsi命令的鏈表
    struct task_struct    * ehandler;  /* Error recovery thread. 錯誤恢復線程 */
    struct completion     * eh_action; /* Wait for specific actions on the
                          host. */
    wait_queue_head_t       host_wait; //scsi設備恢復等待隊列
    struct scsi_host_template *hostt;  //主機適配器模板
    struct scsi_transport_template *transportt; //指向SCSI傳輸層模板

    /*
     * Area to keep a shared tag map (if needed, will be
     * NULL if not).
     */
    union {
        struct blk_queue_tag    *bqt;
        struct blk_mq_tag_set   tag_set; //SCSI支持多隊列時使用
    };
    //已經派發給主機適配器(低層驅動)的scsi命令數
    atomic_t host_busy;        /* commands actually active on low-level */
    atomic_t host_blocked;  //阻塞的scsi命令數

    unsigned int host_failed;      /* commands that failed.
                          protected by host_lock */
    unsigned int host_eh_scheduled;    /* EH scheduled without command */

    unsigned int host_no;  /* Used for IOCTL_GET_IDLUN, /proc/scsi et al. 系統內唯一標識 */

    /* next two fields are used to bound the time spent in error handling */
    int eh_deadline;
    unsigned long last_reset; //記錄上次reset時間


    /*
     * These three parameters can be used to allow for wide scsi,
     * and for host adapters that support multiple busses
     * The last two should be set to 1 more than the actual max id
     * or lun (e.g. 8 for SCSI parallel systems).
     */
    unsigned int max_channel; //主機適配器的最大通道編號
    unsigned int max_id;      //主機適配器目標節點最大編號
    u64 max_lun;              //主機適配器lun最大編號

    unsigned int unique_id;

    /*
     * The maximum length of SCSI commands that this host can accept.
     * Probably 12 for most host adapters, but could be 16 for others.
     * or 260 if the driver supports variable length cdbs.
     * For drivers that don't set this field, a value of 12 is
     * assumed.
     */
    unsigned short max_cmd_len;  //主機適配器可以接受的最長的SCSI命令
    //下面這段在scsi_host_template中也有,由template中的字段賦值
    int this_id;
    int can_queue;
    short cmd_per_lun;
    short unsigned int sg_tablesize;
    short unsigned int sg_prot_tablesize;
    unsigned int max_sectors;
    unsigned long dma_boundary;
    /*
     * In scsi-mq mode, the number of hardware queues supported by the LLD.
     *
     * Note: it is assumed that each hardware queue has a queue depth of
     * can_queue. In other words, the total queue depth per host
     * is nr_hw_queues * can_queue.
     */
    unsigned nr_hw_queues; //在scsi-mq模式中,低層驅動所支持的硬件隊列的數量
    /*
     * Used to assign serial numbers to the cmds.
     * Protected by the host lock.
     */
    unsigned long cmd_serial_number;  //指向命令序列號

    unsigned active_mode:2;           //標識是initiator或target
    unsigned unchecked_isa_dma:1;
    unsigned use_clustering:1;

    /*
     * Host has requested that no further requests come through for the
     * time being.
     */
    unsigned host_self_blocked:1; //表示低層驅動要求阻塞該主機適配器,此時中間層不會繼續派發命令到主機適配器隊列中

    /*
     * Host uses correct SCSI ordering not PC ordering. The bit is
     * set for the minority of drivers whose authors actually read
     * the spec ;).
     */
    unsigned reverse_ordering:1;

    /* Task mgmt function in progress */
    unsigned tmf_in_progress:1;  //任務管理函數正在執行

    /* Asynchronous scan in progress */
    unsigned async_scan:1;       //異步掃描正在執行

    /* Don't resume host in EH */
    unsigned eh_noresume:1;      //在錯誤處理過程不恢復主機適配器

    /* The controller does not support WRITE SAME */
    unsigned no_write_same:1;

    unsigned use_blk_mq:1;       //是否使用SCSI多隊列模式
    unsigned use_cmd_list:1;

    /* Host responded with short (<36 bytes) INQUIRY result */
    unsigned short_inquiry:1;

    /*
     * Optional work queue to be utilized by the transport
     */
    char work_q_name[20];  //被scsi傳輸層使用的工作隊列
    struct workqueue_struct *work_q;

    /*
     * Task management function work queue
     */
    struct workqueue_struct *tmf_work_q; //任務管理函數工作隊列

    /* The transport requires the LUN bits NOT to be stored in CDB[1] */
    unsigned no_scsi2_lun_in_cdb:1;

    /*
     * Value host_blocked counts down from
     */
    unsigned int max_host_blocked; //在派發隊列中累計命令達到這個數值,纔開始喚醒主機適配器

    /* Protection Information */
    unsigned int prot_capabilities;
    unsigned char prot_guard_type;

    /*
     * q used for scsi_tgt msgs, async events or any other requests that
     * need to be processed in userspace
     */
    struct request_queue *uspace_req_q; //需要在用戶空間處理的scsi_tgt消息、異步事件或其他請求的請求隊列

    /* legacy crap */
    unsigned long base;
    unsigned long io_port;   //I/O端口編號
    unsigned char n_io_port;
    unsigned char dma_channel;
    unsigned int  irq;


    enum scsi_host_state shost_state; //狀態

    /* ldm bits */ //shost_gendev: 內嵌通用設備,SCSI設備通過這個域鏈入SCSI總線類型(scsi_bus_type)的設備鏈表
    struct device       shost_gendev, shost_dev;
    //shost_dev: 內嵌類設備, SCSI設備通過這個域鏈入SCSI主機適配器類型(shost_class)的設備鏈表
    /*
     * List of hosts per template.
     *
     * This is only for use by scsi_module.c for legacy templates.
     * For these access to it is synchronized implicitly by
     * module_init/module_exit.
     */
    struct list_head sht_legacy_list;

    /*
     * Points to the transport data (if any) which is allocated
     * separately
     */
    void *shost_data; //指向獨立分配的傳輸層數據,由SCSI傳輸層使用

    /*
     * Points to the physical bus device we'd use to do DMA
     * Needed just in case we have virtual hosts.
     */
    struct device *dma_dev;

    /*
     * We should ensure that this is aligned, both for better performance
     * and also because some compilers (m68k) don't automatically force
     * alignment to a long boundary.
     */ //主機適配器專有數據
    unsigned long hostdata[0]  /* Used for storage of host specific stuff */
        __attribute__ ((aligned (sizeof(unsigned long))));
};
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193

目標節點scsi_target

scsi_target結構中有一個內嵌驅動模型設備,被鏈入SCSI總線類型scsi_bus_type的設備鏈表。

struct scsi_target {
    struct scsi_device  *starget_sdev_user; //指向正在進行I/O的scsi設備,沒有IO則指向NULL
    struct list_head    siblings;  //鏈入主機適配器target鏈表中
    struct list_head    devices;   //屬於該target的device鏈表
    struct device       dev;       //通用設備,用於加入設備驅動模型
    struct kref     reap_ref; /* last put renders target invisible 本結構的引用計數 */
    unsigned int        channel;   //該target所在的channel號
    unsigned int        id; /* target id ... replace
                     * scsi_device.id eventually */
    unsigned int        create:1; /* signal that it needs to be added */
    unsigned int        single_lun:1;   /* Indicates we should only
                         * allow I/O to one of the luns
                         * for the device at a time. */
    unsigned int        pdt_1f_for_no_lun:1;    /* PDT = 0x1f
                         * means no lun present. */
    unsigned int        no_report_luns:1;   /* Don't use
                         * REPORT LUNS for scanning. */
    unsigned int        expecting_lun_change:1; /* A device has reported
                         * a 3F/0E UA, other devices on
                         * the same target will also. */
    /* commands actually active on LLD. */
    atomic_t        target_busy;
    atomic_t        target_blocked;           //當前阻塞的命令數

    /*
     * LLDs should set this in the slave_alloc host template callout.
     * If set to zero then there is not limit.
     */
    unsigned int        can_queue;             //同時處理的命令數
    unsigned int        max_target_blocked;    //阻塞命令數閥值
#define SCSI_DEFAULT_TARGET_BLOCKED 3

    char            scsi_level;                //支持的SCSI規範級別
    enum scsi_target_state  state;             //target狀態
    void            *hostdata; /* available to low-level driver */
    unsigned long       starget_data[0]; /* for the transport SCSI傳輸層(中間層)使用 */
    /* starget_data must be the last element!!!! */
} __attribute__((aligned(sizeof(unsigned long))));
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38

邏輯設備scsi_device

scsi_device描述scsi邏輯設備,代表scsi磁盤的邏輯單元lun。scsi_device描述符所代表的設備可能是另一臺存儲設備上的SATA/SAS/SCSI磁盤或SSD。操作系統在掃描到連接在主機適配器上的邏輯設備時,創建scsi_device結構,用於scsi高層驅動和該設備通信。

struct scsi_device {
    struct Scsi_Host *host;  //所歸屬的主機總線適配器
    struct request_queue *request_queue; //請求隊列

    /* the next two are protected by the host->host_lock */
    struct list_head    siblings;   /* list of all devices on this host */ //鏈入主機總線適配器設備鏈表
    struct list_head    same_target_siblings; /* just the devices sharing same target id */ //鏈入target的設備鏈表

    atomic_t device_busy;       /* commands actually active on LLDD */
    atomic_t device_blocked;    /* Device returned QUEUE_FULL. */

    spinlock_t list_lock;
    struct list_head cmd_list;  /* queue of in use SCSI Command structures */
    struct list_head starved_entry; //鏈入主機適配器的"飢餓"鏈表
    struct scsi_cmnd *current_cmnd; /* currently active command */ //當前正在執行的命令
    unsigned short queue_depth; /* How deep of a queue we want */
    unsigned short max_queue_depth; /* max queue depth */
    unsigned short last_queue_full_depth; /* These two are used by */
    unsigned short last_queue_full_count; /* scsi_track_queue_full() */
    unsigned long last_queue_full_time; /* last queue full time */
    unsigned long queue_ramp_up_period; /* ramp up period in jiffies */
#define SCSI_DEFAULT_RAMP_UP_PERIOD (120 * HZ)

    unsigned long last_queue_ramp_up;   /* last queue ramp up time */

    unsigned int id, channel; //scsi_device所屬的target id和所在channel通道號
    u64 lun;  //該設備的lun編號
    unsigned int manufacturer;  /* Manufacturer of device, for using  製造商
                     * vendor-specific cmd's */
    unsigned sector_size;   /* size in bytes 硬件的扇區大小 */

    void *hostdata;     /* available to low-level driver 專有數據 */
    char type;          //SCSI設備類型
    char scsi_level;    //所支持SCSI規範的版本號,由INQUIRY命令獲得
    char inq_periph_qual;   /* PQ from INQUIRY data */
    unsigned char inquiry_len;  /* valid bytes in 'inquiry' */
    unsigned char * inquiry;    /* INQUIRY response data */
    const char * vendor;        /* [back_compat] point into 'inquiry' ... */
    const char * model;     /* ... after scan; point to static string */
    const char * rev;       /* ... "nullnullnullnull" before scan */

#define SCSI_VPD_PG_LEN                255
    int vpd_pg83_len;          //sense命令 0x83
    unsigned char *vpd_pg83;
    int vpd_pg80_len;          //sense命令 0x80
    unsigned char *vpd_pg80;
    unsigned char current_tag;  /* current tag */
    struct scsi_target      *sdev_target;   /* used only for single_lun */

    unsigned int    sdev_bflags; /* black/white flags as also found in
                 * scsi_devinfo.[hc]. For now used only to
                 * pass settings from slave_alloc to scsi
                 * core. */
    unsigned int eh_timeout; /* Error handling timeout */
    unsigned removable:1;
    unsigned changed:1; /* Data invalid due to media change */
    unsigned busy:1;    /* Used to prevent races */
    unsigned lockable:1;    /* Able to prevent media removal */
    unsigned locked:1;      /* Media removal disabled */
    unsigned borken:1;  /* Tell the Seagate driver to be
                 * painfully slow on this device */
    unsigned disconnect:1;  /* can disconnect */
    unsigned soft_reset:1;  /* Uses soft reset option */
    unsigned sdtr:1;    /* Device supports SDTR messages 支持同步數據傳輸 */
    unsigned wdtr:1;    /* Device supports WDTR messages 支持16位寬數據傳輸*/
    unsigned ppr:1;     /* Device supports PPR messages 支持PPR(並行協議請求)消息*/
    unsigned tagged_supported:1;    /* Supports SCSI-II tagged queuing */
    unsigned simple_tags:1; /* simple queue tag messages are enabled */
    unsigned was_reset:1;   /* There was a bus reset on the bus for
                 * this device */
    unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
                     * because we did a bus reset. */
    unsigned use_10_for_rw:1; /* first try 10-byte read / write */
    unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
    unsigned no_report_opcodes:1;   /* no REPORT SUPPORTED OPERATION CODES */
    unsigned no_write_same:1;   /* no WRITE SAME command */
    unsigned use_16_for_rw:1; /* Use read/write(16) over read/write(10) */
    unsigned skip_ms_page_8:1;  /* do not use MODE SENSE page 0x08 */
    unsigned skip_ms_page_3f:1; /* do not use MODE SENSE page 0x3f */
    unsigned skip_vpd_pages:1;  /* do not read VPD pages */
    unsigned try_vpd_pages:1;   /* attempt to read VPD pages */
    unsigned use_192_bytes_for_3f:1; /* ask for 192 bytes from page 0x3f */
    unsigned no_start_on_add:1; /* do not issue start on add */
    unsigned allow_restart:1; /* issue START_UNIT in error handler */
    unsigned manage_start_stop:1;   /* Let HLD (sd) manage start/stop */
    unsigned start_stop_pwr_cond:1; /* Set power cond. in START_STOP_UNIT */
    unsigned no_uld_attach:1; /* disable connecting to upper level drivers */
    unsigned select_no_atn:1;
    unsigned fix_capacity:1;    /* READ_CAPACITY is too high by 1 */
    unsigned guess_capacity:1;  /* READ_CAPACITY might be too high by 1 */
    unsigned retry_hwerror:1;   /* Retry HARDWARE_ERROR */
    unsigned last_sector_bug:1; /* do not use multisector accesses on
                       SD_LAST_BUGGY_SECTORS */
    unsigned no_read_disc_info:1;   /* Avoid READ_DISC_INFO cmds */
    unsigned no_read_capacity_16:1; /* Avoid READ_CAPACITY_16 cmds */
    unsigned try_rc_10_first:1; /* Try READ_CAPACACITY_10 first */
    unsigned is_visible:1;  /* is the device visible in sysfs */
    unsigned wce_default_on:1;  /* Cache is ON by default */
    unsigned no_dif:1;  /* T10 PI (DIF) should be disabled */
    unsigned broken_fua:1;      /* Don't set FUA bit */
    unsigned lun_in_cdb:1;      /* Store LUN bits in CDB[1] */

    atomic_t disk_events_disable_depth; /* disable depth for disk events */

    DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* supported events */
    DECLARE_BITMAP(pending_events, SDEV_EVT_MAXBITS); /* pending events */
    struct list_head event_list;    /* asserted events */
    struct work_struct event_work;

    unsigned int max_device_blocked; /* what device_blocked counts down from  */
#define SCSI_DEFAULT_DEVICE_BLOCKED 3

    atomic_t iorequest_cnt;
    atomic_t iodone_cnt;
    atomic_t ioerr_cnt;

    struct device       sdev_gendev, //內嵌通用設備, 鏈入scsi總線類型(scsi_bus_type)的設備鏈表
                sdev_dev; //內嵌類設備,鏈入scsi設備類(sdev_class)的設備鏈表

    struct execute_work ew; /* used to get process context on put */
    struct work_struct  requeue_work;

    struct scsi_device_handler *handler; //自定義設備處理函數
    void            *handler_data;

    enum scsi_device_state sdev_state;  //scsi設備狀態
    unsigned long       sdev_data[0];   //scsi傳輸層使用
} __attribute__((aligned(sizeof(unsigned long))));
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128

內核定義的SCSI命令結構scsi_cmnd

scsi_cmnd結構有SCSI中間層創建,傳遞到SCSI低層驅動。每個IO請求會被創建一個scsi_cnmd,但scsi_cmnd並不一定是時IO請求。scsi_cmnd最終轉化成一個具體的SCSI命令。除了命令描述塊之外,scsi_cmnd包含更豐富的信息,包括數據緩衝區、感測數據緩衝區、完成回調函數以及所關聯的塊設備驅動層請求等,是SCSI中間層執行SCSI命令的上下文。

struct scsi_cmnd {
    struct scsi_device *device;  //指向命令所屬SCSI設備的描述符的指針
    struct list_head list;  /* scsi_cmnd participates in queue lists 鏈入scsi設備的命令鏈表 */
    struct list_head eh_entry; /* entry for the host eh_cmd_q */
    struct delayed_work abort_work;
    int eh_eflags;      /* Used by error handlr */

    /*
     * A SCSI Command is assigned a nonzero serial_number before passed
     * to the driver's queue command function.  The serial_number is
     * cleared when scsi_done is entered indicating that the command
     * has been completed.  It is a bug for LLDDs to use this number
     * for purposes other than printk (and even that is only useful
     * for debugging).
     */
    unsigned long serial_number; //scsi命令的唯一序號

    /*
     * This is set to jiffies as it was when the command was first
     * allocated.  It is used to time how long the command has
     * been outstanding
     */
    unsigned long jiffies_at_alloc; //分配時的jiffies, 用於計算命令處理時間

    int retries;  //命令重試次數
    int allowed;  //允許的重試次數

    unsigned char prot_op;    //保護操作(DIF和DIX)
    unsigned char prot_type;  //DIF保護類型
    unsigned char prot_flags;

    unsigned short cmd_len;   //命令長度
    enum dma_data_direction sc_data_direction;  //命令傳輸方向

    /* These elements define the operation we are about to perform */
    unsigned char *cmnd;  //scsi規範格式的命令字符串


    /* These elements define the operation we ultimately want to perform */
    struct scsi_data_buffer sdb;        //scsi命令數據緩衝區
    struct scsi_data_buffer *prot_sdb;  //scsi命令保護信息緩衝區

    unsigned underflow; /* Return error if less than
                   this amount is transferred */

    unsigned transfersize;  /* How much we are guaranteed to  //傳輸單位
                   transfer with each SCSI transfer
                   (ie, between disconnect /
                   reconnects.   Probably == sector
                   size */

    struct request *request;    /* The command we are  通用塊層的請求描述符
                       working on */

#define SCSI_SENSE_BUFFERSIZE   96
    unsigned char *sense_buffer;    //scsi命令感測數據緩衝區
                /* obtained by REQUEST SENSE when
                 * CHECK CONDITION is received on original
                 * command (auto-sense) */

    /* Low-level done function - can be used by low-level driver to point
     *        to completion function.  Not used by mid/upper level code. */
    void (*scsi_done) (struct scsi_cmnd *); //scsi命令在低層驅動完成時,回調

    /*
     * The following fields can be written to by the host specific code.
     * Everything else should be left alone.
     */
    struct scsi_pointer SCp;    /* Scratchpad used by some host adapters */

    unsigned char *host_scribble;   /* The host adapter is allowed to
                     * call scsi_malloc and get some memory
                     * and hang it here.  The host adapter
                     * is also expected to call scsi_free
                     * to release this memory.  (The memory
                     * obtained by scsi_malloc is guaranteed
                     * to be at an address < 16Mb). */

    int result;     /* Status code from lower level driver */
    int flags;      /* Command flags */

    unsigned char tag;  /* SCSI-II queued command tag */
};
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83

驅動scsi_driver

struct scsi_driver {
    struct device_driver    gendrv;  // "繼承"device_driver

    void (*rescan)(struct device *); //重新掃描前調用的回調函數
    int (*init_command)(struct scsi_cmnd *);
    void (*uninit_command)(struct scsi_cmnd *);
    int (*done)(struct scsi_cmnd *);  //當低層驅動完成一個scsi命令時調用,用於計算已經完成的字節數
    int (*eh_action)(struct scsi_cmnd *, int); //錯誤處理回調
};
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

設備模型

  • scsi_bus_type: scsi子系統總線類型
struct bus_type scsi_bus_type = {
        .name       = "scsi",   // 對應/sys/bus/scsi
        .match      = scsi_bus_match,
    .uevent     = scsi_bus_uevent,
#ifdef CONFIG_PM
    .pm     = &scsi_bus_pm_ops,
#endif
};
EXPORT_SYMBOL_GPL(scsi_bus_type);
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • shost_class: scsi子系統類
static struct class shost_class = {
    .name       = "scsi_host",  // 對應/sys/class/scsi_host
    .dev_release    = scsi_host_cls_release,
};
  • 1
  • 2
  • 3
  • 4

這裏寫圖片描述

初始化過程

操作系統啓動時,會加載scsi子系統,入口函數是init_scsi,使用subsys_initcall定義:

static int __init init_scsi(void)
{
    int error;

    error = scsi_init_queue();  //初始化聚散列表所需要的存儲池
    if (error)
        return error;
    error = scsi_init_procfs(); //初始化procfs中與scsi相關的目錄項
    if (error)
        goto cleanup_queue;
    error = scsi_init_devinfo();//設置scsi動態設備信息列表
    if (error)
        goto cleanup_procfs;
    error = scsi_init_hosts();  //註冊shost_class類,在/sys/class/目錄下創建scsi_host子目錄
    if (error)
        goto cleanup_devlist;
    error = scsi_init_sysctl(); //註冊SCSI系統控制表
    if (error)
        goto cleanup_hosts;
    error = scsi_sysfs_register(); //註冊scsi_bus_type總線類型和sdev_class類
    if (error)
        goto cleanup_sysctl;

    scsi_netlink_init(); //初始化SCSI傳輸netlink接口

    printk(KERN_NOTICE "SCSI subsystem initialized\n");
    return 0;

cleanup_sysctl:
    scsi_exit_sysctl();
cleanup_hosts:
    scsi_exit_hosts();
cleanup_devlist:
    scsi_exit_devinfo();
cleanup_procfs:
    scsi_exit_procfs();
cleanup_queue:
    scsi_exit_queue();
    printk(KERN_ERR "SCSI subsystem failed to initialize, error = %d\n",
           -error);
    return error;
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42

scsi_init_hosts函數初始化scsi子系統主機適配器所屬的類shost_class:

int scsi_init_hosts(void)
{
    return class_register(&shost_class);
}
  • 1
  • 2
  • 3
  • 4

scsi_sysfs_register函數初始化scsi子系統總線類型scsi_bus_type和設備所屬的類sdev_class類:

int scsi_sysfs_register(void)
{
    int error;

    error = bus_register(&scsi_bus_type);
    if (!error) {
        error = class_register(&sdev_class);
        if (error)
            bus_unregister(&scsi_bus_type);
    }

    return error;
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

scsi低層驅動是面向主機適配器的,低層驅動被加載時,需要添加主機適配器。主機適配器添加有兩種方式:1.在PCI子系統掃描掛載驅動時添加;2.手動方式添加。所有基於硬件PCI接口的主機適配器都採用第一種方式。添加主機適配器包括兩個步驟:
1. 分別主機適配器數據結構scsi_host_alloc
2. 將主機適配器添加到系統scsi_add_host

struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
{
    struct Scsi_Host *shost;
    gfp_t gfp_mask = GFP_KERNEL;

    if (sht->unchecked_isa_dma && privsize)
        gfp_mask |= __GFP_DMA;
    //一次分配Scsi_Host和私有數據空間
    shost = kzalloc(sizeof(struct Scsi_Host) + privsize, gfp_mask);
    if (!shost)
        return NULL;

    shost->host_lock = &shost->default_lock;
    spin_lock_init(shost->host_lock);
    shost->shost_state = SHOST_CREATED; //更新狀態
    INIT_LIST_HEAD(&shost->__devices);  //初始化scsi設備鏈表
    INIT_LIST_HEAD(&shost->__targets);  //初始化target鏈表
    INIT_LIST_HEAD(&shost->eh_cmd_q);   //初始化執行錯誤的scsi命令鏈表
    INIT_LIST_HEAD(&shost->starved_list);   //初始化scsi命令飢餓鏈表
    init_waitqueue_head(&shost->host_wait);
    mutex_init(&shost->scan_mutex);

    /*
     * subtract one because we increment first then return, but we need to
     * know what the next host number was before increment
     */ //遞增分配主機適配器號
    shost->host_no = atomic_inc_return(&scsi_host_next_hn) - 1;
    shost->dma_channel = 0xff;

    /* These three are default values which can be overridden */
    shost->max_channel = 0; //默認通道號爲0
    shost->max_id = 8;      //默認target最大數量
    shost->max_lun = 8;     //默認scsi_device最大數量

    /* Give each shost a default transportt */
    shost->transportt = &blank_transport_template;  //scsi傳輸層(中間層)模板

    /*
     * All drivers right now should be able to handle 12 byte
     * commands.  Every so often there are requests for 16 byte
     * commands, but individual low-level drivers need to certify that
     * they actually do something sensible with such commands.
     */
    shost->max_cmd_len = 12;  //最長的SCSI命令長度
    shost->hostt = sht;       //使用主機適配器模板
    shost->this_id = sht->this_id;
    shost->can_queue = sht->can_queue;
    shost->sg_tablesize = sht->sg_tablesize;
    shost->sg_prot_tablesize = sht->sg_prot_tablesize;
    shost->cmd_per_lun = sht->cmd_per_lun;
    shost->unchecked_isa_dma = sht->unchecked_isa_dma;
    shost->use_clustering = sht->use_clustering;
    shost->no_write_same = sht->no_write_same;

    if (shost_eh_deadline == -1 || !sht->eh_host_reset_handler)
        shost->eh_deadline = -1;
    else if ((ulong) shost_eh_deadline * HZ > INT_MAX) {
        shost_printk(KERN_WARNING, shost,
                 "eh_deadline %u too large, setting to %u\n",
                 shost_eh_deadline, INT_MAX / HZ);
        shost->eh_deadline = INT_MAX;
    } else
        shost->eh_deadline = shost_eh_deadline * HZ;

    if (sht->supported_mode == MODE_UNKNOWN) //由模板指定HBA的模式
        /* means we didn't set it ... default to INITIATOR */
        shost->active_mode = MODE_INITIATOR;  //主機適配器模式默認是initiator
    else
        shost->active_mode = sht->supported_mode;

    if (sht->max_host_blocked)
        shost->max_host_blocked = sht->max_host_blocked;
    else
        shost->max_host_blocked = SCSI_DEFAULT_HOST_BLOCKED;

    /*
     * If the driver imposes no hard sector transfer limit, start at
     * machine infinity initially.
     */
    if (sht->max_sectors)
        shost->max_sectors = sht->max_sectors;
    else
        shost->max_sectors = SCSI_DEFAULT_MAX_SECTORS;

    /*
     * assume a 4GB boundary, if not set
     */
    if (sht->dma_boundary)
        shost->dma_boundary = sht->dma_boundary;
    else
        shost->dma_boundary = 0xffffffff;  //默認DMA的邊界爲4G

    shost->use_blk_mq = scsi_use_blk_mq && !shost->hostt->disable_blk_mq;

    device_initialize(&shost->shost_gendev); //初始化主機適配器內部通用設備
    dev_set_name(&shost->shost_gendev, "host%d", shost->host_no);
    shost->shost_gendev.bus = &scsi_bus_type;   //設置主機適配器的總線類型
    shost->shost_gendev.type = &scsi_host_type; //設置主機適配器的設備類型

    device_initialize(&shost->shost_dev);    //初始化主機適配器的內部類設備
    shost->shost_dev.parent = &shost->shost_gendev; //內部類設備的父設備設置爲其內部通用設備
    shost->shost_dev.class = &shost_class;   //設置內部類設備所屬的類是shost_class
    dev_set_name(&shost->shost_dev, "host%d", shost->host_no);
    shost->shost_dev.groups = scsi_sysfs_shost_attr_groups;  //設置類設備的屬性組

    shost->ehandler = kthread_run(scsi_error_handler, shost,  //啓動主機適配器的錯誤恢復內核線程
            "scsi_eh_%d", shost->host_no);
    if (IS_ERR(shost->ehandler)) {
        shost_printk(KERN_WARNING, shost,
            "error handler thread failed to spawn, error = %ld\n",
            PTR_ERR(shost->ehandler));
        goto fail_kfree;
    }
    //分配任務管理工作隊列
    shost->tmf_work_q = alloc_workqueue("scsi_tmf_%d",
                        WQ_UNBOUND | WQ_MEM_RECLAIM,
                       1, shost->host_no);
    if (!shost->tmf_work_q) {
        shost_printk(KERN_WARNING, shost,
                 "failed to create tmf workq\n");
        goto fail_kthread;
    }
    scsi_proc_hostdir_add(shost->hostt); //在procfs中添加主機適配器的目錄, eg. //創建/proc/scsi/<主機適配器名稱>目錄
    return shost;

 fail_kthread:
    kthread_stop(shost->ehandler);
 fail_kfree:
    kfree(shost);
    return NULL;
}
EXPORT_SYMBOL(scsi_host_alloc);
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
static inline int __must_check scsi_add_host(struct Scsi_Host *host,
                         struct device *dev) //dev爲父設備
{
    return scsi_add_host_with_dma(host, dev, dev);
}

int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
               struct device *dma_dev)
{
    struct scsi_host_template *sht = shost->hostt;
    int error = -EINVAL;

    shost_printk(KERN_INFO, shost, "%s\n",
            sht->info ? sht->info(shost) : sht->name);

    if (!shost->can_queue) {
        shost_printk(KERN_ERR, shost,
                 "can_queue = 0 no longer supported\n");
        goto fail;
    }

    if (shost_use_blk_mq(shost)) {         //如果主機適配器設置使用多隊列IO,則建立
        error = scsi_mq_setup_tags(shost); //相應的多隊列環境
        if (error)
            goto fail;
    } else {
        shost->bqt = blk_init_tags(shost->can_queue,
                shost->hostt->tag_alloc_policy);
        if (!shost->bqt) {
            error = -ENOMEM;
            goto fail;
        }
    }

    /*
     * Note that we allocate the freelist even for the MQ case for now,
     * as we need a command set aside for scsi_reset_provider.  Having
     * the full host freelist and one command available for that is a
     * little heavy-handed, but avoids introducing a special allocator
     * just for this.  Eventually the structure of scsi_reset_provider
     * will need a major overhaul.
     */ //分配存儲scsi命令和sense數據的緩衝區, 並分配scsi命令的備用倉庫鏈表
    error = scsi_setup_command_freelist(shost);
    if (error)
        goto out_destroy_tags;

    //設置主機適配器的父設備,確定該設備在sysfs中的位置,通常會通過dev參數傳入pci_dev。
    if (!shost->shost_gendev.parent)
        shost->shost_gendev.parent = dev ? dev : &platform_bus; //如果dev爲NULL,設置爲platform_bus
    if (!dma_dev)
        dma_dev = shost->shost_gendev.parent;

    shost->dma_dev = dma_dev;

    error = device_add(&shost->shost_gendev);  //添加主機適配器通用設備到系統
    if (error)
        goto out_destroy_freelist;

    pm_runtime_set_active(&shost->shost_gendev);
    pm_runtime_enable(&shost->shost_gendev);
    device_enable_async_suspend(&shost->shost_gendev); //支持異步掛起通用設備

    scsi_host_set_state(shost, SHOST_RUNNING);  //設置主機適配器狀態
    get_device(shost->shost_gendev.parent);     //增加通用父設備的引用計數

    device_enable_async_suspend(&shost->shost_dev);  //支持異步掛起類設備

    error = device_add(&shost->shost_dev);    //添加主機適配器類設備到系統
    if (error)
        goto out_del_gendev;

    get_device(&shost->shost_gendev);

    if (shost->transportt->host_size) {  //scsi傳輸層使用的數據空間
        shost->shost_data = kzalloc(shost->transportt->host_size,
                     GFP_KERNEL);
        if (shost->shost_data == NULL) {
            error = -ENOMEM;
            goto out_del_dev;
        }
    }

    if (shost->transportt->create_work_queue) {
        snprintf(shost->work_q_name, sizeof(shost->work_q_name),
             "scsi_wq_%d", shost->host_no);
        shost->work_q = create_singlethread_workqueue( //分配被scsi傳輸層使用的工作隊列
                    shost->work_q_name);
        if (!shost->work_q) {
            error = -EINVAL;
            goto out_free_shost_data;
        }
    }

    error = scsi_sysfs_add_host(shost); //添加主機適配器到子系統
    if (error)
        goto out_destroy_host;

    scsi_proc_host_add(shost);  //在procfs添加主機適配器信息
    return error;

 out_destroy_host:
    if (shost->work_q)
        destroy_workqueue(shost->work_q);
 out_free_shost_data:
    kfree(shost->shost_data);
 out_del_dev:
    device_del(&shost->shost_dev);
 out_del_gendev:
    device_del(&shost->shost_gendev);
 out_destroy_freelist:
    scsi_destroy_command_freelist(shost);
 out_destroy_tags:
    if (shost_use_blk_mq(shost))
        scsi_mq_destroy_tags(shost);
 fail:
    return error;
}
EXPORT_SYMBOL(scsi_add_host_with_dma);
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118

設備探測過程

在系統啓動過程中,會掃描默認的PCI根總線,從而觸發了PCI設備掃描的過程,開始構造PCI設備樹,SCSI主機適配器是掛載在PCI總線的設備。SCSI主機適配器做PCI設備會被PCI總線驅動層掃描到(PCI設備的掃描採用配置空間訪問的方式),掃描到SCSI主機適配器後,操作系統開始加載SCSI主機適配器驅動,SCSI主機適配器驅動就是上面所說的低層驅動。SCSI主機適配器驅動根據SCSI主機適配器驅動根據SCSI主機適配模板分配SCSI主機適配器描述符,並添加到系統,之後啓動通過SCSI主機適配器擴展出來的下一級總線–SCSI總線的掃描過程。

SCSI中間層依次以可能的ID和LUN構造INQUIRY命令,之後將這些INQUIRY命令提交給塊IO子系統,後者又最終將調用SCSI中間層的策略例程,再次提取到SCSI命令結構後,調用SCSI低層驅動的queuecommand回調函數實現。
對於給定ID的目標節點,如果它在SCSI總線上存在,那麼它一定要實現對LUN0的INQUIRY響應。也就是說,如果向某個ID的目標節點的LUN0發送INQUIRY命令,或依次向各個LUN嘗試發送INQUIRY命令,檢查是否能收到響應,最終SCSI中間層能夠得到SCSI域中的所連接的邏輯設備及其信息。

SCSI總線具體的掃描方式可以由具體的主機適配器固件、主機適配器驅動實現,在此只討論由主機適配器驅動調用scsi中間層提供通用的掃描函數的實現方式scsi_scan_host。

void scsi_scan_host(struct Scsi_Host *shost)
{
    struct async_scan_data *data;

    if (strncmp(scsi_scan_type, "none", 4) == 0) //檢查掃描邏輯
        return;
    if (scsi_autopm_get_host(shost) < 0)
        return;

    data = scsi_prep_async_scan(shost); //準備異步掃描
    if (!data) {
        do_scsi_scan_host(shost);    //同步掃描
        scsi_autopm_put_host(shost);
        return;
    }

    /* register with the async subsystem so wait_for_device_probe()
     * will flush this work
     */
    async_schedule(do_scan_async, data);  //異步掃描

    /* scsi_autopm_put_host(shost) is called in scsi_finish_async_scan() */
}
EXPORT_SYMBOL(scsi_scan_host);
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24

scsi_scan_host函數是scsi中間層提供的主機適配器掃描函數,對於有主機適配器驅動有自定義掃描邏輯需求的可以設置主機適配器模板的回調函數,由scsi_scan_host函數來調用回調實現自定義掃描。
scsi_scan_type變量指定了掃描方式:async、sync、none。無論最終掃描方式是同步還是異步,都是由do_scsi_scan_host函數實現:

static void do_scsi_scan_host(struct Scsi_Host *shost)
{
    if (shost->hostt->scan_finished) {  //使用自定義掃描方式
        unsigned long start = jiffies;
        if (shost->hostt->scan_start)
            shost->hostt->scan_start(shost); //自定義掃描開始回調

        while (!shost->hostt->scan_finished(shost, jiffies - start)) //自定義掃描完成時返回1
            msleep(10);
    } else { //scsi子系統通用掃描函數, SCAN_WILD_CARD表示掃描所有的target和device
        scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
                SCAN_WILD_CARD, 0);
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

如果主機適配器模板設置了自定義掃描函數,do_scsi_scan_host函數將會調用。如果沒有設置則使用默認的掃描函數scsi_scan_host_selected執行掃描。

int scsi_scan_host_selected(struct Scsi_Host *shost, unsigned int channel,
                unsigned int id, u64 lun, int rescan)
{
    SCSI_LOG_SCAN_BUS(3, shost_printk (KERN_INFO, shost,
        "%s: <%u:%u:%llu>\n",
        __func__, channel, id, lun));
    //檢查channel、id、lun是否有效
    if (((channel != SCAN_WILD_CARD) && (channel > shost->max_channel)) ||
        ((id != SCAN_WILD_CARD) && (id >= shost->max_id)) ||
        ((lun != SCAN_WILD_CARD) && (lun >= shost->max_lun)))
        return -EINVAL;

    mutex_lock(&shost->scan_mutex);
    if (!shost->async_scan)
        scsi_complete_async_scans();
    //檢查Scsi_Host的狀態是否允許掃描
    if (scsi_host_scan_allowed(shost) && scsi_autopm_get_host(shost) == 0) {
        if (channel == SCAN_WILD_CARD)
            for (channel = 0; channel <= shost->max_channel; //遍歷所有的channel進行掃描
                 channel++)
                scsi_scan_channel(shost, channel, id, lun,  //掃描channel
                          rescan);
        else
            scsi_scan_channel(shost, channel, id, lun, rescan); //掃描指定的channel
        scsi_autopm_put_host(shost);
    }
    mutex_unlock(&shost->scan_mutex);

    return 0;
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31

scsi_scan_host_selected函數掃描指定的主機適配器,根據輸入的參數決定是否遍歷掃描所有channel或掃描指定channel,通過函數scsi_scan_channel完成。

static void scsi_scan_channel(struct Scsi_Host *shost, unsigned int channel,
                  unsigned int id, u64 lun, int rescan)
{
    uint order_id;

    if (id == SCAN_WILD_CARD)
        for (id = 0; id < shost->max_id; ++id) {  //遍歷所有的target
            /*
             * XXX adapter drivers when possible (FCP, iSCSI)
             * could modify max_id to match the current max,
             * not the absolute max.
             *
             * XXX add a shost id iterator, so for example,
             * the FC ID can be the same as a target id
             * without a huge overhead of sparse id's.
             */
            if (shost->reverse_ordering)
                /*
                 * Scan from high to low id.
                 */
                order_id = shost->max_id - id - 1;
            else
                order_id = id;
            __scsi_scan_target(&shost->shost_gendev, channel, //掃描指定的target
                    order_id, lun, rescan);
        }
    else
        __scsi_scan_target(&shost->shost_gendev, channel,
                id, lun, rescan);
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30

__scsi_scan_target函數指定掃描target內部的lun。

static void __scsi_scan_target(struct device *parent, unsigned int channel,
        unsigned int id, u64 lun, int rescan)
{
    struct Scsi_Host *shost = dev_to_shost(parent);
    int bflags = 0;
    int res;
    struct scsi_target *starget;

    if (shost->this_id == id)
        /*
         * Don't scan the host adapter
         */
        return;
    //爲指定的id分配target數據結構,並初始化
    starget = scsi_alloc_target(parent, channel, id);
    if (!starget)
        return;
    scsi_autopm_get_target(starget);

    if (lun != SCAN_WILD_CARD) {
        /*
         * Scan for a specific host/chan/id/lun.
         */ //掃描target中指定id的scsi_device(lun),並將scsi_device(lun)添加到子系統
        scsi_probe_and_add_lun(starget, lun, NULL, NULL, rescan, NULL);
        goto out_reap;
    }

    /*
     * Scan LUN 0, if there is some response, scan further. Ideally, we
     * would not configure LUN 0 until all LUNs are scanned.
     */ //探測target的LUN0
    res = scsi_probe_and_add_lun(starget, 0, &bflags, NULL, rescan, NULL);
    if (res == SCSI_SCAN_LUN_PRESENT || res == SCSI_SCAN_TARGET_PRESENT) {
        if (scsi_report_lun_scan(starget, bflags, rescan) != 0) //向target lun 0發送REPORT_LUNS
            /*
             * The REPORT LUN did not scan the target,
             * do a sequential scan.
             */
            scsi_sequential_lun_scan(starget, bflags,  //探測REPORT_LUNS上報的lun
                         starget->scsi_level, rescan);
    }

 out_reap:
    scsi_autopm_put_target(starget);
    /*
     * paired with scsi_alloc_target(): determine if the target has
     * any children at all and if not, nuke it
     */
    scsi_target_reap(starget);

    put_device(&starget->dev);
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52

掃描到target時分配並初始化scsi_target結構,scsi_probe_and_add_lun函數完成探測target中的lun,並將發現的lun添加到系統。

static int scsi_probe_and_add_lun(struct scsi_target *starget,
                  u64 lun, int *bflagsp,
                  struct scsi_device **sdevp, int rescan,
                  void *hostdata)
{
    struct scsi_device *sdev;
    unsigned char *result;
    int bflags, res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
    struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);

    /*
     * The rescan flag is used as an optimization, the first scan of a
     * host adapter calls into here with rescan == 0.
     */
    sdev = scsi_device_lookup_by_target(starget, lun);  //尋找target中指定id的lun
    if (sdev) {   //target中已經存在lun
        if (rescan || !scsi_device_created(sdev)) { //rescan參數要求重新掃描該lun
            SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
                "scsi scan: device exists on %s\n",
                dev_name(&sdev->sdev_gendev)));
            if (sdevp)
                *sdevp = sdev;
            else
                scsi_device_put(sdev);

            if (bflagsp)
                *bflagsp = scsi_get_device_flags(sdev,
                                 sdev->vendor,
                                 sdev->model);
            return SCSI_SCAN_LUN_PRESENT;
        }
        scsi_device_put(sdev);
    } else
        sdev = scsi_alloc_sdev(starget, lun, hostdata); //target中不存在lun,分配scsi_device
    if (!sdev)
        goto out;

    result = kmalloc(result_len, GFP_ATOMIC |
            ((shost->unchecked_isa_dma) ? __GFP_DMA : 0));
    if (!result)
        goto out_free_sdev;

    if (scsi_probe_lun(sdev, result, result_len, &bflags)) //發送INQUIRY到具體device,進行探測
        goto out_free_result;

    if (bflagsp)
        *bflagsp = bflags;
    /*
     * result contains valid SCSI INQUIRY data.
     */
    if (((result[0] >> 5) == 3) && !(bflags & BLIST_ATTACH_PQ3)) {
        /*
         * For a Peripheral qualifier 3 (011b), the SCSI
         * spec says: The device server is not capable of
         * supporting a physical device on this logical
         * unit.
         *
         * For disks, this implies that there is no
         * logical disk configured at sdev->lun, but there
         * is a target id responding.
         */
        SCSI_LOG_SCAN_BUS(2, sdev_printk(KERN_INFO, sdev, "scsi scan:"
                   " peripheral qualifier of 3, device not"
                   " added\n"))
        if (lun == 0) {
            SCSI_LOG_SCAN_BUS(1, {
                unsigned char vend[9];
                unsigned char mod[17];

                sdev_printk(KERN_INFO, sdev,
                    "scsi scan: consider passing scsi_mod."
                    "dev_flags=%s:%s:0x240 or 0x1000240\n",
                    scsi_inq_str(vend, result, 8, 16),
                    scsi_inq_str(mod, result, 16, 32));
            });

        }

        res = SCSI_SCAN_TARGET_PRESENT;
        goto out_free_result;
    }

    /*
     * Some targets may set slight variations of PQ and PDT to signal
     * that no LUN is present, so don't add sdev in these cases.
     * Two specific examples are:
     * 1) NetApp targets: return PQ=1, PDT=0x1f
     * 2) USB UFI: returns PDT=0x1f, with the PQ bits being "reserved"
     *    in the UFI 1.0 spec (we cannot rely on reserved bits).
     *
     * References:
     * 1) SCSI SPC-3, pp. 145-146
     * PQ=1: "A peripheral device having the specified peripheral
     * device type is not connected to this logical unit. However, the
     * device server is capable of supporting the specified peripheral
     * device type on this logical unit."
     * PDT=0x1f: "Unknown or no device type"
     * 2) USB UFI 1.0, p. 20
     * PDT=00h Direct-access device (floppy)
     * PDT=1Fh none (no FDD connected to the requested logical unit)
     */
    if (((result[0] >> 5) == 1 || starget->pdt_1f_for_no_lun) &&
        (result[0] & 0x1f) == 0x1f &&
        !scsi_is_wlun(lun)) {
        SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
                    "scsi scan: peripheral device type"
                    " of 31, no device added\n"));
        res = SCSI_SCAN_TARGET_PRESENT;
        goto out_free_result;
    }
    //添加scsi設備到子系統
    res = scsi_add_lun(sdev, result, &bflags, shost->async_scan);
    if (res == SCSI_SCAN_LUN_PRESENT) {
        if (bflags & BLIST_KEY) {
            sdev->lockable = 0;
            scsi_unlock_floptical(sdev, result);
        }
    }

 out_free_result:
    kfree(result);
 out_free_sdev:
    if (res == SCSI_SCAN_LUN_PRESENT) {
        if (sdevp) {
            if (scsi_device_get(sdev) == 0) {
                *sdevp = sdev;
            } else {
                __scsi_remove_device(sdev);
                res = SCSI_SCAN_NO_RESPONSE;
            }
        }
    } else
        __scsi_remove_device(sdev);
 out:
    return res;
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136

scsi_probe_and_add_lun函數由名字可知,完成lun的probe和add兩個操作:
1. 探測邏輯設備scsi_probe_lun,發送INQUIRY命令到具體設備。
2. 添加邏輯設備到系統scsi_add_lun,根據INQUIRY命令返回值添加lun到系統。

static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result,
              int result_len, int *bflags)
{
    unsigned char scsi_cmd[MAX_COMMAND_SIZE];
    int first_inquiry_len, try_inquiry_len, next_inquiry_len;
    int response_len = 0;
    int pass, count, result;
    struct scsi_sense_hdr sshdr;

    *bflags = 0;

    /* Perform up to 3 passes.  The first pass uses a conservative
     * transfer length of 36 unless sdev->inquiry_len specifies a
     * different value. */
    first_inquiry_len = sdev->inquiry_len ? sdev->inquiry_len : 36;
    try_inquiry_len = first_inquiry_len;
    pass = 1;

 next_pass:
    SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
                "scsi scan: INQUIRY pass %d length %d\n",
                pass, try_inquiry_len));

    /* Each pass gets up to three chances to ignore Unit Attention */
    for (count = 0; count < 3; ++count) {
        int resid;

        memset(scsi_cmd, 0, 6);
        scsi_cmd[0] = INQUIRY;      //命令類型是INQUIRY
        scsi_cmd[4] = (unsigned char) try_inquiry_len;

        memset(inq_result, 0, try_inquiry_len);
        //發送SCSI命令,重試3次
        result = scsi_execute_req(sdev,  scsi_cmd, DMA_FROM_DEVICE,
                      inq_result, try_inquiry_len, &sshdr,
                      HZ / 2 + HZ * scsi_inq_timeout, 3,
                      &resid);

        SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
                "scsi scan: INQUIRY %s with code 0x%x\n",
                result ? "failed" : "successful", result));

        if (result) {
            /*
             * not-ready to ready transition [asc/ascq=0x28/0x0]
             * or power-on, reset [asc/ascq=0x29/0x0], continue.
             * INQUIRY should not yield UNIT_ATTENTION
             * but many buggy devices do so anyway.
             */
            if ((driver_byte(result) & DRIVER_SENSE) &&
                scsi_sense_valid(&sshdr)) {
                if ((sshdr.sense_key == UNIT_ATTENTION) &&
                    ((sshdr.asc == 0x28) ||
                     (sshdr.asc == 0x29)) &&
                    (sshdr.ascq == 0))
                    continue;
            }
        } else {
            /*
             * if nothing was transferred, we try
             * again. It's a workaround for some USB
             * devices.
             */
            if (resid == try_inquiry_len)
                continue;
        }
        break;
    }

    if (result == 0) {
        sanitize_inquiry_string(&inq_result[8], 8);
        sanitize_inquiry_string(&inq_result[16], 16);
        sanitize_inquiry_string(&inq_result[32], 4);

        response_len = inq_result[4] + 5;
        if (response_len > 255)
            response_len = first_inquiry_len;   /* sanity */

        /*
         * Get any flags for this device.
         *
         * XXX add a bflags to scsi_device, and replace the
         * corresponding bit fields in scsi_device, so bflags
         * need not be passed as an argument.
         */
        *bflags = scsi_get_device_flags(sdev, &inq_result[8],
                &inq_result[16]);

        /* When the first pass succeeds we gain information about
         * what larger transfer lengths might work. */
        if (pass == 1) {
            if (BLIST_INQUIRY_36 & *bflags)
                next_inquiry_len = 36;
            else if (BLIST_INQUIRY_58 & *bflags)
                next_inquiry_len = 58;
            else if (sdev->inquiry_len)
                next_inquiry_len = sdev->inquiry_len;
            else
                next_inquiry_len = response_len;

            /* If more data is available perform the second pass */
            if (next_inquiry_len > try_inquiry_len) {
                try_inquiry_len = next_inquiry_len;
                pass = 2;
                goto next_pass;
            }
        }

    } else if (pass == 2) {
        sdev_printk(KERN_INFO, sdev,
                "scsi scan: %d byte inquiry failed.  "
                "Consider BLIST_INQUIRY_36 for this device\n",
                try_inquiry_len);

        /* If this pass failed, the third pass goes back and transfers
         * the same amount as we successfully got in the first pass. */
        try_inquiry_len = first_inquiry_len;
        pass = 3;
        goto next_pass;
    }

    /* If the last transfer attempt got an error, assume the
     * peripheral doesn't exist or is dead. */
    if (result)
        return -EIO;

    /* Don't report any more data than the device says is valid */
    sdev->inquiry_len = min(try_inquiry_len, response_len);

    /*
     * XXX Abort if the response length is less than 36? If less than
     * 32, the lookup of the device flags (above) could be invalid,
     * and it would be possible to take an incorrect action - we do
     * not want to hang because of a short INQUIRY. On the flip side,
     * if the device is spun down or becoming ready (and so it gives a
     * short INQUIRY), an abort here prevents any further use of the
     * device, including spin up.
     *
     * On the whole, the best approach seems to be to assume the first
     * 36 bytes are valid no matter what the device says.  That's
     * better than copying < 36 bytes to the inquiry-result buffer
     * and displaying garbage for the Vendor, Product, or Revision
     * strings.
     */
    if (sdev->inquiry_len < 36) {
        if (!sdev->host->short_inquiry) {
            shost_printk(KERN_INFO, sdev->host,
                    "scsi scan: INQUIRY result too short (%d),"
                    " using 36\n", sdev->inquiry_len);
            sdev->host->short_inquiry = 1;
        }
        sdev->inquiry_len = 36;
    }

    /*
     * Related to the above issue:
     *
     * XXX Devices (disk or all?) should be sent a TEST UNIT READY,
     * and if not ready, sent a START_STOP to start (maybe spin up) and
     * then send the INQUIRY again, since the INQUIRY can change after
     * a device is initialized.
     *
     * Ideally, start a device if explicitly asked to do so.  This
     * assumes that a device is spun up on power on, spun down on
     * request, and then spun up on request.
     */

    /*
     * The scanning code needs to know the scsi_level, even if no
     * device is attached at LUN 0 (SCSI_SCAN_TARGET_PRESENT) so
     * non-zero LUNs can be scanned.
     */
    sdev->scsi_level = inq_result[2] & 0x07;
    if (sdev->scsi_level >= 2 ||
        (sdev->scsi_level == 1 && (inq_result[3] & 0x0f) == 1))
        sdev->scsi_level++;
    sdev->sdev_target->scsi_level = sdev->scsi_level;

    /*
     * If SCSI-2 or lower, and if the transport requires it,
     * store the LUN value in CDB[1].
     */
    sdev->lun_in_cdb = 0;
    if (sdev->scsi_level <= SCSI_2 &&
        sdev->scsi_level != SCSI_UNKNOWN &&
        !sdev->host->no_scsi2_lun_in_cdb)
        sdev->lun_in_cdb = 1;

    return 0;
}


static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
        int *bflags, int async)
{
    int ret;

    /*
     * XXX do not save the inquiry, since it can change underneath us,
     * save just vendor/model/rev.
     *
     * Rather than save it and have an ioctl that retrieves the saved
     * value, have an ioctl that executes the same INQUIRY code used
     * in scsi_probe_lun, let user level programs doing INQUIRY
     * scanning run at their own risk, or supply a user level program
     * that can correctly scan.
     */

    /*
     * Copy at least 36 bytes of INQUIRY data, so that we don't
     * dereference unallocated memory when accessing the Vendor,
     * Product, and Revision strings.  Badly behaved devices may set
     * the INQUIRY Additional Length byte to a small value, indicating
     * these strings are invalid, but often they contain plausible data
     * nonetheless.  It doesn't matter if the device sent < 36 bytes
     * total, since scsi_probe_lun() initializes inq_result with 0s.
     */
    sdev->inquiry = kmemdup(inq_result,
                max_t(size_t, sdev->inquiry_len, 36),
                GFP_ATOMIC);
    if (sdev->inquiry == NULL)
        return SCSI_SCAN_NO_RESPONSE;

    sdev->vendor = (char *) (sdev->inquiry + 8); //第8個字節到第15個字節是vendor identification
    sdev->model = (char *) (sdev->inquiry + 16); //第16個字節到第31個字節是product identification
    sdev->rev = (char *) (sdev->inquiry + 32);   //第32個字節到第35個字節是product revision level

    if (strncmp(sdev->vendor, "ATA     ", 8) == 0) {
        /*
         * sata emulation layer device.  This is a hack to work around
         * the SATL power management specifications which state that
         * when the SATL detects the device has gone into standby
         * mode, it shall respond with NOT READY.
         */
        sdev->allow_restart = 1;
    }

    if (*bflags & BLIST_ISROM) {
        sdev->type = TYPE_ROM;
        sdev->removable = 1;
    } else {
        sdev->type = (inq_result[0] & 0x1f);
        sdev->removable = (inq_result[1] & 0x80) >> 7;

        /*
         * some devices may respond with wrong type for
         * well-known logical units. Force well-known type
         * to enumerate them correctly.
         */
        if (scsi_is_wlun(sdev->lun) && sdev->type != TYPE_WLUN) {
            sdev_printk(KERN_WARNING, sdev,
                "%s: correcting incorrect peripheral device type 0x%x for W-LUN 0x%16xhN\n",
                __func__, sdev->type, (unsigned int)sdev->lun);
            sdev->type = TYPE_WLUN;
        }

    }

    if (sdev->type == TYPE_RBC || sdev->type == TYPE_ROM) {
        /* RBC and MMC devices can return SCSI-3 compliance and yet
         * still not support REPORT LUNS, so make them act as
         * BLIST_NOREPORTLUN unless BLIST_REPORTLUN2 is
         * specifically set */
        if ((*bflags & BLIST_REPORTLUN2) == 0)
            *bflags |= BLIST_NOREPORTLUN;
    }

    /*
     * For a peripheral qualifier (PQ) value of 1 (001b), the SCSI
     * spec says: The device server is capable of supporting the
     * specified peripheral device type on this logical unit. However,
     * the physical device is not currently connected to this logical
     * unit.
     *
     * The above is vague, as it implies that we could treat 001 and
     * 011 the same. Stay compatible with previous code, and create a
     * scsi_device for a PQ of 1
     *
     * Don't set the device offline here; rather let the upper
     * level drivers eval the PQ to decide whether they should
     * attach. So remove ((inq_result[0] >> 5) & 7) == 1 check.
     */

    sdev->inq_periph_qual = (inq_result[0] >> 5) & 7;
    sdev->lockable = sdev->removable;
    sdev->soft_reset = (inq_result[7] & 1) && ((inq_result[3] & 7) == 2);

    if (sdev->scsi_level >= SCSI_3 ||
            (sdev->inquiry_len > 56 && inq_result[56] & 0x04))
        sdev->ppr = 1;
    if (inq_result[7] & 0x60)
        sdev->wdtr = 1;
    if (inq_result[7] & 0x10)
        sdev->sdtr = 1;

    sdev_printk(KERN_NOTICE, sdev, "%s %.8s %.16s %.4s PQ: %d "
            "ANSI: %d%s\n", scsi_device_type(sdev->type),
            sdev->vendor, sdev->model, sdev->rev,
            sdev->inq_periph_qual, inq_result[2] & 0x07,
            (inq_result[3] & 0x0f) == 1 ? " CCS" : "");

    if ((sdev->scsi_level >= SCSI_2) && (inq_result[7] & 2) &&
        !(*bflags & BLIST_NOTQ)) {
        sdev->tagged_supported = 1;
        sdev->simple_tags = 1;
    }

    /*
     * Some devices (Texel CD ROM drives) have handshaking problems
     * when used with the Seagate controllers. borken is initialized
     * to 1, and then set it to 0 here.
     */
    if ((*bflags & BLIST_BORKEN) == 0)
        sdev->borken = 0;

    if (*bflags & BLIST_NO_ULD_ATTACH)
        sdev->no_uld_attach = 1;

    /*
     * Apparently some really broken devices (contrary to the SCSI
     * standards) need to be selected without asserting ATN
     */
    if (*bflags & BLIST_SELECT_NO_ATN)
        sdev->select_no_atn = 1;

    /*
     * Maximum 512 sector transfer length
     * broken RA4x00 Compaq Disk Array
     */
    if (*bflags & BLIST_MAX_512)
        blk_queue_max_hw_sectors(sdev->request_queue, 512);
    /*
     * Max 1024 sector transfer length for targets that report incorrect
     * max/optimal lengths and relied on the old block layer safe default
     */
    else if (*bflags & BLIST_MAX_1024)
        blk_queue_max_hw_sectors(sdev->request_queue, 1024);

    /*
     * Some devices may not want to have a start command automatically
     * issued when a device is added.
     */
    if (*bflags & BLIST_NOSTARTONADD)
        sdev->no_start_on_add = 1;

    if (*bflags & BLIST_SINGLELUN)
        scsi_target(sdev)->single_lun = 1;

    sdev->use_10_for_rw = 1;

    if (*bflags & BLIST_MS_SKIP_PAGE_08)
        sdev->skip_ms_page_8 = 1;

    if (*bflags & BLIST_MS_SKIP_PAGE_3F)
        sdev->skip_ms_page_3f = 1;

    if (*bflags & BLIST_USE_10_BYTE_MS)
        sdev->use_10_for_ms = 1;

    /* some devices don't like REPORT SUPPORTED OPERATION CODES
     * and will simply timeout causing sd_mod init to take a very
     * very long time */
    if (*bflags & BLIST_NO_RSOC)
        sdev->no_report_opcodes = 1;

    /* set the device running here so that slave configure
     * may do I/O */
    ret = scsi_device_set_state(sdev, SDEV_RUNNING); //狀態
    if (ret) {
        ret = scsi_device_set_state(sdev, SDEV_BLOCK);

        if (ret) {
            sdev_printk(KERN_ERR, sdev,
                    "in wrong state %s to complete scan\n",
                    scsi_device_state_name(sdev->sdev_state));
            return SCSI_SCAN_NO_RESPONSE;
        }
    }

    if (*bflags & BLIST_MS_192_BYTES_FOR_3F)
        sdev->use_192_bytes_for_3f = 1;

    if (*bflags & BLIST_NOT_LOCKABLE)
        sdev->lockable = 0;

    if (*bflags & BLIST_RETRY_HWERROR)
        sdev->retry_hwerror = 1;

    if (*bflags & BLIST_NO_DIF)
        sdev->no_dif = 1;

    sdev->eh_timeout = SCSI_DEFAULT_EH_TIMEOUT;

    if (*bflags & BLIST_TRY_VPD_PAGES)
        sdev->try_vpd_pages = 1;
    else if (*bflags & BLIST_SKIP_VPD_PAGES)
        sdev->skip_vpd_pages = 1;

    transport_configure_device(&sdev->sdev_gendev); //把lun配置到scsi傳輸層

    if (sdev->host->hostt->slave_configure) {
        ret = sdev->host->hostt->slave_configure(sdev); //主機適配器模板設置的回調,對scsi_device(lun)執行特定的初始化
        if (ret) {
            /*
             * if LLDD reports slave not present, don't clutter
             * console with alloc failure messages
             */
            if (ret != -ENXIO) {
                sdev_printk(KERN_ERR, sdev,
                    "failed to configure device\n");
            }
            return SCSI_SCAN_NO_RESPONSE;
        }
    }

    if (sdev->scsi_level >= SCSI_3)
        scsi_attach_vpd(sdev);

    sdev->max_queue_depth = sdev->queue_depth;  //設置最大隊列深度

    /*
     * Ok, the device is now all set up, we can
     * register it and tell the rest of the kernel
     * about it.
     */ //添加scsi_device(lun)到sysfs
    if (!async && scsi_sysfs_add_sdev(sdev) != 0)
        return SCSI_SCAN_NO_RESPONSE;

    return SCSI_SCAN_LUN_PRESENT;
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310
  • 311
  • 312
  • 313
  • 314
  • 315
  • 316
  • 317
  • 318
  • 319
  • 320
  • 321
  • 322
  • 323
  • 324
  • 325
  • 326
  • 327
  • 328
  • 329
  • 330
  • 331
  • 332
  • 333
  • 334
  • 335
  • 336
  • 337
  • 338
  • 339
  • 340
  • 341
  • 342
  • 343
  • 344
  • 345
  • 346
  • 347
  • 348
  • 349
  • 350
  • 351
  • 352
  • 353
  • 354
  • 355
  • 356
  • 357
  • 358
  • 359
  • 360
  • 361
  • 362
  • 363
  • 364
  • 365
  • 366
  • 367
  • 368
  • 369
  • 370
  • 371
  • 372
  • 373
  • 374
  • 375
  • 376
  • 377
  • 378
  • 379
  • 380
  • 381
  • 382
  • 383
  • 384
  • 385
  • 386
  • 387
  • 388
  • 389
  • 390
  • 391
  • 392
  • 393
  • 394
  • 395
  • 396
  • 397
  • 398
  • 399
  • 400
  • 401
  • 402
  • 403
  • 404
  • 405
  • 406
  • 407
  • 408
  • 409
  • 410
  • 411
  • 412
  • 413
  • 414
  • 415
  • 416
  • 417
  • 418
  • 419
  • 420
  • 421
  • 422
  • 423
  • 424
  • 425
  • 426
  • 427
  • 428
  • 429
  • 430
            <link href="https://csdnimg.cn/release/phoenix/mdeditor/markdown_views-b6c3c6d139.css" rel="stylesheet">
                                            <div class="more-toolbox">
            <div class="left-toolbox">
                <ul class="toolbox-list">
                    
                    <li class="tool-item tool-active is-like "><a href="javascript:;"><svg class="icon" aria-hidden="true">
                        <use xlink:href="#csdnc-thumbsup"></use>
                    </svg><span class="name">點贊</span>
                    <span class="count">9</span>
                    </a></li>
                    <li class="tool-item tool-active is-collection "><a href="javascript:;" data-report-click="{&quot;mod&quot;:&quot;popu_824&quot;}"><svg class="icon" aria-hidden="true">
                        <use xlink:href="#icon-csdnc-Collection-G"></use>
                    </svg><span class="name">收藏</span></a></li>
                    <li class="tool-item tool-active is-share"><a href="javascript:;"><svg class="icon" aria-hidden="true">
                        <use xlink:href="#icon-csdnc-fenxiang"></use>
                    </svg>分享</a></li>
                    <!--打賞開始-->
                                            <!--打賞結束-->
                                            <li class="tool-item tool-more">
                        <a>
                        <svg t="1575545411852" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="5717" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><defs><style type="text/css"></style></defs><path d="M179.176 499.222m-113.245 0a113.245 113.245 0 1 0 226.49 0 113.245 113.245 0 1 0-226.49 0Z" p-id="5718"></path><path d="M509.684 499.222m-113.245 0a113.245 113.245 0 1 0 226.49 0 113.245 113.245 0 1 0-226.49 0Z" p-id="5719"></path><path d="M846.175 499.222m-113.245 0a113.245 113.245 0 1 0 226.49 0 113.245 113.245 0 1 0-226.49 0Z" p-id="5720"></path></svg>
                        </a>
                        <ul class="more-box">
                            <li class="item"><a class="article-report">文章舉報</a></li>
                        </ul>
                    </li>
                                        </ul>
            </div>
                        </div>
        <div class="person-messagebox">
            <div class="left-message"><a href="https://blog.csdn.net/haleycomet">
                <img src="https://profile.csdnimg.cn/2/2/C/3_haleycomet" class="avatar_pic" username="haleycomet">
                                        <img src="https://g.csdnimg.cn/static/user-reg-year/1x/4.png" class="user-years">
                                </a></div>
            <div class="middle-message">
                                    <div class="title"><span class="tit"><a href="https://blog.csdn.net/haleycomet" data-report-click="{&quot;mod&quot;:&quot;popu_379&quot;}" target="_blank">haleycomet</a></span>
                                        </div>
                <div class="text"><span>發佈了9 篇原創文章</span> · <span>獲贊 11</span> · <span>訪問量 2萬+</span></div>
            </div>
                            <div class="right-message">
                                        <a href="https://im.csdn.net/im/main.html?userName=haleycomet" target="_blank" class="btn btn-sm btn-red-hollow bt-button personal-letter">私信
                    </a>
                                                        <a class="btn btn-sm attented bt-button personal-watch" data-report-click="{&quot;mod&quot;:&quot;popu_379&quot;}">已關注</a>
                                </div>
                        </div>
                </div>
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章