目標檢測中的多尺度測試及源碼解析(FCOS多尺度測試)

 

近期在研究FCOS目標檢測算法,論文發表於ICCV 2019。FCOS方法性能還是很不錯的,代碼工程化也很好,準備follow一下。

FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

論文:https://arxiv.org/pdf/1904.01355.pdf

源碼:https://github.com/tianzhi0549/FCOS

------------------------------------   Let's start   ------------------------------------

這篇博客主要是聊聊FCOS中用到的多尺度測試。通過多尺度測試,基於ResNeXt-64x4d-101和可變形卷積的FCOS模型在COCO test-dev上取得了49.0%的AP。我在COCO上對此也進行了驗證,相比於單尺度測試,多尺度測試的AP大概可以提高2個點,還是非常有效的。當然了,由此帶來的最大問題是時間開銷明顯增大很多,這是未來需要解決的一個難點。

在具體說明FCOS多尺度測試源碼前,先簡單回顧一下目標檢測中的多尺度訓練及測試,做目標檢測的應該基本都知道多尺度對最終性能的重要性。

輸入圖片的尺寸對檢測模型的性能影響相當明顯,事實上,多尺度是提升精度最明顯的技巧之一。在基礎網絡部分常常會生成比原圖小數十倍的特徵圖,導致小物體的特徵描述不容易被檢測網絡捕捉。通過輸入更大、更多尺寸的圖片進行訓練,能夠在一定程度上提高檢測模型對物體大小的魯棒性,僅在測試階段引入多尺度,也可享受大尺寸和多尺寸帶來的增益。[1]

------------------------------------   Let's continue   ------------------------------------

FCOS多尺度測試就是在水平翻轉、resize到不同尺度的圖像上分別檢測目標,再將預測出的bbox融合到一起,然後經過nms等後續處理得到最終的檢測框。想法比較直接,但是效果顯著。FCOS源碼中,針對多尺度測試有幾個相關的設置:

TEST:
  BBOX_AUG:
    ENABLED: False
    H_FLIP: True
    SCALES: (400, 500, 600, 700, 900, 1000, 1100, 1200)
    MAX_SIZE: 2000
    SCALE_H_FLIP: True

其中,ENABLED是多尺度測試的flag,False表示用單尺度測試,速度較快。如果設置成True,則採用多尺度測試的方式。下面幾個是多尺度測試時的參數。

H_FLIP:水平翻轉的flag;

SCALES:測試圖片resize之後的scale;

MAX_SIZE:測試圖片resize時最大的size;

SCALE_H_FLIP:做resize時水平翻轉的flag。

接着,簡單說明多尺度測試的源碼 [2]。主函數是im_detect_bbox_aug,分幾步完成多尺度檢測的任務:

1. 原圖像預測:boxlists_i = im_detect_bbox

2. 翻轉後圖像預測:boxlists_hf = im_detect_bbox_hflip

3. resize到不同尺度預測:boxlists_scl = im_detect_bbox_scale

4. 不同尺度翻轉後圖像預測:boxlists_scl_hf = im_detect_bbox_scale

具體實現細節參照以下源碼,註釋清楚,很容易理解。

 

import torch
import torchvision.transforms as TT

from fcos_core.config import cfg
from fcos_core.data import transforms as T
from fcos_core.structures.image_list import to_image_list
from fcos_core.structures.bounding_box import BoxList
from fcos_core.modeling.rpn.fcos.inference import make_fcos_postprocessor


def im_detect_bbox_aug(model, images, device):
    # Collect detections computed under different transformations
    boxlists_ts = []
    for _ in range(len(images)):
        boxlists_ts.append([])

    def add_preds_t(boxlists_t):
        for i, boxlist_t in enumerate(boxlists_t):
            if len(boxlists_ts[i]) == 0:
                # The first one is identity transform, no need to resize the boxlist
                boxlists_ts[i].append(boxlist_t)
            else:
                # Resize the boxlist as the first one
                boxlists_ts[i].append(boxlist_t.resize(boxlists_ts[i][0].size))

    # Compute detections for the original image (identity transform)
    boxlists_i = im_detect_bbox(
        model, images, cfg.INPUT.MIN_SIZE_TEST, cfg.INPUT.MAX_SIZE_TEST, device
    )
    add_preds_t(boxlists_i)

    # Perform detection on the horizontally flipped image
    if cfg.TEST.BBOX_AUG.H_FLIP:
        boxlists_hf = im_detect_bbox_hflip(
            model, images, cfg.INPUT.MIN_SIZE_TEST, cfg.INPUT.MAX_SIZE_TEST, device
        )
        add_preds_t(boxlists_hf)

    # Compute detections at different scales
    for scale in cfg.TEST.BBOX_AUG.SCALES:
        max_size = cfg.TEST.BBOX_AUG.MAX_SIZE
        boxlists_scl = im_detect_bbox_scale(
            model, images, scale, max_size, device
        )
        add_preds_t(boxlists_scl)

        if cfg.TEST.BBOX_AUG.SCALE_H_FLIP:
            boxlists_scl_hf = im_detect_bbox_scale(
                model, images, scale, max_size, device, hflip=True
            )
            add_preds_t(boxlists_scl_hf)

    assert cfg.MODEL.FCOS_ON, "The multi-scale testing only supports FCOS detector"

    # Merge boxlists detected by different bbox aug params
    boxlists = []
    for i, boxlist_ts in enumerate(boxlists_ts):
        bbox = torch.cat([boxlist_t.bbox for boxlist_t in boxlist_ts])
        scores = torch.cat([boxlist_t.get_field('scores') for boxlist_t in boxlist_ts])
        labels = torch.cat([boxlist_t.get_field('labels') for boxlist_t in boxlist_ts])
        boxlist = BoxList(bbox, boxlist_ts[0].size, boxlist_ts[0].mode)
        boxlist.add_field('scores', scores)
        boxlist.add_field('labels', labels)
        boxlists.append(boxlist)

    # Apply NMS and limit the final detections
    post_processor = make_fcos_postprocessor(cfg)
    results = post_processor.select_over_all_levels(boxlists)

    return results


def im_detect_bbox(model, images, target_scale, target_max_size, device):
    """
    Performs bbox detection on the original image.
    """
    transform = TT.Compose([
        T.Resize(target_scale, target_max_size),
        TT.ToTensor(),
        T.Normalize(
            mean=cfg.INPUT.PIXEL_MEAN, std=cfg.INPUT.PIXEL_STD, to_bgr255=cfg.INPUT.TO_BGR255
        )
    ])
    images = [transform(image) for image in images]
    images = to_image_list(images, cfg.DATALOADER.SIZE_DIVISIBILITY)
    return model(images.to(device))


def im_detect_bbox_hflip(model, images, target_scale, target_max_size, device):
    """
    Performs bbox detection on the horizontally flipped image.
    Function signature is the same as for im_detect_bbox.
    """
    transform = TT.Compose([
        T.Resize(target_scale, target_max_size),
        TT.RandomHorizontalFlip(1.0),
        TT.ToTensor(),
        T.Normalize(
            mean=cfg.INPUT.PIXEL_MEAN, std=cfg.INPUT.PIXEL_STD, to_bgr255=cfg.INPUT.TO_BGR255
        )
    ])
    images = [transform(image) for image in images]
    images = to_image_list(images, cfg.DATALOADER.SIZE_DIVISIBILITY)
    boxlists = model(images.to(device))

    # Invert the detections computed on the flipped image
    boxlists_inv = [boxlist.transpose(0) for boxlist in boxlists]
    return boxlists_inv


def im_detect_bbox_scale(model, images, target_scale, target_max_size, device, hflip=False):
    """
    Computes bbox detections at the given scale.
    Returns predictions in the scaled image space.
    """
    if hflip:
        boxlists_scl = im_detect_bbox_hflip(model, images, target_scale, target_max_size, device)
    else:
        boxlists_scl = im_detect_bbox(model, images, target_scale, target_max_size, device)
    return boxlists_scl

 


參考文獻

[1] https://www.cnblogs.com/Terrypython/p/10642091.html

[2] https://github.com/tianzhi0549/FCOS/blob/master/fcos_core/engine/bbox_aug.py

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章