2018 CVPR
Acquisition of Localization Confidence for Accurate Object Detection
PreciseRoIPooling 代碼
ECCV 2018 | 曠視科技 Oral 論文解讀:IoU-Net 讓目標檢測用上定位置信度
建議先自己看一遍論文,然後再看下面的總結
IoU-Net
解決問題 : nms 過程中,是挑選 分類置信度最大的值的框,但是它不一定框的準
Two drawbacks in object localization
- the misalignment between classification confidence and localization accuracy
- the non-monotonic bounding box regression
joint training
-
Backbone
ResNet-FPN -
FPN
-
Precise RoI Pooling
-
Head
works in parallel
based on the same visual feature from the backbone- IoU predictor
- R-CNN
- classification and regression brance take 512 RoIs per image from RPNs
Training
- img (800,1200)
- batch size 16
- lr 0.01
- iteration 160k
- warm up 0.004 ,10k
Training the IoU detector
- smooth-L1 loss
- IoU labels
normalized , distributed over [-1,1]
Inference
- first apply bounding box regression for the initial coordinates
- IoU-guide NMS
on all detected bounding boxes - refine using optimization-based algorithm
100 bounding boxes with highest classification confidence
Predict IoU
IoU predictor
-
aim
- takes features from the FPN
- estimates the localization accuracy (IoU) for each bounding box
-
data generation
-
generate candidate bounding box set
generate bounding boxes and labels for training the IoU-Net : augmenting the ground-truth,instead of taking proposals from RPNs
for all ground-truth bounding box in training set , manually transform them with a set of randomized parameters -
remove the bounding box having an IoU < 0.5 with the matched ground-truth
-
-
feature
- extracted from the output of FPN with the proposed PrRoI-Pooling layers
- then fed into a two-layer feedforward network for the IoU prediction
-
use class-aware IoU predictors
IoU-guided NMS
-
use the predicted IoU instead of the classification confidence as the ranking keyword for bounding boxes.
-
to determine the classification scores
- select the box having the highest IoU with a ground-truth
- eliminate all other boxes having an overlap greater than threshold nms
- for a group of bounding boxes matching the same ground-truth, we take the most confident prediction for the class label.
highest IoU 的框的分類置信度 是其和他匹配同一gt的並大於閾值被濾掉的框的分類置信度的最大值
-
Algorithm
- ① 從bounding box集合 B 中依次選取預估IOU(localization confidence)最高的bounding box(記爲 )
- ② 將與其IOU高於一定閾值的bounding box一個個選出來,並將這些bounding box(包括最開始選的 )的最高classification confidence記爲
- ③ 將 二元組記錄到集合 D 中 (本質是 bounding box和cls conf的重新分配)
Optimization-based bounding box refinement
- Algorithm
- 對於檢測到的bounding box,利用 PrPool 提取內部特徵並算出 IOUnet 預測的IOU,記其梯度爲grad,這個IOU記爲PrevScore
- 然後更新bounding box
- 更新之後重新進行IOU預測結果爲NewScore
- 如果 prevscore 和 newscore 相差小於一個early-stop閾值或者 newscore 比 prevscore 低於一個“定位退化容忍度”,則認爲該bounding box更新完畢。
PrPool
- 連續
- 可導