Pytorch官方微調Mask-RCNN遇到的坑

已經成功做好了這個實驗，但是不得不吐槽一下官方的說明書，做實驗的前提不交代好，尤其是這句話：

In references/detection/, we have a number of helper functions to simplify training and evaluating detection models. Here, we will use references/detection/engine.py, references/detection/utils.py and references/detection/transforms.py. Just copy them to your folder and use them here.

讓人百思不得其解，references/detection/在哪？沒有交代，這裏附上鍊接：
https://github.com/pytorch/vision/tree/master/references/detection

做實驗你會發現，光這3個文件還不夠，因爲這3個文件還依賴其他文件，依賴的文件其實就在/references/detection目錄下，我們把coco_eval.py，coco_utils.py一併拷過來，好了，我們把這5個文件都拷到當前文件夾下，結構如下所示：

到了這一步，你以爲一切都好了，運行後又出現bug，你會發現coco_eval.py和coco_utils.py都依賴於pycocotools，接下來就去安裝pycocotools。
方法一：在window下使用如下命令，前提要安裝了git

pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI

方法二：https://github.com/philferriere/cocoapi下載源碼，並進行解壓。切換到anncoda的虛擬環境，並切換到 cocoapi\PythonAPI目錄。運行以下指令：

# install pycocotools locally
python setup.py build_ext --inplace
 
# install pycocotools to the Python site-packages
python setup.py build_ext install

以上都在window環境下，本人用方法二解決了pycocotools的安裝，在安裝過程中需要VS環境，裝一個就好了。

隨便提一下，模型文件在程序裏下載太慢，我們自己下載會快很多，鏈接：
https://download.pytorch.org/models/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth
加載模型有兩種方法，第一、將其放到指定的緩存下；第二、自己來加載模型，代碼如下：

    # load an instance segmentation model pre-trained pre-trained on COCO
    # no need to download the backbone if pretrained is set
    pretrained_backbone = False
    backbone = resnet_fpn_backbone('resnet50', pretrained_backbone)
    model = MaskRCNN(backbone, num_classes=91)
    state_dict = torch.load("X:/Downloads/maskrcnn_resnet50_fpn_coco-bf2d0c1e.pth")
    model.load_state_dict(state_dict)
    
    #用以上代碼替換該行代碼
    #model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)

最後運行程序時，又遇到一個bug：
TypeError: ‘numpy.float64’ object cannot be interpreted as an integer

主要因爲numpy版本太高導致的，我當前是最新版本1.18.1，安裝numpy==1.16.0即可解決這個問題。
pip install numpy==1.16.0

總共10個epoch，最後一個epoch結果如下：

Epoch: [9] [ 0/60] eta: 0:01:20 lr: 0.000005 loss: 0.1457 (0.1457) loss_classifier: 0.0221 (0.0221) loss_box_reg: 0.0074 (0.0074) loss_mask:
0.1059 (0.1059) loss_objectness: 0.0011 (0.0011) loss_rpn_box_reg: 0.0092 (0.0092) time: 1.3427 data: 0.7483 max mem: 3827
Epoch: [9] [10/60] eta: 0:00:36 lr: 0.000005 loss: 0.1534 (0.1581) loss_classifier: 0.0227 (0.0255) loss_box_reg: 0.0074 (0.0129) loss_mask:
0.1105 (0.1122) loss_objectness: 0.0005 (0.0006) loss_rpn_box_reg: 0.0077 (0.0068) time: 0.7343 data: 0.0691 max mem: 3827
Epoch: [9] [20/60] eta: 0:00:27 lr: 0.000005 loss: 0.1534 (0.1573) loss_classifier: 0.0227 (0.0249) loss_box_reg: 0.0075 (0.0119) loss_mask:
0.1105 (0.1125) loss_objectness: 0.0003 (0.0008) loss_rpn_box_reg: 0.0074 (0.0071) time: 0.6563 data: 0.0013 max mem: 3827
Epoch: [9] [30/60] eta: 0:00:20 lr: 0.000005 loss: 0.1487 (0.1530) loss_classifier: 0.0205 (0.0233) loss_box_reg: 0.0085 (0.0109) loss_mask:
0.1018 (0.1110) loss_objectness: 0.0003 (0.0008) loss_rpn_box_reg: 0.0051 (0.0071) time: 0.6521 data: 0.0014 max mem: 3827
Epoch: [9] [40/60] eta: 0:00:13 lr: 0.000005 loss: 0.1528 (0.1558) loss_classifier: 0.0205 (0.0237) loss_box_reg: 0.0073 (0.0113) loss_mask:
0.1077 (0.1129) loss_objectness: 0.0003 (0.0008) loss_rpn_box_reg: 0.0053 (0.0071) time: 0.6462 data: 0.0014 max mem: 3827
Epoch: [9] [50/60] eta: 0:00:06 lr: 0.000005 loss: 0.1555 (0.1604) loss_classifier: 0.0227 (0.0257) loss_box_reg: 0.0094 (0.0121) loss_mask:
0.1198 (0.1146) loss_objectness: 0.0004 (0.0007) loss_rpn_box_reg: 0.0067 (0.0073) time: 0.6517 data: 0.0013 max mem: 3827
Epoch: [9] [59/60] eta: 0:00:00 lr: 0.000005 loss: 0.1520 (0.1601) loss_classifier: 0.0227 (0.0253) loss_box_reg: 0.0098 (0.0120) loss_mask:
0.1155 (0.1149) loss_objectness: 0.0004 (0.0007) loss_rpn_box_reg: 0.0067 (0.0072) time: 0.6718 data: 0.0013 max mem: 3827
Epoch: [9] Total time: 0:00:40 (0.6712 s / it)
creating index…
index created!
Test: [ 0/50] eta: 0:00:43 model_time: 0.1556 (0.1556) evaluator_time: 0.0140 (0.0140) time: 0.8673 data: 0.6937 max mem: 3827
Test: [49/50] eta: 0:00:00 model_time: 0.1177 (0.1196) evaluator_time: 0.0040 (0.0049) time: 0.1286 data: 0.0007 max mem: 3827
Test: Total time: 0:00:07 (0.1441 s / it)
Averaged stats: model_time: 0.1177 (0.1196) evaluator_time: 0.0040 (0.0049)
Accumulating evaluation results…
DONE (t=0.01s).
Accumulating evaluation results…
DONE (t=0.01s).
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.806
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.994
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.964
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.766
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.811
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.342
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.839
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.839
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.787
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.843
IoU metric: segm
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.767
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.993
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.953
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.587
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.777
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.328
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.795
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.797
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.675
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.805
That’s it!

Pytorch官方微調Mask-RCNN遇到的坑

leetcode 127. Word Ladder

Pytorch官方微調Mask-RCNN遇到的坑

最長公共子串和最長公共子序列（僅討論2個字符串）

python中opencv 與 PIL讀圖區別，以及與Numpy轉換

使用EmguCV集成的Tesseract-OCR進行光學字符識別

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結