一、環境安裝 基於mmdetection
服務器新建Polarmask文件夾。
下載Polarmask-master zip。 https://github.com/xieenze/PolarMask
將PolarMask -master解壓後移動到服務器Polarmask文件夾。(小白不習慣用git)
參考安裝指令install.md
conda create -n mmlab python=3.7 -y
conda activate mmlab
#安裝pytorch和torchvision
#(用cat /usr/local/cuda/version.txt查看了服務器cuda版本是10.0,原本指令是10.1)
conda install pytorch=1.3.1 cudatoolkit=10.0 torchvision=0.4.2 -c pytorch
下載半天都下不完。。。。。
換pip 下載 1.3.1與0.4.2版本對應
pip install torch===1.3.1 torchvision===0.4.2 -f https://download.pytorch.org/whl/torch_stable.html
安裝成功
#安裝coco數據集依賴
cd /mnt/lxr/Polarmask/cocoapi-master/PythonAPI/
python setup.py build_ext install
#安裝Polarmask依賴
cd /mnt/lxr/Polarmask/PolarMask-master/
python setup.py develop
有報錯,有一些東西沒安裝
conda install cython
conda install pytest-runner
終於安裝好了。
- 準備cocodataset文件目錄 我用的COCO2014,目錄格式:
data
│ ├── coco
│ │ ├── annotations
│ │ ├── train2014
│ │ ├── val2014
│ │ ├── test2014
- 代碼訓練,參考以下鏈接
https://blog.csdn.net/PAN_Andy/article/details/105343673
https://blog.csdn.net/qq_36530992/article/details/104672750
1.在configs/ polar_768_1x_r50.py,更改數據集爲COCO2014,更改學習率爲0.0025(單GPU學習率調小)
device_ids = range(4)變成1
2.train:
# 單gpu
python tools/train.py configs/polarmask/4gpu/polar_768_1x_r50.py
報錯:
1.缺少ipython ,pip安裝
2. No module named 'imagecorruptions'
pip install imagecorruptions
3. ERROR: No matching distribution found for polygon
pip install Polygon3
4. ModuleNotFoundError: No module named 'tqdm'
pip install tqdm
5. ImportError: cannot import name 'get_dist_info' from 'mmcv.runner.utils' (/mnt/miniconda/condaenvs/mmlab/lib/python3.7/site-packages/mmcv-0.5.1-py3.7-linux-x86_64.egg/mmcv/runner/utils.py)
修改/PolarMask/mmdet/datasets/loader/sampler.py中from mmcv.runner.utils import get_dist_info變爲from mmcv.runner import get_dist_info
再開始訓練,單GPU。
python tools/train.py configs/polarmask/4gpu/polar_768_1x_r50.py
沒報錯就開始下載預訓練的res50模型了。
會提示是從以下鏈接下載的,以及下載了什麼。
https://s3.ap-northeast-2.amazonaws.com/open-mmlab/pretrain/third_party/resnet50_caffe-788b5fa3.pth
太難下了,上csdn下載了一個resnet50-caffe.pth,這裏是我下載好的放網盤裏:
下好後,放入它提示的根目錄下,並重命名。我的是/root/.cache/torch/checkpoints/ resnet50_caffe-788b5fa3.pth
python tools/train.py configs/polarmask/4gpu/polar_768_1x_r50.py
然後又報錯:
NotADirectoryError: [Errno 20] Not a directory: '/mnt/lxr/Polarmask/PolarMask-master/work_dirs/trash'
刪除PolarMask-master目錄下的work_dirs文件。再運行訓練。
python tools/train.py configs/polarmask/4gpu/polar_768_1x_r50.py
終於通了,12個週期。
重新掛起:
nohup python tools/train.py configs/polarmask/4gpu/polar_768_1x_r50.py >> my1.log 2>&1 &
exit #記得退出再關閉窗口
關閉窗口,等待訓練結束,大概單GPU要一天多的時間。
Test ,運行指令寫上訓練好的模型路徑,輸出的路徑,示例如下
# 單gpu
#test的格式
python tools/test.py configs/polarmask/4gpu/polar_768_1x_r50.py [YOUR_CHECKPOINT_DIR] --out [OUT_DIR]
# eg:python tools/test.py configs/polarmask/4gpu/polar_768_1x_r101.py /home/wh/weights/polarmask_r101_1x.pth --out work_dirs/polar101.pkl
#我的
python tools/test.py configs/polarmask/4gpu/polar_768_1x_r50.py work_dirs/trash/epoch_12.pth --out /mnt/lxr/Polarmask/PolarMaskmaster/work_dirs/out/results.pkl --show
報錯:
UnboundLocalError: local variable '_mlvl_bboxes' referenced before assignment
File "tools/test.py", line 224, in <module>
main()
File "tools/test.py", line 184, in main
outputs = single_gpu_test(model, data_loader, args.show)
File "tools/test.py", line 26, in single_gpu_test
result = model(return_loss=False, rescale=not show, **data)
File "/mnt/miniconda/condaenvs/mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/miniconda/condaenvs/mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/mnt/miniconda/condaenvs/mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/lxr/Polarmask/PolarMask-master/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/mnt/lxr/Polarmask/PolarMask-master/mmdet/models/detectors/base.py", line 88, in forward
return self.forward_test(img, img_meta, **kwargs)
File "/mnt/lxr/Polarmask/PolarMask-master/mmdet/models/detectors/base.py", line 79, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/mnt/lxr/Polarmask/PolarMask-master/mmdet/models/detectors/polarmask.py", line 65, in simple_test
bbox_list = self.bbox_head.get_bboxes(*bbox_inputs)
File "/mnt/lxr/Polarmask/PolarMask-master/mmdet/core/fp16/decorators.py", line 127, in new_func
return old_func(*args, **kwargs)
File "/mnt/lxr/Polarmask/PolarMask-master/mmdet/models/anchor_heads/polarmask_head.py", line 403, in get_bboxes
scale_factor, cfg, rescale)
File "/mnt/lxr/Polarmask/PolarMask-master/mmdet/models/anchor_heads/polarmask_head.py", line 480, in get_bboxes_single
_mlvl_bboxes,
UnboundLocalError: local variable '_mlvl_bboxes' referenced before assignment
暫未解決,問了別人,都沒有出現這個問題。
放棄,test一直出錯,不知道爲什麼,就直接可視化。
1.demo/visual.py 更改一下cofig路徑、模型存儲路徑和測試圖片路徑。(visual.py就是vidual.ipynb的內容。)
修改mmdet/api/inference.py 增加一個保存的步驟plt.savefig(“result.jpg”)。把生成的圖片存在根目錄了。
python demo/visual.py
可以看出識別物體比較全,而且mask的質量也比較好。