mmdetection-2.1.0訓練數據

0.遇到訓練問題

./tool/dish_train.sh

RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`; (2) making sure all `forward` function outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable). (prepare_for_backward at /pytorch/torch/csrc/distributed/c10d/reducer.cpp:518) 

問題解決:

vim mmdet/apis/train.py

    # put model on gpus
    if distributed:
        find_unused_parameters = True #cfg.get('find_unused_parameters', False)
        # Sets the `find_unused_parameters` parameter in
        # torch.nn.parallel.DistributedDataParallel
        model = MMDistributedDataParallel(
            model.cuda(),
            device_ids=[torch.cuda.current_device()],
            broadcast_buffers=False,
            find_unused_parameters=find_unused_parameters)
    else:
        model = MMDataParallel(
            model.cuda(cfg.gpu_ids[0]), device_ids=cfg.gpu_ids)

網上是這樣修改,但是這是存在問題的

單卡用戶建議使用python train.py

1.改類別數

vim configs/ms_rcnn/ms_rcnn_r50_fpn_1x_coco.py

vim configs/_base_/models/mask_rcnn_r50_fpn.py

新版本的改類別需要改這兩個文件:

我是一類,所以num_classes=1

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章