faster-rcnn python3.5環境下使用自定義數據集復現

faster-rcnn python3.5環境下使用自定義數據集復現

一. 簡單說明

  本篇文章主要是簡單敘說一下 faster-rcnn,在ubuntu16.04,python3.5,cuda8.0,cudnn 6.0,caffe1.0環境下的復現,要知道faste-rcnn官方的代碼環境是python2.7的,所以這裏改動比較多,同時這裏也會說一下怎樣訓練自己的數據集。

二. faster rcnn環境搭建

1. 準備工作

(1)安裝cython,python-opencv,easydict

              執行命令:

                pip install cython 

                pip install easydict 

                apt-get install python-opencv(安裝過了就不必安裝)

2. 編譯faster rcnn

(1)下載 py-faster-rcnn

在某個文件夾下執行命令:

git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git


(2)進入py-faster-rcnn/lib 文件夾執行命令:make

(4)進入py-faster-rcnn\caffe-fast-rcnn

拷貝一份配置,執行命令:執行 cp Makefile.config.example Makefile.config

然後,配置Makefile.config文件,與上篇文檔中記錄的caffe相同:

配置好Makefile.config文件後,執行命令:make -j8 && make pycaffe

如果之前安裝過caffe,則這裏可能出現下面錯誤中的1情況,具體解決方法見錯誤1。



三. 使用官方數據集訓練測試

1. 準備數據集

(1)   下載VOC2007數據集:

        https://pan.baidu.com/s/1u50VVcfdmOCWPLDVqPHqzw

解壓後將數據集放在py-faster-rcnn\data下


(2)下載ImageNet數據集下預訓練得到的模型參數(用來初始化)

        https://pan.baidu.com/s/12renKYoytqk9-9bMrI73Lg

解壓,然後將該文件放在py-faster-rcnn\data下



2. 訓練模型

注:訓練腳本的最後面會有對訓練好的模型進行測試的代碼,這段代碼在執行時會有錯誤,爲了避免出錯,可以將其刪除掉(下面紅色框住的部分):


由於faster-caffe是基於python2.7版本編寫的,所以在訓練的過程中會有很多錯誤:會遇到以下的2,3,4,5,6,7,8中的錯誤,python3中的print()函數需要加上括號的。

一切就緒後 在py-faster-rcnn文件夾下執行命令:

./experiments/scripts/faster_rcnn_alt_opt.sh 0 ZFpascal_voc

執行完畢後在 py-faster-rcnn/output/faster_rcnn_alt_opt/voc_2007_trainval文件夾下可以看到生成的最終模型文件:ZF_faster_rcnn_final.caffemodel



3. 測試模型

(1)    將訓練得到的ZF_faster_rcnn_final.caffemodel 拷貝至 py-faster-rcnn\data\faster_rcnn_models(如果沒有這個文件夾,就新建一個)


(2)    修改 py-faster-rcnn\tools\demo.py

將:im_names =['1559.jpg','1564.jpg']中圖片改爲自己的圖片名稱 (測試圖片放在py-faster-rcnn\data\demo中)

(3)    執行測試:

在py-faster-rcnn下執行: 

            ./tools/demo.py --net zf


四. 使用自定義數據集訓練模型並測試

1. 製作數據集

(1)將需要標記的圖片歸整化一下,轉化成固定尺寸(可選)

(2)將每類的圖片重新更改一下文件名,這裏提供一個手寫版java程序:

public class Demo {
	public static void main(String[] args) {
		Demo demo = new Demo();
		demo.test();
	}
	
	int i = 0;
	public void test() {
		String path = "E:\\python\\togue\\數據集\\crack\\";
		File f = new File(path);
		File[] files = f.listFiles();
		for (File file : files) {
			file.renameTo(new File(path+getNum(6)+"."+file.getName().split("\\.")[1]));
		}
	}
	
	public String getNum(int digit) {
		StringBuffer sbf = new StringBuffer("crack_");
		i++;
		if((i+"").length() < digit) {
			for(int j=0;j<digit-(i+"").length();j++) {
				sbf.append("0");
			}
			sbf.append(i);
		}
		return sbf.toString();
	}
}

可以進行批量更改,具體說明看代碼,很簡單,此步驟就是爲了讓圖片名變得好看有序,

建議圖片名修改爲【 分類名_數字.jpg】 格式


(3)使用標記工具對每一類圖片進行標記,這裏建議每一類都建立一個文件夾,分開標記後分別生成txt文件,最後組合在一起(不知道哪位大神弄的,這裏借用了,內附使用說明)

下載鏈接: 

    https://pan.baidu.com/s/19UFtwfaLtAsIhxtLrDl3hQ

標記後生成output.txt文件,內容大致如下:

crack_000001.jpg 1 106 50 143 240
crack_000002.jpg 1 128 29 192 214
crack_000003.jpg 1 106 32 164 256

前面是圖片名,中間是目標類別,最後是目標的包圍框座標(左上角和右下角座標)。

將每一類的output.txt文件組合在一起 形成一個output.txt文件,將output.txt文件轉化爲xml標記文件,python代碼爲:

from xml.dom.minidom import Document  
import os  
import os.path  
  
xml_path = "E:\\資源共享\\python\\生成測試\\Annotations\\"  
  
if not os.path.exists(xml_path):  
    os.mkdir(xml_path)  
  
def writeXml(tmp, imgname, w, h, objbud, wxml):  
    doc = Document()  
    # owner  
    annotation = doc.createElement('annotation')  
    doc.appendChild(annotation)  
    # owner  
    folder = doc.createElement('folder')  
    annotation.appendChild(folder)  
    folder_txt = doc.createTextNode("SkinLesion")  
    folder.appendChild(folder_txt)  
  
    filename = doc.createElement('filename')  
    annotation.appendChild(filename)  
    filename_txt = doc.createTextNode(imgname)  
    filename.appendChild(filename_txt)  
    # ones#  
    source = doc.createElement('source')  
    annotation.appendChild(source)  
  
    database = doc.createElement('database')  
    source.appendChild(database)  
    database_txt = doc.createTextNode("The SkinLesion Database")  
    database.appendChild(database_txt)  
  
    annotation_new = doc.createElement('annotation')  
    source.appendChild(annotation_new)  
    annotation_new_txt = doc.createTextNode("SkinLesion")  
    annotation_new.appendChild(annotation_new_txt)  
  
    image = doc.createElement('image')  
    source.appendChild(image)  
    image_txt = doc.createTextNode("flickr")  
    image.appendChild(image_txt)  
    # onee#  
    # twos#  
    size = doc.createElement('size')  
    annotation.appendChild(size)  
  
    width = doc.createElement('width')  
    size.appendChild(width)  
    width_txt = doc.createTextNode(str(w))  
    width.appendChild(width_txt)  
  
    height = doc.createElement('height')  
    size.appendChild(height)  
    height_txt = doc.createTextNode(str(h))  
    height.appendChild(height_txt)  
  
    depth = doc.createElement('depth')  
    size.appendChild(depth)  
    depth_txt = doc.createTextNode("3")  
    depth.appendChild(depth_txt)  
    # twoe#  
    segmented = doc.createElement('segmented')  
    annotation.appendChild(segmented)  
    segmented_txt = doc.createTextNode("0")  
    segmented.appendChild(segmented_txt)  


    for i in range(0, int(len(objbud) / 5)):  
        # threes#  
        object_new = doc.createElement("object")  
        annotation.appendChild(object_new)  
  
        name = doc.createElement('name')  
        object_new.appendChild(name)  
        name_txt = doc.createTextNode(objbud[i * 5])  
        name.appendChild(name_txt)  
  
        pose = doc.createElement('pose')  
        object_new.appendChild(pose)  
        pose_txt = doc.createTextNode("Unspecified")  
        pose.appendChild(pose_txt)  
  
        truncated = doc.createElement('truncated')  
        object_new.appendChild(truncated)  
        truncated_txt = doc.createTextNode("0")  
        truncated.appendChild(truncated_txt)  
  
        difficult = doc.createElement('difficult')  
        object_new.appendChild(difficult)  
        difficult_txt = doc.createTextNode("0")  
        difficult.appendChild(difficult_txt)  
        # threes-1#  
        bndbox = doc.createElement('bndbox')  
        object_new.appendChild(bndbox)  
  
        xmin = doc.createElement('xmin')  
        bndbox.appendChild(xmin)  
        xmin_txt = doc.createTextNode(objbud[i * 5 + 1])  
        xmin.appendChild(xmin_txt)  
  
        ymin = doc.createElement('ymin')  
        bndbox.appendChild(ymin)  
        ymin_txt = doc.createTextNode(objbud[i * 5 + 2])  
        ymin.appendChild(ymin_txt)  
  
        xmax = doc.createElement('xmax')  
        bndbox.appendChild(xmax)  
        xmax_txt = doc.createTextNode(objbud[i * 5 + 3])  
        xmax.appendChild(xmax_txt)  
  
        ymax = doc.createElement('ymax')  
        bndbox.appendChild(ymax)  
        ymax_txt = doc.createTextNode(objbud[i * 5 + 4])  
        ymax.appendChild(ymax_txt)  
        # threee-1#  
        # threee#  
          
    tempfile = tmp + "test.xml"  
    with open(tempfile, "wb+") as f:  
        f.write(doc.toprettyxml(indent="\t", encoding='utf-8'))  
  
    rewrite = open(tempfile, "r")  
    lines = rewrite.read().split('\n')  
    newlines = lines[1:len(lines) - 1]  
      
    fw = open(wxml, "w")  
    for i in range(0, len(newlines)):  
        fw.write(newlines[i] + "\n")  
      
    fw.close()  
    rewrite.close()  
    os.remove(tempfile)  
    return  
    
    
temp = "C:\\temp2\\"  
if not os.path.exists(temp):  
    os.mkdir(temp) 
    
fopen = open("E:\\資源共享\\python\\生成測試\\output.txt", 'r')
lines = fopen.readlines()
for line in lines:
    line = (line.split('\n'))[0]
    obj = line.split(' ')
    image_name = obj[0]
    xml_name = image_name.replace('.jpg', '.xml')
    filename = xml_path + xml_name
    obj = obj[1:]
    if obj[0] == '1':
        obj[0] = 'car'
    
    if obj[0] == '2':
        obj[0] = 'nocar'
    writeXml(temp, image_name, 299, 299, obj, filename)

os.rmdir(temp)    
-> 其中 xml_path 爲標註文件存放的路徑

->fopen = open("E:\\資源共享\\python\\生成測試\\output.txt", 'r'),爲output.txt文件路徑

-> 

    if obj[0] == '1':

        obj[0] = 'car'
    
    if obj[0] == '2':
        obj[0] = 'nocar'

這裏的1 對應着car分類,2對應着nocar分類

執行後即可生成每張圖片對應的標準xml文件:xml文件大致內容如下:

<annotation>
	<folder>SkinLesion</folder>
	<filename>car_000001.jpg</filename>
	<source>
		<database>The SkinLesion Database</database>
		<annotation>SkinLesion</annotation>
		<image>flickr</image>
	</source>
	<size>
		<width>299</width>
		<height>299</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>car</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>106</xmin>
			<ymin>50</ymin>
			<xmax>143</xmax>
			<ymax>240</ymax>
		</bndbox>
	</object>
</annotation>

(4)生成訓練,測試集txt文件

新建文件夾 ImageSets,進入文件夾再新建Main文件夾,執行python代碼:

import os
import random
import numpy as np

xmlfilepath = 'E:\\資源共享\\python\\生成測試\\Annotations\\' 
txtsavepath =  'E:\\資源共享\\python\\生成測試\\' 

trainval_percent = 0.5
train_percent = 0.5

xmlfile = os.walk(xmlfilepath)  
numOfxml = sum([len(x) for _, _, x in xmlfile])

name_list = list(name for name in  os.listdir(xmlfilepath))
trainval = sorted(list(random.sample(name_list, int(numOfxml * trainval_percent))))
test = np.setdiff1d(np.array(name_list), np.array(trainval))
   
trainvalsize = len(trainval)
t_name_list = list(name for name in  trainval)
train = sorted(list(random.sample(t_name_list, int(trainvalsize * trainval_percent))))
val = np.setdiff1d(np.array(t_name_list), np.array(train))

ftrainval = open(txtsavepath + "ImageSets\\Main\\trainval.txt", 'w')
ftest = open(txtsavepath + "ImageSets\\Main\\test.txt", 'w')
ftrain = open(txtsavepath + "ImageSets\\Main\\train.txt", 'w')
fval = open(txtsavepath + "ImageSets\\Main\\val.txt", 'w')

for name in  os.listdir(xmlfilepath):
    if name in trainval:  
        ftrainval.write(name.replace(".xml", "") + "\n") 
        if name in train:
             ftrain.write(name.replace(".xml", "") + "\n")
        else:
            fval.write(name.replace(".xml", "") + "\n")
    else:
        ftest.write(name.replace(".xml", "") + "\n")   
        
ftrainval.close() 
ftrain.close() 
fval.close() 
ftest.close()

其中:txtsavepath 是生成txt的根目錄

xmlfilepath 是xml標註文件的文件夾地址

執行後會生成4個txt文件


(5)新建文件夾VOC2007,進入文件夾再新建JPEGImages文件夾,將之前用於標註的圖片全部放到改文件夾下


(6)數據集文件如下所示:





(7)    用製作好的數據集中 Annotations,ImagesSets和JPEGImages替換py-faster-rcnn\data\VOCdevkit2007\VOC2007中對應文件夾);

(8)    下載ImageNet數據集下預訓練得到的模型參數(用來初始化)

               https://pan.baidu.com/s/12renKYoytqk9-9bMrI73Lg

解壓,然後將該文件放在py-faster-rcnn\data下


2. 訓練模型(需要改動以下文件)

(1)py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt修改,有3處:


layer {  
  name: 'data'  
  type: 'Python'  
  top: 'data'  
  top: 'rois'  
  top: 'labels'  
  top: 'bbox_targets'  
  top: 'bbox_inside_weights'  
  top: 'bbox_outside_weights'  
  python_param {  
    module: 'roi_data_layer.layer'  
    layer: 'RoIDataLayer'  
    param_str: "'num_classes': 16" #按訓練集類別改,該值爲類別數+1  
  }  
}  

layer {
  name: "cls_score"
  type: "InnerProduct"
  bottom: "fc7"
  top: "cls_score"
  param { lr_mult: 1.0 }
  param { lr_mult: 2.0 }
  inner_product_param {
    num_output: 16 #按訓練集類別改,該值爲類別數+1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

layer {
  name: "bbox_pred"
  type: "InnerProduct"
  bottom: "fc7"
  top: "bbox_pred"
  param { lr_mult: 1.0 }
  param { lr_mult: 2.0 }
  inner_product_param {
    num_output: 64 #按訓練集類別改,該值爲(類別數+1)*4
    weight_filler {
      type: "gaussian"
      std: 0.001
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

(2)py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt修改

layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 16" #按訓練集類別改,該值爲類別數+1
  }
}

(3)py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt修改

layer {
  name: 'data'
  type: 'Python'
  top: 'data'
  top: 'rois'
  top: 'labels'
  top: 'bbox_targets'
  top: 'bbox_inside_weights'
  top: 'bbox_outside_weights'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 16" #按訓練集類別改,該值爲類別數+1
  }
}

layer {
  name: "cls_score"
  type: "InnerProduct"
  bottom: "fc7"
  top: "cls_score"
  param { lr_mult: 1.0 }
  param { lr_mult: 2.0 }
  inner_product_param {
    num_output: 16 #按訓練集類別改,該值爲類別數+1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}


layer {
  name: "bbox_pred"
  type: "InnerProduct"
  bottom: "fc7"
  top: "bbox_pred"
  param { lr_mult: 1.0 }
  param { lr_mult: 2.0 }
  inner_product_param {
    num_output: 64 #按訓練集類別改,該值爲(類別數+1)*4
    weight_filler {
      type: "gaussian"
      std: 0.001
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

(4)py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_rpn_train.pt修改

layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 16" #按訓練集類別改,該值爲類別數+1
  }
}

(5)py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt/faster_rcnn_test.pt修改

layer {
  name: "cls_score"
  type: "InnerProduct"
  bottom: "fc7"
  top: "cls_score"
  inner_product_param {
    num_output: 16 #按訓練集類別改,該值爲類別數+1
  }
}


layer {
  name: "bbox_pred"
  type: "InnerProduct"
  bottom: "fc7"
  top: "bbox_pred"
  inner_product_param {
    num_output: 64 #按訓練集類別改,該值爲(類別數+1)*4
  }
}

(6) py-faster-rcnn/lib/datasets/pascal_voc.py修改

class pascal_voc(imdb):
    def __init__(self, image_set, year, devkit_path=None):
        imdb.__init__(self, 'voc_' + year + '_' + image_set)
        self._year = year
        self._image_set = image_set
        self._devkit_path = self._get_default_path() if devkit_path is None \
                            else devkit_path
        self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
        self._classes = ('__background__', # always index 0
                         '你的標籤1','你的標籤2',你的標籤3','你的標籤4'
                      )

其中:self._data_path =os.path.join(self._devkit_path, 'VOC'+self._year)  爲訓練集文件夾,

若自定義的數據集直接替換原來VOC2007內的Annotations,ImageSets和JPEGImages,此處不用修改(推薦使用)

self._classes= ('__background__', '你的標籤1','你的標籤2','你的標籤3','你的標籤4')

修改成自定義的標籤,需要注意順序對應。

cls =self._class_to_ind[obj.find('name').text.lower().strip()]

.lower()會將標籤轉成小寫,所以數據標籤中字母最好是小寫的,如果不是則將.lower()去掉,(推薦全部使用小寫)


(7) py-faster-rcnn/lib/datasets/imdb.py修改,該文件的append_flipped_images(self)函數修改爲

def append_flipped_images(self):
        num_images = self.num_images
        widths = [PIL.Image.open(self.image_path_at(i)).size[0]
                  for i in xrange(num_images)]
        for i in xrange(num_images):
            boxes = self.roidb[i]['boxes'].copy()
            oldx1 = boxes[:, 0].copy()
            oldx2 = boxes[:, 2].copy()
            boxes[:, 0] = widths[i] - oldx2 - 1
            print boxes[:, 0]
            boxes[:, 2] = widths[i] - oldx1 - 1
            print boxes[:, 0]
            assert (boxes[:, 2] >= boxes[:, 0]).all()
            entry = {'boxes' : boxes,
                     'gt_overlaps' : self.roidb[i]['gt_overlaps'],
                     'gt_classes' : self.roidb[i]['gt_classes'],
                     'flipped' : True}
            self.roidb.append(entry)
        self._image_index = self._image_index * 2

若出現錯誤:這裏assert (boxes[:, 2] >= boxes[:, 0]).all()可能出現AssertionError

可參照錯誤9中解決方法。

(8)爲防止與之前的模型搞混,訓練前把output文件夾刪除(或改個其他名),還要把py-faster-rcnn/data/cache中的文件和py-faster-rcnn/data/VOCdevkit2007/annotations_cache中的文件刪除(如果有的話)。

(9)至於學習率等之類的設置,可在py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt中的solve文件設置,迭代次數可在py-faster-rcnn\tools的train_faster_rcnn_alt_opt.py中修改:max_iters = [80000, 40000, 80000,40000] 

分別爲4個階段(rpn第1階段,fast rcnn第1階段,rpn第2階段,fast rcnn第2階段)的迭代次數。可改成你希望的迭代次數。

如果改了這些數值,最好把py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt裏對應的solver文件(有4個)也修改,stepsize小於上面修改的數值。

(10)進行訓練

可以同官方數據集訓練一樣將faster_rcnn_alt_opt.sh中測試部分的代碼刪除掉

再進入py-faster-rcnn,執行:

            ./experiments/scripts/faster_rcnn_alt_opt.sh 0 ZFpascal_voc

執行完畢後在 py-faster-rcnn/output/faster_rcnn_alt_opt/voc_2007_trainval文件夾下可以看到生成的最終模型文件:ZF_faster_rcnn_final.caffemodel


3. 訓練測試

同官方數據集測試一樣。

4. 提供調用接口

由於該測試需要提供給其他程序調用,故需要編寫接口,規定接口調用時指定圖片路徑,對demo.py代碼改寫如下所示

#!/usr/bin/env python

# --------------------------------------------------------
# Faster R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------

"""
Demo script showing detections in sample images.

See README.md for installation instructions before running.
"""

import _init_paths
from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms
from utils.timer import Timer
import matplotlib.pyplot as plt
import numpy as np
import scipy.io as sio
import caffe, os, sys, cv2
import argparse

CLASSES = ('__background__','nevus','melanoma')
#VARIABLE = None
NETS = {'vgg16': ('VGG16',
                  'VGG16_faster_rcnn_final.caffemodel'),
        'zf': ('ZF',
                  'ZF_faster_rcnn_final.caffemodel')}

def vis_detections(im, class_name, dets, thresh=0.5):
    """Draw detected bounding boxes."""
    inds = np.where(dets[:, -1] >= thresh)[0]
    if len(inds) == 0:
        return

    im = im[:, :, (2, 1, 0)]
    fig, ax = plt.subplots(figsize=(12, 12))
    ax.imshow(im, aspect='equal')
    for i in inds:
        bbox = dets[i, :4]
        score = dets[i, -1]

        ax.add_patch(
            plt.Rectangle((bbox[0], bbox[1]),
                          bbox[2] - bbox[0],
                          bbox[3] - bbox[1], fill=False,
                          edgecolor='red', linewidth=3.5)
            )
        ax.text(bbox[0], bbox[1] - 2,
                '{:s} {:.3f}'.format(class_name, score),
                bbox=dict(facecolor='blue', alpha=0.5),
                fontsize=14, color='white')
    print("class_name:",class_name,"--score:",score)

    ax.set_title(('{} detections with '
                  'p({} | box) >= {:.1f}').format(class_name, class_name,
                                                  thresh),
                  fontsize=14)
    plt.axis('off')
    plt.tight_layout()
    plt.draw()

def demo(net,_imgpath):
    """Detect object classes in an image using pre-computed object proposals."""

    
	#im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)
    im_file = os.path.join(_imgpath)
	
    im = cv2.imread(im_file)

    # Detect all object classes and regress object bounds
    timer = Timer()
    timer.tic()
    scores, boxes = im_detect(net, im)
    timer.toc()
    #print (('Detection took {:.3f}s for ''{:d} object proposals').format(timer.total_time, boxes.shape[0]))

    # Visualize detections for each class
    CONF_THRESH = 0.8
    NMS_THRESH = 0.3
    for cls_ind, cls in enumerate(CLASSES[1:]):
        cls_ind += 1 # because we skipped background
        cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
        cls_scores = scores[:, cls_ind]
        dets = np.hstack((cls_boxes,
                          cls_scores[:, np.newaxis])).astype(np.float32)
        keep = nms(dets, NMS_THRESH)
        dets = dets[keep, :]
        vis_detections(im, cls, dets, thresh=CONF_THRESH)


def parse_args():
    """Parse input arguments."""
    parser = argparse.ArgumentParser(description='Faster R-CNN demo')
    parser.add_argument('--gpu', dest='gpu_id', help='GPU device id to use [0]',
                        default=0, type=int)
    parser.add_argument('--cpu', dest='cpu_mode',
                        help='Use CPU mode (overrides --gpu)',
                        action='store_true')
    parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16]',
                        choices=NETS.keys(), default='zf')
                        #choices=NETS.keys(), default='vgg16')
    parser.add_argument('--imgpath', dest='imgpath', help='Absolute path to detect pictures',default='/usr/develop/repertory/py-faster-rcnn/tools/')
	

    args = parser.parse_args()

    return args

if __name__ == '__main__':
    
    cfg.TEST.HAS_RPN = True  # Use RPN for proposals

    args = parse_args()

    prototxt = os.path.join(cfg.MODELS_DIR, NETS[args.demo_net][0],'faster_rcnn_alt_opt', 'faster_rcnn_test.pt')
    print(cfg.DATA_DIR)
    
    caffemodel = os.path.join(cfg.DATA_DIR, 'faster_rcnn_models',NETS[args.demo_net][1])

    if not os.path.isfile(caffemodel):
        raise IOError(('{:s} not found.\nDid you run ./data/script/'
                       'fetch_faster_rcnn_models.sh?').format(caffemodel))

    if args.cpu_mode:
        caffe.set_mode_cpu()
    else:
        caffe.set_mode_gpu()
        caffe.set_device(args.gpu_id)
        cfg.GPU_ID = args.gpu_id
    net = caffe.Net(prototxt, caffemodel, caffe.TEST)

    #print('\n\nLoaded network {:s}'.format(caffemodel))

    # Warmup on a dummy image
    im = 128 * np.ones((300, 500, 3), dtype=np.uint8)
    for i in range(2):
        _, _= im_detect(net, im)

    #Call detection pictures
    #print("args.imgpath : ",args.imgpath)
    demo(net, args.imgpath);

    #plt.show()

調用方式如:

    ./demo.py  –imgpath /work/01.jpg        (注意這裏有兩個‘’-‘’)


五. 錯誤解決方法 

(由於版本的問題導致faster-rcnn錯誤很多,以下爲部分錯誤記錄)

1. 錯誤:【python3.5環境下caffe安裝正常,但是編譯faster rcnncaffe-faster-rcnn老是報錯:


In file included from ./include/caffe/util/device_alternate.hpp:40:0】
•	                 from ./include/caffe/common.hpp:19,  
•	                 from ./include/caffe/blob.hpp:8,  
•	                 from ./include/caffe/fast_rcnn_layers.hpp:13,  
•	                 from src/caffe/layers/smooth_L1_loss_layer.cpp:8:  
•	./include/caffe/util/cudnn.hpp: In function ‘const char* cudnnGetErrorString(cudnnStatus_t)’:  
•	./include/caffe/util/cudnn.hpp:21:10: warning: enumeration value ‘CUDNN_STATUS_RUNTIME_PREREQUISITE_MISSING’ not handled in switch [-Wswitch]  
•	   switch (status) {  
•	          ^  
•	./include/caffe/util/cudnn.hpp:21:10: warning: enumeration value ‘CUDNN_STATUS_RUNTIME_IN_PROGRESS’ not handled in switch [-Wswitch]  
•	./include/caffe/util/cudnn.hpp:21:10: warning: enumeration value ‘CUDNN_STATUS_RUNTIME_FP_OVERFLOW’ not handled in switch [-Wswitch]  
•	./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::setConvolutionDesc(cudnnConvolutionStruct**, cudnnTensorDescriptor_t, cudnnFilterDescriptor_t, int, int, int, int)’:

解決方法:

caffe裏面的所有與cudnn相關的.h .cpp 替換成能用cudnn 編譯過的caffe

需要替換的cudnn:

(1). 路徑:/usr/develop/repertory/caffe/include/caffe/util下的 cudnn.hpp複製到caffe-faste-rcnn中對應文件夾

(2). 路徑:/usr/develop/repertory/caffe/src/caffe/util下的cudnn.cpp複製到faster-caffe-faste-rcnn對應文件夾

(3).路徑:/usr/develop/repertory/caffe/include/caffe/layers下的

cudnn_conv_layer.hpp,cudnn_deconv_layer.hpp, cudnn_lcn_layer.hpp, cudnn_lrn_layer.hpp,cudnn_pooling_layer.hpp, cudnn_relu_layer.hpp, cudnn_sigmoid_layer.hpp,cudnn_softmax_layer.hpp, cudnn_tanh_layer.hpp

複製到faster-caffe-faste-rcnn對應文件夾

(4).路徑:/usr/develop/repertory/caffe/src/caffe/layers下的 cudnn_conv_layer.cpp, cudnn_conv_layer.cu, cudnn_deconv_layer.cpp,cudnn_deconv_layer.cu, cudnn_lcn_layer.cpp, cudnn_lcn_layer.cu,cudnn_lrn_layer.cpp, cudnn_lrn_layer.cu, cudnn_pooling_layer.cpp,cudnn_pooling_layer.cu, cudnn_relu_layer.cpp, cudnn_relu_layer.cu, cudnn_sigmoid_layer.cpp,cudnn_sigmoid_layer.cu, cudnn_softmax_layer.cpp, cudnn_softmax_layer.cu,cudnn_tanh_layer.cpp, cudnn_tanh_layer.cu

複製到faster-caffe-faste-rcnn對應文件夾

再次編譯即可



2. 錯誤:【ImportError:/usr/develop/repertory/py-faster-rcnn/tools/../caffe-fast-rcnn/python/caffe/_caffe.so:undefinedsymbol:_ZN5boost6python6detail11init_moduleER11PyModuleDefPFvvE

解決方法:

makefile中boost版本不匹配boost.python是一個類似翻譯器的東西,所以如果你是python3的程序,卻用了python2的翻譯器,那語法、定義等等各方面必然會有衝突。

makefile中查找這個變量PYTHON_LIBRARIES

PYTHON_LIBRARIES ?= boost_python python2.7  

改成:PYTHON_LIBRARIES := boost_python3 python3.5m



3. 錯誤:

File"/usr/develop/repertory/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py",line 16, in <module>import cPickleImportError: No module named 'cPickle'

解決方法:

python2有cPickle,但是在python3下,是沒有cPickle的;

解決辦法:將cPickle改爲pickle即可


4. 錯誤:【Traceback (most recent call last):File"./tools/train_faster_rcnn_alt_opt.py", line 211, in<module>cfg_from_file(args.cfg_file)File"/usr/develop/repertory/py-faster-rcnn/tools/../lib/fast_rcnn/config.py",line 263, in cfg_from_file_merge_a_into_b(yaml_cfg, __C)File"/usr/develop/repertory/py-faster-rcnn/tools/../lib/fast_rcnn/config.py",line 232, in _merge_a_into_b for k, v in a.iteritems():AttributeError:'EasyDict' object has no attribute 'iteritems'

解決方法:

iteritems()改爲items()


5. 錯誤:【AttributeError: 'EasyDict' object has noattribute 'has_key'

解決方法:

has_key方法在python2中是可以使用的,在python3中刪除了。


比如:

if dict.has_key(word):

改爲:

if word in dic

 


6. 錯誤:【NameError:name 'xrange' is not defined

 

解決方法:

xrange改爲range,並且range(x)中x要爲整數:int(x)

 

7. 錯誤:【AttributeError:'module' object has no attribute 'text_format'

 

解決方法:

代碼上方(train.py)增加一行importgoogle.protobuf.text_format 即可解決問題

8. 錯誤:【typeError: a byte-like Objectis required,not ‘str’



解決方法:



9. 錯誤:【faster-rcnn中訓練時assert(boxes[:,2]>=boxes[:,0]).all()】

 

原因:左上角座標(x,y)可能爲0,或標定區域溢出圖片,

而faster rcnn會對Xmin,Ymin,Xmax,Ymax進行減一操作

如果Xmin爲0,減一後變爲65535

 

解決方法:

① 修改lib/datasets/imdb.py,append_flipped_images()函數

數據整理,在一行代碼爲 boxes[:, 2] = widths[i] - oldx1- 1下加入代碼:

for bin range(len(boxes)):

  if boxes[b][2]< boxes[b][0]:

boxes[b][0] = 0

② 修改lib/datasets/pascal_voc.py,_load_pascal_annotation(,)函數

將對Xmin,Ymin,Xmax,Ymax減一去掉,變爲:


③ 修改lib/fast_rcnn/config.py,不使圖片實現翻轉,如下改爲:

# Usehorizontally-flipped images during training?

__C.TRAIN.USE_FLIPPED= False





發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章