用了好久,都沒寫,還是按慣例寫一下。總體步驟其實就是上面流程圖那樣。安裝、配置、下載預訓練模型、放入自己數據、修改模型結構然後訓練和測試,每個模型都是這個流程。
安裝
1.下載源碼
https://github.com/Orpine/py-R-FCN
Py=python版本
2.安裝caffe依賴和caffe
1)安裝依賴
pip install cython
pip install easydict
apt-get install python-opencv
2)下載caffe
git clone https://github.com/Microsoft/caffe.git
3)配置caffe
4)打開終端,cd 你的RFCN路徑/lib,然後make一下
5)編譯caffe的python接口:make pycaffe
安裝完畢
下面下載預訓練模型測試試一試:
這個要翻牆下載,裏面有res50和res101的imagenet預訓練模型。我下好放在了百度雲上。
鏈接:https://pan.baidu.com/s/1-M0r13ULm-8qdq34qfHoPQ
提取碼:a72o
測試
把模型放到rfcn項目的對應位置
$RFCN_ROOT/data/rfcn_models/resnet50_rfcn_final.caffemodel
$RFCN_ROOT/data/rfcn_models/resnet101_rfcn_final.caffemodel
打開終端,運行
cd $RFCN_ROOT
./tools/demo_rfcn.py --net ResNet-50
訓練自己的數據
在data文件夾下放入自己的數據集,格式如下
VOCdevkit/VOC2007
VOC2007裏面就是自己的數據了,主要有三個文件夾
JPEGImages、Annotations、ImageSets
1)修改模型的結構參數
因爲自己的數據類別和預訓練的不一樣,所以輸出神經元節點的數量不一樣,要自己設置。這裏涉及到一個問題,那就是預訓練模型和測試模型的區別。
預訓練模型是別人保存的中間網絡權重參數,其實尾巴的輸出神經元數量沒有保存,所以自己微調一下就能用,而訓練好的測試模型就是將模型參數全部保存下來了,如果你的數據類別和別人一樣那麼你也可以用,當然,不一樣的可能性還是很大的。就是改改數字而已。
主要修改的caffe模型結構文件有:
<1>修改class-aware/train_ohem.prototxt
<2>修改class-aware/test.prototxt
<3>修改train_agnostic.prototxt
<4>修改train_agnostic_ohem.prototxt
<5>修改test_agnostic.prototxt
都在Models/pascal_voc裏面,res50和res101分別對應不同文件夾,用哪個改哪個,這裏以50的以end2end爲例,
打開$RFCN_ROOT/models/pascal_voc/ResNet-50/rfcn_end2end
cls_num=數據集的類別數+1(背景)。
eg:15類的數據,+1類背景,cls_num=16.
<1>修改class-aware/train_ohem.prototxt
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 16" #cls_num
}
}
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 16" #cls_num
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_bbox"
name: "rfcn_bbox"
type: "Convolution"
convolution_param {
num_output: 3136 #4*cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num
group_size: 7
}
}
layer {
bottom: "rfcn_bbox"
bottom: "rois"
top: "psroipooled_loc_rois"
name: "psroipooled_loc_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 64 #4*cls_num
group_size: 7
}
}
<2>修改class-aware/test.prototxt
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_bbox"
name: "rfcn_bbox"
type: "Convolution"
convolution_param {
num_output: 3136 #4*cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num
group_size: 7
}
}
layer {
bottom: "rfcn_bbox"
bottom: "rois"
top: "psroipooled_loc_rois"
name: "psroipooled_loc_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 64 #4*cls_num
group_size: 7
}
}
layer {
name: "cls_prob_reshape"
type: "Reshape"
bottom: "cls_prob_pre"
top: "cls_prob"
reshape_param {
shape {
dim: -1
dim: 16 #cls_num
}
}
}
layer {
name: "bbox_pred_reshape"
type: "Reshape"
bottom: "bbox_pred_pre"
top: "bbox_pred"
reshape_param {
shape {
dim: -1
dim: 64 #4*cls_num
}
}
}
<3>修改train_agnostic.prototxt
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 16" #cls_num
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2) ###
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num ###
group_size: 7
}
}
<4>修改train_agnostic_ohem.prototxt
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 16" #cls_num ###
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2) ###
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num ###
group_size: 7
}
}
<5>修改test_agnostic.prototxt
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2) ###
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num ###
group_size: 7
}
}
layer {
name: "cls_prob_reshape"
type: "Reshape"
bottom: "cls_prob_pre"
top: "cls_prob"
reshape_param {
shape {
dim: -1
dim: 16 #cls_num ###
}
}
}
2)修改部分代碼
因爲自己的數據集標籤具體名字也不一定,要自己設置
<1>$RFCN/lib/datasets/pascal_voc.py
class pascal_voc(imdb):
def __init__(self, image_set, year, devkit_path=None):
imdb.__init__(self, 'voc_' + year + '_' + image_set)
self._year = year
self._image_set = image_set
self._devkit_path = self._get_default_path() if devkit_path is None \
else devkit_path
self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
self._classes = ('__background__', # always index 0
'你的標籤1','你的標籤2',你的標籤3','你的標籤4'
)
<2>$RFCN_ROOT/lib/datasets/imdb.py
這裏會報錯,參考:
http://blog.csdn.net/xzzppp/article/details/52036794
修改迭代次數在lib/data/pascal_voc.py裏面
3)訓練
cd RFCN根目錄
./experiments/scripts/rfcn_end2end_ohem.sh 0 ResNet-50 pascal_voc
4)測試
cd RFCN根目錄
./tools/demo_rfcn.py --net ResNet-50
參考文獻:
https://blog.csdn.net/sinat_30071459/article/details/53202977