TensorFlow學習——Tensorflow Object Detection API（3.模型訓練篇）

2017 年 6 月， Google 公司開放了 TensorFlow Object Detection API 。這個項目使用 TensorFlow 實現了大多數深度學習目標檢測框架，真中就包括Faster R-CNN。

本系列文章將

（1）先介紹如何安裝 TensorFlow Object Detection API；Tensorflow Object Detection API安裝

（2）再介紹如何使用已經訓練好的模型進行物體檢測 ；文章鏈接

（3）最後介紹如何訓練自己的模型；

安裝環境如果是win10 CPU的話請參考（win10 CPU Tensorflow Object Detection API安裝與測試）

之前已經完成了安裝，直接用已有模型做檢測篇的講解；本文講訓練自己的模型做目標檢測；

step1: 準備訓練用的數據集

所謂人工智能，七分靠人工，事先你得準備好訓練用的數據集，自己打標註。爲了講解這麼個過程，採用公開的數據集VOC做講解。VOC2012是VOC2007的升級版，共計11530張圖像（voc數據集詳細介紹）。

涵蓋 20 類（

person（人）
bird（鳥）, cat（貓）, cow（牛）, dog（狗）, horse（馬）, sheep（羊）
aeroplane（飛機）, bicycle（自行車）, boat（船）, bus（公交車）, car（轎車）, motorbike（摩托車）, train（火車）
bottle（瓶子）, chair（椅子）, dining table（餐桌）, potted plant（盆栽）, sofa（沙發）, tv/monitor（電視機/顯示器）

）.

所有的標註圖片都有Detection需要的label，但只有部分數據有Segmentation Label。
VOC2007中包含9963張標註過的圖片，由train/val/test三部分組成，共標註出24,640個物體。
VOC2007的test數據label已經公佈，之後的沒有公佈（只有圖片，沒有label）。
對於檢測任務，VOC2012的trainval/test包含08-11年的所有對應圖片。 trainval有11540張圖片共27450個物體。
對於分割任務， VOC2012的trainval包含07-11年的所有對應圖片， test只包含08-11。trainval有 2913張圖片共6929個物體。

The train/val data has 11,530 images containing 27,450 ROI annotated objects and 6,929 segmentations. （voc2012地址，voc2012數據集下載鏈接）

在object_detection文件夾下新建一個名爲voc的文件夾，將下載的voc2012解壓後放在voc下，目錄結構如下：

之前已經介紹過，tensorflow中數據存儲格式是以.record形式存儲的，首先第一步就是將voc中的原jpg格式圖像轉爲.record格式。這裏就調用了opject_detection / dataset_tools / create_pascal_tf_record.py ；指令如下（前提是你將工作路徑切換到object_detection文件夾下，當然你可以更改生成文件的路徑，我用的是anaconda，然後打開prompt，切換目錄到了object_detection文件夾下）：

python create_pascal_tf_record.py --data_dir voc/VOCdevkit/ --year=VOC2012 --set=train --output_path=voc/pascal_train.record 

python create_pascal_tf_record.py --data_dir voc/VOCdevkit/ --year=VOC2012 --set=val --output_path=voc/pascal_val.record

建議從dataset_tools下拷貝一份create_pascal_tf_record.py到object_detection文件夾下，並在該python文件中增加如下內容：

##add begin
import sys
sys.path.append('E:/DL/tensorflow-models-master/models-master/research')
##add end

其中添加的路徑（sys.path.append）爲object_detection文件夾所在的路徑(即將你的object_detection文件夾所在路徑添加到系統路徑）。這樣再使用上面的的指令，會在voc文件夾下生成訓練和驗證用的數據集；

Step2: 添加標籤映射信息

這步就是將voc數據集的數字標籤和對應代表的類別映射信息準備好。以便後續解析。在tensorflow中，早就對voc數據集做好映射信息了，放在object_detection/data/文件夾下。voc共20類。

其內容爲：

item {
  id: 1
  name: 'aeroplane'
}

item {
  id: 2
  name: 'bicycle'
}

item {
  id: 3
  name: 'bird'
}

item {
  id: 4
  name: 'boat'
}

item {
  id: 5
  name: 'bottle'
}

item {
  id: 6
  name: 'bus'
}

item {
  id: 7
  name: 'car'
}

item {
  id: 8
  name: 'cat'
}

item {
  id: 9
  name: 'chair'
}

item {
  id: 10
  name: 'cow'
}

item {
  id: 11
  name: 'diningtable'
}

item {
  id: 12
  name: 'dog'
}

item {
  id: 13
  name: 'horse'
}

item {
  id: 14
  name: 'motorbike'
}

item {
  id: 15
  name: 'person'
}

item {
  id: 16
  name: 'pottedplant'
}

item {
  id: 17
  name: 'sheep'
}

item {
  id: 18
  name: 'sofa'
}

item {
  id: 19
  name: 'train'
}

item {
  id: 20
  name: 'tvmonitor'
}

我們將 pascal_label_map.pbtxt 拷貝一份到 voc文件夾下；

step3:預訓練模型下載

接下來就是選擇合適的模型對下載的數據集進行訓練了，本文選用的是coco預訓練的Faster R-CNN+Inception_ResNet_v2模型（該預訓練模型下載地址爲：預訓練模型）by the way,你也可以選擇其他模型作爲預訓練模型用，下載地址爲：tensorflow model zoo

將下載的Faster R-CNN+Inception_ResNet_v2模型放在object_detection / voc / pretrained/文件夾下；你下載後得到的會是 “faster_rcnn_inception_resnet_v2_atrous_coco_11_06_2017”這麼個文件，將其解壓會得到下面五個文件：

各個文件model zoo給與的解釋如下：

a graph proto (graph.pbtxt)
a checkpoint (model.ckpt.data-00000-of-00001, model.ckpt.index, model.ckpt.meta)
a frozen graph proto with weights baked into the graph as constants (frozen_inference_graph.pb) to be used for out of the box inference (try this out in the Jupyter notebook!)
a config file (pipeline.config) which was used to generate the graph. These directly correspond to a config file in the samples/configs) directory but often with a modified score threshold. In the case of the heavier Faster R-CNN models, we also provide a version of the model that uses a highly reduced number of proposals for speed.

下載以上文件的作用就是在官網預訓練模型的基礎上做finetune，不是完全重新訓練。

step4:創建配置文件

在object_detection/samples/configs/下，找到faster_rcnn_inception_resnet_v2_atrous_pets.config這一配置文件，並拷貝一份重命名爲voc.config並做一些更改後放在voc文件夾下（如本文文件結構圖所示），更改後的voc.config內容如下（完整內容）：

# Faster R-CNN with Inception Resnet v2, Atrous version;
# Configured for Oxford-IIIT Pets Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  faster_rcnn {
    num_classes: 20                        #change from 37 to 20 by csq
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_inception_resnet_v2'
      first_stage_features_stride: 8
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 8
        width_stride: 8
      }
    }
    first_stage_atrous_rate: 2
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.7
    first_stage_max_proposals: 300
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 17
    maxpool_kernel_size: 1
    maxpool_stride: 1
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 0
            learning_rate: .0003
          }
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "voc/pretrained/model.ckpt"   #change your own path for pre-trained model by csq
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "voc/pascal_train.record"   #change your own path for train data by csq
  }
  label_map_path: "voc/pascal_label_map.pbtxt" #change your own path for label map by csq
}

eval_config: {
  num_examples: 5823                            #change from 1101 to 5823 by csq
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "voc/pascal_val.record"        #change your own path for val data by csq
  }
  label_map_path: "voc/pascal_label_map.pbtxt"  #change your own path for label map by csq
  shuffle: false
  num_readers: 1
}

更改之處總共有7項：1.訓練類別數更改，voc有20類；2.驗證階段圖片數量爲5823（視具體情況）；3.訓練，驗證，標籤路徑更改；

step5:開始訓練

準備好如上內容後，可以開始訓練了，目錄切換到object_detection文件夾下，查看該目錄下是否有train.py，如果沒有（根據tensorflow版本的不同），可查看legacy文件夾，並將其中的train.py拷貝一份到object_detection目錄下。運行如下指令開始訓練：

python train.py --traiη_dir voc/train_dir/ --pipeline_config_path voc/voc.config

訓練得到的日誌和模型會存放在train_dir中。可用tensorboard監控：tensorboard --logdir voc/train_dir/

***如果訓練過程中提示內存和顯存不足，可將配置文件中的最小尺寸和最大尺寸由現在的600，1024等比例改的小些（如300，512）

***如果運行訓練指令後提示“AttributeError:'module' object has no attribute 'get_or_create_global_step'”的話，這是由於tensorflow版本衝突問題造成的；早些的tensorflow版本中是tf.train.get_or_create_global_step()；後面的版本則更改爲tf.contrib.framework.get_or_create_global_step()即可；由於我調用了低版本的tensorflow,因此對提示的錯誤位置做了更改；如下圖，提示optimizer_builder.py中模塊沒有get_or_create_global_step()，故將該文件中做了上述的更改。

***如果遇到提示解析出錯，可能是你的config文件沒有寫對。

下圖是正常訓練的中間過程：

如果訓練的步長設置太長，可適當調小，也可以中途中止訓練，用tensorboard查看訓練狀態，然後決定是否中止或繼續訓練。我訓練了55000步，在voc/train_dir中生成如下內容：

其中紅線部分是每隔一定時間，會保存模型，且只保存最近的5個模型權重。checkpoint文件中記錄了最新的5個權重信息，events開頭文件可在tensorboard中查看訓練記錄，pipeline.config和step4中的配置文件一樣。graph.pbtxt圖表信息。

step6:導出訓練模型

這一步的主要目的是將train_dir目錄中的checkpoint文件導出並用於單張圖片的目標檢測。在object_detection文件夾下有個export_inference_graph.py腳本可導出訓練好的模型。將目錄切換到object_detection文件夾下，運行如下指令：

python export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path voc/voc.config \
--trained_checkpoint_prefix voc/train_dir/model.ckpt-54747 \
--output_directory voc/export/

其中的54747是你最後模型保存時對應的訓練步數，可在checkpoint文件中查看。

最後會在voc/export/文件夾中產生導出的pb文件。

***如果在運行指令後，彈出“ImportError:cannot import name rewriter_config_pb2” ，問題點可能是你的tensorflow版本太低，需要升級，運行pip install --upgrade tensorflow==1.8 （指定個版本或可以不指定）；或者將API中的exporter.py中的rewrite_options = rewriter_config_pb2.RewriterConfig(
layout_optimizer=rewriter_config_pb2.RewriterConfig.ON)這裏的layout_optimizer替換爲optimize_tensor_layout（參考：tensorflow/core/protobuf/rewriter_config_pb2.py has updated,but Object Detection API's file not）

***如果遇到" TypeError: non_max_suppression() got an unexpected keyword argument 'score_threshold' "，請升級你的tensorflow版本，貌似1.9或以上才支持。

TensorFlow學習——Tensorflow Object Detection API（3.模型訓練篇）

step1: 準備訓練用的數據集

Step2: 添加標籤映射信息

step3:預訓練模型下載

step4:創建配置文件

step5:開始訓練

step6:導出訓練模型

釘釘打卡速度慢

Nginx R31 doc 官方文檔-01-nginx 如何安裝

Qt/C++音視頻開發74-合併標籤圖形/生成yolo運算結果圖形/文字和圖形合併成一個/水印濾鏡

挑戰程序設計競賽 2.2章習題 POJ - 3617 Best Cow Line 貪心

字節面試：MySQL什麼時候鎖表？如何防止鎖表？

.NET8連接SQL SERVER 2008 R2 報：證書鏈是由不受信任的頒發機構頒發的

golang開發環境搭建(win10)

python計算機視覺學習筆記——PIL庫的用法

Golang初學：獲取程序內存使用情況，std runtime

There was a problem confirming the ssl certificate: HTTPSConnectionPool(host=‘pypi.org‘,port=443)

用PyCUDA訪問GPU設備屬性信息

視覺slam學習之——ch7 視覺里程計（centos系統）

視覺slam學習之——ch6 非線性曲線擬合（centos系統）

slam學習之——stereoVision.cpp實踐

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結