caffe從零開始學習2——mnist手寫體數字識別例程

文章目錄

前言

上一篇文章講解了caffe在虛擬機ubuntu16.04中的安裝教程,本文章將利用caffe架構學習其中自帶的mnist手寫體數字識別例程，動手學習caffe的具體使用方法。

下載數據

首先，在data/mnist目錄下有個腳本文件：get_mnist.sh，其源碼如下：

#!/usr/bin/env sh
# This scripts downloads the mnist data and unzips it.

DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd "$DIR"

echo "Downloading..."

for fname in train-images-idx3-ubyte train-labels-idx1-ubyte t10k-images-idx3-ubyte t10k-labels-idx1-ubyte
do
    if [ ! -e $fname ]; then
        wget --no-check-certificate http://yann.lecun.com/exdb/mnist/${fname}.gz
        gunzip ${fname}.gz
    fi
done

可以看到這個腳本主要進行4個文件的下載，

    train-images-idx3-ubyte  //訓練用的圖像文件
    train-labels-idx1-ubyte//訓練用的標籤文件
    t10k-images-idx3-ubyte//測試用的圖像文件
    t10k-labels-idx1-ubyte//測試用的標籤文件

執行以下命令進行下載

cd ~/caffe/data/mnist
./get_mnist.sh

下載完成

數據的處理

在Caffe根目錄examples/mnist/下有個create_mnist.sh腳本，是進行數據轉換的腳本：

!/usr/bin/env sh
# This script converts the mnist data into lmdb/leveldb format,
# depending on the value assigned to $BACKEND.
set -e

EXAMPLE=examples/mnist
DATA=data/mnist
BUILD=build/examples/mnist

BACKEND="lmdb"

echo "Creating ${BACKEND}..."

rm -rf $EXAMPLE/mnist_train_${BACKEND}
rm -rf $EXAMPLE/mnist_test_${BACKEND}

$BUILD/convert_mnist_data.bin $DATA/train-images-idx3-ubyte \
  $DATA/train-labels-idx1-ubyte $EXAMPLE/mnist_train_${BACKEND} --backend=${BACKEND}
$BUILD/convert_mnist_data.bin $DATA/t10k-images-idx3-ubyte \
  $DATA/t10k-labels-idx1-ubyte $EXAMPLE/mnist_test_${BACKEND} --backend=${BACKEND}

echo "Done."

主要是用image文件和label文件來生成兩個lmdb格式的文件，執行如下：

./examples/mnist/create_mnist.sh

進行訓練

vim ./examples/mnist/train_lenet.sh

訓練腳本的源碼如下：

#!/usr/bin/env sh
set -e

./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt $@

set -e,這句語句告訴bash如果任何語句的執行結果不是true則應該退出。這樣的好處是防止錯誤像滾雪球般變大導致一個致命的錯誤，而這些錯誤本應該在之前就被處理掉。如果要增加可讀性，可以使用set -o errexit，它的作用與set -e相同。

Linux腳本中$#、$0、$1、$@、$*、$$、$?這個幾個參數的意義：

　　$#：傳入腳本的參數個數；

　　$0:  腳本自身的名稱；　　

　　$1:  傳入腳本的第一個參數；

　　$2:  傳入腳本的第二個參數；

　　$@: 傳入腳本的所有參數；

　　$*：傳入腳本的所有參數；

　　$$:  腳本執行的進程id；

　　$?:  上一條命令執行後的狀態，結果爲0表示執行正常，結果爲1表示執行異常；

    其中$@與$*正常情況下一樣，當在腳本中將$*加上雙引號作爲“$*”引用時，此時將輸入的所有參數當做一個整體字符串對待。比如輸入參數有a b c三個參數，則“$*”表示“a b c”一個字符串。

繼續查看腳本中lenet_solver.prototxt文件，各參數意義見備註：

# The train/test net protocol buffer definition
net: "examples/mnist/lenet_train_test.prototxt"   #網絡文件
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100   #迭代多少個樣本數，例如有5000個樣本，一次測試要跑完這5000個樣本，test_iter * batch需要等於5000
# Carry out testing every 500 training iterations.
test_interval: 500  #測試間隔，每訓練500次，就進行一次測試
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01  #基礎學習率，其他層的最終學習率是lr_   * base_lr
momentum: 0.9   #動量，常用的都是0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"  #學習策略
gamma: 0.0001   #inv學習策略的參數
power: 0.75		#inv學習策略的參數
# Display every 100 iterations
display: 100   #每訓練100次顯示一次，設置爲0，不顯示
# The maximum number of iterations
max_iter: 10000  #最大的迭代次數，10000次就停止了
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU
solver_mode: CPU

由於是用虛擬機進行的訓練，這裏將最後一行GPU改爲CPU。
繼續查看lenet_solver.prototxt文件中lenet_train_test.prototxt文件，文件中的參數已經詳細備註，參見注釋：

name: "LeNet" 
layer {
  name: "mnist"
  type: "Data"
  top: "data"   #輸出
  top: "label"  #輸出，一般data層輸出有兩個，data和label
  include {
    phase: TRAIN #include內部，訓練階段使用，若沒寫include，數據既用於訓練，又用於測試
  }
  transform_param {
    scale: 0.00390625 #1/256，歸一化
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"  #數據源
    batch_size: 64 #每次批處理的個數，一般使用2的n次方，如64,128
    backend: LMDB #選用數據的名稱
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}


# lr_mult:1 weight learn rate
# lr_mult:2 bias learn rate
# 輸入：n*c0*w0*h0
# 輸出：n*c1*w1*h1
# 其中，c1是參數中的num_output, 即生成的特徵圖個數
# w1=(w0+2*pad-kernel_size)/stride+1
# h1=(h0+2*pad-kernel_size)/stride+1
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1	#學習率係數，最終的學習率是這個數乘以solver.prototxt配置文件中國的base_lr。
  }
  param {
    lr_mult: 2	#如果有兩個lr_mult,則第一個是權值的學習率，第二個是偏置項的學習率，一般偏置項學習率是權值學習率的兩倍。
  }
  convolution_param {
    num_output: 20  #卷積核（filter）的個數
    kernel_size: 5	#卷積核的大小,h x w x d，hxw是5x5大小，d是深度（隱含的）,和前一層的深度數值一樣，例如，前一層若是data層有rgb3個通道即深度是3，那麼本層深度也是d=3,又比如前一層是卷積層，有20個特徵圖，深度是20，那麼本層深度d=20
    stride: 1		#卷積核的步長，默認是1
    weight_filler {
      type: "xavier" #權重的初始化，默認爲consatant, 值都是0。很多時候我們選用"xavier"澤維爾算法來初始化，也可以設置爲"gussian"
    }
    bias_filler {
      type: "constant" #偏置項的初始化，一般爲consatant, 值都是0
    }
  }
}


#pooling層的運算方法和卷積層是一樣的
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX #池化方法，默認是MAX（固定區域內，例如對一個2x2的區域進行pooling運算，只保留4個值中最大的值）。 目前可用的方法有MAX,AVE（取平均）
    kernel_size: 2 #池化核的大小
    stride: 2 #池化的步長，默認是1。一般我們設置爲2，即不重疊
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}


#全連接層，是對之前的卷積與池化的特性進行再提取的過程，
#把之前的特徵總結提取爲向量的形式，再根據這個向量做一些分類、迴歸之類的任務
#輸出是一個簡單向量，參數和卷積層一樣
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}

layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}



#測試的時候輸出準確率
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"  #輸入IP2預測分類和label標籤
  top: "accuracy"  #輸出準確率
  include {
    phase: TEST
  }
}

#SoftmaxWithLoss，輸出loss值
#Softmax，輸出似然值，準確率值
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

開始訓練

./examples/mnist/train_lenet.sh

訓練結束後會生成 lenet_iter_10000.caffemodel 權值文件。

測試

執行：

./build/tools/caffe.bin test \
-model examples/mnist/lenet_train_test.prototxt \
-weights examples/mnist/lenet_iter_10000.caffemodel \
-iterations 100

測試結果：

繪製網絡圖

上面已經對mnist例程進行了訓練和測試，也可以看到測試的結果。這個例程的網絡結構還是比較簡單的，對比那種很複雜的網絡文件，看起來就頭疼了。
那有沒有什麼方法可以更方便的查看網絡文件呢？
答案當然是有的！
下面就來學習繪製網絡圖。
先安裝依賴庫

apt-get install graphviz

pip install pydot

執行下面的指令來繪製網絡圖：

python ./python/draw_net.py  ./examples/mnist/lenet_train_test.prototxt ./caffe_png/mnist.png --rankdir=LR

eog mnist.png

如上圖，這樣的網絡圖是不是好看多了~~~~

caffe從零開始學習2——mnist手寫體數字識別例程

文章目錄

前言

下載數據

數據的處理

進行訓練

測試

繪製網絡圖

釘釘打卡速度慢

使用neovim打造go ide(支持代碼跳轉, 代碼補全, 實時語法檢查)

Nginx R31 doc 官方文檔-01-nginx 如何安裝

Python 潮流週刊#51：用 Python 繪製美觀的圖表

cs01 CSS Syntax

Qt/C++音視頻開發74-合併標籤圖形/生成yolo運算結果圖形/文字和圖形合併成一個/水印濾鏡

挑戰程序設計競賽 2.2章習題 POJ - 3617 Best Cow Line 貪心

字節面試：MySQL什麼時候鎖表？如何防止鎖表？

.NET8連接SQL SERVER 2008 R2 報：證書鏈是由不受信任的頒發機構頒發的

golang開發環境搭建(win10)

caffe從零開始學習1——虛擬機下ubuntu16.04安裝caffe(CPU版本)詳細教程

caffe從零開始學習2——mnist手寫體數字識別例程

c++ ——靜態成員變量和靜態成員函數

c++中std::auto_ptr的使用解析

exit()函數解析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結