ubuntu 14.04+ GTX 1070+cuda 8.0 + cudnn 5.1+opencv3.0+caffe-SSD配置一步到位

ubuntu 14.04+ GTX 1070+cuda 8.0 + cudnn 5.1+opencv3.0的安裝參見我前兩篇博客,詳細介紹了!

上一篇寫了如何接着配置caffe框架.    本文重點爲caffe-ssd的安裝與配置細節(注意此時電腦的配置爲ubuntu 14.04+ GTX 1070+cuda 8.0 + cudnn 5.1+opencv3.0+caffe)

一、安裝ssd(大部分依賴庫已經在配置caffe的時候完成)

在/home下新建文件夾caffe-ssd(區別於caffe)

cd caffe-ssd
git clone https://github.com/weiliu89/caffe.git
cd caffe
git checkout ssd

一、配置ssd

cd /home/**(您服務器的名字)/caffe
cp Makefile.config.example Makefile.config
打開Makefile.config,修改之處爲

use_cudnn:=1取消註釋

opencv_version:=3取消註釋

with_python_layer:=1取消註釋

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
改爲
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include  /usr/include/hdf5/serial/
修改Makefile文件
LIBRARIES += glog gflags protobuf boost_system boost_filesystem boost_regex m hdf5_hl hdf5

改爲

LIBRARIES += glog gflags protobuf boost_system boost_filesystem boost_regex m hdf5_hl hdf5 opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs

然後編譯

用下面的方法開始不成功就放棄了

cd /home//caffe
mkdir build
cd build
cmake ..
make all -j16
make install
make runtest
make pycaffe

其中在cmake  ..的時候發現opencv還是找的opencv2.4.8版本,在caffe-ssd/caffe裏面的CMakeLists.txt添加

set(OpenCV_DIR /home/shan/opencv-3.0.0/build)
這樣就能找到3.0版本的了,改完之後接着用下面的方法:

根據readme.md裏的指令編譯

make -j8
make py
make test -j8
make runtest -j8

在Make -j8過程會出現error:

1、json_parser_read.hpp:257:264: error: ‘type name’ declared as function returning an array escape
解決:sudo gedit/usr/include/boost/property_tree/detail/json_parser_read.hpp將257行的escape代碼段註釋掉即可

2、cannot find -lopencv_videoio

     cannot find -lopenblas

像這種問題是找不到libopencv_videoio.so、libopenblas.so等文件

解決方法是:首先在計算機中查找看是否有libopencv_videoio等文件

若沒有,就sudo apt-get install libopencv_videoio-dev

若有,就把文件拷貝到/usr/lib下  ,如果不行,就拷貝到/usr/local/lib

對於我的問題libopencv_videoio.so能找到,進行拷貝

sudo cp /home/shan/opencv-3.0.0/build/lib/libopencv_videoio.so /usr/lib
sudo cp /home/shan/opencv-3.0.0/build/lib/libopencv_videoio.so.3.0 /usr/lib
sudo cp /home/shan/opencv-3.0.0/build/lib/libopencv_videoio.so.3.0.0 /usr/lib
sudo cp /home/shan/opencv-3.0.0/build/lib/libopencv_videoio_pch_dephelp.a /usr/lib

而libopenblas找不到,

sudo apt-get install libopenblas-dev
再make -j8沒問題

make py

make test -j8

make runtest -j8
出現問題: error while loading shared libraries: libcudart.so.8.0: cannot open shared object file: No such file or directory
解決辦法:首先確認/etc/profile中的路徑包含了cuda8.0的安裝路徑及相應的庫文件

sudo gedit /etc/profile  
export PATH=$PATH:/usr/local/cuda-8.0/bin  
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64  
export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-8.0/lib64
export PYTHONPATH=~/caffe-ssd/caffe/python:$PYTHONPATH  
export PKG_CONFIG_PATH=/usr/local/cuda-8.0/pkgconfig:/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH
$source /etc/profile

使配置文件生效,再次執行。
若仍提示相同的錯誤,則執行以下命令,將相應的庫文件複製到/usr/lib


sudo cp /usr/local/cuda-8.0/lib64/libcudart.so.8.0 /usr/local/lib/libcudart.so.8.0 && sudo ldconfig  
sudo cp /usr/local/cuda-8.0/lib64/libcublas.so.8.0 /usr/local/lib/libcublas.so.8.0 && sudo ldconfig  
sudo cp /usr/local/cuda-8.0/lib64/libcurand.so.8.0 /usr/local/lib/libcurand.so.8.0 && sudo ldconfig  

還是不行!!!

sudo cp /usr/local/cuda-8.0/lib64/libcudart.so.8.0 /usr/lib/libcudart.so.8.0 && sudo ldconfig  

成功!!!

到了第二天不知道動了什麼,。,不好使了,,刪除caffe-ssd包,重新下載編譯,編譯不過去了,,又用這個法編譯過去了,,真是醉了

cd /home//caffe
mkdir build
cd build
cmake ..
make all -j16
make install
make runtest
make pycaffe

接下來,下載數據集和訓練


### Preparation
1. Download [fully convolutional reduced (atrous) VGGNet](https://gist.github.com/weiliu89/2ed6e13bfd5b57cf81d6). By default, we assume the model is stored in `$CAFFE_ROOT/models/VGGNet/`

2. Download VOC2007 and VOC2012 dataset. By default, we assume the data is stored in `$HOME/data/`
  ```Shell
  # Download the data.
  cd $HOME/data
  wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
  wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
  wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
  # Extract the data.
  tar -xvf VOCtrainval_11-May-2012.tar
  tar -xvf VOCtrainval_06-Nov-2007.tar
  tar -xvf VOCtest_06-Nov-2007.tar
  數據集的獲取:通過解壓數據集到/data下最靠譜,如果通過拷貝以前保存的數據集到/data下,在/VOC2012/JPEGImage下2011_001232.jpg是拷貝不過來的

3.生成LMDB文件
  cd $CAFFE_ROOT
  # Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
  ./data/VOC0712/create_list.sh
  # You can modify the parameters in create_data.sh if needed.
  # It will create lmdb files for trainval and test with encoded original image:
  #   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
  #   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
  # and make soft links at examples/VOC0712/
  ./data/VOC0712/create_data.sh
  ```
### 訓練
1. Train your model and evaluate the model on the fly.
  ```Shell
  # It will create model definition files and save snapshot models in:
  #   - $CAFFE_ROOT/models/VGGNet/VOC0712/SSD_300x300/
  # and job file, log file, and the python script in:
  #   - $CAFFE_ROOT/jobs/VGGNet/VOC0712/SSD_300x300/
  # and save temporary evaluation results in:
  #   - $HOME/data/VOCdevkit/results/VOC2007/SSD_300x300/
  # It should reach 72.* mAP at 60k iterations.
  python examples/ssd/ssd_pascal.py 訓練模型
  ```

打開caffe/examples/ssd/ssd_pascal.py這個文件,找到gpus=’0,1,2,3’這一行,如果您的服務器有一塊顯卡,則將123刪去,如果有兩個顯卡,則刪去23,以此類推。如果您服務器沒有gpu支持,則註銷以下幾行,程序會以cpu形式訓練。(這個是解決問題cudasuccess(10vs0)的方法)
#Ifnum_gpus >0:

                # batch_size_per_device =int(math.ceil(float(batch_size) / num_gpus))

#iter_size =int(math.ceil(float(accum_batch_size) / (batch_size_per_device * num_gpus)))

 # solver_mode =P.Solver.GPU
  # device_id =int(gpulist[0])

保存後終端運行:
cd  /home/**(您服務器的名字)/caffe
python examples/ssd/ssd_pascal.py

如果出現問題cudasuccess(2vs0)則說明您的顯卡計算量有限,再次打開caffe/examples/ssd/ssd_pascal.py這個文件,找到batch_size =32這一行,修改數字32,可以修改爲16,或者8,甚至爲4(相信大家這個修改可以理解,我就不作說明了),保存後再次終端運行python examples/ssd/ssd_pascal.py


  If you don't have time to train your model, you can download a pre-trained model at [here](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_VOC0712_SSD_300x300.tar.gz).


2. Evaluate the most recent snapshot.
  # If you would like to test a model you trained, you can do:
  python examples/ssd/score_ssd_pascal.py演示detection的訓練結果,數值在0.718左右
  ```
3. Test your model using a webcam. Note: press <kbd>esc</kbd> to stop.
  # If you would like to attach a webcam to a model you trained, you can do:
  python examples/ssd/ssd_pascal_webcam.py演示網絡攝像頭識別效果
  ```
4. Check out `examples/ssd_detect.ipynb` or `examples/ssd/ssd_detect.cpp` on how to detect objects using a SSD model.

### Models
1. Models trained on VOC0712: [SSD300](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_VOC0712_SSD_300x300.tar.gz), [SSD500](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_VOC0712_SSD_500x500.tar.gz)


2. Models trained on MSCOCO trainval35k: [SSD300](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_coco_SSD_300x300.tar.gz), [SSD500](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_coco_SSD_500x500.tar.gz)


3. Models trained on ILSVRC2015 trainval1: [SSD300](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_ilsvrc15_SSD_300x300.tar.gz), [SSD500](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_ilsvrc15_SSD_500x500.tar.gz) (46.4 mAP on val2)


########################################################################


python examples/ssd/ssd_pascal.py時有一些小問題產生,按照網上的都解決,,但具體的忘了,好像

sudo python examples/ssd/ssd_pascal.py  這樣就可以訓練數據了


運行python examples/ssd/score_ssd_pascal.py  和  python examples/ssd/ssd_pascal_webcam.py出現錯誤:


層與層之間的維度信息不匹配,這時候訓練迭代次數將近40000次,然後繼續運行

Python example/ssd/ssd_pascal.py等到迭代到了68000多次,再次運行,成功!!!

至此,所有配置工作圓滿完成!!!

發佈了20 篇原創文章 · 獲贊 6 · 訪問量 4萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章