ubuntu 14.04+ GTX 1070+cuda 8.0 + cudnn 5.1+opencv3.0的安裝參見我前兩篇博客,詳細介紹了!
上一篇寫了如何接着配置caffe框架.
本文重點爲caffe-ssd的安裝與配置細節(注意此時電腦的配置爲ubuntu 14.04+ GTX 1070+cuda 8.0 + cudnn 5.1+opencv3.0+caffe)
一、安裝ssd(大部分依賴庫已經在配置caffe的時候完成)
在/home下新建文件夾caffe-ssd(區別於caffe)
cd caffe-ssd
git clone https://github.com/weiliu89/caffe.git
cd caffe
git checkout ssd
一、配置ssd
cd /home/**(您服務器的名字)/caffe
cp Makefile.config.example Makefile.config
打開Makefile.config,修改之處爲
use_cudnn:=1取消註釋
opencv_version:=3取消註釋
with_python_layer:=1取消註釋
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
改爲INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
修改Makefile文件LIBRARIES += glog gflags protobuf boost_system boost_filesystem boost_regex m hdf5_hl hdf5
改爲
LIBRARIES += glog gflags protobuf boost_system boost_filesystem boost_regex m hdf5_hl hdf5 opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs
然後編譯
用下面的方法開始不成功就放棄了
cd /home//caffe
mkdir build
cd build
cmake ..
make all -j16
make install
make runtest
make pycaffe
其中在cmake ..的時候發現opencv還是找的opencv2.4.8版本,在caffe-ssd/caffe裏面的CMakeLists.txt添加
set(OpenCV_DIR /home/shan/opencv-3.0.0/build)
這樣就能找到3.0版本的了,改完之後接着用下面的方法:根據readme.md裏的指令編譯
make -j8
make py
make test -j8
make runtest -j8
在Make -j8過程會出現error:
1、json_parser_read.hpp:257:264:
error: ‘type name’ declared as function returning an array escape
解決:sudo gedit/usr/include/boost/property_tree/detail/json_parser_read.hpp將257行的escape代碼段註釋掉即可
2、cannot find -lopencv_videoio
cannot find -lopenblas
像這種問題是找不到libopencv_videoio.so、libopenblas.so等文件
解決方法是:首先在計算機中查找看是否有libopencv_videoio等文件
若沒有,就sudo apt-get install libopencv_videoio-dev
若有,就把文件拷貝到/usr/lib下 ,如果不行,就拷貝到/usr/local/lib
對於我的問題libopencv_videoio.so能找到,進行拷貝
sudo cp /home/shan/opencv-3.0.0/build/lib/libopencv_videoio.so /usr/lib
sudo cp /home/shan/opencv-3.0.0/build/lib/libopencv_videoio.so.3.0 /usr/lib
sudo cp /home/shan/opencv-3.0.0/build/lib/libopencv_videoio.so.3.0.0 /usr/lib
sudo cp /home/shan/opencv-3.0.0/build/lib/libopencv_videoio_pch_dephelp.a /usr/lib
而libopenblas找不到,
sudo apt-get install libopenblas-dev
再make -j8沒問題
make py
make test -j8
make
runtest -j8
出現問題: error while loading shared libraries: libcudart.so.8.0: cannot open shared object file: No such file or directory
解決辦法:首先確認/etc/profile中的路徑包含了cuda8.0的安裝路徑及相應的庫文件
sudo gedit /etc/profile
export PATH=$PATH:/usr/local/cuda-8.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64
export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-8.0/lib64
export PYTHONPATH=~/caffe-ssd/caffe/python:$PYTHONPATH
export PKG_CONFIG_PATH=/usr/local/cuda-8.0/pkgconfig:/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH
$source /etc/profile
使配置文件生效,再次執行。
若仍提示相同的錯誤,則執行以下命令,將相應的庫文件複製到/usr/lib
sudo cp /usr/local/cuda-8.0/lib64/libcudart.so.8.0 /usr/local/lib/libcudart.so.8.0 && sudo ldconfig
sudo cp /usr/local/cuda-8.0/lib64/libcublas.so.8.0 /usr/local/lib/libcublas.so.8.0 && sudo ldconfig
sudo cp /usr/local/cuda-8.0/lib64/libcurand.so.8.0 /usr/local/lib/libcurand.so.8.0 && sudo ldconfig
還是不行!!!
sudo cp /usr/local/cuda-8.0/lib64/libcudart.so.8.0 /usr/lib/libcudart.so.8.0 && sudo ldconfig
成功!!!
到了第二天不知道動了什麼,。,不好使了,,刪除caffe-ssd包,重新下載編譯,編譯不過去了,,又用這個法編譯過去了,,真是醉了
cd /home//caffe
mkdir build
cd build
cmake ..
make all -j16
make install
make runtest
make pycaffe
接下來,下載數據集和訓練
### Preparation
1. Download [fully convolutional reduced (atrous) VGGNet](https://gist.github.com/weiliu89/2ed6e13bfd5b57cf81d6). By default, we assume the model is stored in `$CAFFE_ROOT/models/VGGNet/`
2. Download VOC2007 and VOC2012 dataset. By default, we assume the data is stored in `$HOME/data/`
```Shell
# Download the data.
cd $HOME/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
# Extract the data.
tar -xvf VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_06-Nov-2007.tar
tar -xvf VOCtest_06-Nov-2007.tar
數據集的獲取:通過解壓數據集到/data下最靠譜,如果通過拷貝以前保存的數據集到/data下,在/VOC2012/JPEGImage下2011_001232.jpg是拷貝不過來的
3.生成LMDB文件
cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
./data/VOC0712/create_list.sh
# You can modify the parameters in create_data.sh if needed.
# It will create lmdb files for trainval and test with encoded original image:
# - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
# - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
# and make soft links at examples/VOC0712/
./data/VOC0712/create_data.sh
```
### 訓練
1. Train your model and evaluate the model on the fly.
```Shell
# It will create model definition files and save snapshot models in:
# - $CAFFE_ROOT/models/VGGNet/VOC0712/SSD_300x300/
# and job file, log file, and the python script in:
# - $CAFFE_ROOT/jobs/VGGNet/VOC0712/SSD_300x300/
# and save temporary evaluation results in:
# - $HOME/data/VOCdevkit/results/VOC2007/SSD_300x300/
# It should reach 72.* mAP at 60k iterations.
python examples/ssd/ssd_pascal.py 訓練模型
```
打開caffe/examples/ssd/ssd_pascal.py這個文件,找到gpus=’0,1,2,3’這一行,如果您的服務器有一塊顯卡,則將123刪去,如果有兩個顯卡,則刪去23,以此類推。如果您服務器沒有gpu支持,則註銷以下幾行,程序會以cpu形式訓練。(這個是解決問題cudasuccess(10vs0)的方法)
#Ifnum_gpus >0:
# batch_size_per_device =int(math.ceil(float(batch_size) / num_gpus))
#iter_size =int(math.ceil(float(accum_batch_size) / (batch_size_per_device * num_gpus)))
# solver_mode =P.Solver.GPU# device_id =int(gpulist[0])
保存後終端運行:
cd /home/**(您服務器的名字)/caffe
python examples/ssd/ssd_pascal.py
如果出現問題cudasuccess(2vs0)則說明您的顯卡計算量有限,再次打開caffe/examples/ssd/ssd_pascal.py這個文件,找到batch_size =32這一行,修改數字32,可以修改爲16,或者8,甚至爲4(相信大家這個修改可以理解,我就不作說明了),保存後再次終端運行python examples/ssd/ssd_pascal.py
If you don't have time to train your model, you can download a pre-trained model at [here](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_VOC0712_SSD_300x300.tar.gz).
2. Evaluate the most recent snapshot.
# If you would like to test a model you trained, you can do:
python examples/ssd/score_ssd_pascal.py演示detection的訓練結果,數值在0.718左右
```
3. Test your model using a webcam. Note: press <kbd>esc</kbd> to stop.
# If you would like to attach a webcam to a model you trained, you can do:
python examples/ssd/ssd_pascal_webcam.py演示網絡攝像頭識別效果
```
4. Check out `examples/ssd_detect.ipynb` or `examples/ssd/ssd_detect.cpp` on how to detect objects using a SSD model.
### Models
1. Models trained on VOC0712: [SSD300](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_VOC0712_SSD_300x300.tar.gz), [SSD500](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_VOC0712_SSD_500x500.tar.gz)
2. Models trained on MSCOCO trainval35k: [SSD300](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_coco_SSD_300x300.tar.gz),
[SSD500](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_coco_SSD_500x500.tar.gz)
3. Models trained on ILSVRC2015 trainval1: [SSD300](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_ilsvrc15_SSD_300x300.tar.gz),
[SSD500](http://www.cs.unc.edu/~wliu/projects/SSD/models_VGGNet_ilsvrc15_SSD_500x500.tar.gz) (46.4 mAP on val2)
########################################################################
在python examples/ssd/ssd_pascal.py時有一些小問題產生,按照網上的都解決,,但具體的忘了,好像
sudo python examples/ssd/ssd_pascal.py 這樣就可以訓練數據了
運行python
examples/ssd/score_ssd_pascal.py 和 python examples/ssd/ssd_pascal_webcam.py出現錯誤:
層與層之間的維度信息不匹配,這時候訓練迭代次數將近40000次,然後繼續運行
Python example/ssd/ssd_pascal.py等到迭代到了68000多次,再次運行,成功!!!
至此,所有配置工作圓滿完成!!!