Google Protocol Buffer( 簡稱 Protobuf) 是 Google 公司內部的混合語言數據標準。Protocol Buffers 是一種輕便高效的結構化數據存儲格式,可以用於結構化數據串行化,或者說序列化。它很適合做數據存儲或 RPC 數據交換格式。可用於通訊協議、數據存儲等領域的語言無關、平臺無關、可擴展的序列化結構數據格式。目前提供了 C++、Java、Python 三種語言的 API。
Protobuf 的優點
Protobuf 有如 XML,不過它更小、更快、也更簡單。你可以定義自己的數據結構,然後使用代碼生成器生成的代碼來讀寫這個數據結構。你甚至可以在無需重新部署程序的情況下更新數據結構。只需使用 Protobuf 對數據結構進行一次描述,即可利用各種不同語言或從各種不同數據流中對你的結構化數據輕鬆讀寫。
它有一個非常棒的特性,即“向後”兼容性好,人們不必破壞已部署的、依靠“老”數據格式的程序就可以對數據結構進行升級。這樣您的程序就可以不必擔心因爲消息結構的改變而造成的大規模的代碼重構或者遷移的問題。因爲添加新的消息中的 field 並不會引起已經發布的程序的任何改變。
Protobuf 語義更清晰,無需類似 XML 解析器的東西(因爲 Protobuf 編譯器會將 .proto 文件編譯生成對應的數據訪問類以對 Protobuf 數據進行序列化、反序列化操作)。
使用 Protobuf 無需學習複雜的文檔對象模型,Protobuf 的編程模式比較友好,簡單易學,同時它擁有良好的文檔和示例,對於喜歡簡單事物的人們而言,Protobuf 比其他的技術更加有吸引力。
Protobuf 的不足
Protbuf 與 XML 相比也有不足之處。它功能簡單,無法用來表示複雜的概念。
XML 已經成爲多種行業標準的編寫工具,Protobuf 只是 Google 公司內部使用的工具,在通用性上還差很多。
由於文本並不適合用來描述數據結構,所以 Protobuf 也不適合用來對基於文本的標記文檔(如 HTML)建模。另外,由於 XML 具有某種程度上的自解釋性,它可以被人直接讀取編輯,在這一點上 Protobuf 不行,它以二進制的方式存儲,除非你有 .proto 定義,否則你沒法直接讀出 Protobuf 的任何內容。
使用protobuf的原由
一個好的軟件框架應該要有明確的輸入和輸出,對於CNN網絡而言,其主要有兩部分組成:網絡具體結構和網絡的具體優化算法及參數。對於框架的使用者而言,用戶只需輸入兩個描述文件即可得到對該網絡的優化結果,這無疑是非常方便的。
caffe框架選擇使用谷歌的開源protobuf工具對這兩部分進行描述,解析和存儲,這一部分爲caffe的實現節省了大量的代碼。
如前面講述的目標檢測demo,py-faster-rcnn,其主要分爲訓練和測試兩個過程,兩個過程的核心文件都是prototxt格式的文本文件。
如訓練過程
輸入:
(1)slover.prototxt。描述網絡訓練時的各種參數文件,如訓練的策略,學習率的變化率,模型保存的頻率等參數
(2)train.prototxt。描述訓練網絡的網絡結構文件。
(3)test.prototxt。描述測試網絡的網絡結構文件。
輸出:
VGG16.caffemodel:保存的訓練好的網絡參數文件。
protobuf的使用流程
protobuf工具主要是數據序列化存儲和解析。在實際使用的時候主要是作爲一個代碼自動生成工具來使用,通過生成對所定義的數據結構的標準讀寫代碼,用戶可以通過標準的讀寫接口從文件中進行數據的讀取,解析和存儲。
目前proto支持C++,python,java等語言,這裏主要演示caffe中使用的C++調用。
主要使用過程爲:
(1)編寫XXX.proto文件。該文件裏主要定義了各種數據結構及對應的數據類型,如int,string等。
(2)使用protoc對XXX.proto文件進行編譯,生成對應的數據結構文件的讀取和寫入程序,程序接口都是標準化的。生成的文件一般名爲XXX.pb.cc和XXX.pb.h。
(3)在新程序中使用XXX.pb.c和XXX.pb.h提供的代碼。
簡易caffe.proto編寫解析示例
爲了後面更加清楚的理解protobuf工具,這裏一個簡單的caffe.proto爲例進行solver.prototxt和train.prototxt的解析
caffe.proto文件編寫:
syntax = "proto2";
package caffe;//c++ namespace
message NetParameter {
optional string name = 1; // consider giving the network a name
repeated LayerParameter layer = 2; // ID 100 so layers are printed last.
}
message SolverParameter {
optional string train_net = 1;
optional float base_lr = 2;
optional string lr_policy = 3;
optional NetParameter net_param = 4;
}
message ParamSpec {
optional string name = 1;
optional float lr_mult = 3 [default = 1.0];
optional float decay_mult = 4 [default = 1.0];
}
// LayerParameter next available layer-specific ID: 147 (last added: recurrent_param)
message LayerParameter {
optional string name = 1; // the layer name
optional string type = 2; // the layer type
repeated string bottom = 3; // the name of each bottom blob
repeated string top = 4; // the name of each top blob
repeated ParamSpec param = 6;
// Layer type-specific parameters.
optional ConvolutionParameter convolution_param = 106;
optional PythonParameter python_param = 130;
}
message ConvolutionParameter {
optional uint32 num_output = 1; // The number of outputs for the layer
// Pad, kernel size, and stride are all given as a single value for equal
// dimensions in all spatial dimensions, or once per spatial dimension.
repeated uint32 pad = 3; // The padding size; defaults to 0
repeated uint32 kernel_size = 4; // The kernel size
repeated uint32 stride = 6; // The stride; defaults to 1
}
message PythonParameter {
optional string module = 1;
optional string layer = 2;
// This value is set to the attribute `param_str` of the `PythonLayer` object
// in Python before calling the `setup()` method. This could be a number,
// string, dictionary in Python dict format, JSON, etc. You may parse this
// string in `setup` method and use it in `forward` and `backward`.
optional string param_str = 3 [default = ''];
}
...
編譯生成caffe.pb.cc與caffe.pb.h文件
protoc caffe.proto --cpp_out=.//在當前目錄生成cpp文件及頭文件
編寫測試文件main.cpp
#include <fcntl.h>
#include <unistd.h>
#include <iostream>
#include <string>
#include <google/protobuf/io/coded_stream.h>
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/text_format.h>
#include "caffe.pb.h"
using namespace caffe;
using namespace std;
using google::protobuf::io::FileInputStream;
using google::protobuf::Message;
bool ReadProtoFromTextFile(const char* filename, Message* proto) {
int fd = open(filename, O_RDONLY);
FileInputStream* input = new FileInputStream(fd);
bool success = google::protobuf::TextFormat::Parse(input, proto);
delete input;
close(fd);
return success;
}
int main()
{
SolverParameter SGD;
if(!ReadProtoFromTextFile("solver.prototxt", &SGD))
{
cout<<"error opening file"<<endl;
return -1;
}
cout<<"hello,world"<<endl;
cout<<SGD.train_net()<<endl;
cout<<SGD.base_lr()<<endl;
cout<<SGD.lr_policy()<<endl;
NetParameter VGG16;
if(!ReadProtoFromTextFile("train.prototxt", &VGG16))
{
cout<<"error opening file"<<endl;
return -1;
}
cout<<VGG16.name()<<endl;
return 0;
}
編寫solver與train網絡描述文件
solver.prototxt內容
train_net: "/home/bryant/cuda-test/train.prototxt"
base_lr: 0.001
lr_policy: "step"
train.prototxt內容:
name: "VGG_ILSVRC_16_layers"
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 2"
}
}
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
}
編譯鏈接,生成main
g++ caffe.pb.cc main.cpp -o main -lprotobuf
運行結果
bryant@bryant:~/cuda-test/src$ ./main
hello,world
/home/bryant/cuda-test/train.prototxt
0.001
step
VGG_ILSVRC_16_layers
bryant@bryant:~/cuda-test/src$
cmake編譯proto文件
在cmake中編譯時可以直接在CMakeLists.txt中編譯.proto文件,而不需要在終端中輸入命令編譯,這樣使用時方便很多。protobuf提供了FindProtobuf.cmake文件,通過find_package()宏可以向CMakeLists.txt引入該文件。
該文件的內容如下:
#.rst:
# FindProtobuf
# ------------
#
# Locate and configure the Google Protocol Buffers library.
#
# The following variables can be set and are optional:
#
# ``PROTOBUF_SRC_ROOT_FOLDER``
# When compiling with MSVC, if this cache variable is set
# the protobuf-default VS project build locations
# (vsprojects/Debug and vsprojects/Release
# or vsprojects/x64/Debug and vsprojects/x64/Release)
# will be searched for libraries and binaries.
# ``PROTOBUF_IMPORT_DIRS``
# List of additional directories to be searched for
# imported .proto files.
#
# Defines the following variables:
#
# ``PROTOBUF_FOUND``
# Found the Google Protocol Buffers library
# (libprotobuf & header files)
# ``PROTOBUF_INCLUDE_DIRS``
# Include directories for Google Protocol Buffers
# ``PROTOBUF_LIBRARIES``
# The protobuf libraries
# ``PROTOBUF_PROTOC_LIBRARIES``
# The protoc libraries
# ``PROTOBUF_LITE_LIBRARIES``
# The protobuf-lite libraries
#
# The following cache variables are also available to set or use:
#
# ``PROTOBUF_LIBRARY``
# The protobuf library
# ``PROTOBUF_PROTOC_LIBRARY``
# The protoc library
# ``PROTOBUF_INCLUDE_DIR``
# The include directory for protocol buffers
# ``PROTOBUF_PROTOC_EXECUTABLE``
# The protoc compiler
# ``PROTOBUF_LIBRARY_DEBUG``
# The protobuf library (debug)
# ``PROTOBUF_PROTOC_LIBRARY_DEBUG``
# The protoc library (debug)
# ``PROTOBUF_LITE_LIBRARY``
# The protobuf lite library
# ``PROTOBUF_LITE_LIBRARY_DEBUG``
# The protobuf lite library (debug)
#
# Example:
#
# .. code-block:: cmake
#
# find_package(Protobuf REQUIRED)
# include_directories(${PROTOBUF_INCLUDE_DIRS})
# include_directories(${CMAKE_CURRENT_BINARY_DIR})
# protobuf_generate_cpp(PROTO_SRCS PROTO_HDRS foo.proto)
# protobuf_generate_python(PROTO_PY foo.proto)
# add_executable(bar bar.cc ${PROTO_SRCS} ${PROTO_HDRS})
# target_link_libraries(bar ${PROTOBUF_LIBRARIES})
#
# .. note::
# The ``protobuf_generate_cpp`` and ``protobuf_generate_python``
# functions and :command:`add_executable` or :command:`add_library`
# calls only work properly within the same directory.
#
# .. command:: protobuf_generate_cpp
#
# Add custom commands to process ``.proto`` files to C++::
#
# protobuf_generate_cpp (<SRCS> <HDRS> [<ARGN>...])
#
# ``SRCS``
# Variable to define with autogenerated source files
# ``HDRS``
# Variable to define with autogenerated header files
# ``ARGN``
# ``.proto`` files
#
# .. command:: protobuf_generate_python
#
# Add custom commands to process ``.proto`` files to Python::
#
# protobuf_generate_python (<PY> [<ARGN>...])
#
# ``PY``
# Variable to define with autogenerated Python files
# ``ARGN``
# ``.proto`` filess
#=============================================================================
# Copyright 2009 Kitware, Inc.
# Copyright 2009-2011 Philip Lowman <[email protected]>
# Copyright 2008 Esben Mose Hansen, Ange Optimization ApS
#
# Distributed under the OSI-approved BSD License (the "License");
# see accompanying file Copyright.txt for details.
#
# This software is distributed WITHOUT ANY WARRANTY; without even the
# implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# See the License for more information.
#=============================================================================
# (To distribute this file outside of CMake, substitute the full
# License text for the above reference.)
function(PROTOBUF_GENERATE_CPP SRCS HDRS)
if(NOT ARGN)
message(SEND_ERROR "Error: PROTOBUF_GENERATE_CPP() called without any proto files")
return()
endif()
if(PROTOBUF_GENERATE_CPP_APPEND_PATH)
# Create an include path for each file specified
foreach(FIL ${ARGN})
get_filename_component(ABS_FIL ${FIL} ABSOLUTE)
get_filename_component(ABS_PATH ${ABS_FIL} PATH)
list(FIND _protobuf_include_path ${ABS_PATH} _contains_already)
if(${_contains_already} EQUAL -1)
list(APPEND _protobuf_include_path -I ${ABS_PATH})
endif()
endforeach()
else()
set(_protobuf_include_path -I ${CMAKE_CURRENT_SOURCE_DIR})
endif()
if(DEFINED PROTOBUF_IMPORT_DIRS)
foreach(DIR ${PROTOBUF_IMPORT_DIRS})
get_filename_component(ABS_PATH ${DIR} ABSOLUTE)
list(FIND _protobuf_include_path ${ABS_PATH} _contains_already)
if(${_contains_already} EQUAL -1)
list(APPEND _protobuf_include_path -I ${ABS_PATH})
endif()
endforeach()
endif()
set(${SRCS})
set(${HDRS})
foreach(FIL ${ARGN})
get_filename_component(ABS_FIL ${FIL} ABSOLUTE)
get_filename_component(FIL_WE ${FIL} NAME_WE)
list(APPEND ${SRCS} "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}.pb.cc")
list(APPEND ${HDRS} "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}.pb.h")
add_custom_command(
OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}.pb.cc"
"${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}.pb.h"
COMMAND ${PROTOBUF_PROTOC_EXECUTABLE}
ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR} ${_protobuf_include_path} ${ABS_FIL}
DEPENDS ${ABS_FIL} ${PROTOBUF_PROTOC_EXECUTABLE}
COMMENT "Running C++ protocol buffer compiler on ${FIL}"
VERBATIM )
endforeach()
set_source_files_properties(${${SRCS}} ${${HDRS}} PROPERTIES GENERATED TRUE)
set(${SRCS} ${${SRCS}} PARENT_SCOPE)
set(${HDRS} ${${HDRS}} PARENT_SCOPE)
endfunction()
function(PROTOBUF_GENERATE_PYTHON SRCS)
if(NOT ARGN)
message(SEND_ERROR "Error: PROTOBUF_GENERATE_PYTHON() called without any proto files")
return()
endif()
if(PROTOBUF_GENERATE_CPP_APPEND_PATH)
# Create an include path for each file specified
foreach(FIL ${ARGN})
get_filename_component(ABS_FIL ${FIL} ABSOLUTE)
get_filename_component(ABS_PATH ${ABS_FIL} PATH)
list(FIND _protobuf_include_path ${ABS_PATH} _contains_already)
if(${_contains_already} EQUAL -1)
list(APPEND _protobuf_include_path -I ${ABS_PATH})
endif()
endforeach()
else()
set(_protobuf_include_path -I ${CMAKE_CURRENT_SOURCE_DIR})
endif()
if(DEFINED PROTOBUF_IMPORT_DIRS)
foreach(DIR ${PROTOBUF_IMPORT_DIRS})
get_filename_component(ABS_PATH ${DIR} ABSOLUTE)
list(FIND _protobuf_include_path ${ABS_PATH} _contains_already)
if(${_contains_already} EQUAL -1)
list(APPEND _protobuf_include_path -I ${ABS_PATH})
endif()
endforeach()
endif()
set(${SRCS})
foreach(FIL ${ARGN})
get_filename_component(ABS_FIL ${FIL} ABSOLUTE)
get_filename_component(FIL_WE ${FIL} NAME_WE)
list(APPEND ${SRCS} "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}_pb2.py")
add_custom_command(
OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}_pb2.py"
COMMAND ${PROTOBUF_PROTOC_EXECUTABLE} --python_out ${CMAKE_CURRENT_BINARY_DIR} ${_protobuf_include_path} ${ABS_FIL}
DEPENDS ${ABS_FIL} ${PROTOBUF_PROTOC_EXECUTABLE}
COMMENT "Running Python protocol buffer compiler on ${FIL}"
VERBATIM )
endforeach()
set(${SRCS} ${${SRCS}} PARENT_SCOPE)
endfunction()
if(CMAKE_SIZEOF_VOID_P EQUAL 8)
set(_PROTOBUF_ARCH_DIR x64/)
endif()
# Internal function: search for normal library as well as a debug one
# if the debug one is specified also include debug/optimized keywords
# in *_LIBRARIES variable
function(_protobuf_find_libraries name filename)
find_library(${name}_LIBRARY
NAMES ${filename}
PATHS ${PROTOBUF_SRC_ROOT_FOLDER}/vsprojects/${_PROTOBUF_ARCH_DIR}Release)
mark_as_advanced(${name}_LIBRARY)
find_library(${name}_LIBRARY_DEBUG
NAMES ${filename}
PATHS ${PROTOBUF_SRC_ROOT_FOLDER}/vsprojects/${_PROTOBUF_ARCH_DIR}Debug)
mark_as_advanced(${name}_LIBRARY_DEBUG)
if(NOT ${name}_LIBRARY_DEBUG)
# There is no debug library
set(${name}_LIBRARY_DEBUG ${${name}_LIBRARY} PARENT_SCOPE)
set(${name}_LIBRARIES ${${name}_LIBRARY} PARENT_SCOPE)
else()
# There IS a debug library
set(${name}_LIBRARIES
optimized ${${name}_LIBRARY}
debug ${${name}_LIBRARY_DEBUG}
PARENT_SCOPE
)
endif()
endfunction()
# Internal function: find threads library
function(_protobuf_find_threads)
set(CMAKE_THREAD_PREFER_PTHREAD TRUE)
find_package(Threads)
if(Threads_FOUND)
list(APPEND PROTOBUF_LIBRARIES ${CMAKE_THREAD_LIBS_INIT})
set(PROTOBUF_LIBRARIES "${PROTOBUF_LIBRARIES}" PARENT_SCOPE)
endif()
endfunction()
#
# Main.
#
# By default have PROTOBUF_GENERATE_CPP macro pass -I to protoc
# for each directory where a proto file is referenced.
if(NOT DEFINED PROTOBUF_GENERATE_CPP_APPEND_PATH)
set(PROTOBUF_GENERATE_CPP_APPEND_PATH TRUE)
endif()
# Google's provided vcproj files generate libraries with a "lib"
# prefix on Windows
if(MSVC)
set(PROTOBUF_ORIG_FIND_LIBRARY_PREFIXES "${CMAKE_FIND_LIBRARY_PREFIXES}")
set(CMAKE_FIND_LIBRARY_PREFIXES "lib" "")
find_path(PROTOBUF_SRC_ROOT_FOLDER protobuf.pc.in)
endif()
# The Protobuf library
_protobuf_find_libraries(PROTOBUF protobuf)
#DOC "The Google Protocol Buffers RELEASE Library"
_protobuf_find_libraries(PROTOBUF_LITE protobuf-lite)
# The Protobuf Protoc Library
_protobuf_find_libraries(PROTOBUF_PROTOC protoc)
# Restore original find library prefixes
if(MSVC)
set(CMAKE_FIND_LIBRARY_PREFIXES "${PROTOBUF_ORIG_FIND_LIBRARY_PREFIXES}")
endif()
if(UNIX)
_protobuf_find_threads()
endif()
# Find the include directory
find_path(PROTOBUF_INCLUDE_DIR
google/protobuf/service.h
PATHS ${PROTOBUF_SRC_ROOT_FOLDER}/src
)
mark_as_advanced(PROTOBUF_INCLUDE_DIR)
# Find the protoc Executable
find_program(PROTOBUF_PROTOC_EXECUTABLE
NAMES protoc
DOC "The Google Protocol Buffers Compiler"
PATHS
${PROTOBUF_SRC_ROOT_FOLDER}/vsprojects/${_PROTOBUF_ARCH_DIR}Release
${PROTOBUF_SRC_ROOT_FOLDER}/vsprojects/${_PROTOBUF_ARCH_DIR}Debug
)
mark_as_advanced(PROTOBUF_PROTOC_EXECUTABLE)
include(${CMAKE_CURRENT_LIST_DIR}/FindPackageHandleStandardArgs.cmake)
FIND_PACKAGE_HANDLE_STANDARD_ARGS(Protobuf DEFAULT_MSG
PROTOBUF_LIBRARY PROTOBUF_INCLUDE_DIR)
if(PROTOBUF_FOUND)
set(PROTOBUF_INCLUDE_DIRS ${PROTOBUF_INCLUDE_DIR})
endif()
顯然定義了PROTOBUF_GENERATE_CPP,PROTOBUF_GENERATE_PYTHON,_protobuf_find_libraries等函數,PROTOBUF_GENERATE_CPP是用於把.proto編譯成cpp文件,PROTOBUF_GENERATE_PYTHON是用於編譯生成Python文件。
比如上述的例子中用cmake編譯源碼,且在CmakeLists.txt中編譯.proto文件,CMakeLists.txt文件的寫法如下:
cmake_minimum_required(VERSION 2.8)
PROJECT (protoTest)
set(CMAKE_CXX_STANDARD 14)
set(SRC_LIST main.cpp)
file(GLOB_RECURSE SRC_PROTOCOL_LIST ${CMAKE_CURRENT_SOURCE_DIR}/*.proto)
message(***********${SRC_PROTOCOL_LIST}***********)
# Find required protobuf package
find_package(Protobuf REQUIRED)
if(PROTOBUF_FOUND)
message(STATUS "protobuf library found")
else()
message(FATAL_ERROR "protobuf library is needed but cant be found")
endif()
include_directories(${PROTOBUF_INCLUDE_DIRS})
link_libraries(${PROTOBUF_LIBRARIES})
include_directories(${CMAKE_CURRENT_BINARY_DIR})
PROTOBUF_GENERATE_CPP(PROTO_SRCS PROTO_HDRS ${SRC_PROTOCOL_LIST})
add_executable(protoTest ${SRC_LIST} ${PROTO_SRCS} ${PROTO_HDRS})
target_link_libraries(protoTest ${PROTOBUF_LIBRARIES})
運行結果
suteng@suteng:~/Documents/findprotobuf/build$ ./protoTest
hello,world
0
參考:https://blog.csdn.net/piaopiaopiaopiaopiao/article/details/84347377