從0開始使用tensorflow的c++庫進行模型推斷

文章目錄

主要問題

背景

之前我們一直用MNN來作爲推斷框架，且取得了不錯的效果！

近期要在服務器上跑一下模型，我們也順理成章想到用MNN的GPU，主要是OpenCL和Vulkan，因爲在Android手機上已經驗證過這兩個backend的可行性，所以認爲在x86平臺上也是順水推舟的事情。

然後就打臉了，而且是在項目快到期之前才發現這些問題：

心慌意亂，然後只能趕緊用Tensorflow的c++庫救急：

主要內容

1.編譯TF c++的庫

建議參考以下兩篇文章：
TF–C++動態庫編譯從頭到尾的詳解
 tensorflow C++動態庫編譯
1 下載tensorflow_1.15.0版本
2 安裝0.26.0以下版本的bazel
3 執行configure

./configure

4 編譯兩個庫

bazel build --config=opt //tensorflow:libtensorflow_cc.so
bazel build --config=opt //tensorflow:libtensorflow_framework.so

5 編譯其他依賴

6 整理庫和頭文件

在順利編譯完成之後，最好將頭文件和庫分別整理到include 和lib 文件夾，方便後續使用。

2.加載圖，構造Session

主要有以下兩步：

//加載圖
//model_path 是指具體的模型pb文件的路徑
tensorflow::GraphDef graphdef
tensorflow::Status status_load = ReadBinaryProto(Env::Default(), model_path, &graphdef);


//構造Session
Session *session;
tensorflow::Status status_create = session->Create(graphdef);

3.構建輸入tensor

如果將模型推理看做一個黑盒函數，那麼Session的構造就完成了函數的定義，推理過程就是調用這個黑盒函數，爲了得到推理的結果，我們要構建合適的輸入，對TF來說，這個輸入就是:

std::vector<std::pair<std::string, tensorflow::Tensor> > input

可以看到，這個input首先是一個vector，然後vector的元素是鍵值對，key是Tensor的名字，Value是Tensor。

所以構造輸入也分爲兩步，一是構造Tensor，二是構造鍵值對：
一般我們會接受兩種類型的輸入，一是cv::Mat, 二是unsinged char* 。不管是什麼類型，都是兩步，一是將數據轉爲float類型，二是將值拷貝到Tensor的數據空間，以cv::Mat構造Tensor舉例：

Tensor EdgeInfer::ReadTensorFromImageMat(cv::Mat img)
{
    img.convertTo(img,CV_32FC1);
    img = (img - mInputMean)/mInputStd;
    tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT, tensorflow::TensorShape({1, mInputHeight, mInputWidth, mChannel}));
    auto input_tensor_mapped = input_tensor.tensor<float, 4>();

    const float *source_data = (float *)img.data;

    for (int y = 0; y < mInputHeight; ++y)
    {
        const float *source_row = source_data + (y * mInputWidth * mChannel);
        for (int x = 0; x < mInputWidth; ++x)
        {
            const float *source_pixel = source_row + (x * mChannel);
            for (int c = 0; c < mChannel; ++c)
            {
                const float *source_value = source_pixel + c;
                input_tensor_mapped(0, y, x, c) = *source_value;
            }
        }
    }
    return input_tensor;
}

然後將Tensor封裝成鍵值對：

input.push_back(std::pair<std::string, Tensor>(node_name, input_tensor));

4.執行Session

在有了input之後，Session的執行就是傻瓜式操作：

tensorflow::Status status = session->Run(input, {output_node}, {}, &outputs);

參數：

input, 前面構造的輸入
{output_node}，待返回的節點的名字構成的vector，會在outputs中被返回
{}，第三個參數是目標節點的名稱，會執行到該節點，但是不會返回
outputs, 返回的tensor

5.獲取輸出

前面已經說到返回的tensor都在Session->Run()的第四個參數中，和輸入類似，我們一般也不會直接操作Tensor，而是會將Tensor轉爲cv::Mat或者float*，這裏也是兩步，一是獲取到每個輸出tensor的shape和數據指針，二是將數據輸出到指定的格式, 參考此文

int tfTensor2cvMat(const tensorflow::Tensor& inputTensor, cv::Mat& output)
{
	tensorflow::TensorShape inputTensorShape = inputTensor.shape();
	if (inputTensorShape.dims() != 4)
	{
		return -1;
	}

	int height = inputTensorShape.dim_size(1);
	int width = inputTensorShape.dim_size(2);
	int depth = inputTensorShape.dim_size(3);

	output = cv::Mat(height, width, CV_32FC(depth));
	auto inputTensorMapped = inputTensor.tensor<float, 4>();
	float* data = (float*)output.data;
	for (int y = 0; y < height; ++y)
	{
		float* dataRow = data + (y * width * depth);
		for (int x = 0; x < width; ++x)
		{
			float* dataPixel = dataRow + (x * depth);
			for (int c = 0; c < depth; ++c)
			{
				float* dataValue = dataPixel + c;
				*dataValue = inputTensorMapped(0, y, x, c);
			}
		}
	}
	return 0;
}

主要問題

1.系統版本導致的運行時c++庫報錯

舉個例子，在Ubuntu 18.04系統上編譯得到的動態庫如果在16.04系統上運行，會報一系列運行時庫的錯誤，錯誤都指向glibc++。這是因爲18.04的glibc++比16.04要更新，所以不向下兼容。

需要注意的是，向上兼容是支持的，親測在16.04上編的動態庫，在18.04也可以正常運行。

還有就是，千萬不要嘗試升級glibc++，不要嘗試升級glibc++，不要嘗試升級glibc++，老老實實重新編tensorflow。

2.獲取輸出tensor時數據時報錯

這個錯誤非常典型，很容易遇到，且目前網上沒有比較好的解決辦法，這裏記錄我們遇到的問題以及解決辦法。

報錯是：

Check failed: NDIMS == dims() (4 vs. 2)Asking for tensor of 4 dimensions from a tensor of 2 dimensions

乍一看，這是一個TensorFlow內部報出來的錯誤，似乎不太好修改，我們層層尋找，發現報錯的根源在於我們操作的tensor的shape和實際不一致，具體來說，我們把這個tensor當做NHWC格式來使用，但是實際上這個tensor就是一個n batch的一維向量，那麼就會報上面的錯誤。

源碼追溯如下：

把tensor都當做NHWC格式來使用

auto inputTensorMapped = inputTensor.tensor<float, 4>();

調用Tensor::tensor()

template <typename T, size_t NDIMS>
typename TTypes<T, NDIMS>::Tensor Tensor::tensor() {
  CheckTypeAndIsAligned(DataTypeToEnum<T>::v());
  return typename TTypes<T, NDIMS>::Tensor(base<T>(),
                                           shape().AsEigenDSizes<NDIMS>());
}

調用AsEigenDSizes

template <int NDIMS, typename IndexType>
Eigen::DSizes<IndexType, NDIMS> TensorShape::AsEigenDSizes() const {
  CheckDimsEqual(NDIMS);
  return AsEigenDSizesWithPadding<NDIMS, IndexType>();
}

調用CheckDimsEqual

void TensorShape::CheckDimsEqual(int NDIMS) const {
  CHECK_EQ(NDIMS, dims()) << "Asking for tensor of " << NDIMS << " dimensions"
                          << " from a tensor of " << dims() << " dimensions";
}

這就是我們熟悉的報錯了！
解決的辦法是將可能遇到的情況分別處理，調用shape = tensor.shape()，獲取到shape，然後再根據shape.size()就可以獲取tensor的維度，然後再分別處理各種類型的維度的情況，需要注意的是，因爲Eigen是高度抽象的模板類，所以在inputTensor.tensor<float, 4>()函數中需要傳入的第二個參數必須是右值！

總結

我們大概介紹瞭如何使用tensorflow的動態庫來進行模型的加載以及輸入輸出的構造和獲取，在此基礎上我們分析了兩個可能遇到的坑，以及解決辦法，尤其是對於Asking for tensor of ....類報錯，我們仔細分析了報錯的原因，並給出了詳細的解決辦法。

希望能有所幫助！

參考

https://blog.csdn.net/heiheiya/article/details/89454884
https://zhuanlan.zhihu.com/p/42187985
https://zhuanlan.zhihu.com/p/58570658
https://zhuanlan.zhihu.com/p/91892469
https://blog.csdn.net/u011285477/article/details/93975689#整理庫文件和頭文件

從0開始使用tensorflow的c++庫進行模型推斷

文章目錄

背景

主要內容

1.編譯TF c++的庫

2.加載圖，構造Session

3.構建輸入tensor

4.執行Session

5.獲取輸出

主要問題

1.系統版本導致的運行時c++庫報錯

2.獲取輸出tensor時數據時報錯

總結

參考

SQL優化-20231016

Cook-Toom 算法做快速卷積

TensorFlow + MKL 內存泄漏及解決辦法

深度學習應用開發架構的一種思路

一步步復現google/mediapipe的各種功能：手勢、人臉、目標檢測等

MNN 中的矩陣乘法

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結