Pytorch轉TensorRT中的坑

原創

永恒_一瞬

2020-06-16 11:03

Pytorch轉ONNX再轉TensorRT，其中遇到一些轉的時候會出現的層需要修改的問題，這裏對修改的層做一些總結。

Pytorch與TensorRT版本

TensorRT的ONNX解釋器是針對Pytorch版本編譯的，如果版本不對應可能導致轉模型時出現錯誤，如：

While parsing node number 0 [Conv]:
ERROR: ModelImporter.cpp:288 In function importModel:
[5] Assertion failed: tensors.count(input_name)
[E] failed to parse onnx file
[E] Engine could not be created

目前已知的對應關係爲：

Pytorch	TensorRT
1.2	6.0.1.5
1.3	7.0.0.11

reshape

Pytorch中會有很多需要reshape的操作，可用的算子有：

reshape
view
flatten

前兩個都是需要指定reshape後完整的尺寸大小，因此使用中需要先獲取輸入數據的維度，這個在Pytorch框架下使用沒有問題，但用到TensorRT中就是個不定的值無法進行優化，此時會出現：

While parsing node number xxx [Gather]:
ERROR: onnx2trt_utils.hpp:277 In function convert_axis:
[8] Assertion failed: axis >= 0 && axis < nbDims

這個在TensorRT轉換中比較常見，現在主要由Pytorch將自有框架中相關操作換成flatten。需要注意：

如果reshape後是二維[B, C x H x W]，則使用flatten(1)；
如果reshape後是三維[B, C, H x W]，則使用flatten(2)；

normalize

Pytorch中可以使用torch.nn.functional.normalize(X)進行normalize操作，但是這個函數應該是實驗性質，TensorRT不支持，但還有分步進行normlized的操作，可替換爲：

X = X.div(X.norm(p=2, dim=1, keepdim=True))

MaxPool3d

MaxPool3d在TensorRT6之前是不支持的，但現在已經開始支持該操作，但有的網絡定義時並不是用來對五維數據[B, C, D, H, W]進行操作，而是通過Pytorch處理數據時的bug，對四維數據[B, C, H, W]的後三維進行MaxPool，因此對這樣的操作在TensorRT中以及更早的ONNX中都會報輸入維度錯誤：

RuntimeError: [ONNXRuntimeError] : 1 : GENERAL ERROR : failed:
[ShapeInferenceError] Attribute strides has incorrect size

all concat input tensors must have the same dimensions except on the concatenation axis (0), but dimensions mismatched at input 2 at index 1. Input 0 shape: [xxx], Input 2 shape: [xxx]
While parsing node number xx [Conv]:
ERROR: builtin_op_importers.cpp:788 In function importConv:
[8] Assertion failed: (nbSpatialDims == 2 && kernel_weights.shape.nbDims == 4) || (nbSpatialDims == 3 && kernel_weights.shape.nbDims == 5)

如果是對後三維的數據進行三維MaxPool，可能需要進行MaxPool2d後對通道進行操作，這個還待研究；
如果MaxPool3d的kernel_size爲(1, x, y)，stride爲(1, m, n)，則MaxPool3d等價於MaxPool2d；
如果MaxPool3d的kernel_size爲(1, x, y)，stride爲(L, m, n)，則實際爲對數據的第二維通道[:, C, :, :]進行篩選操作後，進行MaxPool2d.

對於第三種情況，需要對Pytorch的Tensor進行篩選，再進行MaxPool2d，而Pytorch的通道操作目前查到的有：

torch.unbind(tensor, dim=0)：去除某個維度
torch.index_select(input, dim, index, out=None)：選出一維度的一些slice組合成新的tensor。指定維度的大小與index大小一致

經測試，torch.unbind還未了解其用途的有效性，torch.index_select是有效果的，因此可用：

x = nn.MaxPool2d(kernel_size, stride=stride)(x.index_select(1, torch.tensor(range(0, self.stop, self.step))))

# 實際使用中，如果使用預訓練模型，直接添加一個index_select操作會導致後續序號無法對應
# 因此需要將MaxPool3d操作替換爲一個封裝了MaxPool2d和index_select的操作
class MaxPool2dStep(nn.Module):
    def __init__(self, kernel_size, stride, step, stop):
        super(MaxPool2dStep, self).__init__()
        self.step = step
        self.stop = stop
        self.block = nn.Sequential(
            nn.MaxPool2d(kernel_size, stride=stride)
        )

    def forward(self, x):
        return self.block(x.index_select(1, torch.tensor(range(0, self.stop, self.step))))

mean (ReduceMean)

torch.mean(x)可以用來獲取整個Tensor的均值，但在轉TensorRT時會報：

While parsing node number xx [ReduceMean]:
ERROR: onnx2trt_utils.hpp:347 In function convert_axis:
[8] Assertion failed: axis >= 0 && axis < nbDims

嘗試了幾種修改方式：

torch.mean(x, 1, keepdim=True). 能轉換，但結果和未修改前不一致，因爲這種均值是對第2維求均值的矩陣，而非一個常數；
torch.mean(f_pow).expand(1, 1, f_pow.shape[2], f_pow.shape[3]). python結果相同，但無法轉TensorRT，因此上種方法不是因爲數位相同原因；
torch.mean(f_pow.flatten(1), 1, keepdim=True). 能轉換，且結果一致，均值是一個shape爲[1, 1]的數據。

因此，猜測TensorRT在處理除法時存在矩陣除以常數的操作，但batchsize維度（即第1維，最外層維度）和內部數據處理機制不同，因此這個常數需要至少保留2個維度，後續再進一步驗證猜想。

BatchNorm1d

對於Pytorch1.2.0轉換模型爲ONNX後，在序列化爲TensorRT時出現：

Integer division by zero

經過對模型進行調整，確定主要的問題出在BatchNorm1d，對於ArcFace模型的最後一層一般是是FC+BN，因此需要先將Tensor打平。但轉成的ONNX後對模型進行可視化，發現最後一個BatchNorm1d其實是ONNX中的BatchNormalization操作，因此需要先將Tensor升維後再降維。

但“Unsqueeze”這樣的操作在轉TensorRT時會出現問題：
Error importing onnx::Unsqueeze #180
Error importing onnx::Unsqueeze layer
一種方法是如TensorRT對ONNX"Batchnorm1d"的替代方案，將最後的FC通過Conv實現，這樣數據就不用打平，在使用BatchNorm操作完後再打平。這就要求在訓練前調整好模型結構。

另一種方法是對Batchnorm1d的操作進行重新定義。如果是已經訓練好後的模型不想修改結構重新訓練，就需要修改Batchnorm1d的實現方式避免轉ONNX時進行升降維的操作。結合pytorch nn.BatchNorm1d 與手動python實現不一樣–解決辦法、class MyBatchNorm2d(nn.BatchNorm2d)，實現Batchnorm1d自定義：

class MyBatchNorm1d(nn.BatchNorm1d):
    def __init__(self, num_features):
        super(MyBatchNorm1d, self).__init__(num_features)
        self.eps = 1e-5
        self.affine = True

    def forward(self, input):
        self._check_input_dim(input)
        # calculate running estimates
        input = (input - self.running_mean) / (torch.sqrt(self.running_var + self.eps))
        if self.affine:
            input = input * self.weight + self.bias
        return input

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Pytorch轉TensorRT中的坑

Pytorch與TensorRT版本

reshape

normalize

MaxPool3d

mean (ReduceMean)

BatchNorm1d

使用neovim打造go ide(支持代碼跳轉, 代碼補全, 實時語法檢查)

挑戰程序設計競賽 2.3章習題 poj 3046 Ant Counting

Shell/Python中的用戶名獲取

Ubuntu 分區空間調整 —— gparted

Ubuntu下MySQL+obdc安裝使用

Ubuntu 誤刪恢復

Ubuntu下dlib編譯

MXNET部署TensorRT檢查OP是否存在

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結