利用TensorFlow serving在docker上部署深度學習模型

原創

Grack_skw

2020-07-07 00:32

官方的文檔https://www.tensorflow.org/tfx/serving/docker

官方例程

部署自己的模型

官方例程

首先拉取官方TensorFlow serving的鏡像

docker pull tensorflow/serving

建立本地的repo

mkdir -p /tmp/tfserving
cd /tmp/tfserving
git clone https://github.com/tensorflow/serving

運行鏡像

docker run -p 8501:8501 \
  --mount type=bind,\
source=/tmp/tfserving/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu,\
target=/models/half_plus_two \
  -e MODEL_NAME=half_plus_two -t tensorflow/serving &

命令的解釋如下

--mount：   表示要進行掛載
source：    指定要運行部署的模型地址， 也就是掛載的源，這個是在宿主機上的模型目錄
target:     這個是要掛載的目標位置，也就是掛載到docker容器中的哪個位置，這是docker容器中的目錄
-t:         指定的是掛載到哪個容器
-p:         指定主機到docker容器的端口映射，這裏的8501端口是鏡像裏REST的端口
docker run: 啓動這個容器並啓動模型服務
 
綜合解釋：
         將source目錄中的例子模型，掛載到-t指定的鏡像啓動的容器下的target目錄，並啓動

這時出現錯誤，提示&要加雙引號

所在位置 行:1 字符: 208
+ ... s/half_plus_three -e MODEL_NAME=half_plus_two -t tensorflow/serving &
+                                                                         ~
不允許使用與號(&)。& 運算符是爲將來使用而保留的；請用雙引號將與號引起來("&")，以將其作爲字符串的一部分傳遞。
    + CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
    + FullyQualifiedErrorId : AmpersandNotAllowed

用雙引號把&擴起來

 docker run -p 8501:8501 --name="half_plus_two" --mount type=bind,source=E:/tmp/tfserving/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu,target=/models/half_plus_two -e MODEL_NAME=half_plus_two -t tensorflow/serving "&"

然後出現了-X的問題，可能是指令的格式不對

Invoke-WebRequest : 找不到與參數名稱“X”匹配的參數。
所在位置 行:1 字符: 42
+ curl -d '{"instances": [1.0, 2.0, 5.0]}' -X POST http://localhost:850 ...
+                                          ~~
    + CategoryInfo          : InvalidArgument: (:) [Invoke-WebRequest]，ParameterBindingException
    + FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

參考了https://www.shuijingwanwq.com/2018/12/28/3077/，改成如下

curl -Body '{"instances": [1.0, 2.0, 5.0]}' -Uri http://localhost:8501/v1/models/half_plus_two:predict -Method 'POST'

或者使用requests.post方法

uri=http://localhost:8501/v1/models/half_plus_two:predict
query_data = '{"instances": [[1.0, 2.0, 3.0]}'
requests.post(url, query_data)

這樣就能work了，結果如下

StatusCode        : 200
StatusDescription : OK
Content           : {
                        "predictions": [2.5, 3.0, 4.5
                        ]
                    }
RawContent        : HTTP/1.1 200 OK
                    Content-Length: 43
                    Content-Type: application/json
                    Date: Mon, 22 Jun 2020 07:58:23 GMT

                    {
                        "predictions": [2.5, 3.0, 4.5
                        ]
                    }
Forms             : {}
Headers           : {[Content-Length, 43], [Content-Type, application/json], [Date, Mon, 22 Jun 2020 07:58:23 GMT]}
Images            : {}
InputFields       : {}
Links             : {}
ParsedHtml        : mshtml.HTMLDocumentClass
RawContentLength  : 43

部署自己的模型

只需要把自己的模型存成例程那樣的格式就行，例程的文件結構是

model下有一個00000123的文件夾，裏面有assets,variables,saved_model.pb三個文件，結構如下

E:.
└─00000123
    │  saved_model.pb
    │
    ├─assets
    │      foo.txt
    │
    └─variables
            variables.data-00000-of-00001
            variables.index

assets下只有一個foo.txt，裏面內容是

asset-file-contents

因爲我用的Keras訓練網絡，所以先需要轉換成TensorFlow的.pb模型文件，代碼如下

def model_h52pb():
    model_path = r'D:\work\model\my_model.h5'

    K.set_learning_phase(0)
    model = load_model(model_path, custom_objects={'my_acc': my_acc})  # 加載h5模型
    export_path = r'D:\work\model\tf_model'

    with K.get_session() as sess:
        tf.saved_model.simple_save(
            sess,
            export_path,
            inputs={'input': model.input},
            outputs={t.name: t for t in model.outputs})

這樣可以得到和例程文件格式完全一樣的variables和pb文件，那個assets文件夾直接拷貝過來。這樣文件格式就搞定了。

官方例程的寫法是

 docker run -p 8501:8501 --mount type=bind,source=E:/tmp/tfserving/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu,target=/models/half_plus_two -e MODEL_NAME=half_plus_two -t tensorflow/serving "&"

source裏面是本地模型的路徑，只需要換成我們模型的路徑就行了。target裏面放模型在容器裏的路徑，MODEL_NAME在query的時候要用到。這裏模型名字設置爲my_model。

 docker run -p 8501:8501 --name="my_model" --mount type=bind,source=E:/tmp/tfserving/my_model,target=/models/my -e MODEL_NAME=my -t tensorflow/serving "&"

query可以這樣寫

curl -Body '{"instances": [[1.0, 2.0, 5.0]]}' -Uri http://localhost:8501/v1/models/my:predict -Method 'POST'

注意instances的格式要跟代碼能正確處理的數據格式一樣，才能進行預測，我這裏是輸入一個1*3的矩陣，與[1,2,3]點乘，然後+1。

收到的response如下

StatusCode        : 200
StatusDescription : OK
Content           : {
                        "predictions": [[20.8982468]
                        ]
                    }
RawContent        : HTTP/1.1 200 OK
                    Content-Length: 42
                    Content-Type: application/json
                    Date: Tue, 23 Jun 2020 01:31:10 GMT

                    {
                        "predictions": [[20.8982468]
                        ]
                    }
Forms             : {}
Headers           : {[Content-Length, 42], [Content-Type, application/json], [Date, Tue, 23 Jun 2020 01:31:10 GMT]}
Images            : {}
InputFields       : {}
Links             : {}
ParsedHtml        : mshtml.HTMLDocumentClass
RawContentLength  : 42

可以看到結果是符合預期的，大功告成！

參考：

http://dockone.io/article/9209

https://www.shuijingwanwq.com/2018/12/28/3077/

https://blog.csdn.net/weixin_34343000/article/details/88118667?utm_medium=distribute.pc_relevant.none-task-blog-baidujs-6

https://blog.csdn.net/u011734144/article/details/82107610?utm_medium=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.nonecase&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.nonecase

https://blog.csdn.net/m0_38088298/article/details/85796658