使用 Mobilenet 和 Keras 來做遷移學習

本文爲 AI 研習社編譯的技術博客，原標題： Transfer Learning using Mobilenet and Keras 作者 | Ferhat Culfaz 翻譯 | 胡瑛皓校對 | 醬番梨整理 | 菠蘿妹原文鏈接： https://towardsdatascience.com/transfer-learning-using-mobilenet-and-keras-c75daf7ff299 注：本文的相關鏈接請點擊文末【閱讀原文】進行訪問

使用Mobilenet和Keras來做遷移學習

本文以notebook的實例的形式講解。首先用Mobilenet分類狗的圖片，然後演示一張不能正確分類的藍雀圖片，然後用遷移學習和Mobilenet重新訓練，使這張圖片得到正確分類。例子中舉的是二分類，當然也可以按需要進行增加更多類型，看具體硬件和計算時間的限制。

Mobilenet論文地址：https://arxiv.org/pdf/1704.04861.pdf

MobileNets： Efficient Convolutional Neural Networks for Mobile Vision Applications, Howard et al, 2017.

Mobilenet採用輕量級架構，會用它進行訓練。它使用深度可分離卷積操作，意思是說其採用的是單通道卷積操作，而不是混合三種顏色然後進行扁平化操作。其具有過濾輸入通道的效果。或就像論文作者解釋的： “ MobileNets運用深度卷積操作，對每個輸入通道應用單個過濾操作。然後逐點用1×1卷積合併深度卷積的結果。用標準卷積將輸入進行過濾合併到一組新的輸出。深度可分離卷積再將其分開成2層，一層用於過濾，另一層用於合併。上述分解顯著減少了計算量和模型大小。”

逐點卷積和深度卷積的區別

Mobilenet的整體架構是這樣的，其中包含30層：

步長2的卷積層
深度卷積
逐點卷積層使通道數翻倍
步長爲2的深度卷積層
逐點卷積層使通道數翻倍等

等等

Mobilenet 完整架構

其維護成本很低，因而性能速度都很好。目前也有一些受青睞的預訓練模型，模型的大小可在適應內存或磁盤上，與所用到的參數成正比。模型的速度和能耗與MACs(Multiply-Accumulates)數量成正比，該指標用以衡量乘法操作與累加操作的數量。

好，現在我們開始編碼!

本文代碼地址如下： https://github.com/ferhat00/Deep-Learning/tree/master/Transfer%20Learning%20CNN

首先，加載一些必要的包和庫。

import keras
from keras import backend as K
from keras.layers.core import Dense, Activation
from keras.optimizers import Adam
from keras.metrics import categorical_crossentropy
from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing import image
from keras.models import Model
from keras.applications import imagenet_utils
from keras.layers import Dense,GlobalAveragePooling2D
from keras.applications import MobileNet
from keras.applications.mobilenet import preprocess_input
import numpy as np
from IPython.display import Image
from keras.optimizers import Adam

我們從Keras導入預訓練模型。

mobile = keras.applications.mobilenet.MobileNet()
def prepare_image(file):
    img_path = ''
    img = image.load_img(img_path + file, target_size=(224, 224))
    img_array = image.img_to_array(img)
    img_array_expanded_dims = np.expand_dims(img_array, axis=0)
    return keras.applications.mobilenet.preprocess_input(img_array_expanded_dims)

我們試試看分類不同品種的狗狗。

Image(filename='German_Shepherd.jpg')

preprocessed_image = prepare_image('German_Shepherd.jpg')
predictions = mobile.predict(preprocessed_image)
results = imagenet_utils.decode_predictions(predictions)
results

輸出：

[[('n02106662', 'German_shepherd', 0.9796372),
  ('n02105162', 'malinois', 0.020184083),
  ('n02091467', 'Norwegian_elkhound', 0.00015799515),
  ('n02116738', 'African_hunting_dog', 5.2901587e-06),
  ('n02105251', 'briard', 3.9127376e-06)]]
Image(filename='labrador1.jpg')

preprocessed_image = prepare_image('labrador1.jpg')
predictions = mobile.predict(preprocessed_image)
results = imagenet_utils.decode_predictions(predictions)
results

輸出：

[[(‘n02099712’, ‘Labrador_retriever’, 0.73073703),
 (‘n02087394’, ‘Rhodesian_ridgeback’, 0.03984367),
 (‘n02092339’, ‘Weimaraner’, 0.03359009),
 (‘n02109047’, ‘Great_Dane’, 0.028944707),
 (‘n02110341’, ‘dalmatian’, 0.022403581)]]
Image(filename='poodle1.jpg')

preprocessed_image = prepare_image('poodle1.jpg')
predictions = mobile.predict(preprocessed_image)
results = imagenet_utils.decode_predictions(predictions)
results

輸出：

[[('n02113799', 'standard_poodle', 0.5650911),
  ('n02113712', 'miniature_poodle', 0.37279922),
  ('n02102973', 'Irish_water_spaniel', 0.053150617),
  ('n02113624', 'toy_poodle', 0.0072146286),
  ('n02093859', 'Kerry_blue_terrier', 0.0013652634)]]

目前爲止一切都很好。分類器區分出不同品種的狗。那麼讓我們拿一種鳥類圖片來試試，這裏用藍雀圖片。

Image(filename='blue_tit.jpg')

藍雀

preprocessed_image = prepare_image('blue_tit.jpg')
predictions = mobile.predict(preprocessed_image)
results = imagenet_utils.decode_predictions(predictions)
results

輸出:

[[('n01592084', 'chickadee', 0.95554715),
  ('n01530575', 'brambling', 0.012973112),
  ('n01828970', 'bee_eater', 0.012916375),
  ('n01532829', 'house_finch', 0.010978725),
  ('n01580077', 'jay', 0.0020677084)]]

可以看到，分類器不能識別藍雀。圖片被錯分爲山雀(chickadee)。這是一種北美本土的鳥類，有一些微妙的不同：

山雀

我們調一下 Mobilenet 的架構，然後重新訓練頂部幾層，進行遷移學習。要達成這個，拿一些圖片來訓練這個模型。這裏會讓模型學習藍雀和烏鴉。這裏就不用手工下載訓練用到的圖片了，用谷歌圖像搜索，然後下載這些圖片，google_images_download包很好用，加一下引用就行了。

地址： https://github.com/hardikvasa/google-images-download

!pip install google_images_download
from google_images_download import google_images_download
response = google_images_download.googleimagesdownload()
arguments = {"keywords":"blue tit","limit":100,"print_urls":False,"format":"jpg", "size":">400*300"}
paths = response.download(arguments)
arguments = {"keywords":"crow","limit":100,"print_urls":False, "format":"jpg", "size":">400*300"}
paths = response.download(arguments)

現在我們重用MobileNet，會下載一個輕量級存檔文件(17Mb), 凍結其基礎層，在模型頂部增加幾層，然後進行訓練。注意本文只訓練一個二分類器，區分藍雀和烏鴉。

base_model=MobileNet(weights='imagenet',include_top=False) #imports the mobilenet model and discards the last 1000 neuron layer.

x=base_model.output
x=GlobalAveragePooling2D()(x)
x=Dense(1024,activation='relu')(x) #we add dense layers so that the model can learn more complex functions and classify for better results.
x=Dense(1024,activation='relu')(x) #dense layer 2
x=Dense(512,activation='relu')(x) #dense layer 3
preds=Dense(2,activation='softmax')(x) #final layer with softmax activation

我們來看一下模型的架構

for i,layer in enumerate(model.layers):
  print(i,layer.name)

這裏採用Imagenet數據集預訓練的權重。保證所有權重是不可被訓練(凍結)的。只訓練最後幾個dense層。

for layer in model.layers:
    layer.trainable=False
# or if we want to set the first 20 layers of the network to be non-trainable
for layer in model.layers[:20]:
    layer.trainable=False
for layer in model.layers[20:]:
    layer.trainable=True

把訓練數據載入ImageDataGenerator。指定一下路徑，它會自動將數據以批次形式供給訓練，簡化了編碼過程。

train_datagen=ImageDataGenerator(preprocessing_function=preprocess_input) #included in our dependencies

train_generator=train_datagen.flow_from_directory('C:/Users/Ferhat/Python Code/Workshop/Tensoorflow transfer learning/downloads',
                                                 target_size=(224,224),
                                                 color_mode='rgb',
                                                 batch_size=32,
                                                 class_mode='categorical',
                                                 shuffle=True)

編譯模型。現在開始訓練。在GTX1070 GPU環境下，訓練不到2分鐘。

model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])
# Adam optimizer
# loss function will be categorical cross entropy
# evaluation metric will be accuracy

step_size_train=train_generator.n//train_generator.batch_size
model.fit_generator(generator=train_generator,
                   steps_per_epoch=step_size_train,
                   epochs=10)

Epoch 1/10
5/5 [==============================] - 5s 952ms/step - loss: 0.9098 - acc: 0.6562
Epoch 2/10
5/5 [==============================] - 3s 563ms/step - loss: 0.0503 - acc: 0.9686
Epoch 3/10
5/5 [==============================] - 3s 687ms/step - loss: 0.0236 - acc: 0.9930
Epoch 4/10
5/5 [==============================] - 4s 716ms/step - loss: 7.5358e-04 - acc: 1.0000
Epoch 5/10
5/5 [==============================] - 3s 522ms/step - loss: 0.0021 - acc: 1.0000
Epoch 6/10
5/5 [==============================] - 4s 780ms/step - loss: 0.0353 - acc: 0.9937
Epoch 7/10
5/5 [==============================] - 3s 654ms/step - loss: 0.0905 - acc: 0.9938
Epoch 8/10
5/5 [==============================] - 4s 890ms/step - loss: 0.0047 - acc: 1.0000
Epoch 9/10
5/5 [==============================] - 3s 649ms/step - loss: 0.0377 - acc: 0.9867
Epoch 10/10
5/5 [==============================] - 5s 929ms/step - loss: 0.0125 - acc: 1.0000

模型訓練好了。我們來測試一些獨立輸入的圖片，檢查一下預測情況。

def load_image(img_path, show=False):

    img = image.load_img(img_path, target_size=(150, 150))
    img_tensor = image.img_to_array(img)                    # (height, width, channels)
    img_tensor = np.expand_dims(img_tensor, axis=0)         # (1, height, width, channels), add a dimension because the model expects this shape: (batch_size, height, width, channels)
    img_tensor /= 255.                                      # imshow expects values in the range [0, 1]

    if show:
        plt.imshow(img_tensor[0])                           
        plt.axis('off')
        plt.show()

    return img_tensor
  
#img_path = 'C:/Users/Ferhat/Python Code/Workshop/Tensoorflow transfer learning/blue_tit.jpg'
img_path = 'C:/Users/Ferhat/Python Code/Workshop/Tensoorflow transfer learning/crow.jpg'
new_image = load_image(img_path)

pred = model.predict(new_image)

pred

輸出：

array([[4.5191143e-15, 1.0000000e+00]], dtype=float32)

結果顯示，分類器準確的預測了烏鴉，此處藍雀圖像被註釋掉了。

烏鴉

本文中的方法可被進一步擴展到更多圖像類型的分類上，分類數增加抽象效果會更好。這種方法是輕量級、可快速實現的CNN遷移學習方法。當然這也取決於速度、準確度、採用的硬件以及你投入的時間。

使用 Mobilenet 和 Keras 來做遷移學習

刷arxiv有哪些技巧？5個問題快速理解機器學習論文

圖解 | NumPy可視化指南 numpy數組 vs. Python列表 1.向量與1維數組 2.矩陣和二維數組 3、3維及更高維數組參考

準備開始學習機器學習？有人幫你選出了 top 8 優質課程：CS229、 Stat 451……

人工智能和機器學習之間的區別，你真的清楚嗎? 什麼是機器學習？什麼是人工智能（AI）？爲什麼科技公司傾向於將AI和ML交替使用？

讓審稿人更感興趣的論文標題和摘要如何撰寫？簡潔、精確、周密是關鍵標題標題類型起草合適的標題好標題清單摘要類型摘要類型編寫合適的摘要參考文獻

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結