文章目錄

1. 可視化卷積層

神經網絡模型通常是不透明的，這意味着很難解釋其做出特定決策或預測的原因。卷積神經網絡旨在處理圖像數據，其結構和功能表明，與其他類型的神經網絡相比，卷積神經網絡應該更容易理解。

具體來說，模型由較小的線性卷積核（filter）和卷積核的計算結果（稱爲激活圖或特徵圖）組成。filter和特徵圖都可以可視化。比如，我們可以設計和理解像線性檢測器一樣的較小的卷積核。可視化已學習的卷積神經網絡中的filter對理解模型如何工作的很有幫助。

通過將filter應用於輸入圖像和先前層輸出的特徵圖而生成的特徵圖，可以深入瞭解模型在給定節點上模型具有特定輸入的內部表示形式。

2. VGG Model

無需從頭開始擬合模型，可以使用預先擬合的現有最新圖像分類模型。Keras 提供了許多由不同研究小組針對ImageNet大規模視覺識別挑戰賽（ILSVRC）開發的性能良好的圖像分類模型的示例。一個例子是VGG-16模型，該模型在2014年比賽中獲得了最佳成績。這是用於可視化的一個很好的模型，因爲它具有序列化卷積和池化層的簡單統一結構，具有16個學習層，並且性能非常好，這意味着卷積核和生成的特徵圖可以捕獲有用的特徵。

# load vgg model
from keras.applications.vgg16 import VGG16
model = VGG16()
model.summary()

運行該示例會將模型權重加載到內存中，並打印已加載模型的摘要。如果是第一次加載模型，那麼權重將從互聯網上下載，約爲500mb。【直接點擊返回的鏈接，下載完成後，放到 C:\Users\34123\.keras\models路徑下即可】

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________

3. 可視化 filters

最簡單的可視化方法是直接繪製各層學習的filter信息。

在神經網絡術語中，學習的filter是權重，但是由於filter的特殊二維結構，權重值彼此之間具有空間關係，因此將每個filter繪製爲二維圖像是有意義的。

第一步是檢查模型中的過濾器，上一節中打印的模型摘要總結了每個圖層的輸出形狀，例如，生成的特徵圖的形狀。它沒有給出網絡中過濾器（權重）的形狀的任何信息，僅給出了每層權重的總數。可以通過model.layers屬性訪問模型的所有層。

每層都有一個layer.name屬性，其中的卷積層具有諸如block＃_conv＃之類的命名卷積，其中的 “＃” 是整數。因此，可以檢查每個圖層的名稱，並跳過任何不包含 ‘conv’ 的名稱。

每個卷積層都有兩組權重。一個是filter，另一個是bias。這些可以通過 layer.get_weights() 函數進行訪問。

# summarize filter shapes
for layer in model.layers:
    
    # check for convolutional layer
    if 'conv' not in layer.name:
        continue
        
    # get filter weights
    filters, biases = layer.get_weights()
    print(layer.name, filters.shape)

輸出：

block1_conv1 (3, 3, 3, 64)
block1_conv2 (3, 3, 64, 64)
block2_conv1 (3, 3, 64, 128)
block2_conv2 (3, 3, 128, 128)
block3_conv1 (3, 3, 128, 256)
block3_conv2 (3, 3, 256, 256)
block3_conv3 (3, 3, 256, 256)
block4_conv1 (3, 3, 256, 512)
block4_conv2 (3, 3, 512, 512)
block4_conv3 (3, 3, 512, 512)
block5_conv1 (3, 3, 512, 512)
block5_conv2 (3, 3, 512, 512)
block5_conv3 (3, 3, 512, 512)

可以看到，所有卷積層都使用3×3的卷積核。

對於具有紅色，綠色和藍色三個通道的輸入圖像，每個filter的深度爲3（此處使用的格式爲channel_last）。可以將一個filter可視化爲包含三個圖像的圖，每個通道一個，或者將所有三個圖像壓縮爲一個彩色圖像，只看第一個通道並假定其他通道看起來相同。但是還有其它63個filter可以可視化。

首先，從第一層檢索filter：

# retrieve weights from the second hidden layer
filters, biases = model.layers[1].get_weights()

權重值可能會是較小的正值，而負值將以0.0爲中心。可以將它們的值標準化爲0-1，以使其易於可視化。

# normalize filter values to 0-1 so we can visualize them
f_min, f_max = filters.min(), filters.max()
filters = (filters - f_min) / (f_max - f_min)

下面可視化64個filter中的前6個的情況，繪製每個filter在三個通道上的結果。

import matplotlib.pyplot as plt
plt.rcParams['figure.dpi'] = 200

# retrieve weights from the second hidden layer
filters, biases = model.layers[1].get_weights()

# normalize filter values to 0-1 so we can visualize them
f_min, f_max = filters.min(), filters.max()
filters = (filters - f_min) / (f_max - f_min)

# plot first few filters
n_filters, ix = 6, 1
for i in range(n_filters):
    # get the filter
    f = filters[:, :, :, i]
    
    # plot each channel separately
    for j in range(3):
        # specify subplot and turn of axis
        ax = plt.subplot(n_filters, 3, ix)
        ax.set_xticks([])
        ax.set_yticks([])
        
        # plot filter channel in grayscale
        plt.imshow(f[:, :, j], cmap='gray') # coolwarm
        ix += 1

# show the figure
plt.show()

輸出：

可以看到，在某些情況下，通道之間的filter是相同的（第一行），在其它情況下，filter是不同的（最後一行）。深色方塊（上圖中的黑色，下圖中的藍色）表示較小的權重（或抑制權重），而淺色方塊（上圖中的白色，下圖中的紅色）表示較大的權重（興奮性權重）。可以看到第一行的濾鏡檢測到從左上方的光到右下方的暗的漸變。

4. 可視化特徵圖

特徵圖捕獲將filter應用於輸入（輸入圖像或特徵圖）的結果。可視化特定輸入圖像的特徵圖，可以幫助理解輸入的哪些特徵在特徵圖中被檢測或保留。靠近輸入的特徵圖會檢測到細小或細粒度的細節，而靠近模型的輸出的特徵圖會捕獲更一般的特徵。

首先，打印每個卷積層以及模型中的層的輸出大小或特徵圖大小。

# summarize feature map size for each conv layer
from keras.applications.vgg16 import VGG16
model = VGG16()

# summarize feature map shapes
for i in range(len(model.layers)):
    layer = model.layers[i]
    
    # check for convolutional layer
    if 'conv' not in layer.name:
        continue
        
    # summarize output shape
    print(i, layer.name, layer.output.shape)

輸出：

1 block1_conv1 (None, 224, 224, 64)
2 block1_conv2 (None, 224, 224, 64)
4 block2_conv1 (None, 112, 112, 128)
5 block2_conv2 (None, 112, 112, 128)
7 block3_conv1 (None, 56, 56, 256)
8 block3_conv2 (None, 56, 56, 256)
9 block3_conv3 (None, 56, 56, 256)
11 block4_conv1 (None, 28, 28, 512)
12 block4_conv2 (None, 28, 28, 512)
13 block4_conv3 (None, 28, 28, 512)
15 block5_conv1 (None, 14, 14, 512)
16 block5_conv2 (None, 14, 14, 512)
17 block5_conv3 (None, 14, 14, 512)

加載VGG模型後，我們可以定義一個新模型，該模型從第一卷積層（索引1）輸出特徵圖。

# redefine model to output right after the first hidden layer
model = Model(inputs=model.inputs, outputs=model.layers[1].output)

定義模型後，需要以模型期望的輸入數據尺寸（在這種情況下爲224×224）加載圖像。

# load the image with the required shape
img = load_img('dog2.jpg', target_size=(224, 224))

接下來，需要將圖像PIL對象轉換爲像素數據的NumPy數組，並從3D數組擴展爲 [ 樣本，行，列，通道 ] 的4D數組，其中樣本數爲1。

# convert the image to an array
img = img_to_array(img)
# expand dimensions so that it represents a single 'sample'
img = expand_dims(img, axis=0)

然後需要針對VGG模型適當縮放像素值。

# prepare the image (e.g. scale pixel values for the vgg)
img = preprocess_input(img)

通過調用 model.predict() 函數並傳遞準備好的單個圖像來獲取特徵圖。

# get feature map for first hidden layer
feature_maps = model.predict(img)

結果是尺寸爲224x224x64的特徵圖，可以將所有64個二維圖像繪製爲8×8正方形子圖中。

# plot all 64 maps in an 8x8 squares
square = 8
ix = 1
for _ in range(square):
	for _ in range(square):
		# specify subplot and turn of axis
		ax = plt.subplot(square, square, ix)
		ax.set_xticks([])
		ax.set_yticks([])
		# plot filter channel in grayscale
		plt.imshow(feature_maps[0, :, :, ix-1], cmap='gray')
		ix += 1
# show the figure
plt.show()

完整代碼：

from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.models import Model
import matplotlib.pyplot as plt
from numpy import expand_dims
plt.rcParams['figure.dpi'] =200

# load the model
model = VGG16()

# redefine model to output right after the first hidden layer
model = Model(inputs=model.inputs, outputs=model.layers[1].output)
model.summary()

# load the image with the required shape
img = load_img('dog2.jpg', target_size=(224, 224))

# convert the image to an array
img = img_to_array(img)

# expand dimensions so that it represents a single 'sample'
img = expand_dims(img, axis=0)

# prepare the image (e.g. scale pixel values for the vgg)
img = preprocess_input(img)

# get feature map for first hidden layer
feature_maps = model.predict(img)

# plot all 64 maps in an 8x8 squares
square = 8
ix = 1
for _ in range(square):
    for _ in range(square):
        # specify subplot and turn of axis
        ax = plt.subplot(square, square, ix)
        ax.set_xticks([])
        ax.set_yticks([])
        # plot filter channel in grayscale
        plt.imshow(feature_maps[0, :, :, ix-1], cmap='gray')
        ix += 1

# show the figure
plt.show()

輸出：

可以從其它特定卷積層的輸出中繪製特徵圖。圖像中有五個主要塊（例如，block1，block2等），它們以池化層結束。每個塊中最後一個卷積層的層索引爲[2、5、9、13、17]。

可以看到，距離模型輸入較近的特徵圖在圖像中捕獲了許多精細的細節，並且隨着我們深入模型的深入，特徵圖顯示的細節越來越少。

block1到block5捕獲的特徵圖如下圖所示：

參考：
https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/

【CV09】如何可視化CNN中的卷積核和特徵圖

文章目錄

1. 可視化卷積層

2. VGG Model

3. 可視化 filters

4. 可視化特徵圖

關於遊戲付費的一點想法

我通過CKA和CKS啦！

【CV12】如何在Keras使用 Mask R-CNN 進行目標檢測

【CV13】如何在Keras中使用 YOLO v3 進行目標檢測

【CV10】經典CNN模型中圖像數據增強方法簡介

【CV09】如何可視化CNN中的卷積核和特徵圖

【CV11】如何從頭開發於CIFAR-10圖像分類的CNN

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

【CV09】如何可視化CNN中的卷積核和特徵圖

文章目錄

1. 可視化卷積層

2. VGG Model

3. 可視化 filters

4. 可視化 特徵圖

4. 可視化特徵圖