ZFNet網絡
ZFNet,Matthew D Zeiler在2013年發表卷積網絡模型
AlexNet網絡的設計思想
主要設計貢獻
- 可以理解爲只是對AlexNet網絡進行了微調
- 使用新穎的可視化技術來一窺中間特徵層的功能,以及分類的操作
- 利用反捲積和反池化技術實現可視化
- 根據可視化的結果,進一步調整網絡結構
ZFNet對AlexNet網絡進行的調整
- 通過可視化發現AlexNet第一層中有大量的高頻(圖像邊緣,圖像內物體分界線,圖像細節和噪音等)和低頻(非邊緣部分、圖像輪廓)信息的混合,卻幾乎沒有覆蓋到中間的頻率信息;
-
- 解決:將AlexNet的第一層濾波器的大小由11x11變成7x7,使得卷積可以進一步提取中間頻率的特徵
- 由於第一層卷積用的步長爲4,太大,導致了有非常多的混疊情況,學到的特徵不是特別好看,不像是後面的特徵能看到一些紋理、顏色等;
-
- 解決:將卷積核移動的步長4變成了2
ZFNet網絡的核心架構
softmax
反捲積和反池化
可以查看我的博客解讀進行理解反捲積和反池化:
圖像處理中的卷積、池化、反捲積和反池化的理解與思考
keras實現ZFNet
#coding=utf-8
from keras.models import Sequential
from keras.layers import Dense,Flatten,Dropout
from keras.layers.convolutional import Conv2D,MaxPooling2D
from keras.utils.np_utils import to_categorical
import numpy as np
seed = 7
np.random.seed(seed)
model = Sequential()
model.add(Conv2D(96,(7,7),strides=(2,2),input_shape=(224,224,3),padding='valid',activation='relu',kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2)))
model.add(Conv2D(256,(5,5),strides=(2,2),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2)))
model.add(Conv2D(384,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(384,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2)))
model.add(Flatten())
model.add(Dense(4096,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1000,activation='softmax'))
model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy'])
model.summary()
PyTorch實現ZFNet
class AlexNet(nn.Module):
def __init__(self,num_classes):
super(AlexNet, self).__init__()
self.conv1 = nn.Conv2d(in_channels = 3, out_channels = 96, kernel_size = 7, stride=2, padding=0)
self.pool1 = nn.MaxPool2d(kernel_size=3, stride=2, padding=0)
self.conv2 = nn.Conv2d(in_channels = 96, out_channels = 256 , kernel_size = 5, stride = 1, padding = 2)
self.pool2 = nn.MaxPool2d(kernel_size= 3,stride=2,padding=0)
self.conv3 = nn.Conv2d(in_channels= 256, out_channels= 384,kernel_size= 3,stride=1,padding=1)
self.conv4 = nn.Conv2d(in_channels=384,out_channels= 384,kernel_size=3,stride=1,padding=1)
self.conv5 = nn.Conv2d(in_channels=384,out_channels= 256,kernel_size=3,stride=1,padding=1)
self.pool3 = nn.MaxPool2d(kernel_size=3,stride=2,padding=0)
self.fc1 = nn.Linear(6*6*256,4096)
self.fc2 = nn.Linear(4096,4096)
self.fc3 = nn.Linear(4096,num_classes)
def forward(self,x):
x = self.pool1(F.relu(self.conv1(x)))
x = self.pool2(F.relu(self.conv2(x)))
x = F.relu(self.conv3(x))
x = F.relu(self.conv4(x))
x = self.pool3(F.relu(self.conv5(x)))
x = x.view(-1, 256 * 6 * 6)
x = F.dropout(x)
x = F.relu(self.fc1(x))
x = F.dropout(x)
x = F.relu(self.fc2(x))
x = F.softmax(self.fc3(x))
return x