tf.nn.conv2d()函數詳解(strides與padding的關係)

原創

2020-06-15 21:59

tf.nn.conv2d()是TensorFlow中用於創建卷積層的函數，這個函數的調用格式如下：

def conv2d(input: Any,
           filter: Any,
           strides: Any,
           padding: Any,
           use_cudnn_on_gpu: bool = True,
           data_format: str = "NHWC",
           dilations: List[int] = [1, 1, 1, 1],
           name: Any = None) -> Any

其中，比較重要的參數是 input， filter， strides， padding。

input 就是輸入的數據，格式就是TensorFlow的標準，使用四維矩陣的形式，分別是Btach_size（可以說是要處理的圖片數量），height， width，deepth（或者說是channel也就是通道數，比如RGB，3個通道）。

filter 在TensorFlow中稱爲濾波器，本質就相當於卷積核權重矩陣，這裏要注意filter的形式，也是四維數組的形式，分別是
height高度（卷積核的高度），
width寬度（卷積核的寬度），
deepth（channel）深度（這個與input的deepth一致），
Feature Map的數目，也可以說是卷積核的數目，也就是最後生成的特種圖的數目。

stride 也就是步長了，按照上面的 “一貫作風”，也是四維數組的形式，分別表示
在batch_size上的步長，
高度的步長，
寬度的步長以
深度的步長，
對應的是input的四個維度，一般對於圖片輸入來說，只需要改變中間兩個值。這個步長一定層度上決定了輸出特徵圖的大小。

padding 是填充，這裏只有兩個值 SAME 和 VALID，

後面的參數就是加速之類的選項，不是很重要。

stride和padding兩個參數應該結合在一起來說。

1、padding取值爲’SAME’

由於圖片大小和卷積核大小不一定是倍數關係，在SAME模式下，會通過周圍補零來保證所有數據都能被掃描到。那到底補多少零，最後輸出的特徵圖的大小爲多少，在這個函數裏，是由stride來決定的。先看代碼：

import tensorflow as tf

data=tf.Variable(tf.random_normal([64,43,43,3]),dtype=tf.float32)
weight=tf.Variable(tf.random_normal([3,3,3,64]),dtype=tf.float32)

sess=tf.InteractiveSession()
tf.global_variables_initializer().run()

conv1=tf.nn.conv2d(data,weight,strides=[1,1,1,1],padding='SAME')
conv2=tf.nn.conv2d(data,weight,strides=[1,2,2,1],padding='SAME')
conv3=tf.nn.conv2d(data,weight,strides=[1,3,3,1],padding='SAME')
conv4=tf.nn.conv2d(data,weight,strides=[1,4,4,1],padding='SAME')

print(conv1)
print(conv2)
print(conv3)
print(conv4)

輸出爲
Tensor(“Conv2D_8:0”, shape=(64, 43, 43, 64), dtype=float32)
Tensor(“Conv2D_9:0”, shape=(64, 22, 22, 64), dtype=float32)
Tensor(“Conv2D_10:0”, shape=(64, 15, 15, 64), dtype=float32)
Tensor(“Conv2D_11:0”, shape=(64, 11, 11, 64), dtype=float32)

可以看出輸出的尺寸大小與stride的第二第三個參數是倍數關係：
當strides=[1,1,1,1]時，輸出尺寸與原始尺寸相同
當strides=[1,2,2,1]時，43不是2的倍數，先把43增加到44，再除2，得22
當strides=[1,3,3,1]時，43不是3的倍數，先把43增加到45，再除3，得15
當strides=[1,4,4,1]時，43不是4的倍數，先把43增加到44，再除4，得11
依次類推

我們再來看看，輸出的特徵圖尺寸與卷積核的大小有沒有關係：

import tensorflow as tf

data=tf.Variable(tf.random_normal([64,43,43,3]),dtype=tf.float32)
weight=tf.Variable(tf.random_normal([5,5,3,64]),dtype=tf.float32)

sess=tf.InteractiveSession()
tf.global_variables_initializer().run()

conv1=tf.nn.conv2d(data,weight,strides=[1,1,1,1],padding='SAME')
conv2=tf.nn.conv2d(data,weight,strides=[1,2,2,1],padding='SAME')
conv3=tf.nn.conv2d(data,weight,strides=[1,3,3,1],padding='SAME')
conv4=tf.nn.conv2d(data,weight,strides=[1,4,4,1],padding='SAME')

print(conv1)
print(conv2)
print(conv3)
print(conv4)

輸出爲
Tensor(“Conv2D_16:0”, shape=(64, 43, 43, 64), dtype=float32)
Tensor(“Conv2D_17:0”, shape=(64, 22, 22, 64), dtype=float32)
Tensor(“Conv2D_18:0”, shape=(64, 15, 15, 64), dtype=float32)
Tensor(“Conv2D_19:0”, shape=(64, 11, 11, 64), dtype=float32)

通過上面可以知道，在SAME模式下，輸出的尺寸與卷積核的尺寸沒有關係，只與strides有關係

2、padding取值爲VALID

在VALID模式下，不會補零，掃描不到的數據會被直接拋棄。

import tensorflow as tf

data=tf.Variable(tf.random_normal([64,43,43,3]),dtype=tf.float32)
weight=tf.Variable(tf.random_normal([5,5,3,64]),dtype=tf.float32)

sess=tf.InteractiveSession()
tf.global_variables_initializer().run()
conv1=tf.nn.conv2d(data,weight,strides=[1,1,1,1],padding='VALID')
conv2=tf.nn.conv2d(data,weight,strides=[1,2,2,1],padding='VALID')
conv3=tf.nn.conv2d(data,weight,strides=[1,3,3,1],padding='VALID')
conv4=tf.nn.conv2d(data,weight,strides=[1,4,4,1],padding='VALID')

print(conv1)
print(conv2)
print(conv3)
print(conv4)

輸出爲：
Tensor(“Conv2D_20:0”, shape=(64, 39, 39, 64), dtype=float32)
Tensor(“Conv2D_21:0”, shape=(64, 20, 20, 64), dtype=float32)
Tensor(“Conv2D_22:0”, shape=(64, 13, 13, 64), dtype=float32)
Tensor(“Conv2D_23:0”, shape=(64, 10, 10, 64), dtype=float32)

import tensorflow as tf

data=tf.Variable(tf.random_normal([64,43,43,3]),dtype=tf.float32)
weight=tf.Variable(tf.random_normal([3,3,3,64]),dtype=tf.float32)

sess=tf.InteractiveSession()
tf.global_variables_initializer().run()
conv1=tf.nn.conv2d(data,weight,strides=[1,1,1,1],padding='VALID')
conv2=tf.nn.conv2d(data,weight,strides=[1,2,2,1],padding='VALID')
conv3=tf.nn.conv2d(data,weight,strides=[1,3,3,1],padding='VALID')
conv4=tf.nn.conv2d(data,weight,strides=[1,4,4,1],padding='VALID')

print(conv1)
print(conv2)
print(conv3)
print(conv4)

Tensor(“Conv2D_28:0”, shape=(64, 41, 41, 64), dtype=float32)
Tensor(“Conv2D_29:0”, shape=(64, 21, 21, 64), dtype=float32)
Tensor(“Conv2D_30:0”, shape=(64, 14, 14, 64), dtype=float32)
Tensor(“Conv2D_31:0”, shape=(64, 11, 11, 64), dtype=float32)

從上面兩組數據對比可以看出，在VALID模式下，輸出的尺寸與卷積核的尺寸，步長都相關。計算公式如下：

由於是VALID模式，padding=0

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

tf.nn.conv2d()函數詳解(strides與padding的關係)

1、padding取值爲’SAME’

2、padding取值爲VALID

學習筆記，梯度下降（非向量實現）

卷積神經網絡（CNN）中卷積層計算細節

tf.nn.conv2d()函數詳解(strides與padding的關係)

tf.tile

範數

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結