以下鏈接是個人關於HR-Net(人體姿態估算) 所有見解,如有錯誤歡迎大家指出,我會第一時間糾正。有興趣的朋友可以加微信:a944284742相互討論技術。若是幫助到了你什麼,一定要記得點贊!因爲這是對我最大的鼓勵。
姿態估計1-00:HR-Net(人體姿態估算)-目錄-史上最新無死角講解
前言
通過上篇博客,詳細介紹了ib/models/pose_hrnet.py中類 PoseHighResolutionNet 的如下函數:
def __init__(self, cfg, **kwargs):
......
def forward(self, x):
......
到最後,我們發現__init__中調用了:
def _make_transition_layer(self, num_channels_pre_layer, num_channels_cur_layer):
def _make_stage(self, layer_config, num_inchannels,multi_scale_output=True):
相對來說,這兩個函數是比較複雜,同時也是比較核心的函數,下面我們對其進行分析。
_make_transition_layer
def _make_transition_layer(
self, num_channels_pre_layer, num_channels_cur_layer):
"""
:param num_channels_pre_layer: 上一個stage平行網絡的輸出通道數目,爲一個list,
stage=2時, num_channels_pre_layer=[256]
stage=3時, num_channels_pre_layer=[32,64]
stage=4時, num_channels_pre_layer=[32,64,128]
:param num_channels_cur_layer:
stage=2時, num_channels_cur_layer = [32,64]
stage=3時, num_channels_cur_layer = [32,64,128]
stage=4時, num_channels_cur_layer = [32,64,128,256]
"""
num_branches_cur = len(num_channels_cur_layer)
num_branches_pre = len(num_channels_pre_layer)
transition_layers = []
# 對stage的每個分支進行處理
for i in range(num_branches_cur):
# 如果不爲最後一個分支
if i < num_branches_pre:
# 如果當前層的輸入通道和輸出通道數不相等,則通過卷積對通道數進行變換
if num_channels_cur_layer[i] != num_channels_pre_layer[i]:
transition_layers.append(
nn.Sequential(
nn.Conv2d(
num_channels_pre_layer[i],
num_channels_cur_layer[i],
3, 1, 1, bias=False
),
nn.BatchNorm2d(num_channels_cur_layer[i]),
nn.ReLU(inplace=True)
)
)
# 如果當前層的輸入通道和輸出通道數相等,則什麼都不做
transition_layers.append(None)
# 如果爲最後一個分支,則再新建一個分支(該分支分辨率會減少一半)
else:
conv3x3s = []
for j in range(i+1-num_branches_pre):
inchannels = num_channels_pre_layer[-1]
outchannels = num_channels_cur_layer[i] \
if j == i-num_branches_pre else inchannels
conv3x3s.append(
nn.Sequential(
nn.Conv2d(
inchannels, outchannels, 3, 2, 1, bias=False
),
nn.BatchNorm2d(outchannels),
nn.ReLU(inplace=True)
)
)
transition_layers.append(nn.Sequential(*conv3x3s))
return nn.ModuleList(transition_layers)
_make_stage
def _make_stage(self, layer_config, num_inchannels,
multi_scale_output=True):
"""
當stage=2時: num_inchannels=[32,64] multi_scale_output=Ture
當stage=3時: num_inchannels=[32,64,128] multi_scale_output=Ture
當stage=4時: num_inchannels=[32,64,128,256] multi_scale_output=False
"""
# 當stage=2,3,4時,num_modules分別爲:1,4,3
# 表示HighResolutionModule(平行之網絡交換信息模塊)模塊的數目
num_modules = layer_config['NUM_MODULES']
# 當stage=2,3,4時,num_branches分別爲:2,3,4,表示每個stage平行網絡的數目
num_branches = layer_config['NUM_BRANCHES']
# 當stage=2,3,4時,num_blocks分別爲:[4,4], [4,4,4], [4,4,4,4],
# 表示每個stage blocks(BasicBlock或者BasicBlock)的數目
num_blocks = layer_config['NUM_BLOCKS']
# 當stage=2,3,4時,num_channels分別爲:[32,64],[32,64,128],[32,64,128,256]
# 在對應stage, 對應每個平行子網絡的輸出通道數
num_channels = layer_config['NUM_CHANNELS']
# 當stage=2,3,4時,分別爲:BasicBlock,BasicBlock,
block = blocks_dict[layer_config['BLOCK']]
# 當stage=2,3,4時,都爲:SUM,表示特徵融合的方式
fuse_method = layer_config['FUSE_METHOD']
modules = []
# 根據num_modules的數目創建HighResolutionModule
for i in range(num_modules):
# multi_scale_output is only used last module
# multi_scale_output 只被用再最後一個HighResolutionModule
if not multi_scale_output and i == num_modules - 1:
reset_multi_scale_output = False
else:
reset_multi_scale_output = True
# 根據參數,添加HighResolutionModule到
modules.append(
HighResolutionModule(
num_branches, # 當前stage平行分支的數目
block, # BasicBlock,BasicBlock
num_blocks, # BasicBlock或者BasicBlock的數目
num_inchannels,# 輸入通道數目
num_channels, # 輸出通道數
fuse_method, # 通特徵融合的方式
reset_multi_scale_output # 是否使用多尺度方式輸出
)
)
# 獲得最後一個HighResolutionModule的輸出通道數
num_inchannels = modules[-1].get_num_inchannels()
return nn.Sequential(*modules), num_inchannels
總結
我相信通過上面的註解,大家對兩個函數應該算是比較瞭解了。可以很明顯的看到,其中還有一個核心就是HighResolutionModule模塊,也就是信息交流模塊的核心關鍵,下篇博客會專門對其進行講解,歡迎觀看。