Resnet原理&源碼簡單分析

嗨，小夥伴們，今天讓我們來了解一下Resnet的原理以及Resnet18網絡在Pytorch的實現。

原理

Resnet想必大家都很熟悉了，它的中文名爲殘差網絡，是由何愷明大佬提出的一種網絡結構。

在論文的開篇，提出了一個問題，神經網絡越深，性能就越好嗎？答案是否定的，如越深的神經網絡可能造成著名的梯度消失、爆炸問題，但這個問題已經通過Batch normalization解決。

但更深的網絡呢？

會出現網絡層數增加但loss不降反升的情況（如下圖），而且這種下降並不是通過過擬合引起的。

爲了能夠實現更深的網絡同時不出現退化的現象，論文提出了一個假設：當把淺層網絡特徵傳到深層網絡時，深層網絡的效果一定會比淺層網絡好（至少不會差），只要保證輸出的特徵參數一致，那麼就可以使用Identity mapping（恆等映射），來傳遞特徵。

那麼，恆等映射是一個什麼樣的東西呢？如下圖，非常簡單，其實就是把淺層網絡的輸出X加到兩層或三層卷積層後（較深層）的輸出，在這個過程中，兩層神經網絡的輸出參數個數是一樣的，這也保證了不同深度的網絡參數可以進行直接相加。

我們把上面的Block，稱之爲A Building Block，在論文中，提出了resnet18，resnet34，resnet50，resnet101，resnet152，resnet後面的尾數分別代表了網絡的層數。在18、34的網絡裏，使用到的A Building Block是上圖左邊的形式。而在50、101、152網絡裏，使用的是右邊的形式。

提出的五個網絡結構如下所示：

這個圖是怎麼看的呢？我們以18層網絡Resnet18來看。

Resnet18要經過一個卷積層、Pooling層，然後是四個“小方塊”，一個方塊由兩個A Build Block組成，一個A Build Block又由兩個卷積層組成，四個“小方塊”即16層，最後是average pool、全連接層。由於Pooling層不需要參數學習，故去除Pooling層，整個resnet18網絡由18層組成。

源碼

爲了更好地幫助小夥伴理解，我把resnet在pytorch的源碼縮減了，目前這個代碼只實現了resnet18和resnet34。
如有需要可自行查看pytorch的resnet源碼。
一開始圖像的輸入是224*224，經過第一次卷積之後，由於stride爲2，output輸出變爲112x112，然後經過一個max pool，size又縮小一半變爲56x56。這兩個都是正常的CNN操作。

不難在__init__定義以下組件，

self.conv1 = nn.Conv2d(1, self.inplanes, kernel_size=7, stride=2, padding=3, bias=False)
self.bn1 = nn.BatchNorm2d(self.inplanes)
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

同時在forward()前向傳播寫上：

x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)

然後再看到這個小方塊，它表示的就是剛剛所說的A Building Block，兩個Building Block串在一起，相當於這裏有四層卷積層。由於Resnet18有四個這樣的方塊，每一個方塊有四層卷積層，每一個方塊不同的地方就在於卷積通道分辨率和卷積核數量不同，一個簡單的想法就是定義一個A Building Block類，然後將2個A Building Block串在一起當成“一層”，然後在前向傳播中調用“四層”即可。

我們把把一個A Building Block定義爲BasicBlock，由於輸入輸出尺度等參數可能不同，所以在調用時需要傳參。比較需要注意的是恆等映射，其實在代碼實現就是在前向傳播之前保存輸入的所有參數identity，當進行兩次卷積後，就將輸出+identity得到新的輸出，在相加之前還得判斷是否需要先進行下采樣，保證輸入和輸出的參數是一樣的。

class BasicBlock(nn.Module):
    # OutChannal represents kernal size.
    def __init__(self, InChannal, OutChannal, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(InChannal, OutChannal, kernel_size=3, stride=stride, bias=False, padding=1)
        self.bn1 = nn.BatchNorm2d(OutChannal)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = nn.Conv2d(OutChannal, OutChannal, kernel_size=3, bias=False, padding=1)
        self.bn2 = nn.BatchNorm2d(OutChannal)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        identity = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)
        if self.downsample is not None:
            identity = self.downsample(x)
        out += identity  # Resnet`s essence
        out = self.relu(out)

        return out

到這裏大家應該就逐漸清晰了，我們只需要定義兩個類，一個是BasicBlock類，另一個是ResNet的類。BasicBlock就是一個基礎的模塊給Resnet調用。在基礎模塊BasicBlock寫好之後，我們繼續完善Resnet。現在要做的事情就是把兩個BasicBlock拼接起來，形成剛剛的“小方塊”。

在ResNet類裏，定義了一個拼接函數，當需要下采樣的時候，也即是stride步長爲2時，進行下采樣，然後再將剩餘的Block拼接起來（根據block數量而定），使用了Sequential函數。由代碼可知，在每一個小方塊運行前（除了第一個），都會先進行下采樣。

def _make_layer(self, Basicblock, planes, block_num, stride=1):
        downsample = None
        # Change channal size
        if stride != 1 or self.inplanes != planes:
            downsample = nn.Sequential(
                nn.Conv2d(self.inplanes, planes, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes),
            )

        # Initial layers
        layers = []
        # 1.original layer, downsamples firstly.
        layers.append(Basicblock(self.inplanes, planes, stride, downsample))
        self.inplanes = planes
        # 2.add layer
        for i in range(1, block_num):
            layers.append(Basicblock(self.inplanes, planes))

        return nn.Sequential(*layers)

於是，resnet18中的由resnet組成的“小方塊”，就由上面的_make_layer函數封裝好了。

最後，你只需要在Resnet裏“搭積木”就好了。比如，resnet18要經過一個卷積層、BN層、ReLU層，Pooling層，然後是四個“小方塊”，最後是average pool、全連接層以及softmax。
先在__init__裏定義：

self.conv1 = nn.Conv2d(1, self.inplanes, kernel_size=7, stride=2, padding=3, bias=False)
self.bn1 = nn.BatchNorm2d(self.inplanes)
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(Basicblock, 64, block_num=layers[0], stride=1)
self.layer2 = self._make_layer(Basicblock, 128, block_num=layers[1], stride=2)
self.layer3 = self._make_layer(Basicblock, 256, block_num=layers[2], stride=2)
self.layer4 = self._make_layer(Basicblock, 512, block_num=layers[3], stride=2)
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(512, num_classes)

再在forward前向傳播中寫下：

 def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x

然後我們定義函數來進行調用。這裏由於Resnet34也是採用了一樣的BasicBlock，所以34也能用。

def _resnet(block, layers, **kwargs):
    model = ResNet(block, layers, **kwargs)
    return model
def resnet18():
    return _resnet(BasicBlock, [2, 2, 2, 2])
def resnet34():
    return _resnet(BasicBlock, [3, 4, 6, 3])

最後，只需要調用函數：

net = resnet18()

一個基本的ResNet就這樣搭建好了，有興趣的小夥伴可以自己寫一個resnet50哈~

下面是完整的代碼：

# A BasicBlock has two convolution layer.
class BasicBlock(nn.Module):
    # OutChannal represents kernal size.
    def __init__(self, InChannal, OutChannal, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(InChannal, OutChannal, kernel_size=3, stride=stride, bias=False, padding=1)
        self.bn1 = nn.BatchNorm2d(OutChannal)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = nn.Conv2d(OutChannal, OutChannal, kernel_size=3, bias=False, padding=1)
        self.bn2 = nn.BatchNorm2d(OutChannal)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        identity = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)
        if self.downsample is not None:
            identity = self.downsample(x)
        out += identity  # Resnet`s essence
        out = self.relu(out)

        return out


class ResNet(nn.Module):

    def __init__(self, Basicblock, layers, num_classes=Config.NUM_CLASSES):
        super(ResNet, self).__init__()

        self.inplanes = 64
        self.dilation = 1

        self.conv1 = nn.Conv2d(1, self.inplanes, kernel_size=7, stride=2, padding=3, bias=False)
        self.bn1 = nn.BatchNorm2d(self.inplanes)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(Basicblock, 64, block_num=layers[0], stride=1)
        self.layer2 = self._make_layer(Basicblock, 128, block_num=layers[1], stride=2)
        self.layer3 = self._make_layer(Basicblock, 256, block_num=layers[2], stride=2)
        self.layer4 = self._make_layer(Basicblock, 512, block_num=layers[3], stride=2)
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(512, num_classes)

        # Judge layer`s type and initialize. Conv2d and BN initialization in a different way
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
        # Zero-initialize the last BN in each residual branch, it can improves performance
        for m in self.modules():
            if isinstance(m, BasicBlock):
                nn.init.constant_(m.bn2.weight, 0)

    def _make_layer(self, Basicblock, planes, block_num, stride=1):
        downsample = None
        # Change channal size
        if stride != 1 or self.inplanes != planes:
            downsample = nn.Sequential(
                nn.Conv2d(self.inplanes, planes, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes),
            )
        # Initial layers
        layers = []
        # 1.original layer, downsamples firstly.
        layers.append(Basicblock(self.inplanes, planes, stride, downsample))
        self.inplanes = planes
        # 2.add layer
        for i in range(1, block_num):
            layers.append(Basicblock(self.inplanes, planes))

        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)

        return x


def _resnet(block, layers, **kwargs):
    model = ResNet(block, layers, **kwargs)
    return model


def resnet18():
    return _resnet(BasicBlock, [2, 2, 2, 2])


def resnet34():
    return _resnet(BasicBlock, [3, 4, 6, 3])

Resnet原理&源碼簡單分析

Resnet原理&源碼簡單分析

原理

源碼

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

Python 安裝庫指令大全

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

基於 Milvus + LlamaIndex 實現高級 RAG

【2024-05-21】以茶會友

Resnet原理&源碼簡單分析

音視頻系列6：ffmpeg多線程拉流

leetcode1013：將數組分成和相等的三個部分

leetcode面試題40：最小的k個數

音視頻系列1：ffmpeg+rtmp拉流

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結