ResNet网络详解及Pytorch代码实现（超详细帮助你掌握ResNet原理及实现）

从图中我们可以看到，作者在CIFAR-10数据集上测试了20层和56层的深度网络，结果就是56层的训练误差和测试误差反而比层数少的20层网络更大，这就是ResNet网络要解决的深度网络退化问题。
而采用ResNet网络之后，可以解决这种退化问题，如下图所示。

从图中作者在ImageNet数据集上的训练结果可以看出，在没有采用ResNet结构之前，如左图所示，34层网络plain-34的性能误差要大于18层网络plain-18的性能误差。而采用ResNet网络结构的34层网络结构ResNet-34性能误差小于18层网络ResNet。因此，采用ResNet网络结构的网络层数越深，则性能越佳。

2. ResNet原理及结构

接下来介绍ResNet网络原理及结构。
假设我们想要网络块学习到的映射为H(x)，而直接学习H(x)是很难学习到的。若我们学习另一个残差函数F(x) = H(x) - x可以很容易学习，因为此时网络块的训练目标是将F(x)逼近于0，而不是某一特定映射。因此，最后的映射H(x)就是将F(x)和x相加，H(x) = F(x) + x，如图所示。

在这里插入图片描述
因此，这个网络块的输出y为

由于相加必须保证x与F()是同维度的，因此可以写成通式如下式，Ws用于匹配维度。

在这里插入图片描述
文中提到两种维度匹配的方式（A）用zero-padding增加维度 (B）用1x1卷积增加维度。

下面给出论文中两种基础块结构，BasicBlock结构用于ResNet34及以下的网络，BotteNeck结构用于ResNet50及以上的网络。理解了这两个基础块，ResNet就是这些基础块的叠加了。

2.1 BasicBlock结构

BasicBlock结构图如图所示，
在这里插入图片描述
网络结构如图，两个3x3的卷积层，通道数都是64，然后就是注意那根跳线，也就是Shortcut Connections，将输入x加到输出。

2.2 BottleNeck结构

BasicBlock结构图如图所示，
在这里插入图片描述
网络结构如图，先是一个1x1的卷积层，然后一个3x3的卷积层，然后又是一个1x1的卷积层。注意的是这里的通道数是变化的，1x1卷积层的作用就是用于改变特征图的通数，使得可以和恒等映射x相叠加，另外这里的1x1卷积层改变维度的很重要的一点是可以降低网络参数量，这也是为什么更深层的网络采用BottleNeck而不是BasicBlock的原因。

2.3 ResNet结构

了解了上述BasicBlock基础块和BotteNeck结构后，ResNet结构就直接叠加搭建了。5种不同层数的ResNet结构图如图所示，

在这里插入图片描述
图中的每一层其实就是我们上面提到的BasicBlock或者BotteNeck结构。这里给出ResNet-34结构图如图所示，图中的虚线连接线是表示通道数不同，需要调整通道。

3. ResNet代码详解(Pytorch)

这部分将给出Pytorch官方给出的ResNet源码，先分别给出BasicBlock和BottleNeck的代码块

3.1 BasicBlock代码块

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

#定义BasicBlock
class BasicBlock(nn.Module):
expansion = 1

def __init__(self, inplanes, planes, stride=1, downsaple=None, groups=1,
base_width=64, dilation=1, norm_layer=None):
super(BasicBlock, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
if groups !=1 or base_width != 64:
raise ValueError('BasicBlock only supports groups=1 and base_width=64')
if dilation > 1:
raise NotImplementedError("Dilation > 1 not supported in BasicBlock")

#下面定义BasicBlock中的各个层
self.conv1 = con3x3(inplanes, planes, stride)
self.bn1 = norm_layer(planes)
self.relu = nn.ReLU(inplace=True) #inplace为True表示进行原地操作，一般默认为False，表示新建一个变量存储操作
self.conv2 = con3x3(planes, planes)
self.bn2 = norm_layer(planes)
self.dowansample = downsaple
self.stride = stride

#定义前向传播函数将前面定义的各层连接起来
def forward(self, x):
identity = x #这是由于残差块需要保留原始输入

out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)

out = self.conv2(out)
out = self.bn2(out)

if self.dowansample is not None: #这是为了保证原始输入与卷积后的输出层叠加时维度相同
identity = self.dowansample(x)

out += identity
out = self.relu(out)

return out

3.2 BottleNeck代码块

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

#下面定义Bottleneck层（Resnet50以上用到的基础块）
class Bottleneck(nn.Module):
expansion = 4 #Bottleneck层输出通道都是输入的4倍

def __init__(self, inplanes, planes, stride=1, downnsaple=None, groups=1,
base_width=64, dilation=1, norm_layer=None):
super(Bottleneck, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
width = int(planes * (base_width / 64.)) * groups
#定义Bottleneck中各层
self.conv1 = con1x1(inplanes, width)
self.bn1 = norm_layer(width)
self.conv2 = con3x3(width, width, stride, groups, dilation)
self.bn2 = norm_layer(width)
self.conv3 = con1x1(width, planes * self.expansion)
self.bn3 = norm_layer(planes * self.expansion)
self.relu = nn.ReLU(inplanes=True)
self.downsaple = downnsaple
self.stride = stride

#定义Bottleneck的前向传播
def forward(self, x):
identity = x

out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)

out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)

out = self.conv3(out)
out = self.bn3(out)
out = self.relu(out)

if self.downsaple is not None:
identity = self.downsaple(x)

out += identity
out = self.relu(out)

return out

3.3 ResNet代码

这里给出的代码没有完全列出官方源码，需要完整源码的同学见前面代码链接。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221

import torch
import torch.nn as nn
from .utils import load_state_dict_from_url #这里是为了加载预训练模型需要的

#提供官方预训练模型的下载地址
model_urls = {
'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
'resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth',
'resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth',
'wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth',
'wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth',
}

#封装下3x3卷积层（卷积层的bias置为False是因为卷积层后面要加BN层，因此这里的bias不需要）
#Conv2d函数的具体参数说明可参见Pytorch官方手册https://pytorch-cn.readthedocs.io/zh/latest/package_references/torch-nn/#_1
def con3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
padding=dilation, groups=groups, bias=False, dilation=dilation)

#封装下1x1卷积层
def con1x1(in_planes, out_planes, stride=1):
return nn.Conv2d(in_planes, out_planes, kenerl_size=1, stride=stride, bias=False)

#定义BasicBlock
class BasicBlock(nn.Module):
expansion = 1

def __init__(self, inplanes, planes, stride=1, downsaple=None, groups=1,
base_width=64, dilation=1, norm_layer=None):
super(BasicBlock, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
if groups !=1 or base_width != 64:
raise ValueError('BasicBlock only supports groups=1 and base_width=64')
if dilation > 1:
raise NotImplementedError("Dilation > 1 not supported in BasicBlock")

#下面定义BasicBlock中的各个层
self.conv1 = con3x3(inplanes, planes, stride)
self.bn1 = norm_layer(planes)
self.relu = nn.ReLU(inplace=True) #inplace为True表示进行原地操作，一般默认为False，表示新建一个变量存储操作
self.conv2 = con3x3(planes, planes)
self.bn2 = norm_layer(planes)
self.dowansample = downsaple
self.stride = stride

#定义前向传播函数将前面定义的各层连接起来
def forward(self, x):
identity = x #这是由于残差块需要保留原始输入

out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)

out = self.conv2(out)
out = self.bn2(out)

if self.dowansample is not None: #这是为了保证原始输入与卷积后的输出层叠加时维度相同
identity = self.dowansample(x)

out += identity
out = self.relu(out)

return out

#下面定义Bottleneck层（Resnet50以上用到的基础块）
class Bottleneck(nn.Module):
expansion = 4 #Bottleneck层输出通道都是输入的4倍

def __init__(self, inplanes, planes, stride=1, downnsaple=None, groups=1,
base_width=64, dilation=1, norm_layer=None):
super(Bottleneck, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
width = int(planes * (base_width / 64.)) * groups
#定义Bottleneck中各层
self.conv1 = con1x1(inplanes, width)
self.bn1 = norm_layer(width)
self.conv2 = con3x3(width, width, stride, groups, dilation)
self.bn2 = norm_layer(width)
self.conv3 = con1x1(width, planes * self.expansion)
self.bn3 = norm_layer(planes * self.expansion)
self.relu = nn.ReLU(inplanes=True)
self.downsaple = downnsaple
self.stride = stride

#定义Bottleneck的前向传播
def forward(self, x):
identity = x

out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)

out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)

out = self.conv3(out)
out = self.bn3(out)
out = self.relu(out)

if self.downsaple is not None:
identity = self.downsaple(x)

out += identity
out = self.relu(out)

return out

#下面进入正题，定义ResNet类
class ResNet(nn.Module):
def __init__(self, block, layer, num_classes=1000, zero_init_residual=False,
groups=1, width_per_group=64, replace_stride_with_dilation=None,
norm_layer=None):
super(ResNet, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
self._norm_layer = norm_layer

self.inplanes = 64
self.dilation = 1
if replace_stride_with_dilation is None:
replace_stride_with_dilation = [False, False, False]
if len(replace_stride_with_dilation) != 3:
raise ValueError("replace_stride_with_dilation should be None "
"or a 3-element tuple, got {}".format(replace_stride_with_dilation))
self.groups = groups
self.base_width = width_per_group
self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=7, stride=2, padding=3,
bias=False)
self.bn1 = norm_layer(self.inplanes)
self.relu = nn.ReLU(self.inplanes)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, layer[0])
self.layer2 = self._make_layer(block, 128, layer[1], stride=2,
dilate=replace_stride_with_dilation[0])
self.layer3 = self._make_layer(block, 256, layer[2], stride=2,
dilate=replace_stride_with_dilation[1])
self.layer4 = self._make_layer(block, 512, layer[3], stride=2,
dilate=replace_stride_with_dilation[2])
self.avgpool = nn.AdaptiveAvgPool2d((1,1))
self.fc = nn.Linear(512 * block.expanion, num_classes)

#定义初始化方式
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_nomal_(m.weight, mode='fan_out', nonlinearity='relu')
elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)

if zero_init_residual:
for m in self.modules():
if isinstance(m, Bottleneck):
nn.init.constant_(m.bn3.weight, 0)
elif isinstance(m, BasicBlock):
nn.init.constant_(m.bn2.weight, 0)
def _make_layer(self, block, planes, blocks, stride=1, dilate=False):
norm_layer = self._norm_layer
downsaple = None
previous_dilation = self.dilation
if dilate:
self.dilation *= stride
stride = 1
if stride != 1 or self.inplanes != planes * block.expanion:
downsaple = nn.Sequential(
con1x1(self.inplanes, planes * block.expanion, stride),
norm_layer(planes * block.expanion),
)

layers = []
layers.append(block(self.inplanes, planes, stride, downsaple, self.groups,
self.base_width, previous_dilation, norm_layer))
self.inplanes = planes * block.expanion
for _ in range(1, block):
layers.append(block(self.inplanes, planes, groups=self.groups,
base_width=self.base_width, dilate=self.dilation,
norm_layer=norm_layer))

return nn.Sequential(*layers)

def _forward_impl(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)

x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)

x = self.avgpool(x)
x = torch.flatten(x, 1)
x = self.fc(x)

return x

def forward(self, x):
return self._forward_impl(x)

def _resnet(arch, block, layers, pretrained, progress, **kwargs):
model = ResNet(block, layers, **kwargs)
if pretrained:
state_dict = load_state_dict_from_url(model_urls[arch],
progress=progress)
model.load_state_dict(state_dict)
return model

def resnet34(pretrained=False, progress=True, **kwargs):
return _resnet('resnet34', BasicBlock, [3, 4, 6, 3], pretrained, progress,
**kwargs)

def resnet101(pretrained=False, progress=True, **kwargs):
return _resnet('resnet101', Bottleneck, [3, 4, 23, 3], pretrained, progress,
**kwargs)

码农家园