目录
- 1. ResNet解决了什么问题
- 2. ResNet原理及结构
- 2.1 BasicBlock结构
- 2.2 BottleNeck结构
- 2.3 ResNet结构
- 3. ResNet代码详解(Pytorch)
- 3.1 BasicBlock代码块
- 3.2 BottleNeck代码块
- 3.3 ResNet代码
博客中的ResNet内容来自何凯明大神在CVPR2016发表的文章《Deep Residual Learning for Image Recognition》,ResNet代码部分来自Pytorch官方实现的ResNet源码,大家感兴趣的可以点击对应链接。
1. ResNet解决了什么问题
Resnet网络是为了解决深度网络中的退化问题,即网络层数越深时,在数据集上表现的性能却越差,如下图所示是论文中给出的深度网络退化现象。

从图中我们可以看到,作者在CIFAR-10数据集上测试了20层和56层的深度网络,结果就是56层的训练误差和测试误差反而比层数少的20层网络更大,这就是ResNet网络要解决的深度网络退化问题。
而采用ResNet网络之后,可以解决这种退化问题,如下图所示。

从图中作者在ImageNet数据集上的训练结果可以看出,在没有采用ResNet结构之前,如左图所示,34层网络plain-34的性能误差要大于18层网络plain-18的性能误差。而采用ResNet网络结构的34层网络结构ResNet-34性能误差小于18层网络ResNet。因此,采用ResNet网络结构的网络层数越深,则性能越佳。
2. ResNet原理及结构
接下来介绍ResNet网络原理及结构。
假设我们想要网络块学习到的映射为H(x),而直接学习H(x)是很难学习到的。若我们学习另一个残差函数F(x) = H(x) - x可以很容易学习,因为此时网络块的训练目标是将F(x)逼近于0,而不是某一特定映射。因此,最后的映射H(x)就是将F(x)和x相加,H(x) = F(x) + x,如图所示。

因此,这个网络块的输出y为

由于相加必须保证x与F()是同维度的,因此可以写成通式如下式,Ws用于匹配维度。

文中提到两种维度匹配的方式(A)用zero-padding增加维度 (B)用1x1卷积增加维度。
下面给出论文中两种基础块结构,BasicBlock结构用于ResNet34及以下的网络,BotteNeck结构用于ResNet50及以上的网络。理解了这两个基础块,ResNet就是这些基础块的叠加了。
2.1 BasicBlock结构
BasicBlock结构图如图所示,

网络结构如图,两个3x3的卷积层,通道数都是64,然后就是注意那根跳线,也就是Shortcut Connections,将输入x加到输出。
2.2 BottleNeck结构
BasicBlock结构图如图所示,

网络结构如图,先是一个1x1的卷积层,然后一个3x3的卷积层,然后又是一个1x1的卷积层。注意的是这里的通道数是变化的,1x1卷积层的作用就是用于改变特征图的通数,使得可以和恒等映射x相叠加,另外这里的1x1卷积层改变维度的很重要的一点是可以降低网络参数量,这也是为什么更深层的网络采用BottleNeck而不是BasicBlock的原因。
2.3 ResNet结构
了解了上述BasicBlock基础块和BotteNeck结构后,ResNet结构就直接叠加搭建了。5种不同层数的ResNet结构图如图所示,

图中的每一层其实就是我们上面提到的BasicBlock或者BotteNeck结构。这里给出ResNet-34结构图如图所示,图中的虚线连接线是表示通道数不同,需要调整通道。

3. ResNet代码详解(Pytorch)
这部分将给出Pytorch官方给出的ResNet源码,先分别给出BasicBlock和BottleNeck的代码块
3.1 BasicBlock代码块
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | #定义BasicBlock class BasicBlock(nn.Module): expansion = 1 def __init__(self, inplanes, planes, stride=1, downsaple=None, groups=1, base_width=64, dilation=1, norm_layer=None): super(BasicBlock, self).__init__() if norm_layer is None: norm_layer = nn.BatchNorm2d if groups !=1 or base_width != 64: raise ValueError('BasicBlock only supports groups=1 and base_width=64') if dilation > 1: raise NotImplementedError("Dilation > 1 not supported in BasicBlock") #下面定义BasicBlock中的各个层 self.conv1 = con3x3(inplanes, planes, stride) self.bn1 = norm_layer(planes) self.relu = nn.ReLU(inplace=True) #inplace为True表示进行原地操作,一般默认为False,表示新建一个变量存储操作 self.conv2 = con3x3(planes, planes) self.bn2 = norm_layer(planes) self.dowansample = downsaple self.stride = stride #定义前向传播函数将前面定义的各层连接起来 def forward(self, x): identity = x #这是由于残差块需要保留原始输入 out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) if self.dowansample is not None: #这是为了保证原始输入与卷积后的输出层叠加时维度相同 identity = self.dowansample(x) out += identity out = self.relu(out) return out |
3.2 BottleNeck代码块
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | #下面定义Bottleneck层(Resnet50以上用到的基础块) class Bottleneck(nn.Module): expansion = 4 #Bottleneck层输出通道都是输入的4倍 def __init__(self, inplanes, planes, stride=1, downnsaple=None, groups=1, base_width=64, dilation=1, norm_layer=None): super(Bottleneck, self).__init__() if norm_layer is None: norm_layer = nn.BatchNorm2d width = int(planes * (base_width / 64.)) * groups #定义Bottleneck中各层 self.conv1 = con1x1(inplanes, width) self.bn1 = norm_layer(width) self.conv2 = con3x3(width, width, stride, groups, dilation) self.bn2 = norm_layer(width) self.conv3 = con1x1(width, planes * self.expansion) self.bn3 = norm_layer(planes * self.expansion) self.relu = nn.ReLU(inplanes=True) self.downsaple = downnsaple self.stride = stride #定义Bottleneck的前向传播 def forward(self, x): identity = x out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) out = self.relu(out) out = self.conv3(out) out = self.bn3(out) out = self.relu(out) if self.downsaple is not None: identity = self.downsaple(x) out += identity out = self.relu(out) return out |
3.3 ResNet代码
这里给出的代码没有完全列出官方源码,需要完整源码的同学见前面代码链接。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 | import torch import torch.nn as nn from .utils import load_state_dict_from_url #这里是为了加载预训练模型需要的 #提供官方预训练模型的下载地址 model_urls = { 'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth', 'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth', 'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth', 'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth', 'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth', 'resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth', 'resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth', 'wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth', 'wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth', } #封装下3x3卷积层(卷积层的bias置为False是因为卷积层后面要加BN层,因此这里的bias不需要) #Conv2d函数的具体参数说明可参见Pytorch官方手册https://pytorch-cn.readthedocs.io/zh/latest/package_references/torch-nn/#_1 def con3x3(in_planes, out_planes, stride=1, groups=1, dilation=1): return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, padding=dilation, groups=groups, bias=False, dilation=dilation) #封装下1x1卷积层 def con1x1(in_planes, out_planes, stride=1): return nn.Conv2d(in_planes, out_planes, kenerl_size=1, stride=stride, bias=False) #定义BasicBlock class BasicBlock(nn.Module): expansion = 1 def __init__(self, inplanes, planes, stride=1, downsaple=None, groups=1, base_width=64, dilation=1, norm_layer=None): super(BasicBlock, self).__init__() if norm_layer is None: norm_layer = nn.BatchNorm2d if groups !=1 or base_width != 64: raise ValueError('BasicBlock only supports groups=1 and base_width=64') if dilation > 1: raise NotImplementedError("Dilation > 1 not supported in BasicBlock") #下面定义BasicBlock中的各个层 self.conv1 = con3x3(inplanes, planes, stride) self.bn1 = norm_layer(planes) self.relu = nn.ReLU(inplace=True) #inplace为True表示进行原地操作,一般默认为False,表示新建一个变量存储操作 self.conv2 = con3x3(planes, planes) self.bn2 = norm_layer(planes) self.dowansample = downsaple self.stride = stride #定义前向传播函数将前面定义的各层连接起来 def forward(self, x): identity = x #这是由于残差块需要保留原始输入 out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) if self.dowansample is not None: #这是为了保证原始输入与卷积后的输出层叠加时维度相同 identity = self.dowansample(x) out += identity out = self.relu(out) return out #下面定义Bottleneck层(Resnet50以上用到的基础块) class Bottleneck(nn.Module): expansion = 4 #Bottleneck层输出通道都是输入的4倍 def __init__(self, inplanes, planes, stride=1, downnsaple=None, groups=1, base_width=64, dilation=1, norm_layer=None): super(Bottleneck, self).__init__() if norm_layer is None: norm_layer = nn.BatchNorm2d width = int(planes * (base_width / 64.)) * groups #定义Bottleneck中各层 self.conv1 = con1x1(inplanes, width) self.bn1 = norm_layer(width) self.conv2 = con3x3(width, width, stride, groups, dilation) self.bn2 = norm_layer(width) self.conv3 = con1x1(width, planes * self.expansion) self.bn3 = norm_layer(planes * self.expansion) self.relu = nn.ReLU(inplanes=True) self.downsaple = downnsaple self.stride = stride #定义Bottleneck的前向传播 def forward(self, x): identity = x out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) out = self.relu(out) out = self.conv3(out) out = self.bn3(out) out = self.relu(out) if self.downsaple is not None: identity = self.downsaple(x) out += identity out = self.relu(out) return out #下面进入正题,定义ResNet类 class ResNet(nn.Module): def __init__(self, block, layer, num_classes=1000, zero_init_residual=False, groups=1, width_per_group=64, replace_stride_with_dilation=None, norm_layer=None): super(ResNet, self).__init__() if norm_layer is None: norm_layer = nn.BatchNorm2d self._norm_layer = norm_layer self.inplanes = 64 self.dilation = 1 if replace_stride_with_dilation is None: replace_stride_with_dilation = [False, False, False] if len(replace_stride_with_dilation) != 3: raise ValueError("replace_stride_with_dilation should be None " "or a 3-element tuple, got {}".format(replace_stride_with_dilation)) self.groups = groups self.base_width = width_per_group self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=7, stride=2, padding=3, bias=False) self.bn1 = norm_layer(self.inplanes) self.relu = nn.ReLU(self.inplanes) self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) self.layer1 = self._make_layer(block, 64, layer[0]) self.layer2 = self._make_layer(block, 128, layer[1], stride=2, dilate=replace_stride_with_dilation[0]) self.layer3 = self._make_layer(block, 256, layer[2], stride=2, dilate=replace_stride_with_dilation[1]) self.layer4 = self._make_layer(block, 512, layer[3], stride=2, dilate=replace_stride_with_dilation[2]) self.avgpool = nn.AdaptiveAvgPool2d((1,1)) self.fc = nn.Linear(512 * block.expanion, num_classes) #定义初始化方式 for m in self.modules(): if isinstance(m, nn.Conv2d): nn.init.kaiming_nomal_(m.weight, mode='fan_out', nonlinearity='relu') elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)): nn.init.constant_(m.weight, 1) nn.init.constant_(m.bias, 0) if zero_init_residual: for m in self.modules(): if isinstance(m, Bottleneck): nn.init.constant_(m.bn3.weight, 0) elif isinstance(m, BasicBlock): nn.init.constant_(m.bn2.weight, 0) def _make_layer(self, block, planes, blocks, stride=1, dilate=False): norm_layer = self._norm_layer downsaple = None previous_dilation = self.dilation if dilate: self.dilation *= stride stride = 1 if stride != 1 or self.inplanes != planes * block.expanion: downsaple = nn.Sequential( con1x1(self.inplanes, planes * block.expanion, stride), norm_layer(planes * block.expanion), ) layers = [] layers.append(block(self.inplanes, planes, stride, downsaple, self.groups, self.base_width, previous_dilation, norm_layer)) self.inplanes = planes * block.expanion for _ in range(1, block): layers.append(block(self.inplanes, planes, groups=self.groups, base_width=self.base_width, dilate=self.dilation, norm_layer=norm_layer)) return nn.Sequential(*layers) def _forward_impl(self, x): x = self.conv1(x) x = self.bn1(x) x = self.relu(x) x = self.maxpool(x) x = self.layer1(x) x = self.layer2(x) x = self.layer3(x) x = self.layer4(x) x = self.avgpool(x) x = torch.flatten(x, 1) x = self.fc(x) return x def forward(self, x): return self._forward_impl(x) def _resnet(arch, block, layers, pretrained, progress, **kwargs): model = ResNet(block, layers, **kwargs) if pretrained: state_dict = load_state_dict_from_url(model_urls[arch], progress=progress) model.load_state_dict(state_dict) return model def resnet34(pretrained=False, progress=True, **kwargs): return _resnet('resnet34', BasicBlock, [3, 4, 6, 3], pretrained, progress, **kwargs) def resnet101(pretrained=False, progress=True, **kwargs): return _resnet('resnet101', Bottleneck, [3, 4, 23, 3], pretrained, progress, **kwargs) |