史上最全MNIST系列（五）——AutoEncoder（普通、降噪、收缩自编码）在MNIST上的Pytorch实现

文章目录

一、自编码理论

1.1 自编码介绍
1.2 什么是自编码
1.3 其他编解码模型
1.4 使用自编码的原因
1.5 自编码的特点
1.6 自编码的类别

1.6.1普通自编码网络（Autoencoder）
1.6.2 稀疏自编码网络（Sparse Autoencoder）
1.6.3 降噪自编码网络（Denoising Autoencoder）
1.6.5 收缩自编码网络（Contractive Autoencoder）

二、普通自编码网络代码

2.1 Encoder.py
2.2 Decoder.py
2.3 Mainnet.py
2.4 GeneralTrain.py
2.5 训练效果展示
2.6 Detector.py
2.7 测试效果展示

三、降噪自编码（Dorpout实现）

3.1 EncoderNet.py

四、收缩自编码（L1正则化实现）

4.1 Encoder_Net.py
4.2 Decoder_Net.py
4.3 L1Train.py
4.4 训练效果展示

一、自编码理论

1.1 自编码介绍

自编码网络是非监督学习领域中的一种，可以自动从无标注的数据中学习特征，是一种以重构输入信息为目标的神经网络，它可以给出比原始数据更好的特征描述，具有较强的特征学习能力，在深度学习中常用自编码网络生成的特征来取代原始数据，以取得更好的效果。自编码属于生成模型。

1.2 什么是自编码

什么是自编码？所谓自编码就是自己给自己编码，再简单点就是令输出等于输入自己。以一个简单三层网络为例如下：
在这里插入图片描述
自编码器通过隐藏层对输入进行了压缩，并在输出层中解压缩，整个过程肯定会丢失信息，但是通过训练我们能够使丢失的信息尽量减少，最大化的保留其主要特征。
编解码的过程：将输入数据放到网络，经过编码得到编码结果，再经过解码，得到输出（类似铸剑的过程，但生成后与原来不是完全一样）

1.3 其他编解码模型

seq2seq，Autoencoder，VAE

1.4 使用自编码的原因

1.提取主要特征数据作为模型的输入，以减小计算量
2.增加数据多样性（数据增样）

1.5 自编码的特点

经过编码后，得到原始数据典型特征（原数据的精髓），言外之意相当于降维了。如果知道主成分分析法（PCA）的人应该了解，PCA方法其实就是实现数据降维的，如果自编码网络激活函数不使用Sigmoid函数，而使用心形函数，那么便是PCA模型了。在这里我们通过这种自编码，规定隐含层神经元的个数以后，通过自编码的训练，让网络的输出尽可能的等于输入，待自编码完成后，那么输入通过隐含层的输出就相当于降维了吧（前提是隐含层的神经元个数要小于输入维数，这样才叫降维，否则的话叫升维）。
在这里插入图片描述
自编码的数学表现形式：

1.6 自编码的类别

1.6.1普通自编码网络（Autoencoder）

在这里插入图片描述

1.6.2 稀疏自编码网络（Sparse Autoencoder）

在这里插入图片描述

1.6.3 降噪自编码网络（Denoising Autoencoder）

在这里插入图片描述

1.6.5 收缩自编码网络（Contractive Autoencoder）

收缩自编码=降噪+稀疏
在这里插入图片描述

二、普通自编码网络代码

代码目录
在这里插入图片描述

2.1 Encoder.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

import torch
import torch.nn as nn

class EncoderNet(nn.Module):
def __init__(self):
super(EncoderNet,self).__init__()
self.conv1 = nn.Sequential(
nn.Conv2d(1,3,3,2,1),
nn.BatchNorm2d(3),
nn.ReLU(),
)#N,3,14,14
self.conv2 = nn.Sequential(
nn.Conv2d(3,6,3,2,1),
nn.BatchNorm2d(6),
nn.ReLU(),
)#N,6,7,7
self.fc = nn.Sequential(
nn.Linear(6*7*7,128),
)#N,128

def forward(self, x):
y1 = self.conv1(x)
y2 = self.conv2(y1)
y2 = torch.reshape(y2,[y2.size(0),-1])
out = self.fc(y2)
return out

2.2 Decoder.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

import torch
import torch.nn as nn

class DecoderNet(nn.Module):
def __init__(self):
super(DecoderNet,self).__init__()
self.fc = nn.Sequential(
nn.Linear(128,6*7*7),
nn.BatchNorm1d(6*7*7),
nn.ReLU()
)#7,7
self.conv1 = nn.Sequential(
nn.ConvTranspose2d(6,3,3,2,1,output_padding=1),
nn.BatchNorm2d(3),
nn.ReLU()
)#14,14
self.conv2 = nn.Sequential(
nn.ConvTranspose2d(3,1,3,2,1,output_padding=1),
nn.ReLU()
)#28.28
def forward(self, x):
y1 = self.fc(x)
y1 = torch.reshape(y1,[y1.size(0),6,7,7])
y2 = self.conv1(y1)
out = self.conv2(y2)
return out

2.3 Mainnet.py

1
2
3
4
5
6
7
8
9
10
11
12
13

from Encoder import EncoderNet
from Decoder import DecoderNet
import torch.nn as nn

class Net(nn.Module):
def __init__(self):
super(Net,self).__init__()
self.encoder = EncoderNet()
self.decoder = DecoderNet()
def forward(self, x):
encoder_out = self.encoder(x)
decoder_out = self.decoder(encoder_out)
return decoder_out

2.4 GeneralTrain.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

import torch
import torch.nn as nn
import torch.utils.data as data
import os
from torchvision import transforms
from Mainnet import Net
from torchvision.utils import save_image
from torchvision.datasets import MNIST

class Trainer:
def __init__(self):
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.net = Net().to(self.device)
self.loss_fn = nn.MSELoss()
self.opt = torch.optim.Adam(self.net.parameters())
self.trans = transforms.Compose([
transforms.ToTensor(),
])

def train(self):
if not os.path.exists("params"):
os.mkdir("params")
if not os.path.exists("img"):
os.mkdir("img")
NUM_EPOCHS = 10
BATCH_SIZE = 100
mydataset = MNIST(root="./MNIST",train=True,download=True,transform=self.trans)
dataloader = data.DataLoader(dataset=mydataset,shuffle=True,batch_size=BATCH_SIZE)
for epochs in range(NUM_EPOCHS):
for i ,(x,y) in enumerate(dataloader):
img = x.to(self.device)
out_img = self.net(img)
loss = self.loss_fn(out_img,img)
self.opt.zero_grad()
loss.backward()
self.opt.step()
if i % 100 == 0:
print("epohs:[{}],iteration:[{}]/[{}],loss:{:.3f}".format(epochs, i, len(dataloader), loss.float()))
fake_image = out_img.cpu().data
real_image = img.cpu().data
save_image(fake_image, "./img/epohs-{}-fake_img.jpg".format(epochs), nrow=10)
save_image(real_image, "./img/epohs-{}-real_img.jpg".format(epochs), nrow=10)
torch.save(self.net.state_dict(), "./params/net.pth")

if __name__ == '__main__':
t = Trainer()
t.train()

2.5 训练效果展示

原图：
在这里插入图片描述
生成图：

2.6 Detector.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

import torch
import torch.nn as nn
from torchvision.datasets import MNIST
from torchvision import transforms
from torch.utils import data
from Mainnet import Net
from torchvision.utils import save_image

class Detector:
def __init__(self):
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.loss_fn = nn.MSELoss()
self.trans = transforms.Compose([
transforms.ToTensor()
])

def detector(self):
BATCH_SIZE = 100
NUM_EPOCHS = 100
net = Net().to(self.device)
net.eval()
net.load_state_dict(torch.load("./params/net.pth"))
test_data = MNIST(root="./MNIST", train=False, download=False, transform=self.trans)
test_loader = data.DataLoader(dataset=test_data, shuffle=True, batch_size=BATCH_SIZE)

for epochs in range(NUM_EPOCHS):
for i ,(img,label) in enumerate(test_loader):
img = img.to(self.device)
out_img = net(img)
loss = self.loss_fn(out_img,img)

if i%100 ==0:
print("epohs:[{}],loss:{:.3f}".format(epochs, loss.float()))
fake_image = out_img.cpu().data
real_image = img.cpu().data
save_image(fake_image, "./test_img/epohs-{}-fake_img.jpg".format(epochs), nrow=10)
save_image(real_image, "./test_img/epohs-{}-real_img.jpg".format(epochs), nrow=10)

if __name__ == '__main__':
t = Detector()
t.detector()

2.7 测试效果展示

原图：
在这里插入图片描述
生成图：

三、降噪自编码（Dorpout实现）

3.1 EncoderNet.py

①降噪自编码可以对数据X进行随机隐藏、
②可以对数据增加噪点，再以原图做标签进行降噪，详解请跳至我的另一篇博文：史上最全MNIST系列（二）——AE模型（自编码）实现MNIST的普通上采样、转置卷积、去噪

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

import torch
import torch.nn as nn

class EncoderNet(nn.Module):
def __init__(self):
super(EncoderNet,self).__init__()
self.conv1 = nn.Sequential(
nn.Conv2d(1,3,3,2,1),
nn.BatchNorm2d(3),
nn.ReLU(),
# nn.Dropout(0.5)
)#N,3,14,14
self.conv2 = nn.Sequential(
nn.Conv2d(3,6,3,2,1),
nn.BatchNorm2d(6),
nn.ReLU(),
# nn.Dropout(0.5)
)#N,6,7,7
self.fc = nn.Sequential(
nn.Linear(6*7*7,128),
)#N,128

def forward(self, x):
= torch.dropout(x,0.5,True)
y1 = self.conv1(x)
# y1 = torch.dropout(y1,0.5,True)
y2 = self.conv2(y1)
y2 = torch.reshape(y2,[y2.size(0),-1])
out = self.fc(y2)
return out

四、收缩自编码（L1正则化实现）

代码目录
在这里插入图片描述

4.1 Encoder_Net.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

import torch
import torch.nn as nn

class Encoder(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(1,128,3,2,1),
nn.BatchNorm2d(128),
nn.ReLU(inplace=True)
)

self.layer2 = nn.Sequential(
nn.Conv2d(128,512,3,2,1),
nn.BatchNorm2d(512),
nn.ReLU(inplace=True)
)

self.layer3 = nn.Sequential(
nn.Linear(512*7*7,128),
nn.Sigmoid()
)

def forward(self,x):
x = self.layer1(x)
x = self.layer2(x)
x = torch.reshape(x,[x.size(0),-1])
out = self.layer3(x)

return out

4.2 Decoder_Net.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

import torch
import torch.nn as nn

class Decoder(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = nn.Sequential(
nn.Linear(128,512*7*7),
nn.BatchNorm1d(512*7*7),
nn.ReLU(inplace=True)
)
self.layer2 = nn.Sequential(
nn.ConvTranspose2d(512,128,3,2,1,1),
nn.BatchNorm2d(128),
nn.ReLU(inplace=True)
)
self.layer3 = nn.Sequential(
nn.ConvTranspose2d(128,1,3,2,1,1),
nn.ReLU(inplace=True)
)

def forward(self,x):
x = self.layer1(x)
out = torch.reshape(x,[x.size(0),512,7,7])
out = self.layer2(out)
out = self.layer3(out)
return out

4.3 L1Train.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

import torch
import torch.nn as nn
import torch.utils.data as data
import os
from torchvision import transforms
from Decoder_Net import Decoder
from Encoder_Net import Encoder
from torchvision.utils import save_image
from torchvision.datasets import MNIST
import matplotlib.pyplot as plt

class Trainer:
def __init__(self):
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.loss_fn = nn.MSELoss()
self.trans = transforms.Compose([
transforms.ToTensor(),
])

def train(self):
if not os.path.exists("params"):
os.mkdir("params")
if not os.path.exists("img"):
os.mkdir("img")
NUM_EPOCHS = 100
BATCH_SIZE = 100

en_net = Encoder().to(self.device)
de_net = Decoder().to(self.device)
if os.path.exists("./params/en_net.pth"):
de_net.load_state_dict(torch.load("./params/en_net.pth"))
if os.path.exists("./params/de_net.pth"):
de_net.load_state_dict(torch.load("./params/de_net.pth"))
en_net_opt = torch.optim.Adam(en_net.parameters())
de_net_opt = torch.optim.Adam(de_net.parameters())

en_L1_loss = 0
for enparam in en_net.parameters():
en_L1_loss += torch.sum(torch.abs(enparam))

de_L1_loss = 0
for deparam in de_net.parameters():
de_L1_loss += torch.sum(torch.abs(deparam))

mydataset = MNIST(root="./MNIST",train=True,download=True,transform=self.trans)
dataloader = data.DataLoader(dataset=mydataset,shuffle=True,batch_size=BATCH_SIZE)
for epochs in range(NUM_EPOCHS):
for i ,(x,y) in enumerate(dataloader):
img = x.to(self.device)
feature = en_net(img)
out_img = de_net(feature)

loss = self.loss_fn(out_img,img)
losses = loss+0.0001*en_L1_loss+0.0001*de_L1_loss
en_net_opt.zero_grad()
de_net_opt.zero_grad()
losses.backward()
en_net_opt.step()
de_net_opt.step()

if i % 100 == 0:
print("epohs:[{}],iteration:[{}]/[{}],loss:{:.3f}".format(epochs, i, len(dataloader), losses.float()))

fake_image = out_img.cpu().data
real_image = img.cpu().data
save_image(fake_image, "./img/epohs-{}-fake_img.jpg".format(epochs), nrow=10)
save_image(real_image, "./img/epohs-{}-real_img.jpg".format(epochs), nrow=10)
torch.save(en_net.state_dict(), "./params/en_net.pth")
torch.save(de_net.state_dict(), "./params/de_net.pth")

if __name__ == '__main__':
t = Trainer()
t.train()

4.4 训练效果展示

原图：
在这里插入图片描述
生成图：