Pytorch深度学习实战教程（二）：UNet语义分割网络

2019年12月3日23:29:08 72 36,176 °C

摘要

研究一个深度学习算法，可以先看网络结构，看懂网络结构后，再Loss计算方法、训练方法等。本文主要讲解UNet网络结构，以及相应代码的代码编写，其它内容会在后续章节进行说明。

Pytorch深度学习实战教程（二）：UNet语义分割网络

一、前言

本文属于Pytorch深度学习语义分割系列教程。

该系列文章的内容有：

Pytorch的基本使用
语义分割算法讲解

如果不了解语义分割原理以及开发环境的搭建，请看该系列教程的上一篇文章《Pytorch深度学习实战教程（一）：语义分割基础与环境搭建》。

本文的开发环境采用上一篇文章搭建好的Windows环境，环境情况如下：

开发环境：Windows

开发语言：Python3.7.4

框架版本：Pytorch1.3.0

CUDA：10.2

cuDNN：7.6.0

本文主要讲解UNet网络结构，以及相应代码的代码编写。

PS：文中出现的所有代码，均可在我的github上下载，欢迎Follow、Star：点击查看

二、UNet网络结构

在语义分割领域，基于深度学习的语义分割算法开山之作是FCN（Fully Convolutional Networks for Semantic Segmentation），而UNet是遵循FCN的原理，并进行了相应的改进，使其适应小样本的简单分割问题。

UNet论文地址：点击查看

研究一个深度学习算法，可以先看网络结构，看懂网络结构后，再Loss计算方法、训练方法等。本文主要针对UNet的网络结构进行讲解，其它内容会在后续章节进行说明。

1、网络结构原理

UNet最早发表在2015的MICCAI会议上，4年多的时间，论文引用量已经达到了9700多次。

UNet成为了大多做医疗影像语义分割任务的baseline，同时也启发了大量研究者对于U型网络结构的研究，发表了一批基于UNet网络结构的改进方法的论文。

UNet网络结构，最主要的两个特点是：U型网络结构和Skip Connection跳层连接。

Pytorch深度学习实战教程（二）：UNet语义分割网络

UNet是一个对称的网络结构，左侧为下采样，右侧为上采样。

按照功能可以将左侧的一系列下采样操作称为encoder，将右侧的一系列上采样操作称为decoder。

Skip Connection中间四条灰色的平行线，Skip Connection就是在上采样的过程中，融合下采样过过程中的feature map。

Skip Connection用到的融合的操作也很简单，就是将feature map的通道进行叠加，俗称Concat。

Concat操作也很好理解，举个例子：一本大小为10cm*10cm，厚度为3cm的书A，和一本大小为10cm*10cm，厚度为4cm的书B。

将书A和书B，边缘对齐地摞在一起。这样就得到了，大小为10cm*10cm厚度为7cm的一摞书，类似这种：

这种“摞在一起”的操作，就是Concat。

同样道理，对于feature map，一个大小为256*256*64的feature map，即feature map的w（宽）为256，h（高）为256，c（通道数）为64。和一个大小为256*256*32的feature map进行Concat融合，就会得到一个大小为256*256*96的feature map。

在实际使用中，Concat融合的两个feature map的大小不一定相同，例如256*256*64的feature map和240*240*32的feature map进行Concat。

这种时候，就有两种办法：

第一种：将大256*256*64的feature map进行裁剪，裁剪为240*240*64的feature map，比如上下左右，各舍弃8 pixel，裁剪后再进行Concat，得到240*240*96的feature map。

第二种：将小240*240*32的feature map进行padding操作，padding为256*256*32的feature map，比如上下左右，各补8 pixel，padding后再进行Concat，得到256*256*96的feature map。

UNet采用的Concat方案就是第二种，将小的feature map进行padding，padding的方式是补0，一种常规的常量填充。

2、代码

有些朋友可能对Pytorch不太了解，推荐一个快速入门的官方教程。一个小时，你就可以掌握一些基本概念和Pytorch代码编写方法。

Pytorch官方基础：点击查看

我们将整个UNet网络拆分为多个模块进行讲解。

DoubleConv模块：

先看下连续两次的卷积操作。

从UNet网络中可以看出，不管是下采样过程还是上采样过程，每一层都会连续进行两次卷积操作，这种操作在UNet网络中重复很多次，可以单独写一个DoubleConv模块：

import torch.nn as nn

class DoubleConv(nn.Module):
    """(convolution => [BN] => ReLU) * 2"""

    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.double_conv = nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=0),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True),
            nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=0),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True)
        )

    def forward(self, x):
        return self.double_conv(x)

import torch.nn as nn

class DoubleConv(nn.Module):

"""(convolution => [BN] => ReLU) * 2"""

def __init__(self, in_channels, out_channels):

super().__init__()

self.double_conv = nn.Sequential(

nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=0),

nn.BatchNorm2d(out_channels),

nn.ReLU(inplace=True),

nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=0),

nn.BatchNorm2d(out_channels),

nn.ReLU(inplace=True)

)

def forward(self, x):

return self.double_conv(x)

解释下，上述的Pytorch代码：torch.nn.Sequential是一个时序容器，Modules 会以它们传入的顺序被添加到容器中。比如上述代码的操作顺序：卷积->BN->ReLU->卷积->BN->ReLU。

DoubleConv模块的in_channels和out_channels可以灵活设定，以便扩展使用。

如上图所示的网络，in_channels设为1，out_channels为64。

输入图片大小为572*572，经过步长为1，padding为0的3*3卷积，得到570*570的feature map，再经过一次卷积得到568*568的feature map。

计算公式：O=(H−F+2×P)/S+1

H为输入feature map的大小，O为输出feature map的大小，F为卷积核的大小，P为padding的大小，S为步长。

Down模块：

UNet网络一共有4次下采样过程，模块化代码如下：

class Down(nn.Module):
    """Downscaling with maxpool then double conv"""

    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.maxpool_conv = nn.Sequential(
            nn.MaxPool2d(2),
            DoubleConv(in_channels, out_channels)
        )

    def forward(self, x):
        return self.maxpool_conv(x)

class Down(nn.Module):

"""Downscaling with maxpool then double conv"""

def __init__(self, in_channels, out_channels):

super().__init__()

self.maxpool_conv = nn.Sequential(

nn.MaxPool2d(2),

DoubleConv(in_channels, out_channels)

)

def forward(self, x):

return self.maxpool_conv(x)

这里的代码很简单，就是一个maxpool池化层，进行下采样，然后接一个DoubleConv模块。

至此，UNet网络的左半部分的下采样过程的代码都写好了，接下来是右半部分的上采样过程。

Up模块：

上采样过程用到的最多的当然就是上采样了，除了常规的上采样操作，还有进行特征的融合。

这块的代码实现起来也稍复杂一些：

class Up(nn.Module):
    """Upscaling then double conv"""

    def __init__(self, in_channels, out_channels, bilinear=True):
        super().__init__()

        # if bilinear, use the normal convolutions to reduce the number of channels
        if bilinear:
            self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
        else:
            self.up = nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size=2, stride=2)

        self.conv = DoubleConv(in_channels, out_channels)

    def forward(self, x1, x2):
        x1 = self.up(x1)
        # input is CHW
        diffY = torch.tensor([x2.size()[2] - x1.size()[2]])
        diffX = torch.tensor([x2.size()[3] - x1.size()[3]])

        x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2,
                        diffY // 2, diffY - diffY // 2])
        # if you have padding issues, see
        # https://github.com/HaiyongJiang/U-Net-Pytorch-Unstructured-Buggy/commit/0e854509c2cea854e247a9c615f175f76fbb2e3a
        # https://github.com/xiaopeng-liao/Pytorch-UNet/commit/8ebac70e633bac59fc22bb5195e513d5832fb3bd
        x = torch.cat([x2, x1], dim=1)
        return self.conv(x)

class Up(nn.Module):

"""Upscaling then double conv"""

def __init__(self, in_channels, out_channels, bilinear=True):

super().__init__()

# if bilinear, use the normal convolutions to reduce the number of channels

if bilinear:

self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)

else:

self.up = nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size=2, stride=2)

self.conv = DoubleConv(in_channels, out_channels)

def forward(self, x1, x2):

x1 = self.up(x1)

# input is CHW

diffY = torch.tensor([x2.size()[2] - x1.size()[2]])

diffX = torch.tensor([x2.size()[3] - x1.size()[3]])

x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2,

diffY // 2, diffY - diffY // 2])

# if you have padding issues, see

# https://github.com/HaiyongJiang/U-Net-Pytorch-Unstructured-Buggy/commit/0e854509c2cea854e247a9c615f175f76fbb2e3a

# https://github.com/xiaopeng-liao/Pytorch-UNet/commit/8ebac70e633bac59fc22bb5195e513d5832fb3bd

x = torch.cat([x2, x1], dim=1)

return self.conv(x)

代码复杂一些，我们可以分开来看，首先是__init__初始化函数里定义的上采样方法以及卷积采用DoubleConv。上采样，定义了两种方法：Upsample和ConvTranspose2d，也就是双线性插值和反卷积。

双线性插值很好理解，示意图：

熟悉双线性插值的朋友对于这幅图应该不陌生，简单地讲：已知Q11、Q12、Q21、Q22四个点坐标，通过Q11和Q21求R1，再通过Q12和Q22求R2，最后通过R1和R2求P，这个过程就是双线性插值。

对于一个feature map而言，其实就是在像素点中间补点，补的点的值是多少，是由相邻像素点的值决定的。

反卷积，顾名思义，就是反着卷积。卷积是让featuer map越来越小，反卷积就是让feature map越来越大，示意图：

下面蓝色为原始图片，周围白色的虚线方块为padding结果，通常为0，上面绿色为卷积后的图片。

这个示意图，就是一个从2*2的feature map->4*4的feature map过程。

在forward前向传播函数中，x1接收的是上采样的数据，x2接收的是特征融合的数据。特征融合方法就是，上文提到的，先对小的feature map进行padding，再进行concat。

OutConv模块：

用上述的DoubleConv模块、Down模块、Up模块就可以拼出UNet的主体网络结构了。UNet网络的输出需要根据分割数量，整合输出通道，结果如下图所示：

操作很简单，就是channel的变换，上图展示的是分类为2的情况（通道为2）。

虽然这个操作很简单，也就调用一次，为了美观整洁，也封装一下吧。

class OutConv(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(OutConv, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1)

    def forward(self, x):
        return self.conv(x)

class OutConv(nn.Module):

def __init__(self, in_channels, out_channels):

super(OutConv, self).__init__()

self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1)

def forward(self, x):

return self.conv(x)

至此，UNet网络用到的模块都已经写好，我们可以将上述的模块代码都放到一个unet_parts.py文件里，然后再创建unet_model.py，根据UNet网络结构，设置每个模块的输入输出通道个数以及调用顺序，编写如下代码：

""" Full assembly of the parts to form the complete network """
"""Refer https://github.com/milesial/Pytorch-UNet/blob/master/unet/unet_model.py"""

import torch.nn.functional as F

from unet_parts import *


class UNet(nn.Module):
    def __init__(self, n_channels, n_classes, bilinear=False):
        super(UNet, self).__init__()
        self.n_channels = n_channels
        self.n_classes = n_classes
        self.bilinear = bilinear

        self.inc = DoubleConv(n_channels, 64)
        self.down1 = Down(64, 128)
        self.down2 = Down(128, 256)
        self.down3 = Down(256, 512)
        self.down4 = Down(512, 1024)
        self.up1 = Up(1024, 512, bilinear)
        self.up2 = Up(512, 256, bilinear)
        self.up3 = Up(256, 128, bilinear)
        self.up4 = Up(128, 64, bilinear)
        self.outc = OutConv(64, n_classes)

    def forward(self, x):
        x1 = self.inc(x)
        x2 = self.down1(x1)
        x3 = self.down2(x2)
        x4 = self.down3(x3)
        x5 = self.down4(x4)
        x = self.up1(x5, x4)
        x = self.up2(x, x3)
        x = self.up3(x, x2)
        x = self.up4(x, x1)
        logits = self.outc(x)
        return logits
    
if __name__ == '__main__':
    net = UNet(n_channels=3, n_classes=1)
    print(net)

""" Full assembly of the parts to form the complete network """

"""Refer https://github.com/milesial/Pytorch-UNet/blob/master/unet/unet_model.py"""

import torch.nn.functional as F

from unet_parts import *

class UNet(nn.Module):

def __init__(self, n_channels, n_classes, bilinear=False):

super(UNet, self).__init__()

self.n_channels = n_channels

self.n_classes = n_classes

self.bilinear = bilinear

self.inc = DoubleConv(n_channels, 64)

self.down1 = Down(64, 128)

self.down2 = Down(128, 256)

self.down3 = Down(256, 512)

self.down4 = Down(512, 1024)

self.up1 = Up(1024, 512, bilinear)

self.up2 = Up(512, 256, bilinear)

self.up3 = Up(256, 128, bilinear)

self.up4 = Up(128, 64, bilinear)

self.outc = OutConv(64, n_classes)

def forward(self, x):

x1 = self.inc(x)

x2 = self.down1(x1)

x3 = self.down2(x2)

x4 = self.down3(x3)

x5 = self.down4(x4)

x = self.up1(x5, x4)

x = self.up2(x, x3)

x = self.up3(x, x2)

x = self.up4(x, x1)

logits = self.outc(x)

return logits

if __name__ == '__main__':

net = UNet(n_channels=3, n_classes=1)

print(net)

使用命令python unet_model.py，如果没有错误，你会得到如下结果：

UNet(
  (inc): DoubleConv(
    (double_conv): Sequential(
      (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1))
      (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU(inplace=True)
      (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
      (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (5): ReLU(inplace=True)
    )
  )
  (down1): Down(
    (maxpool_conv): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): DoubleConv(
        (double_conv): Sequential(
          (0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1))
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1))
          (4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
    )
  )
  (down2): Down(
    (maxpool_conv): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): DoubleConv(
        (double_conv): Sequential(
          (0): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1))
          (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
          (4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
    )
  )
  (down3): Down(
    (maxpool_conv): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): DoubleConv(
        (double_conv): Sequential(
          (0): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1))
          (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1))
          (4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
    )
  )
  (down4): Down(
    (maxpool_conv): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): DoubleConv(
        (double_conv): Sequential(
          (0): Conv2d(512, 1024, kernel_size=(3, 3), stride=(1, 1))
          (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
          (4): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
    )
  )
  (up1): Up(
    (up): ConvTranspose2d(1024, 512, kernel_size=(2, 2), stride=(2, 2))
    (conv): DoubleConv(
      (double_conv): Sequential(
        (0): Conv2d(1024, 512, kernel_size=(3, 3), stride=(1, 1))
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1))
        (4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): ReLU(inplace=True)
      )
    )
  )
  (up2): Up(
    (up): ConvTranspose2d(512, 256, kernel_size=(2, 2), stride=(2, 2))
    (conv): DoubleConv(
      (double_conv): Sequential(
        (0): Conv2d(512, 256, kernel_size=(3, 3), stride=(1, 1))
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): ReLU(inplace=True)
      )
    )
  )
  (up3): Up(
    (up): ConvTranspose2d(256, 128, kernel_size=(2, 2), stride=(2, 2))
    (conv): DoubleConv(
      (double_conv): Sequential(
        (0): Conv2d(256, 128, kernel_size=(3, 3), stride=(1, 1))
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1))
        (4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): ReLU(inplace=True)
      )
    )
  )
  (up4): Up(
    (up): ConvTranspose2d(128, 64, kernel_size=(2, 2), stride=(2, 2))
    (conv): DoubleConv(
      (double_conv): Sequential(
        (0): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1))
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
        (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
        (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (5): ReLU(inplace=True)
      )
    )
  )
  (outc): OutConv(
    (conv): Conv2d(64, 1, kernel_size=(1, 1), stride=(1, 1))
  )
)

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

UNet(

(inc): DoubleConv(

(double_conv): Sequential(

(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1))

(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(2): ReLU(inplace=True)

(3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))

(4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(5): ReLU(inplace=True)

)

(down1): Down(

(maxpool_conv): Sequential(

(0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

(1): DoubleConv(

(double_conv): Sequential(

(0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1))

(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(2): ReLU(inplace=True)

(3): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1))

(4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(5): ReLU(inplace=True)

)

(down2): Down(

(maxpool_conv): Sequential(

(0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

(1): DoubleConv(

(double_conv): Sequential(

(0): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1))

(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(2): ReLU(inplace=True)

(3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))

(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(5): ReLU(inplace=True)

)

(down3): Down(

(maxpool_conv): Sequential(

(0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

(1): DoubleConv(

(double_conv): Sequential(

(0): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1))

(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(2): ReLU(inplace=True)

(3): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1))

(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(5): ReLU(inplace=True)

)

(down4): Down(

(maxpool_conv): Sequential(

(0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

(1): DoubleConv(

(double_conv): Sequential(

(0): Conv2d(512, 1024, kernel_size=(3, 3), stride=(1, 1))

(1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(2): ReLU(inplace=True)

(3): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))

(4): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(5): ReLU(inplace=True)

)

(up1): Up(

(up): ConvTranspose2d(1024, 512, kernel_size=(2, 2), stride=(2, 2))

(conv): DoubleConv(

(double_conv): Sequential(

(0): Conv2d(1024, 512, kernel_size=(3, 3), stride=(1, 1))

(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(2): ReLU(inplace=True)

(3): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1))

(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(5): ReLU(inplace=True)

)

(up2): Up(

(up): ConvTranspose2d(512, 256, kernel_size=(2, 2), stride=(2, 2))

(conv): DoubleConv(

(double_conv): Sequential(

(0): Conv2d(512, 256, kernel_size=(3, 3), stride=(1, 1))

(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(2): ReLU(inplace=True)

(3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))

(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(5): ReLU(inplace=True)

)

(up3): Up(

(up): ConvTranspose2d(256, 128, kernel_size=(2, 2), stride=(2, 2))

(conv): DoubleConv(

(double_conv): Sequential(

(0): Conv2d(256, 128, kernel_size=(3, 3), stride=(1, 1))

(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(2): ReLU(inplace=True)

(3): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1))

(4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(5): ReLU(inplace=True)

)

(up4): Up(

(up): ConvTranspose2d(128, 64, kernel_size=(2, 2), stride=(2, 2))

(conv): DoubleConv(

(double_conv): Sequential(

(0): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1))

(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(2): ReLU(inplace=True)

(3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))

(4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

(5): ReLU(inplace=True)

)

(outc): OutConv(

(conv): Conv2d(64, 1, kernel_size=(1, 1), stride=(1, 1))

)

网络搭建完成，下一步就是使用网络进行训练了，具体实现会在该系列教程的下一篇文章进行讲解。

三、小结

本文主要讲解了UNet网络结构，并对UNet网络进行了模块化梳理。
下篇文章讲解如何使用UNet网络，编写训练代码。

PS：如果觉得本篇本章对您有所帮助，欢迎关注、评论、赞！

文中出现的所有代码，均可在我的github上下载，欢迎Follow、Star：点击查看

参考资料：

https://github.com/milesial/Pytorch-UNet/blob/master/unet/unet_parts.py

https://blog.csdn.net/qq_38906523/article/details/80520950

https://zhuanlan.zhihu.com/p/37618829

微信公众号

分享技术，乐享生活：微信公众号搜索「JackCui-AI」关注一个在互联网摸爬滚打的潜行者。

成事不说，遂事不谏，既往不咎。--- 孔子

发表评论取消回复

目前评论：72 其中：访客 42 博主 30

Bear 陕西省西安市移动 2
回复 2020年4月27日下午8:18 11楼
老师，您在class up类中，init实例化中，定义了当bilinear=True时，使用上采样，当bilinear=Flase时，使用反卷积。然而在forward中调用函数时，却是直接采用Upsample(双线性插值)。请问我这样的理解对吗？
- Jack Cui Admin 北京市百度网讯科技联通节点
  回复 2020年4月28日上午10:48 1层
  @Bear 对啊，init初始化这些方法，然后在forward的时候，想调用哪个方法就可以调用哪个。
野鸽出没 美国新泽西州伯灵顿县月桂山乡Comcast有线通信股份有限公司 0
回复 2020年5月14日上午3:59 12楼
老师，我怎么觉得concat的时候，unet用的是crop裁剪而不是padding呢？padding的话维度会跟x2一样哦，但我们应该要把x2的维度crop到x1的才对吧？
- Jack Cui Admin 北京市百度网讯科技联通节点
  回复 2020年5月14日上午10:57 1层
  @野鸽出没我这个是看的官方的写法，用的pad，所以我这里也用pad了。其实这个不用太纠结，unet的思想主要在于这种U型的经典结构，至于pad或者crop其实，可以根据需求进行调整的。
double 广东省中山市移动 1
回复 2020年6月9日下午8:30 13楼
老师，我觉得您这个教程很棒！特别适合小白上手，我有个问题想向您请教一下，就是在上采样模块中的F.pad()填充的时候，[diffx // 2, diffx – diffx // 2, diffY // 2, diffY – diffY // 2]这一个是什么意思呢？
- Jack Cui Admin 北京市中国电信北京研究院
  回复 2020年6月9日下午10:06 1层
  @double 两个层，差的大小diffx // 2，就是pad一边需要填充的大小。
  - double 广东省中山市移动 1
    回复 2020年6月9日下午10:22 2层
    @Jack Cui 老师，我查了一下pad()这个函数里面四个参数的意思是分别对左、右、上、下进行填充，那F.pad(x1, [diffx // 2, diffx – diffx // 2, diffY // 2, diffY – diffY // 2])这个语句的意思是不是对x1的左边进行填充，填充的数值为diffx // 2；对x1的右边进行填充，填充的数值为diffx – diffx // 2；对x1的上边进行填充，填充的数值为diffy // 2；对x1的下边进行填充，填充的数值为diffy – diffy // 2？如果是的话那为什么填充的数值是diffx // 2和diffx – diffx // 2呢？我是刚接触深度学习的小白，还有很多不懂，希望大神指导一下~
    - Jack Cui Admin 北京市百度网讯科技联通节点
      回复 2020年6月10日下午7:14 3层
      @double feature map调整到同样的大小，才能通道合并在一起。
cx 陕西省西安市电信 0
回复 2020年6月15日下午3:47 14楼
太强了，大佬，文章写得通俗易懂，关于U-Net的最后一层，我感觉我不太理解，最后一层是不是应该是通道数为2啊，然后得到二张特征图，将两张特征图输入Sigmoid()进行分类，最后得到一张特征图，可是我看咱们的代码上的n_classes=1，这点我不太理解了，希望大佬给我指点一下
- Jack Cui Admin 北京市百度网讯科技联通节点
  回复 2020年6月15日下午6:38 1层
  @cx 背景标签不算哈~
吃葡萄不吐桃核 山东省东营市联通 0
回复 2020年7月31日下午7:12 15楼
博主，你的文章里面的代码和github 上的代码有点出入啊。比如:DoubleConv里的padding，文章里面是0，github里面是1。
- Jack Cui Admin 北京市百度网讯科技联通节点
  回复 2020年7月31日下午7:17 1层
  @吃葡萄不吐桃核文章里是按照论文来的，github那个是为了feature map尺寸不变padding为1了。
锟斤拷锟斤拷锟斤拷女锟斤拷 西华大学樱鸣园20栋 0
回复 2020年8月10日下午9:07 16楼
diffY = torch.tensor([x2.size()[2] – x1.size()[2]])
diffX = torch.tensor([x2.size()[3] – x1.size()[3]])
x1 = F.pad(x1, [diffX // 2, diffX – diffX // 2,
diffY // 2, diffY – diffY // 2])也不太懂这块，希望解答一大，谢谢你了
- Jack Cui Admin 北京市百度网讯科技联通节点
  回复 2020年8月11日下午8:27 1层
  @锟斤拷锟斤拷锟斤拷女锟斤拷这个是为了cat的feature尺寸变为相同。
123456 江苏省淮安市电信 2
回复 2020年8月19日下午5:21 17楼
大神，不知道您具体是怎么训练的？我cuda cudnn pytoch 都下载好了但是您具体的训练细节看不懂啊！不需要下载unet吗？具体是在那里进行训练啊？训练时候数据及放在那里啊？
- Jack Cui Admin 北京市百度网讯科技联通节点
  回复 2020年8月19日下午8:12 1层
  @123456 请看下一篇文章~
- 123456 江苏省淮安市电信 2
  回复 2020年8月19日下午8:16 1层
  @123456 您好，我看了第三篇文章，可是我还是不知道怎么训练啊。没有训练的具体细节啊，从哪里开始训练啊？数据集下载放在哪个文件夹里面呢？训练时候仅仅是运行.py的Python程序吗？
123456 江苏省淮安市电信 2
回复 2020年8月19日下午8:17 18楼
您好，我看了第三篇文章，可是我还是不知道怎么训练啊。没有训练的具体细节啊，从哪里开始训练啊？数据集下载放在哪个文件夹里面呢？训练时候仅仅是运行.py的Python程序吗？有没有比较详细的步骤啊？
- Jack Cui Admin 北京市百度网讯科技联通节点
  回复 2020年8月19日下午8:20 1层
  @123456 你说的都有，请仔细看一下：
  https://cuijiahua.com/blog/2020/03/dl-16.html
小孩子 陕西省西安市联通 1
回复 2020年10月14日上午9:55 19楼
是不是想用反卷积，把参数bilinear改为false就可以了？
小孩子 陕西省西安市联通 1
回复 2020年10月14日上午9:56 20楼
是不是想用反卷积不用双线性插值，把参数bilinear改为false就行了？
- Jack Cui Admin 北京市百度网讯科技联通节点
  回复 2020年10月15日下午2:59 1层
  @小孩子是的~

一、前言

二、UNet网络结构

1、网络结构原理

2、代码

三、小结

发表评论取消回复

目前评论：72 其中：访客 42 博主 30

登录 注册 找回密码

登录注册找回密码