當前位置:網站首頁>【論文筆記】LSNet: Extremely Light-Weight Siamese Network For Change Detection in Remote Sensing Image
【論文筆記】LSNet: Extremely Light-Weight Siamese Network For Change Detection in Remote Sensing Image
2022-05-15 07:19:57【m0_61899108】
論文
論文題目:LSNET: EXTREMELY LIGHT-WEIGHT SIAMESE NETWORK FOR CHANGE DETECTIONOF REMOTE SENSING IMAGE
投遞:CVPR 2022
論文地址:https://arxiv.org/abs/2201.09156
項目地址:https://github.com/qaz670756/LSNet
論文思路較為簡單,主要做了兩方面的修改,首先是骨幹網絡的輕量化,采用CGB模塊構建孿生的輕量骨幹網絡;另一方面則是金字塔特征融合方式的改進,在denseFPN的基礎上改進,去除冗餘連接,增加自底而上的融合路徑。模型參數量和計算量大大降低的原因在於骨幹網絡的輕量化,采用了深度可分離卷積替代普通卷積操作。
實驗結果
官方訓練參數:
{
"patch_size": 256,
"augmentation": true,
"num_gpus": 1,
"num_workers": 8,
"num_channel": 3,
"EF": false,
"epochs": 101,
"batch_size": 12,
"learning_rate": 1e-3,
"model_name": "denseFPN",
"loss_function": "contra_hybrid",
"dataset_dir": "data/Real/subset/",
"weight_dir": "./outputs/",
"log_dir": "./log/"
}
Abstract
孿生網絡逐漸成為遙感圖像(remote sensing image,RSI)變化檢測的主流。但隨著結構、模塊和訓練過程的複雜化,模型也越加複雜,難以實際應用。
本文,提出一種用於RSI變化檢測的超輕量孿生網絡(Light-Weight Siamese Network,LSNet),用深度可分離卷積空洞卷積替代標准卷積,並去除冗餘的密集連接,在進行孿生特征融合時只保留有效的特征流,大大壓縮了參數和計算量。在CDD數據集上,與第一比特模型相比,LSNet的參數和計算量分別减少了90.35%和91.34%,精度僅下降了1.5%。
Introduction
傳統的RSI變化檢測方法依賴於人工特征和耗時的前後期處理,難以區分語義變化和背景噪聲。
圖像對可直接輸入孿生卷積網絡,無需預處理,依靠端到端的監督學習就可以分離語義變化區域和不變區域。
- 本文提出一種輕量級的孿生網絡LSNet,效率很高,如圖1。網絡主幹采用上下文引導模塊(Context Guide Block,CGB)構建,該模塊以深度可分離空洞卷積和全局特征聚合為核心組件。對比使用ResNet-50作為主幹時,LSNet主幹的參數量和計算量分別只有原來的3.97%和32.56%。
- 提出一種差分特征金字塔網絡(diffFPN)來進行漸進式特征對差分提取和分辨率恢複(在保持特征流的同時消除冗餘連接),最終將變化的圖像區域從恒定的圖像區域中分離出來。
Method
LSNet:包括一個孿生主幹網(LightSiamese Backbone)和一個差分特征金字塔網絡(diffFPN)。主幹網利用上下文引導模塊(CGB)構建,diffFPN用於有效的孿生特征對融合。
Light-Siamese backbone
圖像T1和T2經過具有共享權重的孿生網絡主幹,該骨幹網由4個複合層組成(從上到下,複合層分別由3/3/8/12個CGB模塊組成),每個CGB相當於兩個級別,因此有4組特征輸出,總共52層。
基本組件(Context Guide Block)CGB如圖2右邊所示。輸入X經過並行的擴張(膨脹)卷積,以獲得不同範圍(感受野)的局部上下文信息。擴張卷積以深度可分離的方式計算,即所有的通道都被分組,卷積只在一個獨立的組中運算。(深度可分離卷積,能大大减少計算量,但速度有上限,其算力瓶頸在於訪問帶寬)
通道交互和全局信息提取。
Differential feature pyramid network
SUNNet提出過一種密集連接的金字塔特征融合方法,圖3(a)所示,
這種denseFPN結構存在2個問題:
- 冗餘連接。(T_1,0、T_2,0等淺層特征被反複輸入到d_1,0、d_2,0、d_3,0中,效率低下)
- 不合理的特征流。(denseFPN中,輸出層d_0,0和d_1,0包含來自主幹的不完整特征)
於是,論文提出diffFPPN結構, 移除冗餘連接,添加了自底向上的融合路徑,使三個輸出層包含完整的主幹網絡特征。
Experiment and Results
Dataset and evaluation metrics
數據集:CDD
常用指標: precision、recall、F1-score、overall accuracy
量化指標:F1-G、F1-F (分別量化單元參數和計算量對F1分數的影響),F1-Eff(評估模型的整體效率)
Accuracy and efficiency comparison
兩個模塊在參數量和計算量上的比較(CDD數據集),錶中可看出,
- 相比ResNet-50,LightSiamese-52主幹的參數量和計算量分別只有原來的1/25和1/3。
- denseFPN的結構存在特征流不合理性,diffFPN則僅提昇0.0709M的參數量時,計算量减少1.0884GFLOPs,减少一半以上。
多種方法在CDD數據集中的性能錶現對比,LSNet方法的各項性能指標都還ok,居前三。
多種方法的效率對比,可見使用diffFPN的方法具有最高的F1-P和F1-G。
結合錶2和圖3,與SNUNet相比,LSNet的參數和計算量分別减少了90.35%和91.34%,精度僅下降1.5%。
LSNet的可視化結果。結果相較准確,但邊緣細節的有待進一步細化。(e)可看出,變化區域的邊緣比內部概率更高,錶明網絡利用區域的結構作為鑒別特征,提高了其對顏色和紋理變化的魯棒性。
Conclusion
為了有效地檢測RSI變化,提出了一種輕量級孿生網絡,該網絡具有由上下文引導模塊(CGB)構建的輕量孿生主幹(LightSiamese Backbone)和特征對融合模塊(diffFPN)。在具有挑戰性的CCD數據集上的結果錶明,與其他主流方法相比,該方法在有限的參數和計算量下獲得了具有競爭力的結果,證明了其有效性。
核心代碼
Context Guide Block
class ContextGuidedBlock(nn.Module):
"""Context Guided Block for CGNet.
This class consists of four components: local feature extractor,
surrounding feature extractor, joint feature extractor and global
context extractor.
Args:
in_channels (int): Number of input feature channels.
out_channels (int): Number of output feature channels.
dilation (int): Dilation rate for surrounding context extractor.
Default: 2.
reduction (int): Reduction for global context extractor. Default: 16.
skip_connect (bool): Add input to output or not. Default: True.
downsample (bool): Downsample the input to 1/2 or not. Default: False.
conv_cfg (dict): Config dict for convolution layer.
Default: None, which means using conv2d.
norm_cfg (dict): Config dict for normalization layer.
Default: dict(type='BN', requires_grad=True).
act_cfg (dict): Config dict for activation layer.
Default: dict(type='PReLU').
with_cp (bool): Use checkpoint or not. Using checkpoint will save some
memory while slowing down the training speed. Default: False.
"""
def __init__(self,
in_channels,
out_channels,
dilation=2,
reduction=16,
skip_connect=True,
downsample=False,
conv_cfg=None,
norm_cfg=dict(type='BN', requires_grad=True),
act_cfg=dict(type='PReLU'),
with_cp=False):
super(ContextGuidedBlock, self).__init__()
self.with_cp = with_cp
self.downsample = downsample
# channels = out_channels if downsample else out_channels // 2
channels = out_channels // 2
if 'type' in act_cfg and act_cfg['type'] == 'PReLU':
act_cfg['num_parameters'] = channels
kernel_size = 3 if downsample else 1
stride = 2 if downsample else 1
padding = (kernel_size - 1) // 2
# self.channel_shuffle = ChannelShuffle(2 if in_channels==in_channels//2*2 else in_channels)
self.conv1x1 = nn.Sequential(
nn.Conv2d(in_channels, channels, kernel_size=kernel_size, stride=stride, padding=padding),
build_norm_layer(channels),
nn.PReLU(num_parameters=channels)
)
self.f_loc = nn.Conv2d(channels, channels, kernel_size=3,
padding=1, groups=channels, bias=False)
self.f_sur = nn.Conv2d(channels, channels, kernel_size=3, padding=dilation,
dilation=dilation, groups=channels, bias=False)
self.bn = build_norm_layer(2 * channels)
self.activate = nn.PReLU(2 * channels)
# original bottleneck in CGNet: A light weight context guided network for segmantic segmentation
# is removed for saving computation amount
# if downsample:
# self.bottleneck = build_conv_layer(
# conv_cfg,
# 2 * channels,
# out_channels,
# kernel_size=1,
# bias=False)
self.skip_connect = skip_connect and not downsample
self.f_glo = GlobalContextExtractor(out_channels, reduction, with_cp)
# self.f_glo = CoordAtt(out_channels,out_channels,groups=reduction)
def forward(self, x):
def _inner_forward(x):
# x = self.channel_shuffle(x)
out = self.conv1x1(x)
loc = self.f_loc(out)
sur = self.f_sur(out)
joi_feat = torch.cat([loc, sur], 1) # the joint feature
joi_feat = self.bn(joi_feat)
joi_feat = self.activate(joi_feat)
if self.downsample:
pass
# joi_feat = self.bottleneck(joi_feat) # channel = out_channels
# f_glo is employed to refine the joint feature
out = self.f_glo(joi_feat)
if self.skip_connect:
return x + out
else:
return out
return _inner_forward(x)
def cgblock(in_ch, out_ch, dilation=2, reduction=8, skip_connect=False):
return nn.Sequential(
ContextGuidedBlock(in_ch, out_ch,
dilation=dilation,
reduction=reduction,
downsample=False,
skip_connect=skip_connect))
light_siamese_backbone
class light_siamese_backbone(nn.Module):
def __init__(self, in_ch=None, num_blocks=None, cur_channels=None,
filters=None, dilations=None, reductions=None):
super(light_siamese_backbone, self).__init__()
norm_cfg = {'type': 'BN', 'eps': 0.001, 'requires_grad': True}
act_cfg = {'type': 'PReLU', 'num_parameters': 32}
self.inject_2x = InputInjection(1) # down-sample for Input, factor=2
self.inject_4x = InputInjection(2) # down-sample for Input, factor=4
# stage 0
self.stem = nn.ModuleList()
for i in range(num_blocks[0]):
self.stem.append(
ContextGuidedBlock(
cur_channels[0], filters[0],
dilations[0], reductions[0],
skip_connect=(i != 0),
downsample=False,
norm_cfg=norm_cfg,
act_cfg=act_cfg) # CG block
)
cur_channels[0] = filters[0]
cur_channels[0] += in_ch
self.norm_prelu_0 = nn.Sequential(
build_norm_layer(cur_channels[0]),
nn.PReLU(cur_channels[0]))
# stage 1
self.level1 = nn.ModuleList()
for i in range(num_blocks[1]):
self.level1.append(
ContextGuidedBlock(
cur_channels[0] if i == 0 else filters[1],
filters[1], dilations[1], reductions[1],
downsample=(i == 0),
norm_cfg=norm_cfg,
act_cfg=act_cfg)) # CG block
cur_channels[1] = 2 * filters[1] + in_ch
self.norm_prelu_1 = nn.Sequential(
build_norm_layer(cur_channels[1]),
nn.PReLU(cur_channels[1]))
# stage 2
self.level2 = nn.ModuleList()
for i in range(num_blocks[2]):
self.level2.append(
ContextGuidedBlock(
cur_channels[1] if i == 0 else filters[2],
filters[2], dilations[2], reductions[2],
downsample=(i == 0),
norm_cfg=norm_cfg,
act_cfg=act_cfg)) # CG block
cur_channels[2] = 2 * filters[2]
self.norm_prelu_2 = nn.Sequential(
build_norm_layer(cur_channels[2]),
nn.PReLU(cur_channels[2]))
# stage 3
self.level3 = nn.ModuleList()
for i in range(num_blocks[3]):
self.level3.append(
ContextGuidedBlock(
cur_channels[2] if i == 0 else filters[3],
filters[3], dilations[3], reductions[3],
downsample=(i == 0),
norm_cfg=norm_cfg,
act_cfg=act_cfg)) # CG block
cur_channels[3] = 2 * filters[3]
self.norm_prelu_3 = nn.Sequential(
build_norm_layer(cur_channels[3]),
nn.PReLU(cur_channels[3]))
def forward(self, x):
# x = torch.cat([xA, xB], dim=0)
# stage 0
inp_2x = x # self.inject_2x(x)
inp_4x = self.inject_2x(x)
for layer in self.stem:
x = layer(x)
x = self.norm_prelu_0(torch.cat([x, inp_2x], 1))
x0_0A, x0_0B = x[:x.shape[0] // 2, :, :, :], x[x.shape[0] // 2:, :, :, :]
# stage 1
for i, layer in enumerate(self.level1):
x = layer(x)
if i == 0:
down1 = x
x = self.norm_prelu_1(torch.cat([x, down1, inp_4x], 1))
x1_0A, x1_0B = x[:x.shape[0] // 2, :, :, :], x[x.shape[0] // 2:, :, :, :]
# stage 2
for i, layer in enumerate(self.level2):
x = layer(x)
if i == 0:
down1 = x
x = self.norm_prelu_2(torch.cat([x, down1], 1))
x2_0A, x2_0B = x[:x.shape[0] // 2, :, :, :], x[x.shape[0] // 2:, :, :, :]
# stage 3
for i, layer in enumerate(self.level3):
x = layer(x)
if i == 0:
down1 = x
x = self.norm_prelu_3(torch.cat([x, down1], 1))
x3_0A, x3_0B = x[:x.shape[0] // 2, :, :, :], x[x.shape[0] // 2:, :, :, :]
return [x0_0A, x0_0B, x1_0A, x1_0B, x2_0A, x2_0B, x3_0A, x3_0B]
class InputInjection(nn.Module):
"""Downsampling module for CGNet."""
def __init__(self, num_downsampling):
super(InputInjection, self).__init__()
self.pool = nn.ModuleList()
for i in range(num_downsampling):
self.pool.append(nn.AvgPool2d(3, stride=2, padding=1))
def forward(self, x):
for pool in self.pool:
x = pool(x)
return x
def build_norm_layer(ch):
layer = nn.BatchNorm2d(ch, eps=0.01)
for param in layer.parameters():
param.requires_grad = True
return layer
diffFPN
class diffFPN(nn.Module):
def __init__(self, cur_channels=None, mid_ch=None,
dilations=None, reductions=None,
bilinear=True):
super(diffFPN, self).__init__()
# lateral convs for unifing channels
self.lateral_convs = nn.ModuleList()
for i in range(4):
self.lateral_convs.append(
cgblock(cur_channels[i] * 2, mid_ch * 2 ** i, dilations[i], reductions[i])
)
# top_down_convs
self.top_down_convs = nn.ModuleList()
for i in range(3, 0, -1):
self.top_down_convs.append(
cgblock(mid_ch * 2 ** i, mid_ch * 2 ** (i - 1), dilation=dilations[i], reduction=reductions[i])
)
# diff convs
self.diff_convs = nn.ModuleList()
for i in range(3):
self.diff_convs.append(
cgblock(mid_ch * (3 * 2 ** i), mid_ch * 2 ** i, dilations[i], reductions[i])
)
for i in range(2):
self.diff_convs.append(
cgblock(mid_ch * (3 * 2 ** i), mid_ch * 2 ** i, dilations[i], reductions[i])
)
self.diff_convs.append(
cgblock(mid_ch * 3, mid_ch * 2,
dilation=dilations[0], reduction=reductions[0])
)
self.up2x = up(32, bilinear)
def forward(self, output):
tmp = [self.lateral_convs[i](torch.cat([output[i * 2], output[i * 2 + 1]], dim=1))
for i in range(4)]
# top_down_path
for i in range(3, 0, -1):
tmp[i - 1] += self.up2x(self.top_down_convs[3 - i](tmp[i]))
# x0_1
tmp = [self.diff_convs[i](torch.cat([tmp[i], self.up2x(tmp[i + 1])], dim=1)) for i in [0, 1, 2]]
x0_1 = tmp[0]
# x0_2
tmp = [self.diff_convs[i](torch.cat([tmp[i - 3], self.up2x(tmp[i - 2])], dim=1)) for i in [3, 4]]
x0_2 = tmp[0]
# x0_3
x0_3 = self.diff_convs[5](torch.cat([tmp[0], self.up2x(tmp[1])], dim=1))
return x0_1, x0_2, x0_3
LSNet_diffFPN
class LSNet_diffFPN(nn.Module):
# SNUNet-CD with ECAM
def __init__(self, in_ch=3, mid_ch=32, out_ch=2, bilinear=True):
super(LSNet_diffFPN, self).__init__()
torch.nn.Module.dump_patches = True
n1 = 32 # the initial number of channels of feature map
filters = (n1, n1 * 2, n1 * 4, n1 * 8, n1 * 16)
num_blocks = (3, 3, 8, 12)
dilations = (1, 2, 4, 8)
reductions = (4, 8, 16, 32)
cur_channels = [0, 0, 0, 0]
cur_channels[0] = in_ch
self.backbone = light_siamese_backbone(in_ch=in_ch, num_blocks=num_blocks,
cur_channels=cur_channels,
filters=filters, dilations=dilations,
reductions=reductions)
self.head = cam_head(mid_ch=mid_ch,out_ch=out_ch)
self.FPN = diffFPN(cur_channels=cur_channels, mid_ch=mid_ch,
dilations=dilations, reductions=reductions, bilinear=bilinear)
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
def forward(self, x, debug=False):
output = self.backbone(x)
x0_1, x0_2, x0_3 = self.FPN(output)
out = self.head(x0_1, x0_2, x0_3)
if debug:
print_flops_params(self.backbone, [x], 'backbone')
print_flops_params(self.FPN, [output], 'diffFPN')
print_flops_params(self.head, [x0_1, x0_2, x0_3], 'head')
return (x0_1, x0_2, x0_3, x0_3, out,)
版權聲明
本文為[m0_61899108]所創,轉載請帶上原文鏈接,感謝
https://cht.chowdera.com/2022/135/202205142322539306.html
邊欄推薦
猜你喜歡
關於創建模態窗口和非模態窗口的研究
An End-to-End Steel Surface Defect Detection Approach via Fusing Multiple Hierarchical Features-閱讀筆記
【性能測試】第五篇 | Jmeter環境安裝
Matplotlib使用指南,100個案例從入門到進階!(附源代碼)
Dots + interval stats and geoms
SIGIR2022 | 基於用戶價格偏好及興趣偏好的會話推薦
Cloudreve自建雲盤實站:容量和速度自己來决定
利用騰訊雲函數搭建免費代理池
Redis的安裝及基本數據類型
js輪播圖效果,透明度漸變實現
隨機推薦
- 【棧+深度優先搜索】括號問題大匯總
- 筆記 第1章 流與文件(6) 文件隨機比特置讀取與Zip文件讀取
- 你的數據庫真的穿“防彈衣”了嗎
- 實驗四 進程同步與通信
- LeetCode騰訊精選練習50題-557.反轉字符串中的單詞III
- 軟考系統集成項目管理工程師全真模擬題(含答案、解析)
- 這款阿裏騰訊人都在用的API管理神器,解决你發愁的文檔問題
- 微服務最强理論基礎,堪稱絕妙心法
- OCTO作為美團的高性能服務通信框架,究竟能不能稱得上是殺手鐧呢?
- 沉浸式面試:MySQL連環炮,你能抗到第幾個?
- 阿裏四面一問:說說之前公司系統都用過的哪些限流模式?
- 速達軟件、金蝶軟件、用友軟件、管家婆軟件、鼎捷軟件等ERP軟件與進銷存軟件的區別
- P1439 【模板】最長公共子序列
- MySQL數據庫(8):數據類型-小數
- 38歲獨居男去世。
- 【FreeRtos任務恢複與掛起】
- UDS-如何在CAPL中實現診斷服務的請求和響應
- UDS-如何在CAPL中實現讀取DTC和它的狀態
- 智能複制多個文件夾裏全部文件到指定比特置
- 剪輯視頻,在視頻某時間段添加srt字幕
- 【SQL UNION 操作符】
- 爬蟲基本原理講解
- 華為設備配置基於用戶VLAN的組播VLAN多對多功能
- 複制帶隨機指針的鏈錶<難度系數>
- 免征個人所得稅的項目有哪些
- npm WARN read-shrinkwrap This version of npm is compatible with [email protected], but package-lock.
- 源代碼保密該如何管理
- JVM(十七) -- 字節碼與類的加載(二) -- 字節碼指令集
- 英語六級高頻詞匯速記 + 2018-12-2聽力 Day04
- 電腦優化配置-win10
- 創新工場李開複:最看好的十年趨勢是醫療科技
- 白話快速理解CDN
- 小樣本學習只是一場學術界自嗨嗎?
- 數據庫常用語句
- LearnOpenGL學習筆記——高級數據
- 數據庫函數查詢 MySQL數據庫是時間正確問題
- Win10下如何在右鍵新建菜單使用Typora新建.md文件
- 力扣刷題771.寶石與石頭
- mysql split 字符串
- CSV文件中的list形式字符串轉為list