Buildings Detection in VHR SAR Images Using Fully Convolution Neural Networks

Muhammad Shahzad, Michael Maurer, Friedrich Fraundorfer, Yuanyuan Wang, Xiao Xiang Zhu

Research output: Contribution to journalArticleResearchpeer-review

Abstract

This paper addresses the highly challenging problem of automatically detecting man-made structures especially buildings in very high-resolution (VHR) synthetic aperture radar (SAR) images. In this context, this paper has two major contributions. First, it presents a novel and generic workflow that initially classifies the spaceborne SAR tomography (TomoSAR) point clouds--generated by processing VHR SAR image stacks using advanced interferometric techniques known as TomoSAR--into buildings and nonbuildings with the aid of auxiliary information (i.e., either using openly available 2-D building footprints or adopting an optical image classification scheme) and later back project the extracted building points onto the SAR imaging coordinates to produce automatic large-scale benchmark labeled (buildings/nonbuildings) SAR data sets. Second, these labeled data sets (i.e., building masks) have been utilized to construct and train the state-of-the-art deep fully convolution neural networks with an additional conditional random field represented as a recurrent neural network to detect building regions in a single VHR SAR image. Such a cascaded formation has been successfully employed in computer vision and remote sensing fields for optical image classification but, to our knowledge, has not been applied to SAR images. The results of the building detection are illustrated and validated over a TerraSAR-X VHR spotlight SAR image covering approximately 39 km²--almost the whole city of Berlin-- with the mean pixel accuracies of around 93.84%.

LanguageEnglish
Pages1100-1116
JournalIEEE transactions on geoscience and remote sensing
Volume57
Issue number2
DOIs
StatusPublished - Feb 2019

Fingerprint

Synthetic aperture radar
Convolution
synthetic aperture radar
Neural networks
Image classification
image classification
computer vision
TerraSAR-X
Radar imaging
detection
Recurrent neural networks
footprint
Computer vision
tomography
train
Tomography
Masks
Remote sensing
pixel
Pixels

Keywords

  • Building detection
  • Buildings
  • Feature extraction
  • fully convolution neural networks (CNNs)
  • OpenStreetMap (OSM)
  • Optical distortion
  • Optical imaging
  • Optical interferometry
  • Optical sensors
  • SAR tomography (TomoSAR)
  • Synthetic aperture radar
  • synthetic aperture radar (SAR)
  • TerraSAR-X/TanDEM-X.

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Earth and Planetary Sciences(all)

Cite this

Buildings Detection in VHR SAR Images Using Fully Convolution Neural Networks. / Shahzad, Muhammad; Maurer, Michael; Fraundorfer, Friedrich; Wang, Yuanyuan; Zhu, Xiao Xiang.

In: IEEE transactions on geoscience and remote sensing, Vol. 57, No. 2, 02.2019, p. 1100-1116.

Research output: Contribution to journalArticleResearchpeer-review

Shahzad, Muhammad ; Maurer, Michael ; Fraundorfer, Friedrich ; Wang, Yuanyuan ; Zhu, Xiao Xiang. / Buildings Detection in VHR SAR Images Using Fully Convolution Neural Networks. In: IEEE transactions on geoscience and remote sensing. 2019 ; Vol. 57, No. 2. pp. 1100-1116
@article{fa5cfa95a28543f2bf679d7eb785a848,
title = "Buildings Detection in VHR SAR Images Using Fully Convolution Neural Networks",
abstract = "This paper addresses the highly challenging problem of automatically detecting man-made structures especially buildings in very high-resolution (VHR) synthetic aperture radar (SAR) images. In this context, this paper has two major contributions. First, it presents a novel and generic workflow that initially classifies the spaceborne SAR tomography (TomoSAR) point clouds--generated by processing VHR SAR image stacks using advanced interferometric techniques known as TomoSAR--into buildings and nonbuildings with the aid of auxiliary information (i.e., either using openly available 2-D building footprints or adopting an optical image classification scheme) and later back project the extracted building points onto the SAR imaging coordinates to produce automatic large-scale benchmark labeled (buildings/nonbuildings) SAR data sets. Second, these labeled data sets (i.e., building masks) have been utilized to construct and train the state-of-the-art deep fully convolution neural networks with an additional conditional random field represented as a recurrent neural network to detect building regions in a single VHR SAR image. Such a cascaded formation has been successfully employed in computer vision and remote sensing fields for optical image classification but, to our knowledge, has not been applied to SAR images. The results of the building detection are illustrated and validated over a TerraSAR-X VHR spotlight SAR image covering approximately 39 km²--almost the whole city of Berlin-- with the mean pixel accuracies of around 93.84{\%}.",
keywords = "Building detection, Buildings, Feature extraction, fully convolution neural networks (CNNs), OpenStreetMap (OSM), Optical distortion, Optical imaging, Optical interferometry, Optical sensors, SAR tomography (TomoSAR), Synthetic aperture radar, synthetic aperture radar (SAR), TerraSAR-X/TanDEM-X.",
author = "Muhammad Shahzad and Michael Maurer and Friedrich Fraundorfer and Yuanyuan Wang and Zhu, {Xiao Xiang}",
year = "2019",
month = "2",
doi = "10.1109/TGRS.2018.2864716",
language = "English",
volume = "57",
pages = "1100--1116",
journal = "IEEE transactions on geoscience and remote sensing",
issn = "0196-2892",
publisher = "Institute of Electrical and Electronics Engineers",
number = "2",

}

TY - JOUR

T1 - Buildings Detection in VHR SAR Images Using Fully Convolution Neural Networks

AU - Shahzad,Muhammad

AU - Maurer,Michael

AU - Fraundorfer,Friedrich

AU - Wang,Yuanyuan

AU - Zhu,Xiao Xiang

PY - 2019/2

Y1 - 2019/2

N2 - This paper addresses the highly challenging problem of automatically detecting man-made structures especially buildings in very high-resolution (VHR) synthetic aperture radar (SAR) images. In this context, this paper has two major contributions. First, it presents a novel and generic workflow that initially classifies the spaceborne SAR tomography (TomoSAR) point clouds--generated by processing VHR SAR image stacks using advanced interferometric techniques known as TomoSAR--into buildings and nonbuildings with the aid of auxiliary information (i.e., either using openly available 2-D building footprints or adopting an optical image classification scheme) and later back project the extracted building points onto the SAR imaging coordinates to produce automatic large-scale benchmark labeled (buildings/nonbuildings) SAR data sets. Second, these labeled data sets (i.e., building masks) have been utilized to construct and train the state-of-the-art deep fully convolution neural networks with an additional conditional random field represented as a recurrent neural network to detect building regions in a single VHR SAR image. Such a cascaded formation has been successfully employed in computer vision and remote sensing fields for optical image classification but, to our knowledge, has not been applied to SAR images. The results of the building detection are illustrated and validated over a TerraSAR-X VHR spotlight SAR image covering approximately 39 km²--almost the whole city of Berlin-- with the mean pixel accuracies of around 93.84%.

AB - This paper addresses the highly challenging problem of automatically detecting man-made structures especially buildings in very high-resolution (VHR) synthetic aperture radar (SAR) images. In this context, this paper has two major contributions. First, it presents a novel and generic workflow that initially classifies the spaceborne SAR tomography (TomoSAR) point clouds--generated by processing VHR SAR image stacks using advanced interferometric techniques known as TomoSAR--into buildings and nonbuildings with the aid of auxiliary information (i.e., either using openly available 2-D building footprints or adopting an optical image classification scheme) and later back project the extracted building points onto the SAR imaging coordinates to produce automatic large-scale benchmark labeled (buildings/nonbuildings) SAR data sets. Second, these labeled data sets (i.e., building masks) have been utilized to construct and train the state-of-the-art deep fully convolution neural networks with an additional conditional random field represented as a recurrent neural network to detect building regions in a single VHR SAR image. Such a cascaded formation has been successfully employed in computer vision and remote sensing fields for optical image classification but, to our knowledge, has not been applied to SAR images. The results of the building detection are illustrated and validated over a TerraSAR-X VHR spotlight SAR image covering approximately 39 km²--almost the whole city of Berlin-- with the mean pixel accuracies of around 93.84%.

KW - Building detection

KW - Buildings

KW - Feature extraction

KW - fully convolution neural networks (CNNs)

KW - OpenStreetMap (OSM)

KW - Optical distortion

KW - Optical imaging

KW - Optical interferometry

KW - Optical sensors

KW - SAR tomography (TomoSAR)

KW - Synthetic aperture radar

KW - synthetic aperture radar (SAR)

KW - TerraSAR-X/TanDEM-X.

UR - http://www.scopus.com/inward/record.url?scp=85054625105&partnerID=8YFLogxK

U2 - 10.1109/TGRS.2018.2864716

DO - 10.1109/TGRS.2018.2864716

M3 - Article

VL - 57

SP - 1100

EP - 1116

JO - IEEE transactions on geoscience and remote sensing

T2 - IEEE transactions on geoscience and remote sensing

JF - IEEE transactions on geoscience and remote sensing

SN - 0196-2892

IS - 2

ER -