Abstract
Compared with the targets in natural images, the aerial targets are
often distributed in an arbitrary direction. However, the existing
detectors rely on the shared features to identify and locate the
targets. This leads to the inconsistency between classification and
regression: the classifier needs rotation-invariant features and the
regressor needs rotation-sensitive features. To solve the above
problems, we propose a Spatial Dual Network (SD-Net) composed of two
modules: Spatial Coordinate Attention Module (SCAM) and Polarization
Dual Pyramid Module(PDPM). We construct an attention module containing
convolution kernels sliding in both horizontal and vertical directions,
which enables the attention module to capture channel correlation
features and global spatial features in different directions. Then, in
the dual pyramid, we separate the features suitable for classification
and regression tasks through the polarization function to the classifier
and regressor of the network, achieving more refined detection.
Extensive experiments show that compared with the existing detectors,
our method can achieve higher performance on two remote sensing datasets
(i.e. HRSC2016 and DOTA) while maintaining high efficiency.