SILVANet: Semantic Instance-Layer normalization and attention with
Vertical Axis.
Abstract
Large receptive field could exploit more information from an input
image. To achieve high performance, recent works have expanded the size
of receptive field. Attention has been mainly used to get large
receptive field. However, the computation of attention costs extremely
expensive. In this paper, we attempt to resolve this problem in other
way which covers large receptive field. First, we exploit properties of
layer/instance normalization methods. This optimizes parameters and
features, reducing additional computational cost. In addition, we
analyze low performance on small objects with vertical axis and propose
vertical self-attention by adopting pooling with vertical direction on
query and key. We achieve the mean Interaction-of-union(mIoU) of 73.1
and the frame per second(fps) of 191, which are comparable results with
state-of-the-arts on Cityscapes test datasets.