4.3 Improvement of RetinaNet model
⑴ NAM attention mechanism
Liu Yc et al. [23] proposed a new
normalization-based attention module (NAM), which suppresses
insignificant weights. It applies a weight sparsity penalty to the
attention modules, thus improving their computational efficiency while
maintaining similar performance. Comparison with SE, BAM, and CBAM
attention mechanism on Resnet and Mobilenet, it has been shown that the
NAM attention mechanism has higher accuracy.
The NAM attention mechanism is a lightweight and efficient attention
mechanism, and the channel attention submodule of NAM is shown in Figure
6. The input feature layer pixels are normalized and convolved with the
weights, and the feature layer is output by sigmoid. For the channel
attention sub-module, the scaling factor BN in Batch Normalization is
used, as in equation ⑴. The scaling factor reflects the size of the
variation of each channel and indicates the weight of that channel. The
scaling factor is the variance in BN. The larger the variance, the more
the channel varies, then the richer the information contained in that
channel will be and the greater the weight. To suppress the unimportant
features, a regularization term is added to the loss function.
µB is the mean of mini batch B and σB is the standard deviation of mini
batch B. γ and β are trainable affine transformation parameters (scale
and shift).
⑴
The improved loss function is shown in Equation (2). x denotes the
input. y is the output. W is the network weight. g(γ) is the L1
parametric penalty function. P is the penalty that balances g(γ) and
g(λ).
⑵