4.3 Improvement of RetinaNet model
⑴ NAM attention mechanism
Liu Yc et al. [23] proposed a new normalization-based attention module (NAM), which suppresses insignificant weights. It applies a weight sparsity penalty to the attention modules, thus improving their computational efficiency while maintaining similar performance. Comparison with SE, BAM, and CBAM attention mechanism on Resnet and Mobilenet, it has been shown that the NAM attention mechanism has higher accuracy.
The NAM attention mechanism is a lightweight and efficient attention mechanism, and the channel attention submodule of NAM is shown in Figure 6. The input feature layer pixels are normalized and convolved with the weights, and the feature layer is output by sigmoid. For the channel attention sub-module, the scaling factor BN in Batch Normalization is used, as in equation ⑴. The scaling factor reflects the size of the variation of each channel and indicates the weight of that channel. The scaling factor is the variance in BN. The larger the variance, the more the channel varies, then the richer the information contained in that channel will be and the greater the weight. To suppress the unimportant features, a regularization term is added to the loss function.
µB is the mean of mini batch B and σB is the standard deviation of mini batch B. γ and β are trainable affine transformation parameters (scale and shift).
The improved loss function is shown in Equation (2). x denotes the input. y is the output. W is the network weight. g(γ) is the L1 parametric penalty function. P is the penalty that balances g(γ) and g(λ).