2 Related work

In this section, we introduce the related work that is most relevant to our study. Firstly, a series of references about insulator detection are reviewed in Section 2.1. Then, Section 2.2 contains scientific literature on insulator segmentation, which is viewed as pixel-level detection of the insulators.

2.1 Insulator detection

Object detection is to predict a bounding box as an indication of the target’s category and location. Similarly, insulator detection or insulator defect detection aims to locate the insulators by surrounding them with bounding boxes and identifying their categories or defect categories. Early methods about insulator defect detection adopted a combination of computer vision and machine learning technologies [1], [13], [14]. These methods heavily relied on hand-crafted features, which were time-consuming to design and required the assistance of experienced experts.
In recent years, deep learning-based detectors are introduced in the application of insulator detection [17]–[20]. The studies can be classified into one-stage and two-stage detectors. One-stage detectors typically correlate to the You Only Look Once (YOLO) family of deep neural networks [21]–[24], whereas two-stage detectors include Region-based CNN (RCNN) and its variations [25]–[27].
Various one-stage detectors are used to identify the insulator’s defective regions. Yang et al. incorporated a lightweight backbone into the vanilla architecture of YOLO v3 to identify missing-cap insulators [19]. The lightweight backbone is based on MobileNet [28] with spatial pyramid pooling [37]. Similarly, a lightweight YOLO v4 is also proposed in [20] to balance detection accuracy and detection speed for insulator detection. Their lightweight techniques are analogous, with MobileNet replacing the original backbone. Furthermore, Han et al. presented TinyYOLO v4 that merged the self-attention module into the Feature Pyramid Network (FPN) [38] to enhance channel-level feature fusion [10]. This channel-wise self-attention facilitates learning better feature representation. With the release of YOLO v5, its pipeline was introduced into insulator detection research. In [39], four versions of YOLO v5 were explored for the localization of the insulator defect. As a result, the more suitable network architecture was chosen through contrast experiments. Gao et al. modified the YOLO v5 pipeline by incorporating a triplet attention module in order to enhance the detection performance of small insulator defects [9]. Then, another attempt to incorporate attention mechanisms with the YOLO v5 was reported in [29]. Lan et al. introduced the Convolutional Block Attention Module (CBAM) to provide more channel and spatial context information for insulator defect detection.
The methods listed above rely on one-stage detectors. Furthermore, two-stage object detection frameworks were introduced into the insulator detection community. In [17], [30], Faster RCNN was used to first roughly localize the regions where insulators are most likely to exist, referred to as ”proposals” in the framework. Then, these proposals are fed into the second stage network, a multitask head, to refine the localization of the insulators’ defects. Moreover, Tao et al. model insulator defect detection as a two-level task that includes insulator localization as well as defect detection [18]. The framework is made up of two concatenated Faster RCNNs: one with a VGG16 backbone for localization and another with the original Faster RCNN for detecting defective regions. Zhong et al. modified the standard Faster RCNN pipeline to consider arbitrarily oriented insulator localization [31]. The proposed framework introduced an oriented Region Proposal Network (RPN) to implement arbitrarily oriented localization for insulators. In [32], the attention mechanism was introduced in Faster RCNN for self-explosion insulator defects. In detail, an adaptive receptive field network is proposed and inserted into the FPN backbone.

2.2 Insulator segmentation

In other research works, the focus of the studies was to segment the insulators or defective regions from the background. In [40], a framework with two cascaded networks were proposed by Li et al. to detect the insulators globally and segment the local defect objects. The segmentation model was designed to incorporate an attention mechanism in an improved version of U-Net [41]. Efficient Channel Attention Networks (ECA-Net) was also introduced as the U-Net encoder, providing an example of fusing an attention mechanism for insulator segmentation [42]. Yu et al. focused on introducing fine-grained texture into the SINet architecture and simultaneously improved a positioning network to segment defective regions for insulators [2]. The insulator segmentation problem was solved by Antwi-Bekoe et al. using a common instance segmentation framework [43], in which the detection and mask branches implemented instance-level segmentation. Xuan et al. used a squeeze-excitation module to improve the backbone and a spatial attention module to forecast the insulator mask to produce excellent results in insulator defect segmentation [44].