According to the network architecture depicted in Figure 1, our modified Faster RCNN consists of three components: Module A (the Feature Pyramid Network (FPN) backbone), Module B (the PU-RPN), and Module C (the ROI Head). Firstly, FPN serves as a feature extractor in charge of computing the feature maps, which are the input of subsequent Modules B and C . Secondly, RPN, the first stage of the detector, focuses on a binary classification, i.e., insulator regions, and background. The generated insulator regions are denoted as proposals in the Faster RCNN framework, and they will be refined into good insulators or different types of defective insulators in ROI Head. To solve the problem of incomplete annotation, we introduce the PU learning strategy [45] into the vanilla RPN, denoted as PU-RPN. Finally, Module C (ROI Head), the second stage of the detector, utilizes the proposals to further refine the predicted insulator’s category and bounding box’s localization. We applied focal loss to the ROI Head in order to mitigate the effect of sample imbalance. The details of the above components are described in the following subsections.

3.1 FPN as feature extractor

There are large or small targets in scenes to be recognized for object detection. Likewise, insulator defect detection also possesses the insulator strings in large size and the small insulators. The larger ones tend to be detected in high-level feature maps, which have low resolution and rich semantic information. But the smaller damaged insulators correspond to too few pixels to be distinguished in high-level feature maps. Therefore, a multiple-scale strategy is key to insulator defect detection.