According to the network architecture depicted in Figure 1, our modified
Faster RCNN consists of three components: Module A (the Feature
Pyramid Network (FPN) backbone), Module B (the PU-RPN), and
Module C (the ROI Head). Firstly, FPN serves as a feature
extractor in charge of computing the feature maps, which are the input
of subsequent Modules B and C . Secondly, RPN, the first
stage of the detector, focuses on a binary classification, i.e.,
insulator regions, and background. The generated insulator regions are
denoted as proposals in the Faster RCNN framework, and they will be
refined into good insulators or different types of defective insulators
in ROI Head. To solve the problem of incomplete annotation, we introduce
the PU learning strategy [45] into the vanilla RPN, denoted as
PU-RPN. Finally, Module C (ROI Head), the second stage of the
detector, utilizes the proposals to further refine the predicted
insulator’s
category
and bounding box’s localization. We applied focal loss to the ROI Head
in order to mitigate the effect of sample imbalance. The details of the
above components are described in the following subsections.
3.1 FPN as feature
extractor
There are large or small targets in scenes to be recognized for object
detection. Likewise, insulator defect detection also possesses the
insulator strings in large size and the small insulators. The larger
ones tend to be detected in high-level feature maps, which have low
resolution and rich semantic information. But the smaller damaged
insulators correspond to too few pixels to be distinguished in
high-level feature maps. Therefore, a multiple-scale strategy is key to
insulator defect detection.