4.1 Dataset

4.1.1 Dataset description

The Insulator Defect Image Dataset (IDID)11The Insulator Defect Image Dataset refers to: https://ieee-dataport.org/competitions/insulator-defect-detection, a widely-used insulator dataset, is utilized for model assessment. As indicated in Figure 6, the IDID comprises several high-resolution aerial images with insulators and the corresponding annotations. Each annotation consists of a category label and a bounding box. Insulators are classified as ”Good,” ”Broken,” and ”FlashDamaged.” Additionally, there exists a fourth category termed as “Insulator String”, which refers to a cluster of insulators depicted in the images. Figure 6 depicts the many categories of bounding boxes with a variety of colors. The IDID training set comprises a total of 1600 aerial images, encompassing 2636 “Good”, 1140 “Broken”, and 2004 “FlashDamaged” insulator shells. These insulator shells collectively form 1788 insulator strings.
However, there exists a portion of images with incomplete annotations in IDID. Figure 7 displays several aerial images in which some insulators are incorrectly annotated. Specifically, the yellow boxes in the second row of images represent unlabeled insulators. The incomplete annotations could potentially stem from the dense arrangement of insulators and the oversight on part of the annotators. In many studies [], the IDID dataset has been utilized as a perfectly annotated dataset despite the presence of missing annotations. Therefore, regarding IDID as a partially annotated dataset is more reasonable and this partially annotated scenario studied in this paper has practical significance.

4.1.2 Dataset split

In order to eliminate interference from the sample imbalance, the model evaluation is performed on validation and test sets with category balance. The number of broken insulator shells is the lowest among the four classes. The samples from this category are distributed into the training, validation, and test sets in a ratio of 5:2:3. The validation and test sets for each category contain 228 and 342 samples, respectively. Therefore, we randomly select 228 and 342 samples from each category to form the validation and test sets, respectively. The remaining samples from each category are mixed together to form the training set, which consists of 1218 insulator strings, 1756 good insulator shells, 570 broken insulator shells, and 1434 flashover damaged insulator shells. In all our experiments, we have set the random seed to 1.

4.2 Experimental setup

4.2.1 Implementation details

To verify our method with different proportions of annotations, we randomly remove a portion of the annotations from the training set. The Annotation PerCent (APC) represents the percentage of annotations that remain after the aforementioned removal procedure. The other significant hyper-parameters are delineated as follows: The batch size is set to 16, the learning rate is set to 0.02, the total number of iterations is 10,000, and the evaluation interval for the validation set is 200 iterations. The best model is selected according to the AP metric on the validation set, which is then applied to the test set. Finally, the data augmentation in our framework contains the horizontal and vertical flips as well as the default data augmentation strategy in Detectron222The github repository of Detectron2: https://github.com/facebookresearch/detectron2.

4.2.2 Software platform

This experiment was conducted on a server with the Linux system and used Visual Studio Code (VSCode) as our Integrated Development Environment (IDE). PyTorch and Python were selected as the deep learning toolkit and the programming language, respectively. The hardware of the server is mainly composed of two Intel(R) Xeon(R) E5-2680 v4 CPUs with 14 cores each running at 2.4 GHz, 256 GB memory, and two Nvidia GeForce RTX 3090 GPUs.

4.2.3 Adopted baselines

To verify the effectiveness of our proposed method on incomplete annotation data, we conducted experiments to compare our method with other mainstream methods under different APCs (1, 0.7, 0.5, and 0.3). Our proposed framework is a Positive-Unlabeled (PU) framework, which is viewed as a combination of a Positive-Negative (PN) pipeline and PU loss. Therefore, we first selected existing mainstream Positive-Negative (PN) learning object detection algorithms [31], [32] to ablate the influence of PU loss. Furthermore, we also introduced several PU-based detectors [34], [49] as contrast methods.
PN-based object detection algorithms typically include two frameworks: one-stage and two-stage. The existing one-stage frameworks for insulator detection are usually based on YOLO v3 and YOLO v4 with MobileNet backbone, abbreviated as M-YOLO v3 [19] and M-YOLO v4 [20], respectively. Since our study does not focus on lightweight computing, we combined DarkNet53 with the aforementioned YOLO frameworks (D-YOLO v3 [23] and D-YOLO v4 [24]), as well as YOLO v5 (D-YOLO v5) [39]. Because
Table 2: Detection results of our method and other methods based on the complete annotation supported by IDID.