The more accurate class prior is needed to be estimated, the smaller
interval needs to be set. In other words, class prior with high
precision leads to high computation complexity of the grid search
method. For example, If the class prior is searched with an interval of
0.1, the computation complexity will increase tenfold from the original.
In [34], the computation of class prior is transformed from the grid
search of the class prior to determining a confidence threshold, which
is used to predict whether the anchors are positive. Suppose an anchors’
set , where and stand for the -th anchor box and its probability of
being positive. is the number of anchors and is obtained by RPN. The
class prior is computed with Equation .
where denotes the confident threshold. In conclusion, a reasonable
threshold directly determines the number of positive anchors, which
affects the estimation of the class prior. is also viewed as a
hyper-parameter and needs to be estimated by the grid search.
Table 1 provided experiment results for the grid search of the confident
threshold. Each column stands for AP metrics (details in Section 4.3)
with the confident thresholds, while the different rows correspond to
various annotation percents. When the Annotation PerCent (APC, refer to
Section 4.2.1) varies from 1 to 0.3, the number of annotated labels
decreases during the training process. From Table 1, it is concluded
that the best confidence thresholds are inconsistent with different
APCs. In [34], the confident threshold is fixed, and therefore it
should be set to 0.2. The parameter selection is based on the fact that
the confident threshold makes the model achieve more best performance.
To sum up, this fixed threshold strategy (denoted as Pi-FT) also needs
compute-intensive optimization of hyper-parameter.
B. A novel index for class
prior
In this section, we offer a novel estimation technique for the class
prior in our PU-RPN. As shown in Figure 5, the predicted results from
two stages, the RPN and ROI Head, are fused to compute the class prior .
Suppose an anchor from a set of . The probability of predicted to be
positive is symbolized by . Therefore, is a set of anchors with their
probability of positive class. A predicted box is denoted as , which is
output by ROI Head. Then the predicted boxes are collected as . The and
indicate the number of anchors and predicted boxes, respectively. For an
arbitrary anchor , we first match it with the predicted boxes and then
determine the matched box using Equation .
where is a function of computing IoU between two boxes. We propose a
class prior index for each anchor, i.e., the index for the anchor is
defined as
The indices of anchors can be expressed as .The class prior is
calculated by
Inspired by [34], an Exponential Moving Average (EMA) strategy to
stabilize the class prior . The momentum is set to 0.9. The EMA class
prior denotes the class prior after updates. The initialization of ( )
is specified as the class prior of the first batch. Assuming the current
batch’s class prior is , and the EMA class prior is updated base on the
Equation .
Algorithm 1: Class prior
estimation based on Pi-Index for one batch.