Experiments and Results
Data analysis
We started with 626 pregnant women with gestational age between 22 to 26
weeks, who underwent prenatal examinations at the First Hospital of
Jilin University in China from January 2018 to June 2019. There are 90
cases with lateral ventricular width equal to or bigger than 10 mm and
16 cases with lateral ventricular width bigger than 15 mm. Actually,
these 90 ventriculomegaly cases and the cases with lateral ventricular
width near 10 mm were selected from 22616 pregnant women. The other
normal cases were randomly selected from all normal cases. The average
lateral ventricular (LV) width, which refers to the larger width of the
left and right lateral ventricles, of the 626 cases was 7 mm (see Figure
S1(a)). The average gestational age is 23.8 weeks. There were 49212
stored freeze-frame images (Figure S2) and the mean number of stored
freeze-frame images is 78.6. Each frame had a size of 768x576 pixels.
Picking out brain images
70 cases were randomly selected as the validation set and 70 other cases
were randomly selected as the test set. The other 486 cases were
training set, which had 2731 brain images and 35687 other images. 2731
brain images and the same number of randomly selected other images were
used for classification training. The training was terminated after 20
epochs and the model with the best overall validation accuracy was chose
as the final model. 376 brain images and 4967 other images from the 70
test cases were successfully tested, and the overall test accuracy is
99.8%. The classification accuracy was 100% (376/376) and 99.8%
(4955/4967) for the brain images and other images, respectively. The
sensitivity and specificity for brain images were 100% (376/376) and
96.9% (376/388), respectively.
Picking out TV and TT planes and localization of brain
region
We randomly selected 60 cases as test set. The remaining 566 cases,
which had 2094 TV-TT plane images and 1044 other images, were training
set. 1044 other images and the same number of randomly selected TV-TT
plane images were used for training. The training was terminated after
20 epochs and the last model was chose as the final model. 210 TV-TT
plane images and 108 other images were successfully tested. The AP@0.5
and AP@0.75 were all 0.992 and the mAP@[.5,.95] was 0.92. The
mAR@[.5,.95] was 0.945. Then we chose the first object detected,
which has the largest percentage, as the result. The overall test
accuracy is 98.1% (312/318). The detection accuracy for the TV-TT plane
images and other images was 97.6% (205/210) and 99.1% (107/108),
respectively. The sensitivity and specificity for TV-TT plane images
were 97.6% (205/210) and 99.5% (205/206), respectively.
Predicting the lateral ventricular
width
The lateral ventricular width shown in each brain region image was
determined by doctors. From all the 2304 TV-TT planes, 1431 planes had
confirmed lateral ventricular width. Other planes either did not show
clear lateral ventricle or the lateral ventricular width cannot be
determined.
We performed two experiment. The first one was to use all the 626 cases,
corresponding to 1431 images with known lateral ventricular width, to
train and test the regression model. The second one was to use the 610
cases with lateral ventricular width less than 15 mm, corresponding to
1351 images, to train and test the model.
For the first experiment, 60 cases were randomly selected as the test
set, which had 141 images. Other 60 case were randomly selected as the
validation set, which had 132 images. The remaining 506 cases, which had
1158 images, were training set. The training was terminated after 100
epochs and the model with the least mean square error (MSE) was chose as
the final model. The mean absolute error (MAE) of the test set was 1.01
mm. More than 65% test images had a MAE of less than 1 mm (Figure 2(a),
Figure S3(a) and Table S1).
For the second experiment, 58 cases were randomly selected as the test
set, which had 107 images. Other 58 case were randomly selected as the
validation set, which had 118 images. The remaining 495 cases, which had
1124 images, were training set. The training was terminated after 100
epochs and the model with the least MSE was chose as the final model.
The MAE of the test set was 0.54 mm. More than 82% test images had a
MAE of less than 1 mm (Figure 2(b), Figure S3(b) and Table S2).
We also evaluated the possibility of the two models to predict lateral
ventricular width in the case level. For each test case, we set the
predicted LV width as the largest predicted LV width of all its TV and
TT planes. For the first model, 235 TV and TT planes from the 60 test
cases were tested. The MAS was 1.47 mm (Figure 2(c), Figure S3(c) and
Table S3). For the second model, 203 TV and TT planes from the 58 test
cases were tested. The MAE was 0.73 mm (Figure 2(d), Figure S3(d) and
Table S4). If we set the threshold for the two models as 10 mm, the
sensitivity was 100% (8/8) and 75% (6/8), and the precision was 57%
(8/14) and 86% (6/7), respectively (Figure 2(c-d)).
From Figure 2(c) and Figure S3(c) we can see that there was a case with
large prediction error of 9.2 mm. The truth LV width was 4.4 mm and the
predicted width was 13.6 mm. We analyzed the prediction result of this
case. This case had three TV or TT planes and the predicted LV width was
4.94 mm, 5.60 mm and 13.6 mm, respectively (Figure S4). Based on the
rule we used, the predicted LV width of this case was set as 13.6 mm. We
found that the last image (Figure S4(c)) was not a regular TV or TT
plane, hence the large prediction error, 9.2 mm, was not a normal
result.
Interpretation of the results using heat
maps
We generated heat maps and their corresponding overlay images for all
test images (Figure 3 and Figure 4). The results were all reviewed by an
expert. For the first experiment, 97 out of 141 heat maps were activated
in/around the lateral ventricular regions. Moreover, all the 141 heat
maps were activated at the left-upper corner. Figure 3 shows some
examples. For the second experiment, 74 out of 107 heat maps were
activated on/around the lateral ventricular regions. 28 of them were
also activated on other regions. Other 34 heat maps were not activated
on/around the lateral ventricular regions. Figure 4 shows some examples.
We can see that for images with large lateral ventricular width, the
heat maps were activated on/around the lateral ventricular regions, as
we expected. We performed further analysis to investigate this
phenomenon. Figure S5(a) and Figure S5(c) shows distribution of lateral
ventricular width of images whose heat maps did not activate the lateral
ventricular regions for the first and second experiment, respectively.
Compared with Figure S5(b) and Figure S5(d), which refer to distribution
of lateral ventricular width of images whose heat maps activate the
lateral ventricular regions for the first and second experiment, the
mean LV width was much smaller (p<0.001 for both experiments).
These results indicate that the regression models can locate the lateral
ventricular regions of images with large lateral ventricular width
successfully and then predict their width based on these regions with
small error.