Appendix
D
Experimental Result of Blood Type
Prediction Using
AI
All the data of survey 2 (excluding 141 participants who did not know
their blood type) were stored in Amazon S3, and with Amazon Machine
Learning, 12 characteristics, including gender, age, and marital status,
were used as training data for prediction targeting for the blood type.
Multinominal logistic regression algorithm was chosen for the
prediction. We divided the whole data into five groups of same sample
size. Each group was estimated as the prediction data, the rest four
groups as the training data, and then the average of the five
predictions was calculated. In these cases, since the sample sizes of
the AI training data were small (this means that the prediction errors
might become larger if we used the raw data of 1-year increment of age),
a dummy variable of 10-year increments was used [20s = 2, 30s = 3, 40s
= 4, 50s = 5]. The accuracy rates were 45.8% (F1 = 0.367) in the
group that had good knowledge of blood type characteristics (542
participants with scores equal to 3 or higher in both item Q3 (relation)
and item Q4 (knowledge)), and 40.1% (F1 = 0.281) in the entire 1,859
participants. When gender, age, and marital status were excluded from
both the learning and training data, the accuracy rates fell to 42.3%
(F1 = 0.334), and 39.6% (F1 = 0.274) respectively. The most common
blood type among Japanese is type A, which accounts for 39.1%, of the
total population (Okubo, 1997). Hence, the accuracy rate became 39.1%
if all the participants were assumed to be the type. Amazon Machine
Learning predicted the blood type at a higher accuracy than this value
in all cases (however, only the 45.8% accuracy was significant atp = 0.032). For all 20 individual predictions, 17 were more
accurate than this value, and the binomial test was significant atp = 0.011. In addition, considering that all F1-scores were
higher than chance (1/4 = 0.25), Amazon’s AI predicted human blood types
with a higher probability than chance.