Appendix D

Experimental Result of Blood Type Prediction Using AI

All the data of survey 2 (excluding 141 participants who did not know their blood type) were stored in Amazon S3, and with Amazon Machine Learning, 12 characteristics, including gender, age, and marital status, were used as training data for prediction targeting for the blood type. Multinominal logistic regression algorithm was chosen for the prediction. We divided the whole data into five groups of same sample size. Each group was estimated as the prediction data, the rest four groups as the training data, and then the average of the five predictions was calculated. In these cases, since the sample sizes of the AI training data were small (this means that the prediction errors might become larger if we used the raw data of 1-year increment of age), a dummy variable of 10-year increments was used [20s = 2, 30s = 3, 40s = 4, 50s = 5]. The accuracy rates were 45.8% (F1 = 0.367) in the group that had good knowledge of blood type characteristics (542 participants with scores equal to 3 or higher in both item Q3 (relation) and item Q4 (knowledge)), and 40.1% (F1 = 0.281) in the entire 1,859 participants. When gender, age, and marital status were excluded from both the learning and training data, the accuracy rates fell to 42.3% (F1 = 0.334), and 39.6% (F1 = 0.274) respectively. The most common blood type among Japanese is type A, which accounts for 39.1%, of the total population (Okubo, 1997). Hence, the accuracy rate became 39.1% if all the participants were assumed to be the type. Amazon Machine Learning predicted the blood type at a higher accuracy than this value in all cases (however, only the 45.8% accuracy was significant atp = 0.032). For all 20 individual predictions, 17 were more accurate than this value, and the binomial test was significant atp = 0.011. In addition, considering that all F1-scores were higher than chance (1/4 = 0.25), Amazon’s AI predicted human blood types with a higher probability than chance.