A Computationally-Inexpensive Strategy in CT Image Data Augmentation for Robust Deep Learning Classification of COVID-19

Yikun Hou; Miguel Navarro-Cía

doi:10.36227/techrxiv.20272764.v1

loading page

A Computationally-Inexpensive Strategy in CT Image Data Augmentation for Robust Deep Learning Classification of COVID-19

Yikun Hou ,
Miguel Navarro-Cía

Abstract

Coronavirus disease 2019 (COVID-19) has spread globally for two years, and chest computed tomography (CT) has been used to diagnose COVID-19 and identify lung damage in long COVID-19 patients. At the beginning of the epidemic, there was a shortage of large and publicly available CT datasets due to privacy concerns. Therefore, it is important to classify CT scans correctly when only limited resources are available, as it will happen again in future pandemics. We followed the transfer learning procedure and limited hyperparameters to use as few computing resources as possible. The Advanced Normalisation Tools (ANTs) were used to synthesise images as augmented/independent data and trained on EfficientNet to investigate the effect of synthetic images. On the COVID-CT dataset, classification accuracy increased from 91.15% to 95.50% and Area Under the Receiver Operating Characteristic (AUC) from 96.40% to 98.54%. We also customised a small dataset to simulate data collected in the early stages of the outbreak and improve accuracy from 85.95% to 94.32% and AUC from 93.21% to 98.61%. This paper provides a feasible solution with a relatively low computational cost for medical image classification when scarce data are available and traditional data augmentation may fail.

This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.