Key points:
- Artificial intelligence (AI) has achieved outstanding performance
within medical imaging interpretation and triage tasks and has been
successfully used to diagnose carcinoma.
- Higher sensitivity was achieved for the malignancy by autonomous
classification of endoscopic images with artificial intelligence
technology in a single center.
- It is still unknown whether different laryngoscopic systems would
impact automatic recognition accuracy from multicenter clinics.
- An automatic classifier for laryngeal carcinoma based on Faster
R-CNNs(region-based convolutional neural networks) was established.
- High accuracy was also obtained for multiple clinics and multiple
laryngoscopic systems.
Introduction
Laryngeal carcinoma is one of the most common malignant tumors of the
head and neck with incidence estimated to be more than 24,500 cases per
year by 2030.1 Survival outcomes in laryngeal cancer
are affected by several aspects, including tumor
stage,2 subsites,3age,2,4 treatment modality,2,5comorbidities3, etc. An early-stage diagnosis is one
of the most crucial factors to decrease the mortality rate and preserve
both laryngeal anatomy and vocal function. The 5-year survival rates are
100% and 80% for patients with stage 0 and stage I laryngeal
carcinoma, respectively, while it decreases to only 70% for patients
with advanced-stage cancer.6
Currently, the optic laryngoscope is a routine method to diagnose
laryngeal cancer, as well as identify the extent of invasion and provide
accurate clinical staging. However, physicians still have difficulty
distinguishing early-stage cancer from mucosal
abnormalities.7 Thus, misdiagnoses and missed
diagnoses are not rare while only using a laryngoscope. High levels of
diagnostic inconsistency have been observed, even among
experts.8
Today, artificial intelligence (AI) has achieved outstanding performance
within medical imaging interpretation and triage tasks and has been
successfully used to diagnose skin cancer,9 lung
cancer,10 glioma,11 and breast
histopathology.12 As a popular technique of
deep-learning algorithms, region-based convolutional neural networks
(R-CNNs) proposed an efficient method in object detection that utilizes
a feature map from a convolutional neural network.13Several recent approaches, including Fast R-CNN, Faster R-CNN, and Mask
R-CNN, were developed based on R-CNNs.14-16 In
particular, Faster R-CNN is one of the first end-to-end two-stage
detectors that has displayed remarkable efficiency and
accuracy.17
Recently, two studies regarding the automatic recognition of
laryngoscopy images based on the convolutional neural network have
achieved promising results. In one previous study, the binary classifier
distinguished benign and malignant-premalignant lesions with an overall
accuracy of 93.0%.18 A sensitivity of 89% and a
specificity of 99.33% were achieved for the malignancy by autonomous
classification of endoscopic images with artificial intelligence
technology.19 However, the studies mentioned above
were both carried out in a single center. It is still unknown whether
the source of laryngoscopic images, the resolution of the laryngoscopy
images, and the different endoscopic systems would impact automatic
recognition accuracy. Therefore, a multicenter clinical trial is
essential to determine whether the AI technique could cope with complex
situations in the real world.
In this study, a multicenter experiment of laryngeal carcinoma detection
was carried out based on an autonomous endoscopic image classifier using
the Faster R-CNN system. Our research established an artificial
intelligence system for the detection of laryngeal carcinoma and
evaluated the performance of this system, aiming to provide a reliable
auxiliary tool to diagnose early-stage laryngeal carcinoma efficiently
and help untrained technicians accomplish objective and accurate
screening.
Methods