RespBERT: A multi-site validation of a Natural Language Processing algorithm, of Radiology Notes to Identify Acute Respiratory Distress Syndrome (ARDS)

Ashwin Pathak; Rishi Kamaleswaran; Curtis Marshall; Carolyn Davis; Philip Yang

doi:10.36227/techrxiv.22696645.v3

loading page

RespBERT: A multi-site validation of a Natural Language Processing algorithm, of Radiology Notes to Identify Acute Respiratory Distress Syndrome (ARDS)

Ashwin Pathak ,
Rishi Kamaleswaran ,
Curtis Marshall ,
Carolyn Davis ,
Philip Yang

Abstract

Acute respiratory distress syndrome (ARDS) is a severe organ dysfunction that is associated with significant mortality and morbidity among critically ill patients admitted to the Intensive Care Unit (ICU). The etiology associated with ARDS can be highly heterogeneous, with most cases being associated with infection or trauma. ARDS is often described as a clinical syndrome associated with poor oxygenation even in the presence of mechanical ventilation. The Berlin criteria of ARDS is the current gold standard for identifying whether patients had developed ARDS, however it often requires manual adjudication of the chest radiograph, resulting in limited tools to automate the process. Since the determination of ARDS is dependent on the presence of bilateral infiltrates on radiographic images, and this information is not typically available in Electronic Medical Record (EMR). Automated determination of the presence of radiological evidence would enable robust study of the syndrome by eliminating expensive individual inspection by physicians of the images. The text of radiological reports provides an opportunity for Natural Language Processing (NLP) to determine the status of the lungs for evaluating the imaging criterion. We developed a Natural Language Processing (NLP) pipeline to analyze radiology notes of 362 patients satisfying sepsis-3 criteria from the Electronic Medical Record (EMR) to determine possible ARDS diagnosis. The radiology notes were de-noised and preprocessed. They were further vectorized through the word-embedding pipeline BERT and fitted to a classification layer using transfer learning. These classification models showed F1-score of 74.5% and 64.22% for Emory and Grady dataset respectively.