Figure 7: Highest weighted patch within the highest
weighted slice for two test set examples, demonstrating that the model
can detect abnormalities throughout the brain.
Discussion
The models presented in this work exploit the hierarchical nature of
neuroimaging data
to accurately classify a range of glioma-related abnormalities. By doing
so it was shown that they outperform simpler models which treat separate
sequences, slices, or intra-slice
regions equally. Crucially, this hierarchical approach provides an
intuitive form of model
interpretability. By visualising patch-HAN’s attention weights, it is
possible to localise
abnormalities both across and within individual slices, while
sequence-HAN’s attention
weights provide inter-slice localisation while additionally providing
importance scores for
each imaging sequence. Critically, this is achieved without the need for
slice- or pixel-level
annotation during training, requiring only a series-level label which is
applied to all slices.
As such, our approach lends itself to training on large-scale
retrospective hospital image
collections, and is well suited for use as part of a semi-autonomous
triage system. In the
future, we wish to extend the model to allow incorporation of
non-imaging data such as
patient clinical history. This can be extracted from the free-text
report that accompanies
images on PACS and embedded into a machine-readable representation
(e.g., using ALARM, BioBERT (Lee et al., 2019). ClinicalBERT (Alsentzer
et al., 2019) language models) and introduced as an additional
hierarchy. Furthermore, we plan to test the model on a range of
abnormalities where sequences other than T2-weighted
images are important, for example, acute infarct cases for which DWI and
apparent diffusion coefficient (ADC) maps appear particularly
discriminatory. In the future, we also plan to combine the two models
presented here to allow simultaneous sequence, slice, and region
importance scores to be determined, However, for this work we were
limited to a single 11 GB GPU which isn’t sufficient for training this
larger model.
Conclusion
In this work we introduced a hierarchical attention network to analyse
real-world non-
volumetric clinical MRI data, demonstrating that this hierarchical
treatment of neuroimag-
ing data leads to gains in model performance, while coarsely localising
the abnormality
both across and within individual image slices. As such, the model is
suitable for use as
part of a semi-automated triage system, where both model accuracy and
interpretability
are important.