Replacing Attention with Modality-wise Convolution for Energy-Efficient
PPG-based Heart Rate Estimation using Knowledge Distillation
Abstract
Continuous Hearth Rate (HR) monitoring based on photoplethysmography
(PPG) sensors is a crucial feature of almost all wrist-worn devices.
However, arm movements lead to the creation of Motion Artifacts (MA),
affecting the performance of PPG-based HR tracking. This problem is
commonly tackled by exploiting the recorded accelerometer data to
correlate them with the PPG signal and eventually clean it. Thus,
automatic fusion techniques based on Deep Learning (DL) algorithms have
been proposed, but they are considered too large and complex to be
deployed on wearable devices. The current work presents a novel and
lightweight DL architecture, PULSE, comprised of temporal convolutions
and feature-level multi-head cross-attention to improve sensor fusion’s
effectiveness. Moreover, we propose a relation-based knowledge
distillation mechanism to pass PULSE’s knowledge to a student network
that utilizes modality-wise convolutions to replace the attention module
and mimic the teacher’s performance with 5x fewer parameters. The
teacher and student are evaluated on the most extensive available
dataset, PPG-DaLiA, with PULSE reducing the mean absolute error by 8.2%
compared to the best state-of-the-art model while simultaneously
reducing the inference latency by 1.6x. The student model is further
compressed using post-training quantization and deployed on two
microcontrollers, demonstrating its suitability for real-time execution,
having a close-to-state-of-the-art MAE of 4.81 BPM (+0.40 BPM), but a
10.9x lower memory footprint of 37.9 kB, and consuming 45.9x lower
energy (0.577 mJ).