loading page

Robust Optimization of Deep Learning Models using Spectral Proximal Method and Saliency Matrix
  • +1
  • Cherng-Liin Yong,
  • Ban-Hoe Kwan,
  • Danny-Wee-Kiat Ng,
  • Hong Seng Sim
Cherng-Liin Yong
Universiti Tunku Abdul Rahman Lee Kong Chian Fakulti Kejuruteraan dan Sains
Author Profile
Ban-Hoe Kwan
Universiti Tunku Abdul Rahman Lee Kong Chian Fakulti Kejuruteraan dan Sains
Author Profile
Danny-Wee-Kiat Ng
Universiti Tunku Abdul Rahman Lee Kong Chian Fakulti Kejuruteraan dan Sains

Corresponding Author:[email protected]

Author Profile
Hong Seng Sim
Universiti Tunku Abdul Rahman Lee Kong Chian Fakulti Kejuruteraan dan Sains
Author Profile

Abstract

Model generalization refers to a model’s ability to perform well on unseen data. In this paper, we present the Spectral Proximal (SP) method with saliency matrix as a training technique for deep learning models that aims to improve their generalization ability. The SP method addresses two challenges that can hinder generalization: the gradient confusion issue in deep model structures and the scarcity of training data. The method uses a damping matrix and a proximal operator with a saliency matrix to correct for errors in the descent direction and prevent over-fitting, respectively. This results in improved performance on image classification (MNIST and CIFAR-10) and detection (YOLOv7) tasks, as well as better generalization on unseen data. We conducted a thorough investigation through experiments on a diverse range of setups, controlling for potential confounding variables. The results consistently showed that the SP method outperformed the baseline method in the majority of cases.