Abstract
Abstract— Even though, reconfigurable intelligent surfaces (RISs) are
adopted in various scenarios to enable the implementation of a smart
radio environment, there are still challenging issues for its real-time
operation due to the need for a costly full dimensional channel
estimation with offline exhaustive search or online exhaustive
beamtraining. The application of the deep learning (DL) tools is favored
to enable feasible solutions. In this work, we propose two low training
overhead and energy efficient adversarial bandit-based schemes with
outstanding performance gains compared to reference DL based reflection
beamforming methods. The resulting deep learning models are also
discussed using state of-the art model quality prediction trends.