High spatial resolution remote sensing and machine learning have improved the accuracy and affordability of high-throughput phenotyping. This paper presents an innovative approach to yield prediction, leveraging multimodal networks and integrating remote sensing data from two sensing technologies. We explore the synergy among different types of remote sensing data, meteorological data, and management practices employing advanced machine learning techniques to enhance accuracy and reliability in yield predictions.This study focuses on three experiments, one in 2021 and two in different environments with distinct management practices in 2022. Hyperspectral, LiDAR and weather-related features served as initial inputs to the LSTM recurrent neural network-based yield prediction models; management practices related categorical variables were concatenated after the time series output from the attention network. In each experiment, 80% of the data was used for training the model and 20% for testing, investigating three deep learning networks:Traditional vanilla stacked LSTM network.Stacked LSTM Network with a temporal attention mechanism.Multi-modal network for the different remote sensing modalities.The attention weights of each time-step were evaluated to determine the importance of each date;All models produced good predictions, but the attention mechanism coupled with the multi-modality network showed the effectiveness of combining multimodal remote sensing data optimally throughout the season and deep learning algorithms in optimizing agricultural decision-making processes. The Attention networks also provided increased interpretability over the growing season, showing flowering time to be the most critical time for the models, which is consistent with field-based trials.