A novel deep convolutional neural network approach for large area satellite time series land cover classification

Hankui Zhang; David Roy; Vitor Martins

doi:10.1002/essoar.10508610.1

loading page

A novel deep convolutional neural network approach for large area satellite time series land cover classification

Hankui Zhang,
David Roy,
Vitor Martins

Abstract

The state of the practice for large area land cover mapping is based on the application of supervised classifiers to multi-temporal multi-spectral optical wavelength data. Deep learning approaches have been developed but are typically applied to individual images with less research on their application to satellite time series. The recent availability of 30 m Landsat analysis ready data (ARD) has significantly increased the ease of using Landsat data. Irregular gaps in the Landsat image time series reduce the easy application of deep learning to time series. This study proposes a novel solution based on a two dimensional (2-D) array (one spectral dimension and one temporal dimension) derived at each ARD pixel time series and using the 2-D convolutional neural network (CNN) deep learning algorithm. Classification results are presented for all of the Conterminous United States (CONUS) considering a year of Landsat 5 and 7 data. The 30 m USGS National Land Cover Database (NLCD) (15 classes after filtering) and USDA Crop Data Layer (CDL) (22 classes after filtering) products are filtered conservatively across the CONUS and sampled to construct >3.31 and >0.48 million training pixels respectively defined with reliable accuracy and reduced spatial autocorrelation. The CNN and a conventional Random Forest (RF) classifier are trained using 10%, 50% and 90% of the training samples, and used to classify the remaining samples. Classification experiments are undertaken independently using the NLCD and CDL training data. Different CNN structures with different learnable coefficients are used and the accuracy results compared with conventional RF results. The main findings were (1) application of the CNN with different structures provided only about 1% accuracy difference, the optimal CNN structure was dependent on the number of training samples, and increasing the number of CNN learnable coefficients beyond the number of training samples was not helpful; (2) although the CNN training time was up to two orders of magnitude slower than the RF, the classification time was an order of magnitude faster, (3) the CNN provided 2-5% higher accuracy than the RF which is notable given the large number of classes and that the overall classification accuracies were >80%.