Abstract
We classify all sky images from 4 seasons, transform the classified
information into time-series data to include information about the
evolution of images and combine these with information on the onset of
geomagnetic substorms. We train a lightweight classifier on this dataset
to predict the onset of substorms within a 15 minute interval after
being shown information of 30 minutes of aurora. The best classifier
achieves a balanced accuracy of 61% with a recall rate of 47% and
false positive rate of 24%. We show that the classifier is limited by
the strong imbalance in the dataset of approximately 50:1 between
negative and positive events. All software and results are open source
and made freely available.