Compressed sensing and atomistic simulations

In order to obtain frequency-resolved quantities from real-time methods like molecular dynamics or electron dynamics, it is necessary to perform a spectral representation of the time-resolved signal. This is a standard operation that is usually performed using a discrete Fourier transform. Since the resolution of the spectrum is given by the length of the time signal, it is interesting to look for more methods that can provide us a spectrum of similar quality with shorter time series, as this is directly reflected in shorter computation times. Several such methods exist, but a particular one that has been explored in Octopus, due to its general applicability and efficiency, is compressed sensing.

Compressed sensing \cite{Candes_2008} is a general theory aimed at optimizing the amount of sampling required to reconstruct a signal. It is based on the idea of sparsity, a measure of how many zero coefficients a signal has when represented in a certain basis. Compressed sensing has been applied to many problems in experimental sciences \cite{Blumensath_2009,Zhu_2012,Sanders_2012} and technology \cite{Schewe_2006,Harding_2013} in order to perform more accurate measurements. Its ideas, however, can also be applied to computational work.

In order to calculate a spectrum in compressed sensing, we need to solve the so-called basis-pursuit optimization problem \[\label{eq:bp} \min_{\vec{\sigma}} |\vec{\sigma}|_1 \quad \textrm{subject to}\quad {\mathrm{F}}\vec{\sigma}=\vec{\tau}\ ,\] where \(|\sigma|_1=\sum_k|\sigma_k|\) is the standard 1-norm, \(\vec\tau\) is the discretized time series, \(\vec\sigma\) is the frequency-resolved function (the spectrum that we want to calculate) and \(\mathrm{F}\) is the Fourier-transform matrix.

Since \(\vec{\tau}\) is a short signal, its dimension is smaller than the one of \(\vec\sigma\). This implies that the linear equation \({\mathrm{F}}\vec{\sigma}=\vec{\tau}\) is under-determined and has many solutions, in this particular case, all the spectra that are compatible with our short time propagation. From all of these possible solutions, eq. (\ref{eq:bp}) takes the one that has the smallest 1-norm, that corresponds to the solution that has the most zero coefficients. For spectra, this means we are choosing the one with the fewest frequencies, which will tend to be the physical one, as for many cases we know that the spectra is composed of a discrete number of frequencies.

To solve eq. (\ref{eq:bp}) numerically, we have implemented in Octopus the SPGL1 algorithm \cite{van_den_Berg_2009}. The solution typically takes a few minutes, which is two orders of magnitude more expensive than the standard Fourier transform, but this is negligible in comparison with the cost of the time propagation.

By applying compressed sensing to the determination of absorptional or vibrational spectra, it was found that a time signal a fifth of the length can be used in comparison with the standard Fourier transform \cite{Andrade_2012}. This is translated into an impressive factor-of-five reduction in the computational time. This is illustrated in Fig. \ref{fig:cs} where we show a spectrum calculated with compressed sensing from a 10 fs propagation, which has a resolution similar to a Fourier transform spectrum obtained with 50 fs of propagation time.

Moreover, the general conclusion that can be obtained from this work is that in the application of compressed sensing to simulations the reduction in the number of samples that compressed sensing produces in an experimental setup is translated into a reduction of the computational time. This concept inspired studies on how to carry the ideas of compressed sensing into the core of electronic-structure simulations. The first result of this effort is a method to use compressed sensing to reconstruct sparse matrices, that has direct application in the calculation of the Hessian matrix and vibrational frequencies from linear response (as discussed in section \ref{sec:sternheimer}). For this case, our method results in the reduction of the computational time by a factor of three \cite{Sanders_2015}.