Authorea

Xavier Andrade edited Parallelization.tex over 9 years ago

Commit id: 7baafcf45b47d7dd17c0a5f196bd6ce9535aa392

deletions | additions

structure codes must execute efficiently in modern computational platforms. This implies support for massively parallel platforms and modern parallel processors, including graphics processing units (GPUs). Octopus has been shown to perform efficientlyboth on parallel supercomputers~\cite{Andrade_2012,Alberdi_2014}. Octopus supercomputers with scaling to hundreds of thousands of cores~\cite{Andrade_2012,Alberdi_2014}. The code also has a very efficient an implementation of GPU acceleration~\cite{Andrade_2012,Andrade_2013}. acceleration~\cite{Andrade_2012_gpus,Andrade_2012} that has shown to be competitive in performance with Gaussian DFT running on GPUs~\cite{Andrade_2013}. Performance is not only important for established methods, but also for the implementation of new features. Ideally, developers should be isolated as much as possible from the optimization and parallelization requirements. The simplicity of real-space grids allows us to provide Octopus developers with building blocks that they can use to produce highly efficient code without caring about the details of the implementation. In most cases, this building blocks allow developers to write code that is automatically parallel, efficient, and that can transparently