Programs in Physics & Physical Chemistry
|[Licence| Download | New Version Template] aexl_v1_0.tar.gz(107 Kbytes)|
|Manuscript Title: Extended Computational Kernels in a Massively Parallel Implementation of the Trotter-Suzuki Approximation|
|Authors: Peter Wittek, Luca Calderaro|
|Program title: Trotter-Suzuki-MPI|
|Catalogue identifier: AEXL_v1_0|
Distribution format: tar.gz
|Journal reference: Comput. Phys. Commun. 197(2015)339|
|Programming language: C++, CUDA, Python, MATLAB.|
|Operating system: Linux.|
|Has the code been vectorised or parallelized?: Yes. Number of processors used: 1-64 in a single node, more in a cluster.|
|RAM: 5 MByte-512 GBytes|
|Keywords: GPU Computing, MPI, Quantum Evolution, Trotter-Suzuki Algorithm, Hybrid Kernel.|
External routines: OpenMP, MPI, CUDA
Does the new version supersede the previous version?: Yes. The original version is not held in the CPC Program Library but can be obtained from https://github.com/peterwittek/trotter-suzuki-mpi
Nature of problem:
The evolution of a general quantum system is described by the time-dependent Schrödinger equation. The solution of this equation involves calculating a matrix exponential, which is formally simple, but computer implementations must con- sider several factors to achieve both high performance and high accuracy.
The Trotter-Suzuki approximation leads to an efficient algorithm for solving the time-dependent Schrödinger equation [1,2]. The implementation uses high-performance parallel kernels in a distributed environment to maximize the computational power of this algorithm [3,4].
Reasons for new version:
The computational kernels were generalized to be able to address a much wider range of physics problems. Furthermore, the code has been modularized to make development easier, providing both a command-line and an application programming interface. High-level wrappers from Python and MATLAB provide further ease of use.
Summary of revisions:
The vectorized CPU kernel must have a tile width that is divisible by two. This puts a constraint on the possible matrix sizes for this kernel. For instance, running twelve MPI threads in a 4x3 configuration, the dimensions must be divisible by six and eight.
The library currently only supports the CPU kernel under Windows. The Python and MATLAB wrappers support the CPU and SSE kernels.
The high-performance kernels were independently extended to study spin dynamics . It remains for future work to include lattice models in this implementation.
The generalization slightly altered the memory access patterns of the computational kernels, yielding performance penalty of approximately 20% compared to the previous version (Table 1). The scaling properties did not change and we see a near-optimal scaling when increasing the number of nodes. The actual running time depends on the system size and the duration to be simulated, and the computational resources. It can range from a few seconds to several days.
|||H. Trotter, On the product of semi-groups of operators, Proceedings of the American Mathematical Society 10 (1959) 545-551.|
|||M. Suzuki, Decomposition formulas of exponential operators and Lie exponentials with some applications to quantum mechanics and statistical physics, Journal of Mathematical Physics 26 (1985) 601.|
|||C. Bederián, A. Dente, Boosting quantum evolutions using Trotter-Suzuki algorithms on GPUs, in: Proceedings of HPCLatAm-11, 4th High-Performance Computing Symposium, Córdoba, Argentina, 2011.|
|||P. Wittek, F. Cucchietti, A second-order distributed Trotter-Suzuki solver with a hybrid CPU-GPU kernel, Computer Physics Communications 184 (4) (2013) 1165-1171. doi:10.1016/j.cpc.2012.12.008.|
|||A. D. Dente, C. S. Bederián, P. R. Zangara, H. M. Pastawski, GPU accelerated Trotter-Suzuki solver for quantum spin dynamics, arXiv:1305.0036.|
|Disclaimer | ScienceDirect | CPC Journal | CPC | QUB|