Programs in Physics & Physical Chemistry
|[Licence| Download | New Version Template] aanz_v3_0.tar.gz(6 Kbytes)|
|Manuscript Title: Update of spherical Bessel transform: FFTW and OpenMP|
|Authors: Peter Koval, J.D. Talman|
|Program title: SBT, version number 3|
|Catalogue identifier: AANZ_v3_0|
Distribution format: tar.gz
|Journal reference: Comput. Phys. Commun. 181(2010)2212|
|Programming language: Fortran 90.|
|Computer: Any computer with a conforming Fortran 90 compiler.|
|Operating system: Any system with a conforming Fortran 90 compiler.|
|Has the code been vectorised or parallelized?: OpenMp is used in the subroutine.|
|Keywords: spherical Bessel functions, shared memory parallelization, FFT.|
External routines: FFTW3 (http://www.fftw.org/)
Does the new version supersede the previous version?: No
Nature of problem:
The Fourier transform of a spherically symmetric angular momentum eigen-function leads to the spherical Bessel transform of the corresponding radial function. Radial functions are often given numerically, on a radial grid. Therefore, the direct computation of the spherical Bessel transform would require of order N arithmetical operations for each value of the transform variable, where N is the number of points on the radial grid. At large values of the transform variable, the rapid oscillation of the integrand makes usual methods of integration impractical. However, if the radial and transform variables are defined on logarithmic grids, this problem is overcome . Moreover, the approach requires two applications of a fast Fourier transform so that the operational count can be lowered to N logN arithmetical operations.
The program applies a procedure that treats the problem as a convolution . The calculation then requires two applications of the fast Fourier transform method.
Reasons for new version:
The previous versions of the program uses a built-in FFT routine, which can only treat arrays sizes of 2n. Employing the widely available FFTW package  allows greater flexibility in the grid dimensions, which can be any multiple of small primes, e.g. 80, 81, 100, etc. Moreover, the FFTW package is much more efficient than the previous FFT routine and also can be used in shared-memory parallelized programs. A number of changes have also been made to make the subroutine thread-safe. We use the OpenMP standard to run the subroutine within OpenMP-parallelized loops.
Summary of revisions:
Run time is dominated by FFT transform (68%). The test calculation of overlap integrals runs 18 seconds with one thread, computing 360000 integrals on Intel Core2 Quad CPU Q9400 2.66 GHz using 256 radial grid points.
|||J.D. Talman, J. Comp. Phys. 29 (1983) 35-48|
|||M. Frigo and S.G. Johnson, Proceedings of the IEEE, 93 (2005) 216-231|
|Disclaimer | ScienceDirect | CPC Journal | CPC | QUB|