Computer Physics Communications Program LibraryPrograms in Physics & Physical Chemistry |

[Licence| Download | New Version Template] aebn_v2_0.tar.gz(1668 Kbytes) | ||
---|---|---|

Manuscript Title: Introducing PROFESS 2.0: a parallelized, fully linear scaling program for orbital-free density functional theory calculations | ||

Authors: Linda Hung, Chen Huang, Ilgyou Shin, Gregory S. Ho, Vincent L. Lignères, Emily A. Carter | ||

Program title: PROFESS | ||

Catalogue identifier: AEBN_v2_0Distribution format: tar.gz | ||

Journal reference: Comput. Phys. Commun. 181(2010)2208 | ||

Programming language: Fortran 90. | ||

Computer: Intel with ifort; AMD Opteron with pathf90. | ||

Operating system: Linux. | ||

Has the code been vectorised or parallelized?: Yes. Parallelization is implemented through domain composition using MPI | ||

RAM: Problem dependent, but 2 GB is sufficient for up to 10,000 ions. | ||

Keywords: Orbital-free density functional theory, Optimization, Electronic structure. | ||

PACS: 31.15.-p.. | ||

Classification: 7.3. | ||

External routines: FFTW 2.1.5 (http://www.fftw.org) | ||

Does the new version supersede the previous version?: Yes | ||

Nature of problem:Given a set of coordinates describing the initial ion positions under periodic boundary conditions, recovers the ground state energy, electron density, ion positions, and cell lattice vectors predicted by orbital-free density functional theory. The computation of all terms is effectively linear scaling. Parallelization is implemented through domain decomposition, and up to ~10,000 ions may be included in the calculation on just a single processor, limited by RAM. For example, when optimizing the geometry of ~50,000 aluminum ions (plus vacuum) on 48 cores, a single iteration of conjugate gradient ion geometry optimization takes ~40 minutes wall time. However, each CG geometry step requires two or more electron density optimizations, so step times will vary. | ||

Solution method:Computes energies as described in text; minimizes this energy with respect to the electron density, ion positions, and cell lattice vectors. | ||

Reasons for new version:To allow much larger systems to be simulated using PROFESS. | ||

Summary of revisions:- PROFESS can run in parallel [1]. Parallelization is implemented through domain decomposition using MPI. (However, copies of all ion positions, which take up a relatively small amount of memory, are saved on all processors.) An updated serial version of PROFESS, with some memory-efficient features specific to the use of a single process, can also be compiled from the same code.
- Instead of linking to the FFTW3 library, we use FFTW 2.1.5, which is the most recent version of FFTW that supports MPI parallel transforms.
- Ion-ion and ion-electron calculations can scale as O(N ln N) through the use of cardinal B-splines [1]-[3]. (For ion-ion calculations, this is known as particle-mesh Ewald.)
- The line search during electron density optimization (when using the square root of electron density as the variational parameter) automatically conserves the total number of electrons in the system, using a similar line search mixing scheme as in Reference [4].
- The square root density CG and TN optimizations are generally more stable.
- Conjugate gradient ion optimization is improved and stable.
- Positions of chosen ions can be held fixed during geometry optimization.
- Variable time steps are used during cell lattice optimization instead of fixed steps.
- The CAT kinetic energy density functional [5] is available.
- A cutoff to avoid divergence in vacuum regions is now an option for some kinetic energy and exchange-correlation functionals (keywords WTV, WGV, CAV, PBEC) [6].
- The PBE exchange-correlation subroutine is more stable.
- Calculations of energy and potential for some functionals are more efficient after removing duplicate computations. (Note: CAT, LQ, and HQ functionals have not yet been consolidated.)
- Density and potential output files have a new format that is more convenient for output from multiple processes. A utility to convert between old and new density formats, as well as Tecplot format, is provided in RhoConvert.f90.
- The interpolation scheme used when reading in pseudopotentials is more accurate.
- WGC kernel integration uses the Runge-Kutta method for better accuracy.
| ||

Restrictions:PROFESS cannot use nonlocal (such as ultrasoft) pseudopotentials. A variety of local pseudopotential files are available at the Carter group website (http://www.princeton.edu/mae/people/faculty/carter/homepage/research/local-pseudopotentials/). Also, due to the current state of the kinetic energy functionals, PROFESS is only reliable for main group metals and some properties of semiconductors. | ||

Running time:Problem dependent: the test example provided with the code takes less than a second to run. Timing results for large scale problems are given in the PROFESS paper and Reference 1. | ||

References: | ||

[1] | L. Hung and E.A. Carter, Chem. Phys. Lett. 475 (2009) 163. | |

[2] | U. Essmann, L. Perera, M. Berkowitz, T. Darden, L. Hsing, L.G. Pedersen, J. Chem. Phys. 103 (1995) 8577. | |

[3] | N. Choly and E. Kaxiras, Phys. Rev. B 67 (2003) 155101. | |

[4] | H. Jiang and W. Yang, J. Chem. Phys. 121 (2004) 2030. | |

[5] | D. Garcìa-Aldea and J.E. Alvarellos, Phys. Rev. A 76 (2007) 052504. | |

[6] | I. Shin, A. Ramasubramaniam, C. Huang, L. Hung, and E.A. Carter, Philos. Mag. 89 (2009) 3195. |

Disclaimer | ScienceDirect | CPC Journal | CPC | QUB |