Elsevier Science Home
Computer Physics Communications Program Library
Full text online from Science Direct
Programs in Physics & Physical Chemistry
CPC Home

[Licence| Download | New Version Template] aesb_v1_0.tar.gz(23392 Kbytes)
Manuscript Title: PRAND: GPU accelerated parallel random number generation library: Using most reliable algorithms and applying parallelism of modern GPUs and CPUs.
Authors: L.Yu. Barash, L.N. Shchur
Program title: PRAND
Catalogue identifier: AESB_v1_0
Distribution format: tar.gz
Journal reference: Comput. Phys. Commun. 185(2014)1343
Programming language: Cuda C, Fortran.
Computer: PC, workstation, laptop, or server with NVIDIA GPU (tested on Tesla X2070, Fermi C2050, GeForce GT540M) and with Intel or AMD processor.
Operating system: Linux with CUDA version 5.0 or later. Should also run on MacOs, Windows, or UNIX.
RAM: 4 Mbytes
Keywords: Statistical methods, Monte Carlo, Random numbers, Pseudorandom numbers, Random number generation, GPGPU, Streaming SIMD Extensions.
PACS: 02.70.Uu, 02.50.Ng, 05.45.-a.
Classification: 4.13.

Nature of problem:
Any calculation requiring uniform pseudorandom number generator, in particular, Monte Carlo calculations. Any calculation or simulation requiring uncorrelated parallel streams of uniform pseudorandom numbers.

Solution method:
The library contains realization of a number of modern and reliable generators: MT19937, MRG32K3A and LFSR113. Also new realizations of the method based on parallel evolution of an ensemble of transformations of two-dimensional torus are included in the library: GM19, GM29, GM31, GM61, GM55, GQ58.1, GQ58.3 and GQ58.4. The library contains: single-threaded and multi-threaded realizations for GPU, single-threaded realizations for CPU, realizations for CPU based on SSE command set. Also, the library contains the abilities to jump ahead inside RNG sequence and to initialize independent random number streams with block splitting method for each of the RNGs.

Restrictions:
Nvidia Cuda Tooklit version 5.0 or later should be installed. To use GPU realizations, Nvidia GPU supporting CUDA and the corresponding Nvidia driver should be installed. For SSE realizations of the generators, Intel or AMD CPU supporting SSE2 command set is required. In order to use the SSE realization of LFSR113, CPU must support SSE4 command set.

Additional comments:
A "serial" version of this program is held in the Library as Catalogue Id., AEIT_v2_0 (RNGSSELIB). It does not require a GPU device or CUDA compiler.

Running time:
The tests and the examples included in the package take less or about one minute to run. Running time is analysed in detail in Sec. 8 of the paper.