Elsevier Science Home
Computer Physics Communications Program Library
Full text online from Science Direct
Programs in Physics & Physical Chemistry
CPC Home

[Licence| Download | New Version Template] aeco_v1_0.tar.gz(40 Kbytes)
Manuscript Title: Accelerating Scientific Computations with Mixed Precision Algorithms
Authors: Marc Baboulin, Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Julie Langou, Julien Langou, Piotr Luszczek, Stanimire Tomov
Program title: ITER-REF
Catalogue identifier: AECO_v1_0
Distribution format: tar.gz
Journal reference: Comput. Phys. Commun. 180(2009)2526
Programming language: FORTRAN 77.
Computer: desktop, server.
Operating system: Unix/Linux.
RAM: 512 Mbytes
Keywords: numerical linear algebra, mixed precision, iterative refinement.
PACS: 02.60.Dc.
Classification: 4.8.

External routines: BLAS (optional)

Nature of problem:
On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution.

Solution method:
Mixed precision algorithms stem from the observation that, in many cases, a single precision solution of a problem can be refined to the point where double precision accuracy is achieved.
A common approach to the solution of linear systems, either dense or sparse, is to perform the LU factorization of the coefficient matrix using Gaussian elimination. First, the coefficient matrix A is factored into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is in general used to improve numerical stability resulting in a factorization PA = LU, where P is a permutation matrix. The solution for the system is achieved by first solving Ly = Pb (forward substitution) and then solving Ux = y (backward substitution). Due to round-off errors, the computed solution, x, carries a numerical error magnified by the condition number of the coefficient matrix A.
In order to improve the computed solution, an iterative process can be applied, which produces a correction to the computed solution at each iteration, which then yields the method that is commonly known as the iterative refinement algorithm. Provided that the system is not too ill-conditioned, the algorithm produces a solution correct to the working precision.

Running time: