Programs in Physics & Physical Chemistry
|[Licence| Download | New Version Template] aeco_v1_0.tar.gz(40 Kbytes)|
|Manuscript Title: Accelerating Scientific Computations with Mixed Precision Algorithms|
|Authors: Marc Baboulin, Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Julie Langou, Julien Langou, Piotr Luszczek, Stanimire Tomov|
|Program title: ITER-REF|
|Catalogue identifier: AECO_v1_0|
Distribution format: tar.gz
|Journal reference: Comput. Phys. Commun. 180(2009)2526|
|Programming language: FORTRAN 77.|
|Computer: desktop, server.|
|Operating system: Unix/Linux.|
|RAM: 512 Mbytes|
|Keywords: numerical linear algebra, mixed precision, iterative refinement.|
External routines: BLAS (optional)
Nature of problem:
On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution.
Mixed precision algorithms stem from the observation that, in many cases, a single precision solution of a problem can be refined to the point where double precision accuracy is achieved.
A common approach to the solution of linear systems, either dense or sparse, is to perform the LU factorization of the coefficient matrix using Gaussian elimination. First, the coefficient matrix A is factored into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is in general used to improve numerical stability resulting in a factorization PA = LU, where P is a permutation matrix. The solution for the system is achieved by first solving Ly = Pb (forward substitution) and then solving Ux = y (backward substitution). Due to round-off errors, the computed solution, x, carries a numerical error magnified by the condition number of the coefficient matrix A.
In order to improve the computed solution, an iterative process can be applied, which produces a correction to the computed solution at each iteration, which then yields the method that is commonly known as the iterative refinement algorithm. Provided that the system is not too ill-conditioned, the algorithm produces a solution correct to the working precision.
|Disclaimer | ScienceDirect | CPC Journal | CPC | QUB|