Elsevier Science Home
Computer Physics Communications Program Library
Full text online from Science Direct
Programs in Physics & Physical Chemistry
CPC Home

[Licence| Download | New Version Template] aceu_v1_0.gz(17 Kbytes)
Manuscript Title: ILUCG2: subprograms for the solution of a linear asymmetric matrix equation arising from a 9-point discretization. See erratum Comp. Phys. Commun. 31(1984)433.
Authors: A.I. Shestakov, D.V. Anderson
Program title: ILUCG2
Catalogue identifier: ACEU_v1_0
Distribution format: gz
Journal reference: Comput. Phys. Commun. 30(1983)31
Programming language: Fortran.
Computer: CRAY-1.
Operating system: CTSS.
Word size: 64
Keywords: General purpose, Partial differential Equations, Elliptic, Parabolic, Two-dimensional, Implicit, Pre-conditioned, Conjugate gradient Algorithm, Iccg, Ilucg, Asymmetric matrix, Non-symmetric Matrix.
Classification: 4.8.

Nature of problem:
Elliptic and parabolic partial differential equations which arise in plasma physics applications (as well as in others) are solved in two dimensions. Plasma diffusion, equilibria, and phase space transport (Fokker-Planck equation) have been treated by these methods. These problems share the common feature of being stiff and requiring implicit solution techniques. Generally, the resulting matrix equations are asymmetric; we solve them here with the ILUCG2 program. In a subsequent article we describe a simpler and faster algorithm, ICCG2, which should be used when the matrix is symmetric.

Solution method:
A generalization of the incomplete Cholesky conjugate gradient (ICCG) algorithm is used to solve the linear asymmetric matrix equation.

Restrictions:
The discretization of the two-dimensional PDE and its boundary conditions must result in a spatial 9-point operator stencil which can be represented by a block tridiagonal matrix composed of tridiagonal blocks.

Unusual features:
The loops are arranged to vectorize on the Cray-1 (with the CFT compiler) wherever possible. Recursive loops (which cannot be vectorized) are written for optimum scalar speed. Some of these loops were made more efficient by increasing the arithmetic per pass and at the same time reducing the number of passes. This trick significantly reduced loop overhead and made these scalar loops 30% faster.

Running time:
These are problem dependent because ill-conditioned matrices will require more iterations than well conditioned ones. However, the running times per iteration are known. For enough equations (1189 here) the time for the incomplete LU factorization is 5.9 mu s per unknown. Each conjugate gradient iteration requires 4.2 mu s per unknown. In the three test problems the number of iterations required to reach a relative residual error of 10**-10 (in the L2 norm) ranged from 9 to 208.