|[Licence| Download | New Version Template] aemc_v2_0.tar.gz(11 Kbytes)|
|Manuscript Title: Improved algorithm for calculating the Chandrasekhar function|
|Authors: A. Jablonski|
|Program title: CHANDRAS_v2|
|Catalogue identifier: AEMC_v2_0|
Distribution format: tar.gz
|Journal reference: Comput. Phys. Commun. 184(2013)440|
|Programming language: Fortran 90.|
|Computer: Any computer with a Fortran 90 compiler.|
|Operating system: Windows 7, Windows XP, Unix/Linux.|
|RAM: 0.7 Mb|
|Supplementary material: A document containing the Figures mentioned in this summary can be downloaded.|
|Keywords: Chandrasekhar function, Isotropic scattering.|
|Classification: 2.4, 7.2.|
Does the new version supersede the previous version?: Yes
Nature of problem:
An attempt has been made to develop a subroutine that calculates the Chandrasekhar function with high accuracy, of at least 10 decimal places. Simultaneously, this subroutine should be very fast. Both requirements stem from the theory of electron transport in condensed matter.
Two algorithms were developed, each based on a different integral representation of the Chandrasekhar function. The final algorithm is arrived at by mixing these two algorithms selecting the ranges of the argument omega in which performance is the fastest.
Reasons for new version:
Some of the theoretical models describing electron transport in condensed matter need a source of the Chandrasekhar H function values with accuracy of at least 10 decimal places. Additionally, calculations of this function should be possibly fast since frequent calls to a subroutine providing this function are made (e.g. numerical evaluation of a double integral with a complicated integrand containing the H function). Both conditions were satisfied in the algorithm previously published . However, it has been found that a proper selection of the quadrature in an integral representation of the Chandrasekhar function may considerably decrease the running time. By suitable selection of the number of abscissas in the Gauss-Legendre quadrature, execution time was decreased by a factor of more than 20.
Simultaneously, the accuracy of results has not been affected.
Summary of revisions:
- As in previous work , two integral representations of the Chandrasekhar function, H(x,omega), were considered: expression published by Dudariev and Whelan  and expression published by Davidović et al. . The algorithms implementing these representations were designated by A and B, respectively. All integrals in these implementations were previously calculated using the Romberg quadrature. It has been found, however, that the use of the Gauss-Legendre quadrature considerably improved the performance of both algorithms. Two conditions have to be satisfied: (i) The number of
abscissas, N, has to be rather large, and (ii) the abscissas and corresponding weights should be determined with possibly high accuracy. The abscissas and weights are available for N = 16, 20, 24, 32, 40, 48, 64, 80 and 96 with accuracy of 20 decimal places , and all these values
were introduced into a new procedure GAUSS replacing procedure ROMBERG. Due to the fact that the implemented tables are rather extensive, they were recalculated using the Rybicki algorithm (Ref. , pp. 183-184) and rechecked. No errors or misprints were found.
- In the integral representation of the H function derived by Davidović et al. , the positive root nu0 of the so-called dispersion function needs to be calculated with accuracy of at least
10 decimal places (Cf. Ref , pp.61-64 and Ref. , Eqs (5) and (29)). For small values of the argument omega and values of omega close to unity, the non-linear equation in one unknown, nu0, can be solved analytically. New simple analytical expressions were derived
here that can be efficiently used in calculations of the root.
- The above modifications of the code considerably decreased time of calculation of both algorithms A and B. Results are summarized in Fig. 1, (see Supplementary material above). The time of calculations is in fact the CPU time in microseconds for a computer equipped with Inter Xeon processor (3.46 GHz) using Lahey-Fujitsu Fortran v. 7.2.
The shortest execution time averaged over values of the argument x exceeding 0.05 has been observed for algorithm B and the Gauss-Legendre quadrature with the number of abscissas equal to 64 (23.2 microseconds). As compared with the Romberg quadrature, the execution time was shortened by the factor of 22.5. For small x values, below 0.05, both algorithms A
and B are considerably faster if the Gauss-Legendre quadrature is used. For N = 64, the average time of execution of algorithm B is decreased with respect to the Romberg quadrature by a factor close to 30. However, in that range of argument x, the algorithm A exhibits a much faster performance. Furthermore, the average execution time of algorithm A equal to about
100 microseconds is practically independent of the number of abscissas N.
- For the Romberg quadrature, to optimize the performance, the mixed algorithm C was proposed in which algorithm A is used for argument x smaller or equal to x0 = 0.4 while algorithm B is used for x larger than 0.4 . For the Gauss-Legendre quadrature, the limit x0 was found to depend on the number of abscissas N. For each value of N considered, the time of calculations of the H function was determined for pairs of arguments uniformly distributed in ranges 0 <= x <= 0.05 and 0 <= omega <=1, and for pairs of arguments uniformly distributed in ranges 0.05 <= x <= 1 and 0 <= omega <=1. As shown in Fig. 2 (see Supplementary material above) for N = 64, the
algorithm A is faster than the algorithm B for x smaller or equal to 0.00225. Thus, the value of x0 = 0.0225 is proposed for the mixed algorithm C when the Gauss-Legendere quadrature with N = 64 is used. Similar computer experiments performed for other N values are summarized below:
The flag L is one of the input parameters for the subroutine GAUSS. In the programs implementing the algorithms A, B and C (CHANDRA, CHANDRB, and CHANDRC), the Gauss-Legendre quadrature with N = 64 is currently set. As follows from Fig. 1, the algorithm B (and consequently the algorithm C) is the fastest in that case. It is still possible to change the number of abscissas; the flag L has to be modified then in lines 165, 169, 185, 189 and
304 of the program CHANDRAS_v2, and the value of x0 in line 111 has to be adjusted according to the above table.
- The above modifications of the code did not affect the accuracy of the calculated Chandrasekhar function, as compared to the original code . For pairs of arguments shown in Fig. 2, accuracy of the H function, calculated from algorithms A and B, reached at least 12 decimal digits, however in majority of cases, the accuracy is equal to 13 decimals.
Two input parameters for the Chandrasekhar function, x and omega, are restricted to the range: 0 <= x<= 1 and 0 <= omega <= 1, which is sufficient in numerous applications.
Between 15 and 100 microseconds for one pair of arguments of the Chandrasekhar function.
| ||A. Jablonski, Computer Phys. Comm. 183 (2012) 1773.|
| ||S.L. Dudarev, M.J. Whelan, Surface Sci. 311 (1994) L687.|
| ||D.M. Davidović, J. Vukanić, D. Arsenović, Icarus 194 (2008) 389.|
| ||Handbook of Mathematical Functions, Edited by M. Abramowitz and I. A. Stegun, Dover Publications, Inc., New York (1972), pp. 916 - 919.|
| ||W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical recipes. The art of scientific computing, Cambridge University Press, Cambridge (2007).|
| ||K.M. Case, P.F. Zweifel, Linear Transport Theory, Addison-Wesley, Reading, MA, 1967, p. 61-65.|