|[Licence| Download | New Version Template] aehc_v2_0.tar.gz(433 Kbytes)|
|Manuscript Title: CAVE-CL: An OpenCL version of the package for detection and quantitative analysis of internal cavities in a system of overlapping balls: application to proteins.|
|Authors: Ján Busa Jr., Ján Busa, Shura Hayryan, Chin-Kun Hu, Ming-Chya Wu|
|Program title: CAVE-CL, CAVE C|
|Catalogue identifier: AEHC_v2_0|
Distribution format: tar.gz
|Journal reference: Comput. Phys. Commun. 190(2015)224|
|Programming language: C, C++, OpenCL.|
|Computer: PC with GPU.|
|Operating system: OpenCL compatible systems.|
|Has the code been vectorised or parallelized?: Parallelized using GPUs. A revised serial version (non GPU) is included in the package as well.|
|Supplementary material: The figure referred to in point 5 of the "Summary of revisions" section can be viewed here.|
|Keywords: Proteins, Solvent accessible area, Excluded volume, Cavities, Analytic method, Stereographic projection, GPGPU, OpenCL.|
|PACS: 82.20.Wt, 02.60.Cb, 02.70.Ns.|
Does the new version supersede the previous version?: Yes
Nature of problem:
Molecular structure analysis.
Analytical method, which uses the stereographic transformation for exact detection of internal cavities in the system of overlapping balls and numerical
algorithm for calculation of the volume and the surface area of cavities.
Reasons for new version:
This work is in line with our global efforts to modernize the protein structure related algorithms and software packages developed in our research
group during last several years [1-8]. These tools are continuing to receive considerable attention from researchers and they have been used in solving many interesting research problems [9, 10]. Among many others, one important application has been found by the members of our team .
Therefore, we think that there is a demand for a revision and modernization of these tools to make them more efficient. Here we follow the approach used
earlier in  to develop a new version of the CAVE package . The original CAVE package was written in the FORTRAN language. One of the reasons for the new version is to rewrite it in C in order to make it more friendly to the young researchers who are not familiar with FORTRAN. Another, a more important reason, is to utilise the possibilities of the contemporary hardware (for example, the modern graphical cards) to improve the performance of the package. We also wanted to eliminate the need to re-compile the program for every molecule during multiple calculations of the array of molecules. For this purpose we have introduced general pdb files as an input. After compiling one time, the program can receive any number of input files successively. Also, we found it necessary to go through the algorithm and to optimize, where it is possible, the memory usage and to make the algorithm more efficient.
Summary of revisions:
- Memory usage and language. The whole code has been ported into C and the static arrays have been replaced with dynamic memory allocation. This allows the loading and handling of proteins of arbitrary size.
- Changes in the algorithm. Like in , the original method of North Pole test and molecule rotation  has been changed. The details of implementation and the benefits from this change are properly described in  and we find it not necessary to repeat it here.
- New tool. A module called input_structure which takes as an input a protein structure file in the format compatible with Protein Data Bank (pdb)  has been adopted from . Using external tool allows users to create their own mappings of atoms and radii without re-compiling the module input_structure itself or the CAVE.
It is the user's responsibility to assign proper radii to each type of atom. One can use any of the published standard sets of radii (see for example, [13-17]). Alternatively, the user can assign his own values for radii immediately in the module input_structure. The radii are assigned in a special file with extension pds (see the documentation) which consists of lines like this: ATOM CA ALA 2.0 which is read
as "the Cα atom of Alanine has radius 2.0 Angstroms".
- Some computational tricks. In several parts of the program square roots were replaced by second powers and calls of sin and cos functions were replaced by calls to sincos allowing for further speed-up (in comparison to the original FORTRAN version).
The typical value of the relative error between results obtained by the original (FORTRAN), C, and OpenCL versions was between 10-8 and 10-10 and it never exceeded 10-5. Small differences in the results can be due to the implementation of the compiler, specially in the case of OpenCL and also in the implementation of arithmetic by the GPU vendor.
- OpenCL implementation and testing results. OpenCL  is an open standard for parallel programming in heterogeneous systems. It is becoming increasingly popular and has proved to be an efficient tool for computations in different fields (see, for example, the most recent [19, 20] and the references therein). Table 1 shows the speedup of the C and OpenCL implementations of CAVE as compared to the FORTRAN version. We compare both results obtained using free GNU FORTRAN (g77) and commercial (and faster) ifort. Speedup is calculated as a ratio between the original time obtained by the FORTRAN and C or OpenCL versions of the program. Times of execution are measured in seconds.
One could expect greater speed-ups but the problem is that not the whole algorithm could be parallelized. Only about 1/3 of the whole program was parallelized and the effect of this is visible for the proteins with 2000 atoms and more if the calculation time of the FORTRAN version is higher than approximately 10 seconds. The rest of the code is sequential and its parallelization will require an entirely new algorithm which might be the future work. Figure 1 shows the speed-up as a function of number of neighbors. This clearly indicates, that the effect of parallelization is stronger for proteins with many neighbors. This is also the reason, why the effect is not so strong for proteins with 0 testing sphere radius. Most of the cavities in such case are enclosed only in a few (around 4 to 8) spheres, while in the case of 1.2 testing sphere radius we have easily 35 or more enclosing spheres.
In global, we can see that the C version is a good choice for general proteins (and testing sphere radius of 0), OpenCL is proper for larger proteins and larger computational times. 0 in the name of protein means that no probe radius has been added to the atomic radii. In other cases 1.2 Å was added to all atomic radii.
All results were obtained on a computer with Intel Core 2 Duo E8500 CPU running at 3.16 GHz with 4GB RAM and GPU NVIDIA GTX470 and a computer with Intel Xeon X5450 CPU running at 3.00 GHz with 32GB RAM and dedicated NVIDIA C1060 GPU card.
When considering which GPU to use, it is important to watch its double precision performance. Consumer oriented GPUs have usually intentionally decreased double precision performance and because of that, results can be similar even if a newer generation of GPUs is used. For instance in 2010 the performance in double precision of NVIDIA GPUs (except for highly specialized GPUs for scientific computing) was 1/8 of
the performance in single precision. Nowdays (2014) this ratio is 1/24, meaning that GPUs from 2010 are as fast as current GPUs (except for special editions of GPUs or dedicated cards).
Table 1: Speed-up of C and OpenCL versions of the program CAVE when compared
to the original (FORTRAN) version calculated using GNU Fortran (gfort) and Intel
Depends on the size of the molecule under consideration. All test examples run under 1 minute, usually under 30 seconds.
The work was supported by Grants MOST 103-2120-M-001-005.
| ||S. Hayryan, C.-K. Hu, S.-Y. Hu, R.-J. Shang, J. Comput. Chem. 22 (2001) 1287.|
| ||F. Eisenmenger, U.H.E. Hansmann, S. Hayryan, C.-K. Hu, Comput. Phys. Commun. 138 (2001) 192.|
| ||F. Eisenmenger, U.H.E. Hansmann, S. Hayryan, C.-K. Hu, Comput. Phys. Commun. 174 (2006) 422.|
| ||S. Hayryan, C.-K. Hu, J. Skřivánek, E. Hayryan, I. Pokorný, J. Comput. Chem. 26 (2005) 334.|
| ||J. Buša, J. Džurina, E. Hayryan, S. Hayryan, C.-K. Hu, J. Plavka, I. Pokorný, J. Skřivánek, M.-C. Wu, Comput. Phys. Commun. 165 (2005)
| ||J. Buša, S. Hayryan, C.-K. Hu, J. Skřivánek, M.-C. Wu, J. Comput.
Chem. 30 (2009) 346.|
| ||J. Buša, S. Hayryan, M.-C. Wu, J. Skřivánek, C.-K. Hu, Comput. Phys.
Commun. 181 (2010) 2116.|
| ||J. Busa Jr., S. Hayryan, M.-C. Wu, J. Busa, and C.-K. Hu, Comp. Phys. Comm. 183 (2012) 2494-2497.|
| ||H. L. Chen, et al., Proteins: Structure, Function, and Bioinformatics 78
| ||P. Kota, et al., Bioinformatics 27 (2011) 2209-2215.|
| ||M.-C. Wu, M. S. Li, W.-J. Ma, M. Kouza, C.-K. Hu, EPL 96 (2011)
| ||B. Lee, F. M. Richards, J. Mol. Biol. 55 (1971) 379.|
| ||F. M. Richards, Annu. Rev. Bipohys. Bioeng. 6 (1977) 151.|
| ||A. Shrake, J. A. Rupley, J. Mol. Biol. 79 (1973) 351.|
| ||A. A. Rashin, M. Iofin, B. Honig, Biochemistry 25 (1986) 3619.|
| ||C. Chotia, Nature 248 (1974) 338.|
| ||M. Molero-Armenta, U. Iturraran-Viveros, S. Aparicio, et al. Comp. Phys. Commun. 185 (2014) 2683.|
| ||M. Bach, V. Lindenstruth, O. Philipsen, et al. Comp. Phys. Commun.
184 (2013) 2042.|