Elsevier Science Home
Computer Physics Communications Program Library
Full text online from Science Direct
Programs in Physics & Physical Chemistry
CPC Home

[Licence| Download | New Version Template] aehc_v2_0.tar.gz(433 Kbytes)
Manuscript Title: CAVE-CL: An OpenCL version of the package for detection and quantitative analysis of internal cavities in a system of overlapping balls: application to proteins.
Authors: Ján Busa Jr., Ján Busa, Shura Hayryan, Chin-Kun Hu, Ming-Chya Wu
Program title: CAVE-CL, CAVE C
Catalogue identifier: AEHC_v2_0
Distribution format: tar.gz
Journal reference: Comput. Phys. Commun. 190(2015)224
Programming language: C, C++, OpenCL.
Computer: PC with GPU.
Operating system: OpenCL compatible systems.
Has the code been vectorised or parallelized?: Parallelized using GPUs. A revised serial version (non GPU) is included in the package as well.
Supplementary material: The figure referred to in point 5 of the "Summary of revisions" section can be viewed here.
Keywords: Proteins, Solvent accessible area, Excluded volume, Cavities, Analytic method, Stereographic projection, GPGPU, OpenCL.
PACS: 82.20.Wt, 02.60.Cb, 02.70.Ns.
Classification: 16.1.

Does the new version supersede the previous version?: Yes

Nature of problem:
Molecular structure analysis.

Solution method:
Analytical method, which uses the stereographic transformation for exact detection of internal cavities in the system of overlapping balls and numerical algorithm for calculation of the volume and the surface area of cavities.

Reasons for new version:
This work is in line with our global efforts to modernize the protein structure related algorithms and software packages developed in our research group during last several years [1-8]. These tools are continuing to receive considerable attention from researchers and they have been used in solving many interesting research problems [9, 10]. Among many others, one important application has been found by the members of our team [11].
Therefore, we think that there is a demand for a revision and modernization of these tools to make them more efficient. Here we follow the approach used earlier in [8] to develop a new version of the CAVE package [7]. The original CAVE package was written in the FORTRAN language. One of the reasons for the new version is to rewrite it in C in order to make it more friendly to the young researchers who are not familiar with FORTRAN. Another, a more important reason, is to utilise the possibilities of the contemporary hardware (for example, the modern graphical cards) to improve the performance of the package. We also wanted to eliminate the need to re-compile the program for every molecule during multiple calculations of the array of molecules. For this purpose we have introduced general pdb files as an input. After compiling one time, the program can receive any number of input files successively. Also, we found it necessary to go through the algorithm and to optimize, where it is possible, the memory usage and to make the algorithm more efficient.

Summary of revisions:

  1. Memory usage and language. The whole code has been ported into C and the static arrays have been replaced with dynamic memory allocation. This allows the loading and handling of proteins of arbitrary size.
  2. Changes in the algorithm. Like in [8], the original method of North Pole test and molecule rotation [4] has been changed. The details of implementation and the benefits from this change are properly described in [8] and we find it not necessary to repeat it here.
  3. New tool. A module called input_structure which takes as an input a protein structure file in the format compatible with Protein Data Bank (pdb) [12] has been adopted from [8]. Using external tool allows users to create their own mappings of atoms and radii without re-compiling the module input_structure itself or the CAVE.
    It is the user's responsibility to assign proper radii to each type of atom. One can use any of the published standard sets of radii (see for example, [13-17]). Alternatively, the user can assign his own values for radii immediately in the module input_structure. The radii are assigned in a special file with extension pds (see the documentation) which consists of lines like this: ATOM CA ALA 2.0 which is read as "the atom of Alanine has radius 2.0 Angstroms".
  4. Some computational tricks. In several parts of the program square roots were replaced by second powers and calls of sin and cos functions were replaced by calls to sincos allowing for further speed-up (in comparison to the original FORTRAN version).
    The typical value of the relative error between results obtained by the original (FORTRAN), C, and OpenCL versions was between 10-8 and 10-10 and it never exceeded 10-5. Small differences in the results can be due to the implementation of the compiler, specially in the case of OpenCL and also in the implementation of arithmetic by the GPU vendor.
  5. OpenCL implementation and testing results. OpenCL [18] is an open standard for parallel programming in heterogeneous systems. It is becoming increasingly popular and has proved to be an efficient tool for computations in different fields (see, for example, the most recent [19, 20] and the references therein). Table 1 shows the speedup of the C and OpenCL implementations of CAVE as compared to the FORTRAN version. We compare both results obtained using free GNU FORTRAN (g77) and commercial (and faster) ifort. Speedup is calculated as a ratio between the original time obtained by the FORTRAN and C or OpenCL versions of the program. Times of execution are measured in seconds. One could expect greater speed-ups but the problem is that not the whole algorithm could be parallelized. Only about 1/3 of the whole program was parallelized and the effect of this is visible for the proteins with 2000 atoms and more if the calculation time of the FORTRAN version is higher than approximately 10 seconds. The rest of the code is sequential and its parallelization will require an entirely new algorithm which might be the future work. Figure 1 shows the speed-up as a function of number of neighbors. This clearly indicates, that the effect of parallelization is stronger for proteins with many neighbors. This is also the reason, why the effect is not so strong for proteins with 0 testing sphere radius. Most of the cavities in such case are enclosed only in a few (around 4 to 8) spheres, while in the case of 1.2 testing sphere radius we have easily 35 or more enclosing spheres.
    In global, we can see that the C version is a good choice for general proteins (and testing sphere radius of 0), OpenCL is proper for larger proteins and larger computational times. 0 in the name of protein means that no probe radius has been added to the atomic radii. In other cases 1.2 Å was added to all atomic radii.
    All results were obtained on a computer with Intel Core 2 Duo E8500 CPU running at 3.16 GHz with 4GB RAM and GPU NVIDIA GTX470 and a computer with Intel Xeon X5450 CPU running at 3.00 GHz with 32GB RAM and dedicated NVIDIA C1060 GPU card.
    When considering which GPU to use, it is important to watch its double precision performance. Consumer oriented GPUs have usually intentionally decreased double precision performance and because of that, results can be similar even if a newer generation of GPUs is used. For instance in 2010 the performance in double precision of NVIDIA GPUs (except for highly specialized GPUs for scientific computing) was 1/8 of the performance in single precision. Nowdays (2014) this ratio is 1/24, meaning that GPUs from 2010 are as fast as current GPUs (except for special editions of GPUs or dedicated cards).


    1. Table 1: Speed-up of C and OpenCL versions of the program CAVE when compared to the original (FORTRAN) version calculated using GNU Fortran (gfort) and Intel FORTRAN (ifort).
      protein
      PDB ID
      number
      of atoms
      time
      gfort [s]
      speed-up
      ifort
      speed-up
      C
      speed-up
      OpenCL
      1AUWO
      1AUW
      13872
      13872
      345.98
      235.10
      1.56
      1.70
      1.94
      1.75
      1.97
      3.32
      1DJ30
      1DJ3
      6580
      6580
      61.37
      68.80
      1.55
      1.81
      1.95
      1.70
      1.97
      3.12
      1I0A0
      1I0A
      13675
      13675
      358.88
      254.81
      1.57
      1.67
      1.97
      1.85
      1.97
      3.44
      1LD40
      1LD4
      15648
      15648
      305.43
      551.28
      1.55
      1.58
      1.87
      1.47
      1.91
      2.53
      1M3Z0
      1M3Z
      11292
      11292
      181.31
      141.04
      1.54
      1.75
      1.92
      1.74
      1.95
      3.06
      1OJX0
      1OJX
      19368
      19368
      503.02
      339.28
      1.62
      1.69
      1.90
      1.60
      1.90
      3.12
      1OK40
      1OK4
      19358
      19358
      541.84
      414.96
      1.67
      2.32
      1.92
      2.39
      1.93
      4.10
      1QI60
      1QI6
      14384
      14384
      292.94
      237.80
      1.55
      1.67
      1.90
      1.82
      1.93
      3.15
      1QNW0
      1QNW
      7168
      7168
      72.24
      71.35
      1.55
      1.85
      1.90
      1.67
      1.93
      3.11
      1S3Q0
      1S3Q
      15358
      15358
      376.79
      349.48
      1.57
      1.76
      1.95
      2.03
      1.98
      3.27
      1UPA0
      1UPA
      16272
      16272
      390.91
      289.55
      1.55
      1.67
      1.90
      1.84
      1.93
      3.09
      1YLO0
      1YLO
      15318
      15318
      306.27
      326.21
      1.55
      1.61
      1.88
      1.90
      1.92
      3.14
      2MYS0
      2MYS
      6287
      6287
      51.37
      62.21
      1.55
      1.78
      1.98
      1.72
      1.97
      2.65

Running time:
Depends on the size of the molecule under consideration. All test examples run under 1 minute, usually under 30 seconds.

The work was supported by Grants MOST 103-2120-M-001-005.

References:
[1] S. Hayryan, C.-K. Hu, S.-Y. Hu, R.-J. Shang, J. Comput. Chem. 22 (2001) 1287.
[2] F. Eisenmenger, U.H.E. Hansmann, S. Hayryan, C.-K. Hu, Comput. Phys. Commun. 138 (2001) 192.
[3] F. Eisenmenger, U.H.E. Hansmann, S. Hayryan, C.-K. Hu, Comput. Phys. Commun. 174 (2006) 422.
[4] S. Hayryan, C.-K. Hu, J. Skřivánek, E. Hayryan, I. Pokorný, J. Comput. Chem. 26 (2005) 334.
[5] J. Buša, J. Džurina, E. Hayryan, S. Hayryan, C.-K. Hu, J. Plavka, I. Pokorný, J. Skřivánek, M.-C. Wu, Comput. Phys. Commun. 165 (2005) 59.
[6] J. Buša, S. Hayryan, C.-K. Hu, J. Skřivánek, M.-C. Wu, J. Comput. Chem. 30 (2009) 346.
[7] J. Buša, S. Hayryan, M.-C. Wu, J. Skřivánek, C.-K. Hu, Comput. Phys. Commun. 181 (2010) 2116.
[8] J. Busa Jr., S. Hayryan, M.-C. Wu, J. Busa, and C.-K. Hu, Comp. Phys. Comm. 183 (2012) 2494-2497.
[9] H. L. Chen, et al., Proteins: Structure, Function, and Bioinformatics 78 (2010) 2973.
[10] P. Kota, et al., Bioinformatics 27 (2011) 2209-2215.
[11] M.-C. Wu, M. S. Li, W.-J. Ma, M. Kouza, C.-K. Hu, EPL 96 (2011) 68005.
[12] http://www.rcsb.org
[13] B. Lee, F. M. Richards, J. Mol. Biol. 55 (1971) 379.
[14] F. M. Richards, Annu. Rev. Bipohys. Bioeng. 6 (1977) 151.
[15] A. Shrake, J. A. Rupley, J. Mol. Biol. 79 (1973) 351.
[16] A. A. Rashin, M. Iofin, B. Honig, Biochemistry 25 (1986) 3619.
[17] C. Chotia, Nature 248 (1974) 338.
[18] http://www.khronos.org/opencl/
[19] M. Molero-Armenta, U. Iturraran-Viveros, S. Aparicio, et al. Comp. Phys. Commun. 185 (2014) 2683.
[20] M. Bach, V. Lindenstruth, O. Philipsen, et al. Comp. Phys. Commun. 184 (2013) 2042.