Elsevier Science Home
Computer Physics Communications Program Library
Full text online from Science Direct
Programs in Physics & Physical Chemistry
CPC Home

[Licence| Download | New Version Template] aeua_v1_3.tar.gz(617 Kbytes)
Manuscript Title: OptQC v1.3: An (updated) optimised parallel quantum compiler
Authors: T. Loke, J. B. Wang
Program title: OptQC v1.3
Catalogue identifier: AEUA_v1_3
Distribution format: tar.gz
Journal reference: Comput. Phys. Commun. 207(2016)531
Programming language: Fortran, MPI.
Computer: Any computer with Fortran compiler (not gfortran4.9 or earlier) and MPI library.
Operating system: Linux.
Keywords: Quantum circuit, Quantum compiler, Optimization.
Classification: 4.15.

External routines: Intel MKL LAPACK routines and MPI routines.

Does the new version supersede the previous version?: Yes

Nature of problem:
It aims to minimise the number of quantum gates required to implement a given unitary operation.

Solution method:
It utilises a descending random walk to select permutation matrices P and Q for a given unitary matrix U such that the number of gates in the quantum circuit of U = QT PT U'PQ is minimised, where U' is equivalent to U up to a permutation. The decomposition of a unitary operator is performed by recursively applying the cosine-sine decomposition.

Reasons for new version:
Simulated annealing process was found to give suboptimal results compared to a normal descending random walk. Computation time was also bloated by the necessity of running the CS decomposition thrice (for U', P and PT) for each iteration of the optimization process.

Summary of revisions:

  • Simulated annealing process was replaced by a descending random walk (equivalent to setting the threshold value β to 0), due to poor convergence of the simulated annealing process, and due to the lack of pathological instances of local minima in this particular search space.
  • Introduced an iterative method for generating the general permutation matrix P , in which we start with P = I and build up the quantum circuit (and modify P correspondingly) by adding gates onto the existing circuit. This removes the requirement of running the CS decomposition on P and PT to find the quantum circuit implementation, since it is already known by construction.
  • Added a synchronization mechanism between threads (after some prescribed number of iterations) in which the current state of the top 10% processes with the fittest solutions are copied over to the remaining 90%. This works so as to discard the less fit solutions and focus the searching algorithm in the state space with the fittest solutions.

Additional comments:
The program contains some Fortran2003 features and will not compile with gcc4.9 or earlier.

Running time:
As before, running time increases with the size of the unitary matrix, as well as the prescribed maximum number of iterations for qubit permutation selection and the descending random walk. All simulation results presented in this paper are obtained from running the program on the Fornax supercomputer managed by iVEC@UWA with Intel Xeon X5650 CPUs. A comparison of running times are also given in Table 1.

Matrix m N0 Before Update After Update
* * * Nmin CPU (mins) Nmin CPU (mins)
Random real unitary 8 29 22 0.333 18 0.250
S8 graph 16 34 23 1.083 21 0.467
3 rd generation 3-Cayley tree 42 996 300 13.583 216 8.233
Quantum Fourier transform 64 4095 3508 20.95 3144 13.250
Shor's algorithm 8 128 8285 5621 75.700 4085 42.917

Table 1: Result summary for the decomposition of various matrices: m is the dimension of the given unitary matrix U , N0 denotes the number of gates required to implement U, and Nmin is the total number of gates obtained after the optimization process (post-reduction).

References:
[1] T. Loke, J.B. Wang, Y.H. Chen, OptQC: An optimized parallel quantum compiler, Computer Physics Communications 185 (2014) 3307.