Quantum interior point methods
Rough overview (in words)
Interior point methods (IPMs) are a type of efficient classical algorithm for solving convex optimization problems such as linear programs (LPs), second-order cone programs (SOCPs), and semidefinite programs (SDPs). IPMs are the basis for effective optimization software tools (e.g., [1, 2]), which are widely used for solving convex optimization problems that arise in industry. They are called interior point methods because, in contrast to the simplex method, they iteratively generate a sequence of points that lie in the interior of the convex region; this sequence of points is guaranteed to rapidly approach the optimal point (which, when it exists and the objective function is convex, is guaranteed to lie at the boundary of the convex region). At each iteration, the next point is produced by solving a system of linear equations. See, e.g., [3, 4] for context on how IPMs fit into the history of methods for optimization.
Quantum interior point methods (QIPMs), first introduced in [5], are quantum algorithms that are identical to classical IPMs, except that they determine the next point using a quantum linear system solver combined with quantum state tomography.
Classical IPMs are generally efficient in the sense that they can solve convex optimization problems in time scaling as a polynomial in the number of variables. The exact degree of the polynomial depends on which kind of convex optimization problem is being solved, as well as certain choices about the IPM. Due to the need for quantum state tomography, QIPMs will also require time that scales at least linearly in the number of variables; thus, the best one can hope for is a polynomial speedup over classical IPMs. The exact runtime of the quantum algorithm depends on instance-specific parameters, such as the condition number of matrices that appear during the course of the algorithm, which makes it difficult to determine whether a speedup can exist.
Rough overview (in math)
For simplicity, we focus on LPs, the simplest kind of optimization problem where QIPMs can be applied. An LP is specified by an \(m \times n\) matrix \(A\), an \(n\)-dimensional vector \(c\), and an \(m\)-dimensional vector \(b\), and it is given by
where \(\langle u,v\rangle\) denotes the standard dot product between vectors \(u\) and \(v\).
The function \(\langle c, x\rangle\) is called the objective function, and a point \(x\) is called feasible if it satisfies \(Ax=b\) and \(x_i \geq 0\) for all \(i\). Inequality constraints of the form \(Ax \leq b\) can be handled by introducing slack variables. We denote the feasible point that optimizes the objective function by \(x^*\).
An important concept in mathematical optimization is duality, where given one optimization problem, an equivalent "dual" optimization problem can be generated through the method of Lagrange multipliers (see [6, Section 5]). The dual of the LP in Eq. \(\eqref{eq:LP-QIPM}\) is given by
Alternatively, one can drop the \(s\) variable and constraints that \(s_i\) are positive, and simply write \(A^\intercal y \leq c\). Denote the optimal feasible points for the dual by \((y^*,s^*)\).
It can be shown that the optimal point lies at the boundary of the feasible region and satisfies the relationship \(x_is_i = 0\) for all \(i\). A key concept in IPMs is the central path, a set of points parameterized by \(\mu > 0\). The central point with parameter \(\mu\) is the feasible point for which \(x_is_i = \mu\) for all \(i\). In general, this point will be in the interior of the feasible region, but as \(\mu \rightarrow 0\), the central path approaches the optimal point on the boundary.
The most effective classical IPMs are "primal-dual path-following methods," which generate a length-\(T\) sequence of primal-dual point pairs \((x^{(t)},y^{(t)},s^{(t)}) \in \mathbb{R}^n \times \mathbb{R}^m \times \mathbb{R}^n\) for \(t=0,\ldots,T-1\) that approximately follows the central path toward the optimum. Given \((x^{(t)},y^{(t)},s^{(t)})\), the point \((x^{(t+1)},y^{(t+1)},s^{(t+1)}) = (x^{(t)}+\Delta x,y^{(t)} + \Delta y,s^{(t)} + \Delta s)\) is formed by solving the following linear system of equations, which is called the Newton system, as it corresponds to one iteration of Newton's method.
where \(\sigma<1\), \(\mathbf{1}\) denotes the all 1s vector, and \(S = \text{diag}(s^{(t)}), X = \text{diag}(x^{(t)})\) are diagonal \(n \times n\) matrices formed from the entries of \(s^{(t)}\) and \(x^{(t)}\). Note that there are alternative ways to formulate the Newton system (see, e.g., [7, 8]). To understand Eq. \(\eqref{eq:Newton_system}\), note that if the point \((x^{(t)}, y^{(t)}, s^{(t)})\) is feasible, then the first two entries on the right-hand-side are zero. Furthermore, if it is on the central path, then \(Xs^{(t)} = \frac{x^{(t)\intercal} s^{(t)}}{n} \mathbf{1}\), so if we were to choose \(\sigma=1\), then the entire right-hand-side would be zero, and the solution to the system would be \(\Delta x = \Delta y = \Delta s = 0\). If instead we set \(\sigma = 1-\delta\) for sufficiently small \(\delta\), the solution will correspond to taking a small step along the central path in the direction of decreasing \(\mu\). Technically, we do not exactly follow the central path, but it can be guaranteed that the sequence of points stays within a small neighborhood of it. As \(\mu \rightarrow 0\), the central path approaches the optimal point \((x^*,y^*,s^*)\), so by following the path toward \(\mu = 0\), a classical or quantum IPM can guarantee success.
The classical IPM can solve the Newton system exactly using Gaussian elimination in \(\mathcal{O}\left( n^3 \right)\) operations, or it can solve the system approximately using a variety of iterative solvers such as conjugate gradient descent or the Kaczmarz method [9]. In contrast, the QIPM solves the Newton system by using a quantum linear system solver to repeatedly prepare the \(\mathcal{O}\left( \log(n) \right)\)-qubit state \(\ket{\Delta x, \Delta y, \Delta s}\) whose amplitudes encode the solution to the Newton system. By preparing many copies, the algorithm can perform (pure state) quantum state tomography to yield an estimate \((\overline{\Delta x}, \overline{\Delta y}, \overline{\Delta s})\) for the amplitudes \((\Delta x, \Delta y, \Delta s)\) to some desired precision \(\xi\) (in 2-norm), i.e.
Due to the tomography step, the QIPM is only able to generate solutions to the Newton system that are inexact. There has been some question in the literature whether the fastest IPMs still work even when inexact solutions are used, as this causes intermediate points to be (slightly) infeasible [7]. However, if \(\xi\) is sufficiently small, the method appears to work empirically even using the inexact solutions that would be output by a quantum solver [10]. Alternatively, there exist workarounds [7] that ensure feasibility is maintained even when linear systems are solved inexactly, at the expense of some additional classical cost.
The IPMs and QIPMs for SOCPs [11, 8] are quite similar to the one for LPs described above: the main difference is that the matrices \(X\) and \(S\) are no longer strictly diagonal matrices. QIPMs have also been proposed for SDPs [5, 7, 12], which are more complex but have more expressive power; here, additional considerations must be taken to guarantee that the intermediate solutions continue to be symmetric even after experiencing errors due to tomography.
Dominant resource cost (gates/qubits)
The outer loop of QIPMs is purely classical; at each iteration a small step is taken to form the next point in the sequence. For LP, SOCP, and SDP, the number of iterations \(T\) required to yield a point for which the objective function is within \(\epsilon\) of optimal is \(\mathcal{O}(\sqrt{n}\log(1/\epsilon))\). The main cost of each iteration is solving the Newton system.
The QIPM solves the Newton system by preparing many copies of the state corresponding to the solution to the linear system. This state can be prepared in time \(\text{polylog}(n)\cdot \zeta \kappa\), where \(\kappa\) is the condition number of the matrix in Eq. \(\eqref{eq:Newton_system}\) and \(\zeta\) is the ratio \(\lVert \cdot \rVert_F / \lVert \cdot \rVert\) of the Frobenius and spectral norms of the matrix, assuming that one can perform a block-encoding of the Newton matrix in \(\text{polylog}(n)\) time, a task that requires access to large-scale quantum random access memory (QRAM). For LP and SOCP, the number of copies that must be prepared scales as \(\mathcal{O}\left( n/\xi^2 \right)\) when using the basic version (see [5, Section 4] and [10, Section IVD]) of pure state tomography that simply measures each copy in the computational basis. A more recent and complex version of tomography [13] can achieve this task using \(\mathcal{O}\left( n/\xi \right)\) copies along with additional gates. For SDP, since the variables are matrices rather than vectors, the number of copies is \(\mathcal{O}\left( n^2/\xi^2 \right)\) or \(\mathcal{O}\left( n^2/\xi \right)\). Overall, using the more efficient version of tomography and ignoring the additional gates, the runtime of the QIPM is expected to scale as
where \(\kappa\) denotes the maximum condition number, \(\zeta\) the maximum ratio of Frobenius to spectral norm, and \(\xi\) the minimum tomographic precision required across all iterations. In the worst case, it may be necessary to take \(\xi\) as small as \(\mathcal{O}\left( 1/\kappa \right)\), and \(\zeta\) can be as large as \(\sqrt{n}\) (SOCP/LP) or \(n\) (SDP). The hidden constant prefactors are dependent primarily on the implementation of the quantum linear system solver and tomography. It is clear that the viability of the QIPM is highly dependent on the value and scaling of the parameters \(\kappa\) and \(\xi\). Unfortunately, it is believed that for some LP/SOCP/SDP instances, the value of \(\kappa\) will diverge as the target precision \(\epsilon\) is made smaller, perhaps as \(\mathcal{O}\left( 1/\epsilon \right)\) [11, 7], although this may not be the case in every instance [12].
The QIPM only requires a register of \(\mathcal{O}\left( \log(n) \right)\) qubits to hold the solution of the linear system; however, achieving the runtimes quoted requires queries to QRAM. In this case, the explicit QRAM circuits that achieve shallow depths of \(\mathcal{O}\left( \log(n) \right)\) necessarily require \(\mathcal{O}\left( n^2 \right)\) total gates across \(\mathcal{O}\left( n^2 \right)\) total qubits.
Caveats
There are several important caveats that must be considered when evaluating a speedup claimed by QIPM.
- Even in a best case scenario, the quantum speedup is at most polynomial (and even subquadratic). Since quantum computation requires significant constant-factor overheads due to slower clock speeds and error correction, the value of \(n\) for which a QIPM would be faster than a classical IPM on actual hardware is likely to be large (see [10] for further discussion).
- Since \(n\) must be large for a quantum speedup to be obtained, a very large QRAM, corresponding to millions or billions of (logical) qubits, would be needed for any speedup to be realized.
- QIPMs are most effective when the condition number \(\kappa\) is relatively small since they rely on quantum linear system solvers. However, when \(\kappa\) is small, iterative classical methods may also be effective, limiting the advantage of the quantum algorithm. In particular, a linear system with \(\mathcal{O}\left( n \right)\) constraints on \(n\) variables can be solved to error \(\xi\) in time \(\mathcal{O}\left( n\zeta^2 \kappa^2\log(1/\xi) \right)\) using the randomized Kaczmarz method [9]. The QIPM performs this task in time \(\mathcal{O}\left( n \zeta \kappa/\xi \right)\). Even if \(\xi = \Omega(1)\), this limits the magnitude of the quantum speedup to \(\mathcal{O}\left( \zeta \kappa \right)\). Thus, for the quantum speedup to be maximized, \(\kappa\) can be neither too small nor too large.
- If the matrices that define the convex problem have a certain structure (e.g. sparsity), this could be exploited to potentially reduce the overhead from block-encoding—in particular, the value of \(\zeta\) and the size of the QRAM required. However, this can help the quantum algorithm only to a limited extent, as the vectors \((\Delta x, \Delta y, \Delta s)\) will still be dense and reading out estimates for all \(\mathcal{O}\left( n \right)\) amplitudes with quantum tomography will be necessary.
Example use cases
- Portfolio optimization, the canonical optimization problem that appears in finance, can be formulated as an SOCP and solved with a QIPM; a study of the condition number of the matrices that appear in this application was consistent with a small quantum speedup [14]; however, a follow-up study did not replicate this finding [10] and also pointed out that in any case large constant-factor overheads would make achieving practical advantage challenging.
- Support vector machines, a common task in machine learning, can be reduced to SOCP and solved with a QIPM; a study of the condition number of the matrices that appear in this application was consistent with a small quantum speedup [11].
- Sample-efficient protocols for mixed-state tomography reduce the problem of reconstructing an estimate of the quantum state to solving an SDP. This SDP could be solved with a QIPM (note that the tomography needed within the QIPM is always on pure states and does not require solving an SDP, thus avoiding an issue of circular logic).
- Nonconvex optimization is often solved approximately by relaxing the problem into a convex problem like an SDP. For example, the \(\mathsf{MAX}\)-\(\mathsf{CUT}\) problem is acombinatorial optimization problem over the nonconvex space \(\{+1,-1\}^n\), but by solving the associated SDP relaxation and rounding, an approximate solution can be obtained.
Further reading
- See Boyd and Vandenberghe [6] for an accessible book on convex optimization including (classical) interior point methods.
- QIPMs are an active area of research. A QIPM for LP and SDP was originally proposed by Kerenidis and Prakash in [5]. This was followed up by a QIPM for SOCP in [11], along with numerical simulations for specific applications [11, 14]. Later, [7] pointed out a potential error in the convergence analysis of previous works, and they presented two possible workarounds called the "inexact-infeasible" and "inexact-feasible" IPMs. Note also the work in [12] for another way to avoid this issue, giving a QIPM for SDP.
Bibliography
-
Alexander Domahidi, Eric Chu, and Stephen Boyd. Ecos: an socp solver for embedded systems. In 2013 European Control Conference (ECC), volume, 3071–3076. 2013. doi:10.23919/ECC.2013.6669541.
-
Erling D. Andersen and Knud D. Andersen. The Mosek Interior Point Optimizer for Linear Programming: An Implementation of the Homogeneous Algorithm, pages 197–232. Springer US, Boston, MA, 2000. URL: https://doi.org/10.1007/978-1-4757-3216-0\_8, doi:10.1007/978-1-4757-3216-0\_8.
-
Margaret Wright. The interior-point revolution in optimization: history, recent developments, and lasting consequences. Bulletin of the American Mathematical Society, 42(1):39–56, 2005. doi:10.1090/S0273-0979-04-01040-7.
-
Arkadi S Nemirovski and Michael J Todd. Interior-point methods for optimization. Acta Numerica, 17:191–234, 2008. doi:10.1017/S0962492906370018.
-
Iordanis Kerenidis and Anupam Prakash. A quantum interior point method for lps and sdps. ACM Transactions on Quantum Computing, 2020. arXiv: https://arxiv.org/abs/1808.09266. doi:10.1145/3406306.
-
S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004. URL: https://web.stanford.edu/\textasciitilde boyd/cvxbook/bv\_cvxbook.pdf.
-
Brandon Augustino, Giacomo Nannicini, Tamás Terlaky, and Luis F. Zuluaga. Quantum interior point methods for semidefinite optimization. Quantum, 7:1110, 9 2023. arXiv: https://arxiv.org/abs/2112.06025. URL: https://doi.org/10.22331/q-2023-09-11-1110, doi:10.22331/q-2023-09-11-1110.
-
Brandon Augustino, Tamás Terlaky, and Luis F Zuluaga. An inexact-feasible quantum interior point method for second-order cone optimization. Technical Report 21T-009, Department of Industrial and Systems Engineering, Lehigh University, 2022.
-
Thomas Strohmer and Roman Vershynin. A randomized kaczmarz algorithm with exponential convergence. Journal of Fourier Analysis and Applications, 15(2):262–278, 2009. arXiv: https://arxiv.org/abs/math/0702226. doi:10.1007/s00041-008-9030-4.
-
Alexander M Dalzell, B David Clader, Grant Salton, Mario Berta, Cedric Yen-Yu Lin, David A Bader, Nikitas Stamatopoulos, Martin J A Schuetz, Fernando G S L Brandão, Helmut G Katzgraber, and others. End-to-end resource analysis for quantum interior point methods and portfolio optimization. PRX Quantum, pages to appear, 2023. arXiv: https://arxiv.org/abs/2211.12489.
-
Iordanis Kerenidis, Anupam Prakash, and Dániel Szilágyi. Quantum algorithms for second-order cone programming and support vector machines. Quantum, 5:427, 2021. arXiv: https://arxiv.org/abs/1908.06720. doi:10.22331/q-2021-04-08-427.
-
Baihe Huang, Shunhua Jiang, Zhao Song, Runzhou Tao, and Ruizhe Zhang. A faster quantum algorithm for semidefinite programming via robust ipm framework. arXiv: https://arxiv.org/abs/2207.11154, 2022.
-
Joran van Apeldoorn, Arjan Cornelissen, András Gilyén, and Giacomo Nannicini. Quantum tomography using state-preparation unitaries. In Proceedings of the 34th ACM-SIAM Symposium on Discrete Algorithms (SODA), 1265–1318. 2023. arXiv: https://arxiv.org/abs/2207.08800. doi:10.1137/1.9781611977554.ch47.
-
Iordanis Kerenidis, Anupam Prakash, and Dániel Szilágyi. Quantum algorithms for portfolio optimization. In Proceedings of the 1st ACM Conference on Advances in Financial Technologies, AFT '19, 147–155. New York, NY, USA, 2019. Association for Computing Machinery. arXiv: https://arxiv.org/abs/1908.08040. URL: https://doi.org/10.1145/3318041.3355465, doi:10.1145/3318041.3355465.