Amplitude estimation

Rough overview (in words)

Given a quantum subroutine that succeeds with unknown success probability, amplitude estimation performs quantum phase estimation on the operator used in amplitude amplification to learn the magnitude of the success amplitude. While the algorithm is referred to as amplitude estimation, it is often the success probability that we wish to compute, and the complexity of the algorithm is often presented accordingly. For example, the original paper introducing amplitude estimation [1] uses the variable \(a\) to denote the success probability. Here we denote the amplitude by \(a\) and the success probability \(p = |a|^2\). The algorithm provides a quadratic speedup over classical methods for estimating \(p\).

Rough overview (in math)

We are given an initial state \(\ket{\psi_0}\), a target state \(\ket{\psi_g}\), and a unitary \(U\) (and its inverse \(U^\dag\)) such that

\[\begin{equation} U\ket{\psi_0} = \ket{\psi} = a \ket{\psi_g} + b\ket{\psi_b} \end{equation}\]

where \(\ket{\psi_b}\) is a state orthogonal to the target state. We assume that we can mark the target state \(\ket{\psi_g}\) (i.e., the ability to reflect about the state). Thus, \(|a|^2\) is the success probability of applying \(U\) and measuring \(\ket{\psi_g}\). We are given the ability to implement the reflection operator around the initial state \(R_{\psi_0} = I - 2\ket{\psi_0}\bra{\psi_0}\) and an operation that, when restricted to the subspace spanned by \(\{ \ket{\psi_g}, \ket{\psi_b} \}\), acts as the reflection around the target state \(R_{\psi_g} = I - 2\ket{\psi_g}\bra{\psi_g}\). We can then estimate the success probability by performing quantum phase estimation on an operator \(W = - U R_{\psi_0} U^\dag R_{\psi_g}\), from the initial state \(U \ket{\psi_0} = \ket{\psi}\). The standard analysis [1] proceeds by letting \(|a|=\sin(\theta)\) and \(|b|=\cos(\theta)\) (thus the phases of \(a\) and \(b\) are absorbed into \(\ket{\psi_g}\) and \(\ket{\psi_b}\) and are not determined by the following procedure) and showing that the 2D subspace spanned by \(\{\ket{\psi_g}, \ket{\psi_b}\}\) is invariant under \(W\), where it acts as a rotation operator

\[\begin{equation} W = \left[\begin{array}{cc} \cos(2\theta) & -\sin(2\theta) \\ \sin(2\theta) & \cos(2\theta) \end{array}\right]. \end{equation}\]

This operator has eigenvalues \(e^{\pm 2 i \theta}\), and we can estimate \(\theta\) to additive error \(\epsilon\) through quantum phase estimation. The estimate for \(\theta\) can be converted into an estimate for \(|a|\), or for the success probability \(p=|a|^2\), which is often the quantity of interest.

Dominant resource cost (gates/qubits)

The classical approach for learning the probability \(p\) to precision \(\epsilon\) has time complexity \(M = \mathcal{O}\left( 1/\epsilon^2 \right)\), where the basic idea is to perform \(M\) incoherent repetitions of applying \(U\) and measuring in the \(\ket{\psi_g}, \ket{\psi_b}\) basis. Amplitude estimation provides a quadratic speedup, learning the probability (and amplitude) with time complexity \(M = \mathcal{O}\left( 1/\epsilon \right)\). The textbook variant has a constant success probability, which can be boosted to \(1-\delta\) with \(\mathcal{O}\left( \log(1/\delta) \right)\) overhead through standard methods (e.g., probability amplification by majority voting).

More precisely, we can follow the analysis in [1] to show that to learn \(|a|\) to error \(\epsilon\) we require \(M\) controlled applications of the walk operator \(W\) where \(M\) satisfies¹

\[\begin{equation} \label{eq:AE_amplitude_bound} \epsilon \leq \frac{\pi \sqrt{1-a^2}}{M} + \frac{a\pi^2}{2M^2}. \end{equation}\]

The algorithm succeeds with probability \(8/\pi^2\). We see that for \(a \approx 1-\mathcal{O}\left( \epsilon \right)\), a further quadratic improvement is obtained (i.e., \(M = \mathcal{O}\left( 1/\sqrt{\epsilon} \right)\)).

To learn the success probability \(p=|a|^2\) to error \(\epsilon\) we require \(M\) controlled applications of the walk operator \(W\) where \(M\) satisfies [1]

\[\begin{equation} \label{eq:AE_probability_bound} \epsilon \leq \frac{2\pi \sqrt{p(1-p)}}{M} + \frac{\pi^2}{M^2}. \end{equation}\]

The algorithm succeeds with probability \(8/\pi^2\). Similar to above, if \(p \approx \mathcal{O}\left( \epsilon \right)\) or \(p \approx 1-\mathcal{O}\left( \epsilon \right)\), we have that \(M = \mathcal{O}\left( 1/\sqrt{\epsilon} \right)\).²

A common setting is the case where \(\ket{\psi_0} = \ket{\bar{0}}\), and \(U\) acts on \(n\) register qubits and \(k\) ancilla qubits such that \(U\ket{\bar{0}} = a\ket{\psi_g}\ket{\bar{0}}_k + b \ket{\psi_b}\ket{\bar{0}^\perp}_k\). In this case, the reflection operators are simple to implement, and \(W\) can be controlled by making these reflections controlled (adding another control qubit to a multicontrol-CZ gate). We require \(\log(M)\) ancilla qubits for phase estimation (which can be reduced using modern variants, see below and [2]).

Caveats

The textbook version of amplitude estimation described above produces biased estimates of \(|a|\) and \(p\). This is partly inherited from the biased nature of textbook quantum phase estimation (see caveats). However, even if unbiased variants of phase estimation are used, the amplitude and probability estimates are not immediately unbiased, as they are obtained by applying nonlinear functions to the estimate of the phase. Unbiased variants of amplitude [2] and probability estimation [3] have been developed to address this.

The variant of amplitude estimation described above is also "destructive" in the sense that the output state is collapsed into a state \(\frac{1}{\sqrt{2}}(\pm i \ket{\psi_g} + \ket{\psi_b}) \neq \ket{\psi_0}, \ket{\psi}\). A nondestructive variant may be desired if the initial state is expensive to prepare and we require coherent or incoherent repetitions of amplitude estimation. Nondestructive variants have been developed in [4, 5, 2].

Example use cases

Approximate counting of solutions marked by an oracle (e.g., topological data analysis, combinatorial optimization).
Amplitude estimation provides a quadratic speedup for Monte Carlo estimation [6, 7] with uses in pricing financial assets. The general idea is to prepare a state \(\ket{\psi} = \sum_x \sqrt{p(x) f(x)} \ket{x}\ket{0} + \ket{\phi 0^\perp}\) where \(\mathbb{E}[f(x)] = \sum_x p(x) f(x)\) represents the expectation value we wish to evaluate using Monte Carlo sampling and corresponds to the probability that we measure the second register in state \(\ket{0}\). Hence, amplitude estimation provides a quadratic speedup for estimating this quantity.
A special case of amplitude estimation is overlap estimation [8], where the goal is to measure \(\bra{\psi_0} U \ket{\psi_0} = \braket{\psi}{\psi_0}\). This can be viewed as an application of amplitude amplification, where \(\ket{\psi_g} = \ket{\psi_0}\). As a result, we only require the ability to implement \(R_{\psi_0} = I - 2\ket{\psi_0}\bra{\psi_0}\), \(U, U^\dag\) (or equivalently \(R_{\psi_0}\) and \(R_{\psi}\)). Note that in overlap estimation, one additionally wants to determine the phase of \(a\), which can be obtained by applying amplitude estimation on a controlled variant of \(U\), as outlined in [8]. Overlap estimation can be used for estimating observables, e.g., in quantum chemistry.
A generalization of amplitude estimation, via the quantum gradient algorithm, forms a core subroutine in some approaches for quantum state tomography [3]. Pure state tomography can be thought of as a generalization of amplitude estimation, in which we seek to learn all amplitudes individually, rather than only a single aggregate quantity.

Bibliography

Gilles Brassard, Peter Høyer, Michele Mosca, and Alain Tapp. Quantum amplitude amplification and estimation. In Quantum Computation and Quantum Information: A Millennium Volume, volume 305 of Contemporary Mathematics Series, pages 53–74. AMS, 2002. doi:10.1090/conm/305/05215.
Patrick Rall and Bryce Fuller. Amplitude estimation from quantum signal processing. Quantum, 7:937, 3 2023. arXiv: https://arxiv.org/abs/2207.08628. URL: https://doi.org/10.22331/q-2023-03-02-937, doi:10.22331/q-2023-03-02-937.
Joran van Apeldoorn, Arjan Cornelissen, András Gilyén, and Giacomo Nannicini. Quantum tomography using state-preparation unitaries. In Proceedings of the 34th ACM-SIAM Symposium on Discrete Algorithms (SODA), 1265–1318. 2023. arXiv: https://arxiv.org/abs/2207.08800. doi:10.1137/1.9781611977554.ch47.
Aram W. Harrow and Annie Y. Wei. Adaptive quantum simulated annealing for bayesian inference and estimating partition functions. In Proceedings of the 31st ACM-SIAM Symposium on Discrete Algorithms (SODA), 193–212. 2020. arXiv: https://arxiv.org/abs/1907.09965. URL: https://epubs.siam.org/doi/abs/10.1137/1.9781611975994.12, arXiv:https://epubs.siam.org/doi/pdf/10.1137/1.9781611975994.12, doi:10.1137/1.9781611975994.12.
Arjan Cornelissen and Yassine Hamoudi. A sublinear-time quantum algorithm for approximating partition functions. In Proceedings of the 34th ACM-SIAM Symposium on Discrete Algorithms (SODA), 1245–1264. 2023. arXiv: https://arxiv.org/abs/2207.08643. URL: https://epubs.siam.org/doi/abs/10.1137/1.9781611977554.ch46, doi:10.1137/1.9781611977554.ch46.
Ashley Montanaro. Quantum speedup of monte carlo methods. Proceedings of the Royal Society A, 2015. arXiv: https://arxiv.org/abs/1504.06987. doi:10.1098/rspa.2015.0301.
Robin Kothari and Ryan O'Donnell. Mean estimation when you have the source code; or, quantum monte carlo methods. In Proceedings of the 34th ACM-SIAM Symposium on Discrete Algorithms (SODA), 1186–1215. 2023. arXiv: https://arxiv.org/abs/2208.07544. URL: https://epubs.siam.org/doi/abs/10.1137/1.9781611977554.ch44, doi:10.1137/1.9781611977554.ch44.
Emanuel Knill, Gerardo Ortiz, and Rolando D. Somma. Optimal quantum measurements of expectation values of observables. Physical Review A, 75:012328, 1 2007. arXiv: https://arxiv.org/abs/quant-ph/0607019. URL: https://link.aps.org/doi/10.1103/PhysRevA.75.012328, doi:10.1103/PhysRevA.75.012328.

Specifically, Lemma 7 of [1] shows that if \(\theta = \arcsin(|a|)\) and \(\tilde{\theta} = \arcsin(|\tilde{a}|)\), then \(|\theta-\tilde{\theta}|\leq \eta\) implies \(|a^2-\tilde{a}^2| \leq 2\eta\sqrt{a^2(1-a^2)}+\eta^2\). This is easily adapted to show that it also implies \(|a-\tilde{a}|\leq \eta\sqrt{1-a^2} + a\eta^2/2\). They show that with probability at least \(8/\pi^2\), \(\theta\) is learned up to additive error at most \(\eta = \pi/M\) with \(M\) calls to \(W\), which together with the above expressions implies Eqs. [eq:AE_amplitude_bound] and [eq:AE_probability_bound]. ↩
We can compare to the classical approach of estimating \(p\) by flipping a \(p\)-biased coin \(M\) times. Letting \(\tilde{p}\) denote the estimate, which has mean \(p\) and variance \(p(1-p)/M\), Chebyshev's inequality implies that \(|p-\tilde{p}|\leq \epsilon\) with probability at least \(8/\pi^2\) as long as \(M \geq C p(1-p)/\epsilon^2\) where \(C = 1/(1-8/\pi^2)\). Thus, when \(p \approx \mathcal{O}\left( \epsilon \right)\) or \(p\approx 1-\mathcal{O}\left( \epsilon \right)\), the classical approach achieves \(M \sim 1/\epsilon\), and the quantum speedup is never more than quadratic. ↩