Professional Documents
Culture Documents
Characterizing Quantum Gates Via Randomized Benchmarking
Characterizing Quantum Gates Via Randomized Benchmarking
connection in special cases between the error rate provided by this protocol and the error strength
measured using the diamond norm distance.
gardless of the number of qubits comprising the system. is given by the set of non-negative, trace-1 linear oper-
We provide a detailed proof that ourprotocol requires ators on H. Unless otherwise stated, we will only be
at most O n2 quantum gates, O n4 cost in classical concerned with quantum operations with the same input
pre-processing (to select each gate-sequence), and a num- and output spaces. The set of linear superoperators map-
ber of single-shot repetitions that is independent of n. As ping L (H) into itself is denoted by T (H) with the set of
well, we give a thorough explanation of the perturbative quantum channels (completely positive, trace-preserving
expansion of the time and gate-dependent errors about linear maps) contained in T (H) denoted by S(H).
the average error that leads to the fitting models for the There are various methods for quantifying the distance
observed fidelity decay. Our zeroth order model directly between quantum operations, we briefly describe those
shows that for time-independent and gate-independent that will be of use to us. Good references for many of
errors the fidelity decay is indeed modeled by an expo- the topics in this section are [35–37].
nential decay, and the decay rate produces an estimate
for the average error rate of the noise.
We derive the first order fitting model which takes into A. Diamond Norm, Average Gate Fidelity and
account the first-order correction terms in the perturba- Minimum Gate Fidelity
tive expansion and provide a detailed explanation of the
conditions for when this is a sufficient model of the fi- One method of quantifying the distance between two
delity decay curve. The fitting formula shows that gate- linear superoperators E1 , E2 ∈ T (H) is given by the dia-
dependent errors can lead to a deviation from the ex- mond norm distance, kE1 − E2 k⋄ . The diamond norm of
ponential decay (defining a partial test for such effects an arbitrary linear superoperator R : L (Cm ) → L (Cn )
in the noise), which was illustrated via numerical exam- is defined as,
ples in [25]. State-preparation and measurement errors
appear as independent fit parameters in the fitting mod-
els and we discuss when the protocol is robust against kRk⋄ = supk∈N kR ⊗ Ik k1 (2.1)
these errors. In the case of Pauli errors we give some
novel preliminary results regarding the relationship be- where kk1 on superoperators is defined to be the ∞-norm
tween the benchmarking average error rate and the more induced by the trace norm k k1 on L (Cm ) and L (Cn ). It
common diamond norm error measure [33, 34] used in is known that the supremum occurs for k = m and so,
fault-tolerant theory.
The paper is structured as follows: In section II we
discuss notation and background material. In section kRk⋄ = kR ⊗ Im k1
III A we discuss the proposed protocol and then in sec- = maxA:kAk1 ≤1 kR ⊗ Im (A)k1 (2.2)
tion III B we present the perturbative expansion and ex-
pressions for the zero’th and first order fitting models. where A ∈ L (Cm ⊗ Cm ). Hence for E1 , E2 ∈ T (H),
Section IV provides a sufficient condition for neglecting
higher order terms in the model as well as a simple case
for when the benchmarking scheme fails. We also discuss kE1 − E2 k⋄ = k (E1 − E2 ) ⊗ Id k1 . (2.3)
when the protocol is robust against state preparation and
The diamond norm distance is commonly used in quan-
measurement errors. Section V discusses the relationship
tum information due to its operational meaning of being
between the error rate given by the benchmarking scheme
related to the optimal probability for distinguishing E1
and other measures of error commonly used in quantum
and E2 using a binary outcome POVM and single input
information. Section VI provides a detailed proof that
state (allowing for ancillas) [38].
our protocol is scalable in the number of qubits compris-
Another method for quantifying the distance between
ing the system, and a discussion with concluding remarks
linear superoperators is given by the k kH
1→1 norm defined
is contained in section VII.
for linear superoperator R : L (Cm ) → L (Cn ) as,
Let us first set some notation. Suppose we have an where A ∈ L (Cm ). One can see that k kH 1→1 is just k k1
n-qubit quantum system so that the Hilbert space H (which is also denoted k k1→1 ) restricted to Hermitian
representing the system has dimension d = 2n . Thus inputs. This norm is less common in quantum informa-
H is isomorphic to Cd and both will generically refer tion due to its lack of operational meaning, however it
to the Hilbert space of a d-dimensional quantum system is a weaker measure of distance than the diamond norm
throughout the presentation. The set of linear operators since for any linear superoperator R : L (Cm ) → L (Cn ),
on H will be denoted by L (H). The set of pure states kRkH 1→1 ≤ kRk⋄ . This will be of much use to us later
is represented by complex projective space CPd−1 and when we consider neglecting higher order effects in the
the set of all mixed states in L (H), denoted by D(H), benchmarking scheme.
3
FE1 ,E2 (ρ) = F (E1 (ρ), E2 (ρ)) where we recall the definition of k kH
2 1→1 in Eq. (2.4. The
second inequality is clear since,
q
p p
= tr E1 (ρ)E2 (ρ) E1 (ρ) (2.5)
min|ψi∈H⊗H F (E1 ⊗ I(|ψihψ|), E2 ⊗ I(|ψihψ|)) ≤ min|φi∈H F (E1 ⊗ I(|φihφ| ⊗ |φihφ|), E2 ⊗ I(|φihφ| ⊗ |φihφ|))
q 2
p p
= min|φi∈H tr E1 (|φihφ|) ⊗ |φihφ| (E2 (|φihφ|) ⊗ |φihφ|) E1 (|φihφ|) ⊗ |φihφ|
q 2
p p
= min|φi∈H tr E1 (|φihφ|) (E2 (|φihφ|)) E1 (|φihφ|) ⊗ |φihφ|
|Clifn |
FEmin
1 ,E2
≥ 1 − kE1 − E2 k⋄ . (2.21) 1
Cj Λ Cj† ρCj Cj†
X
W(Λ)(ρ) :=
|Clifn | j=1
Z
B. The Clifford Group and t-Designs
U Λ U † ρU U † dU.
= (2.24)
U(d)
The Clifford group on n qubits, denoted Clifn , is de-
As shown in [18, 40], U(d) U Λ U † ρU U † dU pro-
R
fined as the normalizer of the Pauli group Pn and is gen-
erated by the phase (S), Hadamard (H) and controlled- duces the unique depolarizing channel Λd with the same
NOT (CNOT) gates. Clifn plays an important role in average fidelity as Λ. Hence if FΛ,I is the average fidelity
many areas of quantum information such as universal- of Λ, and Λd is given by
ity [45], stabilizer code theory/fault-tolerance [46] and
noise estimation [17].
One extremely useful property of Clifn , especially for
1
Λd (ρ) = pρ + (1 − p) (2.25)
noise estimation, is that the uniform probability distri- d
bution over Clifn comprises a unitary 2-design [17]. A then,
unitary t-design is defined as follows,
where im is the m-tuple (i1 , ..., im ) (which we sometimes Fg (m, ψ) = Tr[Eψ Sm (ρψ )]
also denote by i~m ) and im+1 is uniquely determined by (3.10)
im .
where we define the exact average of the sequences to be,
Step 2. For each of the Km sequences, measure the
survival probability Tr[Eψ Sim (ρψ )]. Here ρψ is a quan-
tuml state that takes into account errors in preparing 1 X
Sm = Λim+1 ,m+1 ◦ Cim+1 ◦ ... ◦ Λi1 ,1 ◦ Ci1 .
|ψihψ| and Eψ is the POVM element that takes into ac- |Clifn |m
(i1 ,...,im )
count measurement errors. In the ideal (noise-free) case
ρψ = Eψ = |ψihψ|. (3.11)
Hence the fitting functions by which we model the be-
Step 3. Average over the Km random realizations to havior of Fseq (m, ψ) are derived in terms of Fg (m, ψ)
find the averaged sequence fidelity, (see Sec. III B). Note that since Fg (m, ψ) is the uniform
average over all sequences we can sum over each index
Fseq (m, ψ) = Tr[Eψ SKm (ρψ )], (3.6) independently,
1 X
Fg (m, ψ) = m tr Λim+1 ,m+1 ◦ Cim+1 ◦ Λim ,m ◦ Cim ◦ ... ◦ Λi1 ,1 ◦ Ci1 (ρψ )Eψ . (3.12)
|Clifn | i1 ,...,im
In order to prepare for the next section where we derive tuitive form. We first re-write Λim+1 ,m+1 ◦Cim+1 ◦Λim ,m ◦
the above fitting models, we write Fg (m, ψ) in a more in- Cim ◦ ... ◦ Λi1,1 ◦ Ci1 by inductively defining new uniformly
7
2. Define Di2 uniquely by the equation Ci2 = Di2 ◦ Di†1 , ◦Di1 † ◦ Λi1 ,1 ◦ Di1 . (3.16)
ie. Di2 = Ci2 ◦ Ci1 =
2s=1 Cis .
Si~m ≡ Λim+1 ,m+1 ◦ Cim+1 ◦ ... ◦ Λij ,j ◦ Cij ◦ ... ◦ Λi1 ,1 ◦ Ci1
= Λim+1 ,m+1 ◦ Dim † ◦ Λim ,m ◦ Dim ◦ ... ◦ Di1 † ◦ Λi1 ,1 ◦ Di1
= Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1 + δΛim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1
+... + Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Dij † ◦ δΛij ,j ◦ Dij ◦ ... ◦ Di1 † ◦ Λ ◦ Di1
+... + Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ δΛi1 ,1 ◦ Di1 + O(δΛ2ij ,j ). (3.18)
We define
(0)
Si~ := Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1 , (3.19)
m
8
(1)
Si~ := δΛim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1
m
+... + Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Dij † ◦ δΛij ,j ◦ Dij ◦ ... ◦ Di1 † ◦ Λ ◦ Di1
+... + Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ δΛi1 ,1 ◦ Di1 (3.20)
1
k
A0 := Tr Eψ Λ ρψ − (3.27)
X
Fg(k) (m, ψ) := tr (j)
Sm (ρψ )Eψ (3.22) d
j=0
and
so that,
1
B0 := Tr Eψ Λ . (3.28)
m+1
X d
(k)
Sm = Sm , (3.23)
k=0
Hence, assuming the simplest (ideal) scenario where
the noise operator at each step is independent of the
and applied gate (and is also time-invariant), Fg (m, ψ) =
(0)
Fg (m, |ψi) decays exponentially in p.
m+1
X
Fg (m, ψ) = Fg(m+1) (m, |ψi) = tr (j)
Sm (ρψ )Eψ . 2. First Order Model
j=0
(1) (1)
(3.24) To find Fg (m, |ψi) we note that in the definition of Si~
m
given by Eq. (3.20) there are m+1
1 = m + 1 first-order
perturbation terms which contain the gate dependence.
First, we consider the m − 1 terms with j ∈ {2, ..., m}.
1. Zeroth Order Model For each such j, averaging over the {i1 ...im } gives a term
of the form,
sums are independent. More precisely, the above can be written as,
1
Ci†j
X X
Λ◦ Λm−j
d ◦ 2
†
Dij−1 ◦ ◦ δΛij ,j ◦ Cij ◦ Λ ◦ Dij−1 ◦ † †
Dij−2 ◦ Λ ◦ Dij−2 ◦ ... ◦ Di1 ◦ Λ ◦ Di1
|Clifn | ij−1 ,ij ij−2 ,...,i1
1 P †
where Qj := |Clif n| i Ci ◦ Λi,j ◦ Ci and the subscript where
d represents the depolarization of the operator within
brackets. Using the fact that depolarizing channels com-
mute we get,
1 X †
For the term with j = 1, averaging over i1 , ..., im gives = Ci ◦ Λi,1 ◦ Ci . (3.33)
|Clifn | i
a term of the form,
1 X
Λ◦Λm−1
d ◦ Di1 † ◦δΛi1 ,1 ◦Di1 = Λ◦Λm−1
d ◦(Q1 −Λd ),
|Clifn | i
1
(3.32) Lastly for the term with j = m + 1, averaging gives,
1 X
m δΛim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1
|Clifn | i ...i
1 m
!
1 X 1 X
†
= m−1 δΛim+1 ,m+1 ◦ Dim ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1 . (3.34)
|Clifn | i ...i
|Clifn | i
1 m−1 m
Since Clifn is a group, if i1 , ..., im−1 is fixed, averaging where Λi′ ,m+1 denotes the error that arises when the Clif-
over the im index runs through every Clifford element ford operation Ci† is applied at final time-step m + 1.
with equal frequency in the Dim random variable. Since Again, using the group property of Clifn we have,
Λim+1 ,m+1 is just the error associated with the gate Di†m ,
1
P †
|Clifn | i δΛ i ,m+1 ◦ Di ◦ Λ ◦ Di is independent 1 X
Λi,m+1 ◦ Ci ◦ Λ ◦ Ci † .
m m+1 m m
Rm+1 = (3.36)
of the i1 , ..., im−1 indices. Hence we can define |Clifn | i
1 X
Rm+1 := Λim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim
|Clifn | i
m
1 X
= Λi′ ,m+1 ◦ Ci† ◦ Λ ◦ Ci (3.35) This decoupling of Rm+1 allows us to write,
|Clifn | i
10
!
1 X 1 X
m−1 δΛim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1
|Clifn | i1 ...im−1
|Clifn | i
m
= (Rm+1 − Λ ◦ Λd ) ◦ Λm−1
d . (3.37)
m
X
(0) (1)
= Λ ◦ Λm m−1
Λ ◦ (Qj ◦ Λ)d − Λ2d ◦ Λm−2 + Λ ◦ Λm−1
Sm + Sm d + (Rm+1 − Λ ◦ Λd ) ◦ Λd + d d ◦ (Q1 − Λd )
j=2
m
X
= Rm+1 ◦ Λm−1 Λ ◦ (Qj ◦ Λ)d ◦ Λm−2 + Λ ◦ Λm−1 ◦ Q1 − m (Λ ◦ Λm
d + d d d ). (3.38)
j=2
h i
(1) (0) (1)
To calculate Fg (m, |ψi) := tr Sm + Sm (ρψ )Eψ
we have,
tr (Λ ◦ Λm m
d (ρψ )Eψ ) = A0 p + B0 , (3.42)
m
X
Fg(1) (m, |ψi) = G1,m+1 p m−1
+ H1,m+1 + (A0 qj pm−2 + B0 ) + A1,1 pm−1 + B0 − m (A0 pm + B0 )
j=2
Pm !
m−1 m−2 j=2 qj 2
=p (G1,m+1 + A1,1 − A0 p) + (m − 1)A0 p −p + H1,m+1 . (3.43)
m−1
H
1 X † † † †
|Clifn |m
Λ ◦ Dim ◦ Λ ◦ Dim ◦ ... ◦ Dij ◦ δΛij2 ◦ Dij2 ◦ ... ◦ Dij ◦ δΛij1 ◦ Dij1 ◦ ... ◦ Di1 ◦ Λ ◦ Di1
2 1
i~m
1→1
1 X H
†
H
†
H
†
†
H
≤ kΛk1→1
Dim ◦ Λ ◦ Dim
...
Dij ◦ δΛij2 ◦ Dij2
...
Dij ◦ δΛij1 ◦ Dij1
...
Di1 ◦ Λ ◦ Di1
m
|Clifn | 1→1 2 1→1 1 1→1 1→1
i~m
H
m−1 1 X
†
H 1 X
†
H
= kΛk1→1
Dij2 ◦ δΛij2 ◦ Dij2
Dij1 ◦ δΛij1 ◦ Dij1
|Clifn | i 1→1 |Clif |
n i 1→1
j2 j1
1 X
†
H
1 X
†
H
≤
Dij2 ◦ δΛij2 ◦ Dij2
Dij1 ◦ δΛij1 ◦ Dij1
|Clifn | i 1→1 |Clifn |
i
1→1
j2 j1
where we define the time-dependent variation in the Summing over all j1 , j2 with j2 > j1 gives,
noise,
1 X
γj := kΛi,j − ΛkH
1→1 . (4.3)
|Clifn | i
12
H
(2)
H
1 X (2)
Sm
=
|Clifn |m S
i~m
1→1
i~
m 1→1
H
1 X X † † † †
=
|Clifn |m Λ ◦ Dim ◦ δΛ ◦ Dim ◦ ... ◦ Dij2 ◦ δΛ ij2 ◦ D ij2 ◦ ... ◦ Dij1 ◦ δΛ ij1 ◦ Dij1 ◦ ... ◦ Di1 ◦ Λ ◦ Di1
~
im j2 >j 1
1→1
H
1
Λ ◦ Di†m ◦ δΛ ◦ Dim ◦ ... ◦ Di†j ◦ δΛij2 ◦ Dij2 ◦ ... ◦ Di†j ◦ δΛij1 ◦ Dij1 ◦ ... ◦ Di†1 ◦ Λ ◦ Di1
X
X
≤
|Clifn |m
2 1
j2 >j1
i~m
1→1
X
≤ γj2 γj1 . (4.4)
j2 >j1
In terms of the fidelity we thus have from Eq.’s (4.1) and erator norm k k that satisfies the properties listed above,
(4.4), the following inequality holds,
X
(2)
Fg (m, |ψi) − Fg(1) (m, |ψi) ≤ γj2 γj1 . (4.5) m+1 k
|Fg(k+1) (m, ψ) − Fg(k) (m, ψ)| ≤ γ (4.11)
j2 >j1 k
X (m + 1)m 2 1 X
γ2 = γ (4.6) γ := kΛi − Λk (4.12)
j2 >j1
2 |Clifn | i
it is unlikely to arise in practice. This discussion shows To prove Eq. (5.4) using the semidefinite program
that a constant fidelity decay curve can only occur in in [49] first note that Φ = E1 − E2 has action,
extreme cases and so it is safe to assume the protocol
is independent of state preparation and measurement er- 2
dX −1
rors.
Φ(ρ) = (qi − ri )Pi ρPi† . (5.5)
i=0
In terms of connections between the average error rate Primal problem: Maximize
hJ(Φ), W i subject
to W ≤
r and relevant fault-tolerant measures of error, it is natu- 1d ⊗ ρ, W ∈ Pos L Cd ⊗ Cd , ρ ∈ D L Cd ,
ral to ask how the error rate r between Λ and I is related
Dual problem: Minimize
ktr1 (Z)k∞ subject to Z ≥
to the diamond norm between Λ and I. In general an ex-
J(Φ), Z ∈ Pos(L Cd ⊗ Cd ),
plicit relationship will be impossible to obtain, however
we show that in certain cases that are relevant in various where J(Φ) is the Choi matrix [50] of Φ. If α and β
fault-tolerant noise models we can obtain such a relation- are the solutions to the primal and dual problems then
ship. First we give a new proof of a previously established the case that α = β is called strong duality. It is shown
result [38] for calculating the diamond norm distance be- in [49] that the above semidefinite program always has
tween generalized Pauli channels. The proof we present the property of strong duality and the solution to the
here illustrates how one can apply a semidefinite program program is α = 21 kE1 − E2 k⋄ . Note also that it is always
to calculate the diamond norm distance between quan- the case that α ≤ β.
tum channels [49]. Ideally, this proof technique could be By definition,
used to either explicitly calculate or place bounds on the
diamond norm distance between more general classes of
quantum channels. This could allow for obtaining further J(Φ) = dΦ ⊗ I(|ψ0 ihψ0 |)
relationships between r and the diamond norm distance 2
dX −1
which hold in more general cases.
=d (qi − ri )Pi ⊗ 1|ψ0 ihψ0 |Pi† ⊗ 1. (5.6)
i=0
As a simple corollary to Eq. (5.4) note that if E1 and We know that q0 is related to the average fidelity of E1 ,
E2 are depolarizing channels with fidelity parameters p1 FE1 ,I , by
and p2 respectively then,
q0 d + 1
2 FE1 ,I = (5.16)
2|p1 − p2 |(d − 1) d+1
kE1 − E2 k⋄ = . (5.9)
d2 and so,
To see this note that
2(d + 1)(1 − FE1 ,I )
1−p1
kE1 − Ik⋄ = . (5.17)
d
(d + 1)FE1 ,I − 1 (d + 1) p1 + d −1
q0 = =
d d Therefore in the case of randomized benchmarking
2
(d − 1)p1 + 1 (where we define the error rate r = 1 − FΛ,I ) if Λ is
= (5.10)
d2 a generalized Pauli channel, r and kΛ − Ik⋄ are related
by,
and similarly,
(d + 1)r
(d2 − 1)p2 + 1 kΛ − Ik⋄ = 2 . (5.18)
r0 = . (5.11) d
d2
kE1 − E2 k⋄ = kvk1
2
for each m is clearly inefficient.
dX −1
= |q0 − r0 | + |qi − ri | 2. Uniform sampling: Since the size of the Clifford group
scales as 2O(n ) , sampling directly from a list of all Clif-
2
i=1
2 2
ford elements becomes impossible for large n (writing
(d − 1)p1 + 1 (d − 1)p2 + 1
=
− down every element is inefficient in n).
d2 d2
1 − p1 1 − p2 3. Implementing Clifford operations: In practice, one can
+(d2 − 1) 2 − only implement a generating set for the Clifford group.
d d2
2 Hence even if random sampling can be accomplished
(d − 1)|p1 − p2 |
= 2 . (5.14) there must be a scalable method for implementing each
d2 Clifford using only this generating set.
m
over |Clifn | sequences (i1 , ..., im ). The benchmarking
protocol requires choosing a sequence at random, eval- 2
(0.2)2
uating the above fidelity, repeating for many sequences, ln 0.05
k =
and taking the average of the results. 2(10−3 )2
~ ~
F im (m,|ψi)+....+F im (m,|ψi) ∼ 7 × 104 . (6.5)
Let Sk (m, |ψi) = g k
g
be the nor-
~
malized k-fold sum of the random variable Fgim (m, |ψi) While this number is large it is independent of n and thus
and note that E[Sk (m, |ψi)] = Fg (m, ψ). A probablistic compares favourably with quantum process tomography
bound on |Sk (m, |ψi) − Fg (m, ψ)| is given by Höeffding’s which scales as 16n . As a direct comparison, performing
inequality, process tomography on a 4 qubit system already requires
65536 measurements.
−2(kǫ)2
P (|Sk (m, |ψi) − Fg (m, ψ)| ≥ ǫ) ≤ 2e k(b−a)2 Solution to 2:
−2kǫ2 For the second problem we present a method to scal-
= 2e (b−a)2 (6.2) ably sample uniformly from the full Clifford group that
utilizes the symplectic representation of the Clifford
~
where [a, b] is the range of Fgim (m, |ψi). Since group (see Ref’s [51, 52]). Since the Clifford group is
~
Fgim (m, |ψi) is a fidelity it must lie in [0, 1] (in reality the normalizer of the Pauli group, every Clifford element
it will lie in a much smaller interval, for now we continue is completely determined by its action under conjugation
to assume it lies in [a, b] ⊆ [0, 1]). Suppose we want on the Pauli group. In particular, since the Pauli group
is generated by the set of all Xi and Zi (the label i refers
to X or Z being in the i’th position with identity op-
erators elsewhere), an element of the Clifford group is
P (|Sk (m, |ψi) − Fg (m, ψ)| ≥ ǫ) ≤ δ (6.3) completely determined by its action on this set. In the
symplectic representation this corresponds to each Clif-
where ǫ represents the accuracy of the estimate and 1 − δ ford element Q being associated uniquely to a 2n by 2n
represents the desired confidence level. We can find how binary symplectic matrix C and length 2n binary vector
many trials one needs to perform to obtain this accuracy h which records negative signs in the images of Xi and
−2kǫ2
by setting δ = 2e (b−a)2 and solving for k, Zi . The only constraints on Q are that commutation
relations and Hermiticity of the generating set must be
preserved under Q. Hence we can construct a random
ln 2
(b − a)2 Clifford element Q by inductively constructing a random
δ
k= . (6.4) symplectic matrix C and vector h.
2ǫ2
Since h corresponds to keeping track of negative signs,
Note that k is explicitly independent of m and n which the binary entries of h can be chosen uniformly at ran-
provides a solution to 1. dom. C is inductively constructed column by column
where the first n columns correspond to the images of
It is instructive to obtain an estimate of the size of X1 through Xn , and the last n columns correspond to
k for realistic parameter values of δ and ǫ. Since 1 − δ the images of Z1 through Zn (all of which are written in
represents our desired confidence level we set δ = 0.05. binary notation as in [52]). Preservation of commutation
Fault-tolerance provides a wide range for the error tol- relations is phrased through the symplectic inner prod-
erance of a physical (0-level) gate in the fault-tolerant uct and so at each step one chooses the new column by
construction. The value of the error tolerance depends finding a random solution to a system of linear equations
on both the coding scheme as well as the noise model and which represents the inner product conditions. Since ran-
typical values lie somewhere between 10−6 and 10−2 . Let domly choosing 2n elements of the Pauli group that sat-
us assume that the physical gates have errors on the or- isfy the required commutation relations is equivalent to
der of 10−4 . Intuitively, since the fidelity curve decays inductively choosing random solutions
in sequence length it is reasonable to assume that ǫ can to 2n sets of lin-
ear equations (which requires O n3 operations), we can
be relaxed as m grows large. Similarly, b − a can be as- produce a random Clifford element in O n4 (classical)
sumed to be relatively small for small values of m but operations.
will converge to 1 − d1 as m grows large. As a result both
b − a and ǫ have an implicit dependence on m and this
implicit dependence is advantageous when choosing ǫ for Solution to 3: Any Clifford element can be decomposed
2
large values of m. Let us assume m = 100 and a fidelity into a sequence
of O n one and two-qubit genera-
2
decay curve that is well-approximated by an exponential. tors in O n time [52] (alternatively, there are slower
Then we expect fidelity values on the order of 0.99 at this methods which
produce a “canonical” decomposition into
value of m and so we take ǫ = 10−3 , b − a = 0.2. With O n2 / log n generators [53]). We describe this method
these values for ǫ, δ and b − a we get, which again utilizes the symplectic representation of the
17
to be benchmarked for universal quantum computation that is often made in fault-tolerant estimates is that the
are Clifford gates. correlation in noise between qubits is either small or can
Various interesting questions and comments arise from be ignored.
the benchmarking analysis presented here. First, there With regards to scalability, while we have shown the
is a key point to emphasize regarding the zeroth and protocol itself is scalable in n, a useful direction for fur-
first order fitting models. As depicted in [25] there exist ther research would be an analysis of how the sufficient
physically relevant noise models for which when the true condition of weak average variation of the noise depends
value of the depolarization fidelity parameter p is used, on n. As previously noted, the noise associated to a
the first order model fits the experimental data much multi-qubit Clifford element is given by the noise associ-
better than the zeroth order model. However, it may ated to the sequence of generators comprising the Clif-
be the case that a least squares fitting procedure using ford. A determination of whether these noise operators
the functional form of the zeroth order model produces a continue to satisfy the sufficient condition when it is met
very good fit to the experimental data, albeit producing for small numbers of qubits will be useful for understand-
an incorrect value for p. Therefore in order to obtain a ing the applicability of the protocol.
more accurate value for p one should always use the first Rigorous fault-tolerant analyses sometimes invoke the
order fitting model unless prior knowledge of the noise diamond norm as a measure of the error strength rather
indicates that it is effectively gate-independent. than the weaker characterization provided by the aver-
It will be useful to obtain a better understanding for age fidelity. Hence it is desirable to find relationships
when a least squares fitting procedure using the zeroth between these two quantities that is more general than
order model produces a value for p that is close to its true the special case of random Pauli errors presented here.
value. Clearly in the gate-independent case the zeroth or- As mentioned above, the semidefinite program we have
der model fits the fidelity decay curve exactly. Moreover used to deduce the relationship appears to be a promising
for weakly gate-dependent noise one can see from our tool for further research in this area. From the expression
continuity argument that the zeroth order model is still given in Eq. (2.2) one can see that the diamond norm is
a sufficient fitting function for the fidelity decay curve. essentially a “worst-case” maximization over input (en-
Hence the most interesting case to analyze is when there tangled) states. In quantum computation it is the case
is a non-negligible amount of gate-dependence in the that the measure of accessible states (states that can be
noise and the condition for using the first order model to reached in polynomial time using a generating set for the
fit the decay curve is satisfied. A useful test that would unitary group) is equal to 0. Hence there is a high proba-
indicate gate-dependence in the noise, and thus the va- bility that the maximization criteria demanded by the di-
lidity of the value of p obtained from fitting to the zeroth amond norm is a much stronger condition than necessary
order model, is to perform the least squares fitting proce- for understanding the strength of the errors affecting the
dure using both the zeroth and first order fitting models. computation. This point becomes even more relevant for
If the estimates of p obtained in each case differ signifi- an algorithm-specific (ie. non-universal) quantum com-
cantly then the zeroth order model must be a poor choice puter. An interesting direction of further research is to
of fitting function even though it may fit the data well. In provide precise conditions for when the average fidelity
this case the noise must have a strong gate-dependence provides an indication or bound on the error strength in
because otherwise q − p2 would be small which implies terms of stronger characterizations such as the diamond
the two fitting functions would produce similar estimates norm.
for p. Additionally, if one were able to obtain an estimate of
An interesting question is how to extract a meaningful the minimum gate fidelity from knowledge of the average
average error rate over a generating set of the Clifford fidelity they could use the direct relationship between
group, for instance Gn defined previously, from the aver- the minimum gate fidelity and diamond norm given by
age error rate r over the entire Clifford group. One might Eq. (2.21) to obtain information about the error strength
argue that benchmarking a generating set for the Clif- in terms of the diamond norm. A result that may be
ford group is sufficient for benchmarking the full Clifford useful in this direction of research is the “concentration
group, however it is entirely plausible that noise correla- of measure effect” of the gate fidelity which implies that
tions between the n physical qubits creates large errors as n increases, the measure of the set of states which
on elements of Clifn , even when the errors on the gener- produce a fidelity close to the minimum yet far from the
ating set can be controlled [55]. In fact an assumption average is exponentially small in n [41, 42].
[1] P. Shor, in Proceedings of the 35’th Annual Symposium on [3] R. Feynman, International Journal of Theoretical Physics
Foundations of Computer Science (FOCS) (IEEE Press, 21 (1982).
Los Alamitos, CA, 1994). [4] S. Lloyd, Science 273, 1073 (1996).
[2] A. Harrow, A. Hassidim, and S. Lloyd, Phys. Rev. Lett. [5] P. Shor, Phys. Rev. A 52, R2493 (1995).
103, 150502 (2009). [6] A. Calderbank and P. Shor, Phys. Rev. A 54, 1098
19