Characterizing Quantum Gates Via Randomized Benchmarking

Characterizing Quantum Gates via Randomized Benchmarking
Easwar Magesan,1, 2 Jay M. Gambetta,3 and Joseph Emerson1, 2

1
Department of Applied Mathematics, University of Waterloo, Waterloo, ON N2L 3G1, Canada
2
Institute for Quantum Computing, University of Waterloo, Waterloo, ON N2L 3G1, Canada
3
IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA
We describe and expand upon the scalable randomized benchmarking protocol proposed in Phys.
Rev. Lett. 106, 180504 (2011) which provides a method for benchmarking quantum gates and
estimating the gate-dependence of the noise. The protocol allows the noise to have weak time and
gate-dependence, and we provide a sufficient condition for the applicability of the protocol in terms
of the average variation of the noise. We discuss how state preparation and measurement errors are
taken into account and provide a complete proof of the scalability of the protocol. We establish a
arXiv:1109.6887v2 [quant-ph] 27 Apr 2012
connection in special cases between the error rate provided by this protocol and the error strength
measured using the diamond norm distance.
I. INTRODUCTION noise being independent of the chosen gate, in which case

the fidelity decay curve averaged over randomly chosen
unitaries takes the form of an exponential (in the se-
Quantum computers promise an exponential speed-up
quence length). The protocol of [26] is limited to the
over known classical algorithms for problems such as fac-
single-qubit case and fits the observed fidelity decay av-
toring integers [1], finding solutions to linear systems
eraged over sequences of single-qubit gates (where each
of equations [2] and simulating physical systems [3, 4].
gate consists of a random generator of the Clifford group
Quantum error-correction methods have been devised
composed with a random Pauli operator) to an exponen-
for preserving quantum information in the presence of
tial. The decay rate is assumed to provide an estimate of
noise [5–7], leading to the theoretical development of a
the average error probability per Clifford gate. However,
fault-tolerant theory of quantum computing [8–10]. Such
conditions for when the assumption of an exponential
a theory promises that quantum computation is possi-
decay is valid, specifically in the realistic case of gate-
ble in the presence of errors, provided the error rate is
dependent and time-dependent noise, were not given.
below a certain threshold value which depends on the
Such a set of conditions would be useful because it is easy
particular coding scheme used as well as the error model.
to construct pathological examples where the estimated
This potential has motivated much experimental research
decay rate is not reliable. An unphysical but intuitively
dedicated to building a functioning quantum information
simple example is when the error is gate-dependent and
processor, with various proposals for possible implemen-
equal to the exact inverse of the target gate. The error
tations [11–14].
rate given by the protocol is always equal to zero however
One of the main challenges in building a quantum in- in actuality there is substantial error on each gate (see
formation processor is the non-scalability of completely Sec. IV B). Other important shortcomings of these previ-
characterizing the noise affecting a quantum system via ous RB protocols are that extensions to multi-qubit sys-
process tomography [15, 16]. A complete characteriza- tems are either not scalable or not well understood, and it
tion of the noise is useful because it allows for the de- is unclear how to explicitly account for state preparation
termination of good error-correction schemes, and thus and measurement errors.
the possibility of reliable transmission of quantum infor-
mation. Since complete process tomography is infeasible In this paper we give a full analysis of the scalable
for large systems, there is growing interest in scalable multi-qubit randomized benchmarking protocol for Clif-
methods for partially characterizing the noise affecting a ford gates we proposed in [25] which overcomes the short-
quantum system [17–24]. comings described above. We note that since one “gate”
In Ref. [25] we provided a scalable (in the number n in the single-qubit protocol of [26] consists of a random
of qubits comprising the system) and robust method for Clifford generator as well as a random Pauli operator,
benchmarking the full set of Clifford gates by a single the cost of implementing a gate in this scheme is 2. In
parameter using randomization techniques. The concept the single-qubit case, our RB scheme can be implemented
of using randomization methods for benchmarking quan- by explicitly writing down the 24 elements of the Clifford
tum gates, commonly called randomized benchmarking group decomposed into a sequence of the same generators
(RB), was introduced previously in [18, 26]. The sim- that are randomly applied in [26]. The average number of
plicity of these protocols has motivated experimental im- generators in such a decomposition is 1.875 which implies
plementations in atomic ions for different types of traps that even for the single-qubit case our protocol takes no
[26–28], NMR [29], superconducting qubits [30, 31], and more time to implement than that of [26]. Hence, since
atoms in optical lattices [32]. Unfortunately there are our protocol is scalable and produces an error-estimate
several drawbacks to the methods of [18, 26]. For in- which overcomes the various shortcomings listed above,
stance [18] assumes the highly idealized situation of the it is reasonable to apply it over other existing schemes re-
2
gardless of the number of qubits comprising the system. is given by the set of non-negative, trace-1 linear oper-
We provide a detailed proof that ourprotocol requires ators on H. Unless otherwise stated, we will only be
at most O n2 quantum gates, O n4 cost in classical concerned with quantum operations with the same input
pre-processing (to select each gate-sequence), and a num- and output spaces. The set of linear superoperators map-
ber of single-shot repetitions that is independent of n. As ping L (H) into itself is denoted by T (H) with the set of
well, we give a thorough explanation of the perturbative quantum channels (completely positive, trace-preserving
expansion of the time and gate-dependent errors about linear maps) contained in T (H) denoted by S(H).
the average error that leads to the fitting models for the There are various methods for quantifying the distance
observed fidelity decay. Our zeroth order model directly between quantum operations, we briefly describe those
shows that for time-independent and gate-independent that will be of use to us. Good references for many of
errors the fidelity decay is indeed modeled by an expo- the topics in this section are [35–37].
nential decay, and the decay rate produces an estimate
for the average error rate of the noise.
We derive the first order fitting model which takes into A. Diamond Norm, Average Gate Fidelity and
account the first-order correction terms in the perturba- Minimum Gate Fidelity
tive expansion and provide a detailed explanation of the
conditions for when this is a sufficient model of the fi- One method of quantifying the distance between two
delity decay curve. The fitting formula shows that gate- linear superoperators E1 , E2 ∈ T (H) is given by the dia-
dependent errors can lead to a deviation from the ex- mond norm distance, kE1 − E2 k⋄ . The diamond norm of
ponential decay (defining a partial test for such effects an arbitrary linear superoperator R : L (Cm ) → L (Cn )
in the noise), which was illustrated via numerical exam- is defined as,
ples in [25]. State-preparation and measurement errors
appear as independent fit parameters in the fitting mod-
els and we discuss when the protocol is robust against kRk⋄ = supk∈N kR ⊗ Ik k1 (2.1)
these errors. In the case of Pauli errors we give some
novel preliminary results regarding the relationship be- where kk1 on superoperators is defined to be the ∞-norm
tween the benchmarking average error rate and the more induced by the trace norm k k1 on L (Cm ) and L (Cn ). It
common diamond norm error measure [33, 34] used in is known that the supremum occurs for k = m and so,
fault-tolerant theory.
The paper is structured as follows: In section II we
discuss notation and background material. In section kRk⋄ = kR ⊗ Im k1
III A we discuss the proposed protocol and then in sec- = maxA:kAk1 ≤1 kR ⊗ Im (A)k1 (2.2)
tion III B we present the perturbative expansion and ex-
pressions for the zero’th and first order fitting models. where A ∈ L (Cm ⊗ Cm ). Hence for E1 , E2 ∈ T (H),
Section IV provides a sufficient condition for neglecting
higher order terms in the model as well as a simple case
for when the benchmarking scheme fails. We also discuss kE1 − E2 k⋄ = k (E1 − E2 ) ⊗ Id k1 . (2.3)
when the protocol is robust against state preparation and
The diamond norm distance is commonly used in quan-
measurement errors. Section V discusses the relationship
tum information due to its operational meaning of being
between the error rate given by the benchmarking scheme
related to the optimal probability for distinguishing E1
and other measures of error commonly used in quantum
and E2 using a binary outcome POVM and single input
information. Section VI provides a detailed proof that
state (allowing for ancillas) [38].
our protocol is scalable in the number of qubits compris-
Another method for quantifying the distance between
ing the system, and a discussion with concluding remarks
linear superoperators is given by the k kH
1→1 norm defined
is contained in section VII.
for linear superoperator R : L (Cm ) → L (Cn ) as,
II. BACKGROUND kRkH

1→1 = maxA:A=A† ,kAk1 ≤1 kR (A) k1 (2.4)
Let us first set some notation. Suppose we have an where A ∈ L (Cm ). One can see that k kH 1→1 is just k k1
n-qubit quantum system so that the Hilbert space H (which is also denoted k k1→1 ) restricted to Hermitian
representing the system has dimension d = 2n . Thus inputs. This norm is less common in quantum informa-
H is isomorphic to Cd and both will generically refer tion due to its lack of operational meaning, however it
to the Hilbert space of a d-dimensional quantum system is a weaker measure of distance than the diamond norm
throughout the presentation. The set of linear operators since for any linear superoperator R : L (Cm ) → L (Cn ),
on H will be denoted by L (H). The set of pure states kRkH 1→1 ≤ kRk⋄ . This will be of much use to us later
is represented by complex projective space CPd−1 and when we consider neglecting higher order effects in the
the set of all mixed states in L (H), denoted by D(H), benchmarking scheme.
3
A commonly used state-dependent measure for com-

paring quantum operations E1 , E2 ∈ S(H) is given by the
channel fidelity, ∆F (E1 , E2 ) ≤ kE1 − E2 kH
1→1 ≤ kE1 − E2 k⋄ . (2.10)
FE1 ,E2 (ρ) = F (E1 (ρ), E2 (ρ)) where we recall the definition of k kH
2 1→1 in Eq. (2.4. The
second inequality is clear since,
q
p p
= tr E1 (ρ)E2 (ρ) E1 (ρ) (2.5)
where “F ” refers to the usual fidelity between quantum kE1 − E2 kH

1→1 ≤ kE1 − E2 k1 ≤ kE1 − E2 k⋄ . (2.11)
states [39]. In the case of a unitary operation U, quantum
operation E, and restricting input states to CPd−1 , the Now for the first inequality note that,
channel fidelity is called the gate fidelity. Explicitly, for
φ ↔ |φihφ| ∈ CPd−1 ,
∆F (E1 , E2 ) ≤ max|φi |tr ((E1 − E2 ) (|φihφ|) |φihφ|)|
≤ max|φi k(E1 − E2 ) (|φihφ|)k∞
FE,U (φ) = tr (U(|φihφ|)E(|φihφ|)) , (2.6) = maxA:A=A† ,kAk1 ≤1 k(E1 − E2 ) (A)k∞
and defining Λ = U † ◦ E gives, = kE1 − E2 kH
1→∞ (2.12)
where we note that since E1 and E2 are completely pos-

FE,U (φ) = FΛ,I (φ) = tr (|φihφ|Λ(|φihφ|)) . (2.7) itive, E1 − E2 is Hermiticity-preserving. Hence since
H H
kE1 − E2 k1→∞ ≤ kE1 − E2 k1→1 the inequalities in Eq.
The channel Λ can be thought of as representing how (2.10) hold.
much E deviates from U in that if E = U then Λ = I. Next we show that for any quantum operations E1 ,
The gate fidelity has many nice mathematical proper- E2 ∈ S(H),
ties including a simple expression for the average over
pure states, expressions for the variance in terms of vari-
ous representations of Λ and a concentration of measure FEmin ≥ 1 − kE1 − E2 k⋄ . (2.13)
1 ,E2
phenomenon for large systems [40–42]. The average gate
fidelity is obtained by integrating FE,U over CPd−1 using We have that,
the Fubini-Study measure µF S [43],
Z kE1 −E2 k⋄ = max|ψi∈H⊗H kE1 ⊗I(|ψihψ|)−E2 ⊗I(|ψihψ|)k1 .

FE,U = FΛ,I = tr (|φihφ|Λ(|φihφ|)) dµF S (φ). (2.14)
CPd−1 By the Fuchs-Van de Graaf inequalities [44],
(2.8)
Taking the minimum of FE1 ,E2 over all mixed states ρ pro-
duces a quantity FEmin
1 ,E2
commonly called the minimum kE1 ⊗ I(|ψihψ|) − E2 ⊗ I(|ψihψ|)k1 ≥
channel fidelity,
1 − F (E1 ⊗ I(|ψihψ|), E2 ⊗ I(|ψihψ|)) (2.15)
FEmin = minρ FE1 ,E2 (ρ). so,

1 ,E2
Note that by concavity of the fidelity, the minimum chan-

nel fidelity occurs at a pure state [39]. In the case of the kE1 − E2 k⋄ ≥
gate fidelity, the minimum is called the minimum gate max|ψi∈H⊗H [1 − F (E1 ⊗ I(|ψihψ|), E2 ⊗ I(|ψihψ|))]
fidelity.
= 1 − min|ψi∈H⊗H F (E1 ⊗ I(|ψihψ|), E2 ⊗ I(|ψihψ|)).
In certain cases we will be concerned with how close E1
(2.16)
and E2 are in terms of the difference between the average
fidelity of each channel. To this end we define,
Now we have,

∆F (E1 , E2 ) := FE1 ,I − FE2 ,I . (2.9)
min|ψi∈H⊗H F (E1 ⊗ I(|ψihψ|), E2 ⊗ I(|ψihψ|)) ≤
Lastly, we note the following relationships between min|φi∈H F (E1 (|φihφ|), E2 (|φihφ|)) (2.17)
some of the distance measures defined above. First, for
E1 , E2 ∈ S(H) the following inequalities hold, since
4
min|ψi∈H⊗H F (E1 ⊗ I(|ψihψ|), E2 ⊗ I(|ψihψ|)) ≤ min|φi∈H F (E1 ⊗ I(|φihφ| ⊗ |φihφ|), E2 ⊗ I(|φihφ| ⊗ |φihφ|))
q 2
p p
= min|φi∈H tr E1 (|φihφ|) ⊗ |φihφ| (E2 (|φihφ|) ⊗ |φihφ|) E1 (|φihφ|) ⊗ |φihφ|
q 2
p p
= min|φi∈H tr E1 (|φihφ|) (E2 (|φihφ|)) E1 (|φihφ|) ⊗ |φihφ|
= min|φi∈H F (E1 (|φihφ|), E2 (|φihφ|)). (2.18)
So, In the case t = 2 the above reduces to a “twirling” [47]

condition,
kE1 − E2 k⋄ ≥ 1 − min|φi∈H F (E1 (|φihφ|), E2 (|φihφ|)).

K
(2.19) Z
Uj Λ Uj ρUj Uj† =
†
X
U Λ U † ρU U † dU

qj
j=1 U(d)
Now by concavity,
(2.23)
being satisfied for any quantum channel Λ and any state
FEmin = min|φi∈H F (E1 (|φihφ|), E2 (|φihφ|)) (2.20) ρ [17]. Since a uniform probability distribution on Clifn
1 ,E2
forms a 2-design, if Clifn = {Cj : j ∈ K = {1, ..., |Clifn |}}
and so, then,
|Clifn |
FEmin
1 ,E2
≥ 1 − kE1 − E2 k⋄ . (2.21) 1
Cj Λ Cj† ρCj Cj†
X
W(Λ)(ρ) :=
|Clifn | j=1
Z
B. The Clifford Group and t-Designs
U Λ U † ρU U † dU.

= (2.24)
U(d)
The Clifford group on n qubits, denoted Clifn , is de-
As shown in [18, 40], U(d) U Λ U † ρU U † dU pro-
R
fined as the normalizer of the Pauli group Pn and is gen-
erated by the phase (S), Hadamard (H) and controlled- duces the unique depolarizing channel Λd with the same
NOT (CNOT) gates. Clifn plays an important role in average fidelity as Λ. Hence if FΛ,I is the average fidelity
many areas of quantum information such as universal- of Λ, and Λd is given by
ity [45], stabilizer code theory/fault-tolerance [46] and
noise estimation [17].
One extremely useful property of Clifn , especially for
1
Λd (ρ) = pρ + (1 − p) (2.25)
noise estimation, is that the uniform probability distri- d
bution over Clifn comprises a unitary 2-design [17]. A then,
unitary t-design is defined as follows,
Definition 1. Unitary t-Design (1 − p)

A unitary t-design is a discrete random variable FΛ,I = p + . (2.26)
d
{(q1 , U1 ), ..., (qK , UK )}, with each Ui ∈ U (d), such that
for every homogeneous complex-valued polynomial p in Thus twirling a quantum operation over the Clifford
2d2 indeterminates of degree (s,s) less than or equal to group produces a depolarizing channel and the average
(t,t), fidelity is invariant under the twirling operation.
In Sec. III we will be concerned with compositions
of both gate-independent and gate-dependent twirls. In
K
1 X the gate-independent case, the sequence of twirls of Λ of
Z
p(Uj ) = p(U )dU. (2.22) length k, W(Λ)k , can be re-written as the k-fold compo-
K j=1 U(d)
sition of Λd with itself. Using the above representation
The integral is taken with respect to the Haar measure of Λd we get,
on U (d). Here p(U ) is defined to be the evaluation of p at
the 2d2 values consisting of the d2 matrix entries of U as 1
well as the d2 complex conjugates of these matrix entries. W(Λ)k (ρ) = pk ρ + (1 − pk ) . (2.27)
d
5
Therefore the average fidelity decreases exponentially to

1
d since, 1 XX
Λ= Λi,j . (3.1)
M |Clifn | j i
(1 − pk )
FΛkd ,I = pk + . (2.28)
d Consider the twirl of the average error operator over
Clifn . As discussed in Sec.(II B) this produces a depolar-
We can also write the average fidelity of Λ in terms of ized channel Λd ,
its χ-matrix [15]. The χ-matrix is an important (basis-
dependent) object in experimental quantum information 1 X † 1
Λd (ρ) = C ◦ Λave ◦ Ci (ρ) = pρ + (1 − p) .
as it is directly related to practical methods in process |Clifn | i i d
tomography. The χ-matrix is obtained by expanding the (3.2)
Kraus operators {Ak } of Λ with respect to a particular Recall from Sec. (II B) that the average fidelity of Λ,
basis of L Cd , which is most often chosen to be the denoted Fave , is invariant under Clifford twirling and so,
d2 −1
Pauli basis {Pj }j=0 (P0 = 11). This gives,
1−p
Fave = p + . (3.3)
Ak ρA†k =
X X
Λ(ρ) = χi,j Pi ρPj (2.29) d
k i,j
We now define the average error rate of the set of Clifford
gates as follows:
and so a complete description for Λ can be given by es-
timating the entries of χ. As shown in [15], Definition 3. Average Error Rate
The average error rate, r, of the Clifford gates used in
a quantum computation is defined to be,
χ0,0 d + 1
FΛ,I = (2.30)
d+1
1−p (d − 1)(1 − p)
which gives, r = 1 − Fave = 1− p + = . (3.4)
d d
It is important to note that r defined above should not

1 1 FΛ,I (d + 1) − 1 be confused with the “error rate”, rP , of a Pauli channel
χ0,0 =p 1− 2 + 2 = . (2.31)
d d d P. For Pauli channel P, rP is defined to be the proba-
bility that a non-identity Pauli operator is applied to the
Therefore the (0, 0) entry of the χ-matrix for a quantum input state. Conditioning on a non-identity Pauli being
operation with respect to the Pauli basis is invariant un- applied, there is still a non-zero probability of the input
der twirling over a 2-design. Moreover χ0,0 for Λkdep de- state being unchanged. Subtracting this probability out
creases to d12 exponentially in k. gives our defined parameter r for P which is commonly
called the “infidelity” of P. One can show that r and rP
are related via rP = (d+1)r d . Following the terminology
set in [26] we will call r the (average) error-rate of Λ and
III. RANDOMIZED BENCHMARKING
note that in the case where Λ is a Pauli channel, r is
equal to the infidelity of Λ.
In this section we present both the protocol and a full The parameter r is the figure of merit we want to be
derivation of the fitting models for randomized bench- able to estimate experimentally. One can estimate p di-
marking that were given in [25]. First, we set some rectly using any of standard process tomography [15],
notation and make various definitions that will be used ancilla-assisted/entanglement-assisted process tomogra-
throughout the presentation. phy [48] or Monte-Carlo methods [23, 24]. The to-
Denote the elements of Clifn by Ci and the maxi- mography based schemes suffer from the unrealistic as-
mum sequence length of applying Clifford gates by M . sumptions of negligible state-preparation and measure-
Suppose that the actual implementation of Ci at time j ment errors, and clean ancillary states/operations. These
(1 ≤ j ≤ M ) results in the map Ei,j with Ei,j = Λi,j ◦ Ci schemes also require exponential time resources in n mak-
for some error map Λi,j . Hence to each Clifford Ci we ing them infeasible for even relatively small numbers of
associate a sequence Λi,1 , ..., Λi,M which represents the qubits. The Monte-Carlo methods also have the draw-
time-dependent noise operators affecting Ci . We define back of assuming negligible state-preparation and mea-
the average error operator as follows, surement errors. The advantages of these methods are
that the average fidelity of each gate can be estimated
Definition 2. Average Error Operator and the scheme is efficient in n.
The average error operator affecting the gates in Clifn The experimentally relevant challenge therefore is to
is given by, estimate p while relaxing the assumptions on state prepa-
6
ration, measurement and ancillary states/processes. Ide- where

ally, such a method should also scale efficiently with the
1 X
number of qubits. As we show below, such an estimate SKm = Sim (3.7)
can be obtained through benchmarking the performance Km
im
of random circuits.
is the average sequence operation.
Step 4. Repeat Steps 1 through 3 for different values
A. Protocol of m and fit the results for the averaged sequence fidelity
(defined in Eq. (3.6)) to the model
For a fixed sequence length m ≤ M − 1, the bench-
marking protocol consists of choosing Km sequences of Fg(1) (m, |ψi) = A1 pm +B1 +C1 (m−1)(q−p2 )pm−2 (3.8)
independent and identically distributed uniformly ran- derived below. The coefficients A1 , B1 , and C1 absorb
dom Clifford elements and calculating the fidelity of the the state preparation and measurement errors as well as
average of the Km sequences. One repeats this proce- the error on the final gate. The difference q − p2 is a
dure for different values of m and fits the fidelity decay measure of the degree of gate-dependence in the errors,
curve to the models we derive below. More precisely, the and p determines the average error-rate r according to
protocol is as follows, the relation given by Eq. (3.4). In the case of gate-
independent and time-independent errors the results will
Fix an initial state |ψi and perform the following steps: fit the simpler model
Step 1. Fix m ≤ M − 1 and generate Km sequences

consisting of m + 1 quantum operations. The first m Fg(0) (m, |ψi) = A0 pm + B0 (3.9)
operations are chosen uniformly at random from Clifn
and the m + 1’th operation is uniquely determined as also derived below, where A0 and B0 absorb state
the inverse gate of the composition of the first m. By preparation and measurement errors as well as the er-
assumption each operation Cij is allowed to have some ror on the final gate.
error, represented by Λij ,j , and each sequence can be
modelled by the operation, We note that for each m, in the limit of Km → ∞,
Fseq (m, ψ) converges to the exact (uniform) average,
Fg (m, ψ), over all sequences,
Sim = m+1

j=1 Λij ,j ◦ Cij , (3.5)
where im is the m-tuple (i1 , ..., im ) (which we sometimes Fg (m, ψ) = Tr[Eψ Sm (ρψ )]
also denote by i~m ) and im+1 is uniquely determined by (3.10)
im .
where we define the exact average of the sequences to be,
Step 2. For each of the Km sequences, measure the
survival probability Tr[Eψ Sim (ρψ )]. Here ρψ is a quan-
tuml state that takes into account errors in preparing 1 X
Sm = Λim+1 ,m+1 ◦ Cim+1 ◦ ... ◦ Λi1 ,1 ◦ Ci1 .
|ψihψ| and Eψ is the POVM element that takes into ac- |Clifn |m
(i1 ,...,im )
count measurement errors. In the ideal (noise-free) case
ρψ = Eψ = |ψihψ|. (3.11)
Hence the fitting functions by which we model the be-
Step 3. Average over the Km random realizations to havior of Fseq (m, ψ) are derived in terms of Fg (m, ψ)
find the averaged sequence fidelity, (see Sec. III B). Note that since Fg (m, ψ) is the uniform
average over all sequences we can sum over each index
Fseq (m, ψ) = Tr[Eψ SKm (ρψ )], (3.6) independently,
1 X
Fg (m, ψ) = m tr Λim+1 ,m+1 ◦ Cim+1 ◦ Λim ,m ◦ Cim ◦ ... ◦ Λi1 ,1 ◦ Ci1 (ρψ )Eψ . (3.12)
|Clifn | i1 ,...,im
In order to prepare for the next section where we derive tuitive form. We first re-write Λim+1 ,m+1 ◦Cim+1 ◦Λim ,m ◦
the above fitting models, we write Fg (m, ψ) in a more in- Cim ◦ ... ◦ Λi1,1 ◦ Ci1 by inductively defining new uniformly
7
random gates from the Clifford group in the following

manner:
Si~m = Λim+1 ,m+1 ◦ Cim+1 ◦ Λim ,m ◦ Cim ◦ ... ◦ Λi1 ,1 ◦ Ci1
1. Define Di1 = Ci1 . = Λim+1 ,m+1 ◦ Dim † ◦ Λim ,m ◦ Dim ◦ ...
2. Define Di2 uniquely by the equation Ci2 = Di2 ◦ Di†1 , ◦Di1 † ◦ Λi1 ,1 ◦ Di1 . (3.16)
ie. Di2 = Ci2 ◦ Ci1 = 2s=1 Cis .
3. In general, for j ∈ {2, ..., m}, if Ci1 ,...,Cij and

Di1 ,...,Dij have been chosen, define Dij+1 uniquely by B. Perturbative Expansion and the Fitting Models
the equation Cij+1 = Dij+1 ◦ Dij † , ie.
We would like to develop fitting models for Fg (m, ψ)

Dij+1 = Cij+1 ◦ ... ◦ Ci1 = j+1
s=1 Cis . (3.13) where the most general noise model allows for the noise
to depend upon both the set of gates in Clifn and time.
We can estimate the behavior of Fg (m, ψ) by considering
Note that if j 6= k, Cij and Cik are independent and a perturbative expansion of each Λi,j about the average
so since the Clifford elements form a group, for each Λ. We quantify the difference between Λi,j and Λ by
j = 2, ..., m + 1, Dij is independent of Dij−1 . As well, defining for all i, j,
summing over each ij index runs over every Clifford ele-
ment once and only once in Dij .
We have created a new sequence (Di1 , ..., Dim ) from δΛi,j = Λi,j − Λ. (3.17)
(Ci1 , ..., Cim ) uniquely so that Our approach will be valid provided δΛi,j is a small per-
turbation from Λ in a sense to be made precise later.
Note that each δΛi,j is a Hermiticity-preserving, trace-
Si~m = Λim+1 ,m+1 ◦ Cim+1 ◦ Λim ,m ◦ Cim ◦ ... ◦ Λi1 ,1 ◦ Ci1 annihilating linear superoperator. Under the above con-
= Λim+1 ,m+1 ◦ Dim+1 ◦ Dim † ◦ Λim ,m ◦ Dim ◦ ... ditions this approach will allow for fitting the experimen-
tal fidelity decay sequence to a model with fit parameters
◦Di1 † ◦ Λi1 ,1 ◦ Di1 . (3.14) that determine not only the average error per gate but
also the separate contribution from the combined effects
Since Cim+1 = Ci†1 ◦ ... ◦ Ci†m and Dim+1 = Cim+1 ◦ ... ◦ Ci1 , of state preparation and measurement errors. In the limit
of multiple qubits and very precise control weaker forms
of twirling may permit even more detailed modeling of
Dim+1 = 1. (3.15)
the noise.
Hence the m+1’th gate is decoupled from the rest of the Using the change of variables Dij = js=1 Cis described
sequence and we have above and expanding to first order we get,
Si~m ≡ Λim+1 ,m+1 ◦ Cim+1 ◦ ... ◦ Λij ,j ◦ Cij ◦ ... ◦ Λi1 ,1 ◦ Ci1
= Λim+1 ,m+1 ◦ Dim † ◦ Λim ,m ◦ Dim ◦ ... ◦ Di1 † ◦ Λi1 ,1 ◦ Di1

= Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1 + δΛim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1

+... + Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Dij † ◦ δΛij ,j ◦ Dij ◦ ... ◦ Di1 † ◦ Λ ◦ Di1

+... + Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ δΛi1 ,1 ◦ Di1 + O(δΛ2ij ,j ). (3.18)
We define
(0)
Si~ := Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1 , (3.19)
m
8

(1)
Si~ := δΛim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1
m

+... + Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Dij † ◦ δΛij ,j ◦ Dij ◦ ... ◦ Di1 † ◦ Λ ◦ Di1

+... + Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ δΛi1 ,1 ◦ Di1 (3.20)
and so on for higher order perturbation terms. As well,

recalling the definition of Sm in Eq. (3.11), we define for
each order k, Fg(0) (m, |ψi) = tr Sm
(0)
(ρψ )Eψ
= tr (Λ(ρψ )Eψ ) pm
1

1 X (k)
(k)
Sm := m Si~ (3.21) +tr Λ Eψ (1 − pm )
|Clifn | m d
i1 ,...,im
= A0 pm + B0 (3.26)
and
where
1
  
k
A0 := Tr Eψ Λ ρψ − (3.27)
X
Fg(k) (m, ψ) := tr  (j) 
Sm (ρψ )Eψ  (3.22) d
j=0
and
so that,
1

B0 := Tr Eψ Λ . (3.28)
m+1
X d
(k)
Sm = Sm , (3.23)
k=0
Hence, assuming the simplest (ideal) scenario where
the noise operator at each step is independent of the
and applied gate (and is also time-invariant), Fg (m, ψ) =
(0)
Fg (m, |ψi) decays exponentially in p.
  
m+1
X
Fg (m, ψ) = Fg(m+1) (m, |ψi) = tr  (j) 
Sm (ρψ )Eψ  . 2. First Order Model
j=0
(1) (1)
(3.24) To find Fg (m, |ψi) we note that in the definition of Si~
m
given by Eq. (3.20) there are m+1

1 = m + 1 first-order
perturbation terms which contain the gate dependence.
First, we consider the m − 1 terms with j ∈ {2, ..., m}.
1. Zeroth Order Model For each such j, averaging over the {i1 ...im } gives a term
of the form,
First, we look at the zeroth order fitting model

(0) (0) 1 X
Fg (m, |ψi) and note that Fg (m, |ψi) is exact in the m Λ ◦ Dim † ◦ Λ ◦ Dim ◦ ...
case that the noise is independent of both the gate cho- |Clifn | i ...i
1 m
sen and time, ie. Λij ,j = Λ. By independence of the
Dij and the fact that averaging over the ensemble of re- ◦ Dij ◦ δΛij ,j ◦ Dij ◦ Dij−1 † ◦ Λ ◦ Dij−1 ◦ ...
†
alizations produces independent twirls which depolarize

m factors of Λ (see Sec. (II B)) we get, ◦ Di1 † ◦ Λ ◦ Di1 . (3.29)
For these m − 1 terms the main trick is to realize that

(0) we can re-expand Dij = Cij ◦ Dij−1 in order to depolarize
m

Sm = Λ ◦ Λd ◦ ... ◦ Λd = Λ ◦ j=1 Λd . (3.25)
the unitarily rotated perturbation Ci†j Λij ,j Cij with the
1
P †
Thus, twirling operation |Clif n| ij−1 Dij−1 · Dij−1 because the
9
sums are independent. More precisely, the above can be written as,
   
1
Ci†j
X X
Λ◦ Λm−j
d ◦ 2
†
Dij−1 ◦ ◦ δΛij ,j ◦ Cij ◦ Λ ◦ Dij−1  ◦  † †
Dij−2 ◦ Λ ◦ Dij−2 ◦ ... ◦ Di1 ◦ Λ ◦ Di1 
|Clifn | ij−1 ,ij ij−2 ,...,i1
= Λ ◦ Λm−j ◦ (Qj ◦ Λ)d − Λ2d ◦ Λj−2

d d , (3.30)
1 P †
where Qj := |Clif n| i Ci ◦ Λi,j ◦ Ci and the subscript where
d represents the depolarization of the operator within
brackets. Using the fact that depolarizing channels com-
mute we get,
Λ ◦ Λm−j ◦ (Qj ◦ Λ)d − Λ2d ◦ Λj−2

d d
1 X †
Q1 := Di1 ◦ Λi1 ,1 ◦ Di1
= Λ ◦ (Qj ◦ Λ)d − Λ2d ◦ Λm−2

. (3.31) |Clifn | i
d 1
1 X †
For the term with j = 1, averaging over i1 , ..., im gives = Ci ◦ Λi,1 ◦ Ci . (3.33)
|Clifn | i
a term of the form,
1 X
Λ◦Λm−1
d ◦ Di1 † ◦δΛi1 ,1 ◦Di1 = Λ◦Λm−1
d ◦(Q1 −Λd ),
|Clifn | i
1
(3.32) Lastly for the term with j = m + 1, averaging gives,
1 X
m δΛim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1
|Clifn | i ...i
1 m
!
1 X 1 X
†

= m−1 δΛim+1 ,m+1 ◦ Dim ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1 . (3.34)
|Clifn | i ...i
|Clifn | i
1 m−1 m
Since Clifn is a group, if i1 , ..., im−1 is fixed, averaging where Λi′ ,m+1 denotes the error that arises when the Clif-
over the im index runs through every Clifford element ford operation Ci† is applied at final time-step m + 1.
with equal frequency in the Dim random variable. Since Again, using the group property of Clifn we have,
Λim+1 ,m+1 is just the error associated with the gate Di†m ,

1
P †
|Clifn | i δΛ i ,m+1 ◦ Di ◦ Λ ◦ Di is independent 1 X
Λi,m+1 ◦ Ci ◦ Λ ◦ Ci † .
m m+1 m m
Rm+1 = (3.36)
of the i1 , ..., im−1 indices. Hence we can define |Clifn | i
1 X
Rm+1 := Λim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim
|Clifn | i
m
1 X
= Λi′ ,m+1 ◦ Ci† ◦ Λ ◦ Ci (3.35) This decoupling of Rm+1 allows us to write,
|Clifn | i
10
!
1 X 1 X
m−1 δΛim+1 ,m+1 ◦ Dim † ◦ Λ ◦ Dim ◦ ... ◦ Di1 † ◦ Λ ◦ Di1
|Clifn | i1 ...im−1
|Clifn | i
m
= (Rm+1 − Λ ◦ Λd ) ◦ Λm−1
d . (3.37)
Hence combining Eq.’s (3.25),(3.31),(3.32) and (3.37) gives,
m
X
(0) (1)
= Λ ◦ Λm m−1
Λ ◦ (Qj ◦ Λ)d − Λ2d ◦ Λm−2 + Λ ◦ Λm−1

Sm + Sm d + (Rm+1 − Λ ◦ Λd ) ◦ Λd + d d ◦ (Q1 − Λd )
j=2
m
X
= Rm+1 ◦ Λm−1 Λ ◦ (Qj ◦ Λ)d ◦ Λm−2 + Λ ◦ Λm−1 ◦ Q1 − m (Λ ◦ Λm

d + d d d ). (3.38)
j=2
h i
(1) (0) (1)
To calculate Fg (m, |ψi) := tr Sm + Sm (ρψ )Eψ
we have,
tr (Λ ◦ Λm m
d (ρψ )Eψ ) = A0 p + B0 , (3.42)
tr Rm+1 ◦ Λm−1 (ρψ )Eψ = G1,m+1 pm−1 + H1,m+1 ,

d
(3.39)
tr Λ ◦ (Qj ◦ Λ)d ◦ Λm−2 (ρψ )Eψ = A0 qj pm−2 + B0 ,

d
(3.40) where G1,m+1 := tr Rm+1 (ρψ − 1d )Eψ , H1,m+1

:=
tr Rm+1 ( 1d )Eψ , A1,1 := tr Λ Q1 (ρψ ) − 1d Eψ , A0
and B0 are as given in Eq.s (3.27) and (3.28), and qj
tr Λ ◦ Λm−1 ◦ Q1 (ρψ )Eψ = A1,1 pm−1 + B0 ,

d (3.41) is the depolarization parameter for (Qj ◦ Λ)d . Thus,
m
X
Fg(1) (m, |ψi) = G1,m+1 p m−1
+ H1,m+1 + (A0 qj pm−2 + B0 ) + A1,1 pm−1 + B0 − m (A0 pm + B0 )
j=2
Pm !
m−1 m−2 j=2 qj 2
=p (G1,m+1 + A1,1 − A0 p) + (m − 1)A0 p −p + H1,m+1 . (3.43)
m−1
Finally, we can also re-write Eq. (3.43) as,

Q1 (ρψ ) (p − 1)11
A1 (m) = Tr Eψ Λ − ρψ +
p pd

ρψ 11
Fg(1) (m, |ψi) = A1 (m)pm +B1 (m)+C1 (m−1)(q(m)−p2 )pm−2 +Tr Eψ Rm+1 −
p pd
(3.44)
11
B1 (m) = Tr Eψ Rm+1
d

where, 11
C1 = Tr Eψ Λ ρψ −
d
Xm
q(m) = qj /(m − 1), (3.45)
j=2
11
and qj is the depolarizing parameter defined by • submultiplicativity for Hermiticity-preserving su-

peroperators,
11 • unitary invariance,
(Qj ◦ Λ)d (ρ) = qj ρ + (1 − qj ) . (3.46)
d
• kEkH
1→1 ≤ 1 for any quantum operation E.
We write the first order model in the form of Eq. (3.44)
because of its similarity to that of the zeroth order model
given by Eq. (3.26). The difference between Eq.’s (3.44) Later we will discuss the motivation for using k kH
1→1 as
and (3.26) is the C1 (m − 1)(q(m) − p2 )pm−2 term con- opposed to more familiar norms used in quantum infor-
tained in Eq. (3.44), which can be thought of as a mea- mation theory such as the diamond norm k k⋄ .
sure of the gate-dependence of the noise. From Sec. II A we have that,
Again, we see that the edge effects, state-preparation
and measurement errors are embedded in the three co-
efficients A1 (m), B1 (m), and C1 . Note that the m de- (k+1)
Fg (m, ψ) − Fg(k) (m, ψ)

pendence in q(m) and the A1 (m), and B1 (m) coefficients      
due to the last gate disappears if the errors don’t change k+1
X Xk
(j)  (j) 

as a function of time. = tr  Sm (ρψ )Eψ  − tr  Sm (ρψ )Eψ 
j=0 j=0
h i
(k+1)
= tr Sm (ρψ )Eψ

IV. NEGLECTING HIGHER ORDERS
(k+1) H
≤ kSm k1→1 (4.1)
A. Bounding Higher Order Perturbation Terms
(k+1)
and so bounding Sm provides a bound for how much
We would like to give conditions for when one is jus- the k and k + 1-order fidelities will differ. We first look
tified in stopping the expansion at some order k. The at the case
of(m+1)m
stopping at first order, ie. k = 1. There
main idea, as expressed in Eq. (4.1) below, is to bound are m+1 = second order perturbation terms in
(k+1) 2 2
the “size” of the terms in Sm and we use the “1 → 1” Eq. (3.18). Let us look at at a term with perturbations
norm on linear superoperators maximized over Hermitian at j1 and j2 where without loss of generality we assume
inputs, denoted k kH1→1 , to make this precise (see Sec. II). j2 > j1 . Using the properties listed above, along with
Note that k kH 1→1 has the following useful properties: the triangle inequality, we have,
H

1 X † † † †

|Clifn |m
Λ ◦ Dim ◦ Λ ◦ Dim ◦ ... ◦ Dij ◦ δΛij2 ◦ Dij2 ◦ ... ◦ Dij ◦ δΛij1 ◦ Dij1 ◦ ... ◦ Di1 ◦ Λ ◦ Di1
2 1
i~m

1→1
1 X H

†
H
†
H
†

†
H
≤ kΛk1→1 Dim ◦ Λ ◦ Dim ... Dij ◦ δΛij2 ◦ Dij2 ... Dij ◦ δΛij1 ◦ Dij1 ... Di1 ◦ Λ ◦ Di1

m
|Clifn | 1→1 2 1→1 1 1→1 1→1
i~m

H
m−1 1 X †
H 1 X †
H
= kΛk1→1 Dij2 ◦ δΛij2 ◦ Dij2 Dij1 ◦ δΛij1 ◦ Dij1

|Clifn | i 1→1 |Clif |
n i 1→1
j2 j1
1 X †
H
1 X †
H

≤ Dij2 ◦ δΛij2 ◦ Dij2 Dij1 ◦ δΛij1 ◦ Dij1
|Clifn | i 1→1 |Clifn |
i
1→1
j2 j1
= γj2 γj1 (4.2)
where we define the time-dependent variation in the Summing over all j1 , j2 with j2 > j1 gives,
noise,
1 X
γj := kΛi,j − ΛkH
1→1 . (4.3)
|Clifn | i
12
H

(2) H
1 X (2)
Sm =
|Clifn |m S
i~m
1→1 i~
m 1→1
H

1 X X † † † †

= |Clifn |m Λ ◦ Dim ◦ δΛ ◦ Dim ◦ ... ◦ Dij2 ◦ δΛ ij2 ◦ D ij2 ◦ ... ◦ Dij1 ◦ δΛ ij1 ◦ Dij1 ◦ ... ◦ Di1 ◦ Λ ◦ Di1

~
im j2 >j 1

1→1
H

1
Λ ◦ Di†m ◦ δΛ ◦ Dim ◦ ... ◦ Di†j ◦ δΛij2 ◦ Dij2 ◦ ... ◦ Di†j ◦ δΛij1 ◦ Dij1 ◦ ... ◦ Di†1 ◦ Λ ◦ Di1
X X
≤ |Clifn |m

2 1
j2 >j1 i~m

1→1
X
≤ γj2 γj1 . (4.4)
j2 >j1
In terms of the fidelity we thus have from Eq.’s (4.1) and erator norm k k that satisfies the properties listed above,
(4.4), the following inequality holds,
X
(2)
Fg (m, |ψi) − Fg(1) (m, |ψi) ≤ γj2 γj1 . (4.5) m+1 k

|Fg(k+1) (m, ψ) − Fg(k) (m, ψ)| ≤ γ (4.11)
j2 >j1 k
Note that if the noise is time-independent then we have, where,
X (m + 1)m 2 1 X
γ2 = γ (4.6) γ := kΛi − Λk (4.12)
j2 >j1
2 |Clifn | i
which gives, and for simplicity we have assumed time-independent

noise.

(2)
(m + 1)m The above equations show that in order to give the
Fg (m, |ψi) − Fg(1) (m, |ψi) ≤ γ2. (4.7)

2 tightest bound on the fidelity difference we would like to
find the norm k · k that provides the smallest value of
It is straightforward to show that bounds on higher γ. The diamond norm k · k⋄ is a candidate however by
order terms go as Eq. (2.11) k kH 1→1 is much weaker than k · k⋄ . Therefore
γ associated with k kH 1→1 will be much smaller than γ
associated with k · k⋄ , providing a tighter bound on the
(k) H
X
Sm ≤ γjk ...γj1 (4.8) fidelity difference.
1→1
jk >...>j1
so that the difference between the k and k + 1-order fi-

delities is bounded by, B. Case Where Benchmarking Fails
There is a simple (and highly un-physical) case for

X
(k+1)
Fg (m, ψ) − Fg(k) (m, ψ) ≤ γjk ...γj1 . (4.9)

when benchmarking fails. Suppose the noise is time-
jk >...>j1
independent and for each i, Λi = Ci† . Then Fg (m, ψ) = 1
Again if the noise is time-independent, for every m even though there is substantial error on each
Ci and so benchmarking clearly fails. The key point to
note here is that the noise is highly dependent on the
m + 1
gate chosen and so we expect that the sufficient condi-

(k+1)
(m, ψ) − Fg(k) (m, ψ) ≤ γk. (4.10)

Fg
k tion derived above for ignoring higher order terms will
not be satisfied (ie. γ in this example will be far from
We now discuss our motivation for using k kH 1→1 as 0). To see that this is the case, note that since Clifn is
opposed to more familiar norms for distinguishing super- a unitary 2-design it is also a unitary 1-design. Hence
operators, such as the diamond norm. For any superop- since Clifn is †-closed,
13
C. State Preparation and Measurement Errors

|Clifn | |Clifn |
1 X 1 X † In this section we analyze the effect of state prepara-
Λi = C
|Clifn | i=1 |Clifn | i=1 i tion and measurement errors on the benchmarking pro-
|Clifn |
tocol. The main result is that these errors can be ignored
1 X in situations of practical relevance. For simplicity of the
= Ci
|Clifn | i=1 discussion let us assume the gate-dependence of the noise
is weak enough so that the zeroth order expression given
= Ω (4.13) in Eq. (3.26) is a valid model for the fidelity decay curve.
One can obtain an estimate for p as long as the fidelity
where Ω is the totally depolarizing channel mapping ev- curve is not constant. As state-preparation and mea-
ery input state to the maximally mixed state 1d . There- surement errors are accounted for in A0 and B0 we can
fore, obtain an estimate for p regardless of the form of the
state-preparation and measurement errors whenever the
curve is not constant. Thus the protocol is robust against
† any state preparation or measurement errors unless these
kΛi − ΛkH H
1→1 = kCi − Ωk1→1 . (4.14)
errors create a constant fidelity curve. It is straightfor-
ward to characterize exactly when the fidelity curve is
Now kΛi − ΛkH 1→1 is achieved at a pure state and for constant.
any pure state |ψi, From Eq. (3.26) an exponential decay occurs if and
only if A0 is non-zero and p lies in (0, 1). Hence no decay
occurs if and only if one of p = 0, p = 1 or A0 = 0 occurs.
1 We look at each case separately.
(Λi − Λ)(|ψihψ|) = Ci† |ψihψ|Ci − . (4.15)
d p = 0: This occurs if and only if Λ is the totally depo-
larizing channel and in this case the fidelity is constant
Hence if |φi is a pure state at which kΛi − ΛkH
1→1 is
tr(E )
at B0 = d ψ ≤ d1 . Since we have assumed small gate-
achieved, dependence, this case is only possible if most of the errors
are approximately centred around the totally depolariz-
ing channel with little variation. This situation is of little
1

† practical relevance since the gate operations being char-
kΛi − ΛkH
1→1 =
Ci |φihφ|Ci −
d 1 acterized are usually reasonably precise.
1 1 p = 1: This case corresponds to Λ being the identity
= 1 − + (d − 1)
d d channel which means all gates are perfect. Again, in
2(d − 1) practice this situation is unlikely as the implementation
= . (4.16)
d of any gate will have some associated error. Note that
in this case the fidelity is equal to A0 + B0 which is just
Therefore in this case, tr(Λ(ρψ )Eψ )) = tr(ρψ Eψ ). Hence the constant decay
curve is a measure of the overlap between the imperfect
input state and imperfect POVM element.
1 X H A0 = 0: The case A0 = 0 occurs if and only if
γ = kΛi − Λk1→1
|Clifn | i
2(d − 1)
11
= ≥1 (4.17) tr(Eψ Λ(ρψ )) = tr Eψ Λ . (4.18)
d d

and so our sufficient condition is not satisfied as expected. Thus Λ(ρψ )) and Λ 1d1 have the same probability of
It is important to note that one can devise tests for producing the output “ψ” from the measurement. Since
when such a pathological case is occurring. One simple gates are reasonably precise in practice, this situation
test is given as follows: If the input state is |ψi then occurs when at least one of state preparation or mea-
choose Clifford elements Ci that map |ψi to an orthogo- surement has substantial error. Note that the fidelity
nal state in the measurement basis containing |ψi. For will be equal to B0 in this case and so can take any value
each i, apply Ci to |ψi and perform the measurement. in [0, 1].
For small noise strength the output of the measurement From the above three cases, the only one that depends
should almost never be ψ, however if the noise is some- upon state preparation or measurement errors is the case
thing close to the inverse of the gate the measurement A0 = 0. Since this case occurs when at least one of state
result will be ψ with high probability. preparation or measurement errors has substantial error
14
it is unlikely to arise in practice. This discussion shows To prove Eq. (5.4) using the semidefinite program
that a constant fidelity decay curve can only occur in in [49] first note that Φ = E1 − E2 has action,
extreme cases and so it is safe to assume the protocol
is independent of state preparation and measurement er- 2
dX −1
rors.
Φ(ρ) = (qi − ri )Pi ρPi† . (5.5)
i=0
V. AVERAGE ERROR RATE AND THE

The semidefinite program has the following primal and
DIAMOND NORM
dual problems:
In terms of connections between the average error rate Primal problem: Maximize
hJ(Φ), W i subject
to W ≤
r and relevant fault-tolerant measures of error, it is natu- 1d ⊗ ρ, W ∈ Pos L Cd ⊗ Cd , ρ ∈ D L Cd ,
ral to ask how the error rate r between Λ and I is related
Dual problem: Minimize
ktr1 (Z)k∞ subject to Z ≥
to the diamond norm between Λ and I. In general an ex-
J(Φ), Z ∈ Pos(L Cd ⊗ Cd ),
plicit relationship will be impossible to obtain, however
we show that in certain cases that are relevant in various where J(Φ) is the Choi matrix [50] of Φ. If α and β
fault-tolerant noise models we can obtain such a relation- are the solutions to the primal and dual problems then
ship. First we give a new proof of a previously established the case that α = β is called strong duality. It is shown
result [38] for calculating the diamond norm distance be- in [49] that the above semidefinite program always has
tween generalized Pauli channels. The proof we present the property of strong duality and the solution to the
here illustrates how one can apply a semidefinite program program is α = 21 kE1 − E2 k⋄ . Note also that it is always
to calculate the diamond norm distance between quan- the case that α ≤ β.
tum channels [49]. Ideally, this proof technique could be By definition,
used to either explicitly calculate or place bounds on the
diamond norm distance between more general classes of
quantum channels. This could allow for obtaining further J(Φ) = dΦ ⊗ I(|ψ0 ihψ0 |)
relationships between r and the diamond norm distance 2
dX −1
which hold in more general cases.
=d (qi − ri )Pi ⊗ 1|ψ0 ihψ0 |Pi† ⊗ 1. (5.6)
i=0
A. Calculating the Diamond Norm Distance

Noting that {|ψi i := Pi ⊗ 1|ψ0 i}i=0
2
d −1
Between Generalized Pauli Channels forms an orthonor-
mal basis of maximally entangled states for Cd ⊗ Cd ,
which we call the generalized Bell basis (GBB), we have
Suppose E1 and E2 are Pauli channels, or more gen- that J(Φ) is diagonal when written in GBB with diago-
erally any channels with Kraus operators given by an nal elements (eigenvalues) d(qi − ri ). Let Π+ denote the
orthogonal (normalized to d) basis of unitary operators projector onto the eigenspace with non-negative eigen-
2
{Pi }di=0 (which we call generalized Pauli channels), values and Π− denote the projector onto the eigenspace
with negative eigenvalues.
2
dX −1 For the primal problem let W = Πd+ and ρ = 1d . Then
E1 (ρ) = qi Pi ρPi† (5.1)
i=0 X 1X 1
hJ(Φ), W i = qk − rk = |qk − rk | = k~v k1 .
2 2
k:qk −rk ≥0 k
2
dX −1 (5.7)
E2 (ρ) = ri Pi ρPi† . (5.2)
i=0 Thus α ≥ 21 k~v k1 .
ForPthe dual problem take Z = dΠ+ J(Φ)Π+ which is
Define the vector ~v of length d2 by just k:qk −rk ≥0 (qk −rk )|ψk ihψk | and note Z ≥ J(Φ).
1
P
Moreover, tr1 (Z) = d k:qk −rk ≥0 qk − rk d and so
vi = qi − ri (5.3)
 
for all i ∈ {0, ..., d2 − 1}. Then, X 1
ktr1 (Z)k∞ =  qk − rk  = k~vk1 . (5.8)
2
k:qk −rk ≥0
2
dX −1
kE1 − E2 k⋄ = k~v k1 = |vi |. (5.4) Thus α ≤ 21 k~v k1 which implies α = 1
2 k~
v k1 and kE1 −
i=0 E2 k⋄ = k~v k1 as desired.
15
As a simple corollary to Eq. (5.4) note that if E1 and We know that q0 is related to the average fidelity of E1 ,
E2 are depolarizing channels with fidelity parameters p1 FE1 ,I , by
and p2 respectively then,
q0 d + 1
2 FE1 ,I = (5.16)
2|p1 − p2 |(d − 1) d+1
kE1 − E2 k⋄ = . (5.9)
d2 and so,
To see this note that
2(d + 1)(1 − FE1 ,I )
1−p1
kE1 − Ik⋄ = . (5.17)
d

(d + 1)FE1 ,I − 1 (d + 1) p1 + d −1
q0 = =
d d Therefore in the case of randomized benchmarking
2
(d − 1)p1 + 1 (where we define the error rate r = 1 − FΛ,I ) if Λ is
= (5.10)
d2 a generalized Pauli channel, r and kΛ − Ik⋄ are related
by,
and similarly,
(d + 1)r
(d2 − 1)p2 + 1 kΛ − Ik⋄ = 2 . (5.18)
r0 = . (5.11) d
d2
Thus for every 1 ≤ i ≤ d2 − 1, VI. SCALABILITY OF THE PROTOCOL
In this section we fill in the details of the scalability

1 − q0 1 − p1
qi = = (5.12) proof of our RB protocol that was briefly outlined in [25].
d2 − 1 d2 First, we note that the size of the Clifford group scales
as 2O(n ) and so the number of sequences of length m
2
and
scales as 2mO(n ) . Hence if full averaging over the Clif-
2
ford group is required for each sequence length, our pro-

1 − r0 1 − p2 tocol does not scale well in either of n or m. As mentioned
ri = 2
= . (5.13)
d −1 d2 in [25], there are three obstacles to overcome in order for
the above protocol to be scalable:
So,
1. Sequence length: Since the number of sequences of
length m scales as 2mO(n ) , averaging over all sequences
2
kE1 − E2 k⋄ = kvk1
2
for each m is clearly inefficient.
dX −1
= |q0 − r0 | + |qi − ri | 2. Uniform sampling: Since the size of the Clifford group
scales as 2O(n ) , sampling directly from a list of all Clif-
2
i=1
2 2
ford elements becomes impossible for large n (writing

(d − 1)p1 + 1 (d − 1)p2 + 1
=
− down every element is inefficient in n).
d2 d2

1 − p1 1 − p2 3. Implementing Clifford operations: In practice, one can
+(d2 − 1) 2 − only implement a generating set for the Clifford group.
d d2
2 Hence even if random sampling can be accomplished
(d − 1)|p1 − p2 |
= 2 . (5.14) there must be a scalable method for implementing each
d2 Clifford using only this generating set.
We now describe how to overcome each of the above ob-

B. Relating the Diamond Norm and Error Rate in stacles.
Benchmarking
Solution to 1: From Eq. (3.12), Fg (m, ψ) is the uniform
Now suppose that E2 = I in Eq. (5.4). Then, r0 = 1 average of the random variable
and for every 1 ≤ i ≤ d2 − 1, ri = 0. Hence in this case,

~
Fgim (m, |ψi) := tr Si~m (ρψ )Eψ
kE1 − Ik⋄ = k~v k1 = |q0 − 1| + 1 − q0 = 2(1 − q0 ).
(5.15) = tr Λim+1 ,m+1 ◦ Cim+1 ◦ ... ◦ Λi1 ,1 ◦ Ci1 (ρψ )Eψ (6.1)
16
m
over |Clifn | sequences (i1 , ..., im ). The benchmarking
protocol requires choosing a sequence at random, eval- 2
(0.2)2

uating the above fidelity, repeating for many sequences, ln 0.05
k =
and taking the average of the results. 2(10−3 )2
~ ~
F im (m,|ψi)+....+F im (m,|ψi) ∼ 7 × 104 . (6.5)
Let Sk (m, |ψi) = g k
g
be the nor-
~
malized k-fold sum of the random variable Fgim (m, |ψi) While this number is large it is independent of n and thus
and note that E[Sk (m, |ψi)] = Fg (m, ψ). A probablistic compares favourably with quantum process tomography
bound on |Sk (m, |ψi) − Fg (m, ψ)| is given by Höeffding’s which scales as 16n . As a direct comparison, performing
inequality, process tomography on a 4 qubit system already requires
65536 measurements.
−2(kǫ)2
P (|Sk (m, |ψi) − Fg (m, ψ)| ≥ ǫ) ≤ 2e k(b−a)2 Solution to 2:
−2kǫ2 For the second problem we present a method to scal-
= 2e (b−a)2 (6.2) ably sample uniformly from the full Clifford group that
utilizes the symplectic representation of the Clifford
~
where [a, b] is the range of Fgim (m, |ψi). Since group (see Ref’s [51, 52]). Since the Clifford group is
~
Fgim (m, |ψi) is a fidelity it must lie in [0, 1] (in reality the normalizer of the Pauli group, every Clifford element
it will lie in a much smaller interval, for now we continue is completely determined by its action under conjugation
to assume it lies in [a, b] ⊆ [0, 1]). Suppose we want on the Pauli group. In particular, since the Pauli group
is generated by the set of all Xi and Zi (the label i refers
to X or Z being in the i’th position with identity op-
erators elsewhere), an element of the Clifford group is
P (|Sk (m, |ψi) − Fg (m, ψ)| ≥ ǫ) ≤ δ (6.3) completely determined by its action on this set. In the
symplectic representation this corresponds to each Clif-
where ǫ represents the accuracy of the estimate and 1 − δ ford element Q being associated uniquely to a 2n by 2n
represents the desired confidence level. We can find how binary symplectic matrix C and length 2n binary vector
many trials one needs to perform to obtain this accuracy h which records negative signs in the images of Xi and
−2kǫ2
by setting δ = 2e (b−a)2 and solving for k, Zi . The only constraints on Q are that commutation
relations and Hermiticity of the generating set must be
preserved under Q. Hence we can construct a random
ln 2

(b − a)2 Clifford element Q by inductively constructing a random
δ
k= . (6.4) symplectic matrix C and vector h.
2ǫ2
Since h corresponds to keeping track of negative signs,
Note that k is explicitly independent of m and n which the binary entries of h can be chosen uniformly at ran-
provides a solution to 1. dom. C is inductively constructed column by column
where the first n columns correspond to the images of
It is instructive to obtain an estimate of the size of X1 through Xn , and the last n columns correspond to
k for realistic parameter values of δ and ǫ. Since 1 − δ the images of Z1 through Zn (all of which are written in
represents our desired confidence level we set δ = 0.05. binary notation as in [52]). Preservation of commutation
Fault-tolerance provides a wide range for the error tol- relations is phrased through the symplectic inner prod-
erance of a physical (0-level) gate in the fault-tolerant uct and so at each step one chooses the new column by
construction. The value of the error tolerance depends finding a random solution to a system of linear equations
on both the coding scheme as well as the noise model and which represents the inner product conditions. Since ran-
typical values lie somewhere between 10−6 and 10−2 . Let domly choosing 2n elements of the Pauli group that sat-
us assume that the physical gates have errors on the or- isfy the required commutation relations is equivalent to
der of 10−4 . Intuitively, since the fidelity curve decays inductively choosing random solutions
in sequence length it is reasonable to assume that ǫ can to 2n sets of lin-
ear equations (which requires O n3 operations), we can
be relaxed as m grows large. Similarly, b − a can be as- produce a random Clifford element in O n4 (classical)

sumed to be relatively small for small values of m but operations.
will converge to 1 − d1 as m grows large. As a result both
b − a and ǫ have an implicit dependence on m and this
implicit dependence is advantageous when choosing ǫ for Solution to 3: Any Clifford element can be decomposed
2
large values of m. Let us assume m = 100 and a fidelity into a sequence
of O n one and two-qubit genera-
2
decay curve that is well-approximated by an exponential. tors in O n time [52] (alternatively, there are slower
Then we expect fidelity values on the order of 0.99 at this methods which
produce a “canonical” decomposition into
value of m and so we take ǫ = 10−3 , b − a = 0.2. With O n2 / log n generators [53]). We describe this method
these values for ǫ, δ and b − a we get, which again utilizes the symplectic representation of the
17
Clifford group. As mentioned above, every Clifford el-

ement Q is represented up to phase by a binary, sym-
O n4 · R ln(2/δ)

plectic matrix C and a binary vector h. The main goal (6.6)
is to decompose C into generators as the negative signs 2ǫ2
represented by h can be accounted for via multiplication
which implies the protocol is scalable in n.
by single-qubit Pauli operators. The main theorem used
in the decomposition of Clifford elements is theorem 4
of [52] which states that if C is a binary symplectic ma- VII. DISCUSSION
trix then C can be decomposed as a product of five binary
symplectic matrices, which we denote by T1 through T5 .
We have shown that randomized benchmarking pro-
These symplectic matrices can be decomposed into
symplectic matrices representing 1 and 2-qubit Clif- vides a scalable method for benchmarking the set of
Clifford gates. The protocol allows for time and gate-
ford operations that correspond to Hadamard’s, single
qubit π2 -rotations about σZ , two-qubit π2 -rotations about dependent noise and the fitting models for the fidelity
function take into account state preparation and mea-
σZ ⊗ σZ , two-qubit permutation operations and CNOT
operations. The overall discussion can be condensed into surement errors. In addition to providing an estimate of
the average fidelity across all Clifford gates, the first or-
the following main result:
der model provides a measure of the gate-dependence of
Main Result: Every Clifford operation Q can be realized the noise.
by a sequence of one and two-qubit Clifford operations We have provided here rigorous proofs of both the con-
which consists of the following six rounds of operations: ditions for the validity of the protocol, as well as the scal-
ability of the protocol in the number of qubits n com-
1. An initial round of single-qubit Pauli operators, prising the system. We have also established an exact
2. Applying a sequence of CNOT and two-qubit per- relationship between the average fidelity estimate pro-
mutation operations, vided by the protocol and a stronger characterization of
the average error operator strength given by the diamond
3. Applying a sequence of π2 rotations about σZ ⊗ σZ norm for the case of random Pauli errors. The proof of
followed by a sequence of π2 rotations about σZ , this relationship utilizes a semidefinite program for com-
4. Applying Hadamard operations, puting the diamond norm [49] which has the potential to
establish further connections between these two notions
5. Applying a sequence of π2 rotations about σZ ⊗ σZ of error strength.
followed by a sequence of π2 rotations about σZ , While benchmarking the full unitary group would be
6. Applying a final round of CNOT and two-qubit ideal, this is a provably inefficient task since just gener-
permutation operations. ating a Haar-random unitary operator is inefficient in n.
On the other hand as we have shown here benchmarking
Note that the operations within each of the rounds 3, 4 the Clifford group is an efficient task. It is not difficult to
and 5 all commute and can be performed in any order. see that benchmarking the Clifford group provides signif-
The time-complexity in decomposing a symplectic ma- icant information for both fault-tolerant quantum com-
trix into the sequence of one and two-qubit Clifford oper- putation as well as obtaining a benchmark for a gener-
ations given above is O(n3 ) since one needs to solve linear ating set of the full unitary group. First, any realistic
systems of equations to obtain T1 through T5 . In many implementation of a quantum computer will have to take
cases one would like to have a decomposition of a Clifford advantage of error-correction codes in order to perform
element into a particular generating set for the Clifford fault-tolerant quantum computation. The fact that most
group, such as Gn := {H,S,CNOT} which consists of of the codes used in fault-tolerant theory are stabilizer
Hadamard’s (H) and phase gates (S) on each qubit, as codes implies that the encoding and decoding operations
well as CNOT gates on all pairs of qubits. There are that have to be performed can be chosen to be Clifford
n2 + n elements in Gn and it is a straightforward process operations. Hence a benchmark of Clifford operations
to decompose the operations in 1 through 6 above into provides direct information regarding the robustness of
H, S and CNOT gates. these encoding/decoding schemes.
In total, for an n-qubit system, we can efficiently Second, the unitary group can be generated by adding
choose Clifford gates uniformly at random and decom- just one single-qubit rotation not in the Clifford group
pose each gate into a canonical subsequence of elements (for instance the π8 -gate). Hence a benchmark for the
from the generating set Gn . The total time complexity Clifford group can actually provide useful information re-
of these two procedures is O n4 + O n3 = O n4 .

garding a benchmark for a generating set of the full uni-
The number of trials k one needs to perform to estimate tary group. In addition, it has been shown that any uni-
Fg (m, ψ) to an accuracy ǫ with probability at least 1 − δ tary operation can be implemented using Clifford gates,
is given by Eq. (6.4) which is independent of m and n. a single-qubit ancilla state called a magic state [54] and
Thus if we perform the protocol for R different values of measurements in the computational basis. Hence in this
m, the total time complexity is model of quantum computation the only gates that need
18
to be benchmarked for universal quantum computation that is often made in fault-tolerant estimates is that the
are Clifford gates. correlation in noise between qubits is either small or can
Various interesting questions and comments arise from be ignored.
the benchmarking analysis presented here. First, there With regards to scalability, while we have shown the
is a key point to emphasize regarding the zeroth and protocol itself is scalable in n, a useful direction for fur-
first order fitting models. As depicted in [25] there exist ther research would be an analysis of how the sufficient
physically relevant noise models for which when the true condition of weak average variation of the noise depends
value of the depolarization fidelity parameter p is used, on n. As previously noted, the noise associated to a
the first order model fits the experimental data much multi-qubit Clifford element is given by the noise associ-
better than the zeroth order model. However, it may ated to the sequence of generators comprising the Clif-
be the case that a least squares fitting procedure using ford. A determination of whether these noise operators
the functional form of the zeroth order model produces a continue to satisfy the sufficient condition when it is met
very good fit to the experimental data, albeit producing for small numbers of qubits will be useful for understand-
an incorrect value for p. Therefore in order to obtain a ing the applicability of the protocol.
more accurate value for p one should always use the first Rigorous fault-tolerant analyses sometimes invoke the
order fitting model unless prior knowledge of the noise diamond norm as a measure of the error strength rather
indicates that it is effectively gate-independent. than the weaker characterization provided by the aver-
It will be useful to obtain a better understanding for age fidelity. Hence it is desirable to find relationships
when a least squares fitting procedure using the zeroth between these two quantities that is more general than
order model produces a value for p that is close to its true the special case of random Pauli errors presented here.
value. Clearly in the gate-independent case the zeroth or- As mentioned above, the semidefinite program we have
der model fits the fidelity decay curve exactly. Moreover used to deduce the relationship appears to be a promising
for weakly gate-dependent noise one can see from our tool for further research in this area. From the expression
continuity argument that the zeroth order model is still given in Eq. (2.2) one can see that the diamond norm is
a sufficient fitting function for the fidelity decay curve. essentially a “worst-case” maximization over input (en-
Hence the most interesting case to analyze is when there tangled) states. In quantum computation it is the case
is a non-negligible amount of gate-dependence in the that the measure of accessible states (states that can be
noise and the condition for using the first order model to reached in polynomial time using a generating set for the
fit the decay curve is satisfied. A useful test that would unitary group) is equal to 0. Hence there is a high proba-
indicate gate-dependence in the noise, and thus the va- bility that the maximization criteria demanded by the di-
lidity of the value of p obtained from fitting to the zeroth amond norm is a much stronger condition than necessary
order model, is to perform the least squares fitting proce- for understanding the strength of the errors affecting the
dure using both the zeroth and first order fitting models. computation. This point becomes even more relevant for
If the estimates of p obtained in each case differ signifi- an algorithm-specific (ie. non-universal) quantum com-
cantly then the zeroth order model must be a poor choice puter. An interesting direction of further research is to
of fitting function even though it may fit the data well. In provide precise conditions for when the average fidelity
this case the noise must have a strong gate-dependence provides an indication or bound on the error strength in
because otherwise q − p2 would be small which implies terms of stronger characterizations such as the diamond
the two fitting functions would produce similar estimates norm.
for p. Additionally, if one were able to obtain an estimate of
An interesting question is how to extract a meaningful the minimum gate fidelity from knowledge of the average
average error rate over a generating set of the Clifford fidelity they could use the direct relationship between
group, for instance Gn defined previously, from the aver- the minimum gate fidelity and diamond norm given by
age error rate r over the entire Clifford group. One might Eq. (2.21) to obtain information about the error strength
argue that benchmarking a generating set for the Clif- in terms of the diamond norm. A result that may be
ford group is sufficient for benchmarking the full Clifford useful in this direction of research is the “concentration
group, however it is entirely plausible that noise correla- of measure effect” of the gate fidelity which implies that
tions between the n physical qubits creates large errors as n increases, the measure of the set of states which
on elements of Clifn , even when the errors on the gener- produce a fidelity close to the minimum yet far from the
ating set can be controlled [55]. In fact an assumption average is exponentially small in n [41, 42].
[1] P. Shor, in Proceedings of the 35’th Annual Symposium on [3] R. Feynman, International Journal of Theoretical Physics
Foundations of Computer Science (FOCS) (IEEE Press, 21 (1982).
Los Alamitos, CA, 1994). [4] S. Lloyd, Science 273, 1073 (1996).
[2] A. Harrow, A. Hassidim, and S. Lloyd, Phys. Rev. Lett. [5] P. Shor, Phys. Rev. A 52, R2493 (1995).
103, 150502 (2009). [6] A. Calderbank and P. Shor, Phys. Rev. A 54, 1098
19
(1996). [31] J. M. Chow, L. DiCarlo, J. M. Gambetta, F. Motzoi,

[7] A. Steane, Proc. Roy. Soc. of London A 452, 2551 (1996). L. Frunzio, S. M. Girvin, and R. J. Schoelkopf, Phys.
[8] D. Aharonov and M. Ben-Or, in Proceedings of the Rev. A 82, 040305 (2010).
29th Annual ACM Symposium on Theory of Computing [32] S. Olmschenk, R. Chicireanu, K. D. Nelson, and J. V.
(STOC) (1997). Porto, New J. Phys. 12, 113007 (2010).
[9] E. Knill, R. Laflamme, and W. Zurek, Proc. R. Soc. [33] D. Aharonov, A. Kitaev, and N. Nisan, in Proceedings of
Lond. A 454, 365 (1997). the 30’th annual ACM symposium on theory of computing
[10] J. Preskill, Fault tolerant quantum computation (1997), (ACM, Dallas, TX, 1998).
arXiv:quant-ph/9712048. [34] A. Y. Kitaev, A. H. Shen, and M. N. Vyalyi, Classical
[11] I. Cirac and P. Zoller, Phys. Rev. Lett. 74, 4091 (1995). and Quantum Computation (American Mathematical So-
[12] D. Cory, A. Fahmy, and T. Havel, in Proceedings of ciety, Boston, MA, USA, 2002).
the 4th Workshop on Physics and Computation (Boston, [35] R. Horn and C. Johnson, Matrix Analysis (Cambridge
MA, 1996). University Press, Cambridge, UK, 1990).
[13] D. Loss and D. Divincenzo, Phys. Rev. A 57, 120 (1998). [36] J. Watrous, Quantum Information and Computation 5,
[14] Y. Nakamura, Y. A. Pashkin, and J. S. Tsai, Nature 058 (2005).
(London) 398, 786 (1999). [37] V. Paulsen, Completely Bounded Maps and Operator Al-
[15] I. Chuang and M. Nielsen, J. Mod. Opt. 44, 2455 (1997). gebras (Cambridge University Press, UK, 2002).
[16] J. F. Poyatos, J. I. Cirac, and P. Zoller, Phys. Rev. Lett. [38] M. Sacchi, J. Opt. B 7 (2005).
78, 390 (1997). [39] M. Nielsen and I. Chuang, Quantum Computation and
[17] C. Dankert, R. Cleve, J. Emerson, and E. Livine, Phys. Information (Cambridge University Press, Cambridge,
Rev. A 80, 012304 (2009). UK, 2000).
[18] J. Emerson, R. Alicki, and K. Zyczkowski, Journal of Op- [40] M. Nielsen, Physics Letters A 303, 249 (2002).
tics B: Quantum and Semiclassical Optics 7, S347 (2005). [41] E. Magesan, R. Blume-Kohout, and J. Emerson, Phys.
[19] B. Levi, C. C. Lopez, J. Emerson, and D. G. Cory, Phys. Rev. A 84, 012309 (2011).
Rev. A 75, 022314 (2007). [42] E. Magesan, Quant. Inf. Comp. 11, 0466 (2011).
[20] J. Emerson, M. Silva, O. Moussa, C. Ryan, M. Laforest, [43] I. Bengtsson and K. Zyczkowski, Geometry of Quan-
J. Baugh, D. Cory, and R. Laflamme, Science 317, 1893 tum States: An Introduction to Quantum Entanglement
(2007). (Cambridge University Press, Cambridge, UK, 2006).
[21] M. Silva, E. Magesan, D. Kribs, and J. Emerson, Phys. [44] C. Fuchs and J. van de Graaf, IEEE Trans. Inf. Theory
Rev. A 78, 012347 (2008). 45, 1216 (1999).
[22] A. Bendersky, F. Pastawski, and J. Paz, Phys. Rev. Lett. [45] P. Boykin, T. Mor, M. Pulver, V. Roychowdhury, and
100, 190403 (2008). F. Vatan, in Proceedings of the 40’th Annual Symposium
[23] M. P. da Silva, O. Landon-Cardinal, and D. Poulin, Prac- on Foundations of Computer Science (FOCS) (1999).
tical characterization of quantum devices without tomog- [46] D. Gottesman, Stabilizer Codes and Quantum Error Cor-
raphy (2011), arXiv:1104.3835v3. rection (1997), ph.D. Thesis, arXiv:quant-ph/9705052.
[24] S. T. Flammia and Y.-K. Liu, Phys. Rev. Lett. 106, [47] C. Bennett, D. DiVincenzo, J. Smolin, and W. Wootters,
230501 (2011). Physical Review A 54, 3824 (1996).
[25] E. Magesan, J. M. Gambetta, and J. Emerson, Phys. [48] J. B. Altepeter, D. Branning, E. Jeffrey, T. C. Wei, P. G.
Rev. Lett. 106, 180504 (2011). Kwiat, R. T. Thew, J. L. O’Brien, M. A. Nielsen, and
[26] E. Knill, D. Leibfried, R. Reichle, J. Britton, R. B. A. G. White, Phys. Rev. Lett. 90, 193601 (2003).
Blakestad, J. D. Jost, C. Langer, R. Ozeri, S. Seidelin, [49] J. Watrous, Theory of Computing 5 (2009).
and D. J. Wineland, Physical Review A 77, 012307 [50] M. Choi, Lin. Alg. Appl. pp. 285–290 (1975).
(2008). [51] C. Dankert, Efficient simulation of random quan-
[27] M. J. Biercuk, H. Uys, A. P. VanDevender, N. Shiga, tum states and operations (2005), ph.D. Thesis,
W. M. Itano, and J. J. Bollinger, Quantum Inf. Comput. arXiv:quant-ph/0512217v2.
9, 0920 (2009). [52] J. Dehaene and B. De Moor, Phys. Rev. A 68, 042318
[28] K. R. Brown, A. C. Wilson, Y. Colombe, C. Ospelkaus, (2003).
A. M. Meier, E. Knill, D. Leibfried, and D. J. Wineland, [53] S. Aaronson and D. Gottesman, Phys.l Rev. A 70, 052328
Phys. Rev. A 84, 030303 (2011). (2004).
[29] C. Ryan, M. Laforest, and R. Laflamme, New J. Phys. [54] S. Bravyi and A. Kitaev, Phys. Rev. A 71, 022316 (2005).
11, 013034 (2009). [55] T. Monz, P. Schindler, J. T. Barreiro, M. Chwalla,
[30] J. M. Chow, J. M. Gambetta, L. Tornberg, J. Koch, D. Nigg, W. A. Coish, M. Harlander, W. Hänsel, M. Hen-
L. S. Bishop, A. A. Houck, B. R. Johnson, L. Frunzio, nrich, and R. Blatt, Phys. Rev. Lett. 106, 130506 (2011).
S. M. Girvin, and R. J. Schoelkopf, Phys. Rev. Lett. 102,
090502 (2009).

Characterizing Quantum Gates Via Randomized Benchmarking

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Characterizing Quantum Gates Via Randomized Benchmarking

Uploaded by

Copyright:

Available Formats

Characterizing Quantum Gates via Randomized Benchmarking

Easwar Magesan,1, 2 Jay M. Gambetta,3 and Joseph Emerson1, 2

I. INTRODUCTION noise being independent of the chosen gate, in which case

II. BACKGROUND kRkH

A commonly used state-dependent measure for com-

where “F ” refers to the usual fidelity between quantum kE1 − E2 kH

where we note that since E1 and E2 are completely pos-

Z kE1 −E2 k⋄ = max|ψi∈H⊗H kE1 ⊗I(|ψihψ|)−E2 ⊗I(|ψihψ|)k1 .

FEmin = minρ FE1 ,E2 (ρ). so,

Note that by concavity of the fidelity, the minimum chan-

= min|φi∈H F (E1 (|φihφ|), E2 (|φihφ|)). (2.18)

So, In the case t = 2 the above reduces to a “twirling” [47]

kE1 − E2 k⋄ ≥ 1 − min|φi∈H F (E1 (|φihφ|), E2 (|φihφ|)).

Definition 1. Unitary t-Design (1 − p)

Therefore the average fidelity decreases exponentially to

  It is important to note that r defined above should not

ration, measurement and ancillary states/processes. Ide- where

Step 1. Fix m ≤ M − 1 and generate Km sequences

random gates from the Clifford group in the following

3. In general, for j ∈ {2, ..., m}, if Ci1 ,...,Cij and

We would like to develop fitting models for Fg (m, ψ)

and so on for higher order perturbation terms. As well,

First, we look at the zeroth order fitting model

alizations produces independent twirls which depolarize  

For these m − 1 terms the main trick is to realize that

= Λ ◦ Λm−j ◦ (Qj ◦ Λ)d − Λ2d ◦ Λj−2

Λ ◦ Λm−j ◦ (Qj ◦ Λ)d − Λ2d ◦ Λj−2

Hence combining Eq.’s (3.25),(3.31),(3.32) and (3.37) gives,

tr Rm+1 ◦ Λm−1 (ρψ )Eψ = G1,m+1 pm−1 + H1,m+1 ,

tr Λ ◦ (Qj ◦ Λ)d ◦ Λm−2 (ρψ )Eψ = A0 qj pm−2 + B0 ,

Finally, we can also re-write Eq. (3.43) as,

and qj is the depolarizing parameter defined by • submultiplicativity for Hermiticity-preserving su-

= γj2 γj1 (4.2)

Note that if the noise is time-independent then we have, where,

which gives, and for simplicity we have assumed time-independent

so that the difference between the k and k + 1-order fi-

There is a simple (and highly un-physical) case for

C. State Preparation and Measurement Errors

V. AVERAGE ERROR RATE AND THE

A. Calculating the Diamond Norm Distance

Thus for every 1 ≤ i ≤ d2 − 1, VI. SCALABILITY OF THE PROTOCOL

In this section we fill in the details of the scalability

ford group is required for each sequence length, our pro-

We now describe how to overcome each of the above ob-

Clifford group. As mentioned above, every Clifford el-

(1996). [31] J. M. Chow, L. DiCarlo, J. M. Gambetta, F. Motzoi,

You might also like

It is important to note that r defined above should not

alizations produces independent twirls which depolarize