Problema de Golbach - Matematicas

MAJOR ARCS FOR GOLDBACHS PROBLEM
arXiv:1305.2897v1 [math.NT] 13 May 2013
H. A. HELFGOTT
Abstract. The ternary Goldbach conjecture, or three-primes problem, asserts that every odd integer n greater than 5 is the sum of three primes. The
present paper proves this conjecture.
Both the ternary Goldbach conjecture and the binary, or strong, Goldbach
conjecture had their origin in an exchange of letters between Euler and Goldbach in 1742. We will follow an approach based on the circle method, the
large sieve and exponential sums, supplemented by rigorous computations, including a verification of zeros of L-functions due to D. Platt. The improved
estimates on exponential sums are proven in a twin paper by the author.
Contents
1. Introduction
1.1. Results
1.2. History
1.3. Main ideas
1.4. Acknowledgments
2. Preliminaries
2.1. Notation
2.2. Dirichlet characters and L functions
2.3. Fourier transforms
2.4. Mellin transforms
3. Preparatory work on major arcs
3.1. Decomposition of S (, x) by characters
3.2. The integral over the major arcs
4. Optimizing and coordinating smoothing functions
4.1. The symmetric smoothing function
4.2. The approximation + to
4.3. The smoothing function
5. Mellin transforms and smoothing functions
5.1. Exponential sums and L functions
5.2. How to choose a smoothing function?
5.3. The Mellin transform of the twisted Gaussian
5.4. Totals
6. Explicit formulas
6.1. A general explicit formula
2
6.2. Sums and decay for (t) = t2 et /2 and (t)
6.3. Sums and decay for + (t)
6.4. A verification of zeros and its consequences
7. The integral of the triple product over the minor arcs
7.1. The L2 norm over arcs: variations on the large sieve for primes
7.2. Bounding the quotient in the large sieve for primes
7.3. Putting together 2 bounds over arcs and bounds
1
2
2
2
4
7
8
8
8
9
9
10
11
12
22
23
24
30
36
36
37
38
53
55
56
61
66
78
81
81
86
92
H. A. HELFGOTT
8. Conclusion
8.1. The 2 norm over the major arcs
8.2. The total major-arcs contribution
8.3. Minor-arc totals
8.4. Conclusion: proof of main theorem
Appendix A. Sums over primes
Appendix B. Sums involving (q)
Appendix C. Validated numerics
C.1. Integrals of a smoothing function
C.2. Extrema via bisection and truncated series
References
102
102
103
108
114
115
117
120
120
122
130
1. Introduction
1.1. Results. The ternary Goldbach conjecture (or three-prime problem) states
that every odd number n greater than 5 can be written as the sum of three
primes. Both the ternary Goldbach conjecture and the (stronger) binary Goldbach conjecture (stating that every even number greater than 2 can be written
as the sum of two primes) have their origin in the correspondence between Euler
and Goldbach (1742).
I. M. Vinogradov [Vin37] showed in 1937 that the ternary Goldbach conjecture
is true for all n above a large constant C. Unfortunately, while the value of C has
been improved several times since then, it has always remained much too large
(C = e3100 , [LW02]) for a mechanical verification up to C to be even remotely
feasible.
The present paper proves the ternary Goldbach conjecture.
Main Theorem. Every odd integer n greater than 5 can be expressed as the sum
of three primes.
The proof given here works for all n C = 1030 . The main theorem has
been checked deterministically by computer for all n < 1030 (and indeed for all
n 8.875 1030 ) [HP].
We are able to set major arcs to be few and narrow because the minor-arc
estimates in [Hel] are very strong; we are forced to take them to be few and
narrow because of the kind of L-function bounds we will rely upon.
At issue are
(1) a fuller use of the close relation between the circle method and the large
sieve;
(2) a combination of different smoothings for different tasks;
(3) the verification of GRH up to a bounded height for all conductors q
150000 and all even conductors q 300000 (due to David Platt [Plab]);
(4) better bounds for exponential sums, as in [Hel].
All major computations including D. Platts work in [Plab] have been
conducted rigorously, using interval arithmetic.
1.2. History. The following brief remarks are here to provide some background;
no claim to completeness is made. Results on exponential sums over the primes
are discussed more specifically in [Hel, 1].
1.2.1. Results towards the ternary Goldbach conjecture. Hardy and Littlewood
[HL23] proved that every odd number larger than a constant C is the sum of three
primes, conditionally on the generalized Riemann Hypothesis. This showed, as
they said, that the problem was not unergreifbar (as it had been called by Landau
in [Lan12]).
Vinogradov [Vin37] made the result unconditional. An explicit value for C
15
(namely, C = 33 ) was first found by Borodzin in 1939. This value was improved
to C = 3.33 1043000 by J.-R. Chen and T. Z. Wang [CW89] and to C = 2 101346
by M.-Ch. Liu and T. Wang [LW02]. (J.-R. Chen had also proven that every
large enough even number is either the sum of two primes or the sum p1 + p2 p3
of a prime p1 and the product p2 p3 of two primes.)
In [DEtRZ97], the ternary Goldbach conjecture was proven for all n conditionally on the generalized Riemann hypothesis.
1.2.2. Checking Goldbach for small n. Numerical verifications of the binary Goldbach conjecture for small n were published already in the late nineteenth century; see [Dic66, Ch. XVIII]. Richstein [Ric01] showed that every even integer
4 n 4 1014 is the sum of two primes. Oliveira e Silva, Herzog and Pardi
[OeSHP13] have proven that every even integer 4 n 4 1018 is the sum of two
primes.
The question is then until what point one can establish the ternary Goldbach
conjecture using [OeSHP13]. Clearly, if one can show that every interval of length
4 1018 within [1, N ] contains a prime, then [OeSHP13] implies that every odd
number between 7 and N can be written as the sum of three primes. This was
used in a first version of [Hel] to show that the best existing result on prime
gaps ([RS03], with [Plaa] as input) implies that every odd number between 7 and
1.23 1027 is the sum of three primes. A more explicit approach to prime gaps
[HP] now shows that every odd integer 7 n 8.875694 1030 is the sum of three
primes.
1.2.3. Work on Schnirelmans constant. Schnirelmans constant is a term for
the smallest k such that every integer n > 1 is the sum of at most k primes.
(Thus, Goldbachs binary and ternary conjecture, taken together, are equivalent
to the statement that Schnirelmans constant is 3.) In 1930, Schnirelman [Sch33]
showed that Schnirelmans constant k is finite, developing in the process some of
the bases of what is now called additive or arithmetic combinatorics.
In 1969, Klimov proved that k 6 109 ; he later improved this result to
k 115 [KPS72]
(with G. Z. Piltay and T. A. Sheptiskaya) and k 55. Results
by Vaughan [Vau77] (k = 27), Deshouillers [Des77] (k = 26) and Riesel-Vaughan
[RV83] (k = 19) then followed.
Ramare showed in 1995 that every even n > 1 is the sum of at most 6 primes
[Ram95]. Recently, Tao [Tao] established that every odd number n > 1 is the
sum of at most 5 primes. These results imply that k 6 and k 5, respectively.
The present paper implies that k 4.
1.2.4. Other approaches. Since [HL23] and [Vin37], the main line of attack on
the problem has gone through exponential sums. There are proofs based on
cancellation in other kinds of sums ([HB85], [IK04, 19]), but they have not
been made to yield practical estimates. The same goes for proofs based on other
principles, such as that of Schnirelmans result or the recent work of X. Shao
H. A. HELFGOTT
[Sha]. (It deserves to be underlined that [Sha] establishes Vinogradovs threeprime result without using L-function estimates at all; the constant C is, however,
extremely large.)
1.3. Main ideas. We will limit the discussion here to the general setup, the
estimates for major arcs and the efficient usage of exponential-sum estimates on
the minor arcs. The development of new exponential-sum estimates is the subject
of [Hel].
In the circle method, the number of representations of a number N as the
sum of three primes is represented as an integral over the circle R/Z, which is
partitioned into major arcs M and minor arcs m = (R/Z) \ M:
Z
X
(S(, x))3 e(N )d
(n1 )(n2 )(n3 ) =
(1.1)
n1 +n2 +n3 =N
R/Z
(S(, x)) e(N )d +
(S(, x))3 e(N )d,

m
Px
where S(, x) =
n=1 (n)e(n). The aim is to show that the sum of the
integral over M and the integral over m is positive; this will prove the threeprimes theorem.
The major arcs M = Mr0 consist of intervals (a/q cr0 /qx, a/q + cr0 /qx)
around the rationals a/q, q r0 , where c is a constant. In previous work1, r0
grew with x; in our setup, r0 is a constant. Smoothing changes the left side of
(1.1) into a weighted sum, but, since we aim at an existence result rather than
at an asymptotic for the number of representations p1 + p2 + p3 of N , this is
obviously acceptable.
Typically,
work on major arcs yields rather precise estimates on the integral
R
over M in (1.1), whereas work
on minor arcs gives upper bounds on the absolute
R
value of the integral over m in (1.1). Exponential-sum estimates, such as those
in [Hel], provide upper bounds for maxm |S(, x)|.
1.3.1. Major arc bounds. We will be working with smoothed sums
(1.2)
S (, x) =
(n)(n)e(n/x)(n/x).
n=1
Our integral will actually be of the form

Z
S+ (, x)2 S (, x)e(N )d,
(1.3)
M
where + and are two different smoothing functions to be discussed soon.

Estimating the sums (1.2) on M reduces to estimating the sums
(1.4)
S (/x, x) =
(n)(n)e(n/x)(n/x)
n=1
for varying among all Dirichlet characters modulo q r0 and for || cr0 /q,
i.e., || small. Using estimates on (1.4) efficiently in the estimation of (1.3) is
a delicate task; this is the subject of 3. Let us now focus on how to obtain
estimates on (1.4).
1Ramar
es work [Ram10] is in principle strong enough to allow r0 to be an unspecified large
constant. Taos work [Tao] reaches this standard only for x of moderate size.
Sums such as (1.4) are estimated using Dirichlet L-functions L(s, ). An explicit formula gives an expression
(1.5)
S, (/x, x) = Iq=1 b()x
F ()x + small error,
where Iq=1 = 1 if q = 1 and Iq=1 = 0 otherwise. Here runs over the complex
numbers with L(, ) = 0 and 0 < () < 1 (non-trivial zeros). The function
F is the Mellin transform of e(t)(t) (see 2.4).
The questions are then: where are the non-trivial zeros of L(s, )? How fast
does F () decay as () ?
Write = (s), = (s). The belief is, of course, that = 1/2 for every
non-trivial zero (Generalized Riemann Hypothesis), but this is far from proven.
Most work to date has used zero-free regions of the form 1 1/C log q| |, C
a constant. This is a classical zero-free region, going back, qualitatively, to de la
Vallee-Poussin (1899). The best values of C known are due to McCurley [McC84]
and Kadiri [Kad05].
These regions seem too narrow to yield a proof of the three-primes theorem.
What we will use instead is a finite verification of GRH up to Tq , i.e., a computation showing that, for every Dirichlet character of conductor q r0 (r0 a
constant, as above), every non-trivial zero = + i with | | Tq satisfies
() = 1/2. Such verifications go back to Riemann; modern computer-based
methods are descended in part from a paper by Turing [Tur53]. (See the historical article [Boo06].) In his thesis [Pla11], D. Platt gave a rigorous verification for
r0 = 105 , Tq = 108 /q. In coordination with the present work, he has extended
this to
all odd q 3 105 , with Tq = 108 /q,
all even q 4 105 , with Tq = max(108 /q, 200 + 7.5 107 /q).
This was a major computational effort, involving, in particular, a fast implementation of interval arithmetic (used for the sake of rigor).
What remains to discuss, then, is how to choose in such a way F () decreases
fast enough as | | increases, so that (1.5) gives a good estimate. We cannot hope
for F () to start decreasing consistently before | | is at least as large as a multiple
of 2||. Since varies within (cr0 /q, cr0 /q), this explains why Tq is taken
inversely proportional to q in the above. As we will work with r0 150000, we
also see that we have little margin for maneuver: we want F () to be extremely
small already for, say, | | 80||. We also have a Scylla-and-Charybdis situation,
courtesy of the uncertainty principle: roughly speaking, F () cannot decrease
faster than exponentially on | |/|| both for || 1 and for large.
The most delicate case is that of large, since then | |/|| is small. It turns
out we can manage to get decay that is much faster than exponential for large,
while no slower than exponential for small. This we will achieve by working
2
with smoothing functions based on the (one-sided) Gaussian (t) = et /2 .
2
The Mellin transform of the twisted Gaussian e(t)et /2 is a parabolic cylinder
function U (a, z) with z purely imaginary. Since fully explicit estimates for U (a, z),
z imaginary, have not been worked in the literature, we will have to derive them
ourselves. This is the subject of 5.
H. A. HELFGOTT
We still have some freedom in choosing + and , though they will have to be
2
based on (t) = et /2 . The main term in our estimate for (1.3) is of the form

Z Z
N
(t1 + t2 ) dt1 dt2 ,
+ (t1 )+ (t2 )
(1.6)
C0
x
0
0
where C0 is a constant. Our upper bound for the minor-arc integral, on the other
hand, will be proportional to |+ |22 | |1 . The question is then how to make (1.6)
divided by |+ |22 | |1 as large as possible. A little thought will show that it is
best for + to be symmetric, or nearly symmetric, around t = 1 (say), and for
be concentrated on a much shorter interval than + , while x is set to be x/2 or
slightly less.
It is easy to construct a function of the form t 7 h(t) (t) symmetric around
t = 1, with support on [0, 2]. We will define + = hH (t) (t), where hH is an
approximation to h that is band-limited in the Mellin sense. This will mean that
the decay properties of the Mellin transform of e(t)+ (t) will be like those of
e(t) (t), i.e., very good.
How to choose ? The bounds in [Hel] were derived for 2 = (2I[1/2,1] ) M
(2I[1/2,1] ), which is nice to deal with in the context of combinatorially flavored
analytic number theory, but it has a Mellin transform that decays much too
slowly.2 The solution is to use a smoothing that is, so to speak, Janus-faced, viz.,
2
= (2 M )(t), where (t) = t2 et /2 and is a large constant. We estimate
sums of type S (, x) by estimating S2 (, x) if S2 (, x) if lies on a minor arc,
or by estimating S (, x) if lies on a major arc. (The Mellin transform of is
just a shift of that of .) This is possible because 2 has support bounded away
from zero, while is also concentrated away from 0.
1.3.2. Minor arc bounds: exponential sums and the large sieve. Let mr be the
complement of Mr . In particular, m = mr0 is the complement of M = Mr0 . Exponential sum-estimates, such as those in [Hel], give bounds on maxmr |S(, x)|
that decrease with r.
We need to do better than
Z
Z

S(, x)3 e(N ) d (max |S(, x)| )
|S(, x)|2 d
m
m

m
Z
(1.7)
2
2
|S(, x)| d ,
(max |S(, x)| ) |S|2
m
as this inequality involves a loss of a factor of log x (because |S|22 x log x).
Fortunately, minor arc estimates are valid not just for a fixed r0 , but for the
complement of Mr , where r can vary within a broad range. By partial summation,
these estimates can be combined with upper bounds for
Z
Z
2
|S(, x)|2 d.
|S(, x)| d
Mr
M r0
Giving an estimate for the integral over Mr0 (r0 a constant) will be part of our
task over the major arcs. The question is how to give an upper bound for the
integral over Mr that is valid and non-trivial over a broad range of r.
2This parallels the situation in the transition from Hardy and Littlewood [HL23] to Vinogradov [Vin37]. Hardy and Littlewood used the smoothing (t) = et , whereas Vinogradov
used the brusque (non-)smoothing (t) = I[0,1] . Arguably, this is not just a case of technological
decay; I[0,1] has compact support and is otherwise easy to deal with in the minor-arc regime.
The answer lies in the deep relation between the circle method and the large
sieve. (This was obviously not available to Vinogradov in 1937; the large sieve
is a slightly later development (Linnik [Lin41], 1941) that was optimized and
fully understood later still.) A large sieve is, in essence, an inequality giving a
discretized version of Plancherels identity. Large sieves for primes show that the
inequality can be sharpened for sequences of prime support, provided that, on
the Fourier side, the sum over frequencies is shortened. The idea here is that
this kind of improvement can be adapted back to the continuous context, so as
to give upper bounds on the L2 norms of exponential sums with prime support
when is restricted
to special subsets of the circle. Such an L2 norm is nothing
R
other than Mr |S(, x)|2 d.
The first version of [Hel] used an idea of Heath-Browns3 that can indeed be
understood in this framework. In 7.1, we shall prove a better bound, based on a
large sieve for primes due to Ramare [Ram09]. We will re-derive this sieve using
an idea of Selbergs. We will then make it fully explicit in the crucial range (7.2).
(This, incidentally, also gives fully explicit estimates for Ramares large sieve in
its original discrete context, making it the best large sieve for primes in a wide
range.)
R
The outcome is that Mr |S(, x)|2 d is bounded roughly by 2x log r, rather
than by x log x (or by 2e x log r, as was the case when Heath-Browns idea was
used). The lack of a factor of log x makes it possible to work with r0 equal to a
constant, as we have done; the factor of e reduces the need for computations by
more than an order of magnitude.
1.4. Acknowledgments. The author is very thankful to D. Platt, who, working in close coordination with him, provided GRH verifications in the necessary
ranges, and also helped him with the usage of interval arithmetic. He is also
much indebted to O. Ramare for his help and feedback, especially regarding 7
and Appendix B. Special thanks are also due to A. Booker, B. Green, R. HeathBrown, H. Kadiri, T. Tao and M. Watkins for discussions on Goldbachs problem
and related issues.
Warm thanks are also due to A. C
ordoba and J. Cilleruelo, for discussions on
the method of stationary phase, and to V. Blomer and N. Temme, for further help
with parabolic cylinder functions. Additional references were graciously provided
by R. Bryant, S. Huntsman and I. Rezvyakova.
Travel and other expenses were funded in part by the Adams Prize and the
Philip Leverhulme Prize. The authors work on the problem started at the Universite de Montreal (CRM) in 2006; he is grateful to both the Universite de Montreal
and the Ecole

Normale Superieure for providing pleasant working environments.
The present work would most likely not have been possible without free and
publicly available software: PARI, Maxima, Gnuplot, VNODE-LP, PROFIL /
BIAS, SAGE, and, of course, LATEX, Emacs, the gcc compiler and GNU/Linux in
general. Some exploratory work was done in SAGE and Mathematica. Rigorous
calculations used either D. Platts interval-arithmetic package (based in part on
Crlibm) or the PROFIL/BIAS interval arithmetic package underlying VNODELP.
The calculations contained in this paper used a nearly trivial amount of resources; they were all carried out on the authors desktop computers at home
3Communicated by Heath-Brown to the author, and by the author to Tao, as acknowledged
in [Tao]. The idea is based on a lemma by Montgomery (as in, e.g., [IK04, Lemma 7.15]).
H. A. HELFGOTT
and work. However, D. Platts computations [Plab] used a significant amount

of resources, kindly donated to D. Platt and the author by several institutions.
This crucial help was provided by MesoPSL (affiliated with the Observatoire de
Paris and Paris Sciences et Lettres), Universite de Paris VI/VII (UPMC - DSI P
ole Calcul), University of Warwick (thanks to Bill Hart), University of Bristol,
France Grilles (French National Grid Infrastructure, DIRAC instance), Universite de Lyon 1 and Universite de Bordeaux 1. Both D. Platt and the author would
like to thank the donating organizations, their technical staff, and all academics
who helped to make these resources available to us.
2. Preliminaries
2.1. Notation. As is usual, we write for the Moebius function, for the von
Mangoldt function. We let (n) be the number of divisors of an integer n and
(n) the number of prime divisors. For p prime, n a non-zero integer, we define
vp (n) to be the largest non-negative integer such that p |n.
We write (a, b) for the greatest common divisor of a and b. If there is any risk
of
with the pair (a, b), we write gcd(a, b). Denote by (a, b ) the divisor
Q confusion
vp (a) of a. (Thus, a/(a, b ) is coprime to b, and is in fact the maximal
p|b p
divisor of a with this property.)
As is customary, we write e(x) for e2ix . We write |f |r for the Lr norm of a
function f . Given x R, we let
if x > 0,
1
sgn(x) = 0
if x = 0,
1 if x < 0.
We write O (R) to mean a quantity at most R in absolute value.
2.2. Dirichlet characters and L functions. A Dirichlet character : Z C

of modulus q is a character of (Z/qZ) lifted to Z with the convention that
(n) = 0 when (n, q) 6= 1. Again by convention, there is a Dirichlet character of
modulus q = 1, namely, the trivial character T : Z C defined by T (n) = 1
for every n Z.
If is a character modulo q and is a character modulo q |q such that (n) =
(n) for all n coprime to q, we say that induces . A character is primitive if

it is not induced by any character of smaller modulus. Given a character , we
write for the (uniquely defined) primitive character inducing . If a character
mod q is induced by the trivial character T , we say that is principal and
write 0 for (provided the modulus q is clear from the context). In other words,
0 (n) = 1 when (n, q) = 1 and 0 (n) = 0 when (n, q) = 0.
A Dirichlet L-function
P L(s, ) ( a Dirichlet character) is defined as the analytic continuation of n (n)ns to the entire complex plane; there is a pole at
s = 1 if is principal.
A non-trivial zero of L(s, ) is any s C such that L(s, ) = 0 and 0 < (s) <
1. (In particular, a zero at s = 0 is called trivial, even though its contribution
can be a little tricky to work out. The same would go for the other zeros with
(s) = 0 occuring for non-primitive, though we will avoid this issue by working
mainly with primitive.) The zeros that occur at (some) negative integers are
called trivial zeros.
The critical line is the line (s) = 1/2 in the complex plane. Thus, the generalized Riemann hypothesis for Dirichlet L-functions reads: for every Dirichlet
character , all non-trivial zeros of L(s, ) lie on the critical line. Verifiable finite
versions of the generalized Riemann hypothesis generally read: for every Dirichlet
character of modulus q Q, all non-trivial zeros of L(s, ) with |(s)| f (q)
lie on the critical line (where f : Z R+ is some given function).
2.3. Fourier transforms. The Fourier transform on R is normalized as follows:
Z
b
e(xt)f (x)dx
f (t) =
for f : R C.
The trivial bound is |fb| |f |1 . Integration by parts gives that, if f is
differentiable k times outside finitely many points, then
!
!
d
(k) |
(k) |
|
f
|f
1
(2.1)
fb(t) = O
= O
.
2t
(2t)k
It could happen that |f (k) |1 = , in which case (2.1) is trivial (but not false).
In practice, we require f (k) L1 . In a typical situation, f is differentiable k
times except at x1 , x2 , . . . , xk , where it is differentiable only (k 2) times; the
contribution of xi (say) to |f (k) |1 is then | limxx+ f (k1) (x)limxx f (k1) (x)|.
i
2.4. Mellin transforms. The Mellin transform of a function : (0, ) C is

Z
(x)xs1 dx.
(2.2)
M (s) :=
0
In general, M (f M g) = M f M g and
Z +i
1
M f (z)M g(s z)dz
(2.3)
M (f g)(s) =
2i i
[GR00, 17.32]
provided that z and sz are within the strips on which M f and M g (respectively)
are well-defined.
The Mellin transform is an isometry, in the sense that
Z
Z
1
2 2 dt
|f (t)| t
(2.4)
=
|M f ( + it)|2 dt.
t
2
0
provided that + iR is within the strip on which M f is defined. We also know

that, for general f ,sear
(2.5)
M (tf (t))(s) = s M f (s),
M ((log t)f (t))(s) = (M f ) (s)
(as in, e.g., [BBO10, Table 1.11]).

Since (see, e.g., [BBO10, Table 11.3] or [GR00, 16.43])
(M I[a,b] )(s) =
bs a s
,
s
we see that
(2.6)
M 2 (s) =
1 2s
s
2
M 4 (s) =
1 2s
s
4
10
H. A. HELFGOTT
Let fz = ezt , where (z) > 0. Then

Z
Z
1 t
zt s1
e dt
e t dt = s
(M f )(s) =
z 0
0
Z z
Z
1
1
(s)
= s
eu us1 du = s
et ts1 dt = s ,
z 0
z 0
z
where the next-to-last step holds by contour integration, and the last step holds
by the definition of the Gamma function (s).
3. Preparatory work on major arcs
Let
(3.1)
S (, x) =
j
(n)e(n)(n/x),
where R/Z, is the von Mangoldt function and : R C is of fast enough

decay for the sum to converge.
Our ultimate goal is to bound from below
X
(3.2)
(n1 )(n2 )(n3 )1 (n/x)2 (n/x)3 (n/x),
n1 +n2 +n3 =N
where 1 , 2 , 3 : R C. As can be readily seen, (3.2) equals

Z
S1 (, x)S2 (, x)S3 (, x)e(N )d.
(3.3)
R/Z
In the circle method, the set R/Z gets partitioned into the set of major arcs
M and the set of minor arcs m; the contribution of each of the two sets to the
integral (3.3) is evaluated separately.
Our object here is to treat the major arcs: we wish to estimate
Z
S1 (, x)S2 (, x)S3 (, x)e(N )d
(3.4)
M
for M = M0 ,r , where
(3.5)

[ [ a
[
0 r a
0 r
M0 ,r =
, +
q
2qx q
2qx
qr a mod q
q odd (a,q)=1
[ a 0 r a 0 r
, +
q
qx q
qx
q2r a mod q
q even (a,q)=1
and 0 > 0, r 1 are given.

In other words, our major arcs will be few (that is, a constant number) and
narrow. While [LW02] used relatively narrow major arcs as well, their number,
as in all previous proofs of Vinogradovs result, is not bounded by a constant.
(In his proof of the five-primes theorem, [Tao] is able to take a single major arc
around 0; this is not possible here.)
What we are about to see is the general framework of the major arcs. This is
naturally the place where the overlap with the existing literature is largest. Two
important differences can nevertheless be singled out.
The most obvious one is the presence of smoothing. At this point, it improves and simplifies error terms, but it also means that we will later need
estimates for exponential sums on major arcs, and not just at the middle
of each major arc. (If there is smoothing, we cannot use summation by
11
parts to reduce the problem of estimating sums to a problem of counting

primes in arithmetic progressions, or weighted by characters.)
Since our L-function estimates for exponential sums will give bounds that
are better than the trivial one by only a constant even if it is a rather
large constant we need to be especially careful when estimating error
terms, finding cancellation when possible.
3.1. Decomposition of S (, x) by characters. What follows is largely classical; compare to [HL23] or, say, [Dav67, 26]. The only difference from the
literature lies in the treatment of n non-coprime to q.
Write (, b) for the Gauss sum
X
(3.6)
(, b) =
(a)e(ab/q)
a mod q
associated to a b Z/qZ and a Dirichlet character with modulus q. We let

() = (, 1). If (b, q) = 1, then (, b) = (b1 ) ().
Recall that denotes
the primitive character inducing a given Dirichlet charP
acter . Writing mod q for a sum over all characters of (Z/qZ) ), we see
that, for any a0 Z/qZ,
(3.7)
X
X X
1
1
(, b) (a0 ) =
(a)e(ab/q) (a0 )
(q)
(q)
mod q
mod q a mod q
(a,q)=1
X e(ab/q)
X e(ab/q) X
(a1 a0 ) =
(q)
(q)
mod q
a mod q
(a,q)=1
a mod q
(a,q)=1
(a1 a0 ),
mod q
P
where q = q/ gcd(q, a
). Now, mod q (a1 a0 ) = 0 unless a = a0 (in which
0
P
case mod q (a1 a0 ) = (q )). Thus, (3.7) equals
(q )
(q)
(q )
e(ab/q) =
(q)
a mod q
(a,q)=1
aa0 mod q

a
(q )
(q)
0b
k
(k,q/q )=1
k mod q/q
(k,q/q )=1
mod q/q
kb
q/q
(a0 + kq )b
q
(q )
e
(q)
a0 b
q
(q/q )
provided that (b, q) = 1. (We are evaluating a Ramanujan sum in the last step.)
Hence, for = a/q + /x, q x, (a, q) = 1,
X
1 X
(, a)
(n)(n)e(n/x)(n/x)
(q)
n
equals
X ((q, n ))
n
((q, n ))
(n)e(n)(n/x).
12
H. A. HELFGOTT
Since (a, q) = 1, (, a) = (a) (). The factor ((q, n ))/((q, n )) equals 1

when (n, q) = 1; the absolute value of the factor is at most 1 for every n. Clearly
n X
X p
X
=
log p
(n)
.
x
x
n
1
p|q
(n,q)6=1
Recalling the definition (3.1) of S (, x), we conclude that

(3.8)

X
X
X p
1
,
S (, x) =
(a) ()S,
, x + O 2
log p
(q)
x
x
mod q
where
(3.9)
p|q
S, (, x) =
(n)(n)e(n)(n/x).
Hence S1 (, x)S2 (, x)S3 (, x)e(N ) equals

1 XXX
(1 ) (2 ) (3 )1 (a)2 (a)3 (a)e(N a/q)
3
(3.10) (q) 1 2 3
S1 ,1 (/x, x)S2 ,2 (/x, x)S3 ,3 (/x, x)e(N/x)
plus an error term of absolute value at most
(3.11)
3 Y
X
j=1 j 6=j
|Sj (, x)|
X
p|q
log p
p
x
We will later see that the integral of (3.11) over S 1 is negligible for our choices
of j , it will, in fact, be of size O(x(log x)A ), A a constant. (In (3.10), we have
reduced our problems to estimating S, (/x, x) for primitive; a more obvious
way of reaching the same goal would have multiplied made (3.11) worse by a
factor of about q. The error term O(x(log x)A ) should be compared to the
main term, which will be of size about a constant times x2 .)
3.2. The integral over the major arcs. We are to estimate the integral (3.4),
where the major arcs M0 ,r are defined as in (3.5). We will use 1 = 2 = + ,
3 (t) = (t), where + and will be set later.
We can write
Z
(t/x)e(t/x)dt + O (err, (, x)) x
S, (/x, x) = S (/x, x) =
(3.12)
0
= b() x + O (err,T (, x)) x
for = T the trivial character, and
(3.13)
S, (/x) = O (err, (, x)) x
for primitive and non-trivial. The estimation of the error terms err will come
later; let us focus on (a) obtaining the contribution of the main term, (b) using
estimates on the error terms efficiently.
The main term: three principal characters. The main contribution will be
given by the term in (3.10) with 1 = 2 = 3 = 0 , where 0 is the principal
character mod q.
13
The sum (0 , n) is a Ramanujan sum; as is well-known (see, e.g., [IK04,

(3.2)]),
(3.14)
(0 , n) =
(q/d)d.
d|(q,n)
This simplifies to (q/(q, n))((q, n)) for q square-free. The special case n = 1
gives us that (0 ) = (q).
Thus, the term in (3.10) with 1 = 2 = 3 = 0 equals
e(N a/q)
(q)3 S+ ,0 (/x, x)2 S ,0 (/x, x)e(N/x),
(q)3
(3.15)
where, of course, S,0 (, x) = S (, x) (since 0 is the trivial character). Summing (3.15) for = a/q + /x and a going over all residues mod q coprime to q,
we obtain

q
(q,N
) ((q, N ))
(q)3 S+ ,0 (/x, x)2 S ,0 (/x, x)e(N/x).
(q)3
The integral of (3.15) over all of M = M0 ,r (see (3.5)) thus equals
(3.16)
Z 0 r
X ((q, N ))
2qx
2
S2+ ,0 (, x)S ,0 (, x)e(N )d
(q) ((q, N ))
3
0 r
(q)
qr
q odd
2qx
Z 0
X ((q, N ))
qx
2
+
(q) ((q, N ))
S2+ ,0 (, x)S ,0 (, x)e(N )d.
3
r
(q)
0
r
q2r
q even
qx
The main term in (3.16) is

(3.17)
Z 0 r
X ((q, N ))
2qx
2
3
(q) ((q, N ))
(c
+ (x))2 b (x)e(N )d
x
3
0 r
(q)
qr
q odd
2qx
Z 0
X ((q, N ))
qx
2
(c
+ (x))2 b (x)e(N )d.
+x
(q) ((q, N ))
3
r
(q)
0
r
q2r
q even
qx
We would like to complete both the sum and the integral. Before, we should
say that we will want to be able to use smoothing functions + whose Fourier
transform are not easy to deal with directly. All we want to require is that there
be a smoothing function , easier to deal with, such that be close to + in 2
norm.
Assume, then, that
|+ |2 0 | |,
14
H. A. HELFGOTT
(3)
where is thrice differentiable outside finitely many points and satisfies

L1 . Then (3.17) equals
(3.18)
Z 0 r
X ((q, N ))
2qx
2
3
(q) ((q, N ))
(b (x))2 b (x)e(N )d
x
3
0 r
(q)
qr
q odd
2qx
Z 0
X ((q, N ))
qx
2
+x
(q) ((q, N ))
(b (x))2 b (x)e(N )d.
3
r
(q)
0
r
q2r
q even
plus
(3.19)
qx
X (q)2 Z
q
(q)2
|(c
+ ()) (b ()) ||b ()|d .
Here (3.19) is bounded by 2.82643x2 (by (B.4)) times

sZ
Z
2
|c
+ () + b ()|2 d
|c
+ () b ()| d
|b ()|
| |1 |c
+ b |2 |c
+ + b |2 = | |1 |+ |2 |+ + |2
| |1 |+ |2 (2| |2 + |+ |2 ) = | |1 | |22 (2 + 0 )0 .
Now, (3.18) equals

(3.20) Z
x3
= x3
x3
(b (x))2 b (x)e(N )
X

0 r
q
min 2||x
,r
(q,2)
(q)2 =1
(b (x))2 b (x)e(N )d
(b (x))2 b (x)e(N )
The last line in (3.20) is bounded4 by

Z
2
(3.21)
x |b |
|b ()|2
((q, N ))
((q, N ))d
(q)3

X ((q, N ))
q1
(q)3
X

0 r
q
>min 2||x
,r
(q,2)
(q)2 =1
q
>min
(q,2)
0 r
,r
2||
(q)2 ((q, N ))
((q, N ))
((q, N ))d.
(q)3

(q)2
d.
(q)2
By (2.1) (with k = 3), (B.11) and (B.12), this is at most

!2
Z 0 /2
Z
(3)
4.31004
|
|
8.62008||
1
x2 | |1
|b |2
d + 2x2 | |1
d
3
r
(2)
0 r
0 /2
0 /2
!
(3) 2
|
|
x2
.
| |1 4.310040 | |21 + 0.00113 5 1
r
0
4This is obviously crude, in that we are bounding ((q, N ))/(q) by 1. We are doing so in
order to avoid a potentially harmful dependence on N .
15
It is easy to see that

X ((q, N ))
q1
(q)3
(q)2 ((q, N )) =
Y
p|N
1
(p 1)2
Y
1+
pN
1
(p 1)3
Expanding the integral implicit in the definition of fb,

Z
(3.22)
(b (x))2 b (x)e(N )d =

Z Z
1
N
(t1 + t2 ) dt1 dt2 .
(t1 ) (t2 )
x 0
x
0
(This is standard. One rigorous way to obtain (3.22) is to approximate the integral over (, ) by an integral with a smooth weight, at different scales;
as the scale becomes broader, the Fourier transform of the weight approximates
(as a distribution) the function. Apply Plancherel.)
Hence, (3.17) equals

N
(t1 + t2 ) dt1 dt2
(t1 ) (t2 )
x
x
0
0
Y

Y
1
1
1+
1
(p 1)2
(p 1)3
2
(3.23)
p|N
pN
(the main term) plus
(3.24)
2
2.82643| |2 (2 + 0 ) 0 +
4.310040 | |21 + 0.00113

r
(3)
| |21
05
2
| |1 x
Here (3.23) is just as in the classical case [IK04, (19.10)], except for the fact
that a factor of 1/2 has been replaced by a double integral. We will later see
how to choose our smoothing functions (and x, in terms of N ) so as to make the
double integral as large as possible.
What remains to estimate is the contribution of all the terms of the form
err, (, x) in (3.12) and (3.13). Let us first deal with another matter bounding
the 2 norm of |S (, x)|2 over the major arcs.
The 2 norm. We can always bound the integral of |S (, x)|2 on the whole
circle by Plancherel. If we only want the integral on certain arcs, we use the
bound in Prop. 7.2 (based on work by Ramare). If these arcs are really the major
arcs that is, the arcs on which we have useful analytic estimates then we can
hope to get better bounds using L-functions. This will be useful both to estimate
the error terms in this section and to make the use of Ramares bounds more
efficient later.
16
H. A. HELFGOTT
By (3.8),
X
a mod q
gcd(a,q)=1

2

S a + ,

q
x
1 XX
)
(
()
(q)2
a mod q
gcd(a,q)=1
(a) (a) S, (/x, x)S, (/x, x)

2
+ O 2(1 + q)(log x)2 || max |S (, x)| + (1 + q)(log x)2 ||
1 X
2
| ()| |S, (/x, x)|2 + Kq,1 (2|S (0, x)| + Kq,1 ),
=
(q)
where
Kq,1 = (1 +
q)(log x)2 || .
As is well-known (see, e.g., [IK04, Lem. 3.1])
() =
q
q
q
q
( ),
where q is the modulus of (i.e., the conductor of ), and

| ( )| =
q.
Using the expressions (3.12) and (3.13), we obtain

X a 2 2 (q)
S
+ , x =
|b
()x + O (err,T (, x) x)|2

q
x
(q)
a mod q
(a,q)=1
1
(q)
6=T
q
q
q O

| err, (, x)|2 x2 + Kq,1 (2|S (0, x)| + Kq,1 )

|b
()|2 + O (|err,T (, x)(2||1 + err,T (, x))|)
(q)

2
2
+ O q max | err, (, x)| x + Kq,2 x ,
=
2 (q)x2
6=T
where Kq,2 = Kq,1 (2|S (0, x)|/x + Kq,1 /x).
17
Thus, the integral of |S (, x)|2 over M (see (3.5)) is

X Z
qr a mod q
q odd (a,q)=1
0 r
a
+ 2qx
q
0 r
a
2qx
q
|S (, x)| d +
X Z
q2r a mod q
q even (a,q)=1
a 0 r
+ qx
q
0 r
a
qx
q
|S (, x)|2 d
0 r
0 r
X 2 (q)x2 Z qx
X 2 (q)x2 Z 2qx
2
|b
(x)| d +
|b
(x)|2 d
=
0 r
0 r
(q)
(q)
qr
q odd
+ O
(3.25)
q2r
q even
2qx
qx

X 2 (q)x2 gcd(q, 2)0 r
ET, 0 r (2||1 + ET, 0 r )
(q)
qx
2
2
q
X 0 rx
O q
+
q
qr
q odd
max
Kq,2
| err, (, x)| +
x
mod q
6=T
||0 r/2q
X 20 rx
Kq,2
2
,
+
O q max | err, (, x)| +
mod
q
q
x
q2r
q even
where
6=T
||0 r/q
ET,s = max | err,T (, x)|

||s
and T is the trivial character. If all we want is an upper bound, we can simply
remark that
0
0
X 2 (q) Z qx
X 2 (q) Z 2qx
|b
(x)|2 d + x
|b
(x)|2 d
x
(q) 0 r
(q) 0 r
r
qr
q odd
2qx
q2r
q even
qx
X 2 (q)
X 2 (q)
X 2 (q)
2
2
|b
+
.
|
=
2||
2
(q)
(q) 2
(q)
q2r
q even
qr
q odd
qr
q odd
If we also need a lower bound, we proceed as follows.

Again, we will work with an approximation such that (a) | |2 is small,
(3)
(b) is thrice differentiable outside finitely many points, (c) L1 . By (B.6),
0
X 2 (q) Z 2qx
|b
(x)|2 d
x
(q) 0 r
r
qr
q odd
2qx
0 r

X 2 (q) Z 2q
2
1
|b ()| d + O
=
log r + 0.85 (|b |22 |b
|22 ).
(q) 0 r
2
qr
q odd
2q
18
H. A. HELFGOTT
Also,
0
0
X 2 (q) Z 2qx
X 2 (q) Z qx
2
|b
(x)| d = x
|b
(x)|2 d.
x
(q) 0 r
(q) 0 r
r
q2r
q even
qr
q odd
qx
2qx
By (2.1) and Plancherel,

Z
0 r
2q
|b ()| d =
0 r
2q
|b ()| d O
(3)
= | |22 + O
| |21 q 5
5 6 (0 r)5
Hence
X 2 (q)
(q)
qr
q odd
0 r
2q
0 r
2q
|b ()|2 d = | |22
X 2 (q)
(q)
qr
q odd
Using (B.13), we get that
0 r
2q
!
(3)
| |21
d
(2)6
X 2 (q) |(3) |21 q 5
.
+O
(q) 5 6 (0 r)5
qr
q odd
(3)
X 2 (q) |(3) |2 q 5
1 X 2 (q)q | |21
1
(q) 5 6 (0 r)5
r
(q) 5 6 05
qr
qr
q odd
q odd

(3)
log r 0.425
| |21
+
.
0.64787 +
4r
r
5 6 05
Going back to (3.25), we use (B.2) to bound
X 2 (q)x2 gcd(q, 2)0 r
q
(q)
qx
2.59147 0 rx.
We also note that

X 1
X 2 X1 X 1
X1
+
=
+
q
q
q
q
r 2q
qr
q odd
q2r
q even
qr
q 2
qr
r
log 2e2 r.
2
We have proven the following result.
2 log er log
Lemma 3.1. Let : [0, ) R be in L1 L . Let S (, x) be as in (3.1) and

let M = M0 ,r be as in (3.5). Let : [0, ) R be thrice differentiable outside
(3)
finitely many points. Assume L1 .
Assume r 182. Then
(3.26)

Z
ET,0 r/2
2
|S (, x)| d = Lr,0 x + O 5.190 xr ET, 0 r ||1 +

2
2
M

3 log r
2
2
E,r,
+
r(log
2e
r)K
+ O 0 xr 2 +
0
r,2 ,
0
2
where
(3.27)
E,r,0 =
max
mod q
qrgcd(q,2)
||gcd(q,2)0 r/2q
Kr,2 = (1 +
q| err, (, x)|,
19
ET,s = max | err,T (, x)|,
2r)(log x)2 || (2|S (0, x)|/x + (1 +
||s
2r)(log x)2 || /x)
and Lr,0 satisfies both

Lr,0 2||22
(3.28)
X 2 (q)
(q)
qr
q odd
and
Lr,0 = 2| |22
(3.29)
X 2 (q)
+ O (log r + 1.7) (|b |22 |b
|22 )
(q)
qr
q odd
(3)
+ O
2| |21
5 6 05
!

log r 0.425
+
.
0.64787 +
4r
r
The error term xrET,0 r will be very small, since it will be estimated using
the Riemann zeta function; the error term involving Kr,2 will be completely
2
negligible. The term involving xr(r + 1)E,r,
; we see that it constrains us to
0
have | err, (x, N )| less than a constant times 1/r if we do not want the main
term in the bound (3.26) to be overwhelmed.
3.2.1. The triple product and its error terms. There are at least two ways we can
evaluate (3.4). One is to substitute (3.10) into (3.4). The disadvantages here
are that (a) this can give rise to pages-long formulae, (b) this gives error terms
proportional to xr| err, (x, N )|, meaning that, to win, we would have to show
that | err, (x, N )| is much smaller than 1/r. What we will do instead is to use
our 2 estimate (3.26) in order to bound
the contribution of non-principal terms.
This will give us a gain of almost r on the error terms; in other words, towin,
it will be enough to show later that | err, (x, N )| is much smaller than 1/ r.
The contribution of the error terms in S3 (, x) (that is, all terms involving
the quantities err, in expressions (3.12) and (3.13)) to (3.4) is
X
X
X 1
3 (a)e(N a/q)
(3 )
(q)
qr
q odd
(3.30)
+
q2r
q even
a mod q
(a,q)=1
3 mod q
1
(q)
0 r
2qx
0 r
2qx
(3 )
S+ ( + a/q, x)2 err ,3 (x, x)e(N )d

X
3 (a)e(N a/q)
a mod q
(a,q)=1
3 mod q
0 r
qx
0 r
qx
S+ ( + a/q, x)2 err ,3 (x, x)e(N )d.
20
H. A. HELFGOTT
We should also remember the terms in (3.11); we can integrate them over all of
R/Z, and obtain that they contribute at most
Z
3 Y
X
X
X p
2
|Sj (, x)| max
log p
j
d
qr
x
R/Z
j=1 j 6=j
p|q

p
2
|Sj (, x)|2 max
log p
j
qr
x
j=1 j 6=j
1
p|q
X
X p
2
2
=2
(n)+ (n/x) log r max
pr
x
n
1
sX
X
X p
2
2
2
2
(n)+ (n/x)
(n) (n/x) log r max
+4
pr
x
n
n
3 Y
X
by Cauchy-Schwarz and Plancherel.

The absolute value of (3.30) is at most
0 r
X X Z 2qx

S ( + a/q, x)2 d
q
+
r
0
2qx
qr a mod q
q odd (a,q)=1
(3.31)
X Z
q
0 r
qx
0 r
qx
q2r a mod q
q even (a,q)=1
M0 ,r
max
mod q
||0 r/2q
| err , (, x)|

S ( + a/q, x)2 d max | err , (, x)|
+
mod q
||0 r/q

S ()2 d
+
max
mod q
qrgcd(q,2)
||gcd(q,2)0 r/q
q| err , (, x)|.
We can bound the integral of |S+ ()|2 by (3.26).

What about the contribution of the error part of S2 (, x)? We can obviously
proceed in the same way, except that, to avoid double-counting, S3 (, x) needs
to be replaced by
1
(q)
(0 )b3 () x =
b3 () x,
(q)
(q)
(3.32)
which is its main term (coming from (3.12)). Instead of having an 2 norm as
in (3.31), we have the square-root
of a product of two squares of 2 norms (by
R
Cauchy-Schwarz), namely, M |S+ ()|2 d and
0
0
X 2 (q) Z 2qx
X 2 (q) Z qx
2
|
b
(x)x|
d
+
|b (x)x|2 d
(q)2 0 r
(q)2 0 r
r
(3.33)
qr
q odd
q2r
q even
2qx
x|b |22
X 2 (q)
q
(q)2
qx
By (B.4), the sum over q is at most 2.82643.

As for the contribution of the error part of S1 (, x), we bound it in the same
way, using solely the 2 norm in (3.33) (and replacing both S2 (, x) and S3 (, x)
by expressions as in (3.32)).
21
The total of the error terms is thus

x
(3.34)
+x
where A = (1/x)
max
q | err , (, x)| A
max
q | err+ , (, x)|( A + B+ ) B ,
mod q
qrgcd(q,2)
||gcd(q,2)0 r/q
mod q
qrgcd(q,2)
||gcd(q,2)0 r/q
2
M |S+ (, x)| d
(bounded as in (3.26)) and
B = 2.82643| |22 ,
(3.35)
B+ = 2.82643|+ |22 .
In conclusion, we have proven

L
Proposition 3.2. Let x 1. Let + , : [0, ) R. Assume + C 2 , +
2
1
2
and + , L L . Let : [0, ) R be thrice differentiable outside finitely
(3)
many points. Assume
P L1 and |+ |2 < 0 | |2 , where 0 0.
Let S (, x) = n (n)e(n)(n/x). Let err, , primitive, be given as in
(3.12) and (3.13). Let 0 > 0, r 1. Let M = M0 ,r be as in (3.5).
Then, for any N 0,
equals
(3.36)
S+ (, x)2 S (, x)e(N )d
C0 C , x2 + 2.82643| |22 (2 + 0 ) 0 +
4.310040 | |21 + 0.0012
(3)
| |21
05
| |1 x2
p
+ O (E ,r,0 A+ + E+ ,r,0 1.6812( A+ + 1.6812|+ |2 )| |2 ) x2

q
+ O 2Z+2 ,2 (x)LS (x, r) x + 4 Z+2 ,2 (x)Z2 ,2 (x)LS+ (x, r) x ,
where
(3.37)
Y

1
1
,
C0 =
1
1+
(p 1)2
(p 1)3
pN
p|N

Z Z
N
(t1 ) (t2 )
C , =
(t1 + t2 ) dt1 dt2 ,
x
0
0
Y
22
H. A. HELFGOTT
(3.38)
E,r,0 =
max
mod q
qgcd(q,2)r
||gcd(q,2)0 r/2q
q | err, (, x)|,
ET,s = max | err,T (, x)|,

||s/q

ET,0 r/2
A = L,r,0 ||22 + 5.190 r ET, 0 r |1 | +
2
2

3 log r
2
E,r,
+ 0 rx1 (log 2e2 r)Kr,2
+ 0 r 2 +
0
2
X 2 (q)
L,r,0 2||22
(q)
qr
q odd
Kr,2 = (1 + 2r)(log x)2 || (2Z,1 (x)/x + (1 + 2r)(log x)2 || /x),

X p
1X k
Z,k (x) =
(n)(n/x),
LS (x, r) = log r max
,
pr
x n
x
1
and err, is as in (3.12) and (3.13).

Here is how to read these expressions. The error term in the first line of (3.36)
will be small provided that 0 is small and r is large. The third line of (3.36) will
be negligible, as will be the term 20 r(log er)Kr,2 in the definition of A . (Clearly,
Z,k (x) (log x)k1 and LS (x, q) (q) log x for any of rapid decay.)
It remains to estimate the second line of (3.36), and this includes estimating
A . We see that we will have to give very good bounds for E,r,0 when = + or
= . (The same goes for ET+ ,r0 , for which the same method will work (and
give even better bounds).) We also see that we want to make C0 C+ , x2 as large
as possible; it will be competing not just with the error terms here, but, more
importantly, with the bounds from the minor arcs, which will be proportional to
|+ |22 | |1 .
4. Optimizing and coordinating smoothing functions
One of our goals is to maximize the quantity C , in (3.37) relative to
| |22 | |1 . One way to do this is to ensure that (a) is concentrated on a very
short5 interval [0, ), (b) is supported on the interval [0, 2], and is symmetric
around t = 1, meaning that (t) (2 t). Then, for x N/2, the integral

Z Z
N
(t1 + t2 ) dt1 dt2
(t1 ) (t2 )
x
0
0
in (3.37) should be approximately equal to

Z
Z
N
(t)
(4.1)
| |1
(t)2 dt = | |1 | |22 ,
t dt = | |1
x
0
0
provided that 0 (t) 0 for all t. It is easy to check (using Cauchy-Schwarz in
the second step) that this is essentially optimal. (We will redo this rigorously in
a little while.)
At the same time, the fact is that major-arc estimates are best for smoothing
functions of a particular form, and we have minor-arc estimates from [Hel] for
5This is an idea due to Bourgain in a related context [Bou99].
23
a different specific smoothing 2 . The issue, then, is how do we choose and

as above so that we can
is concentrated on [0, ),
is supported on [0, 2] and symmetric around t = 1,
we can give minor-arc and major-arc estimates for ,
we can give major-arc estimates for a function + close to in 2 norm?
4.1. The symmetric smoothing function . We will later work with a
smoothing function whose Mellin transform decreases very rapidly. Because
of this rapid decay, we will be able to give strong results based on an explicit
formula for . The issue is how to define , given , so that is symmetric
around t = 1 (i.e., (2 x) (x)) and is very small for x > 2.
2
We will later set (t) = et /2 . Let
(
t3 (2 t)3 et/21/2 if t [0, 2],
(4.2)
h : t 7
0
otherwise
We define : R R by
(4.3)
(
2
t3 (2 t)3 e(t1) /2
(t) = h(t) (t) =
0
if t [0, 2],
otherwise.
It is clear that is symmetric around t = 1 for t [0, 2].

4.1.1. The product (t) ( t). We now should go back and redo rigorously
what we discussed informally around (4.1). More precisely, we wish to estimate
Z
Z
(t) (2 + t)dt
(t) ( t)dt =
(4.4)
() =
for 2 close to 2. In this, it will be useful that the Cauchy-Schwarz inequality

degrades slowly, in the following sense.
Lemma 4.1. Let V be a real vector space with an inner product h, i. Then, for
any v, w V with |w v|2 |v|2 /2,
hv, wi = |v|2 |w|2 + O (2.71|v w|22 ).
Proof. By a truncated Taylor expansion,
x x2
1
1+x=1+ +
max
2
2 0t1 4(1 (tx)2 )3/2
2
x
x
=1+ +O
2
23/2
for |x| 1/2. Hence, for = |w v|2 /|v|2 ,

s

2 2
+ 2
2 hwv,vi
2hw v, vi + |w v|22
|w|2
|v|22
(2 + )
+O
= 1+
=1+
|v|2
2
|v|22
23/2

1 (5/2)2
|w v|22
hw v, vi
2
=1++O
+ O 2.71
.
=1+
+ 3/2
2
|v|22
|v|22
2
Multiplying by |v|22 , we obtain that

|v|2 |w|2 = |v|22 + hw v, vi + O 2.71|w v|22 = hv, wi + O 2.71|w v|22 .
24
H. A. HELFGOTT
Applying Lemma 4.1 to (4.4), we obtain that

Z
(t) ((2 ) + t)dt
( )() =
sZ
sZ
+O
(4.5)
2.71
= | |22 + O
=
| |22
| |22
| (t)|2 dt
+O
| (t) ((2 ) + t)| dt
2.71

| ((2 ) + t)|2 dt
Z
2.71(2 )
2
+ O (2.71(2 )

(r + t) dr
2 Z
0
2
| |2 ).
2
dt

(r + t)2 dtdr
We will be working with supported on the non-negative reals; we recall that

is supported on [0, 2]. Hence
(4.6)

Z Z
Z N
x
N
N
(t1 ) (t2 )
(t1 + t2 ) dt1 dt2 =
d
( )()
x
x
0
0
0

Z N
x
N
2
2 2
=
d
(| |2 + O (2.71(2 ) | |2 ))
x
0
!
Z N
Z N
= | |22
()d + 2.71| |22 O
((2 N/x) + )2 ()d ,
provided that N/x 2. We see that it will be wise to set N/x very slightly larger
than 2. As we said before, will be scaled so that it is concentrated on a small
interval [0, ).
4.2. The approximation + to . We will define + : [0, ) R by
(4.7)
+ (t) = hH (t) (t),
where hH (t) is an approximation to h(t) that is band-limited in the Mellin sense.

Band-limited here means that the restriction of the Mellin transform to the imaginary axis has compact support [iH, iH], where H > 0 is a constant. Then,
since
Z i
1
M hH (r)M (s r)dr
(M + )(s) = (M (hH ))(s) =
2i i
(4.8)
Z iH
1
M h(r)M (s r)dr
=
2i iH
(see (2.3)), the Mellin transform M + will have decay properties similar to those
of M as s .
Let
(
IH (s) =
1
0
if |(s)| H,
otherwise.
25
The inverse Mellin transform of IH is

Z iH
1 sin(H log y)
1 y s iH
1
1
|iH =
.
y s ds =
(4.9)
(M IH )(y) =
2i iH
2i log y
log y
It is easy to check that the Mellin transform of this is indeed identical to [iH,iH]
on the imaginary axis: (4.9) is the Dirichlet kernel under a change of variables.
Now, in general, the Mellin transform of f M g is M f M g. We define
Z

sin(H log y) dy
h(ty 1 )
hH (t) = h M 1 IH (t) =
t
log y
y
2
(4.10)
Z 2
t
sin(H log y) dy
=
h(ty)
log y
y
0
and obtain that the Mellin transform of hH (t), on the imaginary axis, equals the
restriction of M h to the interval [iH, iH]. We may adopt the convention that
hH (t) = h(t) = 0 for t < 0.
4.2.1. The difference + in 2 norm. By (4.3) and (4.7),
Z
2
|hH (t) (t) h(t) (t)|2 dt
|+ |2 =
0
Z
(4.11)
dt
2
|hH (t) h(t)|2 .
max | (t)| t
t0
t
0
Since the Mellin transform is an isometry (i.e., (2.4) holds),

Z
Z
Z
1
1
2
2 dt
=
|M hH (it) M h(it)| dt =
|M h(it)|2 dt.
|hH (t) h(t)|
t
2
H
0
2
The maximum maxt0 | (t)| t is 1/ 2e.
Now, consider any : [0, ) C that (a) has compact support (or fast decay),
(b) satisfies (t) = O(t3 ) near the origin and (c) is quadruply differentiable
outside a finite set of points. We can show by integration by parts (cf. (2.1))
that, for (s) > 2,
Z
Z
Z
xs
xs+1
s dx
(x) dx =
(x)
=
dx
(x)x
M (s) =
x
s
s(s + 1)
0
0
0
Z
Z
xs+2
xs+3
=
(3) (x)
dx = lim
dx,
(4) (x)
s(s + 1)(s + 2)
s(s + 1)(s + 2)(s + 3)
t0+ t
0
where (4) (x) is understood in the sense of distributions at the finitely many
places where it is not well-defined
R as a function.
Let s = it, = h. Let Ck = 0 |h(k) (x)|xk1 dx for 0 k 4. Then

C4
(4.12)
M h(it) = O
.
|t||t + i||t + 2i||t + 3i|
Hence
(4.13)
and so
1
=
|hH (t) h(t)|
t
2 dt
C4
|+ |2
7
1
|M h(it)| dt
1
2e
1/4
1
.
H 7/2
C42
C42
dt
t6
7H 7
26
H. A. HELFGOTT
By (C.7), C4 = 3920.8817036284 + O(1010 ). Thus

(4.14)
|+ |2
547.5562
.
H 7/2
It will also be useful to bound

Z

2

(+ (t) (t)) log t dt .

0
This is at most
Z
Z
2
|hH (t) (t) h(t) )|2 | log t|dt
(+ (t) (t)) | log t|dt
0
0

Z
dt
2
max | (t)| t| log t|
|hH (t) h(t)|2 .
t0
t
0
Now
2
max | (t)|2 t| log t| = min
(t)t log t 0.3301223,
t0
t[0,1]
where we find the minimum by the bisection method (carried out rigorously, as
in Appendix C.2, with 30 iterations), Hence, by (4.13),
Z
480.394
(+ (t) (t))2 | log t|dt
(4.15)
.
H 7/2
0
***
As we said before, (M hH )(it) is just the truncation of (M h)(it) to the interval
[H, H]. We can write down M h explicitly:
M h = e1/2 (1)s (8(s + 3, 2) + 12(s + 4, 2) + 6(s + 5, 2) + (s + 6, 2)),
where (s, x) is the (lower) incomplete Gamma function
Z x
et ts1 dt.
(s, x) =
0
However, it is easier to deal with M h by means of bounds and approximations.

Besides (4.12), note we have also derived

C2
C3
C1
,
,
.
(4.16)
M h(it) = O min C0 ,
|t| |t||t + i| |t||t + i||t + 2i|
By (C.2), (C.3), (C.4), (C.6) and (C.7),
C0 1.622284,
C3 131.3399,
C1 3.580004,
C4 3920.882.
C2 15.27957,
(We will compute rigorously far more precise bounds in Appendix C.1, but these
bounds are all we could need.)
4.2.2. Norms involving + . Let us now bound some norms involving + . Relatively crude bounds will suffice in most cases.
First, by (4.14),
(4.17)
|+ |2 | |2 + |+ |2 0.80013 +
where we obtain
(4.18)
| |2 =
547.5562
,
H 7/2
0.640205997 . . . = 0.8001287 . . .
27
by symbolic integration. We can use the same idea to bound |+ (t)tr |2 , r 1/2:
|+ (t)tr |2 | (t)tr |2 + |(+ )(t)tr |2
s
r
max |
| (t)t |2 +
For example,
t0
(t)|2 t2r+1
|hH (t) h(t)|2
dt
t
r
C4,1 1
.
| (t)tr |2 + max | (t)|2 t2r+1
t0
7 H 7/2
| (t)t0.7 |2 0.80691,
| (t)t0.3 |2 0.66168,
max | (t)|2 t20.7+1 0.37486,

t0
max | (t)|2 t2(0.3)+1 0.5934.

t0
Thus
511.92
,
7/2
H
(4.19)
644.08
|+ (t)t0.3 |2 0.66168 +
.
H 7/2
equals (s 1)(M )(s 1). Since the Mellin
The Mellin transform of +
+
transform is an isometry in the sense of (2.4),
Z 1 +i
Z 1 +i

2
2
2
1
1
2

M (+ )(s) ds =
|s M + (s)|2 ds.
|+ |2 =
1
1
2i i
2i i
|+ (t)t0.7 |2 0.80691 +
Recall that + (t) = hH (t) (t). Thus, by (2.3), M + (1/2 + it) equals 1/2
times the (additive) convolution of M hH (it) and M (1/2 + it). Therefore, for
s = 1/2 + it,
Z
|s| H
|s| |M + (s)| =
M h(ir)M (s ir)dr
2 H
Z H
3
(4.20)
|ir 1||M h(ir)| |s ir||M (s ir)|dr
2 H
3
(f g)(t),
=
2
where f (t) = |ir 1||M hH (it)| and g(t) = | 1/2 + it||M (1/2 + it)|. (Since
(1/2 + i(t r)) + (1 + ir) = 1/2 + it = s, either either | 1/2 + i(t r)| |s|/3 or
|1+ir| 2|s|/3; hence |sir||ir1| = |1/2+i(tr)||1+ir| |s|/3.) By Youngs
inequality (in a special case that follows from Cauchy-Schwarz), |f g|2 |f |1 |g|2 .
Again by Cauchy-Schwarz and Plancherel (i.e., isometry),
Z
2 Z H
2

2

|f |1 =
|ir 1||M hH (ir)|dr =
|ir 1||M h(ir)|dr
2H
= 4H
|ir 1| |M h(ir)| dr = 2H
H
Z
0
|(th) (t)|2
|M ((th) )(ir)|2 dr
dt
4H 3.79234.
t
Yet again by Plancherel,

Z
Z 1 +i
2
2
2
2
|s| |M (s)| ds =
|g|2 =
12 i
1
+i
2
1
i
2
|(M (
))(s)|2 ds
2
|
|2
.
4
28
H. A. HELFGOTT
Hence
3
3/2
1
1/4
3.79234 0.87531 H.
4H
(4.21) |+
|2 |f g|2
2
2 2
2 3/2
t | , > 0, in the same way. The only difference is that the
We can bound |+
2
integral is now taken on (1/2 + i, 1/2 + + i) rather than (1/2
i, 1/2 + i). For example, if = 0.7, then |s||M + (s)| = (6/2)(f g)(t),
where f (t) = |ir 1||M hH (it)| and g(t) = |0.2 + it||M ,1 (0.9 + it)|. Here
Z 1.2i
t0.7 |22 0.55091

))(s)|2 ds = |
|(M (
|g|22 =
1.2i
and so
(4.22)
1
3
0.7
|+
t |2 4H 3.79234 0.55091 1.95201 H.
2
By isometry and (2.5),

|+
log |22
1
=
2i
1
+i
2
1
|M (+ log)(s)| ds =
2i
2
1
i
2
1
+i
2
1
i
2
|(M + ) (s)|2 ds.
Now, (M + ) (1/2 + it) equals 1/2 times the additive convolution of M hH (it)
and (M ) (1/2 + it). Hence, by Youngs inequality, |(M + ) (1/2 + it)|2
(1/2)|M hH (it)|1 |(M ) (1/2 + it)|2 . By the definition of hH , Cauchy-Schwarz
and isometry,
sZ
Z H
H
|M h(ir)|2 dr
|M h(ir)|dr 2H
|M hH (it)|1 =
H
2H
sZ
|M h(ir)|2 dr
2H 2|h(t)/ t|2 .
Again by isometry and (2.5),

|(M ) (1/2 + it)|2 =
2| log |2 .
Hence
2
H
2H|h(t)/ t|2 | log |2 = |h(t)/ t|2 | log |2 .
|+ log |2
3/2
(2)
Since
(4.23)
|h(t)/ t|2 1.40927,
| log |2 1.39554,
we get that
(4.24)
|M hH (ir)|1 1.99301 2H
and
(4.25)
|+ log |2 1.10959 H.
29
Let us bound |+ (t)t |1 for (1, ). By Cauchy-Schwarz and Plancherel,
|+ (t)t |1 = | (t)hH (t)t |1 | (t)t+1/2 |2 |hH (t)/ t|2

sZ
dt
= | (t)t+1/2 |2
|hH (t)|2
t
0
s Z
i
1
|M hH (ir)|2 dr
= | (t)t+1/2 |2
2 i
s Z
1
| (t)t+1/2 |2
|M h(ir)|2 dr | (t)t+1/2 |2 |h(t)/ t|2 .
2
Since
| t+1/2 |2 =
sZ
t2
t2+1 dt =
and |h(t)/ t|2 is as in (4.23), we conclude that

p
(4.26)
|+ (t)t |1 0.99651 ( + 1),
( + 1)
2
|+ |1 0.996505.
Let us now get a bound for |+ | . Recall that + (t) = hH (t) (t). Clearly
(4.27)
|+ | = |hH (t) (t)| | | + |(h(t) hH (t)) (t)|

h(t) hH (t)
| (t)t| .
| | +

t
Taking derivatives, we easily see that
| (t)t| = e1/2 .
| | = (1) = 1,
It remains to bound |(h(t) hH (t))/t| . By (4.10),

Z
Z
t
sin w
1 sin(H log y) dy
h(ty )
h w/H
(4.28)
hH (t) =
=
dw.
t
log y
y
w
e
H log 2
2
The sine integral

Si(x) =
x
0
sin t
dt
t
is defined for all x; it tends to /2 as x + and to /2 as x . We

apply integration by parts to the second integral in (4.28), and obtain

Z
d
1
t
hH (t) h(t) =
Si(w)dw h(t)
h w/H
H log 2 dw
e

Z t

1 d
t
Si(w)
=
dw
h w/H
0
dw
2
e

Z

t
d
1 0
Si(w) +
dw.
h w/H
H log 2 dw
2
e
t
Now

tew/H
d
t
=
h
dw
H
ew/H

h

t|h | .

ew/H
Hew/H
t
30
H. A. HELFGOTT
Integration by parts easily yields the bounds | Si(x) /2| < 2/x for x > 0 and
| Si(x) + /2| < 2/|x| for x < 0; we also know that Si(x) > 0 for x > 0 and
Si(x) < 0 for x < 0. Hence
Z 1
Z w/H !
2e
2t|h |
|hH (t) h(t)|
dw +
dw
H
w
0 2
1

t|h |
4
=
1 + E1 (1/H) ,
H
where E1 is the exponential integral
E1 (z) =
By [AS64],
0 < E1 (1/H) <
et
dt.
t
log(H + 1)
< log H.
e1/H
Hence
1 + 4 log H
|hH (t) h(t)|
< |h |
t
H
(4.29)
and so

4
h(t) hH (t)

< 1 + e1/2 |h | 1 + log H .
|+ | 1 + e

t
H
(t) = 0 within (0, 2) are t =

3 1, t = 21 3. A quick check
The roots of h
gives that |h ( 21 3)| > |h ( 3 1)|. Hence
(4.30)
|h | = |h ( 21 3)| 3.65234.
1/2
We have proven
1 + 4 log H
log H
< 1 + 2.21526
.
H
H
We will need another bound of this kind, namely, for + log t. We start as in
(4.27):
|+ log t| | log t| + |(h(t) hH (t)) (t) log t|
(4.31)
|+ | < 1 + e1/2 3.65234
1+
| log t| + |(h hH (t))/t| | (t)t log t| .

By the bisection method with 30 iterations,
| (t) log t| 0.279491,
Hence, by (4.29) and (4.30),
| (t)t log t| 0.346491.

1+
log H
.
H
4.3. The smoothing function . Here the challenge is to define a smoothing
function that is good both for minor-arc estimates and for major-arc estimates.
The two regimes tend to favor different kinds of smoothing function. For minorarc estimates, both [Tao] and [Hel] use
(4.32)
(4.33)
|+ log t| 0.2795 + 1.26551
2 (t) = 4 max(log 2 | log 2t|, 0) = ((2I[1/2,1] ) M (2I[1/2,1] ))(t),
where I[1/2,1] (t) is 1 if t [1/2, 1] and 0 otherwise. For major-arc estimates, we

will use a function based on
2
= et /2 .
31
(For example, we can give major-arc estimates for + , which is based on (by
2
(4.7)). We will actually use here the function t2 et /2 , whose Mellin transform is
M (s + 2) (by, e.g., [BBO10, Table 11.1]).)
We will follow the simple expedient of convolving the two smoothing functions,
one good for minor arcs, the other one for major arcs. In general, let 1 , 2 :
[0, ) C. It is easy to use bounds on sums of the form
X
(4.34)
Sf,1 (x) =
f (n)1 (n/x)
n
to bound sums of the form Sf,1 M 2 :

n
X
Sf,1 M 2 =
f (n)(1 M 2 )
x
n
Z X
Z
(4.35)
n
dw
dw
f (n)1
=
2 (w)
Sf,1 (wx)2 (w) .
=
wx
w
w
0
0
n
The same holds, of course, if 1 and 2 are switched, since 1 M 2 = 2 M 1 .

The only objection is that the bounds on (4.34) that we input might not be valid,
or non-trivial, when the argument wx of Sf,1 (wx) is very small. Because of this,
it is important that the functions 1 , 2 vanish at 0, and desirable that their first
and second derivatives do so as well.
Let us see how this works out in practice for 1 = 2 . Here 2 : [0, ) R is
given by
(4.36)
2 = 1 M 1 = 4 max(log 2 | log 2t|, 0),
where 1 = 2 I[1/2,1] . Bounding the sums S2 (, x) on the minor arcs was the
main subject of [Hel].
Before we use [Hel, Main Thm.], we need an easy lemma so as to simplify its
statement.
Lemma 4.2. For any q 1 and any r max(3, q),
q
< (r),
(q)
where
(4.37)
(r) = e log log r +
2.50637
.
log log r
Proof. Since (r) is increasing for r 27, the statement follows immediately for
q 27 by [RS62, Thm. 15]:
q
< (q) (r).
(q)
For r < 27, it is clear that q/(q) 2 3/(1 2) = 3; it is also easy to see that
(r) > e 2.50637 > 3 for all r > e.

It is time to quote the main theorem in [Hel]. Let x x0 , x0 = 2.16 1020 .
Let 2 = a/q + /x, q Q, gcd(a, q) = 1, |/x| 1/qQ, where Q = (3/4)x2/3 .
Then, if 3 q x1/3 /6, [Hel, Main Thm.] gives us that

||
q x,
(4.38)
|S2 (, x)| gx max 1,
8
32
H. A. HELFGOTT
where
(4.39)
p
(Rx,2r log 2r + 0.5) (r) + 2.5 Lr
+
+ 3.2x1/6 ,
gx (r) =
r
2r
with
Rx,t = 0.27125 log
(4.40)
+ 0.41415
9x1/3
2 log 2.004t

16 80
80
111
,
+ log 2 9 t 9 +
+
9
5
1+

7 13
Lt = (t) log 2 4 t 4
log 4t
(We are using Lemma 4.2 to bound all terms 1/(q) appearing in [Hel, Main
Thm.]; we are also using the obvious fact that, for 0 q fixed and 0 < a < b, 0a q b
is minimal when 0 is minimal.) If q > x1/3 /6, then, again by [Hel, Main Thm.],
(4.41)
|S2 (, x)| h(x)x,
where
(4.42)
h(x) = 0.2727x1/6 (log x)3/2 + 1218x1/3 log x.
We will work with x varying within a range, and so we must pay some attention
to the dependence of (4.38) and (4.41) on x.
Lemma 4.3. Let gx (r) be as in (4.39) and h(x) as in (4.42). Then
(
h(x) if x < (6r)3
x 7
gx (r) if x (6r)3
is a decreasing function of x for r 3 fixed and x 21.
Proof. It is clear from the definitions that x 7 h(x) (for x 21) and x 7 gx,0 (r)
are both decreasing. Thus, we simply have to show that h(x1 ) gx1 ,0 (r) for
x1 = (6r)3 . Since x1 (6 11)3 > e12.5 ,
Rx1 ,2r 0.27125 log(0.065 log x1 + 1.056) + 0.41415
0.27125 log((0.065 + 0.0845) log x1 ) + 0.41415 0.27215 log log x1 .
Hence
1/3
Rx1 ,2r log 2r + 0.5 0.27215 log log x1 log x1 0.27215 log 12.5 log 3 + 0.5
0.09072 log log x1 log x1 0.255.
At the same time,
1/3
(4.43)
2.50637
x1
+
e log log x1 e log 3 + 1.9521
6
log log r
e log log x1
(r) = e log log
for r 37, and we also get (r) e log log x1 for r [11, 37] by the bisection
method (carried out rigorously, as in Appendix C.2, with 10 iterations). Hence
p
(Rx1 ,2r log 2r + 0.5) (r) + 2.5
p
(0.09072 log log x1 log x1 0.255) e log log x1 + 2.5
0.1211 log x1 (log log x1 )3/2 + 2,
33
and so
p
(Rx1 ,2r log 2r + 0.5) (r) + 2.5
1/6
(0.21 log x1 (log log x1 )3/2 + 3.47)x1 .

2r
Now, by (4.43),

80
7
16
80
111
1/3
1/3
Lr e log log x1 log 2 4 (x1 /6)13/4 +
+ log 2 9 (x1 /6) 9 +
9
5

80
13
log x1 + 4.28 +
log x + 7.51.
e log log x1
12
27
It is clear that
4.28e log log x1 + 80
1/3
27 log x1 + 7.51
< 1218x1
log x1 .
1/3
x1 /6
for x1 e.
It remains to show that
13 1/6
e x1
log x1 log log x1
12
is less than 0.2727(log x1 )3/2 for x1 large enough. Since t 7 (log t)3/2 /t1/2 is
decreasing for t > e3 , we see that
(4.44)
0.21 log x1 (log log x1 )3/2 + 3.47 + 3.2 +
1/6
0.21 log x1 (log log x1 )3/2 + 6.67 + 13

12 e x1
0.2727(log x1 )3/2
log x1 log log x1
<1
for all x1 e34 , simply because it is true for x = e34 > ee .

1/3
We conclude that h(x1 ) gx1 ,0 (r) = gx1 ,0 (x1 /6) for x1 e34 . We check
1/3
that h(x1 ) gx1 ,0 (x1 /6) for all x1 [5832, e34 ] as well by the bisection method
(applied to [5832, 583200] and to [583200, e34 ] with 30 iterations in the latter
interval, with 20 initial iterations).

Lemma 4.4. Let Rx,r be as in (4.39). Then t Ret ,r (r) is convex-up for
t 3 log 6r.
Proof. Since t et/6 and t t are clearly convex-up, all we have to do is to

show that t Ret ,r is convex-up. In general, since

f f (f )2
f
=
,
(log f ) =
f
f2
a function of the form (log f ) is convex-up exactly when f f (f )2 0. If

f (t) = 1 + a/(t b), we have f f (f )2 0 whenever
(t + a b) (2a) a2 ,
i.e., a2 + 2at 2ab, and that certainly happens when t b. In our case, b =
3 log(2.004r/9), and so t 3 log 6r implies t b.
Proposition 4.5. Let x Kx0 , x0 = 2.16 1020 , K 1. Let S (, x) be as

in (3.1). Let = 2 M , where 2 is as in (4.36) and : [0, ) [0, ) is
continuous and in L1 .
Let 2 = a/q + /x, q Q, gcd(a, q) = 1, |/x| 1/qQ, where Q = (3/4)x2/3 .
If q (x/K)1/3 /6, then

||
q ||1 x,
(4.45)
S (, x) gx, max 1,
8
34
H. A. HELFGOTT
where
(4.46)
p
(Rx,K,,2r log 2r + 0.5) (r) + 2.5 Lr
+ 3.2K 1/6 x1/6 ,

+
gx, (r) =
r
2r
C,2 /||1
Rx,K,,t = Rx,t + (Rx/K,t Rx,t )
log K
with Rx,t and Lr are as in (4.40), and

Z 1
(w) log w dw.
(4.47)
C,2,K =
1/K
If q > (x/K)1/3 /6, then

|S (, x)| h (x/K) ||1 x,
where
h (x) = h(x) + C,0,K /||1 ,
Z 1/K
|(w)|dw
C,0,K = 1.04488
(4.48)
and h(x) is as in (4.42).

Proof. By (4.35),
S (, x) =
1/K
dw
S2 (, wx)(w)
+
w
1/K
S2 (, wx)(w)
dw
.
w
We bound the first integral by the trivial estimate |S2 (, wx)| |S2 (0, wx)|
and Cor. A.3:
Z 1/K
Z 1/K
dw
dw
wx(w)
1.04488
|S2 (0, wx)|(x)
w
w
0
0
Z 1/K
(w)dw.
= 1.04488x
0
If w 1/K, then wx x0 , and we can use (4.38) or (4.41). If q > (x/K)1/3 /6,
then |S2 (, wx)| h(x/K)wx by (4.41); moreover, |S2 (, y)| h(y)y for
x/K y < (6q)3 (by (4.41)) and |S2 (, y)| gy,1 (r) for y (6q)3 (by (4.38)).
Thus, Lemma 4.3 gives us that
Z
Z
dw
dw
|S2 (, wx)|(w)
h(x/K)wx (w)
w
w
1/K
1/K
Z
(w)dw h(x/K)||1 x.
= h(x/K)x
1/K
If q (x/K)1/3 /6, we always use (4.38). We can use the coarse bound
Z
dw
3.2K 1/6 ||1 x5/6
3.2x1/6 wx (w)
w
1/K
Since Lr does not depend on x,
Z
dw
Lr
Lr
wx (w)
||1 x.
w
r
1/K r
35
By Lemma 4.4 and q (x/K)1/3 /6, y 7 Rey ,t is convex-up and decreasing for
y [log(x/K), ). Hence

( log w
log w
Rx,t if w < 1,
R
+
1
1
1
x/K,t
log K
Rwx,t log K
Rx,t
if w 1.
Therefore
Z
dw
Rwx,t wx (w)
w
1/K
!
!
Z 1
Z
log w
log w
Rx,t (w)xdw
Rx,t x(w)dw +
1 Rx/K,t + 1
1
log K
log K
1/K
1
Z
Z 1
x
(w)dw + (Rx/K,t Rx,t )
Rx,t x
(w) log wdw
log K 1/K
1/K

C,2
Rx,t ||1 + (Rx/K,t Rx,t )
x,
log K
where
C,2,K =
(w) log w dw.

1/K
Lemma 4.6. Let x > K (6e)3 , K 1. Let = 2 M , where 2 is as in

(4.36) and : [0, ) [0, ) is continuous and in L1 . Let gx, be as in (4.46).
Then gx, (r) is a decreasing function of r for r 175.
Proof. Taking derivatives, we can easily see that
log r
(log r)2 log log r
log log r
, r 7
, r 7
r
r
r
are decreasing for r 20. The same is true if log log r is replaced by (r), since
(r)/ log log r is a decreasing function for r e. Since (C,2 /||
1 )/ log K 1,
we see that it is enough to prove that r 7 Ry,t log 2r log log r/ 2r is decreasing
on r for y = x and y = x/K (under the assumption that r 175).
Looking at (4.40) and at (4.49), it remains only to check that
!r
log 8r
log log r
(4.50)
r 7 log 1 +
9x1/3
r
2 log
(4.49)
r 7
4.008r
is decreasing on r for r 175. Taking logarithms, and then derivatives, we see

that we have to show that

1+
1/3
1
+ logr8r
r
22
log 8r
2

log 1 +
log 8r
2
+
1
1
< ,
2r log r log log r
2r
9x
where = log 4.008r
. Since r x1/3 /6, log 54/4.008 > 2.6. Thus, it is enough
to ensure that
2/2.6
1

+
< 1.
(4.51)
log 8r
log 8r
log r log log r
1 + 2 log 1 + 2
Since this is true for r = 175 and the left side is decreasing on r, the inequality
is true for all r 175.
36
H. A. HELFGOTT

Lemma 4.7. Let x 1024 . Let : [0, ) [0, ) be continuous and in L1 .
Let gx, (r) and h(x) be as in (4.46) and (4.42), respectively. Then

2 0.275
gx,
x
h(x/ log x).
3
Proof. We can bound gx, (r) from below by
p
(Rx,r log 2r + 0.5) (r) + 2.5
gmx (r) =
2r
0.275
Let r = (2/3)x
. Using the assumption that x 1024 , we see that

8 0.275
log 3 x
+ 0.41415 0.67086.

Rx,r = 0.27125 log 1 +
1
93/2
0.275
3
2 log 2.004 x
Using x 1024 again, we get that
2.50637
5.72864.
log log r
Since log 2r = 0.275 log x + log 4/3, we conclude that
0.44156 log x + 4.1486
p
.
gmx (r)
4/3 x0.1375
(r) = e log log r +
Recall that
0.2727(log x)3/2 1218 log x

+
.
x1/6
x1/3
It is easy to check that (1/x0.1375 )/((log x)3/2 /x1/6 ) is increasing for x e51.43
(and hence for x 1024 ) and that (1/x0.1375 )/((log x)/x1/3 ) is increasing for
x e5.2 (and hence, again, for x 1024 ). Since
h(x) =
0.44156 log x + 2
0.2727(log x)3/2
,
>
x0.1375
x1/6
for x 1024 , we are done.
2.1486
1218 log x
>
0.1375
x
x1/3
5. Mellin transforms and smoothing functions

5.1. Exponential sums and L functions. We must show how to estimate
expressions of the form
X
S, (/x, x) =
(n)(n)e(n/x)(n/x),
n
R+
0
R+
0
where :
is a smoothing function and is bounded by a large constant.

We must also choose . Let f (t) = e(t)(t). By Mellin inversion,
Z 2+i
X
1
L (s, )
(n)(n)f (n/x) =
(5.1)
F (s)xs ds,
2i 2i
L(s, )
n=1
where F is the Mellin transform of f :

Z
dt
f (t)ts .
(5.2)
F =
t
0
The standard procedure here (already used in [HL23]) is to shift the line of
integration in (5.1) to the left, picking up the contributions of the zeros of L(s, )
37
along the way. (Once the line of integration is far enough to the left, the terms ts
within (5.1) become very small, and so the value of the integral ought to become
very small, too.)
We will assume we know the zeros of L(s, ) up to a certain height H0
meaning, in particular, that we know that non-trivial zeros with (s) H0 lie
on the critical line (s) = 1/2 but we have little control over the zeros above
H0 . (The best zero-free regions available are not by themselves strong enough
for our purposes.) Thus, we want to choose so that the Mellin transform F
decays very rapidly both for = 0 and for non-zero and bounded.
5.2. How to choose a smoothing function? The method of stationary phase
([Olv74, 4.11], [Won01, II.3])) suggests that the main contribution to (5.2)
should come when the phase has derivative 0. The phase part of (5.2) is
e(t) = t(s) = e2it+ log t
(where we write s = + i ); clearly,
(2t + log t) = 2 +
=0
t
when t = /2. This is meaningful when t 0, i.e., sgn( ) 6= sgn(). The

contribution of t = /2 to (5.2) is then
(5.3)
(t)e(t)t
s1
+i 1
multiplied by a width approximately equal to a constant divided by

p
|(2it + log t) | =
The absolute value of (5.3) is

(5.4)
2||
| /t2 | = p .
| |
1

.
2 2
In other words, if sgn( ) 6= sgn() and is not too small, asking that F ( +i )
decay rapidly as | | amounts to asking that (t) decay rapidly as t 0.
Thus, if we ask for F ( + i ) to decay rapidly as | | for all moderate , we
are requesting that
(1) (t) decay rapidly as t ,
(2) the Mellin transform F0 ( + i ) decay rapidly as .
Requirement (2) is there because we also need to consider F ( + it) for very
small, and, in particular, for = 0.
There is clearly an uncertainty-principle issue here; one cannot do arbitrarily
well in both aspects at the same time. Once we are conscious of this, the choice
(t) = et in Hardy-Littlewood actually looks fairly good: obviously, (t) = et
decays exponentially, and its Mellin transform (s + i ) also decays exponentially
as . Moreover, for this choice of , the Mellin transform F (s) can be
written explicitly: F (s) = (s)/(1 2i)s .
38
H. A. HELFGOTT
It is not hard to work out an explicit formula6 for (t) = et . However, it is

not hard to see that, for F (s) as above, F (1/2 + it) decays like et/2|| , just as
we expected from (5.4). This is a little too slow for our purposes: we will often
have to work with relatively large , and we would like to have to check the zeroes
of L functions only up to relatively low heights t. We will settle for a different
choice of : the Gaussian.
2
The decay of the Gaussian smoothing function (t) = et /2 is much faster than
exponential. Its Mellin transform is (s/2), which decays exponentially as (s)
. Moreover, the Mellin transform F (s) ( 6= 0), while not an elementary or
very commonly occurring function, equals (after a change of variables) a relatively
well-studied special function, namely, a parabolic cylinder function U (a, z) (or,
in Whittakers [Whi03] notation, Da1/2 (z)).
For not too small, the main term will indeed work out to be proportional to
2
e( /2) /2 , as the method of stationary phase indicated. This is, of course, much
better than e /2|| . The cost is that the Mellin transform (s/2) for = 0
now decays like e(/4)| | rather than e(/2)| | . This we can certainly afford
the main concern was e| |/ for 105 , say.
5.3. The Mellin transform of the twisted Gaussian. We wish to approximate the Mellin transform
Z
2
dt
et /2 e(t)ts ,
F (s) =
t
0
where R. The parabolic cylinder function U : C2 C is given by

Z
2
1 2
1
ez /4

U (a, z) =
ta 2 e 2 t zt dt
1
2 +a 0
for (a) > 1/2; this can be extended to all a, z C either by analytic continuation or by other integral representations ([AS64, 19.5], [Tem10, 12.5(i)]).
Hence

1
(i)2
(5.5)
F (s) = e
(s)U s , 2i .
2
The second argument of U is purely imaginary; it would be otherwise if a Gaussian

of non-zero mean were chosen.
Let us briefly discuss the state of knowledge up to date on Mellin transforms of
2
twisted Gaussian smoothings, that is, et /2 multiplied by an additive character
e(t). As we have just seen, these Mellin transforms are precisely the parabolic
cylinder functions U (a, z).
The function U (a, z) has been well-studied for a and z real; see, e.g., [Tem10].
Less attention has been paid to the more general case of a and z complex. The
most notable exception is by far the work of Olver [Olv58], [Olv59], [Olv61],
[Olv65]; he gave asymptotic series for U (a, z), a, z C. These were asymptotic
series in the sense of Poincare, and thus not in general convergent; they would
solve our problem if and only if they came with error term bounds. Unfortunately,
6There may be a minor gap in the literature in this respect. The explicit formula given in
[HL23, Lemma 4] does not make all constants explicit. The constants and trivial-zero terms
were fully worked out for q = 1 by [Wig20] (cited in [MV07, Exercise 12.1.1.8(c)]; the sign of
hyp,q (z) there seems to be off). As was pointed out by Landau (see [Har66, p. 628]), [HL23,
Lemma 4] actually has mistaken terms for non-primitive. (The author thanks R. C. Vaughan
for this information and the references.)
39
it would seem that all fully explicit error terms in the literature are either for a
and z real, or for a and z outside our range of interest (see both Olvers work and
[TV03].) The bounds in [Olv61] involve non-explicit constants. Thus, we will
have to find expressions with explicit error bounds ourselves. Our case is that of
a in the critical strip, z purely imaginary.
Gaussian smoothing has been used before in number theory; see, notably,
[HB79]. What is new here is that we will derive fully explicit bounds on the Mellin
transform of the twisted Gaussian. This means that the Gaussian smoothing will
be a real option in explicit work on exponential sums in number theory from now
on.
5.3.1. General approach and situation. We will use the saddle-point method (see,
e.g., [dB81, 5], [Olv74, 4.7], [Won01, II.4]) to obtain bounds with an optimal
leading-order term and small error terms. (We used the stationary-phase method
solely as an exploratory tool.)
What do we expect to obtain? Both the asymptotic expressions in [Olv59] and
the bounds in [Olv61] make clear that, if the sign of = (s) is different from that
of , there will a change in behavior when gets to be of size about (2)2 . This is
unsurprising, given our discussion using stationary phase: for |(a)| smaller than
a constant times |(z)|2 , the term proportional to e(/4)| | = e|(a)|/2 should
be dominant, whereas for |(a)| much larger than a constant times |(z)|2 , the
2
1
term proportional to e 2 ( 2 ) should be dominant.
5.3.2. Setup. We write
(5.6)
(u) =
u2
(2i)u i log u
2
for u real or complex, so that

F (s) =
e(u) u
du
.
u
We will be able to shift the contour of integration as we wish, provided that

it starts at 0 and ends at a point at infinity while keeping within the sector
arg(u) (/4, /4).
We wish to find a saddle point. At a saddle point, (u) = 0. This means that
i
= 0,
i.e.,
u2 + iu i = 0,
u
where = 2. The solutions to (u) = 0 are thus
i 2 + 4i
(5.8)
u0 =
.
2
The second derivative at u0 is

1
1
(5.9)
(u0 ) = 2 u20 + i = 2 (iu0 + 2i ).
u0
u0
(5.7)
u 2i
Assign the names u0,+ , u0, to the roots in (5.8) according to the sign in front
of the square-root (where the square-root is defined so as to have argument in
(/2, /2]).
We assume without loss of generality that 0. We shall also assume at first
that 0 (i.e., 0), as the case < 0 is much easier.
40
H. A. HELFGOTT
5.3.3. The saddle point. Let us start by estimating

s y/2
(5.10)
u0,+ e = |u0,+ | e arg(u0,+ ) ey/2 ,
where y = (iu0 ). (This is the main part of the contribution of the saddle
point, without the factor that depends on the contour.) We have
!
r

2
2
p
4i
i
i
i + 2 + 4i
(5.11)
y=
1 + 2 .
=
2
2
2
Solving a quadratic equation, we get that

r
r
r
4i
j() 1
j() + 1
+i
,
(5.12)
1 + 2 =
2
2
where j() = (1 + 2 )1/2 and = 4 /2 . Thus

!
r
2
j() + 1
y=
1 .
2
2
Let us now compute the argument of u0,+ :

p
p
arg(u0,+ ) = arg i + 2 + 4i = arg i + 1 + i
!
r
r
1 + j()
1 + j()
= arg i +
+i
2
2
1+j()
1
(5.13)
2
= arcsin s q

q
1+j()
1
2 1+j()
2
2
v
u
u1
= arcsin t
2
(by cos 2 = 1 2 sin2 ). Thus
s
!
1
2
2
= arccos
1 + j()
2
1 + j()
s
!!
r
2
y
2
j() + 1
1
arg(u0,+ ) + = arccos
2
1 + j() 2
2
2
(5.14)

2(() 1)
1
,
= arccos
()
2
p
where () = (1 + j())/2.
It is clear that

1
2(() 1)
(5.15)
lim arccos
()
2
whereas
(5.16)
as 0+ .
arccos
2(() 1)

=
()
2 4
4
41
1
We are still missing a factor of |u0,+ | (from (5.10)),
p a factor of |u0,+ | (from
the invariant differential du/u) and a factor of 1/ | (u0,+ )| (from the passage
by the saddle-point along a path of steepest descent). By (5.9), this is
|u |1
p 0,+
=
| (u0,+ )|
|u0,+ |1
|u0,+ |
p
p
=
.
1
|iu0,+ + 2i |
|iu0,+ + 2i |
|u0,+ |
By (5.8) and (5.12),

!
r

i + 2 + 4i r 1 + j()
1 + j()

+
1 i
|u0,+ | =
=

2

2
2
2
s
r
1 + j()
1 + j() 1 + j()
(5.17)
=
+
+12
2
2
2
2
s
r
p
1 + j()
=
1 + j() 2
()2 ().
=
2
2
2
Proceeding as in (5.11), we obtain that

!
r

i
4i

i + 1 + 2 + 2i
|iu0,+ + 2i | =

2

r
r
2

2
2
i
j()
+
1
j()
1

= + 2i +

2

2
2
2
2
v
(5.18)
!2
!2
u
r
r
j()
+
1
j()
1
2 u
+
= t 1 +
2
2
2
s
r
r
2
j()
+
1
j() 1
=
2
.
j() + 2 + 1 2
2
2
2
p
p
Since j() 1 = / j() + 1, this means that
(5.19)
v
s
u
2
u
2
|iu0,+ + 2i | = tj() + 2 + 1
(j() + 1 + 2 )
2
j() + 1
2 p
j() + j()2 (())1 (j() + j()2 )
2
p
2 j() p
2 p
2
1
2() j()(1 (()) ) =

()2 ().
=
2
2
Hence
|
|u0,+
p
=
|iu0,+ + 2i |

p
2 ()
()
1
2
=
1
1
(j())1/4
2 2 4 j() 4
(()2 ())1/4
21/4
3
2 2 4
2
41
2
2

1
2
j() 4
(() ())
(()2 ()) 2 4
It remains to determine the direction of steepest descent at the saddle-point

u0,+ . Let v C point in that direction. Then, by definition, v 2 (u0,+ ) is real
42
H. A. HELFGOTT
and positive, where is as in 5.6. Thus arg(v) = arg( (u0,+ ))/2. By (5.9),
arg( (u0,+ )) = arg(iu0,+ + 2i ) 2 arg(u0,+ ).
Starting as in (5.18), we obtain that
and
(5.20)
arg(iu0,+ + 2i ) = arctan
q
1 +
j1
2
j+1
2
j1
2

j+1
2
1 +
j1
2
j+1
2
p
p
2(j 1) + 2(j + 1)
=
=
j1
1 + j+1
2
p

q
2
j 2 1 + (j + 1)
+ j+1
+ 1 ( + (j + 1))
=
=
j1
j1
(j + 1)(1 + j/)
2( + j)
(1 + j/)
=
=
.
=
j1
1+
Hence, by (5.13),
arg( (u0,+ )) = arctan
2( + j)
arccos ()1 .
Therefore, the direction of steepest descent is

(5.21)
1
2v(v + j)
arg( (u0,+ ))
= arg(u0,+ ) arctan
2
2
= arg(u0,+ ) arctan ,
arg(v) =
where
(5.22)
= tan
1
2v(v + j)
arctan
.
2
Since
1
1
=
tan =
2
sin tan
we see that
(5.23)
1+
1
1
,
tan2 tan
1+ 2
2
4 ( + j)
2( + j)
Recall as well that
cos =
2
1 + cos
,
2
sin =
2
0 = arg(u0,+ ) =
1 cos
.
2
Hence, if we let
(5.24)
1
1
arccos
,
2
()
we get that
cos 0 = cos
(5.25)
sin 0 = sin
1
1
arccos
2
()
1
1
arccos
2
()
=
=
43
1
1
+
,
2 2()
1
1
.
2 2()
We will prove now the useful inequality

(5.26)
arctan > 0 ,
i.e., arg(v) < 0. By (5.21), (5.22) and

p (5.24), this is equivalent to arccos(1/)
2
arctan 2(
+ j)/. Since tan = 1/ cos 1, we know that arccos(1/) =
arctan 2 1; thus, in order to prove (5.26), it is enough to check that
p
2( + j)
2 1
.
This is easy, since j > and 2 1 < < 2.

5.3.4. The contour. We must now choose the contour of integration. First, let
us discuss our options. By (C.9), 0.79837; moreover, it is easy to show
that tends to 1 when either 0+ or . This means that neither the
simplest contour (a straight ray from the origin within the first quadrant) nor
what is arguably the second simplest contour (leaving the origin on a ray within
the first quadrant, then sliding down a circle centered at the origin, passing past
the saddle point until you reach the x-axis, which you then follow to infinity)
are acceptable: either contour passes through the saddle point on a direction
close to 45 degrees (= arctan(1)) off from the direction of steepest descent. (The
saddle-point method allows in principle for any direction less than 45 degrees off
from the direction of steepest descent, but the bounds can degrade rapidly by
more than a constant factor when 45 degrees are approached.)
It is thus best to use a curve that goes through the saddle point u+,0 in the
direction of steepest descent. We thus should use an element of a two-parameter
family of curves. The curve should also have a simple description in terms of
polar coordinates.
We decide that our contour C will be a limacon of Pascal. (Excentric circles
would have been another reasonable choice.) Let C be parameterized by

c
p
1
x = r2 y2
(5.27)
y = r + c0 r,
for r [(c0 1)/c1 , c0 /c1 ], where c0 and c1 are parameters to be set later.
The curve goes from (0, (c0 1)/c1 ) to (c0 /c1 , 0), and stays within the first
quadrant.7 In order for the curve to go through the point u0,+ , we must have
c1 r0
+ c0 = sin 0 ,
(5.28)
where
p
()2 (),
r0 = |u0,+ | =
(5.29)
2
7Because c 1, by (C.21).
0
44
H. A. HELFGOTT
and 0 and sin 0 are as in (5.24) and (5.25). We must also check that the curve
C goes through u0,+ in the direction of steepest descent. The argument of the
point (x, y) is

c r
y
1
+ c0 .
= arcsin = arcsin
r
Hence

d arcsin c1r + c0
c1 /
d
c1 r
=
=r
=r
.
r
dr
dr
cos
cos arcsin c1r + c0
This means that, if v is tangent to C at the point u0,+ ,
tan(arg(v) arg(u0,+ )) = r
d
c1 r0
=
,
dr
cos 0
and so, by (5.21),

(5.30)
c1 =
cos 0
,
r0
where is as in (5.22). In consequence,

c1 r0
c0 =
+ sin 0 = (cos 0 ) + sin 0 ,
and so, by (5.25),

r
(5.31)
c1 =
1 + 1/
,
2
c0 =
1
1
+
+
2 2
1
1
.
2 2
Incidentally, we have also shown that the arc-length infinitesimal is

s
v
r

u
(c1 r/)2
d 2
r2
u1 +
dr
=
dr = 1 +
(5.32) |du| = 1 + r
2 dr.

t
dr
cos2
c0
2
2
c1
c
1
The contour will be as follows: first we go out of the origin along a straight
radial segment C1 ; then we meet the curve C, and we follow it clockwise for a
segment C2 , with the saddle-point roughly at its midpoint; then we follow another
radial ray C3 up to infinity. For small, C3 will just be the x-axis. Both C1 and
C3 will be contained within the first quadrant; we will specify them later.
5.3.5. The integral over the main contour segment C2 . We recall that
u2
+ iu i log u.
2
Our aim is now to bound the integral
Z
e((u)) u1 du
(5.33)
(u) =
C2
over the main contour segment C2 . We will proceed as follows. First, we will
parameterize the integral using a real variable , with the value = 0 corresponding to the saddle point u = u0,+ . We will bound ((u)) from below by
an expression of the form ((u0,+ )) + 2 . We then bound |u|1 |du/d| from
above by a constant. This procedure will give a bound that is larger than the
true value by at most a (very moderate) constant factor.
45
For u = x + iy (or (r, ) in polar coordinates), (5.33) gives us
(5.34)
r 2 2y 2
y
x2 y 2
y + =
y + arcsin
2
2
r
2
4
=
0 () = 2 2 0 (),
((u)) =
where, by (5.27), (5.28), and (5.31),
( + 0 )2
(1 2(sin 0 c1 )2 )
2
arcsin(sin 0 c1 )
+ 0
(sin 0 c1 ) +
,
0 () =
and
(5.35)
r r0
,
0 =
r0
.
By (5.27), (5.28) and (5.35),

y
= c0 c1 ( + 0 ) = sin 0 c1
(5.36)
r
and so
(5.37)
c0 c1 0 = sin 0 .
The variable will range within an interval

1 sin 0 sin 0
(5.38)
[0 , 1 ]
,
.
c1
c1
(Here = (1 sin 0 )/(c1 ) corresponds to the intersection with the y-axis, and
= (sin 0 )/(c1 ) corresponds to the intersection with the x-axis.)
We work out the expansions around 0 of
(5.39)
2 cos 20
( + 0 )2
(1 2(sin 0 c1 )2 ) = 0
+ (0 cos 20 + 202 c1 sin 0 )
2
2

cos 20
2 2 2
+
+ 4c1 0 sin 0 c1 0 2
2
+ 2(0 c21 2 + c1 sin 0 ) 3 c21 2 4 ,

0 sin 0
sin 0
+ 0
(sin 0 c1 ) =
+
+ c1 0 + c1 2 ,
0
1 X Pk (sin 0 ) (c1 )k k
arcsin(sin 0 c1 )
=
+
4
4 4
k!
(cos 0 )2k1
k=1

0
c1
1
(c1 )2 sin 0 2
=
+
+
+ ... ,
4 4 cos 0
2(cos 0 )3
2
where P1 (t) = 1 and P
k+1 (t) = Pk (t)(1t )+(2k1)tP (t) for k 1. (This follows
from (arcsin z) = 1/ 1 z 2 ; it is easy to show that (arcsin z)(k) = Pk (z)(1
z 2 )(k1/2) .)
P
k
We sum these three expressions and obtain a series 0 () =
k ak . We
already know that
(1) a0 equals the value of ((u))/(2 2 ) at the saddle point u0,+ ,
(2) a1 = 0,
46
H. A. HELFGOTT
(3)
1
a2 =
2
2
dr
d
2
2
2

du
du
1
| (u0,+ )| |r=r0 = | (u0,+ )| |r=r0 .

dr
2
dr
Here, as we know from (5.9), (5.19) and (5.17),

q
r
j() 2
2

2j()
|
iu
+
2i
|
2
0,+
=
| (u0,+ )| =
=
,
2
2
2
|u0,+ |
2
2 ( )
and, by (5.31) and (5.32),
v

u
|du|
du
u
|r=r =
|
=
t1 +
r=r
0
0
|dr|
dr
(5.40)
=
=
Thus,
1
a2 =
2
2
c21
c2
1 + 12
r02
c0
c1
r0
2
2
2 ( )
1
1
2 + 2
2 =
1+
1 + c21
c21
r02
2 1 sin2 0
2
1 + 1/
1 + 2 .
2j()
(1 + 2 ),
2
where is as in (5.23).
Let us simplify our expression for 0 () somewhat. We can replace the third
series in (5.39) by a truncated Taylor series ending at k = 2, namely,

0
1
(c1 )2 sin 1 2
c1
arcsin(sin 0 c1 )
=
+
+
4
4 4 cos 0
2(cos 1 )3
for some 1 between 0 and . Then 1 [0, /2], and so
0
1 c1
arcsin(sin 0 c1 )
.
4
4 4 cos 0
Since
R() = c21 2 2 + 2(sin 0 c1 0 )c1
is a quadratic with negative leading coefficient, its minimum within [0 , 1 ] (see

(5.38)) is bounded from below by min(R((1 sin 0 )/(c1 )), R((sin 0 )/(c1 ))).
We compare

sin 0
R
= 2c3 sin 0 sin2 0 ,
c1
where c3 = sin 0 c1 0 , and

1 sin 0
= 2c3 (1 sin 0 ) (1 sin 0 )2
R
c1
.
= 2c3 sin 0 sin2 0 2c3 1 + 2 sin 0
The question is whether

1 sin 0
sin 0
R
R
= 2c3 1 + 2 sin 0
c1
c1
= 2(sin 0 c1 0 ) 1 + 2 sin 0
= 2c1 0 1
is positive. It is:
2
=
2
47
1 + 1/
,
2
2
and, as we know from (C.9), > 0.79837 is greater than 1/ 2 = 0.70710 . . . .

Hence, by (5.37),

sin 0
R() R
= 2c3 sin 0 sin2 0 = sin2 0 2c1 0 sin 0
c1
c1 0
= c1
c1 0 =
= sin2 0 2(c0 sin 0 ) sin 0 = 3 sin2 0 2c0 sin 0
= 3 sin2 0 2((cos 0 ) + sin 0 ) sin 0 = sin2 0 (sin 20 ) .
We conclude that
(5.41)
0 ()
((u0,+ ))
+ 2 ,
2 2
where
(5.42)
= a2
1 (c1 )2 sin 0
+ sin2 0 (sin 20 ) .
4 2(cos 0 )3
We can simplify this further, using

)2 sin
1 (c1
1 + 1/
0
2
= 2
4 2(cos 0 )3
8
=
1
2

1
1 3/2
2 + 2
2
p
1 1/
2
p
=
4 2 1 + 1/
2
2
=
= p
= q
4 2 1
4 j+1 j1
4 2 /4
2
2
and (by (5.25))
1
2
1
1
2 1
sin 20 = 2 sin 0 cos 0 = 2
2 =
4 4
/2
1
=
.
=
=
2
(j + 1)/2
j+1
Therefore (again by (5.25))

r
1
2 1
1
2j
2
(5.43)
=
(1
+
.
2
2
2
2 2 j + 1
Now recall that our task is to bound the integral

Z 1
Z

2 2 0 ()
1 du dr
((u))
1
e
(( + 0 ))
e
|u|
|du| =
dr d d
0
C2

(5.44)
Z 1

((u0,+ ))
2 2 2
1 du
() e
e
( + 0 )
dr d.
0
(We are using (5.34) and (5.41).) Since u0,+ is a solution to equation (5.7), we
see from (5.6) that
!
u20,+
+ iu0,+ i log u0,+
((u0,+ )) =
2

iu0,+ i
1
+
+ arg(u0,+ ) = (iu0,+ ) + arg(u0 , +).
=
2
2
2
48
H. A. HELFGOTT
We defined y = (iu0 ) (after (5.10)), and we computed y/2 arg(u0,+ ) in

(5.14). This gives us

arccos
e((u0 ,+)) = e
If 1, we can bound
( + 0 )1
(5.45)
2(1)
1

(
01
(0 + 0 )1
if 0,
if < 0,
provided that 0 + 0 > 0 (as will be the case). If > 1, then

(
01
if 0,
1
( + 0 )
1
(1 + 0 )
if > 0.
By (5.32),
s
r
2
du
(c
r/)
(c1 ( + 0 ))2
1
= 1+
=
1
+
dr
cos2
1 (sin 0 c1 )2
(This diverges as /2; this is main reason why we cannot actually follow the
curve all the way to the y-axis.) Since we are aiming at a bound that is tight
only up to an order of magnitude, we can be quite brutal here, as we were when
using (5.45): we bound (c1 r/)2 from above by its value when the curve meets
the x-axis (i.e., when r = c0 /c1 ). We bound cos2 from below by its value when
= 1 . We obtain
s
s
2
du
c20
c
0
= 1+
1
+
=
,
dr
1 (sin 0 c1 1 )2
cos2
where is the value of when = 1 .
Finally, we complete the integral in (5.44), we split it in two (depending on
whether 0 or < 0) and use
Z
Z
/2
1
2 2 2
2
d
e
e d = .
0

0
Therefore,
(5.46)
Z
e((u)) |u|1 |du|
C2

/2
c20
1
1
+
(
+
1+
= () e
j
0
0
cos2

1 ! s
arccos 1 2(1)
2
2
1
c0 e
j
=
1+
r0
,
1+ 1+
2
2
0
cos

arccos
1
2(1)
where j = 0 if 1 and j = 1 if > 1. We can set 1 = (sin 0 )/(c1 ). We

can also express 0 + 0 in terms of :
(5.47)
(c0 sin ) c1
c0 sin
r
=
=
.
0 + 0 =
c1
49
Since 0 = r0 /() (by (5.35)) and r0 is as in (5.29),

p
()2 ()
.
0 =
2
Definition (5.22) implies immediately that 1. Thus, by (5.31),
p
(5.48)
c1 0 = 2(1 + 1/) 2 2,
while, by (C.9),
(5.49)
c1 0 =
2(1 + 1/) 0.79837
By (5.47) and (5.48),

0 1
0
c1 0
2
(5.50)
1+
=
=
,
0
0 + 0
c0 sin
c0 sin
whereas

1/ 2
sin 0
1
1.62628.
1+
=1+
1+
0
c1 0
0.79837 2
We will now use some (rigorous) numerical bounds, proven in Appendix C.2.
First of all, by (C.21), c0 > 1 for all > 0; this assures us that c0 sin > 0,
and so the last expression in (5.50) is well defined. By (5.47), this also shows that
0 + 0 > 0, i.e., the curve C stays within the first quadrant for 0 /2, as
we said before.
We would also like to have an upper bound for
s

1
c20
(5.51)
,
1+
cos2

using (5.43). With this in mind, we finally choose :
(5.52)
= .
4
Thus, by (C.36),
s
s
c20
1
1 + 2c20 p
1+
min(5,
0.86)
min(
5, 0.93 ).
2
cos
We also get
2
2
7.82843.
c0 sin
1 1/ 2
Finally, by (C.39),
(
r
/6
if 4,
2

1
1
2
if > 4
2 23/2 1 23/2
2
and so, since = 4 /, = 2 and (1 1/23/2 ) 2/3, (5.29) gives us

(

2
2
if 2
=
min
, .
r0 23
3
if > 2
3
We conclude that

Z
arccos 1 2(1)
((u))
1
2
,
e
|u|
|du| = C, e
(5.53)
C2
50
H. A. HELFGOTT
where
(5.54)
C,

1

3/2
3.3
1
1
1 + max 7.83
, 1.63
= min 2,
min( /, )

for all > 0, > 0 and all . By reflection on the x-axis, the same bound holds
for < 0, < 0 and all . Lastly, (5.53) is also valid for = 0, provided we
replace (5.54) and the exponent of (5.53) by their limits as 0+ .
5.3.6. The integral over the rest of the contour. It remains to complete the contour. Since we have set = /4, C1 will be a segment of the ray at 45 degrees
from the x-axis, counterclockwise
(i.e., y = x, x 0). The segment will go from
(0, 0) up to (x, y) = (r / 2, r / 2), where, by (5.27),

y
c1
1
=
= r + c0 ,
r
and so
r =
c1
(5.55)
1
c0
2
Let w = (1 + i)/ 2. Looking at (5.6), we see that

(5.56)
Z
Z

u2 /2
s1
(u) 1

e
e(u)u
du
=
e
u
du

C1
C1
Z
Z
e((u)) |u|1 |du| =
e((tw)) t1 dt,
C1
where (u) is as in (5.33). Here

2

t

t
= + ,
i + iwt i log t + i
(5.57)
((tw)) =
2
4
4
2
and, by (C.40),

r
+ 0.076392 + =
0.30556 > 0.4798.
4
4
4
2
Consider first the case 1. Then

Z r
Z
1
((tw)) 1
e
t
dt r
0
4
2
dt r
e
r

4
2
By (5.55) and (C.40),
(5.58)
Hence, for 1,
(5.59)
C1
u2 /2
e(u)u
/2
s1

du /2 e0.4798 .
Assume now that 0 < 1, s 6= 0. We can see that it is wise to start by an

integration by parts, so as to avoid convergence problems arising from the term
t1 within the integral as 0+ . We have
Z
Z
us
2
us wr
u2 /2
s1
u2 /2
eu /2 e(u)
du.
e
e(u)u du = e
e(u) |0
s
s
C1
C1
51
By (5.57),

r
u2 /2
r
us wr

((wr )) r
4
e
2
|
e
=
e
e(u)
0

s
|s|
As for the integral,

(5.60)
Z
Z
us
us
2
u2 /2
(u + i)eu /2iu du
du =
e
e(u)
s
s
C1
C1
Z
Z
1
i
2
u2 /2
s+1
=
e
e(u)u du
eu /2 e(u)us du.
s C1
s C1
Hence, by (5.56) and (5.57),
Z
Z r
Z r
us
t
t

4 +1
u2 /2

2
t
dt +
e
e(u)
du
e
e 2 4 t dt

s
|s| 0
|s| 0
C1
!Z
+1
r t
r
r

e 2 4 dt
+
|s|
|s|
0
!
+1
r
+ r
r
2
min
, r e 2 4
+2
r
r
2r
e 2 4.
+
By (5.58),
+2
r
(1 + 2)r
2 +1 + (1 + 2) 2
+
.
We conclude that

2 +1 + (1 + 2) 2 r +
2 /2
u
s1

e 2 4
e
e(u)u
du

C1
!
1+ 2
2 e0.4798
1+
when [0, 1); by (5.59), this is true for 1 as well.

Now let us examine the contribution of the last segment C3 of the contour.
Since C2 hits the x-axis at c0 /c1 , we define C3 to be the segment of the x-axis
going from x = c0 /c1 till x = . Then
Z
Z
Z

2
dx
2
dx
2
dt

x /2
s
t /2
s
=

e
e(x)x
ex /2 x .
e
e(t)t

(5.61)

c0
c0
t
x
x
C3
c1
c1
Now

2
2
2
ex /2 x2 = ex /2 x1 ( 2)ex /2 x3

2
2
2
ex /2 (x2 + ( 2)x4 ) = ex /2 x1 ( 2)( 4)ex /2 x5
52
H. A. HELFGOTT
and so on, implying that

Z
2
dx
ex /2 x
x
t
x

x2 /2
e
x2 + ( 2)x4
2
x
+ ( 2)x4 + ( 2)( 4)x6
and so on. By (C.43),
c0
min
c1
,
4
if 0 2,
if 2 4.
if 4 6,
We conclude that

Z

2 25

2 /2
dt
min 21 ( ) , 32
t
s
P min , 5

,
e
e(t)t
e

t
4
C3
where we can set P (x) = x2 if [0, 2], P (x) = x2 +(2)x4 if [2, 4]

and P (x) = x2 + ( 2)x4 + . . . + ( 2k)x2(k+1) if [2k, 2(k + 1)].
***
We have left the case < 0 for the very end. In this case, we can afford to use
a straight ray from the origin as our contour of integration. Let C be the ray at
angle /4 from the origin, i.e., y = (tan(/4 ))x, x > 0, where > 0 is
small. Write v = e(/4)i . The integral to be estimated is
Z
2
eu /2 e(u)us1 du.
I=
C
Let us try = 0 first. Much as in (5.56) and (5.57), we obtain, for < 0,
Z
Z t
dt
+ 4 1
2
t
dt = e 4
e||t/ 2 t
e
|I|
t
0
0
! Z
!
(5.62)
2
2
dt
= e 4
et t =
() e 4
||
t
||
0
for > 0. Recall that () 1 for 0 < < 1 (because () = ( + 1) and

() 1 for all [1, 2]; the inequality () 1 for [1, 2] can in turn be
proven by (1) = (2) = 1, (1) < 0 < (2) and the convexity of ()). We
see that, while (5.62) is very good in most cases, it poses problems when either
or is close to 0.
Let us first deal with the issue of small. For general and 0,

Z t2
2 sin 2t cos( 4 )+( 4 ) 1
t
dt
e
|I|
0
( 4 )
e
=
( 4 )
e( 4 )
t
=
t
(sin 2)/2
t2 sin 2 dt
e
2/21
(sin 2)/2
ey y 2
t2
e 2 t
dt
t
dy
=
2/2 (/2)e 4 .
/2
y
2(sin 2)
Here we can choose = (arcsin 2/ )/2 (for 2). Then 2 (/2) (2/ ) =
/ , and so
(5.63)
|I|
e/2 /2
e 2
4
/2
4
(/2)
2
(/2)e
2
2(2/ )/2
53
The only issue that remains is that may be close to 0, in which case (/2)
can be large. We can resolve this, as before, by doing an integration by parts. In
general, for 1 < < 1, s 6= 0:
Z

us
2
eu /2 e(u)
|I| e
du
s
C
Z
us
2
(u + i)eu /2iu du
=
s
C
Z
Z
1
2
i
2
u /2
s+1
=
e
e(u)u du +
eu /2 e(u)us du.
s C
s C
u2 /2
(5.64)
us
e(u) |v
s 0
Now we apply (5.62) with s + 1 and s + 2 instead of s, and get that

1
|I| =
|s|
!+2
2
||
( + 2) e 4 +
||
|s|
!

2
4
+ 2 e 4 .
2
||
!+1
2
( + 1) e 4
||
Alternatively, we may apply (5.63) and obtain
1 e/2
|| e/2
(( + 2)/2) (+2)/2 e 4 +
(( + 1)/2) (+1)/2 e 4
|s| 2
|s| 2

/2
/2
||
e
1+
e 4
2
|I|
for [0, 1], where we are using the facts that (s)
(s) 1 for s [1, 2].
for s [1/2, 1] and
5.4. Totals. Summing (5.53) with the bounds obtained in 5.3.6, we obtain our
final estimate. Recall that we can reduce the case < 0 to the case > 0 by
reflection. We have proven the following statement.
2 /2
Proposition 5.1. Let f (t) = et

of f , i.e.,
F (s) =
e(t), R. Let F be the Mellin transform
2 /2
et
e(t)ts
dt
,
t
where R. Let s = + i , 0, 6= 0. Let = 2. Then, if sgn() 6=

sgn( ),
(5.65)
|F (s)| C0,, e
| |
()2
0.4798
+ C1,, e

| |

2 25
min 18 (
) , 32 | |
+ C2,, e
54
where
(5.66)
H. A. HELFGOTT

2(() 1)
1
arccos
,
()
p !

| |
= min 2, 2||
1 + max(7.831 , 1.631 )
1
E() =
2
C0,,
3.3
C1,,
C2,,
min
s
p
!
1+ 2
1 + 1 + 2
2
= 1+
,
,
() =
2

| | 5 p
= P min
,
| |
,
2|| 4
3/2
| | p
| |
2|| ,
where P (x) = x2 if [0, 2], P (x) = x2 + ( 2)x4 if (2, 4] and

P (x) = x2 + ( 2)x4 + . . . + ( 2k)x2(k+1) if (2k, 2(k + 1)].
If sgn() = sgn( ) (or = 0) and | | 2,
e 4 | | ,
|F (s)| C,
(5.67)
where
C,
e/2 /2
2
1 + 23/2 ||
(/2)
| |
for [0, 1],

for > 0 arbitrary.
2
The terms in (5.65) other than C0,, eE(| |/() )| | are usually very small.
In practice, we will apply Prop. 5.1 when | |/2|| is larger than a moderate
constant (say 8) and | | is larger than a somewhat larger constant (say 100).
Thus, C0,, will be bounded.
2
For comparison, the Mellin transform of et /2 (i.e., F0 = M f0 ) is 2s/21 (s/2),
which decays like e(/4)| | . For very small (e.g., | | < 2), it can make sense to
use the trivial bound
Z
2/2
2
dt
et /2 t = 2/21 (/2)
(5.68)
|F (s)| F0 () =
t
0
for (0, 1]. Alternatively, we could use integration by parts (much as in (5.64)),
followed by the trivial bound:
Z
us
F (s + 2) 2i
2
eu /2 e(u)
(5.69)
F (s) =
du =
F (s + 1),
s
s
s
0
and so

r
+1
+2
+ 2 2 1 |2| +1
2 2 1 +2
1 + 2||
2
2
(5.70)
|F (s)|
|s|
2
|s|
for 0 1, since 2x (x) 2 for x [1/2, 3/2].

It will be useful to have simple approximations to E() in (5.66).
Lemma 5.2. Let E() and () be as in (5.66). Then
5 3
1
(5.71)
E()
8
384
for all > 0. We can also write
sin 2

,
(5.72)
E() =
4
2
4(1 + sin )
55
where = arcsin 1/().

Clearly, (5.71) is useful for small, whereas (5.72) is useful for large (since
then is close to 0). Taking derivatives, we see that (5.72) implies that E() is
decreasing on ; thus, E() is increasing on . Note that (5.71) gives us that

!

5
2
1 2
| |
.
1
| |
(5.73)
E
()2
2 2
48 4 ||2
Proof. Let = arccos 1/(). Then () = 1/(cos ), whereas
p
2
1 + 2 = 2 2 () 1 =
1,
cos2
s
r
2
2
4
4
(5.74)
=
1 1 =
cos2
cos4 cos2
2 sin
2 1 cos2
=
.
=
2
cos
cos2
Thus
(5.75)

2 cos1 1
(1 cos ) cos
(1 cos2 ) cos
=
2E() =
=
2 sin
sin
sin (1 + cos )
cos2
=
sin 2
sin cos
=
.
1 + cos
4 cos2 2
By (C.44) and (5.74), this implies that

2E()
53
,
4 24 8
giving us (5.71).
To obtain (5.72), simply define = /2 ; the desired inequality follows
from the last two steps of (5.75).

Let us end by a remark that may be relevant to applications outside number
theory. By (5.5), Proposition 5.1 gives us bounds on the parabolic cylinder function U (a, z) for z purely imaginary and |(a)| 1/2. The bounds are useful when
|(a)| is at least somewhat larger than |(a)| (i.e., when | | is large compared to
). As we have seen in the above, extending the result to broader bands for a is
not too hard integration by parts can be used to push a to the right.
6. Explicit formulas
An explicit formula is an expression restating a sum such as S, (/x, x) as a
sum of the Mellin transform G (s) over the zeros of the L function L(s, ). More
specifically, for us, G (s) is the Mellin transform of (s)e(s) for some smoothing
function and some R. We want a formula whose error terms are good both
for very close or equal to 0 and for farther away from 0. (Indeed, our choices
of will be made so that F (s) decays rapidly in both cases.)
2
We will derive two explicit formulas: one for (t) = t2 et /(2) , and the other
for = + , where + is defined as in (4.7) for some value of H > 0 to be set
2
later. As we have already discussed, both functions are variants of (t) = et /2 .
Thus, our expressions for G will be based on our study of the Mellin transform
F of e(t) in 5.
56
H. A. HELFGOTT
6.1. A general explicit formula.

1
+
Lemma 6.1. Let : R+
0 R be in C . Let x R , R. Let be a primitive
character mod q, q 1.
Write G (s) for the Mellin transform of (t)e(t). Assume that G is holomorphic on {s : 1/2 (s) 1 + }. Then

X
X
n (n/x) = Iq=1 b()x

G ()x
(n)(n)e
x
(6.1)
n=1
+ O (|G (0)|) + O (log q + 6.01) (| |2 + 2||||2 ) x1/2 ,
where
Iq=1
(
1
=
0
if q = 1,
if q =
6 1.
and
P the norms ||2 , | |2 are taken with respect to the usual measure dt. The sum
is a sum over all non-trivial zeros of L(s, ).
Proof. Since G (s) is defined for (s) [1/2, 1 + ],

Z 1++i
X
1
L (s, )
(n)(n)e(n/x)(n/x) =
G (s)xs ds
2i 1+i
L(s, )
n=1
(see [HL23, Lemma 1] or, e.g., [MV07, p. 144]). We shift the line of integration
to (s) = 1/2. If q = 1 or (1) = 1, then L(s, ) does not have a zero at
s = 0; if q > 1 and (1) = 1, then L(s, ) has a simple zero at 0 (as discussed
in, e.g., [Dav67, 19]). Thus, we obtain
Z 2+i
X
1
L (s, )
G (s)xs ds = Iq=1 G (1)x
G ()x I1, G (0)
2i 2i
L(s, )
(6.2)
Z 1/2+i
L (s, )
1
G (s)xs ds,
2i 1/2i L(s, )
where I1, = 1 if q > 1 and (1) = 1 and I1, = 0 otherwise. Of course,
Z
(t)e(t)dt = b().
G (1) = M ((t)e(t))(1) =
0
It is time to estimate the integral on the right side of (6.2). By the functional
equation (as in, e.g., [IK04, Thm. 4.15]),

s+
1s+
1
1
L (1 s, )
L (s, )
= log
,

(6.3)
L(s, )
q
2
2
2
2
L(1 s, )
where (s) = (s)/(s) and
(
0
=
1
if (1) = 1
if (1) = 1.
By (1 x) (x) = cot x (immediate from (s)(1 s) = / sin s) and

(s) + (s + 1/2) = 2((2s) log 2) (Legendre),

(s + )
s+
1s+
1
.
+
= (1 s) + log 2 + cot
(6.4)
2
2
2
2
2
57
Now, if (z) = 3/2, then |t2 + z 2 | 9/4 for all real t. Hence, by [OLBC10,
(5.9.15)] and [GR00, (3.411.1)],
Z
tdt
1
2
(z) = log z
2
2
2z
(t + z )(e2t 1)
0
!
Z
tdt
1
+ 2 O
= log z
9 2t
2z
1)
0
4 (e
Z

1
tdt
8
(6.5)
= log z
+ O
2z 9
e2t 1

0
8
1
1
+ O
(2)(2)
= log z
2z 9
(2)2

1
1
10
= log z
+O
= log z + O
.
2z
27
27
Thus, in particular, (1 s) = log(3/2 i ) + O (10/27), where we write s =

1/2 + i . Now

i

e 4 2 + e 4 i+ 2

(s
+
)
=
cot
e 4 i 2 e 4 i+ 2 = 1.

2
Since (s) = 1/2, a comparison of Dirichlet series gives

L (1 s, ) | (3/2)|

(6.6)
L(1 s, ) |(3/2)| 1.50524,
where (3/2) and (3/2) can be evaluated by Euler-Maclaurin. Therefore, (6.3)

and (6.4) give us that, for s = 1/2 + i ,

L (s, )
log q + log 3 + i + 10 + log 2 + + 1.50524

27

L(s, )
2
2

(6.7)

q 1
9

log + log 2 +
+ 4.1396.
2
4
Recall that we must bound the integral on the right side of (6.2). The absolute
value of the integral is at most x1/2 times

Z 1 +i

L (s, )
2
1

G (s) ds.
(6.8)

2 1 i L(s, )
2
By Cauchy-Schwarz, this is at most

v
v
u Z 1 +i
u Z 1 +i
2
u
u 1

2
2
L
(s,
)
1
t

|ds| t 1
|G (s)s|2 |ds|
2 1 i L(s, ) s
2 1 i
2
By (6.7),
v
v
uZ 1 +i
uZ 1 +i

(s, ) 1 2
u
u

log q 2
2
2
L
t
t

L(s, ) s |ds|
s |ds|
21 i
12 i
v
uZ

u 1 log 2 + 9 + 4.1396 + log 2
t
2
4
+
d
1
2
4 +
2 log q + 226.844,
58
H. A. HELFGOTT
where we compute the last integral numerically.8

By (2.5), G (s)s is the Mellin transform of
(6.9)
d(e(t)(t))
= 2ite(t)(t) te(t) (t)
dt
Hence, by Plancherel (as in (2.4)),

(6.10)
v
sZ
u Z 1 +i
u 1
2
t
|G (s)s|2 |ds| =
|2ite(t)(t) te(t) (t)|2 t2 dt
2 1 i
0
2
sZ
sZ
= 2||
|(t)|2 dt +
| (t)|2 dt.
Thus, (6.8) is at most

log q +
226.844
2

| |2 + 2||||2 .
It now remains to bound the sum G ()x in (6.1). Clearly

X
X

G ()x
|G ()| x() .

Recall that these are sums over the non-trivial zeros of L(s, ).
We first prove a general lemma on sums of values of functions on the non-trivial
zeros of L(s, ).
Lemma 6.2. Let f : R+ C be piecewise C 1 . Assume limt f (t)t log t = 0.
Then, for any y 1,
Z
X
qT
1
dT
f (T ) log
f (()) =
2 y
2
(6.11)
where
(6.12)
non-trivial
()>y

Z

1

+ O |f (y)|g (y) +
f (T ) g (T )dT ,
2
y
g (T ) = 0.5 log qT + 17.7
If f is real-valued and decreasing on [y, ), the second line of (6.11) equals

Z
f (T )
1
dT .
O
4 y
T
Proof. Write N (T, ) for the number of non-trivial zeros of L(s, ) with |(s)|
T . Write N + (T, ) for the number of (necessarily non-trivial) zeros of L(s, )
8By a rigorous integration from = 100000 to = 100000 using VNODE-LP [Ned06],
which runs on the PROFIL/BIAS interval arithmetic package[Kn

u99].
59
with 0 < (s) T . Then, for any f : R+ C with f piecewise differentiable

and limt f (t)N (T, ) = 0,
Z
X
f (T ) dN + (T, )
f (()) =
y
:()>y
1
=
2
f (T )(N + (T, ) N + (y, ))dT
f (T )(N (T, ) N (y, ))dT.
Now, by [Ros41, Thms. 1719] and [McC84, Thm. 2.1] (see also [Tru, Thm. 1]),
T
qT
log
+ O (g (T ))
2e
for T 1, where g (T ) is as in (6.12). (This is a classical formula; the references
serve to prove the explicit form (6.12) for the error term g (T ).)
Thus, for y 1,

Z
X
T
qT
y
qy
1
log
log
f (T )
dT
f (()) =
2 y
2e
2e
:()>y
(6.14)

Z

1

f (T ) g (T )dT .
+ O |f (y)|g (y) +
2
y
(6.13)
N (T, ) =
Here
(6.15)
f (T )
T
qT
y
qy
log
log
2e
2e
1
dT =
2
f (T ) log
qT
dT.
2
If f is real-valued and decreasing (and so, by limt f (t) = 0, non-negative),

Z
Z

f (T )g (T )dT
f (T ) g (T )dT = f (y)g (y)
|f (y)|g (y) +
y
= 0.5
since
g (T )
f (T )
dT,
T
0.5/T for all T T0 .
Lemma 6.3. Let : R+

0 R be such that both (t) and (log t)(t) lie in L1 L2
(with respect to dt). Let R. Let G (s) be the Mellin transform of (t)e(t).
Let be a primitive character mod q, q 1. Let T0 1. Assume that all
non-trivial zeros of L(s, ) with |()| T0 lie on the critical line. Then
X
|G ()|
non-trivial
|()|T0
is at most
p
p
(||2 + | log |2 ) T0 log qT0 + (17.21| log |2 (log 2 e)||2 ) T0

+ (t)/ t (1.32 log q + 34.5)
1
Proof. For s = 1/2 + i , we have the trivial bound

Z

dt
|(t)|t1/2 = (t)/ t ,
(6.16)
|G (s)|
t
1
0
60
H. A. HELFGOTT
where F is as in (6.19). We also have the trivial bound

(6.17)
Z
Z

dt
s dt
(log t)(t)t1

|(log
t)(t)|t
=
|G (s)| =
(log t)(t)t
1
t
t
0
0
for s = + i .
Let us start by bounding the contribution of very low-lying zeros (|()| 1).
By (6.13) and (6.12),
N (1, ) =
1
q
log
+ O (0.5 log q + 17.7) = O (0.819 log q + 16.8).
2e
Therefore,
X
non-trivial
|()|1

|G ()| (t)t1/2 (0.819 log q + 16.8).
1
Let us now consider zeros with |()| > 1. Apply Lemma 6.2 with y = 1 and
(
|G (1/2 + it)| if t T0 ,
f (t) =
0
if t > T0 .
This gives us that
X
:1<|()|T0
(6.18)
T0
qT
dT
2
1

Z
+ O |f (1)|g (1) +
|f (T )| g (T ) dT ,
f (()) =
f (T ) log
where we are using the fact that f ( + i ) = f ( i ) (because is real-valued).

By Cauchy-Schwarz,
s
s

Z
Z
Z
qT
qT 2
1 T0
1 T0
1 T0
2
dT .
f (T ) log
dT
|f (T )| dT
log
1
2
1
1
2
Now
2
Z
Z
Z

1
1 T0
2
G 1 + iT dT
|e(t)(t)|2 dt = ||22
|f (T )| dT

1
2
2
0
by Plancherel (as in (2.4)). We also have

Z qT0
Z T0
2
2
qT 2
dT
log
(log t)2 dt
2
q
1
0
Hence
1
T0
qT
f (T ) log
dT
2
s
qT0
log
2e
2
qT0
log
2e
+ 1 ||2
Again by Cauchy-Schwarz,
Z
|f (T )| g (T ) dT
1
2
|f (T )|2 dT
2
+1
T0
T0 .
T0 .
|g (T )|2 dT .
61
Since |f (T )| = |G (1/2 + iT )| and (M ) (s) is the Mellin transform of log(t)

e(t)(t) (by (2.5)),
Z
1
|f (T )|2 dT = |(t) log(t)|2 .
2
Much as before,
Z
Z T0
|g (T )|2 dT
T0
(0.5 log qT + 17.7)2 dT
= (0.25(log qT0 )2 + 17.2(log qT0 ) + 296.09)T0 .
Summing, we obtain
Z
Z
qT
1 T0
|f (T )| g (T ) dT
dT +
f (T ) log
1
2
1

p
1
log qT0
qT0
+
+ 17.21 |(t)(log t)|2
log
||2 +
T0
2e 2
2
Finally, by (6.16) and (6.12),

|f (1)|g (1) (t)/ t (0.5 log q + 17.7).
1
By (6.18) and the assumption that all non-trivial zeros with |()| T0 lie on
the line (s) = 1/2, we conclude that
X
p
|G ()| (||2 + | log |2 ) T0 log qT0
non-trivial
1<|()|T0
+ (17.21| log |2 (log 2 e)||2 ) T0

+ (t)/ t (0.5 log q + 17.7).
1
6.2. Sums and decay for (t) = t2 et /2 and (t). Let

(
2
t2 et /2 if t 0,
(t) =
0
if t < 0,
(6.19)
2
t
F (s) = (M (e 2 e(t)))(s),
G (s) = (M ((t)e(t)))(s).
Then, by the definition of the Mellin transform,
G (s) = F (s + 2).
Hence
|G (0)| |(M )(0)| =
and
||22 =
3
,
8
| log |22 0.16364,
2 /2
tet
dt = 1
7
,
16
21/4 (1/4)
1.07791.
|(t)/ t|1 =
4
| |22 =
62
H. A. HELFGOTT
2
Lemma 6.4. Let (t) = t2 et /2 . Let x R+ , R. Let be a primitive

character mod q, q 1. Assume that all non-trivial zeros of L(s, ) with
|()| T0 satisfy (s) = 1/2. Assume that T0 max(4 2 ||, 100).
Write G (s) for the Mellin transform of (t)e(t). Then
!
T2
X
qT0
0.1065 0 2
0.1598T0
()
.
3.5e
+ 2.56e
|G ()| T0 log
2
non-trivial
|()|>T0
Here we have preferred a bound with a simple form. It is probably feasible to

derive from the results in 5 a bound essentially proportional to eE()T0 , where
= T0 /()2 and E() is as in (5.66). (This would behave as e(/4) for large
2
and as e0.125(T0 /()) for small.)
Proof. First of all,
X
non-trivial
|()|>T0
|G ()| =
non-trivial
()>T0
(|F ( + 2)| + |F ((1 ) + 2)|) ,
where we are using G () = F ( + 2) and the functional equation (which implies

that non-trivial zeros come in pairs , 1 ).
By Prop. 5.1, given = + i , [0, 1], max(1, 2||), we obtain that
|F ( + 2)| + |F ((1 ) + 2)|
is at most
E
C0,, e
| |
()2
+ C1, e
where E() is as in (5.66),
C0,, 2 (1 + 1.632 )
C1,
0.4798
min

2 25
min 81 (
) , 32 | |
+ C2, e
2
| | p
| |
2|| ,
3/2
3.251 min
+ C e 4 | | ,

2 2
, | |
3
!

3
1+ 2
| | 5 p
1+
| | 2 , C2, min
,
| | + 1,
| |
|| 4
e/2 | |3/2
2
and = 2.
For T0 100,
(6.20)
and so
!
1+ 2
e/2 ( 0.4798)| |
1+
+
e 4
1.025
| |
2
C1, e0.4798| | + C e 4 | | 1.025| |3/2 e0.4798| | .
It is clear that this is a decreasing function of | | for | | > 3/(2 0.4798) (and
hence for | | 100).
We bound
(
E(1.5) 0.1598
if 1.5,
(6.21)
E() E(1.5)
1.5 0.1065 if < 1.5.
This holds for 1.5 because E() is increasing on , for 1.19 because of
(5.71), and for [1.19, 1.5] by the bisection method (with 20 iterations).
63
The bound (6.21) implies immediately that

E
(6.22)
| |
()2
Since T0 max(4 2 ||, 100),

1
e 8 ( ) 0.055e0.1598 3 ( ) ,
(6.23)
and

2
0.1598 min 23 (
) ,| |
25
e 32 1.1 1028 e0.1598 ,

2

0.055 1
1
| |
| |
2 | | 2
+1
+
0.0039
0.055
||
4
2 4
3

p
1
5
5 1
+
| | 0.0075| |.
0.055
| | + 1 0.055
4
4 10 100

Hence
(6.24)
C0,, e
| |
()2

2 25
min 81 (
) , 32 | |
+ C2, e
C0,,

2
0.1598 min 32 (
) ,| |
where

2 2
= 3.26 min
, | | .
3
It is clear that the right side of (6.24) is a decreasing function of when
!

2 2
1
min
, | |
3
0.1598
C0,,
(and hence when T0 max(4 2 ||, 100)).

We thus have
X
X
|G ()|
where
(6.25)
non-trivial
|()|>T0
f ( ) =
C0,,
non-trivial
()>T0

2
0.1598 min 23 (
) ,| |
f (()),
+ 1.025| |3/2 e0.478| |
is a decreasing function of for T0 .

We can now apply Lemma 6.2. We obtain that

Z
X
1
qT
1
f (())
f (T )
log
+
dT.
2
2
4T
T0
non-trivial
()>T0
If || 4, then the condition T0 4 2 || implies ()2 , and so

min(( /)2 , | |) = | |. In that case, the contribution of the term in (6.25)
is at most
involving C0,,

Z
1
qT
1
3.26
log
+
T e0.1598T dT
(6.26)
2
2
4T
T0
is at most
If || > 4, the contribution of C0,,

Z
qT
1
1
log
+
T e0.1598T dT
3.26
(6.27)
2
4T
max(T0 ,()2 ) 2
64
H. A. HELFGOTT
plus (if T0 < ()2 )

2
Z ()2
2
T
qT
1
1
0.1598 23 T 2
() dT
log
+
e
3.26
2
2
4T 2 2
T0

Z
(6.28)
q||t
1
1
2
log
+
t2 e0.1065t dt.
3.26||
T0
2
2
4||t
||
For any y 1, c, c1 > 0,

Z
Z
2
t2 ect dt <

1
y
1
2
ct2
+
dt
=
e
ecy ,
2 t2
2y
4c
2c
4c
y
y

Z
Z
at log et log et
a
2
2
ct2
2
dt
(t log t + c1 t) e
t log t +
2 ect dt
2c
2c
4c t
y
y
t2 +
(2cy + a) log y + a cy2

e
,
4c2
where
a=
log ey
2c
y log ey
4c12 y
2c
c1 y +
c1 y + 4c21y2
1
+ y log ey
.
y
12
2c
4c y
Setting c = 0.1065, c1 = 1/(2||) 8 and y = T0 /(||) 4, we obtain

Z
q||t
1
1
2
log
+
t2 e0.1065t dt
T0
2
2
4||t
||

T0 2
T0
q||
1
1
0.1065 ||
log
+
2
2
2c|| 4c2 4

T0
T0
2c
+
a
log ||
+ a 0.1065 T0 2
1
||
||
+
e
2
4c2
and
4
1
1
8 + 40.10652 (4)2
+ 4 log 4e
a
0.088.
1
4
20.1065 40.10652 4

T0 2
0.1065 ||
times
Multiplying by 3.26||, we get that (6.28) is at most e

T0
eT0
q||
+ 2.44T0 log
+ 3.2|| log
(2.44T0 + 2.86||) log
2
||
||
!
(6.29)
1
1 + log T0 /||
qT0
qT0
T0 log
2.56T0 log
,
2.44 + 3.2
T0 /||
2
2
where we are using several times the assumption that T0 4 2 ||.
Let us now go back to (6.26). For any y 1, c, c1 > 0,

Z
1
y
ct
+
ecy ,
te dt =
c c2
y

Z
Z
c1 ct
a1
a
1
t log t +
e dt
t+
log t 2 ect dt
t
c
c c t
y
y
y
a cy
+
e
log y,
c c2
where
a=
log y
c1
1
c + c + y
log y
1
c c2 y
65
Setting c = 0.1598, c1 = /2, y = T0 100, we obtain that

Z
1
qT
1
log
+
T e0.1598T dT
2
2
4T
T0

T0
T0
1
a
q
1
+ 2 +
+ 2 log T0 e0.1598T0
log
2
2
c
c
c
c
and
a
/2
log T0
1
0.1598 + 0.1598 + T0
log T0
1
0.1598 0.15982 T0
1.235.
Multiplying by 3.26 and simplifying, we obtain that (6.26) is at most

qT0 0.1148T0
e
.
2
Obviously, (6.27) is bounded above by (6.26), and hence it is bounded above by
(6.30) as well.
(6.30)
3.5T0 log
Proposition 6.5. Let (t) = t2 et /2 . Let x R+ , R. Let be a primitive

character mod q, q 1. Assume that all non-trivial zeros of L(s, ) with
|()| T0 lie on the critical line. Assume that T0 max(4 2 ||, 100).
Then
(6.31)
(

X
b()x + O (err, (, x)) x if q = 1,
n (n/x) =
(n)(n)e
x
O (err, (, x)) x
if q > 1,
n=1
where
!
T02
qT0
0.1065
()2
3.5e0.1598T0 + 2.56e
err, (, x) = T0 log
2

p
p
+ 1.22 T0 log qT0 + 5.056 T0 + 1.423 log q + 38.19 x1/2
+ (log q + 6.01) (0.89 + 5.13||) x3/2 .
Proof. Immediate from Lemma 6.1, Lemma 6.3 and Lemma 6.4.
Now that we have Prop. 6.5, we can derive from it similar bounds for a smoothing defined as the multiplicative convolution of with something else just as
we discussed at the beginning of 4.3.
2
Corollary 6.6. Let (t) = t2 et /2 , 1 = 2 I[1/2,1] , 2 = 1 M 1 . Let =

2 M . Let x R+ , R. Let be a primitive character mod q, q 1.
Assume that all non-trivial zeros of L(s, ) with |()| T0 lie on the critical
line. Assume that T0 max(4 2 ||, 100).
Then
(6.32)
(

X
b ()x + O (err , (, x)) x if q = 1,
n (n/x) =
(n)(n)e
x
O (err , (, x)) x
if q > 1,
n=1
66
H. A. HELFGOTT
where
(6.33)
!
T2
qT0
0.1065 0 2
0.1598T0
()
3.5e
+ 0.0074 e
err, (, x) = T0 log
2

p
p
1
+ 1.22 T0 log qT0 + 5.056 T0 + 1.423 log q + 38.19 x 2
+ (log q + 6.01) (0.89 + 2.7||) x3/2 .
Proof. The left side of (6.32) equals

Z X
dw
n
n
2 (w)
(n)(n)e
x
wx
w
0 n=1
=
1X
1
4
(n)(n)e
n=1
wn
wx

dw
n
2 (w) ,
wx
w
since 2 is supported on [1/4, 1]. By Prop. 6.5, the main term (if q = 1)
contributes
Z
Z 1
dw
b(w)2 (w)dw
=x
b(w)xw 2 (w)
1
w
0
4
Z Z
Z Z
dt
r
e(r) 2 (w)dw
=x
(t)e(wt)dt2 (w)dw = x
w
w
0
0

Z Z
r
dw
=x
2 (w)
e(r)dr = b () x.
w
w
0
The error term is
Z 1
Z 1
dw
=x
err, (w, xw) wx 2 (w)
err, (w, xw)2 (w)dw.
(6.34)
1
1
w
4
4
R
R
Since w 2 (w)dw = 1, w w2 (w)dw = (3/4) log 2 and9

Z
0.1065(4)2 12 1
w
2 (w)dw 0.002866,
e
w
we see that (6.34) implies (6.33).
6.3. Sums and decay for + (t). We will work with

(6.35)
2 /2
(t) = + (t) = hH (t) (t) = hH (t)et
where hH is as in (4.10). Due to the sharp truncation in the Mellin transform

M hH (see 4.2) and the pole of M (s) at s = 0, the Mellin transform M + (s)
of + (t) has unpleasant singularities at s = iH. In consequence, we must use a
different contour of integration from the one we used before. This will require us
to rework our explicit formula (Lemma 6.1) somewhat. We will need to assume
that the non-trivial zeros of L(s, ) lie on the critical line up to a height T0 ; we
would have needed to make the same assumption later anyhow.
9By rigorous integration from 1/4 to 1/2 and from 1/2 to 1 using VNODE-LP [Ned06].
67
1
Lemma 6.7. Let : R+
0 R be in C . Let x 1, R. Let be a primitive
character mod q, q 1.
Let H 3/2, T0 > H + 1. Assume that all non-trivial zeros of L(s, ) with
|()| T0 satisfy (s) = 1/2.
Write G (s) for the Mellin transform of (t)e(t). Assume that G is holomorphic on {s : 1/5 (s) 1 + } {s : 1/2 (s) 1/5, |(s)| T0 1}.
Then

X
X
n (n/x) = Iq=1 b()x

G ()x
(n)(n)e
x
n=1
(6.36)
+ O ((7.91 log q + 82.7) (2|||(t)t7/10 |2 + | (t)t7/10 |2 )) x1/5

7 log H 7 log q
+ O
+
+ 11.04 max |G ( + iH )|x
10
2
[ 21 , 15 ]

log q + log H + 6.71
+O
(2||||2 + | |2 ) x1/2 ,
H
where
Iq=1
(
1
=
0
if q = 1,
if q =
6 1,
H = T0 1
and
P the norms ||2 , | |2 are taken with respect to the usual measure dt. The sum
is a sum over all non-trivial zeros of L(s, ).
Proof. We start just as in the proof of Lem. 6.1, except we shift the integral only
up to (s) = 1/5 in the central interval, and push it up to (s) = 1/2 only in
the tails:
(6.37)
n=1
(n)(n)e(n/x)(n/x) = Iq=1 b()x
1
2i
L (s, )
L(s, )
G ()x
G (s)xs ds,
where I1, = 1 if q > 1 and (1) = 1 and I1, = 0 otherwise, and C is the
contour consisting of
(1) a segment C1 = [1/5 iH , 1/5 + iH ], where H = T0 1 > H,
(2) the union C2 of two horizontal segments:

1
1
1
1
C2 = iH , iH + iH , + iH ,
2
5
2
5

(3) the union C3 of two rays

1
1
1
1
C3 = i, iH + iH , + i .
2
2
2
2
We can still use (6.3) to estimate L (s, )/L(s, ) (given L (1s, )/L(1s, )).
We estimate (z) for (z) 4/5 much as in (6.5): since |t2 + z 2 | 16/25 for t
68
H. A. HELFGOTT
real,
1
+ 2 O
(z) = log z
2z
1
2z
1
= log z
2z
= log z
tdt
16 2t
25 (e
1)

1
50
+
O
(2)(2)
16
(2)2

25
145
+O
= log z + O
192
192
0
For s with (s) = 1/5,

e 10
i 2
10
i+ 2
i 2
10
s
+
e
e

1
+
2
=
cot = i

e 10 2 e 10 i+ 2
2
e 10 i 2 e 10 i+ 2

e 10 i + e 10

i

i
=
3.07769.
cot

e 10 e 10 i
10
(The inequality is clear for 0; the case < 0 follows by symmetry.) Similarly,

(s
+
1)
= cot 3 0.32492.
cot

2
5
For (s) arbitrary and | | = |(s)| H ,

e 2 H + e 2 H

s
H

.
cot H
= coth
e 2
2
2
e 2 H
We now must estimate L (s, )/L(s, ), where (s) 4/5 and |(s)| H .
By a lemma of Landaus (see, e.g., [MV07, Lemma 6.3], where the constants
are easily made explicit) based on the Borel-Caratheodory Lemma (as in [MV07,
Lemma 6.2]), any function f analytic and zero-free on a disc Cs0 ,R = {s : |ss0|
R} of radius R > 0 around s0 satisfies
f (s)
= O
f (s)
2R log M/|f (s0 )|

(R r)2
for all s with |s s0 | r, where 0 < r < R and M is the maximum of |f (z)|
on Cs0 ,R . We set s0 = 3/2 + i (where = (s)), r = 1/2 + 1/5 = 7/10, and
let R 1 , using the assumption that L(s, ) has no non-trivial zeroes off the
critical line with imaginary part H + 1 = T0 .
We obtain

maxsCs0 ,1 |L(s, )|
200
L (s, )
= O
log
.
(6.38)
L(s, )
9
|L(s0 , )|
Clearly,
|L(s0 , )|
Y
Y (1 p3 )1
(3)
(1 + p3/2 )1 =
0.46013.
=
3/2
1
(3/2)
(1 p
)
p
p
By partial summation, for s = + it with 1/2 and any N Z+ ,
X
X
L(s, ) =
(m)ns
(m) (N + 1)s
nN
nN +1
mN
mn
(m) (ns (n + 1)s+1 )
!

N 11/2
1
= O 3 N + M (q)/ N ,
+N
+ M (q)N
=O
1 1/2
P

where M (q) = maxn mn (m). We set N = M (q)/3, and obtain
p
|L(s, )| 2M (q)N 1/2 = 2 3 M (q).
We can afford to use the trivial bound M (q) q. We conclude that

!

2
3
q
L (s, )
100
=O
= O 8 log
log q + 44.8596 .
L(s, )
(3)/(3/2)
9
Therefore, by (6.3) and (6.4),
(6.39)

L (s, )
145
= log log(1 s) + O
L(s, )
q
192

+ O (4 log q + 44.8596)
cot
+ log 2 + O
2
10
= log(1 s) + O (5 log q + 52.2871)
for (s) = 1/5, |(s)| H ,
(6.40)

L (s, )
145
= log log(1 s) + O
L(s, )
q
192

H
coth
+ O (4 log q + 44.8596)
+ log 2 + O
2
2
= log(1 s) + O (5 log q + 49.1654).
for (s) [1/2, 1/5], 1 |(s)| H , and (as in (6.7))
L (s, )
log(1 s) + O (log q + 5.2844).
L(s, )
(6.41)
Let 1 j 3. By Cauchy-Schwarz,
Z

1

L (s, )

G (s)ds

2 Cj L(s, )

is at most
1
2
Cj
s Z

L (s, ) 1 2
1

L(s, ) s |ds| 2
Cj
|G (s)s|2 |ds|.
By (6.39) and (6.40),

sZ
sZ
sZ

L (s, ) 2 |ds|
5 log q 2
| log |1 s|| + c2,j 2

|ds|,

s |ds| +

|s|2

s
Cj L(s, )
Cj
Cj
where c1,1 = 5, c1,2 = 5, c1,3 = 1, c2,1 = 52.2871, c2,2 = 49.1654, c2,3 = 5.2844.
69
70
H. A. HELFGOTT
For j = 1,

5 log q 2
2
1
2

s |ds| = (5 log q) 10 tan 5H 125(log q) ,
C1

Z H 1
Z
2 + 16 + c 2
| log |1 s|| + c2,j 2
log
2,j
2
25
|ds|

d 42949.3,

1
s
+
2
H
C1
25
Z
where we have computed the last integral numerically. As before, G (s)s is the
Mellin transform of (6.9). Hence
v
s Z
u Z 1 +i
u 1
5
1
2
|G (s)s| |ds| t
|G (s)s|2 |ds|
1
2 C1
2 i
5
sZ
|2ite(t)(t) te(t) (t)|2 t3/5 dt
2|||(t)t7/10 |2 + | (t)t7/10 |2 .
Hence,
Z
1

2
is at most
C1

L (s, )
G (s)ds
L(s, )
(7.9057 log q + 82.678) (2|||(t)t7/10 |2 + | (t)t7/10 |2 ).
For j = 3,

Z
log q 2
|ds|
2(log q)2

|ds| = 2(log q)2
<
,
s
2
H
1/2+iH |s|
C3
2

Z 1
Z
9
2

| log |1 s|| + c2,j 2
2 log + 4 + c2,j
|ds| 2

d

1
2
s
H
C3
4 +
2

Z log + log 2 + c2,j
2
d
2
2
H
Z
2(log H + 6.71)2
.
H
provided that H 3/2. Now, as in (6.10),

s Z
1
|G (s)s|2 |ds| 2||||2 + | |2 .
2 C3
Hence,
Z
1

2
C1

log q + log H + 6.71
L (s, )
G (s)ds
(2||||2 + | |2 ).
L(s, )
H
Lastly, for j = 2,

Z
L (s, )
1

|ds|
2 C2 L(s, )
1
2
+ 51
(log |1 s| + O (5 log q + c2,2 ))
7(log H + 5 log q + 49.51198)
.
10
71
Lemma 6.8. Let = + be as in (6.35) for some H 5. Let x R+ ,

R. Let be a primitive character mod q, q 1. Assume that all nontrivial zeros of L(s, ) with |()| T0 satisfy (s) = 1/2, where T0
H + max(4 2 ||, H/2, 100).
Write G (s) for the Mellin transform of (t)e(t). Then
!
(T H)2
X
qT0
0.1065 0 2
0.1598(T0 H)
()
.
3.44e
+ 0.63||e
|G ()| H log
2
non-trivial
|()|>T0
Proof. Clearly,
non-trivial
|()|>T0
|G ()| =
non-trivial
()>T0
(|G ()| + |G (1 )|) .

2
Let F be as in (6.19). Then, since + (t)e(t) = hH (t)et /2 e(t), where hH is as

in (4.10), we see by (2.3) that
Z H
1
G (s) =
M h(ir)F (s ir)dr,
2 H
where F is as in (6.19), and so, since |M h(ir)| = |M h(ir)|,

Z H
1
|M h(ir)|(|F (ir)|+|F (1(ir))|)dr.
(6.42) |G ()|+|G (1 )|
2 H
We now proceed much as in the proof of Lem. 6.4. By Prop. 5.1, given s =
+ i , [0, 1], 4 2 max(1, ||),
|F ()| + |F (1 )|
is at most
E
C0,, e
| |
()2
0.4798
+ C1, e
where E() is as in (5.66),
C0,, 2 (1 + 7.831 )
C2, min
1
4
| |
||
2
3/2
2

2 25
min 81 (
) , 32 | |
+ C2, e
1
4.217,
!1
25
,
, | |
16
For T0 H 100,
!
1
1+ 2
e/2 1/2
2 +
1+
2
and so
C1, =
e/2 1/2
2
2 3/2 ||
1+ p
| |
+ C e 4 | | ,
!
1
1+ 2
2,
1+
!
2 3/2 ||
1+ p
.
| |
e( 4 0.4798)| |
1.025| |1/2 + 1.5 1012 || 0.033| |
C1, e0.4798| | + C e 4 | | 0.033| |e0.4798| | .

It is clear that this is increasing for | | 100.
We bound E() as in (6.21). Inequalities (6.22) and (6.23) still hold. We also
see that, for | | T0 H max(4 2 ||, 100),
0.055 C2, 0.055 min(4 2 , 2500/16)1 0.0014.
72
H. A. HELFGOTT
Thus,
E
C0,, e
| |
()2

2 25
min 81 (
) , 32 | |
+ C2, e

2
0.1598 min 23 (
) ,| |
4.22e
and so |F ()| + |F (1 )| g( ), where

2
0.1598 min 32 (
) ,| |
g( ) = 4.22e
+ 0.033| |e0.4798| |
is decreasing for T0 H. Recall that, by (4.24),

1
2
|M hH (ir)|dr 1.99301
H
.
2
Therefore, using (6.42), we conclude that

|G ()| + |G (1 )| f ( ),
for = + i , > 0, where
f ( ) = 0.7951 H g( H)
is decreasing for T0 (because g( ) is decreasing for T0 H).
We apply Lemma 6.2, and get that

Z
X
qT
1
1
log
+
dT
|G ()|
f (T )
2
2
4T
T0
non-trivial
|()|>T0
= 0.7951 H
T0
g(T H)
qT
1
1
log
+
2
2
4T
dT.
We continue as in the proof of Lemma 6.4, only the integrals are somewhat
simpler this time. For any c > 0,

Z
qT
1
1
c(T H)
e
log
+
dT
2
2
4T
T0
(6.43)

1
1
qT0
1
1
log
+
+
ec(T0 H) .
2
2c
2
2c
4c T0
At the same time,

Z
H 2
1
qT0
1
23 c( T
)
log
+
dT
e
2
2
4T
T0

||
qT0

Z
H 2
2 log 2 +
||
q||t
1
23 c(t
)
=
log
+
e
dt
4 T0 H
T0
2
2
4t
3 c ||
||
4T0
||
since T0 4 2 e > 2e. For c1 > 0, since

1 + c11T0
1
qT
T + 1/c1
+
+
log
2c1
2
4c1
2c21
ec1 (T H) ,
23 c
T0 H
2
73
we see that
Z
Te
T0
(6.44)

qT
1
1
log
+
dT
2
2
4T
!

1 + cT10
qT0 1
1
ec1 (T0 H)
+
T0 log
+
2c
2
c
4c
!

1 + c11T0
1
qT0
1
+
T0 log
+
ec1 (T0 H) .
2c1
2
c1
4c1
c1 (T H)
We set c = 0.1598, c1 = 0.4798. Since T0 max(100 + H, 3H/2), the ratio of the

right side of (6.44) to the right side of (6.43) is at most
max
c c2
,
c1 c21

1
e(c1 c0 )(T0 H)
T0 +
c1

c
1
3(T0 H) +
e(c1 c0 )(T0 H) 1.3 1012 .
c1
c1
We also see that

0.7951 4.22
1
1
1 2c
1
2 + 4c
+
0
2c T0 log qT
2
and, since T0 H 4 2 ||,

0.7951 4.22
1
2
0
log qT
2 +
4 T0 H
3 c ||
4T0
3.35533
1
2
0.62992 log
3.4303
4T0 log
4
3c
T0
2
qT0
2
qT0
.
2
We conclude that

Z
qT
1
1
log
+
dT
0.7951
g(T H)
2
2
4T
T0
qT0
log
log
0.1065
3.44e0.1598(T0 H) + 0.63||e
(T0 H)2
()2
.
Proposition 6.9. Let = + be as in (6.35) for some H 50. Let x 103 ,

R. Let be a primitive character mod q, q 1. Assume that all nontrivial zeros of L(s, ) with |()| T0 lie on the critical line, where T0
H + max(4 2 ||, H/2, 100).
Then
(6.45)
(

X
c
+ ()x + O err+ , (, x) x if q = 1,

n + (n/x) =
(n)(n)e
err
x
O
(,
x)
x
if q > 1,
,
+
n=1
74
H. A. HELFGOTT
where
(6.46)
!
(T H)2
qT0
0.1065 0 2
0.1598(T0 H)
()
3.44e
+ 0.63||e
err+ , (, x) = H log
2
p
1
+O (((0.641 + 1.11 H) log qT0 + (1.5 + 19.1 H)) T0 + 1.65 log q + 44)x 2

4
+O ||(40.2 log q + 420) + H(0.015 log T0 + 15.6 log q + 163) x 5 .
Proof. We apply Lemmas 6.7, Lemma 6.3 and Lemma 6.8. We bound the norms
involving + using the estimates in 4.2.2. The error terms in (6.36) total at most
((7.91 log q + 82.7)(5.074|| + 1.953 H)) x1/5

+(0.23 log T + 1.12 log q + 11.04)x1/5 max |G ( + i(T 1))|
(6.47)
[ 12 , 15 ]
+(0.05 log q + 0.542)(4.027|| + 0.876 H)x1/2
Since x 103 , the last line of (6.47) is easily absorbed into the first line of (6.47)
(by a change in the last significant digits). Much as in the proof of Lem. 6.8, we
bound
Z
1
|G ( + i(T 1))|
|M hH (ir)|dr max |F ( + i )|
T 1H
2
0.3172 H max |F ( + i )|,

T 1H
where F is as in (6.19). We can obtain an easy bound for F applicable for

arbitrary as follows:

2

d
t2
e(t)
(s) = M (t2 e(t) + 2ite(t)) (t) (s),
sF (s) = M t
e
dt
and so
dt
| (t)t+1 |1 + 2||| (t)t |1
t
0

+1
+1
/2
2
+1 +2
=2
.
||
2
2
|sF (s)|
|t + 2i| (t)t+1
Since [1/2, 1/5] and x 103 , a quick verification gives that x 2/2 (/2+1)
and x 2(+1)/2 || (( + 1)/2) are maximal when = 1/5. Hence
|F (s)|x
and so
1.01964 + 7.09119|| 1/5 1.01964 + 7.09119|| 1/5

x
x 99
2
|s|
100 max(4 ||, 100)
(0.0103 + 0.18144)x1/5 0.19174x1/5 ,
|G ( + i(T 1))| 0.3172 H 0.19174 0.061 H.

We conclude that (6.47) is at most
(5.075|| + 1.954 H) (7.91 log q + 82.7) x1/5
(6.48)
+ 0.061 H (0.23 log T + 1.12 log q + 11.04) x1/5
(||(40.2 log q + 420) + H(0.015 log T + 15.6 log q + 163))x1/5 .

75
Finally, let us prove a simple result that will allow us to compute a key 2
norm.
Proposition 6.10. Let = + be as in (6.35), H 50. Let x 106 . Assume
that all non-trivial zeros of the Riemann zeta function (s) with |()| T0 lie
on the critical line, where T0 2H + max(H, 100).
Then
Z
X
2
2
+
(t) log xt dt + O (err2 ,+ ) x log x,
(n)(log n)+
(n/x) = x
(6.49)
0
n=1
where
err2 ,+
(6.50)
p
(log T0 )2
+ 0.224 log T0 H T0 e(T0 2H)/4
= 0.311
log x
p
+ (6.2 H + 5.3) T0 log T0 x1/2 + 419.3 Hx4/5 .
Proof. We will need to consider two smoothing functions, namely, +,0 (t) =
+ (t)2 and +,1 = + (t)2 log t. Clearly,
2
(n)(log n)+
(n/x) = (log x)
(n)+,0 (n/x) +
Since + (t) = hH
2
(t)et /2 ,
(n)+,1 (n/x).
n=1
n=1
n=1
+,1 (r) = h2H (t)(log t)et .
+,0 (r) = h2H (t)et ,
Let +,2 = (log x)+,0 + +,1 .

2
The Mellin transform of et is (s/2)/2; by (2.5), this implies that the Mellin
2
transform of (log t)et is (s/2)/4. Hence, by (2.3),

Z
s ir
1
2
M hH (ir) Fx
dr,
(6.51)
M +,0 (s) =
4
2
where
1
Fx (s) = (log x)(s) + (s).
2
(6.52)
Moreover,
(6.53)
M h2H (ir)
1
=
2
M hH (iu)M hH (i(r u))du,
and so M h2H (ir) is supported on [2H, 2H]. We also see that |M h2H (ir)|1
|M hH (ir)|21 /2. We know that |M hH (ir)|21 /2 1.993012 H by (4.24).
Hence
(6.54)
Z
1
|M (h2H )(ir)|dr max |Fx ((s ir)/2)|
|M +,0 (s)|
4
|r|2H
1.993012
H max |Fx ((s ir)/2)| 0.3161H max |Fx ((s ir)/2)|.
4
|r|2H
|r|2H
By [OLBC10, 5.6.9] (Stirling with explicit constants),
(6.55)
|(s)| 2|s|(s)1/2 e|(s)|/2 e1/6|z| ,
and so
(6.56)
p
|(s)| 2.511 |(s)|e|(s)|/2
76
H. A. HELFGOTT
for s C with 1 (s) 1 and |(s)| 99. Moreover, by [OLBC10, 5.11.2]

and the remarks at the beginning of [OLBC10, 5.11(ii)],

(s)
1
1
= log s
+O
(s)
2s
cos3 /2
for | arg(s)| < ( (, )). Again, s = + i with 1 1 and | | 99,
this gives us
(s)
= log | | + O (3.02).
(s)
Hence, under the same conditions on s,
1
log | | + 1.51)(s)
2
p
1
2.511((log x) + log | | + 1.51) | |e| |/2 .
2
|Fx (s)| ((log x) +
(6.57)
Thus, by (6.54),
r
1
| |
He(| |2H)/4
(6.58) |M +,2 ()| 0.7938H((log x) + log | | + 1.17)
2
2
for = + i with | | T0 1 2H + 99 and 1 1.
We now apply Lemma 6.7 with = +,2 , = 0 and trivial. We obtain that
(6.59)
X
(n)+,2 (n/x)
n=1
Z
+ O
+ O
+,2 (t)dt x

(t)t7/10 |2 x1/5
M +,2 ()x + O 82.7|+,2

7 log T0
+ 11.04 max |M +,2 ( + i(T0 1))| x
10
[ 12 , 15 ]
!
log T0 + 6.71
p
|+,2 |2 x1/2
(T0 1)
2 , we can bound
Since +,2 = (log xt)+
2 3/10
|+,2
(t)t7/10 |2 |2+ +
(log xt)t7/10 |2 + |+
t
|2
7/10
2(|+ | log x + |+ (t) log t| )|+
t
|2 + |+ | |+ t3/10 |2
2(1.26499 log x + 0.43088) 1.95201 H + 1.26499 0.66241
4.93855 H log x + 1.68217 H + 0.83795
by (4.31), (4.32), (4.22) and (4.19), using the assumption H 50. Similarly,
|+,2
|2 2(|+ | log x + |+ (t) log t| )|+
|2 + |+ | |+ |2
2(1.26499 log x + 0.43088) 0.87531 H + 1.26499 0.80075
2.21452 H log x + 0.75431 H + 1.01295
77
by (4.32), (4.21) and (4.19). Hence, since H 50 and x 106 ,

log T0 + 6.71
|+,2 |2 x(1/2+1/5)
82.7|+,2
(t)t7/10 |2 x1/5 + p
(T0 1)
(408.5 H log x + 139.2 H + 69.3)x1/5
+ (1.07 H log x + 0.363 H + 0.487)x1/2 419.29 Hx1/5 log x.

We bound M +,2 by (6.58):
(6.60)

7 log T0
+ 11.04 |M +,2 ( + i(T0 1))|
10
1
(0.1769 log T0 + 8.7628)(log x + log T0 + 1.17)H
2
T0
He(T0 12H)/4 .
2
Since T0 3(T0 2H), H T0 /3 and T0 2H 100, this gives us that

7 log T0
+ 11.04 |M +,2 ( + i(T0 1))| 3.54 1030 log x + 1.542 1029
10
5 1030 log x
for 1 1 and x 106 . Thus, the error terms in (6.59) total at most
419.3 Hx1/5 log x.

It is time to bound the contribution of the zeros. The contribution of the zeros
up to T0 gets bounded by Lemma 6.3:
X
p
|M +,2 ()| (|+,2 |2 + |+,2 log |2 ) T0 log T0
non-trivial
|()|T0
+ 17.21|+,2 log |2

T0 + 34.5 +,2 (t)/ t
2 , we bound the norms here as follows:

Since +,2 = (log xt)+
|+,2 |2 (|+ | log x + |+ log | )|+ |2

1.014 log x + 0.346 1.04 log x,
|+,2 log |2 (|+ | log x + |+ log | )|+ log |2
(1.404 log x + 0.479) H 1.44 H log x,
|+,2 / t|1 (|+ | log x + |+ log | )|+ / t|1

1.679 log x + 0.572 1.721 log x,
where we use the bounds
|+ |
1.265, |+ log | 0.431, |+ |2 0.8008,
|+ log |2 1.1096 H, |(t)/ t|1 1.3267 from (4.31), (4.32) and (4.17) (with
H 50) and (4.26). (We also use the assumption x 106 .) Since T0 200, this
means that
X
p
|M +,2 ()| (6.2 H + 5.3) T0 log T0 log x.

non-trivial
|()|T0
78
H. A. HELFGOTT
To bound the contribution of the zeros beyond T0 , we apply Lemma 6.2, and
get that

Z
X
T
1
1
log
+
dT,
|M +,2 ()|
(6.61)
f (T )
2
2 4T
T0
non-trivial
|()|>T0
where
r
1
| | (| |2H)/4
f (T ) = 1.5876H ((log x) + log | | + 1.17)
e
2
2
(see (6.58)). Since T T0 200, we know that ((1/2) log(T /2) + 1/4T )
0.216 log T . In general,

Z
T0
4 p
2/
2
2
2 T /4
2
T (log T ) e
dT
T0 (log T0 ) + (log e T0 ) e 4 ,
T0
T
p

Z0
T0
4
2/
T /4
2
T (log T )e
dT
T0 (log T0 ) + log e T0 e 4 ,
T0
T0

p
Z
T
4
2/
0
T eT /4 dT =
T0 +
e 4 ;
T
0
T0
T0 (log T0 )2 eT0 /4 ,
for T0 200, the quantities on the right
are
at
most
1.281
1.279 T0 (log T0 )eT0 /4 , and 1.278 T0 eT0 /4 , respectively. Thus, (6.61) gives
us that
X
|M +,2 ()| 1.281 (0.216 log T0 ) f (T0 )
non-trivial
|()|>T0
(0.311(log T0 )2 + 0.224(log x)(log T0 ))H
T0 e(T0 2H)/4 .
6.4. A verification of zeros and its consequences. David Platt verified in

his doctoral thesis [Pla11], that, for every primitive character of conductor
q 105 , all the non-trivial zeroes of L(s, ) with imaginary part 108 /q lie on
the critical line, i.e., have real part exactly 1/2. (We call this a GRH verification
up to 108 /q.)
In work undertaken in coordination with the present project [Plab], Platt has
extended this computations to
all odd q 3 105 , with Tq = 108 /q,
all even q 4 105 , with Tq = max(108 /q, 200 + 7.5 107 /q).
The method used was rigorous; its implementation uses interval arithmetic.
Let us see what this verification gives when used as an input to Cor. 6.6 and
Prop. 6.9. Since we intend to apply these results to the estimation of (3.36), our
main goal is to bound
q | err, (, x)|
(6.62)
E,r,0 =
max
mod q
qgcd(q,2)r
||gcd(q,2)0 r/2q
for = and = + . In the case of + , we will be able to assume x x+ = 1029 .

In the case of , we will prefer to work with a smaller x; thus, we make only the
assumption x x = 1026 . Since we will use Platts input, we set r = 150000.
79
(Note that Platts calculations really allow us to go up to r = 200000.) We also

set 0 = 8.
In general,
4r
q gcd(q, 2) r 2r, ||
.
q/ gcd(q, 2)
To work with Cor. 6.6, we set
T0 =
5 107
.
q/ gcd(q, 2)
Thus
T0
1000
5 107
=
,
150000
3
5 107
1000
T0
=
= 26.525823 . . .
4r
12
and so
0.1065
3.5 e0.1598T0 + 0.0074 e
T02
()2
2.575 1023 .
Since there are no primitive characters of modulus 2, q 4r. Examining

(6.33), we obtain
108
108
q err , log
2.575 1023
q
2

8
8
8
+ 1.22 10 log 10 + 5.056 10 + 1.423 300000 log 300000 + 38.19 300000
1 +
3/2
x 2 + (log 300000 + 6.01) (0.89 300000 + 2.7 4 300000) x
4.743 1014 + 3.0604 108 + 6.035 1032 = 3.061 108 .

To work with Prop. 6.9, we set
T0 = H +
3.75 107
,
q/ gcd(q, 2)
H = 200.
Thus
T0 H
3.75 107
= 250,
150000
3.75 107
750
T0 H
=
= 19.89436 . . .
4r
12
and also
qT0 2r H + 7.5 107 1.35 108 .
Hence
3.44
0.1598(T0 H)
2re
0.1065
+ 0.63 4r e
(T0 H)2
()2
1.953 1013 .
80
H. A. HELFGOTT
Examining (6.46), we get
q err+ , (, x) 200 log 1.35 108 1.953 1013

1

+ (16.339 log 1.35 108 + 271.62) 1.35 108 + 64.81 2r x+ 2
4

+ 3708r + 200r 360.01 x+ 5

5.1707 1011 + 2.1331 108 + 3.5219 1015 2.139 108 .
We record our final conclusions: for E,r,0 defined as in (6.62) and r = 150000,
E ,r,8 3.061 108 ,
(6.63)
E+ ,r,8 2.139 108 ,
where we assume x x = 1026 when bounding E ,r,8 and x x+ = 1029 when

bounding E+ ,r,8 .
Let us optimize things a little more carefully for the trivial character T . We
will make the stronger assumption x x1 = 4.51029 . We wish to bound ET+ ,4r ,
where
ET,s = max | err,T (, x)|.
||s
We will go up to a height T0 = H + 600000 t, where H = 200 and t 10. Then

200 + 600000t
T0 H
t.
4r
Hence
0.1598(T0 H)
3.44e
0.1065
+ 0.63||e
(T0 H)2
()2
101300000 + 156000e0.1065t .
Looking at (6.46), we get

T0 1300000
2
200 log
10
+ 378000e0.1065t
2

p
1/2
+ O (16.339 log T0 + 271.615) T0 + 44 x1

4/5
+ O 2.52 108 + 200(0.015 log T0 + 163) x1 .
ET+ ,4r
We choose t = 20; this gives T0 3.77 107 , which is certainly within the checked
range. We obtain
(6.64)
ET+ ,4r 5.122 109 .
for r = 150000 and x x1 = 4.5 1029 .

Lastly, let us look at the sum estimated in (6.49). Here it will be enough
to go up to just T0 = 3H = 600. We make, again, the strong assumption
x x1 = 4.5 1029 . We look at (6.50) and obtain

(log 600)2
+ 0.224 log 600 200 600e50
err2 ,+ 0.311
log x1
(6.65)
1/2
4/5
+ 419.3 200x1
+ (6.2 200 + 5.3) 600 log 600 x1
4.8 1065 + 2.17 1011 + 1.13 1020 2.2 1011 .
81
It remains only to estimate the integral in (6.49). First of all,

Z
Z
2
2 (t) log xt dt
+ (t) log xt dt =
0
0
Z
Z
(+ (t) (t))2 log xt dt.
(+ (t) (t)) (t) log xt dt +
+2
0
The main term will be given by

Z

2 (t) log xt dt = 0.64020599736635 + O 1014 log x
0

0.021094778698867 + O 1015 ,
where the
were computed rigorously using VNODE-LP [Ned06]. (The
R integrals
2
integral 0 (t)dt can also be computed symbolically.) By Cauchy-Schwarz and
the triangle inequality,
Z
(+ (t) (t)) (t) log xt dt |+ |2 | (t) log xt|2
0
|+ |2 (| |2 log x + | log |2 )
547.56
(0.80013 log x + 0.04574)
H 7/2
3.873 106 log x + 2.214 107 ,
where we are using (4.14) and evaluate | log |2 rigorously as above. (We are
also using the assumption x x1 to bound 1/ log x.) By (4.14) and (4.15),
Z
547.56
480.394
(+ (t) (t))2 log xt dt
log x +
7/2
H
H 7/2
0
6
4.8398 10 log x + 4.25 106 .
We conclude that
(6.66)
Z
2
+
(t) log xt dt
= (0.640206 + O (1.2589 105 )) log x 0.0210948 + O (8.7042 106 )
We add to this the error term 2.2 1011 log x from (6.65), and simplify using the
assumption x x1 . We obtain:
(6.67)
n=1
2
(n/x) = (0.6402 + O (2 105 ))x log x 0.0211x.
(n)(log n)+
7. The integral of the triple product over the minor arcs

7.1. The L2 norm over arcs: variations
on the large sieve for primes.
R
We are trying to estimate an integral R/Z |S()|3 d. Rather than bound it
by |S| |S|22 , we can use
R the fact that large (major) values of S() have to
be multiplied only by M |S()|2 d, where M is a union
R (small in measure) of
minor Rarcs. Now, can we give an upper bound for M |S()|2 d better than
|S|22 = R/Z |S()|2 d?
The first version of [Hel] gave an estimate on that integral using a technique due
to Heath-Brown, which in turn rests on an inequality of Montgomerys ([Mon71,
(3.9)]; see also, e.g., [IK04, Lem. 7.15]). The technique was communicated by
82
H. A. HELFGOTT
Heath-Brown to the present author, who communicated it to Tao ([Tao, Lem. 4.6]
and adjoining comments). We will be able to do better than that estimate here.
The role played by Montgomerys inequality in Heath-Browns method is played
here by a result of Ramares ([Ram09, Thm. 2.1]; see also [Ram09, Thm. 5.2]).
The following Proposition is based on Ramares result, or rather on one possible
proof of it. Instead of using the result as stated in [Ram09], we will actually be
using elements of the proof of [Bom74, Thm. 7A], credited to Selberg. Simply
integrating Ramares inequality would give a non-trivial if slightly worse bound.
on the primes. Assume
Proposition 7.1. Let {an }
n=1 , an C, be supported
x.
Let
Q0 1, 0 1 be such
that {an } is in L1 L2 and
that
a
=
0
for
n
n
p
2
that 0 Q0 x/2; set Q = x/20 Q0 . Let
(7.1)
M=
[ a 0 r a 0 r
, +
.
q
qx q
qx
qr a mod q
(a,q)=1
Let S() =
n an e(n)
for R/Z. Then
|S()|2 d
max max
qQ0 sQ0 /q
Gq (Q0 /sq)
Gq (Q/sq)
X
n
|an |2 ,
where
(7.2)
X 2 (r)
.
(r)
Gq (R) =
rR
(r,q)=1
Proof. By (7.1),
(7.3)
|S()| d =
X Z
qQ0
0 Q0
qx
0 Q0
qx

2
X a

d.
S
+
a mod q
(a,q)=1
Thanks to the last equations of [Bom74, p. 24] and [Bom74, p. 25],

X a 2
= 1
S

q
(q)
a mod q
(a,q)=1
X
q |q
(q ,q/q )=1
2 (q/q )=1
2

X X

an (n)

mod q
83
for every q x, where we use the assumption that n is prime and > x (and
thus coprime to q) when an =
6 0. Hence
2

Z 0 Q0
Z

X
X
qx
1 X

2
an e(n)(n) d
q
|S()| d =

0 Q0 (q)

n
qQ0
q
(q )
q
(q )
q Q0
q Q0
q |q
(q ,q/q )=1
2 (q/q )=1
2 (r)
(r)
rQ0 /q
(r,q )=1
0 Q0
q x
0 Q0
q x
qx
2

X X

an e(n)(n) d

0 Q0
q rx
Q
q0 rx0 mod q

0
Q
r q0 min 1, ||x
2 (r)
(r)
2

X X

an e(n)(n) d

mod q
(r,q )=1
Here || 0 Q0 /q x implies (Q0 /q)0 /||x 1. Therefore,

Z
Gq (Q0 /sq )
2
max
|S()| d max
,
(7.4)
q Q0 sQ0 /q Gq (Q/sq )
M
where
=
q Q0
q
(q )
0 Q0
q x
0 Q0
q x

0
r qQ min 1, ||x
2 (r)
(r)
mod q
(r,q )=1
qQ
2

X X

a
e(n)(n)
d

n

2

0 Q

X X
X 2 (r) Z qrx
q

an e(n)(n) d.

(q)
(r) 0 Q
n
rQ/q
(r,q)=1
qrx
mod q
As stated in the proof of [Bom74, Thm. 7A],

(r)(n) ()cr (n) =
qr
X
(b)e2in qr
b=1
(b,qr)=1
for primitive of modulus q. Here cr (n) stands for the Ramanujan sum
X
cr (n) =
e2nu/r .
u mod r
(u,r)=1
For n coprime to r, cr (n) = (r). Since is primitive, | ()| =
r x coprime to q,
2

2 qr

X
X

b

.
(b)S
an e(n)(n) =
q

n
qr
b=1

(b,qr)=1
q. Hence, for
84
H. A. HELFGOTT
Thus,
=
X X 2 (r)
(rq)
qQ rQ/q
(r,q)=1
qQ
1
(q)
XZ
0 Q
qx
qQ
0 Q
qx
0 Q
qx
0 Q
qx

2

qr
X X
b

(b)S
d

qr
mod q b=1

(b,qr)=1
0 Q
qrx
Q
0
qrx
2

q

X X
b

(b)S
d

q

mod q b=1

(b,q)=1
2
q
X

S b d.

q
b=1
(b,q)=1
Let us now p
check that the intervals (b/q 0 Q/qx, b/q + 0 Q/qx) do not overlap.
Since Q = x/20 , we see that 0 Q/qx = 1/2qQ. The difference between two
distinct fractions b/q, b /q is at least 1/qq . For q, q Q, 1/qq 1/2qQ +
1/2Qq . Hence the intervals around b/q and b /q do not overlap. We conclude
that
2 X
Z

S b =
|an |2 ,

q
R/Z
n
and so, by (7.4), we are done.
We will actually use Prop. 7.1 in the slightly modified form given by the following statement.
on the primes. Assume
Proposition 7.2. Let {an }
that {an } is in L1 L2 and

that
a
=
0
for
n
x.
Let
Q0 1, 0 1 be such
n
p
2
that 0 Q0 x/2;
P set Q = x/20 Q0 . Let M = M0 ,Q0 be as in (3.5).
Let S() = n an e(n) for R/Z. Then
Z
Gq (2Q0 /sq) X
|S()|2 d max max

|an |2 ,
q2Q
G
(2Q/sq)
s2Q
/q
0
0
q
M ,Q
n
0
q even
where
(7.5)
X 2 (r)
.
(r)
Gq (R) =
rR
(r,q)=1
Proof. By (3.5),
Z
|S()| d =
X Z
qQ0
q odd
0 Q0
2qx
0 Q0
2qx
X Z
0 Q0
qx
qQ0
q even
0 Q0
qx
2
X a

d
S
+
a mod q
(a,q)=1

2
X a

S
+ d.

q
a mod q
(a,q)=1
85
R
We proceed as in the proof of Prop. 7.1. We still have (7.3). Hence M |S()|2 d
equals
2

Q

X q Z 2q0 x0
X
2 (r) X X

a
e(n)(n)
d

n
)
0 Q0

(q
(r)

Q
q Q0
q odd
q 2Q0
q even
2q x
mod q
0
r q0 min 1, 2||x
(r,2q )=1
q
(q )
0 Q0
q x
0 Q0
q x
2Q0
q

0
min 1, 2||x
2 (r)
(r)
2

X X

an e(n)(n) d.

mod q
(r,q )=1
(The sum with q odd and r even is equal to the first sum; hence the factor of 2
in front.) Therefore,
G2q (Q0 /sq )
max
|S()|2 d max
21
q Q0 sQ0 /q G2q (Q/sq )

M
(7.6)
q odd
+ max
max
q 2Q0 s2Q0 /q
q even
where
1 =
qQ
q odd
qQ
q odd
2 =
q2Q
q even
q
(q)
rQ/q
(r,2q)=1
2 (r)
(r)
0
2qrx
mod q
qrx
mod q
0
X 2 (r) Z qrx
q
(q)
(r) 0 Q
Q
r2Q/q
(r,q)=1
qrx
2

X X

an e(n)(n) d.

r2Q/q
(r,q)=1
r even
Gq (2Q0 /sq )
2 ,
Gq (2Q/sq )
2

X X

an e(n)(n) d

0 Q
2qrx
0
X 2 (r) Z qrx
q
(q)
(r) 0 Q
2

X X

an e(n)(n) d.

mod q
The two expressions within parentheses in (7.6) are actually equal.

Much as before, using [Bom74, Thm. 7A], we obtain that
0 Q
2
q
X
X 1 Z 2qx

S b d,
1

(q) 0 Q
q
qQ
q odd
1 + 2
q2Q
q even
2qx
1
(q)
0 Q
qx
0 Q
qx
b=1
(b,q)=1
2
q
X

S b d.

q
b=1
(b,q)=1
Let us now check that the intervals of integration (b/q 0 Q/2qx, b/q + 0 Q/2qx)
(for q odd), (b/q 0 Q/qx, b/q + 0 Q/qx) (for q even) do not overlap. Recall
that 0 Q/qx = 1/2qQ. The absolute value of the difference between two distinct
fractions b/q, b /q is at least 1/qq . For q, q Q odd, this is larger than 1/4qQ +
86
H. A. HELFGOTT
1/4Qq , and so the intervals do not overlap. For q Q odd and q 2Q even (or
vice versa), 1/qq 1/4qQ + 1/2Qq , and so, again the intervals do not overlap.
If q Q and q Q are both even, then |b/q b /q | is actually 2/qq . Clearly,
2/qq 1/2qQ + 1/2Qq , and so again there is no overlap. We conclude that
2 X
Z

S b =
21 + 2
|an |2 .

q
R/Z
n
7.2. Bounding the quotient in the large sieve for primes. The estimate
given by Proposition 7.1 involves the quotient
Gq (Q0 /sq)
,
(7.7)
max max
qQ0 sQ0 /q Gq (Q/sq)
where Gq is as in (7.2). The appearance of such a quotient (at least for s = 1) is
typical of Ramares version of the large sieve for primes; see, e.g., [Ram09]. We
will see how to bound such a quotient in a way that is essentially optimal, not
just asymptotically, but also in the ranges that are most relevant to us. (This
includes, for example, Q0 106 , Q 1015 .)
As the present work shows, Ramares work gives bounds that are, in some
contexts, better than those of other large sieves for primes by a constant factor
(approaching e = 1.78107 . . . ). Thus, giving a fully explicit and nearly optimal
bound for (7.7) is a task of clear general relevance, besides being needed for our
main goal.
We will obtain bounds for Gq (Q0 /sq)/Gq (Q/sq) when Q0 2 1010 , Q Q20 .
As we shall see, our bounds will be best when s = q = 1 or, sometimes, when
s = 1 and q = 2 instead.
P
Write G(R) for G1 (R) = rR 2 (r)/(r). We will need several estimates for
Gq (R) and G(R). As stated in [Ram95, Lemma 3.4],
(7.8)
G(R) log R + 1.4709
for R 1. By [MV73, Lem. 7],

(7.9)
G(R) log R + 1.07
for R 6. There is also the trivial bound

X 2 (r) Y
X 2 (r)
1 1
1
G(R) =
(r)
r
p
rR
rR
p|r
(7.10)
X1
X 2 (r) Y X 1
> log R.
=
r
pj
r
rR
p|r j1
rR
The following bound, also well-known and easy,

q
(7.11)
G(R)
Gq (R) G(Rq),
(q)
P
can be obtained by multiplying Gq (R) = rR:(r,q)=1 2 (r)/(r) term-by-term
Q
by q/(q) = p|q (1 + 1/(p)).
We will also use Ramares estimate from [Ram95, Lem. 3.4]:

X
log p
(d)
log R + cE +
+ O 7.284R1/3 f1 (d)
(7.12)
Gd (R) =
d
p
p|d
for all d Z+ and all R 1, where

(7.13)
f1 (d) =
Y
p|d
and
(7.14)
p1/3 + p2/3
(1 + p2/3 ) 1 +
p(p 1)
cE = +
X
p2
by [RS62, (2.11)].
If R 182, then
(7.15)
87
!1
log p
= 1.3325822 . . .
p(p 1)
log R + 1.312 G(R) log R + 1.354,
where the upper bound is valid for R 120. This is true by (7.12) for R 4 107 ;
we check (7.15) for 120 R 4 107 by a numerical computation.10 Similarly,
for R 200,
log R + 1.661
log R + 1.698
(7.16)
G2 (R)
2
2
by (7.12) for R 1.6108 , and by a numerical computation for 200 R 1.6108 .
Write = (log Q0 )/(log Q) 1. We obtain immediately from (7.16) that
(7.17)
log Q0 + 1.661
G2 (Q0 )
G2 (Q)
log Q + 1.698
for Q, Q0 200.
Let us start by giving an easy bound, off from the truth by a factor of about
e (like some other versions of the large sieve). First, we need a simple explicit
lemma.
Lemma 7.3. Let m 1, q 1. Then
Y
p
(7.18)
e (log(m + log q) + 0.65771).
p1
p|qpm
Q
Proof. Let P = pmp|q p. Then, by [RS75, (5.1)],
P
Y
Pq
p = qe pm log p qe(1+0 )m ,
pm
where 0 = 0.001102. Now, by [RS62, (3.42)],

2.50637
2.50637
n
e log log n +
e log log x +
(n)
log log n
log log x
m
for all x n 27. Hence, if qe 27,

P
2.50637
e log((1 + 0 )m + log q) +
(P)
log(m + log q)

2.50637/e
e log(m + log q) + 0 +
.
log(m + log q)
Thus (7.18) holds when m + log q 8.53, since then 0 + (2.50637/e )/ log(m +
log q) 0.65771. We verify all choices of m, q 1 with m + log q 8.53 computationally; the worst case is that of m = 1, q = 6, which give the value 0.65771
in (7.18).

10Using D. Platts implementation [Pla11] of double-precision interval arithmetic based on
Lambovs [Lam08] ideas.
88
H. A. HELFGOTT
Here is the promised easy bound.

Lemma 7.4. Let Q0 1, Q 182Q0 . Let q Q0 , s Q0 /q, q an integer.
Then

Q0
e log sq + log q + 1.172

Gq (Q0 /sq)
e log Q0 + 1.172
.
Q
Gq (Q/sq)
log Q0 + 1.31
log QQ0 + 1.31
Q
Proof. Let P = pQ0/sqp|q p. Then
and so
(7.19)
Gq (Q0 /sq)GP (Q/Q0 ) Gq (Q/sq)

Gq (Q0 /sq)
1
.
Gq (Q/sq)
GP (Q/Q0 )
Now the lower bound in (7.11) gives us that, for d = P, R = Q/Q0 ,

GP (Q/Q0 )
(P)
G(Q/Q0 ).
P
By Lem. 7.3,

P
Q0
e log
+ log q + 0.658 .
(P)
sq
Hence, using (7.9), we get that

log Q0 + log q + 1.172
e
sq
Gq (Q0 /sq)
P/(P)
(7.20)
,
Gq (Q/sq)
G(Q/Q0 )
log Q + 1.31
Q0
since Q/Q0 120. Since

Q0
Q0
1
1
Q0
+ log q = 2 + =
1
0,
sq
sq
q
q
sq
the rightmost expression of (7.20) is maximal for q = 1.
We will use Lemma 7.4 when Q0 > 2 1010 , since then the numerical bounds
we will derive are not available. As we will now see, we can also use Lemma 7.4
to obtain a bound that is useful when sq is large compared to Q0 , even when
Q0 2 1010 .
Lemma 7.5. Let Q0 1, Q 200Q0 . Let q Q0 , s Q0 /q, q an even integer.
Let = (log Q0 )/ log Q 2/3. If
2Q0
(1)e
log q,
1.1617 Q0
sq
then
(7.21)
Gq (2Q0 /sq)
log Q0 + 1.698
.
Gq (2Q/sq)
log Q
Proof. Apply Lemma 7.4. By (7.20), we see that (7.21) will hold provided that

log QQ0 + 1.31
2Q0
e log
+ log q + 1.172
(log Q0 + 1.698)
sq
log Q
(7.22)
(log Q0 + 1.698)(log Q0 1.31)
.
log Q0 + 1.698
log Q
89
Since = (log Q0 )/ log Q,

(log Q0 + 1.698)(log Q0 1.31)
log Q
1.698(log Q0 1.31)
= log Q0 + 1.698 (log Q0 1.31)
log Q
(1 ) log Q0 + 1.698 + 1.31 1.698,
log Q0 + 1.698
and so (7.22) will hold provided that

2Q0
e log
+ log q + 1.172 (1 ) log Q0 + 1.698 + 1.31 1.698.
sq
For all [0, 2/3],
2
1.698 + 1.31 1.698 1.172 0.526 0.388 0.267.
3
Hence it is enough that
2Q0
(1)e
+ log q ee ((1) log Q0 +0.267) = c Q0
,
sq
where c = exp(exp() 0.153) = 1.16172 . . . .
Proposition 7.6. Let Q0 105 , Q 200Q0 . Let = (log Q0 )/ log Q. Assume

1 = 0.55. Then, for every even and positive q 2Q0 and every s
[1, 2Q0 /q],
log Q0 + 1.698
Gq (2Q0 /sq)
.
Gq (2Q/sq)
log Q
(7.23)
Proof. Define errq,R so that

(7.24)
Gq (R) =
(q)
log R + cE +
q
X log p
p|q
+ errq,R .
By (7.11) and (7.16), Gq (2Q/sq) ((q)/q)(log 2Q/sq + 1.661). Hence, (7.23)

will hold when
(7.25)
X log p
2Q0
log Q0 + 1.698
q
log
+ cE +
+
errq, 2Q0 (log 2Q/sq + 1.661)
.
sq
p
(q)
log Q
sq
p|q
We know that log sq log 2. The right side of (7.25) equals

log Q0 + 1.698 (log
log Q0 + 1.698
sq
1.661)
.
2
log Q
Note that

log Q0 + 1.698
1.698
1.698
=+
= 1+
.
log Q
log Q
log Q0
Thus, (7.25) will be true provided that

1.698
sq
1.661) + (1.698 cE )
log sq 1 +
(log
log Q0
2
X log p
q
+
errq, 2Q0
p
(q)
sq
p|q
90
H. A. HELFGOTT
Moreover, if this is true for = 1 , it is true for all 1 . Thus, it is enough to

check that

sq
1.698
1.661) + 0.365
(log
log sq 1 1 +
log Q0,min
2
(7.26)
X log p
q
+
errq, 2Q0 ,
p
(q)
sq
p|q
where Q0,min = 105 . Since Q0,min = 105 and 1 0.87, we know that 1 (1 +
1.698/ log Q0,min ) < 1.14751 < 1. Now all that remains to do is to take the
maximum mq,R,1 of errq,R over all R satisfying
R > 1.1617 max(Rq, Q0,min )(11 )e
(7.27)
log q
(since all smaller R are covered by Lemma 7.5) and verify that
(7.28)
mq,R,1
(q)
(q)
q
for
(q) = (1 1.14751 ) log q + 2.701351 + 0.365
X log p
p|q
(Values of s larger than 1 make the conditions stricter and the conclusions weaker,
thus making the task only easier.)
Ramares bound (7.12) implies that
errq,R 7.284R1/3 f1 (q),
with f1 (q) as in (7.13). This is enough when

q 7.284f1 (q) 3
.
(7.29)
R (q) =
(q)
(q)
The first question is: for which q does (7.27) imply (7.29)? It is easy to see
that (7.27) implies
R1(11 )e
> 1.1617q (11 )e
and so, assuming 1.1617q (11 )e

(7.30)
(q) = 1.1617q (11 )e
log q
R(11 )e
> 1.1617q (11 )e
log q
log q > 0, we get that R > (q), where

log q
(1.1617q (11 )e
log q)
1
1(11 )e
1
1(11 )e
Thus, if (q) is greater than the quantity (q) given in (7.29), we are done.
and p (log p)/p are decreasing functions of p for
Now, (p/(p 1)) f1 (p) Q
p 3. Hence, for even q < pp0 p, p0 a prime,
X log p
(7.31)
(q) (1 1.14751 ) log q + 2.271
p
p<p
0
and
(7.32)
(q)
p<p0
Q
7.284 p<p0 f1 (p)
p
P
p 1 (1 1.14751 ) log q + 2.701351 + 0.365 p<p
0
log p
p
!3
For instance, since 1 = 0.55,

3
51.273
0.3688 log q+0.3414 3

58.958
(q)
0.3688 log q+0.2051

3
66.508
0.3688 log q+0.0889
for q <
for q <
for q <
Q
Q
Q
p23 p
= 223092870,
p29 p
= 6469693230,
p31 p
= 200560490130.
91
for q > 128,

Since q 7 1.1617q (11 )e log q is increasing
Q
Q so is q 7 (q).
A comparison for q = 3.59 107 , q = p23 p and q = p29 p shows that
Q
Q
(q) > (q) for q [3.59 107 , p31 p). For q p31 p, we know that
1

1.1617q (10.55)e log q 1(10.55)e 298.03 log q
and so
(q)
1.1617q (10.55)e
1
298.03
1
1(10.55)e
1

1(10.55)e
1.222q 0.338 .
1.1616q (10.55)e
By [RS62, (3.14), (3.24), (3.30)] and a numerical computation for p1 < 106 ,
X log p
Y
X
< log p1 , log
p =
log p > 0.8p1 ,
p
pp1
pp1
pp1

Y p
1
(log p1 ) < 1.94 log p1

<e 1+
2
p1
log
p
1
pp1
Q
for all p1 31. This implies, in particular, that, for q = pp1 p, (q) >
1.222e0.80.338p1 = 1.222e0.2704p1 .
We also have
!1
Y
p1/3 + p2/3
0.1489
1+
p(p 1)
p31
and
log f1
pp1
Hence
p =
log(1 + p
2/3
) + log
p31
pp1
1/3
0.729p1
7.284 0.1489e0.729p1
1.94 log p1
0.3688 0.8p1 log p1
2.025e0.2217p1
p1/3 + p2/3
1+
p(p 1)
!1
+ log 0.1489.
1/3
(q)
!3

1/3 3
1.265e0.729p1
0.2217p1 < 1.222e0.2704p1 for p 31, it follows that

for p1 31. Since 2.025e
1
Q
(q) > (q) for q = pp1 p, p1 31. Looking at (7.31) and (7.32), we see
Q
that this implies that (q) > (q) holds for all q p31 p. Together with our
previous calculations, this gives us that (q) > (q) for all q 3.59 107 .
Now, for q < 3.59 107 even, we need to check that the maximum mq,R,1 of
errq,R over all (q) R < (q) satisfies (7.28). Since log R is increasing on R and
92
H. A. HELFGOTT
Gq (R) depends only on R, we can tell from (7.24) that, since we are taking the
maximum of errq,R , it is enough to check integer values of R. We check all integers
R in [(q), (q)) for all even q < 3.59 107 by an explicit computation.11

Finally, we have the trivial bound
Gq (Q0 /sq)
1,
Gq (Q/sq)
(7.33)
which we shall use for Q0 close to Q.

on the primes. Assume that
Corollary 7.7. Let {an }
5
{an } is in L1 L2 and that an =
p0 for n x. Let Q0 10 , 0 1 be such
2
that (200Q0 ) x/20 ; set Q = x/20 and = (log Q0 )/ log Q. Let M0 ,Q0 be
as in (3.5). P
Let S() = n an e(n) for R/Z. Then, if 0.55,
Z
Z
log Q0 + 1.698
2
|S()|2 d.
|S()| d
log Q
R/Z
M ,Q
0
P
Here, of course, R/Z |S()|2 d = n |an |2 (Plancherel). If > 0.55, we will
use the trivial bound
Z
Z
|S()|2 d
|S()|2 d
(7.34)
R/Z
M0 ,r
Proof. Immediate from Prop. 7.2, Lem. 7.4 and Prop. 7.6.
7.3. Putting together 2 bounds over arcs and bounds. First, we need a
simple lemma essentially a way to obtain upper bounds by means of summation
by parts.
+
Lemma 7.8. Let f, g : {a, a + 1, . . . , b} R+
0 , where a, b Z . Assume that,
for all x [a, b],
X
(7.35)
f (n) F (x),
anx
where F : [a, b] R is continuous, piecewise differentiable and non-decreasing.

Then
Z b
b
X
(max g(n)) F (u)du.
f (n) g(n) (max g(n)) F (a) +
na
n=a
Proof. Let S(n) =

(7.36)
b
X
n=a
Pn
m=a f (m).
nu
Then, by partial summation,
f (n) g(n) S(b)g(b) +
b1
X
n=a
S(n)(g(n) g(n + 1)).
11Here, as elsewhere in this section, numerical computations were carried out by the author
in C; all floating-point operations used D. Platts interval arithmetic package.
93
Let h(x) = maxxnb g(n). Then h is non-increasing. Hence (7.35) and (7.36)
imply that
b
X
n=a
f (n)g(n)
b
X
f (n)h(n)
n=a
S(b)h(b) +
F (b)h(b) +
P
In general, for n C, A(x) =

differentiable on [a, x],
Z
X
n F (x) = A(x)F (x)
anx
b1
X
n=a
b1
X
n=a
S(n)(h(n) h(n + 1))

F (n)(h(n) h(n + 1)).
anx n
x
and F continuous and piecewise
A(u)F (u)du.
Applying this with n = h(n)h(n+1) and A(x) =

1), we obtain
b1
X
(Abel summation)
anx n
= h(a)h(x+
F (n)(h(n) h(n + 1))
n=a
= (h(a) h(b))F (b 1)
b1
= h(a)F (a) h(b)F (b 1) +

= h(a)F (a) h(b)F (b 1) +
= h(a)F (a) h(b)F (b) +
(h(a) h(u + 1))F (u)du
b1
b1
n=a
h(u)F (u)du
h(u)F (u)du,
since h(u + 1) = h(u) for u

/ Z. Hence
b
X
h(u + 1)F (u)du
f (n)g(n) h(a)F (a) +
h(u)F (u)du.
We will now seeRour main application of Lemma 7.8. We have to bound an integral of the form M ,r |S1 ()|2 |S2 ()|d, where M0 ,r is a union of arcs defined
0
R
as in (3.5). Our inputs are (a) a bound on integrals of the form M ,r |S1 ()|2 d,
0
(b) a bound on |S2 ()| for (R/Z) \ M0 ,r . The input of type (a) is what we
derived in 7.1 and 7.2; the input of type (b) is a minor-arcs bound, and as such
is the main subject of [Hel].
P
1
Proposition 7.9. Let S1 () =
n an e(n), an C, {an } in L . Let S2 :
R/Z C be continuous. Define M0 ,r as in (3.5).
Let r0 be a positive integer not greater than r1 . Let H : [r0 , r1 ] R+ be a nondecreasing continuous function, continuous and differentiable almost everywhere,
94
H. A. HELFGOTT
such that
P
(7.37)
1
|an |2
M0 ,r+1
|S1 ()|2 d H(r)
for some 0 x/2r12 and all r [r0 , r1 ]. Assume, moreover, that H(r1 ) = 1. Let
g : [r0 , r1 ] R+ be a non-increasing function such that
(7.38)
max
|S2 ()| g(r)
(R/Z)\M0 ,r
for all r [r0 , r1 ] and 0 as above.

Then
Z
1
P
|S1 ()|2 |S2 ()|d
2
|a
|
n
(R/Z)\M
n
0 ,r0
(7.39)
Z r1
g(r)H (r)dr,
g(r0 ) (H(r0 ) I0 ) +
r0
where
1
I0 = P
2
n |an |
(7.40)
M0 ,r0
|S1 ()|2 d.
The condition 0 x/2r12 is there just to ensure that the arcs in the definition
of M0 ,r do not overlap for r r1 .
Proof. For r0 r < r1 , let
Let
1
f (r) = P
2
n |an |
Then, by (7.38),
1
f (r1 ) = P
2
n |an |
1
2
n |an |
(R/Z)\M0 ,r0
M0 ,r+1 \M0 ,r
(R/Z)\M0 ,r1
|S1 ()|2 d.
|S1 ()|2 d.
|S1 ()|2 |S2 ()|d
r1
X
f (r)g(r).
r=r0
By (7.37),
(7.41)
r0 rx
1
f (r) = P
2
n |an |
=
M0 ,x+1 \M0 ,r0
1
P
2
n |an |
M0 ,x+1
for x [r0 , r1 ). Moreover,

Z
X
1
f (r) = P
2
(R/Z)\M
n |an |
0 ,r0
r0 rr1
1
P
2
n |an |
R/Z
|S1 ()|2 d
!
|S1 ()|2 d
I0 H(x) I0
|S1 ()|2
2
|S1 ()|
I0 = 1 I0 = H(r1 ) I0 .
95
We let F (x) = H(x) I0 and apply Lemma 7.8 with a = r0 , b = r1 . We obtain

that
Z r1
r1
X
(max g(r))F (u) du
f (r)g(r) (max g(r))F (r0 ) +
rr0
r=r0
ru
r0
g(r0 )(H(r0 ) I0 ) +
r1
g(u)H (u) du.
r0
Theorem 7.10 (Total of minor arcs). Let x 1024 , where 4 1750. Let
X
(7.42)
S (, x) =
(n)e(n)(n/x).
n
Let (t) = (2 M )(t), where 2 is as in (4.36) and : [0, ) [0, )

is continuous and in L1 . Let + : [0, ) [0, ) be a bounded, piecewise
differentiable function with limt + (t) = 0. Let M0 ,r be as in (3.5) with
0 = 8. Let 105 r0 < r1 , where r1 = (2/3)(x/)0.55/2 .
Let
Z
|S (, x)||S+ (, x)|2 d.
Zr0 =
(R/Z)\M8,r0
Then
Zr 0
where
S=
(7.43)
p> x
||1 x
(M + T ) +
S (0, x) E
!2
2
(log p)2 +
(n/x),
T = C,3 (log x) (S ( J E)2 ),

Z
|S+ (, x)|2 d,
J=
M8,r0

E = (C+ ,0 + C+ ,2 ) log x + (2C+ ,0 + C+ ,1 ) x1/2 ,
C+ ,0 = 0.7131
(7.44)
and
(7.45)
C+ ,1 = 0.7131
1
(sup + (r))2 dt,
t rt
log t
(sup + (r))2 dt,
t rt
C+ ,2 = 0.51941|+ |2 ,
Z
1.04488 1/K
|(w)|dw
C,3 (K) =
||1
0
(log(r0 + 1)) + 1.698

p
M = g(r0 )
S ( J E)2
log x/16

Z r1
2
g(r)
dr + 0.45g(r1 ) S
+
x
log 16
r
r0
where g(r) = gx/, (r) with K = log(x/) (see (4.46)).
96
H. A. HELFGOTT
Proof. Let y = x/. Let Q = (3/4)y 2/3 , as in [Hel, Main Thm.] (applied with
y instead of x). Let (R/Z) \ M8,r , where r r0 and y is used instead of x
to define M8,r (see (3.5)). There exists an approximation 2 = a/q + /y with
q Q, ||/y 1/qQ. Thus, = a /q + /2y, where either a /q = a/2q or
a /q = (a + q)/2q holds. (In particular, if q is odd, then q = q; if q is even,
then q may be q or 2q.)
There are three cases:
(1) q r. Then either (a) q is odd and q r or (b) q is even and q 2r.
Since is not in M8,r , then, by definition (3.5), ||/y 0 r/q = 8r/q. In
particular, || 8.
Thus, by Prop. 4.5 and Lemma 4.6,

||
q ||1 y gy, (r) ||1 y.
(7.46)
|S (, x)| = |S2 M (, y)| gy,
8
(2) r < q y 1/3 /6. Then, by Prop. 4.5 and Lemma 4.6,
(7.47)

||
|S (, x)| = |S2 M (, y)| gy, max
, 1 q ||1 y gy, (r) ||1 y.
8
(3) q > y 1/3 /6. Again by Prop. 4.5,
(7.48)
Let
y

|S (, x)| = |S2 M (, y)| h
||1 + C,3 (K) y,
K
where h(x) is as in (4.42). (Note that C,3 (K), as in (7.44), equals
C,0,K /||1 , where C,0,K is as in (4.48).) We set K = log y. Since
y = x/ 1024 , it follows that y/K = y/ log y > 2.16 1020 .
2 0.55
r1 = y 2 ,
3
(
gx, (r)
g(r) =
gx, (r1 )
if r r1 ,
if r > r1 .
Since > 4, we see that r1 (x/16)0.55/2 . By Lemma 4.6, g(r) is a decreasing function; moreover, by Lemma 4.7, gy, (r1 ) h(y/ log y), and so g(r)
h(y/ log y) for all r. Thus, we have shown that
(7.49)
|S (y, )| (g(r) + C,3 (log y)) ||1 y
for all (R/Z) \ M8,r .

We first need to undertake the fairly dull task of getting non-prime or small n
out of the sum defining S+ (, x). Write
X
S1,+ (, x) =
(log p)e(p) (p/x),
p> x
S2,+ (, x) =
(n)e(n)+ (n/x) +
n non-prime
n> x
(n)e(n)+ (n/x).
n x
By the triangle inequality (with weights |S+ (, x)|),

sZ
|S (, x)||S+ (, x)|2 d
(R/Z)\M8,r0
2
X
j=1
sZ
(R/Z)\M8,r0
|S (, x)||Sj,+ (, x)|2 d.
Clearly,
Z
(R/Z)\M8,r0
|S (, x)||S2,+ (, x)|2 d
max |S (, x)|
(R/Z)
n=1
97
(n) (n/x)
R/Z
|S2,+ (, x)|2 d
(n)2 + (n/x)2 +
n non-prime
n x
(n)2 + (n/x)2 .
Let + (z) = suptz + (t). Since + (t) tends to 0 as t , so does + . By

[RS62, Thm. 13], partial summation and integration by parts,
X
X
(n)2 + (n/x)2
(n)2 + (n/x)2
n non-prime
0.7131

2
(n)2
+ (t/x) dt
nt
n non-prime
Z
2
log e t
+ 2
t
n non-prime

(log t) 1.4262 t + 2 (t/x) dt

Z
t
2 + log tx 2
+ (t)dt
x,
dt 0.7131
x
t
0
while, by [RS62, Thm. 12],

X
X
1
(n)
(n)2 + (n/x)2 |+ |2 (log x)
2
nr1
n x
2
0.51941|+ | x log x.
This shows that
Z
X
(n) (n/x) E = S (0, x) E,
|S (, x)||S2,+ (, x)|2 d
(R/Z)\M8,r0
n=1
where E is as in (7.43).
It remains to bound
Z
(7.50)
(R/Z)\M8,r0
|S (, x)||S1,+ (, x)|2 d.
We wish to apply Prop. 7.9. Corollary 7.7 gives us an input of type (7.37); we
have just derived a bound (7.43) that provides an input of type (7.38). More
precisely, by Corollary 7.7, (7.37) holds with
( (log(r+1))+1.698
if r < r1 ,
log x/16
H(r) =
1
if r r1 .
Since r1 = (2/3)y 0.275 and 1750,
log((2/3)(x/)0.275 + 1) + 1.698
p
log x/16
!
0.275
0.275 log 23 + 1.698 + 0.275 log 16
p
1
+
= 0.45.
0.5
0.5
log x/16
lim H(r) lim H(r) = 1
rr1+
rr1
98
H. A. HELFGOTT
We also have (7.38) with

(7.51)
(g(r) + C,3 (log y)) ||1 y
instead of g(r) (by (7.49)). Here (7.51) is a decreasing function of r because g(r)
is, as we already checked. Hence, Prop. 7.9 gives us that (7.50) is at most
(7.52)
Z r1
9g(r1 )
1
g(r)
p
dr +
g(r0 )(H(r0 ) I0 ) + (1 I0 ) C,3 (log x) +
r
+
1
20
log x/16 r0
P
times ||1 y p>x (log p)2 2 (p/x), where
Z
1
(7.53)
I0 = P
|S1,+ (, x)|2 d.
2 2
p> x (log p) + (n/x) M8,r0
By the triangle inequality,

sZ
M8,r0
|S1,+ (, x)|2 d =
sZ
M8,r0
sZ
M8,r0
M8,r0
|S+ (, x)|2 d
sZ
R/Z
and so we are done.
(, x)|2
|S+
As we already showed,
Z
|S2,+ (, x)|2 d =
Thus,
|S+ (, x) S2,+ (, x)|2 d
n non-prime
or n x
sZ
M8,r0
sZ
R/Z
|S2,+ (, x)|2 d
|S2,+ (, x)|2 d.
(n)2 + (n/x)2 E.
I0 S ( J E)2 ,
We now should estimate the integral in (7.45). It is easy to see that

(7.54)
Z
Z
Z
log r
2
1
1
log er0
1
dr = 1/2 ,
dr =
,
dr = ,
2
2
3/2
r
r0
r0
r0
r0 r
r0 r
r0
Z
Z
Z r1
log r
2 log e2 r0
log 2r
2 log 2e2 r0
r+
1
dr
=
dr
=
dr = log
,
,
,
3/2
r0
r0
r0
r 3/2
r0 r
r0
r0 r
Z
Z
(log 2r)3
(log 2r)2
2P2 (log 2r0 )
2P3 (log 2r0 )
,
,
dr
=
dr
=
1/2
r0
r 3/2
r 3/2
r0
r0
r
0
where
(7.55)
P2 (t) = t2 + 4t + 8,
P3 (t) = t3 + 6t2 + 24t + 48.
We must also estimate the integrals

Z r1 p
Z r1
(r)
(r)
(7.56)
dr,
dr,
3/2
r2
r
r0
r0
r1
r0
(r) log r
dr,
r2
r1
r0
(r)
dr,
r 3/2
99
Clearly, (r) e log log r = 2.50637/ log log r is decreasing on r. Hence, for
r 105 ,
(r) e log log r + c ,
where c = 1.025742. Let F (t) = e log t + c . Then F (t) = e /t2 < 0. Hence
p
d2 F (t)
F (t)
(F (t))2
p
<0
=
dt2
2 F (t) 4(F (t))3/2
p
for all t > 0. In other words, F (t) is convex-down, and so we can bound
p
p

F (t) from above by F (t0 ) + F (t0 ) (t t0 ), for any t t0 > 0. Hence, for
r r0 105 ,
p
p
p
p
d F (t)
r
|t=log r0 log
(r) F (log r) F (log r0 ) +
dt
r0
r
p
log
e
r0
= F (log r0 ) + p
.
F (log r0 ) 2 log r0
Thus, by (7.54),
(7.57)p

Z
p
(r)
e
log e2 r0
1
e
p
F
(log
r
)
2
dr
0
F (log r0 )
r0
r0
r 3/2
F (log r0 ) log r0
r0
p

2 F (log r0 )
e
=
1
.
r0
F (log r0 ) log r0
The other integrals in (7.56) are easier. Just as in (7.57), we extend the range
of integration to [r0 , ]. Using (7.54), we obtain

Z
Z
c
(r)
F (log r)
log log r0
dr
dr = e
+ E1 (log r0 ) + ,
2
2
r
r
r
r0
0
r
r0

Z 0
c log er0
(1 + log r0 ) log log r0 + 1
(r) log r
dr e
+ E1 (log r0 ) +
,
2
r
r0
r0
r0
where E1 is the exponential integral
E1 (z) =
et
dt.
t
By [OLBC10, (6.8.2)],
1
1
E1 (log r)
.
r(log r + 1)
r log r
Hence
r0
e (log log r0 + 1/ log r0 ) + c

(r)
dr
,
r2
r0
r0

log log r + 1
e
0
log r0 + c
(r) log r
dr
log er0 .
r2
r0
100
H. A. HELFGOTT
Finally,
Z
r1
r0

2 log log r
(r)
2c
2c
log r
r1
e
|r0 +
+ 2E1
2
r0
r1
r
r 3/2

1
1
log log r0 log log r1

2e
+ 2c
r0
r1
r0
r1

1
1
.
+ 4e
r0 log r0
r1 (log r1 + 2)
It is time to estimate
Z
(7.58)
r1
r0
Ry,2r log 2r
r 3/2
(r)
dr,
where z = y or z = y/ log y (and y = x/, as before). By Cauchy-Schwarz, (7.58)

is at most
sZ
sZ
r1
r1
(Rz,2r log 2r)2
(r)
dr
dr.
3/2
3/2
r
r0
r0 r
We have already bounded the second integral. Let us look at Rz,t (defined in
+ 0.41415, where
(4.40)). We can write Rz,t = 0.27125Rz,t
!
log
4t
.
(7.59)
Rz,t
= log 1 +
9z 1/3
2 log 2.004t
Clearly,
Rz,e
1+
t /4 = log
t/2
log
36z 1/3
2.004
Now, for f (t) = log(c + at/(b t)) and t [0, b),

f (t) =
c+
ab

at
bt
(b t)2
f (t) =
ab((a 2c)(b 2t) 2ct)

.

2
at
4
c + bt (b t)
In our case, a = 1/2, c = 1 and b = log 36z 1/3 log(2.004) > 0. Hence, for t < b,

b 3
b
3
b t > 0,
ab((a 2c)(b 2t) 2ct) =
2t + (b 2t) =
2
2
2 2
and so f (t) > 0. In other words, t Rz,e

t /4 is convex-up for t < b, i.e., for
et /4 < 9z 1/3 /2.004. It is easy to check that, since we are assuming y 1024 ,
4 0.55
9
2r1 = y 2 <
3
2.004
y
log y
1/3
9z 1/3
.
2.004
We conclude that r Rz,2r

is convex-up on log 8r for r r1 , and hence so is
r Rz,r . Thus, for r [r0 , r1 ],
(7.60)
2
2
Rz,2r
Rz,2r
log r/r0
log r1 /r
2
+ Rz,2r
.
1
log r1 /r0
log r1 /r0
101
Therefore, by (7.54),

Z r1
(Rz,2r log 2r)2
dr
log r/r0
log r1 /r
2
2
+ Rz,2r1
(log 2r)2 3/2
dr
Rz,2r0
3/2
log
r
/r
log
r
/r
r
r
1 0
1 0
r0
r0

2
2Rz,2r0
P3 (log 2r0 ) P3 (log 2r1 )
P2 (log 2r0 ) P2 (log 2r1 )
log 2r1
=
log rr10
r0
r1
r0
r1

2
2Rz,2r1
P2 (log 2r0 ) P2 (log 2r1 )
P3 (log 2r0 ) P3 (log 2r1 )
+
log
2r
0
log rr10
r0
r1
r0
r1
!

P2 (log 2r0 ) P2 (log 2r1 )
log 2r0 2
2
2
(R
R
)
= 2 Rz,2r0
z,2r1
x,2r0
log rr01
r0
r1

2
2
Rz,2r
Rz,2r
P3 (log 2r0 ) P3 (log 2r1 )
1
0
+2
r1
log r0
r0
r1

P2 (log 2r0 ) P2 (log 2r1 )
2
= 2Rz,2r
0
r0
r1

2
2
Rz,2r1 Rz,2r0 P2 (log 2r0 ) P3 (log 2r1 ) (log 2r0 )P2 (log 2r1 )
+2
,
log rr10
r0
r1
Z
r1
where P2 (t) and P3 (t) are as in (7.55), and P2 (t) = P3 (t)tP2 (t) = 2t2 +16t+48.
Putting all terms together, we conclude that
Z r1
g(r)
(7.61)
dr f0 (r0 , y) + f1 (r0 ) + f2 (r0 , y),
r
r0
where
f0 (r0 , y) =
(7.62)
(1 c )
I0,r0 ,r1 ,y + c
I0,r0 ,r1 ,
y
log y
s
2
I1,r0
r0

F (log r0 )
e
5
1
f1 (r0 ) =
+
F (log r0 ) log r0
2r0
2r0

13
80
1
log er0 + 10.102 Jr0 +
log er0 + 23.433
+
r0
4
9
f2 (r0 , y) = 3.2
(log y)1/6
r1
log ,
1/6
r0
y
where F (t) = e log t + c , c = 1.025742, y = x/ (as usual),

(7.63)

P2 (log 2r0 ) P2 (log 2r1 )
2
I0,r0 ,r1 ,z = Rz,2r0
r0
r1

2
2
Rz,2r1 Rz,2r0 P2 (log 2r0 ) P3 (log 2r1 ) (log 2r0 )P2 (log 2r1 )
log rr10
r0
r1
Jr = F (log r) +
e
,
log r
and C,2,K is as in (4.47).
I1,r = F (log r) +
2e
,
log r
c =
C,2,log y /||1
log log y
102
H. A. HELFGOTT
8. Conclusion
We now need to gather all results, using the smoothing functions
+ = h200 (t) (t)
(as in (4.7), with H = 200) and
= (2 M )(t)
(as in Thm. 7.10, with 1750 to be set soon). We define (t) = t2 et /2 . Just
2
like before, hH = h200 is as in (4.10), (t) = ,1 (t) = et /2 , and 2 = 1 M 1 ,
where 1 = 2 I[1/2,1/2] .
We fix a value for r0 , namely, r0 = 150000. Our results will have to be valid
for any x x1 , where x1 is fixed. We set x1 = 4.5 1029 , since we want a result
valid for N 1030 , and, as was discussed in (4.1), we will work with x1 slightly
smaller than N/2. (We will later see that N 1030 implies the stronger lower
bound x 4.9 1029 .)
8.1. The 2 norm over the major arcs. We apply Lemma 3.1 with = +
and as in (4.3). Let us first work out the error terms. Recall that 0 = 8. We
use the bound on E+ ,r,0 given in (6.63), and the bound on ET+ ,0 r/2 in (6.64):
E+ ,r,0 2.139 108 ,
ET+ ,0 r/2 5.122 109 .
We also need to bound a few norms: by (4.17), (4.18), (4.26), (4.31) and (4.30),
|+ |1 0.996505,
547.5562
0.80014,
2007/2
1 + 4 log 200
1.0858.
1 + 2.21526
200
|+ |2 0.80013 +
(8.1)
|+ |
By (3.12),
S+ (0, x) = c
+ (0) x + O err+ ,T (0, x) x
(|+ |1 + ET+ ,0 r/2 )x 0.99651x.

Hence, we can bound Kr,2 in (3.27):
Kr,2 = (1 + 300000)(log x)2 1.14146
(2 0.99651 + (1 + 300000)(log x)2 1.14146/x)

= 1248.32(log x)2 2.7 1014 x
for x 1020 . We also have
5.190 r ET
0 r
+, 2
and
||1 +
ET
0 r
+, 2
!!
0.031789

3 log r
2
2
0 r
2+
E+ ,r,0 + (log 2e r)Kr,2 4.844 107 .
2
We recall from (4.18) and (4.14) that
(8.2)
and
(8.3)
0.8001287 | |2 0.8001288
|+ |2
547.5562
4.84 106 .
H 7/2
103
Symbolic integration gives

| |22 = 2.7375292 . . .
(8.4)
(2)
(3)
We bound | |1 using the fact that (as we can tell by taking derivatives) (t)
increases from 0 at t = 0 to a maximum within [0, 1/2], and then decreases to
(2)
(1) = 7, only to increase to a maximum within [3/2, 2] (equal to that within
[0, 1/2]) and then decrease to 0 at t = 2:
(2)
(3)
(8.5)
(2)
(2)
| |1 = 2 max (t) 2 (1) + 2 max (t)

t[3/2,2]
t[0,1/2]
= 4 max
t[0,1/2]
(2)
(t)
+ 14 4 4.6255653 + 14 32.503,
where we compute the maximum by the bisection method with 30 iterations

(using interval arithmetic, as always).
We evaluate explicitly
X 2 (q)
= 13.597558346 . . .
(q)
qr
q odd
Looking at (3.29) and Lr,0 (log r + 1.7)|+ |22 ,, we conclude that

Lr,0 13.597558347 0.80012872 8.70524,
Lr,0 13.597558348 0.80012882 + O ((log r + 1.7) 4.84 106 )

log r 0.425
+
8.70516.
+ O 1.341 105 0.64787 +
4r
r
Lemma 3.1 thus gives us that

Z

S (, x)2 d = (8.7052 + O (0.0004))x + O (0.03179)x
+
M8,r0
(8.6)
= (8.7052 + O (0.0322))x.
8.2. The total major-arcs contribution. First of all, we must bound from
below
Y

Y
1
1
.
1+
(8.7)
C0 =
1
(p 1)2
(p 1)3
pN
p|N
The only prime that we know does not divide N is 2. Thus, we use the bound

Y
1
C0 2
1
1.3203236.
(p 1)2
p>2
The other main constant is C , , which we defined in (3.37) and already

started to estimate in (4.6):
!
Z N
Z
(8.8) C , = | |22
()d + 2.71| |22 O
((2 N/x) + )2 ()d
104
H. A. HELFGOTT
2 /2
provided that N 2x. Recall that = (2 M )(t), where (t) = t2 et

Therefore,
Z N/x
Z 1
Z N/x
Z N/x
dw
2 (w)
(2 )()d =
()d =
w
w
0
1/4
0
0
Z
Z 1
|2 |1 ||1
1
=
()ddw.
2 (w)
1/4
N/xw
Now
y 2 /2
() = ye
Z
+ 2
t2
e
y/ 2
by [OLBC10, (7.8.3)]. Hence

Z
Z
()d
N/xw
()d <
dt <

2
y+
y
1
2 +
ey
e2
2 /2
and so, since |2 |1 = 1,

Z 1
Z N/x
1
||1
2
2 (w)dw 2 + 2 e2
()d
1/4
0
(8.9)

1
2
||1 2 + 2 e2 .
Let us now focus on the second integral in (8.8). Write N/x = 2 + c1 /. Then
the integral equals
Z 2+c1 /
Z
1
2
(c1 / + ) ()d 3
(u c1 )2 (2 M )(u) du
0
0
Z
Z 1
1
(vw c1 )2 (v)dvdw
2 (w)
= 3
1/4
0
r
r
Z 1
1
2
2
= 3
2 (w) 3
w 2 2c1 w + c1
dw
1/4
2
2
r
r

1 49 9
2
c1 +
c .
= 3
48 2 4
2 1
It is thus best to choose c1 = (9/4)/ 2 = 0.89762 . . . . Looking up | |22 in (8.4),

we obtain
Z N
x
2
((2 N/x) + )2 ()d
2.71| |2
0
r

1 49 (9/4)2
1.9986

.
7.413 3
48 2
3
2 2
We conclude that
C ,
Setting

1
1.9986
1
2
2
2
||1 | |2 | |2 2 + 2 e2
.
3
= 49
and using (4.18), we obtain

(8.10)
C ,
1
(||1 | |22 0.000833).
105
p
Here it is useful to note that ||1 = 2 , and so, by (4.18), ||1 | |22 = 0.80237 . . . .
We have finally chosen x in terms of N :
(8.11)
x=
N
N
N
=
= 0.49724 . . . N.
c1 =
49/36
1
2+
2 + 361 2
2 + 2 49
Thus, we see that, since we are assuming N 1030 , we in fact have x 4.9724 . . .
1029 , and so, in particular,
x
1028 .
(8.12)
x 4.9 1029 ,
Let us continue with our determination of the major-arcs total. We should

compute the quantities in (3.38). We already have bounds for E+ ,r,0 , A+ ,
L,r,0 and Kr,2 . From (6.63), we have
(8.13)
E ,r,8
3.061 108
,
where the factor of comes from the scaling in (t) = (2 M )(t) (which in
effect divides x by ). It remains only to bound the more harmless terms of type
Z,2 and LS .
By (3.12),
1X 2
1X
2
Z+2 ,2 =
(n)+
(n/x)
(n)+ (n/x) (+ (n/x) log n)
x n
x n
1X
(8.14)
(n)+ (n/x) (|+ (t) log+ (t/2)| + |+ | log 2x)
x n
(|+ |1 + | err+ ,T (0, x)|) (|+ (t) log+ (t/2)| + |+ | log 2x),
where log+ (t) := max(0, log t). Proceeding as in (4.27), we see that
|+ (t) log+ (t/2)| | (t) log+ (t/2)| + |h hH | | (t) log+ (t/2)|
= |h hH | | (t) log+ (t/2)| .
Since
| (t) log+ (t/2)| max | (t)(t/2 1)| 0.012
t2
and |h hH | < 0.1415 (by (4.29) and (4.30)), we see that |+ (t) log+ (t/2)| <
0.0017. We also have the bounds on |+ |1 and |+ | in (8.1) and the bound
| err+ ,T (0, x)| ET+ ,0 r/2 5.122 109 from (6.64). Hence
Z+2 ,2 0.99651 (0.0017 + 1.0858 log 2x) 0.002 + 1.09 log x.
Similarly,
Z2 ,2 =
1X 2
1X
(n)2 (n/x)
(n) (n/x) ( (n/x) log n)
x n
x n
(| |1 + | err ,T (0, x)|) (| (t) log+ (t)| + | | log(x/)).

Clearly,
(8.15)
| | = |2 M |

2 (t)
|| 1.92182 2 1.414.

t 1
e
106
H. A. HELFGOTT
and, since log+ is non-decreasing and 2 is supported on a subset of [0, 1],

| (t) log+ (t)| = |(2 M ) log+ | |2 M ( log+ )|

2 (t)
| log+ | 1.92182 0.762313 1.46503

t
1
where we bound | log | by the bisection method with 25 iterations. We

already know that
p
/2
|2 |1 ||1
||1
(8.16)
| |1 =
=
=
.
We conclude that
(8.17)
p
Z2 ,2 ( /2/49 + 3.061 108 )(1.46503 + (2/e) log(x/49)) 0.0189 log x.
We have bounds for | | and |+ | . We can also bound
| t| = |(2 M ) t| |2 |1 | t| 33/2 e3/2
and
|+ (t) t| | (t) t| + |h hH | | (t) t|
(8.18)
1.0648 + 0.1415e1/2 1.1507,
where we bound | (t) t| 1.0648 by the bisection method (20 iterations).

We can now bound LS (x, r) for = , + :
X p
LS (x, r) = log r max

pr
x
1
X | t|
log x
= (log r) max
||
+
pr log p
p /x
(log r) max
pr
and so
(8.19)
1
p x
| t|
log x
|| +
log p
1 1/p
(log r)(log x)
|| + 2(log r)| t| ,
log 2
LS

2
3/2
log x + 2 (3/e)
log r (1.0615 log x + 2.3189) log r,
e log 2
LS+

1.14146
log x + 2 1.1507 log r (1.6468 log x + 2.3014) log r.
log 2
We can now start to put together all terms in (3.36). Let 0 = |+ |2 /| |2 .

Then, by (8.3),
0 | |2 |+ |2 4.84 106 .
Thus,
2.82643| |22 (2 + 0 ) 0 +
4.310040 | |21 + 0.0012

r
(3)
| |21
05
is at most
107
2.82643 4.84 106 (2 0.80013 + 4.84 106 )

2
4.311 8 0.64021 + 0.0012 32.503

85
0.0001691
+
150000
by (4.18) and (8.5). Recall that | |1 is given by (8.16).
Since = (2 M )(x),
|2 (t)/ t|2 ||2

|2 M |22
2
| |2 =
q
3
32
3
1.53659
8 3 (log 2)
.
=
The second line of (3.36) is thus at most x2 times
3.061 108
1.53659
8
8.739 + 2.139 10 1.6812( 8.739 + 1.6812 0.80014)
6
1.6097 10
.
where we are using the bound A+ 8.739 we obtained in (8.6). (We are also
using the bounds on norms in (8.1).)
Using the bounds (8.14), (8.17) and (8.19), we see that the third line of (3.36)
is at most
2 (0.0189 log x (1.0615 log x + 2.3189) log 150000) x
p
+ 4 (0.002 + 1.14 log x) 0.0189 log x(1.6468 log x + 2.3014)(log 150000)x
13(log x)2 x,
where we just use the very weak assumption x 1010 , though we can by now
assume (8.12)).
We conclude that, for r = 150000, the integral over the major arcs
Z
S+ (, x)2 S (, x)e(N )d
M8,r
is
(8.20)
!
6
/2
1.6097
10
x2 +
x2 + 14(log x)2 x
C0 C0 , x2 + O 0.0001691

0.0002136x2
= C0 C0 , x2 + O
= C0 C0 , x2 + O 4.359 106 x2 ,
where C0 and C0 , are as in (3.37). Notice that C0 C0 , x2 is the expected

asymptotic for the integral over all of R/Z.
Moreover, by (8.9) and (8.10),
C0 C0 ,
Hence
(8.21)
1.0594003 0.000833
1.05857
1.3203236||1 | |22 0.000833
=
.
49
Z
M8,r
S+ (, x)2 S (, x)e(N )d
1.05856 2
x ,
where, as usual, = 49. This is our total major-arc bound.
108
H. A. HELFGOTT
8.3. Minor-arc totals. We need to estimate the quantities E, S, T , J, M in

Theorem 7.10. Let us start by bounding the constants in (7.44). The constants
C+ ,j , j = 0, 1, 2, will appear only in the minor term A2 , and so crude bounds on
them will do.
By (8.1) and (8.18),

1.1507
sup + (r) min 1.0858,
t
rt
for all t 0. Thus,
C+ ,0 = 0.7131
0.7131
Z
Similarly,
C+ ,1 0.7131
1
0
2
sup + (r) dt
rt
1.08582
+
t
1.15072
dt
t5/2
2
Z
sup + (r) dt 0.7131
rt
2.31092.
1.15072 log t
dt 0.41965.
t5/2
Immediately from (8.1),

C+ ,2 = 0.51941|+ |2 0.61235.
We get
E ((2.488 + 0.677) log x + (2 2.488 + 0.42)) x1/2
(8.22)
(3.165 log x + 5.396) x1/2 3.31 1013 x,
where E is defined as in (7.43), and where we are using the assumption x

4.5 1029 made at the beginning. Using (8.13) and (8.16), we see that
x
p
/2 + O (3.061 108 )
.
S (0, x) = (| |1 + O (ET ,0 ))x =
Hence
S (0, x) E 5.84 1014
(8.23)
We can bound
(8.24)
X
n
x2
.
2
(n)(log n)+
(n/x) = 0.64022x log x 0.0211x
2
by (6.67). Let us now estimate T . Recall that (t) = t2 et /2 . Since

Z u
Z u
Z u
u3
2 t2 /2
t2 dt =
t e
dt
(t)dt =
,
3
0
0
0
we can bound
Z 1

x 1.04488 log x/ 2 t2 /2
0.2779
= p
.
C,3 log
t e
dt
(log x/)3
/2 0
By (8.6), we already know that J = (8.7052 + O (0.0322))x. Hence
( J E)2 = ( (8.7052 + O (0.0322))x 3.302 1013 x)2

(8.25)
8.672999x,
109
and so
x
(S ( J E)2 )
T = C,3 log
0.2779
((0.6402 + O (2 105 ))x log x 0.0211x 8.672999x)
(log x/)3
x log x
x
0.17792
2.41609
3
(log x/)
(log x/)3
x
x
0.17792
1.72365
.
2
(log x/)
(log x/)3
for = 49. Since x/ 1028 , this implies that
T 3.638 105 x.
(8.26)
It remains to estimate M . Let us first look at g(r0 ); here g = gx/, , where

gx/, is defined as in (4.46). Write y = x/. We must estimate the constant
C,2,K defined in (4.48):
Z 1
Z 1
Z 1
2
w2 ew /2 log wdw
(w) log wdw
(w) log wdw
C,2,K =
0
1/K
0.093426,
where
again we use VNODE-LP for rigorous numerical integration. Since ||1 =
p
/2, this implies that
C,2 /||1
0.07455
log K
log log y
(8.27)
and so
(8.28)
Ry,K,,t

0.07455
0.07455
R
+
1
Ry,t .
=
log log y y/K,t
log log y
Let t = 2r0 = 300000; we recall that K = log y. Recall from (8.12) that
y = x/ 1028 ; thus, y/K 1.5511 1026 and log log y 4.16624. Going back
to the definition of Rx,t in (4.40), we see that
!
log(8 150000)
+ 0.41415 0.55394,
(8.29) Ry,,2r0 0.27125 log 1 +
9(1028 )1/3
2 log 2.0042150000
(8.30)
Ry/K,2r0 0.27125 log
and so
Ry,K,,2r0
Using
1+
log(8 150000)
2 log
9(1.55111026 )1/3
2.0042150000
+ 0.41415 0.5703,

0.07455
0.07455
0.5703 + 1
0.55394 0.55424.
4.16624
4.16624
(r) = e log log r +
2.50637
5.42506,
log log r
we see from (4.40) that

16
13
80
7
80
111
394.316.
+ log 2 9 150000 9 +
Lr0 = 5.42506 log 2 4 150000 4 +
9
5
110
H. A. HELFGOTT
Going back to (4.46), we sum up and obtain that

log y 1/6
(0.55424 log 300000 + 0.5) 5.42506 + 2.5 394.316
+
+ 3.2
g(r0 ) =
150000
y
2 150000
0.039182.
Using again the bound x 4.9 1029 from (8.12), we obtain
log(150000 + 1) + 1.698
p
S ( J E)2
log x/16
13.6164
1
(0.64022x log x 0.0211x) 8.673x
log
x log 4
2
13.6164 1.33393x 8.673x 9.49046x.
Therefore,
(8.31)
g(r0 )
log(150000 + 1) + 1.698
p
S ( J E)2
log x/16
0.039182 9.49046x
0.37186x.
This is one of the main terms.

Let r1 = (2/3)y 0.55/2 , where, as usual, y = x/ and = 49. Then
Ry,2r1
(8.32)

2 0.55
2
log
8
y
3
= 0.27125 log 1 +
+ 0.41415
9y 1/3
2 log
0.55
2.0042 23 y 2
!
16
0.55
2 log y + log 3
+ 0.41415
= 0.27125 log 1 +
27/8
7
log
y
+
2
log
60
1.002

33
0.27125 log 1 +
+ 0.41415 0.74266.
14
Similarly, for K = log y (as usual),

(8.33)

2 0.55
2
log
8
y
3
Ry/K,2r1 = 0.27125 log 1 +

+ 0.41415
9(y/K)1/3
2 log
0.55
2
2.0042 3 y
= 0.27125 log
1+
7
60
log y
0.55
16
2 log y + log 3
27/8
23 log log y + 2 log 1.002
+ 0.41415.
Let
f (t) =
16
0.55
2 t + log 3
27/8
2
7
60 t 3 log t + 2 log 1.002
33
+
=
14
2
3
7
60 t
log t c
32 log t + 2 log
27/8
1.002
111
where c = (33/14) log((27/8)/1.002) log 16/3. Then

27/8
23 log t + 2 log 1.002
23 log t c
f (t) =

2
27/8
7
2
t
log
t
+
2
log
60
3
1.002

27/8
7
2
7
log
t
(c
+
2/3)
+
c
2
log
90
60
1.002 3t
.
=

27/8 2
7
2
t
log
t
+
2
log
60
3
1.002
2
3t
7
60 t
7
60
2
3t
Hence f (t) < 0 for t 25. It follows that Ry/K,2r1 (with K = log y and r1
varying with y) is decreasing on y for y e25 , and thus, of course, for y 1028 .
We thus obtain from (8.33) that
(8.34)
Ry/K,2r1 R
1028
log 1028
,2r1
0.76994.
By (8.28), we conclude that

Ry,K,,2r1

0.07455
0.07455
0.76994 + 1
0.74266 0.74315.
4.16624
4.16624
Since r1 = (2/3)y 0.55/2 and (r) is increasing for r 27, we know that
(8.35)
2.50637
(r1 ) (y 0.55/2 ) = e log log y 0.55/2 +
log log y 0.55/2
2.50637
2
= e log log y +
2 e log 0.55 e log log y 1.42763
log log y log 0.55
for y 1028 . Hence, (4.40) gives us that
Lr1

32
80
2 3 44 111
(e log log y 1.42763) log 13 y +
+ log 16 y 9 +
9
5
34
39
3.859 log y log log y + 1.7957 log y + 15.65 log log y + 15.1
25
13
6
(4.29002 log y + 19.28) log log y.
Moreover,
p
p
p
1.42763
(r1 ) = e log log y 1.42763 e log log y
2 e log log y
and so
p
4 11
(0.74315 log y 40 + 0.5) (r1 )
3
p
1.42763 (0.2044 log y + 0.7138)
(0.2044 log y + 0.7138) e log log y

2 e log log y
p
(0.2728 log y + 0.9527) log log y 3.64.
112
H. A. HELFGOTT
Therefore, by (4.46),
gy, (r1 )
(0.2728 log y + 0.9527) log log y 1.14

q
4 11
40
3y
(4.29002 log y + 19.28) log log y
2
3y
11
40
0.251 log y log log y

11
y 80
3.2(log y)1/6
y 1/6
By (8.24) and y = x/ = x/49, we conclude that

(8.36)

(0.64022 log x 0.0211)x
0.45g(r1 )S 0.45
11
y 80
(0.64022 log y + 2.471)x 0.09186x.

11
y 80
It remains only to bound
2S
x
log 16
r1
r0
g(r)
dr
r
in the expression (7.45) for M . We will use the bound on the integral given
in (7.61). The easiest term to bound there is f1 (r0 ), defined in (7.62), since it
depends only on r0 : for r0 = 150000,
f1 (r0 ) = 0.0161322 . . . .
It is also not hard to bound f2 (r0 , x), also defined in (7.62):

2 11
40
(log y)1/6
(log y)1/6 11
3x
f2 (r0 , y) = 3.2
log
3.2
log
y
+
0.6648
log
r
0 ,
r0
40
y 1/6
y 1/6
and so, since r0 = 150000 and y 1028 ,
f2 (r0 , y) 0.000895.
Let us now look at the terms I1,r , c in (7.63). We already saw in (8.27) that
c =
0.07455
C,2 /||1
0.0179.
log K
log log y
Since F (t) = e log t + c with c = 1.025742,

(8.37)
I1,r0 = F (log r0 ) +
2e
= 5.42938 . . .
log r0
It thus remains only to estimate I0,r0 ,r1 ,z for z = y and z = y/K, where K = log y.
We already know that
Ry,2r0 0.55394,
Ry,2r1 0.74266,
Ry/K,2r0 0.5703,
Ry/K,2r1 0.76994
by (8.29), (8.30), (8.32) and (8.34). We also have the trivial bound Rz,t 0.41415
valid for any z and t for which Rz,t is defined.
113
Omitting negative terms from (7.63), we easily get the following bound, crude
but useful enough:
I0,r0 ,r1 ,z
2
Rz,2r
0
2
0.414152 P2 (log 2r0 )
P2 (log 2r0 ) Rz,2r
1
+
,
r0
log rr10
r0
where P2 (t) = t2 + 4t + 8 and P2 (t) = 2t2 + 16t + 48. For z = y and r0 = 150000,
this gives
P2 (log 2r0 ) 0.742662 0.414152 P2 (log 2r0 )
+
0.55
r0
r0
2
log 2y3r0
0.55722
;
0.17232 + 11
40 log y log 225000
I0,r0 ,r1 ,y 0.553942
for z = y/K, we proceed in the same way, and obtain

I0,r0 ,r1 ,y/K 0.18265 +
This gives us
(8.38)
11
40
0.61773
.
log y log 225000
q
p
(1 c ) I0,r0 ,r1 ,y + c I0,r0 ,r1 , y
log y
s
0.9821
0.17232 +
+ 0.0179
0.18265 +
11
40
11
40
0.55722
log y log 225000
0.61773
.
log y log 225000
In particular, since y 1028 ,

q
p
(1 c ) I0,r0 ,r1 ,y + c I0,r0 ,r1 ,
Therefore,
f0 (r0 , y) 0.52514
y
log y
0.52514.
2
5.42939 0.087932.
r0
By (7.61), we conclude that

Z r1
g(r)
dr 0.087932 + 0.061323 + 0.000895 0.15015.
r
r0
By (8.24) and x 4.9 1029 ,
2S
2(0.64022x log x 0.0211x)
1.33393x.
x
log 16
log x log 16
Hence
(8.39)
2S
x
log 16
r1
r0
g(r)
dr 0.20029x.
r
Putting (8.31), (8.36) and (8.39), we conclude that the quantity M defined in
(7.45) is bounded by
(8.40)
M 0.37186x + 0.09186x + 0.20029x 0.66401x.
114
H. A. HELFGOTT
Gathering the terms from (8.23), (8.26) and (8.40), we see that Theorem 7.10
states that the minor-arc total
Z
|S (, x)||S+ (, x)|2 d
Zr0 =
(R/Z)\M8,r0
is bounded by
Zr 0
(8.41)

p
||1 x
(M + T ) +
||1 (0.6022 + 3.638
0.83226
S (0, x) E
!2
x
x
+ 5.84 1014
105 )
x2
2
for r0 = 150000, x 4.9 1029 , where we use yet again the fact that ||1 =
This is our total minor-arc bound.
/2.
8.4. Conclusion: proof of main theorem. As we have known from the start,
X
(n1 )(n2 )(n3 )+ (n1 )+ (n2 ) (n3 )
(8.42)
n1 +n2 +n3 =N
R/Z
S+ (, x)2 S (, x)e(N )d.
We have just shown that, assuming N 1030 , N odd,

Z
S+ (, x)2 S (, x)e(N )d
R/Z
Z
S+ (, x)2 S (, x)e(N )d
=
M8,r0
+O
(R/Z)\M8,r0
|S+ (, x)| |S (, x)|d

x2
x2
x2
1.05856 + O 0.83226
0.2263
for r0 = 150000, where x = N/(2+1/(36 2)), as in (8.11). (We are using (8.21)
2
and (8.41).) Recall that = 49 and (t) = (2 M )(t), where (t) = t2 et /2 .
It only remains to show that the contribution of terms with n1 , n2 or n3 nonprime to the sum in (8.42) is negligible. (Let us take out n1 , n2 , n3 equal to 2 as
well, since some prefer to state the ternary Goldbach conjecture as follows: every
odd number 9 is the sum of three odd primes.) Clearly
X
(n1 )(n2 )(n3 )+ (n1 )+ (n2 ) (n3 )
n1 +n2 +n3 =N
n1 , n2 or n3 even or non-prime
(8.43)
3|+ |2 | |
(n1 )(n2 )(n3 )
n1 +n2 +n3 =N
n1 even or non-prime
3|+ |2 | | (log N )
n1 N non-prime
or n1 = 2
(n1 )
n2 N
(n2 ).
115
By (8.1) and (8.15), |+ | 1.0858 and | | 1.414. By [RS62, Thms. 12

and 13],
X
(n1 ) < 1.4262 N ,

n1 N non-prime
or n1 = 2
n2 N non-prime
or n2 = 2
(n2 ) = 1.03883 1.4262N 1.48158N.
Hence, the sum on the first line of (8.43) is at most

10.568N 3/2 log N.
Thus, for N 1030 odd,
X
(n1 )(n2 )(n3 )+ (n1 )+ (n2 ) (n3 )
n1 +n2 +n3 =N
n1 , n2 , n3 odd primes
x2
1.0568N 3/2 log N
0.001141N 2 8 1014 N 2 0.00114N 2

0.2263
by = 49 and (8.11). Since 0.00114N 2 > 0, this shows that every odd number
N 1030 can be written as the sum of three odd primes.
Since the ternary Goldbach conjecture has already been checked for all N
30
10 [HP], we conclude that every odd number N > 7 can be written as the
sum of three odd primes, and every odd number N > 5 can be written as the
sum of three primes. The main theorem is hereby proven: the ternary Goldbach
conjecture is true.
Appendix A. Sums over primes
P
Here we treat some sums of the type
n (n)(n), where has compact
support. Since the sums are over all integers (not just an arithmetic progression)
and there is no phase e(n) involved, the treatment is relatively straightforward.
The following is standard.
Lemma A.1 (Explicit formula). Let : [1, ) C be continuous and piecewise
C 1 with L1 ; let it also be of compact support contained in [1, ). Then

Z
X
X
1
(x)dx
(M )(),
(A.1)
(n)(n) =
1
x(x2 1)
1
n
where runs over the non-trivial zeros of (s).
The non-trivial zeros of (s) are, of course, those in the critical strip 0 <
(s) < 1.
Remark. Lemma A.1 appears as exercise 5 in [IK04, 5.5]; the condition there
that be smooth can be relaxed, since already the weaker assumption that
be in L1 implies that the Mellin transform (M )(
P + it) decays quadratically on t
as t , thereby guaranteeing that the sum (M )() converges absolutely.
Lemma A.2. Let x 10. Let 2 be as in (4.33). Assume that all non-trivial
zeros of (s) with |(s)| T0 lie on the critical line.
116
H. A. HELFGOTT
Then

n
0
X
log eT
9/4 6.03
9.7
1/2
2
=
x
+
O
0.135x
+
+
+
x.
(n)
(A.2)
2
2
x
x
T
2
T
0
0
n
In particular, with T0 = 3.061 1010 in the assumption, we have, for x 2000,

n
X
(n)2
= (1 + O ())x + O (0.135x1/2 ),
x
n
where = 2.73 1010 .
The assumption that all non-trivial zeros up to T0 = 3.061 1010 lie on the
critical line was proven rigorously in [Plaa]; higher values of T0 have been reached
elsewhere ([Wed03], [GD04]).
Proof. By Lemma A.1,
Z
n Z t
X
X
2 (t/x)
2
=
dt
(M )(),
dt
(n)2
x
x
t(t2 1)
1
1
where (u) = R2 (u/x) and runs over all non-trivial zeros of (s). Since 2 is
non-negative, 1 2 (t/x)dt = x|2 |1 = x, while

!

Z 1
Z
2 (t)
2 (t/x)
9.61114
dt
=
O
dt
=
O
.
2 2
t(t2 1)
x2
1/4 tx (t 1/100)
1
By (2.6),
X
X
X 1 2 2
(M )() =
M 2 ()x =
x = S1 (x)2S1 (x/2)+S1 (x/4),
where
(A.3)
Sm (x) =
X x
.
m+1
Setting aside the contribution of all with |()| T0 and all with |()| > T0
and (s) 1/2, and using the symmetry provided by the functional equation,
we obtain
X 1
X
1
|Sm (x)| x1/2
+
x
m+1
m+1
||
||
x1/2
1
||m+1
|()|>T0
|()|>1/2
|()|>T0
1
||m+1
We bound the first sum by [Ros41, Lemma 17] and the second sum by [RS03,
Lemma 2]. We obtain

2.68
1
eT0
+
+ m x1/2 ,
(A.4)
|Sm (x)|
x log
2mT0m T0m+1
2
where 1 = 0.0463, 2 = 0.00167 and 3 = 0.0000744.
Hence

X

3
2.68 9x
eT0
1

+ 2
log
+
+ 2 1 x1/2 .
(M )() x

2T0
4
2
2
T0
117
For T0 = 3.061 1010 and x 2000, we obtain

n
X
(n)2
= (1 + O ())x + O (0.135x1/2 ),
x
n
where = 2.73 1010 .
Corollary A.3. Let 2 be as in (4.33). Assume that all non-trivial zeros of (s)
with |(s)| T0 , T0 = 3.061 1010 , lie on the critical line. Then, for all x 1,

n
X
min (1 + )x + 0.2x1/2 , 1.04488x ,
(A.5)
(n)2
x
n
where = 2.73 1010 .
Proof. Immediate from Lemma A.2 for

P x 2000. For x < 2000, we use computation as follows. Since |2 | = 16 and x/4nx (n) x for all x 0, computing
P
nx (n)2 (n/x) only for x (1/1000)Z [0, 2000] results in an inaccuracy of
at most (16 0.0005/0.9995)x 0.00801x. This resolves the matter at all points
outside (205, 207) (for the first estimate) or outside (9.5, 10.5) and (13.5, 14.5)
(for the second estimate). In those intervals, the prime powers n involved do not
change (since whether x/4 < n x depends only on n and [x]), and thus we can
find the maximum of the sum in (A.5) just by taking derivatives.

Appendix B. Sums involving (q)
We need estimates for several sums involving
P (q) in the denominator.
The easiest are convergent sums, such as q 2 (q)/((q)q). We can express
Q
this as p (1 + 1/(p(p 1))). This is a convergent product, and the main task is
to bound a tail: for r an integer,
X
X
Y
1
1
1
1
= .
(B.1)
log
1+
p(p 1)
p(p 1) n>r n(n 1)
r
p>r
p>r
A quick computation12 now suffices to give
(B.2)
2.591461
X gcd(q, 2)2 (q)

(q)q
< 2.591463
and so
(B.3)
1.295730
X 2 (q)
< 1.295732,
(q)q
q odd
since the expression bounded in (B.3) is exactly half of that bounded in (B.2).
Again using (B.1), we get that
(B.4)
2.826419
X 2 (q)
q
(q)2
< 2.826421.
In what follows, we will use values for convergent sums obtained in much the
same way an easy tail bound followed by a computation.
12Using D. Platts integer arithmetic package.
118
H. A. HELFGOTT
By [Ram95, Lemma 3.4],

X 2 (q)
= log r + cE + O (7.284r 1/3 ),
(q)
qr

(B.5)
X 2 (q)
1
log 2
=
log r + cE +
+ O (4.899r 1/3 ),
(q)
2
2
qr
q odd
where
cE = +
X
p
log p
= 1.332582275 + O (109 /3)
p(p 1)
by [RS62, (2.11)]. As we already said in (7.15), this, supplemented by a computation for r 4 107 , gives
X 2 (q)
log r + 1.312
log r + 1.354
(q)
qr
for r 182. In the same way, we get that

X 2 (q)
1
1
(B.6)
log r + 0.83
log r + 0.85.
2
(q)
2
qr
q odd
for r 195. (The numerical verification here goes up to 1.38108 ; for r > 3.18108 ,
use B.6.)
Clearly
X 2 (q)
X 2 (q)
=
.
(B.7)
(q)
(q)
qr
q odd
q2r
q even
We wish to obtain bounds for the sums

X 2 (q)
X 2 (q)
,
,
(q)2
(q)2
qr
qr
q odd
X 2 (q)
,
(q)2
qr
q even
where N Z+ and r 1. To do this, it will be helpful to express some of the

quantities within these sums as convolutions.13 For q squarefree and j 1,
X fj (b)
2 (q)q j1
(B.8)
=
,
(q)j
a
ab=q
where fj is the multiplicative function defined by
pj (p 1)j
,
fj (pk ) = 0
(p 1)j p
We will also find the following estimate useful.
fj (p) =
for k 2.
Lemma B.1. Let j 2 be an integer and A a positive real. Let m 1 be an

integer. Then

X 2 (a)
(j)/(2j) Y
1 1
.
1+ j
(B.9)
aj
Aj1
p
aA
(a,m)=1
p|m
13The author would like to thank O. Ramar

e for teaching him this technique.
119
It is useful to note that (2)/(4) = 15/ 2 = 1.519817 . . . and (3)/(6) =

1.181564 . . . .
Proof. The right side of (B.9) decreases as A increases, while the left side depends
only on A. Hence, it is enough to prove (B.9) when A is an integer.
For A = 1, (B.9) is an equality. Let

1 1
(j) Y
.
1+ j
C=
(2j)
p
p|m
Let A 2. Since
aA
(a,m)=1
X
2 (a)
=
C
aj
a<A
(a,m)=1
2 (a)
aj
and
C=
2 (a)
<
aj
2 (a)
1
1
+ j +
,
aj
A
(j 1)Aj1
a
(a,m)=1
a<A
(a,m)=1
a<A
(a,m)=1
2 (a)
1
+ j +
j
a
A
1
dt
tj
we obtain
X
aA
(a,m)=1
X
1
Aj1 1
2 (a)
=
C
+
aj
Aj1
Aj1
Aj1 1
C
< j1 +
A
Aj1
C
1
j1 + j1
A
A

a<A
(a,m)=1
2 (a)
aj

Aj1
1
1
+
A j1

1 .
1
1
+
j
A
(j 1)Aj1
1
Aj1

a<A
(a,m)=1
2 (a)
aj
Since (1 1/A)(1/A + 1) < 1 and 1/A + 1/(j 1) 1 for j 3, we obtain that

1
1
1
+
1 j1
<1
A
A j1
for all integers j 2, and so the statement follows.
We now obtain easily the estimates we want: by (B.8) and Lemma B.1 (with
j = 2 and m = 1),
(B.10)
X 2 (q)
qr
X X f2 (b) 2 (q) X f2 (b) X 2 (a)
(q)2
a
q
b
a2
qr ab=q
b1
ar/b

15 Y
2p 1
6.7345
(2)/(4) X
2
1+
.
f2 (b) =
2
r
r p
(p 1) p
r
=
b1
120
H. A. HELFGOTT
Similarly, by (B.8) and Lemma B.1 (with j = 2 and m = 2),
(B.11)
X f2 (b) X 2 (a)
X 2 (q)
(2)/(4) 1 X
f2 (b)
=
(q)2
b
a2
1 + 1/22 r
qr
q odd
b1
b odd
b odd
ar/b
a odd

12 1 Y
2p 1
2.15502
= 2
1+
2
r
(p 1) p
r
p>2
X 2 (q)
X 2 (q)
4.31004
=
.
2
2
(q)
(q)
r
(B.12)
qr
q even
qr/2
q odd
Lastly,
(B.13)

X 2 (q)q
X 1 r
X
X 1 X
X 1
2 (q)
=
=
+1
2 (q)
(q)
(d)
(d)
2(d) d
qr
q odd
qr
q odd
dr
d odd
d|q
qr
d|q
q odd
dr
d odd
1 X 1
log r
r X 1
+
0.64787r +
+ 0.425,
2
(d)d 2
(d)
4
d odd
dr
d odd
where we are using (B.3) and (B.6).

Appendix C. Validated numerics
C.1. Integrals of a smoothing function. Let
(
x3 (2 x)3 ex1/2 if t [0, 2],
(C.1)
h : t 7
0
otherwise
Clearly, h(0) = h (0) = h (0) = h(2) = h (2) = h (2) = 0, and h(x), h (x) and
h (x) are all continuous. We are interested in computing
Z
|h(k) (x)|xk1 dx
Ck =
0
for 0 k 4. (If k = 4, the integral is to be understood in the sense of

distributions.)
Rigorous numerical integration14 gives that
(C.2)
1.6222831573801406 C0 1.6222831573801515.
We will compute Ck , 1 k 4, in a somewhat different way.15

The function (x3 (2 x)3 ex1/2 ) = ((x3 (2 x)3 ) + x3 (2 x)3 )ex1/2
has the
3 (2 x)3 ) + x3 (2 x)3 , namely, 0, 2,

same
zeros
as
H
(x)
=
(x
10
2 and
1
( 10 + 2); only the first three of these four roots lie in the interval [0, 2].
14By VNODE-LP [Ned06], running on PROFIL/BIAS [Kn
u99].
15It does not seem possible to use [Ned06] directly on an integrand involving the abs function,
due to the fact that it is not differentiable at the origin.
121
(x)) is + within (0, 10 2) and within

The
sign
of
H
(x)
(and
hence
of
h
1
( 10 2, 2). Hence
Z
|h (x)|dx = |h( 10 2) h(0)| + |h(2) h( 10 2)|

C1 =
(C.3)
0

= 2h( 10 2) = 3.58000383169 + O 1012 .
The situation with (x3 (2 x)3 ex1/2 ) is similar: it has zeros at the roots of
H2 (x) = 0, where H2 (x) = H1 (x) + H

k (x) +
in general, Hk+1 (x) = H
1 (x) (and,
Hk (x)).The roots within [0, 2] are 0, 3 1, 21 3 and 2. Write 2,1 = 3 1,

2,2 = 21 3. The sign of H2 (x) (and hence of h (x)) is first +, then , then
+. Hence
(C.4)
Z 2
Z 2,2
Z 2,1
Z
h (x)xdx +
h (x)xdx
h (x)xdx
|h (x)|xdx =
C2 =
=h
+h
2,1
(x)x|0 2,1
(x)x|22,2
2,1
0
2
h (x)dx
2,2
(x)x|2,1
2,2
2,2
h (x)dx
2,1
h (x)dx
2,2
= 2h (2,1 )2,1 2h(2,1 ) 2h (2,2 )2,2 + 2h(2,2 )

= 15.27956091266 + O 1011 .
To compute C3 , we proceed in the same way, except now we must find the roots
numerically. It is enough to find (candidates for) the roots using any available
tool16 and then check rigorously that the sign does change around the purported
roots. In this way, we check that the roots 3,1 , 3,2 , 3,3 of H3 (x) = 0 lie within
the intervals
[0.366931547524632, 0.366931547524633],
[1.233580882085861, 1.233580882085862],
[1.847147885393624, 1.847147885393625],
respectively. The sign of H3 (x) on the interval [0, 2] is first +, then , then +,
then . Proceeding as before, we obtain that
Z
C3 =
|h (x)|x2 dx
0
(C.5)
=2
3
X
(1)j+1 (h (3,j )23,j 2h (3,j )3,j + 2h(3,j ))
j=1
and so interval arithmetic gives us

(C.6)

C3 = 131.3398196149 + O 1010 .
The treatment of the integral in C4 is very similar, at least as first. The roots
4,1 , 4,2 of H4 (x) = 0 lie within the intervals
[0.866114027542349, 0.86611402754235],
[1.640243631518005, 1.640243631518006].
16Routine find root in SAGE was used here.
122
H. A. HELFGOTT
The sign of H4 (x) on the interval [0, 2] is first , then +, then . Using integration
by parts as before, we obtain
Z 2

(4) 3
h (x) x dx = lim h(3) (x)x3 lim h(3) (x)x3
x0+
+2
j=1
Now
x2

(1)j h(3) (4,j )34,j 3h (4,j )24,j + 6h (4,j )4,j 6h(4,j )
2
X

= 2199.91310061863 + O 3 1011 .
Z
Hence
(C.7)
|h(4) (x)x3 |dx = lim |h(3) (2 + ) h(3) (2 )| 23 = 48 e3/2 23 .

0+

C4, = 48 e3/2 23 = 3920.8817036284 + O 1010 .
C.2. Extrema via bisection and truncated series. Let f : I R, I R.

We wish to find the minima and maxima of f in I rigorously.
The bisection method (as described in, e.g., [Tuc11, 5.2]) can be used to show
that the minimum (or maximum) of f on a compact interval I lies within an
interval (usually a very small one). We will need to complement it by other
arguments if either (a) I is not compact, or (b) we want to know the minimum
or maximum exactly.
p
As in 5.3, let j() = (1 + 2 )1/2 and () = (1 + j())/2 for 0. Let ,
cos 0 , sin 0 , c0 and c1 be understood as one-variable real-valued functions on ,
given by (3.13), (5.25) and (5.31).
First, let us bound () from below. By the bisection method17 applied with
32 iterations,
0.798375987 min () 0.798375989.
010
p
p
Since j() and () j()/2 /2,
= ,
0
3/2
2()(() + j())
2
2
and so
(C.8)
() 1
1
1 .
2()(() + j())
2
Hence () 0.8418 for 20. We conclude that
(C.9)
0.798375987 min () 0.798375989.

0
Now let us bound c0 () from below. For 8,

s
r
1
1
1
1
1
sin 0 =
,
2 2
2
2
2
whereas cos 0 1/ 2 for all 0. Hence, by (C.9)

0.7983 1
+ > 1.06
(C.10)
c0 ()
2
2
17Implemented by the author from the description in [Tuc11, p. 8788], using D. Platts
interval arithmetic package.
123
for 8. The bisection method applied with 28 iterations gives us that

max c0 () 1 + 5 108 > 1.
(C.11)
0.018
It remains to study c0 () for [0, 0.01]. The method we are about to give
actually works for all [0, 1].
Since

1
1
,
1+x =
1+x =
,
4(1 + x)3/2
2 1+x

3/4
1
1/2
1
1
=
,
,
=
=
3/2
3/2
2(1 + x)
(1 + x)
(1 + x)5/2
1+x
1+x
a truncated Taylor expansion gives us that, for x 0,
1
1
1
1 + x x2 1 + x 1 + x
2
8
2
(C.12)
1
3
1
1
1 x + x2 .
1 x
2
2
8
1+x
Hence, for 0,
(C.13)
1 + 2 /2 4 /8 j() 1 + 2 /2,
1 + 2 /8 54 /128 + 6 /256 8 /2048 () 1 + 2 /8,
and so
() 1 + 2 /8 54 /128
(C.14)
for 8. We also get from (C.12) that
(C.15)
Hence
1
=q
()
1 j() 1 3
1
+
2
2
8
j() 1
2
1 + j()1
2
2

4
2
74
1
3 4
1
+
,
1
+
2 4
16
8 16
8
128
1 j() 1
1
1
2
1
=q
1 .
()
2
2
8
1 + j()1
2
sin 0 =
(C.16)
sin 0
while
(C.17)
cos 0 =
s
r
1
1
2 2()
74
=
16 256
4
7 2
,
16
= ,
16
4
1
1
+
2 2()
2
1 ,
16
cos 0
2
74
+
,
16 256
By (C.13) and (C.15),

(C.18)
2
1 2 /8
2( + j)
2 2 + 52 /8
2
1 32
2
32
33
.
4
64
124
H. A. HELFGOTT
Assuming 0 1,
1
1+
52
16
94
64
52 94
+
+
1
16
64
52 94
16
64
2 !
52 614
+
,
16
256
and so, by (C.14) and (C.15),

2
7
1 8 + 128
2( + j)
2 2 + 52 214
8 2 128 4

7
52 464
+
+
1
1
4
8
128
16
256

2
35 4
81 6 161 8
7
73 355
+

+ 14
+
.
1
4
16
128
2048
2
4
64
512
Hence, we obtain
2
() = 1 +
2( + j)
2( + j)

2

2
73 355
1 2
1 33
+
1+
2 4
64
8 16
4
64
512
(C.19)

2
3
5

3
7
1
96
35
1 +
+
+
+ 13
4
4 32
64
256 2048
512
2
2
3
4
7
165

+
,
1 +
4 32
64
2048
where, in the last line, we use again the assumption 1.
For x [1/4, 0],
1
x
1
x2
x2
1+x 1+ x
=
1
+
2
2 4(1 1/4)3/2
2 33/2
x2
1
1
1 + x.
1+x 1+ x
2
8
2
Hence
(C.20)
r
2
2
4
74
2
74
1
cos 0 1
3/2
+
1
+
32 3 256
16 256
32 512

49
7 2
4 sin 0
1 3/2
4
32
4
3 256
for 1. Therefore,
c0 () = () cos 0 + sin 0

2
73
165 4
4
2
1 +
+
3/2
1
4 32
64
2048
32 3 256
7 3
49
3/2
+
5
4 128
3 1024 !
!
!
7 3 3
7
167
3
3
3
+
+
+
+
4
1+
16
2304 2048
2048 192
147456
1+
3
0.09494 ,
16
125
where we are again using 1. We conclude that, for all (0, 1/2],
c0 () > 1.
Together with (C.10) and (C.11), this shows that
(C.21)
c0 () > 1
> 0.
It is easy to check that c0 (0) = 1.

(The truncated-Taylor series estimates above could themselves have been done
automatically; see [Tuc11, Ch. 4] (automatic differentiation). The footnote in
[Tuc11, p. 72] (referring to the work of Berz and Makino [BM98] on Taylor
models) seems particularly relevant here. We have preferred to do matters by
hand in the above.)
Now let us examine (), given as in (5.43). Let us first focus on the case of
large. We can use the lower bound (C.8) on (). To obtain a good upper
bound on (), we need to get truncated series expansions on 1/ for and j.
These are:
(C.22)
r

p
1
1
1
j() = 2 + 1 = 1 + 2 1 + 2 = + ,
2
2
r
r
r r
r

1
1
1
1
1+j
1
1+ + 2
() =
+ +
=
1+
,
2
2 2 4
2
2
2
2
p
p
together with the trivial bounds j() and () j()/2 /2. By
(C.22),

q
1 + 2
1
1

2 q =
2 q
q
2
2
2
1
1
+
1
+
1
+
2
2
2
2
2
(C.23)

q
2
2
1+
8
2

q + 3/2
=
2

1+ 2
1 2 +
+ 1
22
for 15, and so

(C.24)
j
2+
2

for 15. In fact, the bisection method (applied with 20 iterations, including 10
initial iterations after which the possibility of finding a minimum within each
interval is tested) shows that (C.23) (and hence (C.24)) holds for all 1. By
(C.22),
(C.25)

2( + j)
2 1 +
q
1
2 1+
2
1
1

1
2 1 + 2 +
1
2
++
1
2
1
1
1

2 2
23/2
for 16. (Again, (C.25) is also true for 1 16 by the bisection method; it
is trivially true for [0, 1], since the last term of (C.25) is then negative.) We
126
H. A. HELFGOTT
also have the easy upper bound

1
1
1
q
q
=
+
(C.26)

2( + j)
2 + 1
2 2 (2)3/2
2 2 ( 2 + )
valid for 1/2.
Hence, by (C.12), (C.25) and (C.26),
s
2

= 1+
2( + j)
2( + j)

1
1
1
1
1
1 2
1
1

+
+
1 +
1+
3/2
2
2 2
2 2
2
2
for 3. Again,
we use the bisection method (with 20 iterations) on [1/2, 3],
and note that 1/ 2 < 1/ for < 1/2; we thus obtain
1
1
(C.27)
1 +
2
for all > 0.
We recall (5.43) and the lower bounds (C.24) and (C.8). We get
s
r

2 !
1
1
8
1
1 2
1
1+ 1
1
+
2+
2
2
2
2

1
1
1
1
+
1 +
2
2 + 1
2
!

1
1
1
2
1
2
5
(C.28)
1+
2 +
1 +
2
2
2 4
2 2

1
1
1
1
+ 1 1 + 2 1 +
2
2
2
1
1
1
1
9
37
1
2+
3 1+
1+
5/2
2 4 8
2 16
2
for 2. This implies that () > 1 for 11. (Since our estimates always
give an error of at most O(1/ ), we also get lim () = 1.) The bisection
method (with 20 iterations, including 6 initial iterations) gives that () > 1 also
holds for 1 11.
Let us now look at what happens for 1. From (C.19), we get the simpler
bound
33
2
+
1
(C.29)
1 +
4 32
32
4
valid for 1, implying that
234
2 113
+
+
2
8
64
1024
for 1. We also have, by (5.23) and (C.18),

2

33
1
1 2
1+
1+
2 2( + j)
2( + j)
2 4
4
64
(C.30)
2
3
2
3
5

+
1 +
1 +
4 32
64
4
64
2 1
127
for 1. (This immediately implies the easy bound 1, which follows

anyhow from (5.22) for all 0.)
By (C.13),
1 + 2 /2 4 /8
1 + 2 /2 4 /8
8
j
2

2
4
2
2
2
2

1 + 8
1 + 8
8 + 64
for 1. Therefore, by (5.43),

r

2
1 1
34
2 113
1
52
8
1
+
+
+
+
2
2
8
64
128
2
4
64
2
2 2
2

2
3
2
2
11
3
1
4 3 15
33
7
1 +
+
1+ +
4
32
64
2
2
32
2
2
64
64
4 3

2
for 1. This implies the bound () > 1 for all 1. Conversely, ()
4/ 3/2 follows from () > 1 for > 8/5. We check () 4/ 3/2 for
[1, 8/5] by the bijection method (5 iterations).
We conclude that, for all > 0,

4 3
(C.31)
max 1,
.
2
This bound has the right asymptotics for 0+ and +.
Let us now bound c0 from above. By (C.20) and (C.30),

74
2
52
+
1
+
c0 () = () cos 0 + sin 0 1 +
4
64
32 512
4
(C.32)
32
3
234
75
356
2
1+
+
+
+ 15 1 +
64
128 2048 2048
2
15
for 1. Since 1 and 0 [0, /4] [0, /2], the bound
(C.33)
c0 () cos 0 + sin 0 2
holds for all 0. By (C.27), we also know that, for 2,

1
1
cos 0 + sin 0
c0 () 1 +
2
s
(C.34)

1
1
1 2
9
1 +
+1 2 1 +
.
2
2 2 16
From (C.31) and (C.33), we obtain that

1
1 + 2c20 1 (1 + 2 2) = 5
(C.35)
for all 0. At the same time, (C.31) and (C.32) imply that

1
4 3 1
42 24
+ 2
1 + 2c20
3+
2
15
15

1

4
3
3
42
3
1+
+
=
1
1+
4
8
45
675
4
2
for 0.4. Hence (1 + 2c20 )/ 0.86 for < 0.29. The bisection method
(20 iterations, starting by splitting the range into 28 equal intervals) shows that
128
H. A. HELFGOTT
(1 + 2c20 )/ 0.86 also holds for 0.29 6; for > 6, the same inequality
holds by (C.35).
We have thus shown that
(C.36)
1 + 2c20
min(5, 0.86)
for all > 0.

p
Now we wish to bound ( 2 )/2 from below. By (C.14) and (C.13),
2

2
54
2
1+
2 1 +
8
128
8
(C.37)

2
2 54
52
1
2
2 54
4
=1+
1+
4
64
128 8
8
8
64
for 1, and so
r
52
,
1
2
4
8
and this is greater than /6p
for 1/3. The bisection method (20 iterations, 5
initial steps) confirms that ( 2 )/2 > /6 also holds for 2/3 < 4. On
the other hand, by (C.22) and 2 = (1 + j)/2 (1 + )/2,
v
q

u
r
r
u +1 1 + 1

s
2
2
t 2
2
1
1
2
1+
1+
2
2
2
2
(C.38)
s
r
r

1
1
2
1
+
3/2
1
=
1
2
2
2
2
2
2
p
for 4. We check by the bisection method (20 iterations) that ( 2 )/2
/2 1/23/2 also holds for all 0 4.

We conclude that
(
r
/6
if 4,
2
(C.39)

1
2
for all .
2 23/2
We still have a few other inequalities to check. Let us first derive an easy lower
bound on c1 () for large: by (C.8), (C.23) and (C.12),
r
r

s

1
1
1 + 1/
1
2
8
1
1
+
c1 () =
2
2
3/2
2
2
r

r
1
1
3
2
1
2
1+
1
1
=
4
2 4
2
for 1. Together with (C.34), this implies that, for 2,

1
9
1 + 9
2
1 12 + 8
2
16
c0 1/ 2
1
2 2
,
=
q
2 2 1 3
2c1
2 2 1 3
again for 1. This is 1/ 8 for 8. Hence it is 1/ 8 25 < 0.071 for

25.
129
Let us now look at small. By (C.13),

2

2
2 54
2 94
2
1+
1+
+
=
8
8
32
8
32
for any > 0. Hence, by (C.15) and (C.29),

s
r

5 2
4
1 + 1/
2 2 /8
,
1
c1 () =
1
1
4
2
2
4
4
4
+ 9
8
32
whereas, for 1,
c0 () = () cos 0 + sin 0 1 + sin 0 1 + /4
by (C.20). Thus
1 + 4 12
c0 1/ 2
0.0584
2c1
2 4 1 54 2 1 4
for 0.1.
We check the remaining interval [0.1, 25] (or [0.1, 8], if we aim at the
bound 1/ 8) by the bisection method (with 24 iterations, including 12 initial
iterations or 15 iterations and 10 initial iterations, in the case of [0.1, 8]) and
obtain that
c0 1/ 2
0.0763896
0.0763895 max
0
2c1
(C.40)
1
c0 1/ 2
.
sup
2
1
0
In the same way, we see that
1
1
c0
3 0.171
c1
1 4
for 36 and
1 + 4
c0
0.267
c1
4 1 54 2 1 4
for 0.1. The bisection method applied to [0.1, 36] with 24 iterations (including
12 initial iterations) now gives
c0
0.29888.
(C.41)
0.29887 max
>0 c1
We would also like a lower bound for c0 /c1 . For c0 , we can use the lower bound
c0 1 given by (C.21). By (C.15), (C.30) and (C.37),
s
r

2
74
2 8 + 128
1 + 1/
52
c1 () =
1 +
2
2 /8 54 /64
4
64

2
2
4
5
5
4
1+
1 +
<
16
4
64
for 1/4. Thus, c0 /(c1 ) 1/4 for [0, 1/4]. The bisection method (with 20
iterations, including 10 initial iterations) gives us that c0 /(c1 ) 1/4 also holds
for [1/4, 6.2]. Hence
c0
c1
4
for 6.2.
130
H. A. HELFGOTT
Now consider the case of large . By and 1,

p
( 2 )/
1 1 1/ 2
1/
c0
p
.
(C.42)
p
q
c1
1+1/
2 1 + 1/
1 + 1/
(This is off from optimal by a factor of about 2.) For 200, (C.42) implies
that c0 /(c1 ) 0.6405. The bisection method (with 20 iterations, including
5 initial iterations) gives us c0 /(c1 ) 5/8 = 0.625 for [6.2, 200]. We

conclude that

5
c0
,
min
(C.43)
.
c1
4 8
Finally, we verify an inequality that will be useful for the estimation of a crucial
exponent in one of the main intermediate results (Prop. 5.1). We wish to show
that, for all [0, /2],
(C.44)
sin
sin 2
5 sin3
4 cos2 2
2 cos2 24 cos6
The left side is positive for all (0, /2], since cos2 /2 1/ 2 and (sin 2)/2
is less than 2/2 = . The right side is negative for > 1 (since it is negative
for = 1, and (sin )/(cos )2 is increasing on ). Hence, it is enough to check
(C.44) for [0, 1]. The two sides of (C.44) are equal for = 0; moreover,
the first four derivatives also match at = 0. We take the fifth derivatives of
both sides; the bisection method (running on [0, 1] with 20 iterations, including
10 initial iterations) gives us that the fifth derivative of the left side minus the
fifth derivative of the right side is always positive on [0, 1] (and minimal at 0,
where it equals 30.5 + O 109 ).
References
[AS64]
[BBO10]
[BM98]
[Bom74]
[Boo06]
[Bou99]
[CW89]
[Dav67]
[dB81]
[Des77]
M. Abramowitz and I. A. Stegun. Handbook of mathematical functions with formulas, graphs, and mathematical tables, volume 55 of National Bureau of Standards
Applied Mathematics Series. For sale by the Superintendent of Documents, U.S.
Government Printing Office, Washington, D.C., 1964.
J. Bertrand, P. Bertrand, and J.-P. Ovarlez. Mellin transform. In A. D. Poularikas,
editor, Transforms and applications handbook. CRC Press, Boca Raton, FL, 2010.
M. Berz and K. Makino. Verified integration of ODEs and flows using differential algebraic methods on high-order Taylor models. Reliab. Comput., 4(4):361369,
1998.
E. Bombieri. Le grand crible dans la theorie analytique des nombres. Societe
Mathematique de France, Paris, 1974. Avec une sommaire en anglais, Asterisque,
No. 18.
A. R. Booker. Turing and the Riemann hypothesis. Notices Amer. Math. Soc.,
53(10):12081211, 2006.
J. Bourgain. On triples in arithmetic progression. Geom. Funct. Anal., 9(5):968984,
1999.
J. R. Chen and T. Z. Wang. On the Goldbach problem. Acta Math. Sinica,
32(5):702718, 1989.
H. Davenport. Multiplicative number theory, volume 1966 of Lectures given at the
University of Michigan, Winter Term. Markham Publishing Co., Chicago, Ill., 1967.
N. G. de Bruijn. Asymptotic methods in analysis. Dover Publications Inc., New
York, third edition, 1981.
J.-M. Deshouillers. Sur la constante de Snirel

man. In Seminaire Delange-PisotPoitou, 17e annee: (1975/76), Theorie des nombres: Fac. 2, Exp. No. G16, page 6.
Secretariat Math., Paris, 1977.
131
[DEtRZ97] J.-M. Deshouillers, G. Effinger, H. te Riele, and D. Zinoviev. A complete Vinogradov

3-primes theorem under the Riemann hypothesis. Electron. Res. Announc. Amer.
Math. Soc., 3:99104, 1997.
[Dic66]
L. E. Dickson. History of the theory of numbers. Vol. I: Divisibility and primality.
Chelsea Publishing Co., New York, 1966.
[GD04]
X. Gourdon and P. Demichel. The first 1013 zeros of the Riemann zeta function, and zeros computation at very large height.
http://numbers.computation.free.fr/Constants/Miscellaneous/zetazeros1e13-1e24.pdf,
2004.
[GR00]
I. S. Gradshteyn and I. M. Ryzhik. Table of integrals, series, and products. Academic Press Inc., San Diego, CA, sixth edition, 2000. Translated from the Russian,
Translation edited and with a preface by Alan Jeffrey and Daniel Zwillinger.
[Har66]
G. H. Hardy. Collected papers of G. H. Hardy (Including Joint papers with J. E.
Littlewood and others). Vol. I. Edited by a committee appointed by the London
Mathematical Society. Clarendon Press, Oxford, 1966.
[HB79]
D. R. Heath-Brown. The fourth power moment of the Riemann zeta function. Proc.
London Math. Soc. (3), 38(3):385422, 1979.
[HB85]
D. R. Heath-Brown. The ternary Goldbach problem. Rev. Mat. Iberoamericana,
1(1):4559, 1985.
[Hel]
H. A. Helfgott. Minor arcs for Goldbachs problem. Preprint. Available as
arXiv:1205.5252.
[HL23]
G. H. Hardy and J. E. Littlewood. Some problems of Partitio numerorum; III: On
the expression of a number as a sum of primes. Acta Math., 44(1):170, 1923.
[HP]
H. A. Helfgott and D. Platt. Numerical verification of ternary Goldbach. Preprint.
[IK04]
H. Iwaniec and E. Kowalski. Analytic number theory, volume 53 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence, RI, 2004.
[Kad05]
H. Kadiri. Une region explicite sans zeros pour la fonction de Riemann. Acta
Arith., 117(4):303339, 2005.
[Kn
u99]
O. Kn
uppel. PROFIL/BIAS, February 1999. version 2.
[KPS72]
N. I. Klimov, G. Z. Pil tja, and T. A. Septickaja.
An estimate of the absolute
constant in the Goldbach-Snirel

man problem. In Studies in number theory, No. 4
(Russian), pages 3551. Izdat. Saratov. Univ., Saratov, 1972.
[Lam08]
B. Lambov. Interval arithmetic using SSE-2. In Reliable Implementation of Real
Number Algorithms: Theory and Practice. International Seminar Dagstuhl Castle,
Germany, January 8-13, 2006, volume 5045 of Lecture Notes in Computer Science,
pages 102113. Springer, Berlin, 2008.
[Lan12]
E. Landau. Gel
oste und ungel
oste Probleme aus der Theorie der Primzahlverteilung
und der Riemannschen Zetafunktion. In Proceedings of the fifth Itnernational Congress of Mathematicians, volume 1, pages 93108. Cambridge, 1912.
[Lin41]
U. V. Linnik. The large sieve.. C. R. (Doklady) Acad. Sci. URSS (N.S.), 30:292
294, 1941.
[LW02]
M.-Ch. Liu and T. Wang. On the Vinogradov bound in the three primes Goldbach
conjecture. Acta Arith., 105(2):133175, 2002.
[McC84]
K. S. McCurley. Explicit estimates for the error term in the prime number theorem
for arithmetic progressions. Math. Comp., 42(165):265285, 1984.
[Mon71]
H. L. Montgomery. Topics in multiplicative number theory. Lecture Notes in Mathematics, Vol. 227. Springer-Verlag, Berlin, 1971.
[MV73]
H. L. Montgomery and R. C. Vaughan. The large sieve. Mathematika, 20:119134,
1973.
[MV07]
H. L. Montgomery and R. C. Vaughan. Multiplicative number theory. I. Classical
theory, volume 97 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2007.
[Ned06]
N. S. Nedialkov. Vnode-lp: a validated solver for initial value problems in ordinary
differential equations, July 2006. version 0.3.
[OeSHP13] T. Oliveira e Silva, S. Herzog, and S. Pardi. Empirical verification of the even
goldbach conjecture, and computation of prime gaps, up to 4 1018 . Accepted for
publication in Math. Comp., 2013.
132
H. A. HELFGOTT
[OLBC10] F. W. J. Olver, D. W. Lozier, R. F. Boisvert, and Ch. W. Clark, editors. NIST handbook of mathematical functions. U.S. Department of Commerce National Institute
of Standards and Technology, Washington, DC, 2010. With 1 CD-ROM (Windows,
Macintosh and UNIX).
[Olv58]
F. W. J. Olver. Uniform asymptotic expansions of solutions of linear second-order
differential equations for large values of a parameter. Philos. Trans. Roy. Soc. London. Ser. A, 250:479517, 1958.
[Olv59]
F. W. J. Olver. Uniform asymptotic expansions for Weber parabolic cylinder functions of large orders. J. Res. Nat. Bur. Standards Sect. B, 63B:131169, 1959.
[Olv61]
F. W. J. Olver. Two inequalities for parabolic cylinder functions. Proc. Cambridge
Philos. Soc., 57:811822, 1961.
[Olv65]
F. W. J. Olver. On the asymptotic solution of second-order differential equations
having an irregular singularity of rank one, with an application to Whittaker functions. J. Soc. Indust. Appl. Math. Ser. B Numer. Anal., 2:225243, 1965.
[Olv74]
F. W. J. Olver. Asymptotics and special functions. Academic Press [A subsidiary of
Harcourt Brace Jovanovich, Publishers], New York-London, 1974. Computer Science
and Applied Mathematics.
[Plaa]
D. Platt. Computing (x) analytically. Preprint. Available as arXiv:1203.5712.
[Plab]
D. Platt. Numerical computations concerning grh. Preprint.
[Pla11]
D. Platt. Computing degree 1 L-functions rigorously. PhD thesis, Bristol University,
2011.
[Ram95]
O. Ramare. On Snirel
mans constant. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4),
22(4):645706, 1995.
[Ram09]
O. Ramare. Arithmetical aspects of the large sieve inequality, volume 1 of HarishChandra Research Institute Lecture Notes. Hindustan Book Agency, New Delhi,
2009. With the collaboration of D. S. Ramana.
[Ram10]
O. Ramare. On Bombieris asymptotic sieve. J. Number Theory, 130(5):11551189,
2010.
[Ric01]
J. Richstein. Verifying the Goldbach conjecture up to 4 1014 . Math. Comp.,
70(236):17451749 (electronic), 2001.
[Ros41]
B. Rosser. Explicit bounds for some functions of prime numbers. Amer. J. Math.,
63:211232, 1941.
[RS62]
J. B. Rosser and L. Schoenfeld. Approximate formulas for some functions of prime
numbers. Illinois J. Math., 6:6494, 1962.
[RS75]
J. B. Rosser and L. Schoenfeld. Sharper bounds for the Chebyshev functions (x)
and (x). Math. Comp., 29:243269, 1975. Collection of articles dedicated to Derrick
Henry Lehmer on the occasion of his seventieth birthday.
[RS03]
O. Ramare and Y. Saouter. Short effective intervals containing primes. J. Number
Theory, 98(1):1033, 2003.
[RV83]
H. Riesel and R. C. Vaughan. On sums of primes. Ark. Mat., 21(1):4674, 1983.
[Sch33]
L. Schnirelmann. Uber
additive Eigenschaften von Zahlen. Math. Ann., 107(1):649
690, 1933.
[Sha]
X. Shao. A density version of the Vinogradov three prime theorem. Preprint. Available as arXiv:1206.6139.
[Tao]
T. Tao. Every odd number greater than 1 is the sum of at most five primes. Preprint.
Available as arXiv:1201.6656.
[Tem10]
N. M. Temme. Parabolic cylinder functions. In NIST handbook of mathematical
functions, pages 303319. U.S. Dept. Commerce, Washington, DC, 2010.
[Tru]
T. S. Trudgian. An improved upper bound for the error in the zero-counting formulae
for Dirichlet L-functions and Dedekind zeta-functions. Preprint.
[Tuc11]
W. Tucker. Validated numerics: A short introduction to rigorous computations.
Princeton University Press, Princeton, NJ, 2011.
[Tur53]
A. M. Turing. Some calculations of the Riemann zeta-function. Proc. London Math.
Soc. (3), 3:99117, 1953.
[TV03]
N. M. Temme and R. Vidunas. Parabolic cylinder functions: examples of error
bounds for asymptotic expansions. Anal. Appl. (Singap.), 1(3):265288, 2003.
[Vau77]
R. C. Vaughan. On the estimation of Schnirelmans constant. J. Reine Angew.
Math., 290:93108, 1977.
[Vin37]
[Wed03]
[Whi03]
[Wig20]
[Won01]
133
I. M. Vinogradov. Representation of an odd number as a sum of three primes. Dokl.

Akad. Nauk. SSR, 15:291294, 1937.
S. Wedeniwski. ZetaGrid - Computational verification of the Riemann hypothesis.
Conference in Number Theory in honour of Professor H. C. Williams, Banff, Alberta,
Canada, May 2003.
E. T. Whittaker. On the functions associated with the parabolic cylinder in harmonic
analysis. Proc. London Math. Soc., 35:417427, 1903.
S. Wigert. Sur la theorie de la fonction (s) de Riemann. Ark. Mat., 14:117, 1920.
R. Wong. Asymptotic approximations of integrals, volume 34 of Classics in Applied
Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia,
PA, 2001. Corrected reprint of the 1989 original.
Harald Helfgott, Ecole

Normale Sup
erieure, D
epartement de Math
ematiques,
45 rue dUlm, F-75230 Paris, France
E-mail address: harald.helfgott@ens.fr

Problema de Golbach - Matematicas

Uploaded by

Copyright:

Available Formats

You might also like

Problema de Golbach - Matematicas

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Problema de Golbach - Matematicas

Uploaded by

Copyright:

Available Formats

MAJOR ARCS FOR GOLDBACHS PROBLEM

arXiv:1305.2897v1 [math.NT] 13 May 2013

MAJOR ARCS FOR GOLDBACHS PROBLEM

(S(, x)) e(N )d +

(S(, x))3 e(N )d,

Our integral will actually be of the form

where + and are two different smoothing functions to be discussed soon.

MAJOR ARCS FOR GOLDBACHS PROBLEM

S, (/x, x) = Iq=1 b()x

F ()x + small error,

MAJOR ARCS FOR GOLDBACHS PROBLEM

and the Ecole

and work. However, D. Platts computations [Plab] used a significant amount

2.2. Dirichlet characters and L functions. A Dirichlet character : Z C

(n) for all n coprime to q, we say that induces . A character is primitive if

MAJOR ARCS FOR GOLDBACHS PROBLEM

2.4. Mellin transforms. The Mellin transform of a function : (0, ) C is

provided that + iR is within the strip on which M f is defined. We also know

M (tf (t))(s) = s M f (s),

M ((log t)f (t))(s) = (M f ) (s)

(as in, e.g., [BBO10, Table 1.11]).

Let fz = ezt , where (z) > 0. Then

where R/Z, is the von Mangoldt function and : R C is of fast enough

where 1 , 2 , 3 : R C. As can be readily seen, (3.2) equals

and 0 > 0, r 1 are given.

MAJOR ARCS FOR GOLDBACHS PROBLEM

parts to reduce the problem of estimating sums to a problem of counting

associated to a b Z/qZ and a Dirichlet character with modulus q. We let

Since (a, q) = 1, (, a) = (a) (). The factor ((q, n ))/((q, n )) equals 1

Recalling the definition (3.1) of S (, x), we conclude that

Hence S1 (, x)S2 (, x)S3 (, x)e(N ) equals

S, (/x) = O (err, (, x)) x

MAJOR ARCS FOR GOLDBACHS PROBLEM

The sum (0 , n) is a Ramanujan sum; as is well-known (see, e.g., [IK04,

The main term in (3.16) is

where is thrice differentiable outside finitely many points and satisfies

Here (3.19) is bounded by 2.82643x2 (by (B.4)) times

Now, (3.18) equals

The last line in (3.20) is bounded4 by

By (2.1) (with k = 3), (B.11) and (B.12), this is at most

order to avoid a potentially harmful dependence on N .

MAJOR ARCS FOR GOLDBACHS PROBLEM

It is easy to see that

Expanding the integral implicit in the definition of fb,

(the main term) plus

4.310040 | |21 + 0.00113

(a) (a) S, (/x, x)S, (/x, x)

+ O 2(1 + q)(log x)2 || max |S (, x)| + (1 + q)(log x)2 ||

As is well-known (see, e.g., [IK04, Lem. 3.1])

where q is the modulus of (i.e., the conductor of ), and

Using the expressions (3.12) and (3.13), we obtain

where Kq,2 = Kq,1 (2|S (0, x)|/x + Kq,1 /x).

MAJOR ARCS FOR GOLDBACHS PROBLEM

Thus, the integral of |S (, x)|2 over M (see (3.5)) is

ET,s = max | err,T (, x)|

If we also need a lower bound, we proceed as follows.

By (2.1) and Plancherel,

Using (B.13), we get that