Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Analytic Number Theory

Valentin Blomer
22. November 2022

Only for personal use!


Warning: these lecture notes are likely to contain misprints.

1
Inhaltsverzeichnis
0 Introduction 3

1 Arithmetic functions and Dirichlet series 6

2 Fourier analysis and Poisson summation 16

3 The functional equation 22

4 The Mellin transform 35

5 Zeros of L-functions and prime number theorems 41

2
0 Introduction
Recommended literature:
• J. Brüdern, Einführung in die analytische Zahlentheorie

• H. Davenport, Multiplicative Number Theory


• H. Iwaniec, E. Kowalski, Analytic number theory
Perhaps also useful:

• H. Montgomery, R. Vaughan, Multiplicative number theory I. Classical


theory
• G. Tenenbaum, Introduction to analytic and probabilistic number theory

Analytic number theory is the art of counting arithmetic objects. Typical


questions include:
• How many primes are there up to x?
• How many primes are there up to x in a given arithmetic progression a
(mod q)?

• How many primes are there in short interval (x, x + y] with y much less
than x?1
• in how many ways can one write a number as a sum of 2 (3, 4, . . .) primes?
• the same questions for squarefree numbers, sums of two squares, etc.

• how many numbers up to x are there with exactly k prime factors?


• how many numbers up to x are there with no prime factor ≥ y?
• what can one say about the distribution of quadratic non-residues, or
primitive roots modulo p?

• how many abelian groups are there of order less than x?


One can also study more complicated algebraic objects: class groups, elliptic
curves etc.

We will introduce analytic techniques to answer such questions. Often the


starting point for a counting problem is an analytic expression for a δ-function.
1 this is related to the previous question: two numbers are in the same progression modulo

q (a prime, say), if they are q-adically close, and they are in a short interval, if they are
archimedically close

3
Example 1
We use the notation e(x) = e2πix . For n ∈ Z we have
Z 1
e(αn)dα = δn=0 .
0

Hence the number of integral solutions to n = x21 + x22 + x23 + x24 (sums of four
squares) is
X Z 1 Z 1  X 4
e(α(x21 + . . . + x24 − n))dα = e(−αn) e(αx2 ) dα.
0 0 √
x1 ,...,x4 ∈Z |x|≤ n

We have now “reduced” the counting problem to evaluating P a single integral.


In order to do this, we observe that the exponential sum e(αx2 ) is typically
quite small due to cancellation unless α is close to a rational number a/q with
small denominator, in which case it is determined by the behaviour of squa-
res modulo q. Hence one could hope that the main contribution of the integral
comes from those α that are close to rational points with small denominator,
and the remaining parts are negligible. This gives indeed an asymptotic formula
with error term. It is also an instance of a local-global principle, since the glo-
bal asymptotic formula depends on the local behaviour in residue classes. This
method is called circle method.

Example 2
For natural numbers n we have
X
µ(d) = δn=1
d|n

where µ denotes the Möbius function


(
(−1)#{prime divisors of n} , n squarefree,
µ(n) =
0, otherwise,

see (1.4). This inclusion-exclusion formula√is essentially the sieve of Eratosthe-


nes: let P be the product of all primes ≤ x. Then we have
X X X X X X X hxi
1= 1= µ(d) = µ(d) 1= µ(d) .
√ d
x<p≤x n≤x n≤x d|(n,P ) d|P n≤x d|P
(n,P )=1 n≡0 (mod d)

Ignoring error terms, we could hope that this is approximately


X µ(d) Y  1

x =x 1− .
d √ p
d|P p≤ x

4

One can show that this is asymptotic to e−γ x/ log x, where γ = 0.577 . . .
is the Euler constant. Hence our heuristic argument gives the correct order of
magnitude, but the wrong constant. In particular, it is not allowed to ignore error
terms. Nevertheless, this method can be refined to prove interesting results (not
for counting primes, though), and leads to sieve theory.

Example 3
For c ∈ R let (c) be the vertical path c + it, −∞ < t < ∞. Then for c > 0 one
has Z (
1 −s ds 1, 0 < y < 1,
y =
2πi (c) s 0, y > 1.
(This is a non-trivial formula that we will prove later in (4.6)). The complex
contour integral is only conditionally convergent, but we ignore this for the
moment. Hence if an is any sequence of complex numbers, we have (ignoring
convergence)
Z  −s Z X
X X X 1 n ds 1 an s ds
an = an δn/x≤1 = an = s
x .
n n
2πi (c) x s 2πi (c) n n s
n≤x

Again our counting problem is “reduced” to evaluating one integral, which in


turn requires some
P understanding of the analytic properties of the associated
Dirichlet series n an n−s . As a toy problem, we have
Z
X 1 ds
1= ζ(s)xs , c > 1.
2πi (c) s
n≤x

By Cauchy’s integral theorem/residue theorem, this integral is independent of


the path, as long as it doesn’t cross a pole. If it does, we need to pick up the
residue. We shift the contour to the left to some 0 < c < 1. We pick up the pole
of the zeta-function at s = 1 getting
Z
X 1 ds ?
1=x+ ζ(s)xs = x + O(xc ).
2πi (c) s
n≤x

There are serious convergence issues, but they can be overcome in many appli-
cations. We will use this idea, for instance, to count primes up to x and prove
the prime number theorem.

5
1 Arithmetic functions and Dirichlet series
(1.1) Definition
An arithmetic function is a map f : N → C. An arithmetic function f 6= 0 is
multiplicative resp. completely multiplicative if f (nm) = f (n)f (m) for all
(m, n) ∈ N2 with (m, n) = 1 resp. for all (m, n) ∈ N2 . The set A of arithmetic
functions is a ring (an integral domain) with respect to pointwise addition and
Dirichlet convolution
X
(f ∗ g)(n) := f (d)g(n/d).
d|n

The identity element with respect to multiplication is the function η with η(1) =
1, η(n) = 0 for n > 1. (To show that there are no zero divisors let r, s minimal
with f (r) 6= 0 6= g(s) for 0 6= f, g ∈ A. Then (f ∗ g)(rs) = f (r)g(s) 6= 0.)

(1.2) Remarks
a) Since f (n) = f (1 · n) = f (1)f (n) for a multiplicative function f , we have
automatically f (1) = 1.
b) Multiplicative function are determined by their values on prime powers, com-
pletely multiplicative functions by their values on primes.
c) The units of A are precisely the function f with f (1) 6= 0: if f ∗ g = η, then
1 = η(1) = f (1)g(1). Conversely, if f (1) 6= 0, define its inverse g recursively by
g(1) = 1/f (1) and
1 X
g(n) = − f (d)g(n/d).
f (1)
d|n
d>1

(1.3) Lemma
The set of mutliplicative functions forms a subgroup of A∗ .

Proof. For (n, m) = 1 we have a bijection {d | nm} ↔ {d1 | n} × {d2 | m}. Let
f, g be multiplicative (hence invertible). Then
 nm  X X  
X nm
(f ∗ g)(nm) = f (d)g = f (d1 d2 )g
d d1 d2
d|nm d1 |n d2 |m
 X  
X n n
= f (d1 )g f (d2 )g = (f ∗ g)(n) · (f ∗ g)(m)
d1 d2
d1 |n d2 |m

for (n, m) = 1, so f ∗ g is multiplicative. To show that f −1 is multiplicative


define Y  Y
h pep := f −1 (pep ).
p p

6
Then h is multiplicative and h = f −1 on prime powers, so f ∗ h = η on prime
powers, and hence by multiplicativity everywhere. We conclude that f −1 =
f −1 ∗ η = f −1 ∗ f ∗ h = h is multiplicative.

(1.4) Examples
• the function 1(n) = 1 for all n ∈ N is completely multiplicative;
P
• the divisor function τ (n) = d|n 1 = (1 ∗ 1)(n) is not completely mul-
tiplicative. We have τ (pe ) = e + 1.
P
• the iterated divisor function τk (n) = (1 ∗ . . . ∗ 1)(n) = a1 ···ak =n 1 with
τk (pe ) = k+e−1

e ;
• Euler φ-function φ(n). We have (φ ∗ 1)(n) = n (check on prime powers);
P
• the functions ω(n) = #{p | n : p prim} and Ω(n) = pk kn k are not
multiplicative;
(
log p, n = pk
• the function Λ(n) = (von Mangoldt function) sa-
0, otherwise
Q ep
tisfies (Λ ∗ 1)(n) = log n: for n = p we have
X X X
Λ(d) = ep log p = log pep = log n;
d|n p p

• Let f ∈ Z[x] be a polynomial and ρf (n) = |{x ∈ Z/nZ | f (x) ≡


0 (mod n)}|. By the Chinese reminder theorem, ρf is multiplicative;
(
(−1)ω(n) , n squarefree
• the Möbius function µ(n) = satisfies µ∗1 = η
0, otherwise
P
(check on prime powers), so d|n µ(d) = δn=1 .

(1.5) Theorem (summation by parts)


Let y ∈ N, x ∈ R, x > y.
PLet g : [y, x] → C be continuously differentiable, and
let an ∈ C. Put A(t) = y≤n≤t an . Then
X Z x
an g(n) = A(x)g(x) − A(t)g 0 (t)dt.
y≤n≤x y

Proof. We have
X X
A(x)g(x) − an g(n) = an (g(x) − g(n))
y≤n≤x y≤n≤x
X Z x Z x X
= an g 0 (ξ)dξ = g 0 (ξ) an dξ.
y≤n≤x n y y≤n≤ξ

7
(1.6) Examples
a) The limit
X 1 
γ := lim − log M = 0.577 . . .
M →∞ m
m≤M

exists since writing t = [t] + {t} we have


Z x Z x
X 1 [x] [t] x − {x} t − {t}
= + 2
dt = + dt
n x 1 t x 1 t2
n≤x
Z ∞ Z ∞ Z ∞  
x − {x} {t} {t} {t} 1
= + log x − 2
dt + 2
dt = log x + 1 − 2
dt +O .
x 1 t x t 1 t x
| {z }
=:γ

b) We have X
τ (n) = x log x + (2γ − 1)x + O(x1/2 ),
n≤x

i.e. “on average” a number n has about log n divisors.

Proof: We have
X X X X X X √
τ (n) = 1= 1+ 1+ 1=2 1 + [ x]2 .
n≤x ab≤x ab≤x
√ ab≤x
√ ab≤x
√ ab≤x

a> x b> x a,b≤ x a> x
√ √
We have [ x]2 = x + O( x) and
X X X X x √ 
1= 1= − x + O(1)
√ √ √ b
ab≤x
√ b≤ x x<a≤x/b b≤ x
a> x
√ √ √ √ √
    
1 1
= x log x + γ + O √ − x[ x] + O( x) = x log x + γ − 1 + O( x).
x 2

(1.7) Lemma
For k ∈ N, ε > 0, there exists a constant C(k, ε) > 0 such that τk (n) ≤ C(k, ε)nε .

Proof. Let ε > 0. We have


k+α−1

τk (n) Y
α
= .
nε α
pαε
p kn

If p > C0 (k, ε) for some sufficiently large C0 (k, ε), then pαε > (α + k)k . We can
drop these factors and obtain
k+α−1
 k+α−1
 !C0 (k,ε)
τk (n) Y
α α
≤ ≤ max =: C(k, ε).
nε α
pαε α≥1 2αε
p kn
p≤C0 (k,ε)

8
(1.8) Definition and Lemma
Let f be an arithmetic
P∞ function. The Dirichlet series attached to f is the
−s
formal series n=1 f (n)n . If f, g are two functions whose Dirichlet series
converge absolutely for some s ∈ C, then
a)
∞ ∞ ∞
X (f ∗ g)(n) X f (n) X g(n)
= ;
n=1
ns n=1
ns n=1 ns
b)
∞ ∞
X f (n) Y X f (pk )
=
n=1
ns p
pks
k=0

if f is multiplicative (Euler product).

Proof. a) We have
∞ ∞ ∞ ∞
X (f ∗ g)(n) X 1 X X f (a) X g(b)
= f (a)g(b) = .
n=1
ns n=1
ns a=1
as bs
ab=n b=1

b) Multiply the right hand side and use uniqueness of prime factorization.

(1.9) Examples
a) For <s > 1 we have
∞ Y −1
X 1 1
ζ(s) := = 1− s 6= 0.
n=1
ns p
p

The Riemann zeta function has meromorphic continuation to <s > 0 with only
a simple pole at s = 1: for N ∈ N we have by partial summation we have
X Z N
n−s = [N ]N −s + s [x]x−s−1 dx.
n≤N 1

For <s > 1 we have


Z ∞ Z ∞
s
ζ(s) = s (x − {x})x−s−1 dx = −s {x}x−s−1 dx.
1 s−1 1

The right hand side exists for <s > 0 (except at s = 1) and yields the desired
continuation.
b) By (1.4) we have
∞ ∞
X τ (n) X φ(n) ζ(s − 1)
s
= ζ(s)2 , <s > 1, s
= , <s > 2,
n=1
n n=1
n ζ(s)
∞ ∞
ζ 0 (s) X Λ(n) X µ(n) 1
− = , <s > 1, = , <s > 1.
ζ(s) n=1
ns n=1
n s ζ(s)

9
(1.10) Theorem
P∞ −s
If F (s) = n=1 an n is convergent for some s0 ∈ C, then it converges for
all s ∈ C with <s > <s0 , uniformly on compacta, and defines a holomorphic
function in this region.

Proof. We show that the series converges uniformly in | arg(s − s0 )| ≤ π/2 − η


for any η > 0. By partial summation we have
N N N Z N X
X an X an s0 −s X an s0 −s an s0 −s−1
= n = N + (s − s0 ) t dt.
ns ns0 ns0 M M ≤n≤t n s0
n=M n=M n=M

For any ε > 0 there is M = M (ε) such that | M ≤n≤X an n−s0 | ≤ ε for all
P
X > M . Since <s ≥ <s0 we have
N Z N !    
X a
n <(s0 −s)−1 |s − s0 | 1
≤ ε 1 + |s − s0 | t dt ≤ ε 1 + ≤ε 1+ .

ns <(s − s0 ) sin η


n=M M

(1.11) Theorem
There is a number σ0 ∈ R∪{±∞} (abscissa of convergence) such that a Dirichlet
P for all s with <s > σ0 and diverges for all s with <s < σ0 . If
series converges
σ0 > 0, i.e. if n an diverges, it is given by
PN ( N
)
log | n=1 an | X
α
σ0 = lim sup = inf α | an = O(N ) .
N →∞ log N n=1

Remark: The assumption σ0 > 0 is no loss of generality and can always be


arranged by a suitable shift an 7→ an nα .

Proof. The existence of σ0 follows from (1.10). The second equality is obvious.
P call−σthe right hand side γ. It remains to show γ = σ0 . Let σ > σ0 . Then
We
an n is convergent, hence by partial summation
N N Z NX
X X an σ an σ−1
an = σ
N − σ σ
t dt = O(N σ ),
n=1 n=1
n 1 n
n≤t

so that γ ≤ σ. Since σ > σ0 was arbitrary, P


we obtain γ ≤ σ0 .
N
Now let σ > γ. Choose γ < α < σ so that n=1 an = O(N α ). We have
N N Z NX
X an X
−σ
σ
= an N + σ an t−σ−1 dt =: f (N ),
n=1
n n=1 1 n≤t

say, and limN →∞ f (N ) exists, since for N > M we have


Z N
f (N ) − f (M ) = O(M α−σ ) + O(tα ) · t−σ−1 dt = O(M α−σ ) → 0
M
for M → ∞. Hence σ ≥ σ0 and therefore also γ ≥ σ0 .

10
(1.12) Corollary
of convergence of n an n−s and σ1 the abscissa of con-
P
Let σ0 be the
Pabscissa
vergence of n |an |n−s . Then σ0 ≤ σ1 ≤ σ0 + 1.

Proof. Exercise

(1.13) Theorem (Landau)


Let F (s) = n an n−s be a Dirichlet series with nonnegative coefficients and let
P
σ0 be its abscissa of convergence. Then F has a singularity (i.e. is not holomor-
phic) at s = σ0 .

Proof. Wlog let σ0 = 0, and assume that F is holomorphic at s = 0. Then it is


holomorphic in a neighbourhood of 0, and hence we can expand it in a Taylor
series about 1 with radius of convergence > 1. Hence there is some δ > 0 such
the following series converges:
∞ ∞ ∞ ∞ ∞
X (−1 − δ)k X (1 + δ)k X (log n)k an X an X (1 + δ)k (log n)k
F (k) (1) = =
k! k! n=1
n n=1
n k!
k=0 k=0 k=0
∞ ∞
X an (1+δ) log n X
= e = an nδ .
n=1
n n=1

Here everything is absolutely convergent, but by assumption σ0 = 0 the right


hand side diverges, contradiction.

(1.14) Theorem (identity theorem)


Let F (s) = n an n−s and G(s) = n bn n−s be two Dirichlet series that con-
P P
verge in <s > c for some c ∈ R and that are identical in this region. Then
an = bn for all n.

Proof. If not, then let m be the smallest index with am 6= bm . For σ > c we
have
∞ ∞ ∞ ∞
!
X an X bn X  m σ X  m σ
σ
0=m − = am − bm + a n − bn
n=1
nσ n=1 nσ n=m+1
n n=m+1
n

For σ → ∞ this converges to am − bm 6= 0 (limit and summation may be


interchanged by uniform convergence), contradiction.

11
(1.15) Example
What is the “probability” that two numbers are coprime? By (1.4) we have
1 X 1 X X 1 X X X
1= µ(d) = µ(d) 1
x2 x2 x 2
n,m≤x n,m≤x d|(n,m) d≤x n≤x m≤x
(n,m)=1 d|n d|m
 
1 X x 2 X µ(d) 1 X1 1 X
= 2 µ(d) + O(1) = +O + 1
x d d2 x d x2
d≤x d≤x d≤x d≤x

!  
X µ(d) X 1 log x 1 1 log x 6
= 2
+O 2
+ + = +O → 2.
d d x x ζ(2) x π
d=1 d>x

(1.16) Notation (Vinogradov symbols)


For complex-valued functions f, g with g ≥ 0 we write f  g if f = O(g). If
in addition f ≥ 0, we write f  g if g = O(f ) and f  g if f = O(g) and
g = O(f ).

Appendix: characters
(1.17) Definition
Let q ∈ N. A homomorphism χ : (Z/qZ)∗ → S 1 is called a Dirichlet character.
We extend χ to a completely multiplicative q-periodic function Z → S 1 ∪ {0}
by putting χ(n) = 0 for (n, q) 6= 1 and call this again a Dirichlet character. The
function
X∞ Y
L(s, χ) := χ(n)n−s = (1 − χ(p)p−s )−1
n=1 p

is called Dirichlet L-function. We write χ (mod q) for the set of characters


\ ∗.
χ ∈ (Z/qZ)

(1.18) Remarks and Examples


a) Since (Z/qZ)∗ is a finite abelian group, it is isomorphic to its dual group2 ,
so there are φ(q) Dirichlet characters modulo q, and they form a group. We call
the trivial character χ0 , so χ0 (n) = δ(n,q)=1 .
b) If p is an odd prime, then the Legendre symbol (./p) is a quadratic (= order
2) Dirichlet character mod p (and the only one).
c) We have the orthogonality relations
( (
X φ(q), n ≡ 1 (mod q), X φ(q), χ = χ0 ,
χ(n) = χ(n) =
χ (mod q)
0, otherwise, n (mod q)
0, otherwise.

2 this is easy to see for cyclic groups and then follows in general

12
To show for instance the second formula for χ 6= χ0 , pick m with (m, q) = 1 and
χ(m) 6= 1, then
X X X
χ(n) = χ(nm) = χ(m) χ(n).
n (mod q) n (mod q) n (mod q)

d) Let r(n) = #{(a, b) ∈ Z2 | n = a2 + b2 } = #{α ∈ Z[i] | N α = n} = 4#{a ⊆


Z[i] | N a = n} since Z[i] is a principal ideal domain and has 4 units {±1, ±i}.
The discriminant is −4. The norm is multiplicative; primes p ≡ 1 (mod 4) are
split, p = 2 is ramified and primes p ≡ 3 (mod 4) are inert. Hence

e + 1,
 p ≡ 1 (mod 4)
1 e X
r(p ) = 1, p = 2, = χ−4 (pα )
4 1
 n 0≤α≤e
2 ((−1) + 1), p ≡ 3 (mod 4)

P−4 is the unique non-trivial character modulo 4. We conclude r = 4χ−4 ∗


where χ
1 and n r(n)n−s = 4ζ(s)L(s, χ−4 ).

(1.19) Definition
a) Let χ modulo q be a Dirichlet character. We say that χ has quasiperiod d
if χ(n) = χ(m) whenever n ≡ m (mod d) and (nm, q) = 1. The smallest quasi-
period is called conductor of χ.
b) If q ∗ | q, then χ mod q is induced by χ∗ mod q ∗ if χ(n) = χ∗ (n) for all
(n, q) = 1. A character χ mod q is called primitive if it is not induced by a
character χ∗ mod q ∗ with q ∗ < q.

Example: The nontrivial character χ mod 3 with values 1, −1, 0, 1, −1, 0 . . .


induces the character 1, 0, 0, 0, −1, 0, 1, 0, 0, 0, −1, 0, . . . mod 6.

Remark: If χ mod q is induced by χ∗ mod q ∗ , then


−1 Y  −1 Y 
Y χ∗ (p) χ∗ (p)

χ(p)
L(s, χ) = 1− s = 1− = 1− L(s, χ∗ ).
p ps ps
p-q p-q p|q

(1.20) Lemma
a) The conductor of a character χ mod q is a divisor of q.
b) Every character χ mod q is induced by a unique character χ∗ mod q ∗ where
q ∗ is the conductor of χ. This character χ∗ is primitive.

Proof. a) Let d be a quasiperiod of χ and let g = (d, q). Let n ≡ m (mod g)


and (nm, q) = 1. Then n − m = dx + qy for some x, y ∈ Z. Hence χ(m) =
χ(m + qy) = χ(n − dx) = χ(n) since n − dx = m + qy is coprime to q. Hence g
is a quasiperiod of χ, and in particular the smallest quasiperiod divides q.
b) Let q ∗ be the conductor of χ. For (n, q ∗ ) = 1 define χ∗ (n) = χ(n + kq ∗ ) for

13
any k ∈ Z such that (n + kq ∗ , q) = 1, and χ∗ (n) = 0 for (n, q ∗ ) > 1. Such a k
exists, for instance Y
k= p
p|q
p-q ∗ n

does the job3 . By definition of the conductor, this definition is independent of


the choice of k (as long as (n + kq ∗ , q) = 1), and defines q ∗ -periodic function χ∗
supported on (n, q ∗ ) = 1 that inherits complete multiplicativity from χ:
χ∗ (n)χ∗ (m) = χ((n+k1 q ∗ )(m+k2 q ∗ )) = χ(nm + (k1 m + k2 n + k1 k2 q ∗ )q ∗ ) = χ∗ (nm)
| {z }
coprime to q

for (nm, q ∗ ) = 1. Hence χ∗ is a character mod q ∗ , and it is primitive, since if it


had a smaller quasiperiod, then χ had the same quasiperiod contradicting the
minimality of conductor. Finally χ∗ is unique, since any χ1 mod q ∗ inducing χ
satisfies χ1 (n) = χ(n + kq ∗ ) = χ∗ (n) for (n, q ∗ ) = 1 and k as above.

(1.21) Lemma
If a, b ∈ Z and χ is a primitive character modulo q, then
1 X
χ(ac + b) = χ(b)δq|a .
q
c (mod q)

Proof. First we observe that we can replace a by (a, q) without changing the
sum: indeed, if we write d = (a, q), a0 = a/d, q 0 = q/d with (a0 , q 0 ) = 1, then
X X X
χ(ac + b) = χ(a0 dc + b) = d χ(d(a0 c) + b)
c (mod q) c (mod q) c (mod q 0 )
X X
=d χ(dc + b) = χ(dc + b).
c (mod q 0 ) c (mod q)

Hence we can assume wlog that a | q.


The claim of the lemma is obvious if q | a. Assume now q - a, i.e., a is a
proper divisor of q, and let S be the sum in question. For any x ∈ Z such that
(1 + ax, q) = 1 the map c 7→ c0 := c(1 + ax) + bx is a bijection of Z/qZ. Hence
1 X 1 X
χ(1 + ax)S = χ((ac + b)(1 + ax)) = χ(ac0 + b) = S.
q q
c (mod q) c0 (mod q)

Assume that χ(1 + ax) = 1 for all x ∈ Z with (1 + ax, q) = 1. Let (n, q) = 1 and
fix an integer n̄ such that n̄n ≡ 1 (mod q). Then χ(n + ay) = χ(n + nn̄ay) =
χ(n)χ(1 + an̄y) = χ(n) for all y ∈ Z with (1 + an̄y, q) = (n + ay, q) = 1, so that
by definition χ has quasiperiod a < q, contradiction. Hence there exists x with
χ(1 + ax) 6= 1, and so S = 0.
3 To check this, show that every p | q divides exactly one of n and kq ∗ , therefore it does

not divide n + kq ∗ .

14
(1.22) Definition
For a character χ modulo q we define the Gauß sum
X
τ (χ) = χ(h)e(h/q)
h (mod q)

where generally e(x) = e2πix .

(1.23) Theorem
Let χ be a character modulo q, a ∈ Z.
a) If (a, q) = 1, then4
X
χ(a)τ (χ̄) = χ̄(h)e(ha/q).
h (mod q)

b) If χ is primitive, this holds for all a ∈ Z.



c) If χ is primitive, then |τ (χ)| = q.

Proof. a) change variables h 7→ ha.


b) Let d = (a, q) > 1. Then
   
X ha X X (sq/d + r)a
χ̄(h)e = χ̄(sq/d + r)e
q q
h (mod q) r (mod q/d) s (mod d)
   
X ra X d X ra X
= e χ̄(sq/d + r) = e χ̄(sq/d + r) = 0
q q q
r (mod q/d) s (mod d) r (mod q/d) s (mod q)

by (1.21) since d > 1 and χ̄ is primitive.


c) We have
 
X X X n(a − b)
φ(q)|τ (χ)|2 = |χ(n)|2 |τ (χ̄)|2 = χ̄(a)χ(b)e
q
n (mod q) n (mod q) a,b (mod q)
X
=q |χ(a)|2 = qφ(q).
a (mod q)

Note in the first step that τ (χ) = χ(−1)τ (χ̄), so that |τ (χ)| = |τ (χ̄)|.

(1.24) Definition
A Dirichlet character is called even if χ(−1) = 1 and odd if χ(−1) = −1.

4 Note that χ̄ is a primitive character if χ is a primitive character and |τ (χ̄)| = |τ (χ)|.

15
2 Fourier analysis and Poisson summation
The main result of this section is the Poisson summation formula. We give two
arithmetic applications. More will follow later.

(2.1) Definition
A Schwartz class function on R is a C ∞ -function f : R → C such that
f (n) (x) n,A (1 + |x|)−A for all n ∈ N0 and all A > 0. We denote the vector
space of such functions by S(R).

(2.2) Definition
For f ∈ L1 (R) we define the Fourier transform fb ∈ L∞ (R) by
Z
F(f )(y) = f (y) :=
b f (x)e(−xy)dx.
R

(2.3) Lemma
Let f, g ∈ L1 (R), c ∈ R.
a) The Fourier transform is a linear operator.
b) If h(x) = f (x − c), then bh(y) = e(−cy)fb(y).
h(y) = |c|−1 fb(y/c).
c) If h(x) = f (cx) with c 6= 0, then b
d) If h(x) = (f ∗ g)(x) = R f (t)g(x − t)dt, then h ∈ L1 (R) and b
R
h(y) = fb(y)b
g (y).
R R
e) We have R f (x)b g (x)dx = R fb(x)g(x)dx.
2
f) The function f (x) = e−πx is self-dual with respect to the Fourier transform.

Proof. a) - e) Change of variables and Fubini.


f) We have
Z Z Z
−πx2 −πy 2 −π(x+iy)2 −πy 2 2
e e(−xy)dx = e e dx = e e−πx dx.
R R =x=y

In this complex contour integral we shift the line of integration to =x = 0 getting


Z Z
−πx2 −πy 2 2 2
e e(−xy)dx = e e−πx dx = e−πy .
R R

(2.4) Theorem
a) If f ∈ S(R), then fb ∈ S(R). More precisely, if f (r) (x)  (1 + |x|)−n−1−δ for
all r ≤ k and some δ > 0, then fb(n) (y)  (1 + |y|)−k .
b) If f ∈ C 1 (R) ∩ L1 (R) and fb ∈ L1 (R), in particular if f, f 0 , f 00  (1 + |x|)−2
(by part a) with n = 0, k = 2, δ = 1), the Fourier inversion formula holds:
Z
f (x) = fb(y)e(xy)dy = fb(−x).
b
R

16
c) If f ∈ S(R), Parseval’s identity holds:
Z Z
2
|f (x)| dx = |fb(y)|2 dy.
R R
Proof. a) differentiation under the integral sign and integration by parts:
Z
1 1
fb(n) (y)  r |(xn f (x))(r) |dx  r
|y| R |y|
for all r ≤ k.
b) By Lebesgue’s dominated convergence theorem we have
Z Z Z
2 2
fb(y)e(−xy)dy = lim e−πε y fb(y)e(−xy)dy = lim fb(y)gε (y)dy
R ε→0 R ε→0 R
−πε2 y 2
where gε (y) = gε (y; x) = e e(−xy). By (2.3b, c, f) we compute
1 − π(x−t)2

gbε (t) = e ε2
ε
so that by (2.3e)
Z Z Z
1 π(x−y)2 f ∈C 1 (R) 1 π(x−y)2
fb(y)e(−xy)dy = lim f (y) e− ε2 dy = lim (f (x) + O(|y − x|)) e− ε2 dy
R ε→0 R ε ε→0 R ε
= lim (f (x) + O(ε)) = f (x).
ε→0

c) Apply (2.3e) with f (x) = gb(x) = b̄


g(−x), so that fb(x) = ḡ(x) by part b).

(2.5) Remarks
a) Theorem 2.4b, c holds with less restrictive assumptions on f . A standard
condition in b) is f ∈ L1 (R) ∩ C(R) and fb ∈ L1 (R), and f ∈ L1 (R) ∩ L2 (R) in
c).
b) The results in (2.3), (2.4) generalize in an obvious way to functions on Rn .
c) The Fourier transform translates smoothness in decay conditions (and vice
versa), and it translates small support into large support. In particular, there is
an uncertainty principle: the support of f and fb cannot be small simultaneously.

(2.6) Theorem (Poisson summation)


Let f ∈ S(R). Then X X
f (n) = fb(n).
n∈Z n∈Z

P
Proof. Let F (x) = P n∈Z f (x + n) and expand this 1-periodic function into a
Fourier series: F (x) = m∈Z a(m)e(mx) where
Z 1 Z 1X Z ∞
a(m) = F (x)e(−mx)dx = f (x+n)e(−mx)dx = f (x)e(−mx)dx = fb(m).
0 0 n∈Z −∞

Now put x = 0.

17
(2.7) Corollary
Let f ∈ S(R), a, q ∈ N, χ a primitive Dirichlet character modulo q.
a) We have    
X 1 X b m am
f (m) = f e .
q q q
m∈Z m∈Z
m≡a (mod q)

b) We have  
X τ (χ) X b m
f (m)χ(m) = f χ̄(m).
q q
m∈Z m∈Z

Proof. Exercise

(2.8) Corollary
2
e−πn x
P
√ Theta function Θ(x) =
The n∈Z satisfies the functional equation Θ(1/x) =
xΘ(x).

(2.9) Lemma
Let X and 0 < Z < Y be three real numbers. There exists a smooth, non-
negative function f satisfying the following properties:
a) the support of f is contained in [X − Z, X + Y + Z]
b) f = 1 on [X, X + Y ];
c) kf (j) k1  Z 1−j for all j ∈ N.

Proof. Exercise.

(2.10) Theorem (Pólya-Vinogradov)


Let χ be a non-principal Dirichlet character modulo q, M ∈ Z, N ≥ 2. Then
X
χ(n)  q 1/2 log q.
M <n≤M +N

Proof. Let us first assume that χ is primitive (and q ≥ 2). We would like to
apply Poisson summation. To this end we need to smooth out the characteristic
function on the interval (M, M + N ]. Let w be a smooth bounded function with
support on [M −1, M +N +1] such that w = 1 on (M, M +N ] and kw(j) k1 j 1
for all j ∈ N. Then we have by (2.7b) that
Z  
X X τ (χ) X nx
χ(n) = χ(n)w(n)+O(1) = χ̄(n) w(x)e − dx+O(1).
q R q
M <n≤M +N n∈Z n∈Z

Note that, since q ≥ 2, χ(0) = 0. Integration by parts shows


Z    j
nx q
w(x)e − dx j
R q |n|

18
for all j ∈ N. Hence
 
X 1 X q X q2 
χ(n)  1 + √ +  q 1/2 log q.
q n n>q n2
M <n≤M +N n≤q

If χ is induced by χ1 modulo q1 with q = q1 r, then


X X X X
χ(n) = χ1 (n) = χ1 (n) µ(d)
M <n≤M +N M <n≤M +N M <n≤M +N d|(n,r)
(n,r)=1
X X X X
= µ(d) χ1 (n) = µ(d)χ1 (d) χ1 (n)
d|r M <n≤M +N d|r M/d<n≤(M +N )/d
d|n

By what we have proved, this is


1/2
 τ (r)q1 log q1  q 1/2 log q.

Remark. This is nontrivial for N  q 1/2 log q.

Application: let q be a prime. Then any interval of length ≥ cq 1/2 log q with c
sufficiently large contains a quadratic residue and a quadratic non-residue.

(2.11) Lemma
For k ∈ Z, x ∈ R define the Bessel function
Z π
1
Jk (x) = e−ikφ+ix sin φ dφ.
2π −π

It satisfies the following properties:


a) For θ ∈ [0, 2π], x ∈ R we have
X
eix sin θ = Jk (x)eikθ .
k∈Z

b) We have 2Jk0 (x) = Jk−1 (x) − Jk+1 (x) and 2k


x Jk (x) = Jk+1 (x) + Jk−1 (x).
c) We have (xk Jk (x))0 = xk Jk−1 (x).
d) If F : (0, ∞) → C is a smooth compactly supported function and α > 0, then
∞ j Z ∞
√ dj √
Z 
2
F (x)Jk (α x)dx = − (F (x)x−k/2 )x(k+j)/2 Jk+j (α x)dx
0 α 0 dxj

for all j ∈ N0 .
e) We have Jk (x) k min(1, |x|−1/2 ).

19
Proof. a) Follows directly from the theory of Fourier series.
b) Differentiate the generating series with respect to x:

eiθ − e−iθ ix sin θ X 0


e = Jk (x)eikθ
2
k∈Z

and θ:
eiθ + e−iθ ix sin θ X
x e = kJk (x)eikθ
2
k∈Z

and compare coefficients.


c) Add the two equations in part b) and multiply by xk /2 getting xk Jk0 (x) +
kxk−1 Jk (x) = xk Jk−1 (x).
d) Integrate by parts j times using the formula in c) in the form
d  r/2 √  d √ √  α √ √
x Jr (α x) = α−r (α x)r Jr (α x) = α−r √ (α x)r Jr−1 (α x)
dx dx 2 x
α (r−1)/2 √
= x Jr−1 (α x).
2
e) The bound |Jk (x)| ≤ 1 is obvious. To show the second bound, let 0 < α <
1/10 be a parameter. Let us assume that |x|α > 100k. Then integration by parts
shows
Z π/2−α Z π/2−α

e−ikφ+ix sin φ dφ = (−ik + ix cos φ)e−ikφ+ix sin φ
α α −ik + ix cos φ
 −ikφ+ix sin φ π/2−α Z π/2−α  
e d 1
= − e−ikφ+ix sin φ dφ
−ik + ix cos φ φ=α α dφ −ik + ix cos φ
Z π/2−α  
1 d 1 dφ  1

 +
|x|α α
dφ −ik + ix cos φ |x|α
since the integrand has no sign change in the range of integration. Hence
Z π/2
eix sin φ dφ  α + (α|x|)−1
0

for any 0 < α < 1/10 such that |x|α > 100k. We can p estimate the other regions
of the integral in the same way. Choosing α  1/ |x| and assuming wlog that
x is sufficiently large (i.e. x  k 2 ), we obtain the desired bound.

(2.12) Theorem (Hardy, Sierpiński)


The number of lattice points in a circle about the origin of radius R1/2 is
πR + Oε (R1/3+ε ) for every ε > 0.

Remark: This improves what one gets from the elementary Lipschitz principle:
πR + O(R1/2 ).

20
Proof. Let 1 ≤ T ≤ R1/2 be another parameter and let w be a smooth bounded
function with support on [0, R + T ] such that w = 1 on [0, R] and kw(j) k1 j
T 1−j for all j ∈ N. Let r(n) be as in (1.18d). We have
X
r(n) = 4 χ−4 (d)  τ (n)  nε .
d|n

Hence
X X X
r(n) = r(n)w(n) + O(T Rε ) = w(x2 + y 2 ) + O(T Rε )
n≤R n≤R x,y∈Z
X Z
= w(x2 + y 2 )e(−nx − my)dx dy + O(T Rε ).
n,m∈Z R2

The term with n = m = 0 is


Z Z ∞ Z T +R
2 2 2
w(x + y )dx dy = 2π w(r )r dr = π w(r)dr = πR + O(T ).
R2 0 0

In the other terms we make an orthogonal


√ change of variables ( xy ) 7→ S ( xy ) with
S ∈ SO(2) such that (n, m)S T = (0, − n2 + m2 ). Then
X Z   
x
w(x2 + y 2 )e −(n, m) dx dy
R2 y
(n,m)6=0
X Z ∞ Z 2π p
= w(r2 ) e( n2 + m2 r sin φ)dφ r dr
(n,m)6=0 0 0

X Z ∞ √
=π r(`) w(r)J0 (2π r`)dr.
`≥1 0

By Lemma 2.11d) and e) with k = 0 we obtain


Z ∞ √ −j/2
Z R+T  
w(r)J0 (2π r`)dr j ` T −j rj/2 min 1, (r`)−1/4 dr
0 R
 `−j/2 T 1−j Rj/2 (R`)−1/4
for any j ∈ N. Hence the sum over ` is at most
X r(`) X r(`) R3/4 R1/2+ε
3/4
R1/4 + 5/4
 .
2
` 2
` T T 1/2
`≤R/T `≥R/T

1/3
Choosing T = R gives the result.

Remark: The proof has shown the summation formula


X Z ∞ X
r(n)w(n) = π w(x)dx + r(n)w̌(n)
n 0 n

with ∞ √
Z
w̌(n) = π w(x)J0 (2π xn)dx.
0

21
3 The functional equation
In this section we use Poisson summation to derive an integral representation
of the Riemann zeta-function that (a) continues the zeta-function to the entire
plane and (b) proves the functional equation. We introduce a general class of L-
functions sharing similar properties with the Riemann zeta-functions. For such
L-functions we show the “approximate functional equation” as a tool for un-
derstanding L-functions in the critical strip. Applications include the convexity
bound and bounds for moments.

(3.1) Definition
For <z > 0 we define the Gamma function by
Z ∞
dt
Γ(z) = e−t tz .
0 t

(3.2) Theorem
The Gamma function can be continued meromorphically to C with simple poles
at −n, n ∈ N0 , with residue (−1)n /n!. It satisfies zΓ(z) = Γ(z + 1) and in
particular Γ(n) = (n − 1)! for n ∈ N. It is rapidly decaying on vertical lines, i.e.
Γ(x + iy) x,A (1 + |y|)−A away from poles, i.e. if minn∈N0 (x + iy + n) ≥ 1/100.

Proof. The formula zΓ(z) = Γ(z + 1) follows from partial integration in <z >
0. This gives inductively meromorphic continuation (and hence the recurrence
relation everywhere) and the location of poles. The decay property follows from
repeated integration by parts.

(3.3) Theorem
The Gamma function satisfies the following properties:
a) Γ is zero-free.

b) Γ(1/2) = π.
c) Γ(z)Γ(1 − z) = π/√ sin(πz).
d) Γ(z)Γ(z + 1/2) = π21−2z Γ(2z).
e) Γ(z) = Γ(z̄).
f) Stirling’s formula:
 
1 1
log Γ(z) = z − log z − z + log(2π) + Oδ (|z|−1 ),
2 2
or  1/2     
2π z z 1
Γ(z) = 1 + Oδ
z e |z|
or
Γ0
(z) = log z + Oδ (|z|−1 )
Γ

22
whenever | arg(z)| < π − δ for some δ > 0.

Proof. Known from complex analysis. Part e) is obvious, part b) follows from
c).

(3.4) Corollary
a) For x, y ∈ R we have
π 1
|Γ(x + iy)| x,δ e− 2 |y| (1 + |y|)x− 2 , |x + iy + n| ≥ δ for all n ∈ N0 .
b) Suppose that <z ≥ −A, <(z + w) ≥ δ, |<w| ≤ A for some A, δ > 0. Then
there exists a constant c > 0 (depending on A and δ) such that
Γ(z + w)
A,δ (1 + |z|)<(w) exp(c|w|).
Γ(z)
Proof. a) It suffices to assume that |y| > 1. Then by Stirling’s formula we have
1   p 
|Γ(x + iy)| x 1/2
exp < (x + iy)(log x2 + y 2 + i arg(x + iy))
|y|
1 2 π
x 1/2
ex(log |y|+Ox (1/y ))−|y|( 2 +Ox (1/|y|)) .
|y|
b) Let us first assume that <z ≥ 1/2. Then we can apply Stirling’s formula to
Γ(z + w) and Γ(z). We have
Z z+w 1 
1 1

| log(z + w) − log z| = dt ≤ |w| + .

z t <(z + w) <z
Therefore
Γ(z + w)
= exp{<(log Γ(z + w) − log Γ(z))}

Γ(z)

  
  1 1  
δ exp < (z + w − 1/2) log(z) + O |w| + − w − (z − 1/2) log z
<(z + w) <z
 exp{<(w log z) + OA,δ (|w|)} = exp{<(w) log |z| + OA,δ (|w|)} = |z|<w exp{OA,δ (|w|)}.

If <z > −A, |z| ≥ δ, we conclude from what we have already shown that
Γ(z + w) z(z + 1) · · · (z + [A] + 1)Γ(z + w)
= δ,A |z|[A]+2 ·|z|<w−[A]−2 exp(OA,δ (|w|)).
Γ(z) Γ(z + [A] + 2)
Finally if |z| ≤ δ, there is nothing to show.

(3.5) Definition
An entire function f : C → C is called of order β if f (z)  exp(|z|β+ε ) for
all ε > 0. By slight abuse of notation we call aQmeromorphic function f with
finitely many poles at z = zj of order β if f (z) j (z − zj ) is of order β.

23
(3.6) Main Theorem (Riemann)
Let Z(s) := Γ(s/2)π −s/2 ζ(s) be the completed zeta function. It can be con-
tinued to a meromorphic function of order 1 with simple poles at s = 0 and
s = 1 and functional equation Z(s) = Z(1 − s). Its zeros are contained in the
region 0 ≤ <s ≤ 1.
The zeta function ζ(s) can be continued meromorphically to all of C; it is holo-
morphic except for a simple pole at s = 1 with residue 1. In <s < 0 it has zeros
precisely at −2, −4, −6, . . . which are all simple.

Remarks. The factor Γ(s/2)π −s/2 is sometimes called the local Euler factor at
∞. In <s > 1 we understand the zeta function from its series/product repre-
sentation. This is translated into the region <s < 0 by the functional equation.
The hard part will be to understand the zeta function in the “critical strip”
0 < <s < 1. The Riemann hypothesis states that all non-trivial zeros of the
Riemann zeta function satisfy <ρ = 1/2.

Proof. We have
∞ Z ∞
X 1 2 s
Z(s) = Γ(s/2)π −s/2 ζ(s) = π −s/2 s
e−πn y (πn2 y) 2 −1 πn2 dy
n=1
n 0

in <s > 1. Let



X 2
ω(x) = e−πn x
= (Θ(x) − 1)/2.
n=1

It follows from (2.8) that




 
1 x−1
ω = xω(x) + .
x 2

We obtain
Z ∞ Z ∞ Z ∞  
dx s s dx 1 s dx
Z(s) = ω(x)x = 2 ω(x)x 2 + ω x− 2
0 x x x x
Z ∞1 1
1 1 s 1−s dx
=− + + ω(x)(x 2 + x 2 ) .
s s−1 1 x

As ω is rapidly decaying, the integral is absolutely and locally uniformly con-


vergent for all s ∈ C and hence defines a holomorphic function. The functional
equation is now obvious. The only poles of Z are at 0 and 1 and they are simple.
There are no zeros of Z(s) in <s > 1 by the product expansion (and the fact
that the Gamma function has no zeros) and hence no zeros in <s < 0 by the
functional equation. Stirling’s formula implies Γ(s/2)  e|(s/2) log s| in <s ≥ 1/2,
and (s − 1)ζ(s)  |s|2 follows from (1.16) in <s ≥ 1/2. Hence Z(s)  e|s log s|
in <s ≥ 1/2, and by the functional equation we conclude that Z is of order 1.
It follows that the only pole of ζ is at s = 1 with residue (Γ(1/2)π −1/2 )−1 = 1.

24
The statement about the zeros of ζ is <s < 0 follows from
ζ(1 − s)Γ((1 − s)/2)π −1/2+s
ζ(s) =
Γ(s/2)
and the fact that Γ and ζ have no zeros in <s > 1.

(3.7) Theorem
Let χ be a primitive character modulo q > 1. Let κ = 0 if χ is even and κ = 1
if χ is odd. Then the completed L-function
 q s/2  s + κ 
Λ(s, χ) = Γ L(s, χ)
π 2
is entire of order 1 and satisfies the functional equation Λ(s, χ) = (χ)Λ(1−s, χ̄)
where (χ) = i−κ τ (χ)q −1/2 . In particular, in <s < 0 we have L(s, χ) = 0 if and
only if s ∈ {−2, −4, . . .} (χ even) or s ∈ {−1, −3, . . .} (χ odd).

Proof. We copy the proof of (3.6) with some modifications. In <s > 1 we have
 q s/2  s + κ 
Λ(s, χ) = Γ L(s, χ)
π 2

χ(n) ∞ −πn2 y/q
 q s/2 X Z
s+κ
= s
e (πn2 y/q) 2 −1 πn2 /q dy
π n=1
n 0
 κ/2 Z ∞ X ∞
π s+κ 2
= χ(n)nκ y 2 −1 e−πn y/q dy.
q 0 n=1

We define X 2
Θ(x, χ) := χ(n)nκ e−πxn /q
.
n∈Z
2
If f (t) = tκ e−πt x/q
, then one checks that
   κ  
t −κ t q 1/2 −πt2 /xq
fb =i e
q x x
for κ ∈ {0, 1}. Hence by (2.7b) we have the functional equation
i−κ τ (χ)
Θ(x, χ) = √ Θ(1/x, χ̄).
xκ+1/2 q
Since χ(n)nκ is even and χ(0) = 0, we obtain
 κ/2 Z ∞
π s+κ dy
2Λ(s, χ) = Θ(y, χ)y 2
q 0 y
 κ/2 Z ∞ Z ∞ 
π s+κ dy
− s+κ dy
= Θ(y, χ)y 2 + Θ(1/y, χ)y 2
q 1 y 1 y
 κ/2 Z ∞
i−κ τ (χ) ∞
Z 
π s+κ dy 1−s+κ dy
= Θ(y, χ)y 2 + √ Θ(y, χ̄)y 2 .
q 1 y q 1 y

25
The integrals on the right hand side define entire functions. Replacing s by 1 − s
and χ by χ̄ shows the functional equation. The statements about the order of
Λ(s, χ) and the zeros of L(s, χ) follow as in (3.6).

(3.8) Definition

P L-function
An of degree d ∈ N in the class S is a Dirichlet series L(s) =
an n−s with the following properties:
a) it is absolutely convergent in <s > 1 and can be expanded in an Euler
product
d  −1
YY αj (p)
L(s) = 1−
p j=1
ps

for certain αj (p) ∈ C satisfying |αj (p)| < p;


b) there exist
• an integer N ≥ 1 (conductor) satisfying αj (p) 6= 0 for p - N ,
• a complex number η ∈ S 1 (root number)
• and d complex numbers κj with <κj ≥ −1/2 that are either real or come
in pairs of conjugate complex numbers
such that
d  
s/2
Y
−s/2 s + κj
Λ(s) := N π Γ L(s)
j=1
2

can be continued meromorphically to a function of order 1 with poles at most


at 0 and 1 and satisfies the functional equation Λ(s) = ηΛ(1 − s̄).

We write
d  
Y s + κj
L∞ (s) = π −s/2 Γ
j=1
2

and
d  X
Y
−s/2 s + κj
Λ̄(s) := Λ(s̄) = N s/2
π Γ ān n−s .
j=1
2 n

We write
d
Y d
Y
C0 (s) := (|s + κj | + 2), C(s) := N (|s + κj | + 2),
j=1 j=1

and call C(s) the analytic conductor of the L-function. It is a measure of its
complexity at the point s.

Remarks. Clearly the ζ-function and Dirichlet L-functions are in S with d = 1.


For a Dirichlet L-function attached to a primitive character of conductor q, the

26
analytic conductor is  q(|s| + 1).
By condition a) the coefficients an are multiplicative. One can show that the
sequence apk satisfies a d-term recurrence (exercise).
Since the L-function is absolutely convergent in <s > 1, we have by (1.11) that
1+ε
P
n≤x |an |  x .
The product of two L-functions of degrees d1 and d2 is an L-function of d1 + d2 .
An L-function is called primitive if it doesn’t factor into L-functions of lower
degree. Currently we only know primitive L-functions of degree 1. (One can
show that Dirichlet L-functions and the Riemann zeta-function are the only
L-functions of degree 1). We will get to know primitive L-functions of degree 2
much later.
All our subsequent estimates will be uniform in N , αj (p) and κj .
Notation: We write s = σ + it.

(3.9) Lemma
We have
L∞ (1 − s)
N 1/2−s A,δ C(s)1/2−σ , |σ| ≤ A, <(1 − s + κj ) ≥ δ
L∞ (s)

and
L∞ (s + u) c
A,δ e 2 d|u| C0 (s)<u/2 , <s ≥ −A, |<u| ≤ A, <(s + u + κj ) ≥ δ
L∞ (s)

where c is the constant from (3.4b).

Proof. To prove the first statement, apply (3.3e) and (3.4b) with
1 − s + κj s + κj 1
z+w = , z= , w= − <s
2 2 2
and recall that the κj are either real or come in complex conjugate pairs. The
second part follows similarly.

We would like to investigate L-functions in the “critical strip” 0 < <s < 1. We
cannot use the series representation, but a certain smoothly truncated version
does the job.

(3.10) Theorem (Approximate Functional Equation)


Let L(s) be an L-function as in (3.8). Let G(u) be any even function which is
holomorphic and bounded in |<u| < 4 and normalized by G(0) = 1. Let X > 0.
Then for 0 < σ < 1 we have
X an  n  X ān 
nX

L(s) = Vs √ + (s) V1−s √ −R
n
ns X N n
n1−s N

27
where5
Z
1 L∞ (s + u) du
Vs (y) = y −u G(u) ,
2πi (3) L∞ (s) u
L∞ (1 − s)
(s) = ηN 1/2−s ,
L∞ (s)
 
Λ(s + u) G(u) u
R= res + res X .
u=1−s u=−s N s/2 L∞ (s) u

In particular, R = 0 if Λ is entire and |(s)| = 1 for <s = 1/2.

Remarks: The function Vs is smooth and rapidly decaying as y → ∞. This


can be seen by differentiating under the integral sign and shifting the contour
to the far right (at least if G(u) is holomorphic for sufficiently large real part of
u). We will estimate it more precisely in (3.11) for a suitable test function G.
We will almost always choose X = 1 in order to balance the two terms.

Proof. Let Z
1 du
I(s) = X u Λ(s + u)G(u) .
2πi (3) u
The integral is absolutely convergent by the rapid decay of L∞ . We move the
line of integration to <u = −3, picking up possible poles of Λ at u ∈ {1 − s, −s}
and a pole at u = 0. This gives
Z
1 du
I(s) = X u Λ(s + u)G(u) + Λ(s) + RN s/2 L∞ (s).
2πi (−3) u

In the first integral we apply the functional equation and change variables u 7→
−u getting
Z
η du
I(s) = − X −u Λ̄(1 − s + u)G(u) + Λ(s) + RN s/2 L∞ (s).
2πi (3) u

Now we open both integrals:


1
Z X an du X an  n 
u (s+u)/2 s/2
I(s) = X N L∞ (s+u) G(u) = N L∞ (s) Vs √
2πi (3) n
ns+u u n
ns X N

and
Z  
η −u du (1−s)/2
X ān nX
X Λ̄(1 − s + u)G(u) = ηN L∞ (1 − s) V1−s √
2πi (3) u n
n1−s N
X ān  
nX
= (s)N s/2 L∞ (s) V
1−s 1−s
√ ,
n
n N

and the result follows after dividing by N s/2 L∞ (s).


5 As in Example 3 in the introduction we write (c) for the path (c − i∞, c + i∞).

28
(3.11) Lemma
Let 0 ≤ σ ≤ 1 and suppose that σ + <κj > 0 for all j. There is a choice of G
such that  −A
y
y a Vs(a) (y) a,A 1 +
C0 (s)1/2
for all a ∈ N0 , A ≥ 1.

Remark: In other words, Vs is bounded, and all its derivatives are rapidly
decaying once y is bigger than C0 (s)1/2 . Hence the “effective length” of the ap-
proximate functional equation is C(s)1/2 .
2
Proof. We choose G(u) = eu Then G is even, holomorphic and satisfies
2
G(u) <u e−|u| . Now
Z
1 L∞ (s + u) du
a (a)
y Vs (y) = y −u G(u)(−u) · · · (−u − a + 1) .
2πi (3) L∞ (s) u

We use (3.9b) to estimate the Γ-quotient. Moving the line of integration to


<u = −δ for some sufficiently small δ, we pick up only the pole at u = 0
with residue 1 if a = 0, and no pole if a > 0, and the remaining integral
is a y δ C0 (s)−δ/2 . Moving the line of integration to <u = A, we bound the
integral by A,a y −A C0 (s)A/2 . Combining these two estimates completes the
proof.

(3.12) Corollary (Convexity bound)


Let L be an L-function as in (3.8), and assume that n≤x |an |  x1+ε with
P
an implied constant independent of L. If R 6= 0 in (3.10), assume that L is the
Riemann zeta function. Then for 0 ≤ σ ≤ 1 (and |s − 1| ≥ δ in the case of the
Riemann zeta-function) we have
1−σ
L(s) δ,ε C(s) 2 +ε

for every ε > 0.

Proof. There are two ways to prove this:

a) Using the Phragmén-Lindelöf principle: Let f be a function holomor-


phic on an open neighbourhood of a strip a ≤ σ ≤ b, such that |f (s)|  exp(|s|A )
for some A ≥ 0 and a ≤ σ ≤ b. Assume that

|f (a + it)| ≤ Ma (1 + |t|)α , |f (b + it)| ≤ Mb (1 + |t|)β (t ∈ R).

Then
1−`(σ)
|f (σ + it)| ≤ Ma`(σ) Mb (1 + |t|)α`(σ)+β(1−`(σ))
σ−b
where `(σ) = a−b is the linear function with `(a) = 1, `(b) = 0.

29
It follows from the assumption that L(s)  1 in <s ≥ 1 + ε, and from the
functional equation along with (3.9a) we obtain L(s)  C(s)1/2−σ+ε in <s ≤ −ε.
The Phragmén-Lindelöf principle now gives the claim. In the case of the Rie-
mann zeta-function we consider (s − 1)ζ(s) to have a holomorphic function.

b) Using the approximate functional equation: For simplicity we assume


min(σ, 1 − σ) + <κj > 0 for all j. We choose X = 1 in (3.10) and estimate
trivially using (3.11) getting (recall (3.9a))

X |an |  −10 X |an |  −10


n 1/2−σ n
L(s)  1+ +C(s) 1+ +|R|
n
nσ C(s)1/2 n
n1−σ C(s)1/2

Note that C(s)  C(1 − s) for bounded <s P since the κj are either real or co-
me in pairs of complex conjugates. Since n≤x |an |  x1+ε , we see by partial
summation after splitting the sum at n = C(s)1/2 that the first two sums are
 C(s)(1−σ)/2+ε , as desired. In the case of the Riemann zeta-function we esti-
mate R  1 and the claim follows.

Some details: First we explain why C(s)  C(1 − s). Since |x + iy|  |x| + |y|, we have for
s = σ + it that
d
Y d
Y
C(s) = N (|s + κj | + 2)  N (1 + |σ + <κj | + |t + =κj |)
j=1 j=1
d
Y
N (1 + |<κj | + |t + =κj |)
j=1

for fixed σ. Indeed:


1
(|x| + |y|) ≤ max(|x|, |y|) ≤ |x + iy| ≤ |x| + |y|,
2
so that |x + iy|  |x| + |y|, and for 1 + |t + =κj | =: T ≥ 1 one has

T + |σ + <κj | ≤ T + |σ| + |<κj | ≤ (1 + |σ|)(T + |<κj |),


(
1 1
(T + |<κj |), |σ| ≤ (T + |<κj |),
T + |σ + <κj | ≥ max(1, T + |<κj | − |σ|) ≥ 2 1
2
1
1 ≥ 2|σ| (T + |<κj |), |σ| ≥ 2
(T + |<κj |),

so 1 + |σ + <κj | + |t + =κj | σ 1 + |<κj | + |t + =κj |. The same argument shows


d
Y
C(1 − s)  N (1 + |<κj | + | − t + =κj |),
j=1

and the two products coincide, because either =κj = 0, or there exists a pair of κ’s with
opposite signs.
P Summation1+ε by parts: For notational simplicity let us write C := C(s)1/2 and A(t) :=
|a
n≤t n |  t . First we estimate
X |an |  X |an | Z C
n −10 A(C) A(t)
σ
1+ ≤ σ
= σ
+σ σ+1
dt
n≤C
n C n≤C
n C 1 t
Z C
 C 1+ε−σ + t−σ+ε dt  C 1+ε−σ .
1

30
Next we estimate in the same way
X |an |  X |an | C 10 Z ∞
n −10 10 A(t)
σ
1 + ≤ σ n10
≤ C (σ + 10) σ+10+1
dt  C 1+ε−σ .
n≥C
n C n≥C
n C t

In total we see that the first sum is  C 1−σ+ε = C(s)(1−σ+ε)/2 , as desired.


For the second term we replace σ with 1 − σ getting
X |an |  −10
n
C(s)1/2−σ 1 +  C(s)1/2−σ · C(s)(σ+ε)/2 = C(s)(1−σ+ε)/2 .
n
n1−σ C(s)1/2
Residue: both terms are similar, let’s consider the residue at u = 1−s. We have d = N = 1,
ress=1 Λ(s) = 1 and the and bound G(u)  exp(|u|2 ). Hence we bound the residue by
1 G(1 − s)
 1.
L∞ (s) 1 − s

(3.13) Remark
The Generalized Lindelöf Hypothesis states that L(s)  C(s)max(1/2−σ,0)+ε eve-
rywhere (away from poles), but this is currently not known. For L-functions of
degree 1 the best known bound is
L(s)  C(s)1/6+ε , <s = 1/2,
whereas (3.12) gives the exponent 1/4. This uses deep results from algebraic
geometry and the theory of automorphic forms.

(3.14) Corollary
We have X
|L(1/2, χ)|2  q 1+ε .
χ (mod q)
χ primitive

Proof. If one sums the approximate functional equation over a family, one has
to be very careful which parameters depend on the family. In our case, the root
number (s) depends on χ and the weight function V depends on the parity of
χ. Therefore we sum over even and odd characters separately and eliminate the
root number by a baby form of Cauchy-Schwarz (|a + b|2 ≤ 2(|a|2 + |b|2 )):
X χ(n)  n  2

X X
|L(1/2, χ)|2 ≤ 4 V √

n1/2 q


χ (mod q) χ (mod q) n
χ primitive, even χ primitive, even

Here we wrote V = V1/2 and observe that R = 0 in (3.10). This is bounded by


2
X X χ(n)  n  X X χ(n)χ̄(m)  n   m 
≤4 V √ =4 V √ V̄ √
n1/2 q (nm)1/2 q q


χ (mod q) n χ (mod q) n,m
   
X 1 n m
= 4φ(q) V √ V̄ √ .
(nm)1/2 q q
n≡m (mod q)
(nm,q)=1

31
By symmetry we can assume m ≥ n, so that we are left with bounding

X 1  n   −1
X 1 m
q V √ √ 1 + √
n
n1/2 q m
m≥n
q
m≡n (mod q)
X 1  n   1 X q 1/2  X 1  n −1  1 1
q V √ √ +  q 1 + √ √ + .
n
n1/2 q n
k≥1
(kq)3/2 n
n1/2 q n q

Now it’s a matter of book-keeping. The above is bounded by


 X 1 1  X  √q 1 
q + 1/2 + +  q 1+ε .
√ n n q √ n2 n3/2 q 1/2
n≤ q n≥ q

The odd characters can be estimated in the same way.

Remark: In fact, the same argument shows the stronger result


X
|L(1/2, χ)|4  q 1+ε
χ (mod q)

(exercise). Here one can drop all but one term and recover the convexity bound
L(1/2, χ)  q 1/4+ε .

(3.15) Convention
From now on, the symbol ε means a sufficiently small number, but not necessa-
rily the same at each occurrence. For instance, we may write xε log x  xε
etc.

(3.16) Theorem
For T > 1, 1/2 ≤ σ < 1 one has
Z T
|ζ(σ + it)|2 dt  T 1+ε .
0

Proof. By (3.10) and (3.9) we have


Z T
|ζ(σ + it)|2 dt  I1 + I2 + I3
0
where
2
Z T X 1 Z T
I1 = V (n) dt, I = |R|2 dt,

σ+it 3
nσ+it

0
n
0
2
Z T X 1
I2 = V (n) (1 + |t|)1/2−σ dt.

1−σ−it 1−σ−it
0
n
n

We have     −1
|G(1 − σ − it)| |G(−σ − it)| Γ σ + it  e−|t| ,

R +
|1 − σ − it| | − σ − it| 2

32
so I3  1.
Next we investigate I1 . We always assume 0 ≤ t ≤ T and 1/2 ≤ σ < 1. By (3.11) we have
 −A
X 1 X 1 n A 1
Vσ+it (n)   T 2 −(σ+A)( 2 +ε)  T −εA/2 .
1/2+ε
nσ+it 1/2+ε
nσ T 1/2
n>T n>T

Choosing A = 2/ε, we obtain


  2
2
Z T X Z T X  
1 1 1 dt+O 1 .

I1 = V σ+it (n) + O dt  Vσ+it (n)

0 nσ+it T 0 n≤T 1/2+ε nσ+it
T
n≤T 1/2+ε

In the definition of Vs we shift the contour to <u = ε getting


2
G(u)Γ( s+u
Z T Z
) X 1
2 dt + O(T −1 ).

I1 =
s
u/2 Γ( ) σ+it+u
du

0 (ε) uπ 2 1/2+ε
n
n≤T

By (3.4b) and the rapid decay of G, we obtain


 2
Z T Z X
ε/2 −c|u|/2
1  dt + T −1

I1   T e σ+it+u
du
0 (ε)

n≤T 1/2+ε n

and by Cauchy-Schwarz we can estimate


! Z 2 
Z T Z X
ε −c|u|/2 −c|u|/2
1 du dt + T −1

I1  T e du  e
σ+it+u
0 (ε) (ε)

n≤T 1/2+ε n
2
∞ T
Z Z X
1
 Tε e−c|x|/2 dt dx + T −1 .


σ+ε+it+ix
n≤T 1/2+ε n

−∞ 0

The t-integral equals


X 1  m ix Z T  m it X 1

1

dt  min T,
(nm)σ+ε n 0 n (nm)σ+ε | log m/n|
n,m≤T 1/2+ε n,m≤T 1/2+ε
X 1 X 1
T +
n2(σ+ε) (nm)σ+ε | log m/n|
n≤T 1/2+ε n<m≤T 1/2+ε
X 1 X 1
T +
n2(σ+ε) log(1 + h/n)
n≤T 1/2+ε h≤T 1/2+ε
X n
 T + log T  T + T (1/2+ε)(2−2σ−2ε)  T.
n2(σ+ε)
n≤T 1/2+ε

The estimation of I2 is very similar. In the final t-integral we obtain



Z T X  
1 (1 + |t|)1−2σ dt  T 1−2σ
X 1 1
min T,
0
n1−σ+ε+it+ix (nm)1−σ+ε | log m/n|
n≤T 1/2+ε n,m≤T 1/2+ε
 
X 1 X 1
 T 1−2σ T 1+(1/2+ε)(2σ−1−2ε) + 2(1−σ+ε)

1/2+ε
n 1/2+ε
log(1 + h/n)
n≤T h≤T
1−2σ (1/2+ε)(2σ−2ε)
T +T (log T )T  T.
This completes the proof.

33
Remark: As in (3.14), the proof shows really a fourth moment estimate
Z T
|ζ(σ + it)|4 dt  T 1+ε .
0

It is a major unsolved problem to estimate higher moments.

34
4 The Mellin transform
In this section we get to know a new kind of integral transform that is particu-
larly useful for counting purposes.

(4.1) Definition
Let f : (0, ∞) → C be a piecewise continuous function satisfying f (x)  x−a
for x → 0 and f (x)  x−b for x → ∞ for some real numbers a < b (possibly
±∞). The Mellin transform of f is the function
Z ∞
dx
M(f )(s) = f (s) :=
b f (x)xs .
0 x

It is a holomorphic function in a < <s < b.

(4.2) Remarks
We have F(f )(y) = M(f ◦ log)(−2πiy) and M(f )(s) = F(f ◦ exp)(−s/(2πi)).
Hence the Mellin transform is, up to scaling by −2πi, the image of the Fourier
transform under the homomorphism exp : (R, +) 7→ (R> , ·). The maps x 7→ xs
are quasi-characters of the multiplicative group.
We have M(e−x )(s) = Γ(s) with a = 0, b = ∞.

(4.3) Lemma
Let f : (0, ∞) → C be a piecewise continuous function satisfying f (x)  x−a
for x → 0 and f (x)  x−b for x → ∞ for some real numbers a < b. Assume
that for x → 0 we have
∞ NX
X (m)
f (x) ∼ cnm (log x)n xrm
m=0 n=0

for certain cnm ∈ R, N (m) ∈ N0 , cM (m),m 6= 0, <(rm ) → ∞, i.e.

N (m)
X X
f (x) = cnm (log x)n xrm + O(xM ) (x → 0)
<(rm )<M n=0

for all M ∈ R. Then the Mellin transform fb(s) can be continued meromorphi-
cally to <s < b with poles at s = −rm of order N (m) + 1.

Remark: This is consistent with what we know about the Gamma function
(3.2). A similar result holds if f has an asymptotic expansion at ∞.

35
Proof. For M ∈ R let f = gM + hM with


0, x ≥ 1,

gM (x) = X NX (m)

 cnm (log x)n xrm , x<1

<(rm )≤M n=0

and hM = f − gM . Then
Z ∞ Z 1
dx dx
hM (s) =
b f (x)xs + (f − gM )(x)xs
1 x 0 x

is holomorphic in −M < <s < b. On the other hand, for <s sufficiently large
one has
X NX (m)
cnm (−1)n n!
gbM (s) =
n=0
(s + rm )n+1
<(rm )≤M

which can be continued meromorphically to the entire plane. Since M was ar-
bitrary, the claim follows.

(4.4) Lemma
Let a0 ≤ a < b ≤ b0 . Let f, h : (0, ∞) → C be piecewise continuous functions
satisfying f (x), h(x)  x−a for x → 0 and f (x), h(x)  x−b for x → ∞, and
assume that fb can be continued to a0 < <s < b0 .
a) For u ∈ R, v > 0, the Mellin transform of g(x) = xu f (xv ) can be continued
to −u + va0 < <s < −u + vb0 and satisfies gb(s) = v −1 fb((s + u)/v).
b) If f is continuously differentiable, then the Mellin transform of g = f 0 can
be continued to a0 + 1 < <s < b0 + 1 and satisfies gb(s) = (1 − s)fb(s − 1).
c) If f ∈ CcN , then fb(s) = O<s,N ((1 + |=s|)−N ) for s ∈ C.
R∞
d) If (f ∗ h)(x) := 0 f (t)h(x/t)dt/t, then f[ ∗ h = fbbh in a < <s < b.

Proof. Exercise

Remark: In particular, the Mellin transform of a smooth, compactly supported


function is entire and rapidly decaying on vertical lines.

(4.5) Theorem (Inversion formula)


Let f : (0, ∞) → C be smooth and rapidly decaying at ∞. Assume that f (x) 
x−a as x → 0. Then Z
1
f (x) = fb(s)x−s ds
2πi (c)
where c is real number > a.

36
Examples and remarks: We have e−x = 2πi 1
Γ(s)x−s ds.
R
(1)
In (3.10) the weight function is given by an inverse Mellin transform. One ty-
pically estimates Mellin integrals by partial integration and inverse Mellin inte-
grals by shifting the contour.

Proof. Replacing f (x) by xN f (x), we can assume that a < 0, so that we can
choose c = 0. Now the statement follows simply from the corresponding inversion
formula for the Fourier transform (2.4):
Z Z
1 −s 1  s 
M(f )(s)x ds = F(f ◦ exp) − x−s ds
2πi (0) 2πi (0) 2πi
Z ∞
= F(f ◦ exp)(s) exp(2πis log x)ds = f (exp(log x)) = f (x).
−∞

(4.6) Theorem (Perron’s formula)


Let 
1,
 0 < x < 1,
χ(x) = 1/2, x = 1

0, x > 1.

1 x−s
R
Then χ
b(s) = 1/s and χ(x) = 2πi (c) s
ds for c > 0. More precisely:
Z c+iT −s (
1 x O(cT −1 ), x = 1,
ds = χ(x) + −c −1
2πi c−iT s O(x min(1, (T | log x|) )), x 6= 1.

The implied constants are absolute.

Proof. We distinguish 3 cases.


a) Let x = 1. Then
Z c+iT Z T
1 T 1 T /c dt
Z Z
1 1 1 dt c 1 c
ds = = dt = = +O .
2πi c−iT s 2π −T c + it π 0 c2 + t 2 π 0 1 + t2 2 T
b) Let now x > 1. Let r > c. Let Q be the boundary of the rectangle with
vertices c ± iT , r ± iT . By Cauchy’s integral theorem we have
Z
ds
x−s = 0.
Q s
We estimate the two horizontal lines by
Z ∞
dσ 2 x−c
≤2 x−σ ≤ ,
c |σ + iT | T | log x|
and the line on the right by
Z T
dt T x−r
≤ x−r  → 0, r → ∞.
−T |r + it| r

37

Alternatively, let C be the circle about 0 with radius R = c2 + T 2 . Let C0 be
the portion to the right of the segment [c − iT, c + iT ]. Then
1 Z c+iT x−s 1 Z x−s x−c
ds = ds ≤ R = x−c .

2πi c−iT s 2πi C0 s R

This completes the proof for x > 1.


c) The case x < 1 is similar with Q being the rectangle with vertices c ± iT ,
−r ± iT and C1 the portion of C to the left of the segment [c − iT, c + iT ]. Here
one picks up the residue at s = 0.

(4.7) Theorem
Let x ≥ 2. Let 0 < c  1, T ≥ 2 + c, possibly depending on x. Let an
−s
P
be a sequence of complex numbers and assume that n an n is absolutely
convergent at s = c. Then
c+iT !
xc X |an |
Z 
X 1 X an
s ds x log x
an = x +O + max 5 |an | 1 + .
2πi c−iT n
ns s T n nc 3
4 x≤n≤ 4 x
T
n≤x

Proof. Let Ax := max 43 x≤n≤ 45 x |an |. By (4.6) we have


Z c+iT
X X 1 X an ds
an = an χ(n/x) + O(Ax ) = xs +E
n
2πi c−iT n
ns s
n≤x

where
X |an |
E  Ax + x c min 1, (T | log(n/x)|)−1

n
nc
xc X |an | 1 X
 Ax + c
+ Ax | log(x/n)|−1
T n n T
2≤|n−x|≤ 14 x
c −1
x X |an | Ax X x
 Ax + + − 1

T n nc T n
2≤|n−x|≤ 14 x

xc X |an | Ax X x
 Ax + +
T n nc T |x − n|
2≤|n−x|≤ 14 x

and the claim follows.

Remark: The truncation is not necessary if one allows a smooth weight func-
tion, e.g. Z X
X 1 an
an e−n/x = s
Γ(s)xs ds.
n
2πi (c) n
n

38
(4.8) Example
1
We have n≤x µ2 (n) = + O(x1/2+ε ).
P
ζ(2) x

Proof. We choose in Perron’s formula6 c = 1 + 1/Plog x and T = xα with some


0 < α < 1 to be chosen later. Then xc  x and n n−c  log x, so the error
term in (4.7) is O(x log x/T ). Note that
X µ2 (n) Y 1 ζ(s)
= 1+ =
n
ns p
ps ζ(2s)

for <s > 1. Therefore,


Z c+iT  
X
2 1 ζ(s) s ds x log x
µ (n) = x +O .
2πi c−iT ζ(2s) s T
n≤x

µ(n)n−2s is
P
We shift the contour to <s = 1/2 + ε. In this region 1/ζ(2s) =
holomorphic and bounded (by absolute convergence). Hence
Z 1/2+ε+iT Z c±iT !  
X
2 x x log x
µ (n) = + (. . .) + O (. . .) + O .
ζ(2) 1/2+ε−iT 1/2+ε±iT T
n≤x

We estimate the first integral by Cauchy-Schwarz and (3.15):


1/2+ε+iT T
|ζ(1/2 + ε + it)|
Z Z
1/2+ε
(. . .)  x dt
1/2+ε−iT 0 1+t
!1/2 !1/2
T T
|ζ(1/2 + ε + it)|2
Z Z
1/2+ε 1
x dt dt  x1/2+O(ε) .
0 1+t 0 1+t

(Cut the first integral into O(log T ) dyadic pieces A ≤ t ≤ 2A, 1 ≤ A ≤ T .)


Using the convexity bound, the horizontal integrals can be estimated by
Z c±iT Z c Z c
1 1
(. . .)  xσ |ζ(σ + iT )|dσ  xσ T max((1−σ)/2,0)+ε dσ
1/2+ε±iT T 1/2+ε T 1/2+ε
ε Z c
x  x σ xε  x c x1+ε
 dσ   .
T 1/2 1/2+ε T
1/2 T 1/2 T 1/2 T

Choosing T = x1/2 completes the proof.

Remark: We see that the error term is intimately linked to the location of zeros
of the Riemann zeta function, because the factor 1/ζ(2s) limits a contour shift
further to the left. Any improvement in the error term (beyond removing the ε
which is not hard) is equivalent to a quasi-Riemann hypothesis.
6c = 1 + ε would do the job equally well.

39
(4.9) Example
Let k ∈ N, k ≥ 2. There is a polynomial Pk of degree k − 1 and a constant
δ = δk > 0, such that
X
dk (n) = xPk (log x) + O(x1−δ ).
n≤x

Proof. Again we choose c = 1 + 1/ log x and T = xα with 0 < α ≤ 2/k.


By (1.7) we have dk (n)  nε . As before the error term in Perron’s formula is
O(x1+ε T −1 ), and we have
Z 1/2+iT Z c±iT !
ζ k (s)xs
 1+ε 
X x
dk (n) = res + (. . .) + O (. . .) + O .
s=1 s 1/2−iT 1/2±iT T
n≤x

For the error we simply use the convexity bound:


1/2+iT T
(1 + t)k/4
Z Z
1/2
(. . .)  x dt  x1/2 T k/4
1/2−iT 0 (1 + t)

and
c±iT c
x1+ε
Z Z
1
(. . .)  xσ T k/2(1−σ)+ε dσ  .
1/2±iT T 1/2 T
We choose −1
1 k
T = x 2 ( 4 +1) = x2/(k+1) ,
so that
X ζ k (s)xs
dk (n) = res + O(x1−δ+ε )
s=1 s
n≤x

with δ = 2/(k + 1). Now we use the Taylor expansions


1 α
ζ k (s) = + + ...,
(s − 1)k (s − 1)k−1
1
= 1 − (s − 1) + . . . ,
s
1
xs = x + x log x(s − 1) + x(log x)2 (s − 1)2 + . . .
2
to complete the proof.

Remark: Assuming the Lindelöf hypothesis (cf. (3.13)), we could get the error
term O(x1/2+ε ) for all k.

40
5 Zeros of L-functions and prime number theo-
rems
In this section we study the location and density of zeros of L-functions and pro-
ve the prime number theorem (5.6) and the prime number theorem in arithmetic
progressions (5.14).

(5.1) Theorem (facts from complex analysis)


Let f be an entire function of finite order α. Let z1 , z2 , . . . denote the non-zero
zeros of f (with multiplicity) ordered by absolute value.
a) If β > α, then
X X
1  Rβ and |zj |−β < ∞.
|zj |≤R j

b) We have the product expansion


∞  
Y z
f (z) = z ordz=0 f eg(z) 1− eJ(z/zj )
j=1
zj

where g is a polynomial of degree at most [α] and


[α]
X wj
J(w) = .
j=1
j

(5.2) Corollary (partial fraction decomposition)


Let L be an L-function as in (3.8) and let −r ∈ Z be the order of Λ at s = 0, 1.
Then
L0 L0 X  1 
1 r r 1
− (s) = log N + ∞ (s) − b + + − +
L 2 L∞ s s−1 s−ρ ρ
ρ6=0,1

where the sum is over all zeros ρ of Λ (with multiplicity), and b is some complex
number (depending on the L-function).

Proof. Since (s(1 − s))r Λ(s) is an entire function of order 1, this follows from
(5.1b) with g(z) = a + bz after taking the logarithmic derivative and noting that
d
ds (s(1− s))r Λ(s) L0 L0∞ 1 r r
r
= (s) + (s) + log N + +
(s(1 − s)) Λ(s) L L∞ 2 s s−1

and
d
ds (1 − s/ρ)es/ρ 1 1
s/ρ
= + .
(1 − s/ρ)e s−ρ ρ

41
(5.3) Important technical lemma
Keep the above notation, in particular let L be a general L-function, let ρ 6= 0, 1
be a zero of Λ,
Pand let b be as in (5.2). Then
a) <(b) = − ρ <ρ−1 . (Note that this sum is not absolutely convergent, but
limx→∞ |ρ|≤x <ρ−1 exists.)
P

b) [density of zeros] For T ≥ 1 one has


d
X
#{ρ = β + iγ : |γ − T | ≤ 1}  log C(iT ) = log N + log(2 + |κj + iT |).
j=1

c) [approximate partial fraction decomposition] For −1/2 ≤ <s ≤ 2, one has


L0 X 1 X 1 r r
(s) = + − − + O(log C(s)).
L s−ρ s + κj s s−1
|s−ρ|<1 |s+κj |<1

Proof. a) We apply the logarithmic derivative to the functional equation


(s(1 − s))r Λ(s) = η(s(1 − s))r Λ̄(1 − s)
getting
X  1 1 1 1
 X  1 1
 X 
1 1

b+b̄ = − + + + =− + − + .
s − ρ 1 − s − ρ̄ ρ ρ̄ s − ρ 1 − s − ρ̄ ρ ρ̄
ρ6=0,1 ρ6=0,1 ρ6=0,1

Note that both sums on the right hand side are absolutely convergent since
1 1 1 − 2<ρ 1
+ = s
s − ρ 1 − s − ρ̄ (s − ρ)(1 − s − ρ̄) |ρ|2
and 1/ρ + 1/ρ̄  1/|ρ|2 . Since both ρ and 1 − ρ̄ are zeros, the first sum vanishes.
b) Let s = 3 + iT. We have the crude bound
d
L0 X X αj (p) log p X p
(s) = −  d  d  log C(iT ).
L p j=1
ps p
p3

By Stirling’s formula,
1 L0
log N + ∞ (s)  log C(iT ).
2 L∞
Finally, r/s + r/(1 − s)  r  1. Taking the real part in (5.2) and using part
a), we see
X X  1 1

−1
<ρ − < +  log C(iT )
ρ ρ
s−ρ ρ
for s = 3 + iT . Write ρ = β + iγ. Since
1 1 1
< =<  ,
s−ρ (3 − β) + i(T − γ) 1 + (T − γ)2

42
we can rearrange the absolutely convergent sum to get
X 1
 log C(iT )
ρ
1 + (T − γ)2

and so in particular #{ρ = β + iγ : |γ − T | ≤ 1}  log C(iT ).


c) Let s = σ + it with −1/2 ≤ σ ≤ 2. We have trivially
L0 L0 L0
− (s) = − (s) + (3 + it) + O(1).
L L L
We apply (5.2) to both fractions on the right hand side getting with the previous
analysis
L0 L0 X  1 
r r 1
− (s) = ∞ (s) + + − − + O(log C(it)).
L L∞ s s−1 s − ρ 3 + it − ρ
ρ6=0,1

By the third last display we have


X  1 1
 X 3−σ X 1
− =   log C(it),
s − ρ 3 + it − ρ (s − ρ)(3 + it − ρ) ρ
1 + (t − γ)2
|s−ρ|≥1 |s−ρ|≥1

by the statement of part b) we have


X 1
 log C(it)
3 + it − ρ
|s−ρ|≤1

L0
and by Stirling’s formula we have L∞

(s)  log C(it) away from poles, in other
words
L0∞ X 1
(s) = + O(log C(it)).
L∞ s + κj
|s+κj |<1

This completes the proof.

(5.4) Theorem (“explicit formula”)


Let w be a smooth and compactly supported function on [2, ∞). If Λ denotes
the von Mangoldt function, then
Z ∞ 
X X 1
Λ(n)w(n) = − w(ρ) + 1− w(x)dx
(x − 1)x(x + 1)
b
n 0≤<ρ≤1 0

where ρ runs over all nontrivial zeros (with multiplicity) of the Riemann zeta
function and w
b is the Mellin transform of w.

Remark: This shows a close connection between primes and zeros of the Rie-
mann zeta function.

43
Sketch of Proof. By Mellin inversion we have
ζ0
Z
X 1
Λ(n)w(n) = − (s)w(s)ds.
b
n
2πi (2) ζ

We shift the contour to the far left. We pick up contributions atR the non-trivial

zeros of the Riemann zeta function, at s = 1 we obtain w(s)
b = 0 w(x)dx, and
the trivial zeros contribute
XZ ∞ Z ∞
−2n dx 1 dx
X
− w(−2n) =− w(x)x =− w(x) 2 .
x −1 x
b
0 x 0
n≥1 n≥1

Exercise: make the contour shift rigorous using the rapid decay of w
b on vertical
lines and (5.3).

(5.5) Theorem (zero-free region for ζ)


There exists a constant C > 0 such that for all zeros ρ = β + iγ of the Riemann
zeta function the inequality
C
<ρ < 1 −
log(2 + |γ|)
holds.

Proof. Since ζ has a pole at s = 1, it is zero-free in a neighbourhood of 1, say,


in |s − 1| ≤ C0 . We start with an ingenious trick of Hadamard and de la Vallée
Poussin. For all α ∈ R we have

3 + 4 cos α + cos(2α) = 2(1 + cos α)2 ≥ 0.

Let s = σ + it with σ > 1. Then


ζ0 ζ0 ζ0
  X
Λ(n)
< −3 (σ) − 4 (σ + it) − (σ + 2it) = (3+4 cos(t log n)+cos(2t log n)) ≥ 0.
ζ ζ ζ n

Let us now assume 1 < σ < 2 and t  1. Then (5.3c) simplifies as


ζ0 X 1
− (s) = − + O(log |s|).
ζ s−ρ
|s−ρ|<1

Since <(s − ρ)−1 > 0, we conclude


ζ0 1
−< (s) ≤ −< + O(log |s|) ≤ O(log |s|)
ζ s−ρ
for any zero ρ with |s − ρ| < 1. Moreover,
ζ0 1
− (σ) = + O(1).
ζ σ−1

44
Now fix a zero ρ with t = =ρ ≥ C0 /2. Then

ζ0 ζ0 ζ0
 
0 ≤ < −3 (σ) − 4 (σ + it) − (σ + 2it)
ζ ζ ζ
3 4
≤ + O(1) − < + O(log(2 + t))
σ−1 σ + it − <ρ − it
3 4
≤ − + C1 log(2 + t).
σ − 1 σ − <ρ

for some constant C1 > 0. Now we choose σ = 1 + δ/ log(2 + t) for some δ > 0.
This gives
δ 4δ 1
<ρ < 1 + − =1−
log(2 + t) (3 + C1 δ) log(t + 2) 15C1 log(2 + t)

upon choosing δ = 1/(3C1 ). (Notice that the inequality 3 < 4 is absolutely


crucial!) This shows the theorem for all ρ with =ρ > C0 /2 and by symmetry
for all ρ with =ρ < −C0 /2. Together with the region |s − 1| ≤ C0 this covers
everything claimed in the Theorem if C is sufficiently small.

(5.6) Main Theorem (Prime Number Theorem)


There exists a constant c > 0 such that
X p
ψ(x) = Λ(n) = x + O(x exp(−c log x)).
n≤x

Corollary (Exercise): There exists a constant c > 0 such that


Z x
X dt p
π(x) = 1= + O(x exp(−c log x)).
p≤x 2 log t

Proof. We apply Perron’s formula (4.7) with c = 1 + 1/ log x and some 10 <
T ≤ x1/2 to be chosen later. This gives
Z c+iT 0 !
X 1 ζ xs x X log n x log x
Λ(n) = − (s) ds + O + log x .
2πi c−iT ζ s T n n1+1/ log x T
n≤x

The error term is O(x(log x)2 /T ). We integrate over the rectangle with vertices
c ± iT , c0 ± iT where
C
c0 = 1 −
2 log(2 + T )
with C as in (5.5). Then (5.3) implies that

ζ0
(s)  (log T )2
ζ

45
on the edges of the rectangle. Hence the integral over the three segments [c −
iT, c0 − iT ], [c0 − iT, c0 + iT ], [c0 + iT, c + iT ] is
x 0

 (log T )2 + (log T )xc ,
T
and we obtain
x(log x)2
 
X 0
Λ(n) = x + O + (log T )3 x1−C / log T .
T
n≤x


The proof is completed by choosing T = exp( log x).

(5.7) Theorem
The Riemann hypothesis is true if and only if ψ(x) = x+O(x1/2+ε ) for all ε > 0.

Proof. “=⇒”: Choose c0 = 1/2 + ε and T = x1/2 in the preceding proof.


“⇐=”: By partial summation we have (initially in <s > 1)
∞ Z ∞
ζ0 X
− (s) = Λ(n)n−s = s ψ(x)x−s−1 dx.
ζ n=1 1

By assumption we can write ψ(x) = x + R(x) with R(x) = O(x1/2+ε ), so that


Z ∞
ζ0 s
− (s) = +s R(x)x−s−1 dx.
ζ s−1 1

The integral on the right hand side is holomorphic in <s > 1/2, hence ζ is
zero-free in this region.

(5.8) Corollary
We have
X log p p X1 p
= log x+c1 +O(exp(−c0 log x)), = log log x+c2 +O(exp(−c0 log x))
p p
p≤x p≤x

for constants c1 , c2 ∈ R, c0 > 0.

Remark: With weaker error terms O(1/ log x), this can be proved without the
prime number theorem in an elementary way.

Proof. partial summation.

46
(5.9) Corollary
a) We have
Y 1

C  p 
1− = 1 + O(exp(−c0 log x))
p log x
p≤x

for a suitable constant C ∈ R.


b) We have P φ(n)  n(log log n)−1 .
c) We have n≤x ω(n) = x log log x + c2 x + O(x/ log x).

Proof. We have
   
Y 1
 X  
1  X 1 XX 1
1− = exp  log 1 − = exp − − 
p p p kpk
p≤x p≤x p≤x p≤x k≥2
!
p XX 1 X X 1 
= exp − log log x − c1 + O(exp(−c0 log x)) − + O
p
kpk p>x
pk
k≥2 k≥2
| {z }
=(p(p−1))−1
C  p 
= exp O(exp(−c0 log x)) ,
log x
and the result follows.
b) Exercise.
c) We have
XX X x 
1= + O(1) = x log log x + c2 x + O(x/ log x).
p
n≤x p|n p≤x

47

You might also like