Professional Documents
Culture Documents
Murphy
Murphy
Introduction
Remarks:
1. There are several ways of defining elliptic curves. We have chosen the
definition above because it is the most concrete, and requires no further
explanation.
2. An alternative definition is that an elliptic curve over a field k is a
non-singular cubic curve over k containing at least one point defined
over k.
By a cubic curve we mean a curve defined by a cubic polynomial
ax3 + bx2 y + cxy 2 + dy 3 + ex2 + f xy + gy 2 + hx + iy + j = 0.
We will see in Chapter 2 exactly what is meant by non-singular ; but
informally it means that the curve does not cross itself like
y 2 = x3 + x,
or have a cusp like
y 2 = x3 .
We shall see too that the curve
y 2 = x3 + ax2 + bx + c
is non-singular precisely when the cubic on the right is separable, ie has
distinct roots.
428–99 1–1
3. The additive group on an elliptic curve is most naturally seen in this
context; if P, Q, R are three points defined over k (ie with coordinates
in k) on the cubic curve then P + Q + R = 0 if and only if P, Q, R are
collinear.
Note one subtle (and important) point about this definition: if P, Q
are two points on the curve defined over k then the line P Q meets the
curve in a third point defined over k. This follows from the fact that if
two of the roots α, β of the cubic polynomial
α + β + γ = −B/A.
It follows that the points defined over k form a group. Since a curve
defined over k is also defined over any extension field K ⊃ k, there is
a group E(K) defined for each such field.
In particular, in the rational case k = Q which specially concerns us
we can consider the groups over Q, R and C, as well as over the p-adic
fields Qp which we shall introduce in Chapter ??. Each of these groups
tells us something about the elliptic curve we are studying.
4. We’ve skated over one difficulty; the line P Q may not meet the curve
again. We have to pass from affine to projective geometry, in effect
adding a line at infinity where P Q can meet the curve in this case. All
this will be detailed in Chapter 2
5. There is an even more general definition. To every curve there corre-
sponds a non-negative integer g, the genus of the curve. An elliptic
curve over k is a curve of genus 1 over k containing at least one point
defined over k.
(The reason for adding the condition that the curve must contain a
point over k is that the set of points defined over k form an abelian
group, as we have said; and a group, by definition, must be non-empty.)
Lines and conics are curves of genus 0. Such curves are said to be
rational, since the points on the curve can be parametrised by rational
functions, at least if k is algebraically closed. For example, the circle
x2 + y 2 = 1 can be parametrised by
t2 − 1 2t
x= , y = .
t2 + 1 t2 + 1
428–99 1–2
From this point of view, elliptic curves are the least complicated curves
after the conics studied by the ancient Greeks.
Our earlier definitions of an elliptic curve were set in the plane; but
this definition — an elliptic curve is a curve of genus 1 — extends to
curves in any number of dimensions.
y 2 = x3 + ax2 + bx + c
y 2 = x3 + bx + c.
We shall say that the curve in this case is in Weierstrass reduced form,
or just reduced form.
y 2 + c1 xy + c3 = x3 + c2 x2 + c4 x + c6 .
(We shall see in due course the reason for this rather curious numbering
of the coefficients. Note that there is no coefficient c5 .)
We shall say that the curve in this case is in Weierstrass general form.
Note that if the characteristic of k is not 2 then we can bring the
equation above to standard form by ‘completing the square’ on the
left:
(y + c1 x/2 + c3 /2)2 = x3 + (c2 + c21 /4)x2 + (c4 + c1 c3 /2)x + (c6 + c23 /4),
428–99 1–3
The familiar trigonometric functions cos x, sin x, tan x, etc, are singly
periodic functions f (x) of a real variable:
f (x + 2π) = f (x).
f (z + ω1 ) = f (z), f (z + ω2 ) = f (z).
It turns out (as we shall see in Chapter 8) that all such functions can be
expressed in terms of one such function, Weierstrass’ elliptic function
(This is where the term elliptic comes from; because of this relation
the function ϕ(z) can be used to compute integrals around an ellipse.)
We see from this equation that the points (ϕ(z), ϕ0 (z)/2) parametrise
the elliptic curve
y 2 = x3 + bx + c,
where b = B/4, c = C/4 — much as (cos t, sin t) parametrises the
circle x2 + y 2 = 1. It turns out that every elliptic curve over C can
be parametrized by a Weierstrass elliptic function in this way; and this
provides a powerful analytical tool for studying elliptic curves.
p(x) = x3 + ax2 + bx + c
on the right hand side of our equation should be separable, ie should have
distinct roots, it is useful to establish a criterion for this.
428–99 1–4
Definition 1.2 Suppose the polynomial
f (x) = xn + c1 xn−1 + · · · + cn
Equivalently, Y
D(f ) = (−1)n(n−1)/2 (αi − αj ),
i6=j
Proposition 1.2 The polynomial f (x) has a multiple root if and only if f (x)
and its derivative f 0 (x) have a factor in common:
Then
f 0 (x) = (x − α)r−1 (g(x) + (x − α)g 0 (x)) .
Thus if r > 1,
(x − α) | gcd(f (x), f 0 (x)).
Conversely, suppose this is so. If
f (x) = (x − α)g(x)
428–99 1–5
then
f 0 (x) = g(x) + (x − α)g 0 (x)
and so
(x − α) | f 0 (x) =⇒ (x − α) | g(x)
=⇒ (x − α)2 | f (x).
J
As this suggests, the discriminant of a polynomial is closely related to the
resultant of two polynomials, which tells us if those polynomials have a root
in common.
have roots
α1 , . . . , αm and β1 , . . . , βn ,
respectively. Then the resultant R(f, g) of f and g is defined to be
Y
R(f, g) = (βj − αi ).
1≤i≤m,1≤j≤n
Proposition 1.3 The polynomials f (x), g(x) have a root in common if and
only if R(f, g) = 0.
Now
f (x) = (x − α1 ) · · · (x − αm ), g(x) = (x − β1 ) · · · (x − βn ).
Thus
R(f, g) = g(α1 )g(α2 ) · · · g(αm ).
Since the expression on the right is symmetric in α1 , . . . , αm , it follows that
R(f, g) can be expressed as a polynomial in the coefficients of f and g.
428–99 1–6
Proposition 1.4 The resultant R(f, g) can be expressed as an (m + n) ×
(m + n) determinant:
1 a1 a2 . . . am 0 ... 0
0 1 a1 . . . am−1 am ... 0
...
0 0 0 ... ... am−1 am
R(f, g) = det
1 b 1
b2 . . . bn 0 ... 0
0 1 b1 . . . bn−1 bn ... 0
...
0 0 0 ... ... bn−1 bn
Proof I Let us denote this determinant by S(f, g). Suppose f (x) and g(x)
have a root, say t, in common. Consider the m + n equations
tm−1 f (t) = 0
tm−2 f (t) = 0
...
f (t) = 0
n−1
t g(t) = 0
tn−2 g(t) = 0
...
g(t) = 0
αi − βj = 0 (1 ≤ i ≤ m, 1 ≤ j ≤ n)
428–99 1–7
Let us apply this argument to the polynomials f (x), f 0 (x). We have seen
that f (x) has a repeated root if D(f ) = 0; and we have also seen that f (x)
has a repeated root if R(f, f 0 ) = 0. It is not surprising therefore to find that
there is a relation between these entities.
Proof I On differentiating
Y
f (x) = (x − αi )
and setting x = αj , Y
f 0 (αj ) = (αj − αi ).
i6=j
It follows that
Y
R(f, f 0 ) = f 0 (αj )
j
Y
= (αj − αi )
i6=j
Y
= (−1)n(n−1)/2 (αj − αi )2
j<i
n(n−1)/2
= (−1) D(f ).
In other words,
D(f ) = (−1)n(n−1)/2 R(f, f 0 ).
J
Now we can apply this result to our cubic. First we consider the reduced
case.
f (x) = x3 + bx + c
is
D(f ) = −(4b3 + 27c2 ).
428–99 1–8
Proof I We have
f 0 (x) = 3x2 + b,
and so
D(f ) = −R(f, f 0 )
1 0 b c 0
0 1 0 b c
= − det
3 0 b 0 0
0 3 0 b 0
0 0 3 0 b
= −4b3 − 27c2 .
J
It is probably a good idea to remember the discriminant in this reduced
case, but not the more general case we turn to now.
f (x) = x3 + ax2 + bx + c
is
D(f ) = −4a3 c + 18abc − 4b3 − 27c2 .
Proof I We could determine this in the same way, by computing the deter-
minant
1 a b c 0
0 1 a b c
D(f ) = − det
3 2a b 0 0.
0 3 2a b 0
0 0 3 2a b
428–99 1–9
1.2 Weights
The transformation
x 7→ d2 x, y 7→ d3 y
leaves our equation in standard form, taking
y 2 = x3 + ax2 + bx + c
into
y 2 = x 3 + a0 x 2 + b 0 x + c 0
where
a0 = d2 a, b0 = d4 b, c0 = d6 c.
We may say that the terms a, b, c have weights 2, 4, 6 respectively. The
various invariants we shall meet — in particular the discriminant defined
above — are all homogeneous, ie consist of terms of the same weight. This
offers a valuable check on the sometimes complicated formulae we shall en-
counter.
In particular, we see that the disciminant is of weight 12. So it could not
contain, for example, a term a2 b, since that has weight 8.
428–99 1–10
Chapter 1
Introduction
f (x, y) = 0,
f (x, y) = a1 x3 + a2 x2 y + a3 xy 2 + a4 y 3 + a5 x2 + a6 xy + a7 y 2 + a8 x + a9 y + a10 .
Let Γ(k) denote the set of points P = (x, y) ∈ Γ defined over k, ie with
coordinates x, y ∈ k.
Suppose P, Q ∈ Γ(k). Let ` be the line P Q if P 6= Q, or the tangent at
P if P = Q. Then ` meets Γ in a third point R ∈ Γ(k).
For if ` is the line
y = mx + d
then
y2 − y1
m=
x2 − x1
428–99 1–1
if P 6= Q; while
(∂f /∂x)P
m=
(∂f /∂y)P
if P = Q. In either case,
m ∈ k;
and so also
d = y1 − mx1 ∈ k.
But P Q meets Γ where
u(x) = f (x, mx + d) = 0.
Now u(x) is a cubic polynomial, say
u(x) = b0 x3 + b1 x2 + b2 x + b3 ,
with coefficients b0 , b1 , b2 , b3 ∈ k.
If the roots of this equation are x1 , x2 , x3 then
b1
x1 + x2 + x3 = − ∈ k.
b0
Thus
x1 , x2 ∈ k =⇒ x3 = −(x1 + x2 + b1 /b0 ) ∈ k.
Since
y3 = mx3 + d ∈ k,
it follows that
R = (x3 , y3 ) ∈ Γ(k),
as we claimed.
We set
R = P ∗ Q.
Evidently this binary operation is commutative:
Q ∗ P = P ∗ Q. (∗1)
Moreover, the relation between P, Q, R is symmetric:
R = P ∗ Q =⇒ P = Q ∗ R =⇒ Q = R ∗ P.
In other words,
P ∗ (P ∗ Q) = Q. (∗2)
It follows from this that
P ∗ Q = P ∗ R ⇐⇒ Q = R. (∗3)
We have skated round two problems in the discussion above:
428–99 1–2
1. The line P Q may not meet the curve Γ again, since the coefficient of x3
in the polynomial u(x) may vanish, leaving a quadratic with the two
solutions x1 , x2 .
For example, consider the curve
x2 = y 3 + 1.
The points P = (2, 3), Q = (−2, 3) lie on this curve; but the line
y=3
1.2 Addition
The operation ∗ is not associative. For if it were it would follow from (∗1)
that if S = P ∗ P then
S ∗ Q = (P ∗ P ) ∗ Q = P ∗ (P ∗ Q) = Q
P + Q = O ∗ (P ∗ Q)
P + (Q + R) = (P + Q) + R
428–99 1–3
for all P, Q, R ∈ Γ. That is far from obvious.
It is clear however that O is a neutral (or zero) element with respect to
this operation:
O + P = O ∗ (O ∗ P ) = P,
by (∗1). Moreover, if we set
S =O∗O
then the point
P0 = S ∗ P
is the additive inverse of P. For
P 0 ∗ P = (S ∗ P ) ∗ P = S
and so
P 0 + P = O ∗ S = O ∗ (O ∗ O) = O.
Thus we may write
−P = S ∗ P.
It follows that if the operation is associative then it defines an abelian
group on Γ(k).
It might seem surprising that we can choose any point O ∈ Γ as the
neutral (or zero) point. However, that is not really so. For if we have an
abelian group structure on a set A then we take any element a ∈ A and
define a new abelian group structure on A by the operation
x † y = x + y − a.
(x † y) † z = x + y + z − 2a = x † (y † z).
Moreover
x † a = x + a − a = x,
so the element a is the new zero element; and if we set
x0 = −x + 2a
then
x + x0 = x + (−x) + 2a − a = a,
ie x0 is the inverse of x with respect to the new operation.
In effect, all that we have done is to ‘move the origin’ from 0 to a, through
the transformation
x 7→ x − a.
428–99 1–4
1.3 The choice of O
Recall that we can choose any point O ∈ Γ(k) as the zero point of our group.
What is the best choice?
We saw that
−P = S ∗ P,
where S = O ∗ O (ie S is the point where the tangent at O meets Γ again).
It turns out that life is much simpler if we can choose O so that S = O,
ie
O ∗ O = O.
That is, the tangent at O meets Γ in three coincident points: O, O, O. In
other words, O is a point of inflexion (or flex ) on Γ.
For then, as we have seen,
−P = O ∗ P.
P + Q + R = 0 ⇐⇒ P, Q, R are collinear.
R = P ∗ Q =⇒ O ∗ R = O ∗ (P ∗ Q)
=⇒ −R = P + Q
=⇒ P + Q + R = 0.
Conversely, if P + Q + R = 0 then
P + Q + R = 0 =⇒ −R = P + Q
=⇒ O ∗ R = O ∗ (P ∗ Q)
=⇒ R = P ∗ Q
=⇒ P, Q, R collinear
Γ : x3 + 2y 3 = 4
X 3 + 2Y 3 = 4Z 3 .
1.4 Associativity
There are several ways of showing that our addition is associative. But since
we defined addition geometrically, it is appropriate to give a geometric proof
of associativity. For the moment, we merely sketch the proof; we shall fill in
the details in Chapter 3.
We want to show that
P + (Q + R) = (P + Q) + R
ie
Since
O ∗ X = O ∗ Y ⇐⇒ X = Y,
428–99 1–6
we can ‘hive off’ the last O∗; it is sufficient to show that
P ∗ (Q + R) = (P + Q) ∗ R.
(X ∗ Y ) ∗ (Z ∗ T ) = (X ∗ Z) ∗ (Y ∗ T ). (∗4)
To see that this follows from the associative law, note first that it is sufficient
to prove the result in any extension of the ground field k; so we may assume
that k is algebraically closed. In that case we can certainly find a point of
inflexion O ∈ E; and on taking this as our zero point,
X ∗ Y = O ∗ (X + Y ) = −(X + Y ).
Thus
(X ∗ Y ) ∗ (Z ∗ T ) = X + Y + Z + T = (X ∗ Z) ∗ (Y ∗ T ).
(P + Q) ∗ R = (Q + R) ∗ P.
Γ : a1 x3 + a2 x2 y + a3 xy 2 + a4 y 3 + a5 x2 + a6 xy + a7 y 2 + a8 x + a9 y + a10 = 0.
428–99 1–7
Let us choose these two points on the line ` = P1 P2 , say, then this line
will meet Γ in 4 points, and so will lie wholly in Γ, which must therefore be
degenerate:
Γ = `C,
where C is a conic.
But this conic C must pass through the 6 points P3 , P4 , P5 , P6 , P7 , P8 .
Now a general conic is defined by 6 coefficients:
C : b1 x2 + b2 xy + b3 y 2 + b4 x + b5 y + b6 = 0.
C = µ1 C1 + µ2 C2 ;
and we could find a conic in this pencil passing through any further point
R. But now if we choose R on ` = Q1 Q2 , say, then the line ` meets C in 3
points, and so lies wholly in C. Thus C is degenerate:
C = `m,
Γ = `C.
Thus all the cubics in our pencil must be degenerate. But that is impossible,
since we supposed that there was a non-degenerate cubic (the elliptic curve
E) passing through the 8 points.
We have shown, therefore, that d = 2, ie the pencil of cubics through the
8 points takes the form
Γ = λ1 Γ1 + λ2 Γ2 .
Now Γ1 and Γ2 meet in at most 9 points. For on eliminating y say from
the equations for Γ1 and Γ2 we obtain a polynomial equation of degree 9 in x,
428–99 1–8
to which the x-coefficients of P1 , . . . , P8 provide 8 solutions. It follows that
there is a 9th solution, giving a 9th common point P9 on Γ1 and Γ2 . (It also
follows — although we make no use of this — that if P1 , . . . , P8 ∈ Γ(k) then
P9 ∈ Γ(k), by the same argument we used to show that if P, Q ∈ Γ(k) then
P ∗ Q ∈ Γ(k).)
We have proved (more-or-less) the remarkable result that given any 8
points P1 , . . . , P8 (no 3 of which are collinear) there exists a unique 9th
point P9 with the property that every cubic Γ through P1 , . . . , P8 also passes
through P9 .
To prove the associative law, we apply this result to the 8 points
X, Y, Z, T, X ∗ Y, X ∗ Z, Y ∗ T, Z ∗ T.
These points all lie on the elliptic curve E, of course, and they also lie on 2
sets of 3 lines, as follows
` m n
0
` X Y X ∗Y
m0 Z T Z ∗T
n0 X ∗Z Y ∗T
E, `mn, `0 m0 n0 .
Each of these passes through the 8 points, and so belongs to the pencil defined
by those points. Hence
E = λ`mn + λ0 `0 m0 n0
for some λ, λ0 ∈ k.
Moreover, E and `mn meet in the further point
(X ∗ Y ) ∗ (Z ∗ T ) ∈ E ∩ `mn;
(X ∗ Z) ∗ (Y ∗ T ) ∈ E ∩ `0 m0 n0 ;
(X ∗ Y ) ∗ (Z ∗ T ) = (X ∗ Z) ∗ (Y ∗ T ).
This establishes the identity (∗4), and so the associativity of our addition.
428–99 1–9
Chapter 2
v = ρu (ρ ∈ k × ).
dim PV = dim V − 1.
428–99 2–1
Each point of Pn (k) is represented by a set of n + 1 homogeneous coordinates
[X1 , . . . , Xn , Xn+1 ].
[X1 , . . . , Xn , 0]
We identify the affine plane k 2 with the subset Z 6= 0 of P2 (k), by the map
aX + bY + cZ = 0
aX + bY + cZ = 0
428–99 2–2
defines a line in the projective plane P2 (k). Each such line except for the line
at infinity Z = 0 intersects the affine subspace k 2 ⊂ P2 (k) in an affine line.
Any 2 distinct projective lines
aX + bY + cZ = 0, a0 X + b0 Y + c0 Z = 0
intersect in a point; while any 2 distinct points in P2 (k) define a unique pro-
jective line. This perfect duality between points and lines (or in n dimensions,
between points and (n − 1)-dimensional subspaces) is a minor advantage of
projective geometry.
Two affine lines are parallel if and only if the corresponding projective
lines meet on the line at infinity.
t:V →V
induces a map
t̄ : PV → PV,
where PV is the corresponding projective space. Such a map is called a
projective transformation.
Two linear maps t, ρt (ρ ∈ k × ) define the same linear transformation.
Thus the projective transformations form the projective group
In particular
P GL(n, k) = GL(n + 1, k)/k × .
If P1 , P2 , P3 , P4 are 4 points in the projective plane, no 3 of which are
collinear, and Q1 , Q2 , Q3 , P4 is a second similar set, then there is a unique
projective transformation sending
P1 7→ Q1 , P2 7→ Q2 , P3 7→ Q3 , P4 7→ Q4 .
Pi = [Xi , Yi , Zi ] (i = 1, 2, 3, 4)
then
428–99 2–3
for some ai ∈ k; and the ai are all non-zero since no 3 of the points are
collinear. But now we can take ai [Xi , Yi , Zi ] to represent Pi ; and then
P4 = P 1 + P2 + P 3 .
Q4 = Q1 + Q2 + Q3 .
P = λ1 P1 + λ 2 P2 + λ 3 P3 .
P 7→ Q = λ1 Q1 + λ2 Q2 + λ3 Q3 .
F (x, y) = 0.
Thus it makes sense to speak of the points in projective space Pn (k) satisfying
the equation P (X1 , . . . , Xn , Xn+1 ) if (and only if) P is homogeneous.
428–99 2–4
If p(x1 , . . . , xn ) is a polynomial in k n of degree d then the corresponding
homogeneous polynomial is
p(x, y) = y 2 − x3 − ax2 − bx − c
of degree 3 is
P (X, Y, Z) = Y 2 Z − X 3 − aX 2 Z − bXZ 2 − cZ 3 .
f (x, y) = 0.
∂F ∂F ∂F
X+ Y + Z = 0,
∂X ∂Y ∂Z
where the partial differential coefficients are computed at the point [X0 , Y0 , Z0 ].
Let us verify that this is indeed the projective line corresponding to the
usual tangent, if P is a point in the affine plane.
428–99 2–5
First note an important identity satisfied by the partial differential coef-
ficients of a homogeous polynomial F (x, y, z). If F is of degree d then
428–99 2–6
This tangent is defined unless
∂F ∂F ∂F
= = = 0,
∂X ∂Y ∂Z
In this case we say that P is a singular point on the curve. A curve is said to
be non-singular if contains no singular points, either in k or in any extension
field of k.
We say that the curve F (X, Y, Z) = 0 is degenerate if the polynomail F
factorises:
F (X, Y, Z) = G(X, Y, Z)H(X, Y, Z).
A degenerate curve is always singular. For the points where the constituents
meet,
G(X, Y, Z) = H(X, Y, Z) = 0,
are necessarily singular.
y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6 (c1 , c2 , c3 , c4 , c6 ∈ k)
y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
meets the line at infinity in just one point, [0, 1, 0]. This is a point of inflec-
tion on the curve, and is non-singular.
428–99 2–7
Proof I The homogeneous form of the curve in this case is
F (X, Y, Z) = Y 2 Z + c1 XY Z + c3 Y Z 2 − X 3 − c2 X 2 Z − c4 XZ 2 − c6 Z 3 = 0.
This meets the line at infinity Z = 0 where
X 3 = 0,
ie thrice at the point [0, 1, 0], which is thus a point of inflection. To see that
this point is non-singular, note that
∂F
= Y 2 + c1 XY + 2c3 Y Z − c2 X 2 − 2c4 XZ − 3c3 Z 2
∂Z
=1
at [0, 1, 0], since all the terms except the first vanish. J
Now suppose char(k) = 2. We shall establish a condition on the coeffi-
cients ci for non-singularity.
We have seen that the point [0, 1, 0] on the line at infinity is non-singular.
So any singular point is in the affine plane.
In characteristic 2, −1 = 1, 2 = 0, 3 = 1, etc; so we have
∂F
= c1 Y Z + X 2 + c4 Z 2 ,
∂X
∂F
= c1 XZ + c3 Z 2 = Z(c1 X + c3 Z),
∂Y
∂F
= Y 2 + c1 XY + c2 X 2 + c6 Z 2 .
∂Z
428–99 2–8
We may not be able to solve these equations in k, but we can always solve
them in an extension of k, for example in its algebraic closure k̄. Thus we
have established what we said earlier; the curve
y 2 = x3 + ax2 + bx + c
x = c3 /c1 .
y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
428–99 2–9
2.7 The discriminant of an elliptic curve
We have established two conditions for non-singularity: the condition above
when char(k) = 2, and the condition that if char(k) 6= 2 then the curve
y 2 = x3 + ax2 + bx + c
y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
y 2 = x3 + ax2 + bx + c
with
a = c2 + c21 /4, b = c4 + c1 c3 /2, c = c6 + c23 /4.
We know that the curve is non-singular in this case if
∆(c1 , c2 , c3 , c4 , c6 ).
428–99 2–10
coefficients. To see that this is so, consider
26 ∆ = − 28 (c2 + c21 /4)3 (c6 + c23 /4)
+ 26 (c2 + c21 /4)2 (c4 + c1 c3 /2)2
+ 27 32 (c2 + c21 /4)(c4 + c1 c3 /2)(c6 + c23 /4)
− 28 (c4 + c1 c3 /2)3
− 26 33 (c6 + c23 /4)2
= − (4c2 + c21 )3 (4c6 + c23 )
+ (4c2 + c21 )2 (2c4 + c1 c3 )2
+ 22 32 (4c2 + c21 )(2c4 + c1 c3 )(4c6 + c23 )
− 25 (2c4 + c1 c3 )3
− 22 33 (4c6 + c23 )2 .
Working modulo 4,
26 ∆ ≡ −c61 c23 + c61 c23 mod 4.
Thus 24 ∆ is a polynomial with integral coefficients.
428–99 2–11
Thus in characteristic 2
D(E) = 24 ∆ = c61 c6 + c51 c3 c4 + c41 c2 c23 + c41 c24 + c31 c33 + c43 ,
which is exactly the polynomial which we showed vanished if and only if the
curve is singular. J
P ∈ Γ1 ∩ Γ2 .
aX + bY + cZ = 0;
428–99 2–12
which is a homogenous equation of degree d in u, v.
If now P = (u0 , v0 ) ∈ Λ ∩ Γ then uv0 − vu0 is a factor of H(u, v). We
define the intersection number I(P ; Λ, Γ) to be the multiplicity of this factor
in H(u, v).
It is readily verified that this number is independent of the choice of
points P1 , P2 ∈ Λ.
If the ground field k is algebraically closed then H(u, v) factorises com-
pletely into linear factors; and it follows that the sum of the intersection
numbers is equal to the degree:
X
I(P ; Λ, Γ) = deg Γ.
P ∈Λ∩Γ
where G is of degree d−1. In this case the intersection numbers are undefined.
I(P ; Λ, Γ) ≥ 2.
428–99 2–13
the coefficient of ud−1 v is 0; while the coefficient of ud is 0 since F (P ) = 0.
Thus H has a double zero at u = 0, ie
I(P ; Λ, Γ) ≥ 2.
Remarks:
min I(P ; Λ, Γ) ≥ 2.
Λ3P
Proof I We may assume that the field k we are working over is infinite; for
otherwise we can pass to an infinite extension of k (for example, the algebraic
closure k̄ of k, or the field k(t) of rational functions over k).
Let the curves be given by the homogeneous equations
F1 (X, Y, Z) = 0, F2 (X, Y, Z) = 0,
of degrees n1 , n2 .
Suppose the curves have n1 n2 + 1 points in common, say
P0 , P1 , . . . , Pn1 n2 .
We can find a line ax + by + cz not passing through any of these points; and
we can take this line as the line at infinity. Thus we may assume that the
n1 n2 + 1 points are all in the affine plane k 2 . In this way we can reduce the
problem to the affine case, in which the curves are given by affine equations
f1 (x, y) = 0, f2 (x, y) = 0,
428–99 2–14
where f1 , f2 are non-homogeneous polynomials of degrees ≤ n1 , n2 .
By making a further change of coordinates, if necessary, we may assume
that the n1 n2 + 1 points
Pi = (xi , yi )
have distinct x-coordinates and distinct y-coordinates.
Now let us regard f1 , f2 as polynomials in y, and let us compute their
resultant R(f1 , f2 ). This is a polynomial of degree ≤ n1 n2 in x.
For each xi the polynomials f1 (xi , y), f2 (xi , y) have a factor y − yi in
common. It follows that the resultant R(x) must vanish for these values of
x. Thus R(x) has more roots than its degree, and so must vanish identically.
But that implies that the polynomials f1 (x, y), f2 (x, y) have a factor in
common, say
f1 (x, y) = m(x, y)g1 (x, y), f2 (x, y) = m(x, y)g1 (x, y).
F1 (X, Y, Z) = M (X, Y, Z)G1 (X, Y, Z), F2 (X, Y, Z) = M (X, Y, Z)G1 (X, Y, Z).
Remarks:
1. If the curves have a factor in common, and if the field we are working
over is infinite, then of course the curves have an infinity of points in
common.
428–99 2–15
Proof I When we eliminate Z say as above (in the proof of Bezout’s The-
orem) we are left with a homogeneous polynomial over k of degree n1 n2 in
X, Y . We know that this polynomial has (n1 n2 − 1) roots in k. It follows
that the last root is also in k, by the homogeneous analogue of the fact that
the sum of the roots of the polynomial
td + a1 td−1 + · · · + ad = 0
is equal to −a1 . J
In effect we have used a particular case of this result (with n1 = 1, n2 = 3)
in our assertion that if P, Q ∈ E then P ∗ Q ∈ E; the line P Q meets E in two
points over k, so it meets E in a third point over k.
y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6 ,
or in homogeneous form,
F (X, Y, Z) = Y 2 Z + c1 XY Z + c3 Y Z 2 − X 3 − c2 X 2 Z − c4 XZ 2 − c6 Z 3 = 0.
X 3 = 0,
ie thrice at the point [0, 1, 0]. Thus the line at infinity is the tangent to the
curve at [0, 1, 0] — but it is more than that, it is a point of inflection.
Γ : F (X, Y, Z) = 0
I(P ; Λ, Γ) ≥ 3.
Γ : F (X, Y, Z) = 0,
428–99 2–16
where F (X, Y, Z) is a homogeneous polynomial of degree ≥ 2. Then P is a
point of inflection on Γ if and only if it satisfies the hessian equation
∂2F ∂2F ∂2F
∂X 2 ∂X∂Y ∂X∂Z
∂2F ∂2F ∂2F
H(X, Y, Z) ≡ det ∂X∂Y ∂Y 2 ∂Y ∂Z
= 0.
∂2F ∂2F ∂2F
∂X∂Z ∂Y ∂Z ∂Z 2
uP + vQ = [uX + vX 0 , uY + vY 0 , uZ + vZ 0 ].
428–99 2–17
Remark: It would be surprising if this were not so; for in that case we would
have defined in an intrinsic way a second line passing through any point P
of a curve. One might think of the normal to the curve at P . But angle is
not a projective invariant, so this would not make sense.
Proof of Lemma B To avoid confusion, let us for a moment set P = [X0 , Y0 , Z0 ].
Then the tangent to M (X, Y, Z) = 0 at P is
L(X, Y, Z) = 0 =⇒ M (X, Y, Z) = 0.
428–99 2–18
Lemma 2 The conic
det A = 0,
where
a h g
A = h b f = 0.
g f c
Let the lines L1 = 0, L2 = 0 meet in the point (X0 , Y0 , Z0 ). Then the tangent
at (X0 , Y0 , Z0 ) is undefined. Thus
Av0 = 0,
where
X0
v 0 = Y0 .
Z0
Hence A is singular, ie det A = 0.
Conversely, suppose det A = 0. Then we can find X0 , Y0 , Z0 satisfying
the equation Av0 = 0. It follows that the tangent to Γ at any point P passes
through P0 = [X0 , Y0 , Z0 ]. But now take any point P . The tangent at P cuts
the conic C(X, Y, Z) = 0 twice at P and at P0 . But a line can only cut a
conic twice. It follows that the line P0 P lies wholly in the conic, which must
thus degenerate into 2 lines. C
Putting this together, if P is a flex, then the conic M (X, Y, Z) = 0 is
degenerate and so H(X, Y, Z) = 0.
Conversely, if H(X, Y, Z) = 0 then M (X, Y, Z) = 0 is degenerate. Since
the tangent to this conic at P is L(X, Y, Z) = 0, this line must be one of the
lines making up the conic:
428–99 2–19
As we saw, the point [0, 1, 0] is a flex on an elliptic curve given by Weier-
strass’ equation. We shall always take this point as the zero element O of
the group on the curve. The other flexes are just the points of order 3 in the
group. Thus flexes play an important rôle in the theory.
The hessian curve of a cubic is itself a cubic. But 2 cubics meet in at most
9 points — as may be seen by considering the resultant of the 2 polynomials,
which is a homogeneous polynomial of degree 9 in 2 variables. It follows that
an elliptic curve has at most 9 flexes.
We shall see that an elliptic curve over the reals R has at most 3 flexes;
and the same is therefore true of an elliptic curve over the rationals Q (which
is our main focus of interest).
Severi (1879–1961) and the Italian School studied general algebraic va-
rieties, that is, the points satisfying a set of polynomial equations.
428–99 2–20
André Weil (1906–1998) In his seminal work, The Foundations of Alge-
braic Geometry, Weil provided a secure foundation for the work of the
Italian school, and extended it to varieties over finite and other fields,
not just C.
428–99 2–21
Chapter 3
P + Q + R = 0 ⇐⇒ P, Q, R are collinear.
But as we shall see, this is not quite sufficient to define the group structure.
Also, since this is the basis for the entire theory of elliptic curves we need to
ensure that we are on a firm foundation.
Proposition 3.1 Suppose P, Q are points on the elliptic curve E(k). Then
the line P Q (or the tangent at P if P = Q) meets E(k) again at a unique
point R.
lX + mY + nZ = 0.
a0 X 3 + a1 X 2 Y + a2 XY 2 + a3 Y 3 = 0.
428–99 3–1
Remark: When we speak of a root of a homogeneous polynomial in X, Y we
mean of course the ratio X0 : Y0 ; and when we say that the root is in k we
mean that we can find X0 , Y0 ∈ k in this ratio.
The proposition that if n − 1 roots of a polynomial p(x) ∈ k[x] lie in k
then so does the nth root carries over unchanged to the homogeneous case.
428–99 3–2
since the tangent at O meets E again at O, as O is a point of inflection.
However, it is far from evident that the operation is associative:
(P + Q) + R = P + (Q + R)?
We shall prove this important result in the next Chapter. But for the moment
we shall assume that it is true, and look at some concrete examples of the
group on an elliptic curve.
First though, let us get an explicit expression for −P when
P = (x0 , y0 ) = [x0 , y0 , 1].
The line OP is
X − x0 Z = 0,
since this certainly goes through P and O = [0, 1, 0].
In affine terms this is the line
x = x0 ,
ie the line through P parallel to the y-axis.
Suppose the elliptic curve is in standard form
y 2 = x3 + ax2 + bx + c = 0.
In this case the line x = x0 meets the curve again at the point (x0 , −y0 ).
Thus if the elliptic curve is given in standard form then
−(x, y) = (x, −y).
ie
y1 = −y0 − c1 x0 − c3 .
Thus
−(x, y) = (x, −y − c1 x − c3 ).
428–99 3–3
3.2 Examples
1. Consider the curve
E(Q) : y 2 = x3 + 1
over the rationals Q. There are 5 obvious points on this curve:
P = (−1, 0), Q = (0, 1), −Q = (0, −1), R = (2, 3), −R = (2, −3).
y = mx + c.
The slope m is
1−0
m= = 1.
0 − (−1)
Thus P Q is the line
y = x + 1.
(x − 1)2 = x3 + 1.
−1 + 0 + x2 = 1,
ie
x2 = 2.
Thus y2 = x2 + 1 = 3, ie
P ∗ Q = (2, 3) = R.
It follows that
P + Q = −R = (2, −3).
428–99 3–4
ie
dy 3x2
= .
dx 2y
In particular the slope at R is
12
m= = 2;
6
and so the tangent at R is
y = 2x − 1.
(2x + c)2 = x3 + 1.
Two of the roots of this are 2, 2 from R (twice). Thus if the other root
is x2 then (from the coefficient of x2 )
2 + 2 + x2 = 22 ,
ie
x2 = 0.
Thus
and so
2R = Q.
Note that
−P = (−1, 0) = P,
ie
2P = 0;
428–99 3–5
In fact it is clear that the point P = (x0 , y0 ) on the curve
y 2 = x3 + ax2 + bx + c
y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6 .
−P = (x0 , −y0 − c1 x0 − c3 ).
2y0 + c1 x0 + c3 = 0,
2y + c1 x + c3 = 0.
In either case, the line meets the curve in 0, 1 or 3 points. Thus there
are either 0, 1 or 3 points of order 2 on an elliptic curve.
Finally, let us determine 2Q. The slope at Q is
0
m= = 0.
2
Thus the tangent at Q is y = 1. If this meets the curve again at (x2 , y2 )
then
0 + 0 + x2 = 02 ,
ie
x2 = 0.
Hence
Q ∗ Q = Q,
ie
2Q = −Q,
428–99 3–6
ie
3Q = 0.
y 2 = x3 + 1
except (x, y) = (−1, 0), (0, ±1), (2, ±3). However, this will require con-
siderable apparatus to establish.
The group on the elliptic curve in this case is finite. There is no known
algorithm to determine whether the group on a general elliptic curve
over Q is finite or infinite. There are techniques which are likely to
work in any given case, but there is no guarantee that they will work.
One important property of the group is known: Mordell’s Theorem
states that the group on an elliptic curve E over Q is finitely-generated.
In other words, there are points P1 , . . . , Pr ∈ E such that every rational
point P ∈ E is expressible in the form
P = n 1 P 1 + · · · + n r Pr .
Our main aim in the first part of the course is to prove Mordell’s The-
orem.
E(F5 ) : y 2 = x3 + 1
but now over the finite field F5 . The curve is still non-singular, since
D = −4 = 1
in F5 .
428–99 3–7
We can easily find all the points on the curve. We have to find all (x, y)
with 0 ≤ x, y ≤ 4, or if we prefer x, y ∈ {0, ±1, ±2}, for which
y 2 ≡ x3 + 1 mod 5.
We see that there are 6 points in the group, including the zero point
O = [0, 1, 0]:
O, (0, ±1), (2, ±2), (−1, 0).
There is only one abelian group of order 6, namely the cyclic group
C6 = Z/(6). Thus
E(F5 ) = C6 .
There is just one element of order 2, namely P = (−1, 0), since this is
the only point of the curve on the x-axis y = 0.
Let us determine the order of Q = (0, 1). The method is exactly the
same as in the rational case. As there, the slope of the curve is given
by
dy 3x2
= .
dx 2y
In particular, the slope at Q is m = 0, so that the tangent at Q is
y = 1.
Q ∗ Q = Q,
and so
Q + Q = −Q,
428–99 3–8
ie
3Q = 0.
ie
E(Fp ) : y 2 = x3 + ax2 + bx + c
428–99 3–9
The values of ap for the same equation but different primes p have
remarkable and mysterious properties, related to modular forms and
Fermat’s Last Theorem, which have still not been elucidated.
That is well beyond the scope of this course (although we shall have
something to say about modular forms), but there is one related topic
that we shall deal with.
It turns out that any elliptic curve E(Q) over the rationals can be ‘re-
duced modp’ to give a curve E(Fp ) over the finite field Fp . This curve
may be singular for a finite set of so-called ‘bad’ primes (for that partic-
ular curve), but it will remain an elliptic curve for the remaining primes.
Furthermore it will emerge that there is a natural homomorphism
E(Q) → E(Fp )
for each of these ‘good’ primes p; and the study of these homomor-
phisms is one of the many tools we shall have to hand for studying the
curve E(Q).
3. Let us look now at the elliptic curve
E(Q) : y 2 = x3 − 2x.
We see that this contains the points
P = (0, 0), Q = (2, 2), −Q = (2, −2).
We know that P has order 2.
Let us determine 2Q. The slope is given by
dy
2y = 3x2 − 2,
dx
ie
dy 3x2 − 2
=
dx 2y
At P ,
10 5
m= = .
4 2
Thus the tangent at P is
5
(y − 2) = (x − 2),
2
428–99 3–10
ie
5x − 2y − 6 = 0.
and so
9 21
2P = ,− .
4 8
We shall show later that a point (x, y) of finite order on the elliptic
curve
y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
necessarily has integer coordinates x, y ∈ Z. (This is quite difficult to
prove — though not as difficult as Mordell’s Theorem! Essentially we
have to show that as we successively double the point, 2Q, 4Q, 8Q, . . . ,
the denominator of the slope m gets larger and larger.)
It will follow from this that the point Q is of infinite order. In particular
the group E(Q) in this case is infinite.
E(Q) : y 2 − y = x3 − x.
(y − 1/2)2 = x3 − x + 1/4,
428–99 3–11
ie
y12 = x3 − x + 1/4
x2 = a2 x, y2 = a3 y1 ,
since the coefficients of y 2 and x3 will still be the same after such a
change. In the present case, if we take a = 2 the equation becomes
x2 = 4x, y2 = 8y − 4.
E(Q) : y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
y 02 = x03 + ax0 + b
E(Q) : y 2 − y = x3 − x.
P = (0, 0), Q = (1, 0), R = (−1, 0), S = (0, 1), T = (1, 1), U = (−1, 1).
If P = (x, y) ∈ E then
−P = (x, 1 − y).
Thus
−P = S, −Q = T, −R = U.
428–99 3–12
Let us determine P + Q. The line P Q has slope
0
m= = 0;
1
so P Q is the line
y = 0.
This meets the curve again at (−1, 0). Thus
P + Q = −(−1, 0) = (−1, 1),
ie
P + Q = U.
428–99 3–13
5. Finally, let us look at the same equation over the field F2 :
E(F2 ) : y 2 − y = x3 − x.
First we must verify that this is an elliptic curve, ie that the curve
remains non-singular under ‘reduction mod2’.
The curve takes the homogeneous form (remember that in characterstic
2, −x = x, so that we do not need to worry about sign):
F (X, Y, Z) ≡ Y 2 Z + Y Z 2 + X 3 + XZ 2 = 0.
Hence
∂F
= X 2 + Z 2,
∂X
∂F
= Z 2,
∂Y
∂F
= Y 2.
∂Z
Thus at a singular point, Y = Z = 0, ie the point would be [1, 0, 0],
which is not on the curve.
The projective plane P2 (F2 ) contains just 7 points: 4 points in the
affine plane F22 , and 3 points on the line at infinity. (In general, the
projective plane P2 (Fq ), over a finite field with q elements, contains
q 2 + q + 1 points.
It is trivial to see that E(F2 ) contains just 5 points: all 4 affine points
(0, 0), (0, 1), (1, 0), (1, 1) together with the point O = [0, 1, 0] at infinity.
The only abelian (or non-abelian) group with 5 elements is the cyclic
group of order 5. Thus
E(F2 ) = C5 .
x † y = x + y − a.
428–99 3–14
This operation is evidently commutative; and it is associative, since
(x † y) † z = x + y + z − 2a = x † (y † z).
x † a = x + a − a = x;
x † (2a − x) = a,
P + Q + R = 0 ⇐⇒ P, Q, R are collinear.
P † Q † R = P + Q + R − 2A = −2A,
so
P † Q † R = A ⇐⇒ 3A = 0 ⇐⇒ A is a point of inflection.
E(k) : y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
428–99 3–15
Chapter 4
Proof I We have
P + (Q + R) = O ∗ (P ∗ (Q + R)) , (P + Q) + R = O ∗ ((P + Q) ∗ R) .
Since
O ∗ (O ∗ P ) = P,
it follows that
O ∗ A = O ∗ B ⇐⇒ A = B.
Thus it is sufficient to show that
P ∗ (Q + R) = (P + Q) ∗ R,
ie
P ∗ (O ∗ (Q ∗ R)) = (O ∗ (P ∗ Q) ∗ R.
Lemma 3 The associative law holds if and only if
(P ∗ Q) ∗ (R ∗ S) = (P ∗ R) ∗ (Q ∗ S)
for any four points P, Q, R, S ∈ E(k).
428–99 4–1
Proof of Lemma B Suppose the associative law holds, so that E(k) is an
additive group. Recall that
P ∗ Q = −(P + Q).
Thus
(P ∗ Q) ∗ (R ∗ S) = − ((P ∗ Q) + (R ∗ S))
= − (−(P + Q) − (R + S))
= (P + Q) + (R + S).
Similarly,
(P ∗ R) ∗ (Q ∗ S) = (P + R) + (Q + S)
= (P + Q) + (R + S)
= (P ∗ Q) ∗ (R ∗ S).
X1 = P, X2 = Q, X3 = R, X4 = S,
X5 = P ∗ Q, X6 = R ∗ S, X7 = P ∗ R, X8 = Q ∗ S,
X9 = X5 ∗ X6 , X10 = X7 ∗ X8 .
`1 = X1 X2 X5 , `2 = X3 X4 X6 , `3 = X1 X3 X7 , `4 = X2 X4 X8 , `5 = X5 X6 X9 , `6 = X7 X8 X10 .
Γ1 : F1 (X, Y, Z) = 0, Γ2 : F2 (X, Y, Z) = 0,
are two cubic curves. By the pencil defined by Γ1 , Γ2 we mean the family of
cubic curves
Γr,s : rF1 (X, Y, Z) + sF2 (X, Y, Z) = 0.
428–99 4–2
This is a one-dimensional pencil, since each cubic in the family is determined
by the ratio [r, s]. More generally, we can consider two-dimensional pencils
etc.
Note that a general cubic Γ (we are not concerned with singularity or
non-singularity for the moment) is defined by 10 coefficients:
The cubic is unchanged if we multiply all the cubics by the same scalar
ρ ∈ k × , so we may say that the cubics form a projective space of dimension
9.
We can always find a cubic passing through any 9 points, since m simul-
taneous homogeneous linear equations in n > m unknowns always have a
non-zero solution.
In general there will be just one such cubic; but there may well be more
than one for some sets of 9 points.
Note that three lines `, m, n define a cubic
Γ = `mn.
428–99 4–3
Chapter 5
d(x, y) = |x − y|.
kxkp = p−e .
Note that all integers are quite small in the p-adic valuation:
x ∈ Z =⇒ kxkp ≤ 1.
428–99 5–1
High powers of p are very small:
pn → 0 as n → ∞.
x 7→ kxk : k → R
d(x, y) = kx − yk
x1 + · · · + xn = 0 (x1 , . . . , xn ∈ Qp )
no term can dominate, ie at least two of the xi must attain max kxi kp .
To emphasize the analogy between the p-adic valuation and the familiar
valuation |x| we sometimes write
kxk∞ = |x|.
428–99 5–2
5.2 p-adic numbers
The reals R can be constructed from the rationals Q by completing the latter
with respect to the valuation |x|. In this construction each Cauchy sequence
{xi ∈ Q : |xi − xj | → 0 as i, j → ∞}
defines a real number, with 2 sequences defining the same number if |xi −yi | →
0.
(There are 2 very different ways of constructing R from Q: by completing
Q, as above; or alternatively, by the use of Dedekind sections. In this each
real number corresponds to a partition of Q into 2 subsets L, R where
l ∈ L, r ∈ R =⇒ l < r.
Q ⊂ Qp .
R = Q∞ .
The numbers x ∈ Qp with kxkp ≤ 1 are called p-adic integers. The p-adic
integers form a ring, denoted by Zp . For if x, y ∈ Zp then by property (3)
above,
kx + ykp ≤ max(kxkp , kykp ) ≤ 1,
and so x + y ∈ Zp . Similarly, by property (1),
and so xy ∈ Zp .
Evidently
Z ⊂ Zp .
428–99 5–3
More generally,
m
∈ Zp
x=
n
if p 6 | n. (We sometimes say that a rational number x of this form is p-
integral.) In other words,
m
Q ∩ Zp = { : p 6 | n}.
n
Evidently the p-integral numbers form a sub-ring of Q.
Concretely, each element x ∈ Zp is uniquely expressible in the form
x = c0 + c1 p + c2 p2 + · · · (0 ≤ ci < p).
ie
1
− 2 ≡ 1 · 3 mod 32 .
2
Thus
1
≡ 2 + 1 · 3 mod 32
2
428–99 5–4
For the next step,
1 1 1
− − 1 = − ≡ 1 mod 3
3 2 2
giving
1
≡ 2 + 1 · 3 + 1 · 32 mod 33
2
It is clear that this pattern will be repeated indefinitely. Thus
1
= 2 + 3 + 32 + 33 + · · · .
2
To check this,
2 + 3 + 32 + · · · = 1 + (1 + 3 + 32 + · · · )
1
=1+
1−3
1
=1−
2
1
= .
2
As another illustration, let us expand 3/5 ∈ Q7 . We have
3
≡ 2 mod 7
5
1 3 1
− 2 = − ≡ 4 mod 7
7 5 5
1 1 3
− − 4 = − ≡ 5 mod 7
7 5 5
1 3 4
− − 5 = − ≡ 2 mod 7
7 5 5
1 4 2
− − 2 = − ≡ 1 mod 7
7 5 5
1 2 1
− − 1 = − ≡ 4 mod 7
7 5 5
x = c + yp,
x = 1 + yp.
x−1 = 1 − yp + y 2 p2 − y 3 p3 + · · · .
dc ≡ 1 mod p.
Then
dx ≡ dc ≡ 1 mod p,
say
dx = 1 + py,
x−1 = d 1 − yp + y 2 p2 − · · · .
Thus the elements x ∈ Zp with kxkp = 1 are all units in Zp , ie they have
inverses in Zp ; and all such units are of this form. These units form the
multiplicative group
Z×
p = {x ∈ Zp : kxkp = 1}.
428–99 5–6
5.3 In the p-adic neighbourhood of 0
Recall that an elliptic curve E(k) can be brought to Weierstrassian form
y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
if and only if it has a flex defined over k. This is not in general true for
elliptic curves over Qp . For example, the curve
X 3 + pY 3 + p2 Z 3 = 0
has no points at all (let alone flexes) defined over Qp . For if [X, Y, Z] were a
point on this curve then
a+b+c=0
then two (at least) of a, b, c must have the same p-adic value, by Corollary 3
to Proposition 5.1.
On the other hand, Qp is of characteristic 0; so if E(Qp ) is Weierstrassian
— as we shall always assume, for reasons given earlier — then it can be
brought to standard form
y 2 = x3 + bx + c.
In spite of this, there is some advantage in working with the general Weier-
strassian equation, since — as we shall see in Chapter 6 — this allows us to
apply the results of this Chapter to study the integer points (that is, points
with integer coordinates) on elliptic curves over Q given in general Weier-
strassian form. Such an equation over Q can of course be reduced to standard
form; but the reduction may well transform integer to non-integer points.
As in the real case, we study the curve in the neighbourhood of 0 = [0, 1, 0]
by taking coordinates X, Z, where
E(Qp ) : Z + c1 XZ + c3 Z 2 = X 3 + c2 X 2 Z + c4 XZ 2 + c6 Z 3 .
428–99 5–7
Proposition 5.2 If P ∈ E(Qp ) then
kXkp = pe .
kc1 XZkp , kc3 Z 2 kp , kc2 X 2 Zkp , kc4 XZ 2 kp , kc6 XZkp < kZkp .
kZkp = kX 3 kp = kXk3p .
Z = X 3 − c1 X 4 + (c21 + c2 )X 5 + · · · .
F (X, Z(X)) = 0
identically, where
F (X, Z) = Z + c1 XZ + c3 Z 2 − (X 3 + c2 X 2 Z + c4 XZ 2 + c6 Z 3 ).
428–99 5–8
Remember too that in the p-adic valuation integers are small,
x ∈ Z =⇒ kxkp ≤ 1.
Thus a power-series
a0 + a1 x + a2 x 2 + · · ·
where ai ∈ Z—or more generally, ai ∈ Zp —will converge for all x with
kxkp < 1.
428–99 5–9
1. only odd powers of X appear, ie di = 0 for i odd;
2. d2 = a, d4 = a2 + b, d6 = a3 + 3ab + c;
4. the coefficient d2i has weight i, given that a, b, c are ascribed weights
2,4,6 respectively;
Z = X 3 + aX 2 Z + bXZ 2 + cZ 3
Proposition 5.4 Suppose kX1 kp , kX2 kp < 1. Then we can express S(X1 , X2 )
as a double power-series in X1 , X2 ,
S(X1 , X2 ) = X1 + X2 + c1 X1 X2 + · · ·
X
= Si (X1 , X2 )
i
X
= sij X1i X2j
i,j
where
2. S1 (X1 , X2 ) = X1 + X2 , S2 (X1 , X2 ) = c1 X1 X2 ;
428–99 5–10
Proof I As in the real case, let the line
P1 P2 : Z = M X + D
P3 = P1 ∗ P2 .
Hence
coeff of X 2
X1 + X2 + X3 = −
coeff of X 3
c1 M + 2c3 M 2 − (c2 + c4 M + c6 M 2 )D
=
1 + c2 M + c4 M 2 + c6 M 3
Now
Z2 − Z1
M=
X2 − X1
X 3 − X13 X 4 − X14
= 2 − c1 2 + ···
X2 − X1 X2 − X1
= X12 + X1 X2 + X22 − c1 (X13 + X12 X2 + X1 X22 + X23 ) + · · · ,
X2 Z1 − X1 Z2
D=
X2 − X1
2
X2 − X12 X23 − X13
= X1 X2 − c1 + ···
X2 − X1 X2 − X1
= X1 X2 X1 + X2 − c1 (X22 + X1 X2 + X22 ) + · · · .
or more precisely,
428–99 5–11
Hence
X1 + X2 + X3 ≡ 0 mod p2 .
More precisely,
ie
In particular,
kX3 kp ≤ p−1 ,
and so
kZ3 kp = kM X3 + Dk ≤ p−3 ,
ie
P1 , P2 ∈ E(p) =⇒ P3 ∈ E(p) .
Recall that
P1 + P2 = O ∗ (P1 ∗ P2 ) = O ∗ P3 .
By our formulae above, with O, X3 in place of X1 , X2 ,
or more precisely
Hence
X(P1 + P2 ) = X1 + X2 mod p2 ,
or more precisely
428–99 5–12
J
Finally, we turn to the normal coordinate function θ(X), defined as in
the real case by
dθ 1
=
dX ∂F/∂Z
1
=
1 + c1 X + 2c3 Z − c2 X 2 − 2c4 XZ − 3c6 Z 2
Proposition 5.5 Suppose kXkp < 1. Then we can express θ as a power-
series in X,
c
θ = X + X2 + · · ·
2 X
= tn X n+1
where
1. t1 = 1, t2 = −c1 /2;
3. ti is of weight i.
Proof I Since
dθ 1
=
dX 1 + c1 X + 2c3 Z − c2 X 2 − 2c4 XZ − 3c6 Z 2
= 1 − (c1 X + 2c3 Z − c2 X 2 − 2c4 XZ − 3c6 Z 2 )
+ (c1 X + 2c3 Z − c2 X 2 − 2c4 XZ − 3c6 Z 2 )2 + · · ·
the coefficients in the power-series for dθ/dX are integral polynomials in the
ci . It follows on integration that the coefficients ti in the power-series for
θ(X) have at worst denominator i.
It remains to show that this power series converges for kXkp < 1.
428–99 5–13
Then
pe | i =⇒ pe ≤ i
=⇒ k1/ik ≤ i.
C
If now kXkp < 1 then
1
kXkp ≤ ;
p
and so
i
kti X i kp ≤ ,
pi
which tends to 0 as i → ∞. The power-series is therefore convergent. J
Note that
pi ≥ 2i = (1 + 1)i > i2 /2
if i ≥ 2, while if p is odd, k1/2kp = 1. Thus
So if p is odd,
while if p = 2,
That is why in our discussion below the argument often applies to P ∈ E(p)
if p is odd, while if p = 2 we have to restrict P to E22 .
E(pe ) (Qp )
θ : E(pe ) (Qp ) → pe Zp
428–99 5–14
Proof I The identity
which we established in the real case, must still hold; and we conclude from
it, as before, that
θ(P1 + P2 ) = θ(P1 ) + θ(P2 )
whenever
P1 , P2 ∈ E(pe ) (Qp ).
It follows from this that E(pe ) is a subgroup; and that
θ : E(pe ) → pe Zp
is a homomorphism, provided e ≥ 2 if p = 2.
Since
θ(X) = X − c1 X 2 /2 + · · · ,
we have
kθ(X)kp = kXkp
for all kXkp ≤ p−e . In particular
θ(X) = 0 ⇐⇒ X = 0.
Hence θ is injective.
It is also surjective, as the following Lemma will show.
pn Zp (n = 0, 1, 2, . . . ),
together with {0}. In particular, every closed subgroup of Zp , apart from {0},
is in fact open.
Z = Zp .
x r = c0 + c1 p + · · · + cr pr .
428–99 5–15
Now suppose S is a closed subgroup of Zp . Let s ∈ S be an element of
maximal p-adic valuation, say
ksk = p−e .
Then
s = pe u
where u is a unit in Zp , with inverse v, say. Given any > 0, we can find
n ∈ Z such that
kv − nk < .
Then
ns − pe = pe (nu − 1)
= pe u(n − v);
and so
kns − pe k < .
Since ns ∈ S and S is closed, it follows that
pe ∈ S.
Hence
pe Z = pe Zp ⊂ S.
Since s was a maximal element in S, it follows that
S = pe Zp .
C
It follows from this Lemma that im θ is one of the subgroups pm Zp . But
since
kXk = p−e =⇒ kθ(X)k = p−e ,
im θ must in fact be pe Zp , ie θ is surjective.
A continuous bijective map from a compact space to a hausdorff space is
necessarily a homeomorphism. (This follows from the fact that the image of
every closed, and therefore compact, subset is compact, and therefore closed.)
In particular, θ establishes an isomorphism
E(pe ) ∼
= pe Zp ∼
= Zp .
J
It follows from this Theorem that E(pe ) is torsion-free, since Zp is torsion-
free. Thus there are no points of finite order on E close to O, a result which
we shall exploit in the next Chapter.
428–99 5–16
5.4 The Structure of E(Qp)
We shall not use the following result, but include it for the sake of complete-
ness.
Theorem 5.2 Let F ⊂ E(Qp ) be the torsion subgroup of the elliptic curve
E(Qp ). Then
E(Qp ) ∼= F ⊕ Zp .
pn P ∈ E(p)
for some n > 0 since pn P → O and E(p) is an open neighbourhood of O.
Hence the open subgroups p−n E(p) cover Ep . Since Ep is compact, it follows
that p−n E(p) ⊃ Ep for some n, ie
pn Ep ⊂ E(p) ∼
= Zp .
But by Lemma 5 to Theorem 5.1, the only closed subgroups of Zp are the
pe Zp , which correspond under this isomorphism to the subgroups E(pe ) of E(p) .
We conclude that
pn Ep = E(pe )
for some e. C
428–99 5–17
Lemma 7 Suppose A is a finite p-group; and suppose gcd(m, p) = 1. Then
the map ψ : A → A under which
a 7→ ma
is an isomorphism.
ma = 0.
a 7→ ma
is an isomorphism.
mP = 0.
By Lemma 1,
pn Ep ⊂ E(p2 ) ∼
= Zp
for some n.
But Zp is torsion-free. Thus
mP = 0 =⇒ m(pn P = 0) =⇒ pn P = 0.
Hence
m, pn | order(P ) =⇒ order(P ) = 1 =⇒ P = 0
since gcd(m, pn ) = 1. Thus
ker ψ = 0,
ie ψ is injective.
Now suppose P ∈ Ep . We have to show that P = mQ for some Q ∈ Ep .
428–99 5–18
Since Ep /pn Ep is a finite p-group we can find Q ∈ Ep such that
mQ ≡ P mod pn Ep
ie
mQ = P + R,
where
R ∈ pn Ep ∼
= Zp .
Now the map
P 7→ mP : Zp → Zp
is certainly an isomorphism, since m is a unit in Zp with inverse m−1 ∈ Zp .
In particular we can find S ∈ pn Ep with
mS = R.
P = mQ + R = mQ + mS = m(Q + S).
P ∈ Fp0 ∩ Ep ,
say
mP = O,
where gcd(m, p) = 1.
On considering p mod m as an element of the finite group
pr ≡ 1 mod m
428–99 5–19
and so
pn P → O =⇒ P = O.
Now suppose P ∈ E. Since E is compact, and Ep is open, E/Ep is finite
(eg since E must be covered by a finite number of Ep -cosets). Let the order
of this finite group be mpe , where gcd(m, p) = 1.
We can find u, v ∈ Z such that
um + vpe = 1;
and then
P = Q + R,
where
Q = u(mP ), R = v(pe P ).
Now
pe Q = u(mpe P ) ∈ Ep .
Hence
pn Q → 0 as n → ∞
ie
Q ∈ Ep .
mR = mS,
and so
T = R − S ∈ Fp0 .
P = T + (Q + S),
Lemma 10 Fp ⊂ Ep .
428–99 5–20
Proof of Lemma B Suppose
P = Q + R ∈ Fp ,
pn P = 0 =⇒ pn Q = 0, pn R = 0,
pn Q = 0 =⇒ order(Q) | pn =⇒ order(Q) = 1 =⇒ Q = 0,
P = R ∈ Ep .
C
It remains to split Ep into Fp and a subgroup isomorphic to Zp .
Consider the surjection
ψ : Ep → E(pe ) ∼
= Zp .
ψ(P1 ) = P0 ;
and let
E1 = hP1 i
be the closure in Ep of the subgroup generated by P1 . We shall show that
the restriction
ψ1 = ψ | E1 : E1 → E(pe )
is an isomorphism, so that
E1 ∼
= E(pe ) ∼
= Zp .
428–99 5–21
By definition, Q is the limit of points in hP1 i, say
ni P1 → Q,
where ni ∈ Z. But then, since ψ is continuous,
ni P0 → ψ(Q) = 0.
Hence
ni → 0
in Zp . But then it follows that
n i P1 → 0
in Ep , since \
pn Ep = 0.
Hence Q = 0, ie ker ψ1 = 0.
It remains to show that
Ep = Fp ⊕ E1 .
Suppose P ∈ Ep . Then
ψ(P ) = ψ(Q),
for some Q ∈ E1 . In other words,
pn (P − Q) = 0.
Thus
R = P − Q ∈ Fp
On the other hand, if
Fp ∩ E1 = 0,
since as we have seen,
E1 ∼
= E(pe ) ∼
= Zp ,
and Zp is torsion-free.
We have shown therefore that
E = Fp0 ⊕ Ep
= Fp0 ⊕ (Fp ⊕ E1 )
= (Fp0 ⊕ Fp ) ⊕ E1
= F ⊕ E1
∼
= F ⊕ Zp .
428–99 5–22
J
ni → x =⇒ ni P → xP.
428–99 5–23
Chapter 6
428–99 6–1
The only point of finite order in this subgroup is 0 (since Zp has no other
elements of finite order).
It follows that any coset
P + E(p) (Qp )
contains at most one element of finite order. For if there were two, say P, Q,
then P − Q would be a point of finite order in the subgroup.
But E(Qp ) is compact, since it is a closed subspace of the compact space
2
P (Qp ). Hence it can be covered by a finite number of cosets
Since each coset contains at most 1 point of finite order, the number of such
points is finite. J
Remark: We shall prove in Chapter 8 the much deeper result that the group
E(Q) of an elliptic curve over Q is finitely-generated (Mordell’s Theorem),
from which the finiteness of F follows, as shown in Appendix A.
E(R) ∼
= T or T ⊕ Z/(2).
Since
E(Q) ⊂ E(R),
it follows that
F ⊂ T or T ⊕ Z/(2).
Lemma Every finite subgroup of T is cyclic; and there is just one such
subgroup of each order n.
T = R/Z
428–99 6–2
is
F = Q/Z.
For if t̄ ∈ T is of order n then nt ∈ Z, say nt = m, ie t = m/n ∈ Q.
Conversely, if t ∈ Q, say t = m/n, then nt̄ = 0, and so t̄ ∈ F .
Suppose
A ⊂ Q/Z
is a finite subgroup 6= 0. Since each t̄ ∈ T has a unique representative
t ∈ [−1/2, 1/2), A has a smallest representative t = m/n > 0, where we may
assume that m, n > 0, gcd(m, n) = 1.
In fact n = 1; for we can find u, v, ∈ Z such that
um + vn = 1,
and then
1 m
= u + v,
n n
ie
1 m
≡ u mod Z
n n
Thus
1
∈ A.
n
Since 1/n ≤ m/n, this must be our minimal representative: n = 1.
Now every element t̄ ∈ A must be of the form m/n; for otherwise we
could find a representative
Moreover, our argument shows that this is the only subgroup of A of order
n. C
Since this is the only subgroup of T of order n we can write
Z/(n) ⊂ T
428–99 6–3
without ambiguity, identifying
A ⊂ T ⊕ Z/(2).
A ∩ T = Z/(n).
Thus
Z/(n) ⊂ A ⊂ Z/(n) ⊕ Z/(2).
Since Z/(n) is of index 2 in Z/(n) ⊕ Z/(n) it follows that
If n is odd then
Z/(n) ⊕ Z/(2) ∼
= Z/(2n)
by the Chinese Remainder Theorem. Thus either A is cyclic or else
A∼
= Z/(n) ⊕ Z/(2)
with n even. J
Mazur has shown that in fact the torsion group of an elliptic curve can
only be one of a small number of groups, namely
428–99 6–4
Proof I If P = (x, y) then −P = (x, −y). Thus 2P = 0, ie −P = P , if and
only if y = 0.
Thus there are as many elements of order 2 as there are roots of f (x) =
x + ax2 + bx + c in Q. But if 2 roots α, β ∈ Q then the third root γ ∈ Q,
3
since
α + β + γ = −a.
J
In determining whether
p(x) = x3 + ax2 + bx + c
pa = 0, pb = 0 =⇒ p(a + b) = 0.
We can consider this subgroup as a vector space over the finite field GF(p).
Proposition 6.4 If p is an odd prime then there are either no points of order
p on the elliptic curve E(Q), or else there are exactly p − 1 such elements,
forming with 0 the group Z/(p).
428–99 6–5
Proof I Suppose P has order 3, ie
P + P + P = 0.
From the definition of addition, this means that the tangent at P meets E in
3 coincident points P, P, P . In other words, P is a point of inflexion.
It follows from the previous Proposition that there are either 0 or 2 such
flexes. J
Remark: The point 0 is of course a flex (by choice); so there are either 1 or
3 flexes on the elliptic curve E(Q) given by a general Weierstrass equation.
Proof I
kxkp ≤ 1 ⇐⇒ kykp ≤ 1.
Proof of Lemma B If kxkp ≤ 1 but kykp > 1 then y 2 will dominate the
equation. On the other hand, if kxkp > 1 but kykp ≤ 1 then x3 will dominate
the equation. C
On combining these results for all primes,
x ∈ Z ⇐⇒ y ∈ Z.
(This last result is easily proved directly; for if x ∈ Z then the equation
for E can be regarded as a monic quadratic equation for y with integral
coefficients; and any rational solution for y is therefore integral; and similarly
if y ∈ Z then the equation for E can be regarded as a monic cubic equation
for x with integral coefficients; and any rational solution for x is therefore
integral.)
428–99 6–6
Proof of Lemma B The equation of the curve in (X, Z)-coordinates is
Z + c1 XZ + c3 Z 2 = X 3 + c2 X 2 Z + c4 XZ 2 + c6 Z 3 .
Suppose P ∈
/ E(p) , ie
kXkp ≥ 1 or kZkp ≥ 1.
In fact
kXkp ≥ 1 =⇒ kZkp ≥ 1;
for if kXkp ≥ 1 but kZkp < 1 then X 3 would dominate the equation. Thus
kZkp ≥ 1
in either case.
Since y = 1/Z
kZkp ≥ 1 =⇒ kykp ≤ 1.
Hence
x, y ∈ Zp
by Lemma 1. C
2. E(22 ) is torsion-free.
E(p) ∼
= Zp (p odd), E(22 ) ∼
= Z2 ,
as we saw in Chapter 5. C
Proof of Lemma B Suppose P = (X, Z). Recall that although E(2) was
defined as
E(2) = (X, Z) ∈ E : kXk2 , kZk2 < 2−1 ,
Z(1 + c1 X + c2 Z) = X 3 + c2 X 2 Z + c4 XZ 2 + C6 Z 3
428–99 6–7
that
(X, Z) ∈ E(2) =⇒ kZk2 ≤ 2−3 .
(More generally, although E(pe ) is defined as
in fact
(X, Z) ∈ E(pe ) =⇒ kZkp ≤ p−3e
by induction on e.)
The tangent at P is
Z = MX + D
where
∂F/∂X
M=
∂F/∂Z
c1 Z − (3X 2 + 2c2 XZ + 3c4 Z 2 )
= .
1 + c1 X + 2c3 Z − (c2 X 2 + 2c4 XZ + 3c6 Z 2 )
The term 3X 2 dominates the numerator, while the term 1 dominates the
numerator. It follows that
kM k2 ≤ 2−2 .
Hence
kDk2 = kZ − M Xk2 ≤ 2−3 .
The tangent meets E where
(M X + D)(1 + c1 X + c3 (M X + D))
= X 3 + c2 X 2 (M X + D) + c4 X(M X + D)2 + c6 (M X + D)3 .
coeff of X 2
2X + X1 = −
coeff of X 3
c1 M + c3 M 2 − (c2 + 2c4 M + 3c6 M 2 )D
= .
1 + c2 M + c4 M 2 + c6 M 3
Hence
kX1 k2 ≤ 2−2 .
Since
kZ1 k = kM X1 + Dk ≤ 2−4 ,
428–99 6–8
it follows that
(X1 , Z1 ) ∈ E(22 ) .
We conclude that
2P = −(X1 , Z1 ) ∈ E(22 ) ,
since E(22 ( is a subgroup of E. C
Now suppose P = (x, y) ∈ E(Q) is of finite order.
For each odd prime p,
P ∈
/ E(p)
by Lemma 3. Thus
x, y ∈ Zp
by Lemma 2.
Since 2P is of finite order,
P ∈ E(2) =⇒ 2P ∈ E(22 ) =⇒ 2P = 0,
x, y ∈ Z2 ,
by Lemma refIntegrality.
Putting these results together, we conclude that either 2P = 0 or else
x, y ∈ Zp for all p =⇒ x, y ∈ Z.
y 2 = x3 + ax2 + bx + c
then x, y ∈ Z.
2P = 0 =⇒ y = 0 =⇒ x3 + ax2 + bx + c = 0.
E(Q) : y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
428–99 6–9
then
−P = (x, −y − c1 x − c3 ).
For by definition, −P is the point where the line OP meets the curve
again. But the lines through O are just the lines
x=c
parallel to the y-axis (together with the line Z = 0 at infinity). This is clear
if we take the line in homogeneous form
lX + mY + nZ = 0.
x = X/Z = −n/l.
−P = (x, y1 ).
y 2 + y(c1 x + c3 ) − (x3 + c2 x2 + c4 x + c6 ).
Hence
y + y1 = −(c1 x + c3 ),
ie
y1 = −y − c1 x − c3 .
It follows that
2P = 0 ⇐⇒ −P = P
⇐⇒ y = −y − c1 x − c3
⇐⇒ 2y + c1 x + c3 = 0.
E(Q) : y 2 + xy = x3 + 4x2 + x.
2y + x = 0.
428–99 6–10
This meets the curve where
x2 /4 − x2 /2 = x3 + 4x2 + x,
ie
4x3 + 17x2 + 4x = 0.
This has roots 0, −1/4, −4. Thus the curve has three points of order 2,
namely (0, 0), (−1/4, 1/8), (4, 2).
y 2 = f (x),
where
f (x) ≡ x3 + ax2 + bx + c (a, b, c ∈ Z);
and suppose P = [x, y, 1] ∈ E is a point of finite order. Then either y = 0,
or
y 2 | ∆(f ),
where
∆ = 8a3 c − a2 b2 − 18abc + 4b3 + 27c2
is the discriminant of f (x).
y | ∆(f ),
since this brings out the basic idea in a more direct way.
Suppose P = (x, y) is a point of finite order. Then so is 2P = (x1 , y1 ).
Thus by Proposition ,
x, y, x1 , y1 ∈ Z.
Recall that
2x + x1 = −a + m2 ,
where
f 0 (y)
m= .
2y
428–99 6–11
Since a ∈ Z, it follows that
m2 ∈ Z =⇒ m ∈ Z =⇒ 2y | f 0 (x).
On the other hand
y | f (x)
2
since y = f (x). Thus
y | f (x), f 0 (x).
Recall that the resultant R(f, g) of two polynomials
f (x) = a0 xm + a1 xm−1 + · · · + am , g(x) = b0 xn + b1 xn−1 + · · · + bn
is the determinant of the (m + n) × (m + n) matrix
a0 a1 a2 . . . am 0 ... 0
0 a0 a1 . . . am−1 am ... 0
...
0 0 0 ... ... am−1 am
R(f, g) =
b0 b1 b2 . . .
bn 0 ... 0
0 b0 b1 . . . bn−1 bn ... 0
...
0 0 0 ... ... bn−1 bn
We saw earlier that R(f, g) = 0 is a necessary and sufficient condition
for f (x), g(x) to have a root in common. Our present use of the resultant,
though related, is more subtle.
Lemma 1 Suppose f (x), g(x) ∈ Z[x]. Then there exist polynomials u(x), v(x) ∈
Z[x] such that
u(x)f (x) + v(x)g(x) = R(f, g).
428–99 6–12
It is readily verified that if
The existence of such integers follows at once from the following Sub-
lemma. (For simplicity we prove the result with det A as first coordinate
rather than last; but it is easy to see that this does not matter.)
where the Ai1 ’s are the corresponding co-factors. On the other hand, if i 6= n
then
a1i A11 + a2i A21 + · · · + ani An1 = 0,
428–99 6–13
since this is the determinant of a matrix with two identical columns.
Thus the vector
A11
A21
v = ..
.
An1
has the required property. C
C
We apply this Lemma to the polynomials f (x), f 0 (x), recalling that
R(f, f 0 ) = −D(f ).
Hence
y | f (x), f 0 (x) =⇒ y | D.
Turning now to the full result, suppose again the P = (x, y) is of finite
order, and that 2P = (x1 , y1 ). We know that x, y, x1 , y1 ∈ Z.
x(2P ) = 2x + a − m2 ,
where
f 0 (x)
x= .
2y
Thus
4y 2 (2x + a) − f 0 (x)2
x(2P ) =
4y 2
4(x + ax2 + bx + c)(2x + a) − (3x2 + 2ax + b)2
3
= ,
4y 2
428–99 6–14
which yields the given result on simplification. C
It follows from the lemma that
y 2 | g(x);
Thus
y 2 | f (x), g(x)
since y 2 = f (x).
Lemma 3 There exist polynomials u(x), v(x) ∈ Z[x] of degrees 3, 2 such that
Proof of Lemma B For simplicity we are going to prove the result in the case
a = 0. We leave it to the reader to establish the general result.
Let us see if we can find u(x), v(x) ∈ Q[x] of the form
u(x) = x3 + Bx + C, v(x) = x2 + D
such that
u(x)f (x) − v(x)g(x) = const.
The coefficients of x6 and x5 on the left both vanish. Equating the coef-
ficients of x4 , x3 , x2 , x yield
x4 : b + B = −2b + D =⇒ D = B + 3b
x3 : c + C = −8c =⇒ C = −9c
x2 : Bb = b2 − 2Db => 2D + B = b
x: Bc + Cb = −8Dc =⇒ B − 9b = −8D.
B = −5b/3, D = 4b/3.
−5b/3 − 9b = −32b/3,
which is an identity.
Accordingly, we take
428–99 6–15
and then
u(x)f (x) − v(x)g(x) = −27c2 − 4b2 = D,
as required C
The result now follows as before; since x, y ∈ Z,
y 2 | f (x), g(x) =⇒ y 2 | D.
R(f, g) = −D2 ,
6.5 Examples
In these examples we compute the torsion group F of various elliptic curves
E(Q).
E(Q) : y 2 = x3 + 1.
p(x) = x3 + bx + c
is
D = − 4b3 + 27c2 .
y = 0, ±1, ±3.
428–99 6–16
If y = ±1 then x = 0, giving the two points (0, ±1).
If y = ±3 then x3 = 8, giving the two points (2, ±3).
It remains to determine which of these points (0, ±1), (2, ±3) is of finite
order – remembering that the Nagell-Lutz condition y 2 | D is necessary
(if y 6= 0) but by no means sufficient.
The tangent at P = (0, 1) has slope
p0 (x) 3x2
m= = = 0.
2y 2y
Thus the tangent at P is
y = 1.
This meets E where
x3 = 0,
ie thrice at P . In other words P is a flex, and so of order 3.
Turning to the point (2, 3) we have
3x2
m= = 2.
2y
and so the tangent at this point is
y = 2x − 1,
which meets E again at (0, −1). Thus
2(2, 3) = −(0, −1) = (0, 1).
We conclude that (2, 3) (and (2, −3) = −(2, 3)) are of order 6, and
F = Z/(6).
428–99 6–17
3. Suppose F is the torsion subgroup of
E(Q) : y 2 = x3 + x
We have
D = −4,
and so
y = 0, ±1, ±2.
xn + a1 xn−1 + · · · + an
y12 = x3 − x + 1/4.
428–99 6–18
Now we can make the coefficients integral by the transformation
y2 = 23 y1 , x2 = 22 x,
giving
y22 = x32 − 24 x2 + 26 /4,
since the coefficient of x has weight 4, while the constant coefficient has
weight 6. (In practice it is probably easier to apply this transformation
first, and then complete the square; that way our coefficients always
remain integral.) Our new equation is
with discriminant
D = −(4 · 212 + 27 · 28 )
= −28 (64 + 27)
= −28 91.
If y2 = 0 then
x32 − 16x2 + 16 = 0.
But
16 | x32 =⇒ 4 | x2
=⇒ 32 | x32 , 16x2
=⇒ 32 | 16,
428–99 6–19
Finally, if y2 = ±4 then
dy 3x2 − 1
= .
dx 2y − 1
Thus the tangent at P has slope m = 1, and so is
y = x.
y = −2x + 2,
ie
x3 − 6x2 + 9x − 4 = 0.
We know this has two roots equal to 1. The third root must satisfy
2 + x = 6,
ie
x = 4.
428–99 6–20
At this point
y = −2x + 2 = −6.
We know that this point (4, −6) is not of finite order, by Nagell-Lutz.
It follows that (1, 0) is of infinite order. Hence so is (0, 0) since 2(0, 0) =
(1, 0); and so too are (1, 1) = −(1, 0) and (0, 1) = −(0, 0)
It remains to consider the points (−1, 0 and (−1, 1) = −(−1, 0). Note
that if these are of finite order then they must be of order 3 (since there
would be just 3 points in F ), ie they would be flexes.
The tangent at P = (−1, 0) has slope m = −2, and so is
y = −2x − 2.
We know that this has two roots -1. Hence the third root is given by
−2 + x = 6,
ie
x = 8,
y = −2x + 2 = −14.
So
2(−1, 0) = −(8, −14).
Again, we know by Nagell-Lutz that this point is of infinite order, and
so therefore is (−1, 0) and (−1, 1) = −(−1, 0).
To verify that P = (4, −6), for example, is not of finite order, we may
note that the tangent at this point has slope
47
m=− .
11
But the tangent
y = mx + d
428–99 6–21
at P meets the curve again where
2 · 4 + x1 = m2 − m.
F = {0}.
428–99 6–22
Chapter 7
Reduction modulo p
Proof I We can ensure that the coordinates Xi are all integral, by multiplying
by the lcm of the denominators; and then we can ensure that not all the Xi
are divisible by p by dividing by the highest power of p dividing all the Xi .
It remains to show that the resulting point P̄ ∈ Pn (GFp) is uniquely
determined by the point P . Suppose we have two such expressions for P :
P = [X0 , X1 , . . . , Xn ] = [X00 , X10 , . . . , Xn0 ].
Then
[X00 , X10 , . . . , Xn0 ] = ρ[X0 , X1 , . . . , Xn ]
for some ρ ∈ Q× . Let
r
ρ= ,
s
428–99 7–1
where gcd(r, s) = 1. Then
rXi0 = sXi
for all i. Clearly p 6| r; for otherwise p | Xi for all i. Similarly p 6| s. But then
r̄Xi0 = s̄Xi
ie
where ρ̄ = r̄/s̄.
Thus the two representations of P give the same point P̄ . J
Pn (Q) → Pn (GFp) : P 7→ P̃
reduction modulo p.
It is not necessary to choose integral coordinates for reduction; it is suf-
ficient that they be p-integral, that is, of the form c = a/b, where a, b are
integers with p 6 | b. Note that if b is p-integral then the ‘remainder’ c̃ = ã/b̃
modulo p is well-defined. The following result is readily verified.
P = [X0 , . . . , Xn ],
where X0 , . . . , Xn are p-integral but X̄0 , . . . , X̄n do not all vanish. Then
P̄ = [X̄0 , . . . , X̄n ].
P ∈ Γ =⇒ P̄ ∈ Γ̄.
aX + bY + cZ = 0.
428–99 7–2
We can ensure that a, b, c are integral, by multiplying by the lcm of their
denominators, and we can ensure that a, b, c are not all divisible by p, by
dividing a, b, c by a suitable power of p; and then we set
`¯ : āX + b̄Y + c̄Z = 0.
If now P = [X, Y, Z] where X, Y, Z are all integers, but not all are divisible
by p, then
aX + bY + cZ = 0 =⇒ āX̄ + b̄Ȳ + c̄Z̄ = 0.
Thus P lies on the line
`¯ : āX + b̄Y + c̄Z = 0.
Now suppose Γ is a curve in P2 (Q), given by the homogeneous polynomial
equation
F (X, Y, Z) = 0.
We can ensure that all the coefficients of F are integral, but not all divisible
by p; and then we can define he polynomial
F̄ [X, Y, Z] ∈ GFp[X, Y, Z],
by taking each coefficient of F mod p.
Suppose P = [X, Y, Z] where X, Y, Z ∈ Z but not all are divisible by p.
Then
P ∈ Γ ⇐⇒ F (X, Y, Z) = 0 =⇒ F̄ (X̄, Ȳ , Z̄) = 0 ⇐⇒ P̄ ∈ Γ̄.
J
428–99 7–3
Proposition 7.4 The reduction Ẽ of E modulo p is good if and only if p 6= 2
and
p 6 | D,
where
D = −4a3 c + a2 b2 + 18abc − 4b3 − 27c2
is the discriminant of the polynomial p(x) = x3 + ax2 + bx + c:
Theorem 7.1 Suppose the elliptic curve E(Q) has good reduction modulo p.
Then the map
E(Q) → E(GFp) : P 7→ P̃
is a homomorphism.
Proof I The zero point on E certainly maps into the zero point on Ẽ:
[0, 1, 0] 7→ [0, 1̃, 0].
Suppose the 3 points P, Q, R ∈ E(Q) satisfy
P + Q + R = 0.
In other words P, Q, R lie on a line
l : ax + by + cz = 0.
Let ˜l be the reduction of l modulo p. Evidently ˜l is a line in P2 (GFp), which
contains P̃ , Q̃, R̃ by Proposition ??.
We need to be a little careful at this point. If P̃ , Q̃, R̃ are distinct then it
follows that
P̃ + Q̃ + R̃ = 0.
But can we be certain of this conclusion if 2 or all 3 of these points coincide?
It’s not difficult to see that we can.
Lemma 4 Suppose the line l meets the curve Γ ⊂ P2 (Q) of degree n in the
n rational points P1 , . . . , Pn (each repeated according to multiplicity). Then
˜l meets Γ̃ in P˜1 , . . . , P˜n (each repeated according to multiplicity).
428–99 7–4
Proof of Lemma B Choose 2 points
for some c ∈ Q.
C
Thus
P + Q + R = 0 =⇒ P̄ + Q̄ + R̄ = 0.
Since it is readily verified that
−P = −P̄ ,
E(Q) : y 2 = x3 + ax2 + bx + c
has good reduction at the prime p. Let T ⊂ E(Q) be the torsion subgroup
(formed by the points of finite order). Then the reduction map
ρ : E(Q) → E(GFp),
428–99 7–5
Proof I We know by the Nagell-Lutz Theorem 6.1 that the non-zero points
P = (X, Y ) ∈ T
P̃ = [X̃, Ỹ , 1] = (X̃, Ỹ )
ker ρ = {0},
and so ρ is injective. J
7.2 An example
By Theorem 7.2, the torsion subgroup T of E(Q) has an isomorphic image
in E(GFp) for every good prime p. We can often exploit this to determine
T.
In general, the Nagell-Lutz Theorem provides a surer method of deter-
mining T . But there may be cases where the method below is quicker.
As an illustration, let us look at the curve
E(Q) : y 2 = x3 + x + 1.
Since
D = −31.
E has good reduction at all odd primes p except 31.
Consider first reduction at p = 3 If (x, y) ∈ E(GF3) then x3 + x + 1 must
be a quadratic residue modulo 3, ie
x3 + x + 1 = 0 or 1 mod 3.
This does not hold if x = 2 = −1; but it does hold in the other 2 cases
x = 0 and x = 1.
428–99 7–6
We know that the point (X, Y ) has order 2 if and only if Y = 0. In this case
there is just 1 such point, namely (1, 0). Thus E(GF3) is of order 4, and has
1 element of order 2. Consequently,
E(GF3) ∼
= Z/(4).
New consider the curve defined by the same equation over GF5. We have
x3 + x + 1 = 0, 1 or 4 mod 5.
This does not hold if x = 1 mod 5. The other cases yield the points:
Thus
|E(GF5)| = 9,
and so
E(GF5) = Z/(3) ⊕ Z/(3) or Z/(9).
We leave it to the reader to determine which is the case.
This does not affect our present purpose, since in either case
by Lagrange’s Theorem.
Γ = `C,
Proof I
428–99 7–7
Proof of Lemma B We may assume (after a suitable projective transforma-
tion) that the equation has no terms of the first order:
Γ : ax2 + 2hxy + by 2 + O(x3 , y 3 ).
But any line y = mx through P meets Γ where
(a + 2hm + bm2 )y 2 + O(x3 , y 3 ),
with a double root (at least) at y − 0, ie at (0, 0). C
Now suppose P, Q are singularities. Then the line P Q meets Γ tiwce at
P and twice at Q, by the Lemma. Thus the line meets Γ four times, which
is impossible. Hence there is at most one singularity. J
Singularities on cubic curves divide into two kinds: nodes and cusps.
These are distinguished as follows: Let us move the singularity to (0, 0).
Then
F (X, Y, Z) = aX 2 + 2hXY + bY 2 + O(X, Y )3 .
428–99 7–8
This root α ∈ k. For if α is a double root then gcd(p(x), p0 (x)) = x − α,
and we can compute this gcd by Euclid’s algorithm within the ring k[x];
while if α is a triple root then 3α = −b.
Thus we may assume that α = 0, after the transformation x 7→ x − α.
Our equation now takes the form
y 2 = x3 + ax2 .
Note that the second-order terms are y 2 − ax2 . This has distinct factors
unless a = 0. Thus by the definition above, the singularity is a cusp if a = 0,
and a node if a 6= 0. (This accords with the look of the curve if k = R.)
Let us consider the case where the singularity is a cusp first. Our equation
is
y 2 = x3 .
We parametrize Γ \ {(0, 0)} by the map
(
(t−2 , t−3 ) if t 6= 0,
k → Γ : t 7→
[0, 1, 0] if t = 0.
In other words,
P (t) = [t, 1, t3 ]
for all t ∈ k.
Suppose the points P, Q, R with parameters p, q, r lie on the line
aX + bY + cZ = 0.
at + b + ct3 = 0.
p + q + r = 0.
P + Q + R = 0 ⇐⇒ p + q + r = 0.
−P (t) = P (−t).
428–99 7–9
It follows that the map
and so
(t + 1)2 2
x = x3 + x2 ,
(t − 1)2
ie
4t
x=
(t − 1)2
and
4t(t + 1)
y= .
(t − 1)3
In homogeneous terms
k → Γ : t → P (t)
428–99 7–10
is bijective, with t = 0 corresponding to the singular point (0, 0). Thus we
have a one-one correspondence between t ∈ k × and P ∈ Γ \ {(0, 0}.
Suppose the points P, Q, R with parameters p, q, r lie on the line
aX + bY + cZ = 0.
pqr = 1.
Thus
P + Q + R = 0 ⇐⇒ pqr = 0.
In addition, tt is readily verified that
−P (t) = P (1/t).
E(GFp) : y 2 = x3 + ax2 + bx + c.
If (x, y) ∈ E then
p(x) = x3 + ax2 + bx + c
428–99 7–11
must be a quadratic residue modp. Of the numbers {1, 2, . . . , p − 1} just
(p − 1)/2 are quadratic residues, namely
Thus if the values of p(x) mod p are randomly distributed, the expectation
would be that p(x) = 0 for one x, and that p(x) would be a quadratic residue
for (p−1)/2 values of x. The former would give one point (x, 0) on the curve;
each of the latter would give two points (x, ±y). Thus the expected number
of points is
p−1
1+2 = p.
2
To this must be added the point O = [0, 1, 0], giving p + 1 points in all.
5 ≤ kE(GF7)k ≤ 11.
428–99 7–12
Chapter 8
f (z) = c0 + c1 (z − z0 ) + c2 (z − z0 )2 + · · · .
428–99 8–1
This power-series will be dominated by its first non-zero term, and it is easy
to deduce that
0 < |z − z0 | < C =⇒ f (z) 6= c0
for some constant C > 0. It follows that there is no non-zero period with
|ω| < C. J
Note that as an abelian group, C ∼
= R2 .
U = hs1 , . . . , sm−1 i.
S0 = S ∩ U
s = λ1 t1 + · · · + λm−1 tm−1 + λm sm .
say.
428–99 8–2
But since S is a discrete subgroup, it has only a finite number of elements
in the compact disk |v| ≤ R. Thus we need only consider a finite number
of elements s ∈ S when minimizing |λm |; and so the minimum is certainly
attained, at tm say.
Now suppose s ∈ S. Evidently t1 , . . . , tm−1 , tm is a basis for V . Hence
s = µ1 t1 + · · · + µm−1 tm−1 + µm tm .
s0 = s − nm tm
= µ1 t1 + · · · + µm−1 tm−1 + (µm − nm )tm
where
a b
det = ±1.
c d
428–99 8–3
Proof I Suppose first that λ/µ ∈ Q, say
λ/µ = m/n.
Then
nλ = mµ,
ie λ, µ are not linearly independent.
(Alternatively, we may suppose that gcd(m, n) = 1. Then there exist
a, b ∈ Z such that
am + bn = 1
Thus
aλ + bµ = µ/n ∈ Λ,
and
λ = m(µ/n), µ = n(µ/n),
ie λ and µ are both multiples of a smaller period.)
Now suppose that
λ/µ ∈ R \ Q,
ie the ratio is real but irrational.
Proof of Lemma B Choose N with 1/N < . For x ∈ R, let {x} denote the
fractional part of x, ie
{x} = x − [x].
Consider the N + 1 fractional parts
By the Pigeon-Hole Principle, two of the fractional parts, say {rα}, {sα},
must lie in the same subinterval. But then
428–99 8–4
ie
ie
|mα − n| < ,
where m = r − s, n = [rα] − [sα]. C
By the Lemma, we can find m, n ∈ Z such that
|m(λ/µ) − n| < .
Hence
428–99 8–5
8.2 Applications of Cauchy’s Theorem
Let us recall some fundamental results from complex analysis:
1 2πR c
|f 0 (a)| ≤ 2
= .
2π R R
Since R is arbitrary it follows that f 0 (a) = 0 for all a, and so f (z) is
constant.
428–99 8–6
5. Suppose the meromorphic function f (z) has zeros at a1 , a2 , . . . , ar and
poles at b1 , b2 , . . . , bs inside C; and suppose f (z) has no poles or zeros
on C. Then Z 0
1 f (z)
dz = r − s,
2πi C f (z)
with the understanding that poles and zeros are counted with appro-
priate multiplicity, eg a double zero is counted twice. For the function
f 0 (z)/f (z) has a simple pole with residue d at a zero of order d, and a
simple pole with residue −d at a pole of order d.
6. With the same assumptions,
f 0 (z)
Z
1
z dz = (a1 + · · · + ar ) − (b1 + · · · + bs ).
2πi C f (z)
For if f (z) has a zero at a of order m then zf 0 (z)/f (z) has a simple
pole at a with residue ma; while if f (z) has a pole at b of order n then
zf 0 (z)/f (z) has a simple pole at b with residue −nb.
is holomorphic in U , with
X
f 0 (z) = u0n (z).
Notice that this is much simpler to prove than the corresponding result
for real functions, using the fact that
Z
1 f (z)
f (a) = dz,
2πi C z − a
428–99 8–7
Proposition 8.4 An elliptic function f (z) with no poles is necessarily con-
stant.
Proof I By the Proposition, the residue c at a single pole must vanish. But
a simple pole cannot have zero residue. J
Thus an elliptic function has to have at least 2 poles (or a double pole)
in each fundamental parallelogram.
c1 + · · · + cr = 0.
Note that in this case the poles are not counted according to their multi-
plicity.
Proof I This follows at once from the fact that
Z
1
f (z)dz = 0.
2πi Π
J
428–99 8–8
Proposition 8.7 Suppose f (z) is an elliptic function; and suppose Π is a
fundamental parallelogram, containing no poles or zeros of f (z) on its bound-
ary. Let the zeros of f (z) inside Π be a1 , . . . , ar , and let the poles inside Π
be b1 , . . . , br (each repeated according to its multiplicity). Then
a1 + · · · + ar ≡ b1 + · · · + cr mod Λ.
428–99 8–9
Proof I Let λ, µ be a basis for the lattice Λ, so that
ω = mλ + nµ (m, n ∈ Z).
Q(x, y) − C1 (x2 + y 2 )
Q(x, y) ≤ (A + B + C)(x2 + y 2 ).
C
Geometrically, this Lemma states that concentric circles can be drawn
inside and outside an ellipse.
428–99 8–10
This converges if and only if 1 − 2e < −1, ie e > 1.
To see that S and I converge or diverge together, we note that if m ≥
0, n ≥ 0 then
1 1 1
≤ 2 ≤
((m + 1)2 2
+ (n + 1) )e 2
(x + y )e (m + y 2 )e
2
1 1 2|z| 3
| − | ≤ .
(z − ω)2 ω 2 |ω|
1/|ω|3 is convergent, it follows that the series
P
Since
X 1 1
2
− 2
(z − ω) ω
|ω|≥2C
428–99 8–11
Definition 8.6 The Weierstrass elliptic function ϕ(z) with respect to the
lattice Λ ⊂ C is defined by
1 X 1 1
ϕ(z) = 2 + 2
− 2 .
z ω∈Λ,ω6=0
(z − ω) ω
f (z + ω0 ) = f (z).
The result
P would be2 obvious if we could separate ϕ(z) into a variable part
2 2
P
1/z + 1/(z − ω) and a constant part 1/ω . Unfortunately these 2
parts do not converge separately, so a more careful approach—which we
sketch below—is required.
Given > 0, choose R so large that
X 1 X 1
3
< and < ;
|ω| |z − ω|3
|ω|≥R |ω|≥R
and let
ϕ(z) = F (z) + R(z),
where
1 X 1 1
F (z) = 2 + 2
− 2
z (z − ω) ω
|ω|≤R+|z|+|ω0 |
and
X 1 1
R(z) = 2
− 2
(z − ω) ω
|ω|>R+|z|+|ω0 |
Then
428–99 8–12
8.4 The Field of Elliptic Functions
Proposition 8.11 ϕ(z) is even.
Proposition 8.12 The elliptic functions form with respect to Λ form a field
over C, of which the even functions form a sub-field.
Proof I If f (z), g(z) are elliptic with respect to Λ, then so are f (z) ± g(z),
f (z)g(z) and f (z)/g(z); and the same is true if f (z), g(z) are even. J
Proposition 8.13 An odd elliptic funtion f (z) has a pole or zero at every
semilattice point σ.
2σ = ω ∈ Λ
ie
−σ = σ − ω,
it follows that
f (−σ) = f (σ − ω) = f (σ).
On the other hand, since f (z) is odd.
f (−σ) = −f (σ).
Hence
f (σ) = 0,
ie σ is a zero of f (z). J
428–99 8–13
Proof I Suppose σ is a zero of f (z). Since f 1 (z) = f 0 (z) is odd, f 2 (z) = f 00 (z)
is even, f 3 (z) is odd, etc,
Thus the first n for which f (n) (σ) 6= 0 is even. Hence the ordero of the zero
is even.
If f (z) has a pole at σ then the result follows on considering 1/f (z). J
Theorem 8.1 The field k of even elliptic functions with respect to Λ is gen-
erated over C by the Weierstrass elliptic function: k = C(ϕ(z)). In other
words, every elliptic function f (z) is expressible as a rational function of
ϕ(z):
P (ϕ(z))
f (z) = ,
Q(ϕ(z))
where P, Q are polynomials.
Proof I If f (z) has a pole or zero at 0, it must have even multiplicity since
f (z) is even. Thus we can find e ∈ Z such that
428–99 8–14
Proposition 8.14 Every elliptic function f (z) is expressible in the form
is even, and so
f (z) = F (z) + ϕ0 (z)H(z),
where F (z) and H(z) are both even elliptic functions. The result now follows
from the previous Proposition. J
K = C (ϕ(z), ϕ0 (z)) .
428–99 8–15
Proof I The function on the left has a 6-fold pole at z = 0, and double zeros
at each semilattice point. The function on the right also has a 6-fold pole at
z = 0. Consider the function f (z) = ϕ(z) − ϕ(ei ). This has a zero at ei ; and
it is a double zero since f 0 (ei ) = ϕ0 (ei ) = 0.
Thus the function on the right has exactly the same poles and zeros as
the function on the left. Hence they differ only by a multiplicative constant
(since their ratio has no poles or zeros).
The value of this constant follows on considering the coefficients of 1/z 6
on both sides:
1 2
ϕ(z) = 2
+ h(z) =⇒ ϕ0 (z) = − 3 + O(z)
z z
4 1
=⇒ ϕ0 (z)2 = 6 + O( 2 .
z z
J
Theorem 8.2 The functional equation satisfied by ϕ(z) takes the form
where X 1 X 1
g2 = 4
, g 3 = 6
.
w∈Λ,w6=0
w w∈Λ,w6=0
w
1
= 2
+ 3g2 z 2 + 5g3 z 4 + O(z 6 ).
z
428–99 8–16
Differentiating,
2
ϕ0 (z) = − 3
+ 6g2 z + 20g3 z 3 + O(z 5 ).
z
Thus
4 24g2
ϕ0 (z)2 = 6
− 2 − 80g3 + O(z 2 ),
z z
while
1 9g2
ϕ(z)3 = 6
+ 2 + 15g3 + O(z 2 ),
z z
and
1
ϕ(z)2 = + 6g2 + O(z 2 ),
z4
Substituting in the functional equation,
4 24g2 4 36g2 a b
6
+ 2 + 80g3 = 6 + 2 + 60g3 + 4 + 6ag2 + 2 + c + O(z 2 ).
z z z z z z
Comparing coefficients,
a = 0, b = −60g2 , c = −140g3 ,
as stated. J
428–99 8–17
Proof I Suppose (x, y) = [x, y, 1] ∈ E. Consider the elliptic function
f (z) = ϕ(z) − x.
This has a double pole at the points of Λ, and so has two zeros in any
fundamental parallelogram Π. Since f (z) is even, the two zeros are ±a mod
Λ. But there are just two points (x, ±y) on E with a given x-coordinate. It
follows that each point (x, y) ∈ E arises from some z ∈ C, ie Φ is surjective.
Since ϕ(z) and ϕ0 (z) are both doubly-periodic,
z1 ≡ z2 mod Λ =⇒ Φ(z1 ) = Φ(z2 ).
Conversely, if ϕ(z1 ) = ϕ(z2 ) then the argument above shows that z1 ≡
±z2 mod Λ. Since ϕ0 (−z) = −ϕ0 (z), it follows that
Φ(z1 ) = Φ(z2 ) =⇒ z1 ≡ z2 mod Λ.
The map Φ is certainly continuous at all points z ∈ / Λ, since ϕ(z) and
ϕ0 (z) are both differentiable, and so a fortiori continuous. It remains to show
that Φ is continuous at 0. In the neighbourhood of 0 ∈ E,
0 1 −2
(ϕ(z), ϕ (z)) = + ··· , 3 + ··· .
z2 z
Changing to X, Z coordinates, where [x, y, 1] = [X, 1, Z], ie
x 1
X= , Z= ,
z z
we see that
1
X = z + O(z 3 ), Z = − z 3 + O(z 5 ).
2
It follows that Φ is continuous at 0, and so at the other points of Λ. J
428–99 8–18
8.7 The Addition Formula
Suppose u, v ∈ C \ Λ, with u 6≡ v mod Λ. Then we can find A, B, C ∈ C such
that
This has a triple pole (at most) at each lattice point z ∈ Λ. Hence it has 3
zeros a1 , a2 , a3 in any fundamental parallelogram Π, satisfying
a1 + a2 + a3 ≡ 0 mod Λ,
Aϕ(u + v) − Bϕ0 (u + v) + C = 0.
Thus, eliminating A, B, C,
ϕ(u + v) −ϕ0 (u + v) 1
This expresses Φ(u + v) = (ϕ(u + v), ϕ0 (u + v)) in terms of Φ(u) and Φ(v).
u + v + w = 0.
428–99 8–19
In other words the 3 points Φ(u), Φ(v), Φ(w) lie on the line
Ax + By + C = 0.
Φ(u) = [0, 1, 0], Φ(v) = [ϕ(v), ϕ0 (v), 1], Φ(w) = [ϕ(v), −ϕ0 (v), 1]
E(C) ∼
= T2 .
In one sense this result is of little practical value, since we already know
that
E(R) = T1 or T1 ⊕ Z/(2),
and this gives us more information about E(Q). For example, the result for
E(R) tells us that the torsion subgroup F , formed by the points of E(Q) of
finite order, is either cyclic Z/(n), or else of the form Z/(2) ⊕ Z/(n). The
result for E(C) only tells us that F is either cyclic Z/(n), or else of the form
Z/(m) ⊕ Z/(n).
Perhaps the main interest of the complex case is that it explains in a
natural way why there is a group structure on E.
It is natural to ask: Does every elliptic curve over C arises in this way from
some lattice Λ?
Suppose s ∈ C× . Consider the lattice
sΛ = {sω : ω ∈ Λ}.
428–99 8–20
In particular, sΛ gives rise to the elliptic curve
y 2 = x3 − 15s−4 g2 (Λ)x − 35s−6 g3 (Λ).
But this is just the equation we get if we make the transformation
x 7→ s−2 x, y 7→ s−3 y,
since the coefficients of x and 1 in the Weierstrass equation have weights 4
and 6, respectively. Thus similar lattices give rise to projectively equivalent
elliptic curves.
In effect, therefore, we are only concerned with lattices up to similarity.
In other words, we are concerned with the ratio
τ = λ/µ
rather than with the basis elements λ, µ themselves. (For the lattice h1, τ i is
similar to the lattice hλ, µi.)
Recall that τ ∈ / R. Thus τ either lies in the upper half-plane
H = {z ∈ C : =(z) > 0}
or else in the lower half-plane −H. It is convenient to restrict ourselves to
bases λ, µ with λ/µ ∈ H. Let us say that the basis is positive in this case.
(Note that just one of λ, µ and −λ, µ is positive; so we can always make a
basis positive by replacing λ with −λ if necessary.)
Recall that if λ0 , µ0 is another basis then
0
λ a b λ
0 = ,
µ c d µ
where a, b, c, d ∈ Z and ad − bc = ±1. On setting τ 0 = λ0 /µ0 this becomes
aτ + b
τ0 = .
cτ + d
The following result, although apparently rather technical, will prove very
useful.
428–99 8–21
Proof I We have
1 0
=(τ 0 ) =
τ − τ0
2i
1 aτ + b aτ̄ + b
= −
2i cτ + d cτ̄ + d
1 (ad − bc)(τ − τ̄ )
=
2i (cτ + d)(cτ̄ + d)
det T
= =(τ ).
|cτ + d|2
J
az + b
z 7→ T z = .
cz + d
Notice that the matrices ±T define the same transformation.
G = SL(2, Z)/{±I}.
428–99 8–22
Thus T corresponds to the translation
z 7→ z + 1,
z 7→ −1/z.
G = hS, T i.
Proof I We have
S 2 = −I
= I,
t2 − t + 1 = 0.
Hence ST satisfies
(t + 1)(t2 − t + 1) = t3 + 1 = 0,
ie
(ST )3 = −I
= I,
428–99 8–23
Notice that we have included half the boundary of F, just as we did (and
for much the same reason) with the fundamental parallelogram Π for a lattice
Λ.
Notice too that F contains the points −ω 2 and i; these will play a special
rôle in what follows.
z0 = gz ∈ F (g ∈ G).
Remark: Note that we are not saying g ∈ G is unique (we shall deal with
that question shortly); only that z0 is unique.
Proof I The idea is to find a transform gz maximimising =(gz). By Propo-
sition 11.2, if
az + b
gz =
cz + d
then
1
=(gz) = =(z).
|cz + d|2
For a fixed z ∈ H, the points
{cz + d : c, d ∈ Z}
form a lattice (with basis 1,z). There are only a finite number of lattice
points inside the disk |z| ≤ 1, ie there are only a finite number of c, d ∈ Z
with
|cz + d| ≤ 1.
It follows that =(gz) can only take a finite number of values ≥ =(z). In
particular there must be a maximum such value, attained say at g0 z.
Now translation z 7→ z + r does not affect =(z), so the maximal value is
also attained at each pount T r g0 z.
But we can choose r so that z0 = T r (g0 z) lies in the strip
1 1
S = {z ∈ H : − < <(z) ≤ }.
2 2
We claim that this transform z0 ∈ F, or else |z0 | = 1 and Sz0 ∈ F.
428–99 8–24
Proof of Lemma B If z = reiθ then Sz = −1/z and so
1
=(Sz) = sin θ > r sin θ = =(z).
r
C
In particular, |z0 | ≥ 1; for otherwise =(Sz0 ) > =(z0 ), contradicting the
maximality of =(z0 ). If |z0 | > 1 then z0 ∈ F; while if |z0 | = 1 then either
<(z0 ) ≥ 0, in which case z0 ∈ F, or else <(z0 ) < 0 in which case Sz0 ∈ F F .
Now suppose z, gz ∈ F. We may assume (swapping z,gz if necessary)
that
=(gz) ≥ =(z).
By Proposition 11.2, this implies that
|cz + d| ≤ 1.
Hence
|d| ≤ 1.
The problem is reduced to just 4 cases: (c, d) = (1, 0), (0, 1), (1, 1), (1, −1).
If c = 0 then g is a translation
gz = z + r;
428–99 8–25
Now
z ∈ F =⇒ Sz ∈ S.
Hence a = 0, ie g = S. But it is clear that
z, Sz ∈ F =⇒ |z| = 1;
|cz + d|
|z ± 1| ≥ 1,
S(x) = {g ∈ G : gx = x}.
G = hS, T i.
428–99 8–26
Proof I Let
H = hS, T i
be the subgroup of G generated by S, T .
On examining the proof of Proposition 8.3 it is clear that the argument
holds equally well with H replacing G. In particular, if z ∈ H then we can
find a transform
hz ∈ F (h ∈ H).
Now suppose g ∈ G. Choose any z ∈ F except −ω 2 or i, and consider
the transform gz. By Theorem 8.3 we can find h ∈ H such that
h(gz) ∈ F.
hgz = z;
and therefore
hg ∈ S(z) = {I},
ie
hg = I =⇒ g = h−1 ∈ H.
Remark: Note that it would not make sense to speak of a function of odd
weight, since cz + d is only determined up to ±1.
428–99 8–27
Proposition 8.21 The meromorphic function f (z) on H is weakly modular
of weight 2k if and only if
(a + c)z + (b + d)
T (gz) = gz + 1 = ;
cz + d
while
f (T gz) = f (gz)
= (cz + d)−2k f (z),
428–99 8–28
The map
Θ : z 7→ q = e2πiz
maps H onto the interior of the disk
D = {z : |z| < 1}
Θ(z1 ) = Θ(z2 ) ⇐⇒ z2 − z1 ∈ Z.
f (z) = g(e2πz ).
Definition 8.12 The weakly modular function f (z) is said to have a pole (or
zero) of order m at ∞ if that is true of g(q) at q = 0. It is said to be regular
at ∞ if it does not have a pole there; and in that case we set f (∞) = g(0).
Conversely, we can recover the modular function from the lattice function by
428–99 8–29
Thus Gk (z) corresponds to the lattice function
X 1
gk (Λ) = .
ω∈Λ,ω6=0
ω 2k
Gk (∞) = 2ζ(2k).
Gk (z) → 2ζ(2k) as z → ∞.
It follows from this that g(q) is regular at q = 0, with g(0) = 2ζ(2k). (For
the coefficient a−n in the Laurent series is given by
Z
1
a−n = q n−1 g(q)dq
2πi C
round a small circle C with centre 0, and this vanishes as the radius of the
circle tends to 0.) J
Proposition 8.23 A modular function has only a finite number of poles and
zeros in F.
428–99 8–30
Proof I The function g(q) has an expansion
{z ∈ H : =(z) > er }.
On the other hand, f (z) has only a finite number of poles or zeros in the
compact set
{z ∈ F̄ : =(z) ≤ er }.
It follows that f (z) has only a finite number of poles or zeros in F. J
1. vu (f + g) ≤ max(vu (f ), vu (g)),
2. vu (f g) = vu (f ) + vu (g).
1 1 X k
vω (f ) + vi (f ) + vz (f ) = .
3 2 6
z6=ω,−ω 2 ,i
428–99 8–31
Proof I Let
f 0 (z)
Z
1
I= dz,
2πi Γ f (z)
where Γ runs round the boundary of F, truncated at the top. More precisely,
Γ = A + B + C + D + E,
where A is the line joining −ω 2 to 1/2 + Ri, B is the line joining 1/2 + Ri to
−1/2 + Ri, C is the line joining −1/2 + Ri to ω, D is the circular arc joining
ω to i, and E is the circular arc joining −ω 2 .
Let us assume for the moment that f (z) has no poles or zeros on Γ, and
also that R is so large that all the poles or zeros of f (z) inside F are inside
Γ.
As we know, if f (z) has a pole or zero at u ∈ H then f 0 (z)/f (z) has a
simple zero at u with residue vu (f ). It follows that
X
I= vu (f ).
u∈F
f 0 (z)
Z Z
1
for dz.
X 2πi X f (z)
1 2k 1
f (Sz) = 2k
f (z) =⇒ f 0 (Sz) = − 2k+1 f (z) + 2k f 0 (z)
z z z
f 0 (Sz) 2k f 0 (z)
=⇒ =− + .
f (Sz) z f (z)
428–99 8–32
(In effect, f 0 (z)/f (z) = d/dz(log f (z)).) Thus the main parts of the
integral cancel out, leaving
Z Z Z
1 2k
+ = dz
D E 2πi D z
2k π/2
Z
= iθdθ
2πi 2π/3
2 1
=k −
3 2
k
=
6
3. Finally, on B we have
f (z) = g(e2πiz ).
Changing variable from z to q = e2πiz ,
f 0 (z) g 0 (q)
= 2πiq , dz = 2πiq dq,
f (z) g(q)
and so
g 0 (q)
Z Z
1
= dq,
B 2πi γ g(q)
where q runs round the small circle
γ : q = e−2πR e2πx
428–99 8–33
But as we observed, X
I= vu (f ).
u∈F
Thus
X k
vu (f ) + v∞ (f ) = ,
u∈F
6
as required.
It remains to deal with the case where f (z) has one or more poles or zeros
on Γ.
f (Sz) = z 2k f (z).
B 0 = B 00 + δ, C 0 = C 00 + δ1 ,
428–99 8–34
where B 00 , C 00 are slightly curtailed versions of B, C. By our argument
in the main case,
Z Z Z Z
k
+ = + +O() = + O().
B 00 C 00 B C 6
In the neighbourhood of ω,
f 0 (z) vω (f )
= + h(z),
f (z) z−ω
Similarly
f 0 (z)
Z
1 vω (f )
dz = − .
2πi γ1 f (z) 6
Also Z Z
+ = 0,
A0 C0
as before. Putting the parts together,
k 1
I= − v∞ (f ) − vω (f ),
6 2
and so
1 X k
vω (f ) + vu (f ) = ,
3 u
6
as required.
Γ0 = A + B + C + D00 + δ + E 00 .
Then Z Z
k
+ E 00 = + O(),
D00 6
428–99 8–35
as in the previous case; while
f 0 (z) vi (f )
= + h(z)
f (z) z−i
as required.
Proposition 8.24 There are no modular forms of weight < 0; and the only
modular forms of weigth 0 are the constants.
Proof I For a modular form f (z), vu (f ) ≥ 0 for all u. Thus if f (z) were of
weight < 0, then the left-hand side of the identity in the Theorem would be
≥ 0, while the right-hand side would be < 0.
Similarly, if k = 0 then the only way the identity could be satisfied is if
vu (f ) = 0 for all u (including ∞). But then f (z) − f (∞) is a modular form
of weight 0 with v∞ (f ) > 0, which is a contradiction unless the function is
identically zero, ie f (z) = f (∞) is constant. J
428–99 8–36
we have
a b 1
+ +c= ,
3 2 6
with a, b ∈ N, which is manifestly impossible. J
428–99 8–37
It is easy enough to prove this directly; since Sω = −ω 2 ,
1
G(−ω 2 ) = 4
G(ω) = ω 2 G(ω),
ω
while since −ω 2 = ω + 1,
G(−ω 2 ) = G(ω),
Similarly, since Si = i,
1
G3 (i) = G3 (i) = −G3 (i).
i6
Recall that the discriminant ∆(E) of the elliptic curve
y 2 = x3 + bx + c
was defined to be
∆ = 24 D,
where
D = −(4b3 + 27c2 )
is the discriminant of the polynomial on the right. (The factor 24 was intro-
duced to allow the discriminant of the general Weierstrassian elliptic curve
y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
is
∆(E) = 24 33 52 (20g23 − 49g32 ).
(The scalar factor is irrelevant for our present purposes, and is only retained
for consistency.)
428–99 8–38
Proof I It is clear that ∆(z) is a modular form of weight 12. We know that
the elliptic curve
E : y 2 = x3 − 15g2 x − 35g3
is non-singular. (Recall the argument: If the curve had a singularity, it would
be a point (α, 0) on the line of symmetry y = 0, where α is a double root
of the polynomial on the right. But we have seen that this polynomial has
three distinct roots corresponding to the semilattice points of the lattice Λ
in question.)
But now our formula gives
v∞ (∆) = 1,
ie ∆(z) has a simple zero at ∞. J
Remark: A modular form f (z) with f (∞) = 0 is called a cusp form.
Proposition 8.30 The modular forms are generated by G2 (z) and G3 (z).
More precisely, a modular form of weight 2k is a linear combination of the
modular forms
G2 (z)a G3 (z)b ,
where
2a + 3b = k.
2a + 3b = 4
is a = 2, b = 0, while the only solution of
2a + 3b = 5
is a = 1, b = 1. The result follows as in Propositions 8.26 and 8.27. C
428–99 8–39
Proof of Lemma B If k is even, a = k/2, b = 0 is a solution; while if k is
odd, a = (k − 3)/2, b = 1 is a solution. C
Now suppose f (z) is a modular form of weight 2k, where k ≥ 6. By the
last Lemma, we can find a, b such that 2a + 3b = k. Let
where we choose
f (∞) f (∞)
ρ= a b
=
G2 (∞) G3 (∞) ζ(4)a ζ(6)b
so that
h(∞) = 0.
Then h(z) is a modular form of weight 2k with h(∞) = 0.
But now
h(z)
k(z) =
∆(z)
is a modular form of weight 2k − 12; for the zero of h(z) at ∞ cancels out
the zero of ∆(z) at ∞, and ∆(z) has no other zeros.
It follows by our inductive hypothesis that k(z) is a linear combination
of the monomial functions
0 0
G2 (z)a G3 (z)b (2a0 + 3b0 = k − 6).
Hence
g(z) = ∆(z)k(z)
is a linear combination of the functions
00 00
G2 (z)a G3 (z)b (2a00 + 3b00 = k);
and so therefore is
f (z) = g(z) + G2 (z)a G3 (z)b .
J
Proposition 8.31 The functions G2 (z)a G3 (z)b with 2a+3b = k form a basis
for the modular forms of weight 2k.
428–99 8–40
Proof I Suppose there were a linear relation between these monomial func-
tions. The relation of lowest weight must be of the form
λG2 (z)3c + · · · + µG3 (z)2c = 0.
(For otherwise we could divide the relation by G2 (z) or G3 (z).)
But now taking z = i, −ω 2 ,
µG3 (ω)2c = 0 =⇒ µ = 0, λG2 (ω)3c = 0 =⇒ λ = 0.
J
The modular forms constitute a graded algebra
M = (Mk )k∈N ,
where Mk is the space of modular forms of weight 2k. It follows from the
Proposition above that this algebra is the polynomial algebra generated by
G2 and G3 :
M = C[G2 , G3 ].
Remark: The scalar factor is of no significance for our present purpose. (It
is chosen so that j(z) has residue 1 at ∞.)
Proof I This follows at once from the properties of G2 (z) and ∆(z) (Propo-
sitions 8.26 and 8.29). J
Proof I The modular function j(z) − c is of weight 0, and has a simple pole
at ∞. It follows from the Modular Counting Theorem that f (z) either has
a triple zero at −ω 2 , or else a simple zero at some other point.
In any case, there is just one zero in F. J
Recall that each modular function has an associated lattice function.
428–99 8–41
Definition 8.19 For each lattice Λ = hλ, µi we set
J(Λ) = j(λ/µ).
Thus
g23
J(Λ) = 26 33
∆
22 g23
= .
52 20g23 − 49g32
Theorem 8.6 Each elliptic curve
E(C) : y 2 = x3 + bx + c
j(z0 ) = C.
Let
Λ0 = h1, z0 i;
and let
E(C) : y 2 = x3 + b0 x + c0
be the elliptic curve associated to Λ0 . Then
b30 b3
= j(z0 ) = C = .
4b30 − 27c20 4b3 − 27c2
We know that the denominators do not vanish, since the curves are non-
singular. Hence
b0 = 0 ⇐⇒ b = 0.
Suppose for the moment this is not so. Then
4b30 − 27c20 4b3 − 27c2 c20 c2
= =⇒ = .
b30 b3 b30 b3
428–99 8–42
Evidently
c0 = 0 ⇐⇒ c = 0.
Suppose this too is not so. Then
3 2
b c
= .
b0 c0
Let
b c γ
= β, = γ, ρ = .
b0 c0 β
Then γ 2 = β 3 , and so
γ2 β3
ρ2 = = = β,
β2 β2
γ3 γ3
ρ3 = 3 = 2 = γ.
β γ
Thus
b = ρ2 b0 , c = ρ3 c0 .
Let s2 = ρ. Then
b = s4 b 0 , c = s6 c 0 .
It follows that the given curve is defined by the lattice
y 2 = x3 + c0 , y 2 = x3 + c.
The transformation
x → s2 x, y → s3 y
will take the first curve into the second provided we choose s so that
c = s6 c 0 .
y 2 = x3 + b0 x, y 2 = x3 + bx,
and the transformation will take the first curve into the second provided we
choose s so that
b = s4 40 .
428–99 8–43
Now suppose the given curve E is also defined by the lattice
b3
j(z 0 ) = J(Λ0 ) = 22 33 53 = J(Λ) = j(z0 ).
22 b3 − 33 c2
Hence, by Proposition 8.32,
z = gz
for some g ∈ G, say the transformation
az + b
gz = .
cz + d
It follows that the lattices
Λ0 = sΛ.
But then
sb b = b, s6 c = c.
If b, c 6= 0 this implies that s2 = 1, so that s = ±1 and the lattices are
the same.
If b = 0 then
s = ±1, ±ω, ±ω 2 .
But
j(z) = 0 =⇒ z = gω.
Thus the lattice is similar to
Λ0 = {m + nω : m, n ∈ Z},
ωΛ0 = Λ0 , −ω 2 Λ0 = Λ0 ,
G3 (z0 ) = 0 =⇒ z0 = i,
428–99 8–44
so that the lattice is similar to
Λ0 = {m + ni : m, n ∈ Z},
E(k) : y 2 + c1 xy + c3 y = x3 + c2 x2 + c4 x + c6
over all fields k, by exactly the same method by which we extended the
definition of the discriminant ∆(E) to all such curves.
The j-invariant turns out to have an important rôle in the classification
of elliptic curves over a general field k. But that is another story.
428–99 8–45
Chapter 9
Mordell’s Theorem
428–99 9–1
9.2 The Idea of the Proof
Suppose E(Q) is finitely-generated. Then the group
E(Q)
2E(Q)
A = F ⊕ rZ = F ⊕ Z
| ⊕ ·{z
· · ⊕ Z}
r summands
where F is finite and r = rank(A). Suppose there are 2s elements of order
dividing 2 in A. Then
It follows that
A/2A = F/2F ⊕ Z/(2) ⊕ · · · ⊕ Z/(2),
each direct summand Z in A contributing one copy of Z/(2). It remains to
determine F/2F .
Consider the homomorphism
φ : F → F : x 7→ 2x.
428–99 9–2
and moreover,
kE(Q)/2E(Q)k = 2r+s .
where r is the rank of E(Q) and s = 0, 1 or 2 according as the cubic f (x) has
0,1 or 3 roots in Q.
The converse, unfortunately, is not true: an abelian group A may have
A/2A finite without A being finitely-generated. For example,
Q/2Q = 0,
since every rational is expressible as twice another rational; but Q is not
finitely-generated as an abelian group.
So the condition (that E/2E be finite) is necessary but not sufficient.
However, it allows us to start a process of “infinite descent”, as follows.
Let the points E1 , . . . , Em be representatives of the cosets in E/2E; and
suppose P ∈ E. Then
P − Ei ∈ 2E
for some i, say
P − Ei0 = 2P1 .
We can apply the same argument to P1 :
P1 − Ei1 = 2P2 ;
and we can continue in this way
P2 − Ei2 = 2P3 ,
P3 − Ei3 = 2P4 .
...
We expect the points P1 , P2 , . . . defined in this way by successive ‘halving’
to descend the curve in some sense. But what exactly do we mean by ‘de-
scend’ ? When infinite descent is applied to integral solutions of an equation,
the meaning is clear: the coordinates become smaller. But we are dealing
with rational points. We need some notion of the simplicity of a rational
number q = m/n. We therefore define the height of q ∈ Q to be
H(q) = max(|m|, |n|),
if q = m/n in its lowest terms. Now our task is clear; we have to show that
the points P1 , P2 , . . . are descending in the sense that the heights of their
coordinates are decreasing.
Actually, we shall find it sufficient, and much simpler, to consider the
x-coordinate.
Thus the proof has 2 quite separate parts, which we might call the alge-
braic or group-theoretic part, and the topological or valuation-theoretic part.
428–99 9–3
9.3 When can a Point be ‘Halved’ ?
Recall our mammoth formula for the ‘double’ of a point X, Y ) ∈ E:
4
X − 2bX 2 − 8cX + b2 − 4ac
2(X, Y ) = ,
4Y 2
X 6 + 2aX 5 + 5bX 4 + 20cX 3 + (8a2 c − 2ab2 − 4bc)X + b3 − 4abc + 8c2
.
8Y 3
If c = 0 we may observe that the x-coordinate is a perfect square
2 2
X −b
.
2Y
At first sight this seems a pure fluke. But it turns out to be the hinge of our
argument.
Suppose the line
y = mx + d
meets E in the 3 points
It follows that
x1 + x2 + x3 = m2 − a,
x2 x3 + x3 x1 + x1 x2 = b − 2md,
x1 x2 x3 = d2 − c.
The last of these equations is the one that concerns us now. Suppose again
that c = 0. Then the equation becomes
x1 x2 x3 = d2 .
P + Q + R = 0 =⇒ x1 x2 x3 = d2 .
428–99 9–4
Remembering that P and −P have the same x-coordinate, it follows that if
P = 2Q then the x-coordinate of P is a square.
This was on the assumption that c = 0. Geometrically, this means that
(0, 0) ∈ E. Now (0, 0) is a point of order 2. But any point (α, 0) ∈ E of order
2 can be brought to (0, 0) by the coordinate-change x 7→ x − α.
Thus the only assumption we are making is that E(K) possesses a point
of order 2. In fact, returning to the original coordinates, we can express the
result as follows: Suppose (α, 0) ∈ E, where α is a root of
f (x) = x3 + ax2 + bx + c.
Then
P = (X, Y ) ∈ 2E(K) =⇒ X − α = θ2
in K.
But there is nothing special about the root α. Suppose now that all 3
roots α, β, γ ∈ K. Then our argument shows that
P = (X, Y ) ∈ 2E(K) =⇒ X − α, X − β, X − γ ∈ K 2 ,
that is,
2 2 2
X − α = α0 , X − β = β 0 , X − γ = γ 0 ,
where α0 , β 0 , γ 0 ∈ K.
This brings us to the main result in the algebraic half of the proof of
Mordell’s Theorem.
E(K) : y 2 = x3 + ax2 + bx + c
f (x) = x3 + ax2 + bx + c
P = (X, Y ) ∈ 2E(K) ⇐⇒ X − α, X − β, X − γ ∈ K 2 .
Remark: Note that any 2 of these conditions implies the third, since
428–99 9–5
Proof I To simplify the presentation, let us make the coordinate-change
x 7→ x − X. (This is not the same as the earlier coordinate-change making
c = 0.) The given point P is now (0, Y ), and we have to show that
say
2 2 2
α = −α0 , β = −β 0 , γ = −γ 0 ,
where α0 , β 0 , γ 0 ∈ K.
(We have already seen that this condition is necessary. Our argument
will re-prove that, and show that the condition is also sufficient.)
By definition, P = 2Q if the tangent to E at −Q passes through P . Let
us therefore determine all the tangents that can be drawn from P to E.
The general line through P = (0, Y ) is
y = mx + Y.
(mx + Y )2 = x3 + ax2 + bx + c.
Y 2 = c.
x2 + (a − m2 )x + (b − 2mY ) = 0.
The line will be a tangent if this quadratic has coincident roots. The condi-
tion for this is that
(a − m2 )2 = 4(b − 2mY ).
This is a quartic for m; so in general 4 tangents can be drawn to E from any
point P ∈ E.
It is easy to see why there are 4 tangents. Let
428–99 9–6
These give rise to the 4 tangents passing through P . In particular we see that
if one tangent is defined over K then so are all 4. (Note that the tangents
must be distinct, since A, B, C are distinct.) Thus if our quartic has one root
in K then all its roots must lie in K.
We should say, that 4 tangents can be drawn over C. For there is no
reason to suppose that the roots of the quartic will lie in K. In fact, that is
exactly what we have to determine.
For if Q ∈ E(K) then our line P Q is defined over K, and so m ∈ K.
Conversely, if m ∈ K and the line is tangent to E then the point Q = (ξ, η)
at which it touches has coordinates in K. For the roots of our equation
(mx + Y )2 = x3 + ax2 + bx + c
Q(x)2 = L(x)2 ,
λ3 − 2aλ2 + 4bλ − 8c = 0.
Miracle! This is almost our original cubic f (x) (in the equation y 2 = f (x)).
In fact the equation can be written
f (−λ/2) = 0.
428–99 9–7
It follows that its 3 solutions are
λ = −2α, −2β, −2γ.
We can take λ to have any of these values. Suppose we take
λ = −2α.
Then our quartic for m takes the form
(m2 − a + λ)2 = (2λ)(m − 2Y /λ)2 .
Thus if our quartic has a solution in K, which we know is the case if
P = 2Q, then λ/2 = −α must be a square. Similarly, taking the other 2
values for λ, it follows that −β and −γ must also be squares:
2 2 2
−α = α0 , −β = β 0 , −γ = γ 0 .
Conversely suppose that this is the case. Then we can take
2
λ = −2α = 4α0 ,
and our quartic for m splits into 2 quadratics
m2 − a + λ = ±2α0 (m − 4Y /λ).
Note that since α + β + γ = −a,
−a + λ = −α + β + γ
2 2 2
= α0 − β 0 − γ 0 .
Furthermore
2 2 2
Y 2 = c = −αβγ = α0 β 0 γ 0 ,
so that
Y = ±α0 β 0 γ 0 .
We can take the + sign without loss of generality, since the signs of α0 , β 0 , γ 0
were arbitrary anyway.
Thus our quadratics become
2 2 2
m2 + α0 − β 0 − γ 0 = ±2(α0 m − 2β 0 γ 0 ).
In other words,
(m ± α0 )2 = (β 0 ± γ 0 )2 .
We conclude that the 4 tangents through P are y = mx + Y , where
m = α0 + β 0 − γ 0 , α0 − β 0 + γ 0 , −α0 + β 0 − γ 0 , −α0 − β 0 + γ 0 .
In particular, we see that if −α, −β, −γ are perfect squares in K then m ∈ K
and P = 2Q. J
428–99 9–8
9.4 The 3 Homomorphisms
Recall that if
are 3 points of
E(K) : y 2 = x3 + ax2 + bx
then
P + Q + R = 0 =⇒ x1 x2 x3 ∈ K 2 .
It would have been nicer if we could have said
P + Q + R = 0 =⇒ x1 x2 x3 ∈ (K × )2 ,
P + Q + R = 0 =⇒ x1 x2 x3 ≡ 1 mod (K × )2 .
Θ : E(K) → K × /(K × )2 .
x1 x2 x3 = 0,
x2 x3 + x3 x1 + x1 x2 = b − 2md.
x2 x3 = b
y 2 = x3 + ax2 + bx + c.
428–99 9–9
Then the map
Θ : E(K) → K × /(K × )2
defined by
X
if P = (X, Y ) 6= (0, 0),
P 7→ b if P = (0, 0),
1 if P = 0 = [0, 1, 0]
is a homomorphism
P + Q = 0 =⇒ Θ(P )Θ(Q) = 1
in all cases.
It is sufficient therefore to show that
P + Q + R = 0 =⇒ Θ(P )Θ(Q)Θ(R) = 1
x2 x3 = b,
and so
Θ(Q)Θ(R) = b.
Thus, since Θ(D) = b,
Θ(P )Θ(Q)Θ(R) = b2 = 1
in K × /(K × )2 . J
We were assuming in this Proposition that c = 0. To convert back to the
general case, we note that if α is a root of f (x) then the coordinate-change
x 7→ x − α takes f (x) into x3 + a0 x2 + b0 x, where
y 2 = x3 + ax2 + bx + c;
428–99 9–10
and suppose
A = (α, 0)
is a point of order 2 on E(K). Then the map
Θα : E(K) → K × /(K × )2
defined by
X − α
if P = (X, Y ) 6= A,
P 7→ 3α2 + 2aα + b if P = A,
1 if P = 0 = [0, 1, 0],
is a homomorphism.
Note that we have 3 homomorphisms, corresponding to the 3 roots α, β, γ
of f (x). We can re-state Proposition 9.2 as follows.
E(K) : y 2 = x3 + ax2 + bx + c
f (x) = x3 + ax2 + bx + c
and similarly for the other 2 kernels — each is contained in the intersection
of the other two. Thus
428–99 9–11
Proof I By the Proposition (and the following Remark),
Φ : A → A/C.
ker ΦB = B ∩ C.
B/B ∩ C ∼
= im ΦB ⊂ A/C.
Hence
kB/B ∩ Ck ≤ kA/Ck,
and the result follows. C
Applying the Lemma with B = ker Θα , C = ker Θβ we deduce that E/2E
is finite if and only if
E/ ker Θα ∼
= im Θα and E/ ker Θβ ∼
= im Θβ
are both finite; and the same is true if α, β are replaced by α, γ or β, γ. J
428–99 9–12
9.5 The Finiteness of the Images
We have to prove that im Θα , im Θβ , im Θγ (or at least two of them) are
finite. It is sufficient to prove the result for one of them; and we can again
suppose for simplicity that c = 0.
Then im Θ is finite.
Proof I Suppose
P = (x, y) ∈ E,
where y 6= 0.
pe || x, pf || y.
If e < 0 then the right-hand side is dominated by x3 , and so f < 0 and
2f = 3e.
On the other hand, if e > 0 then
p 6 | x2 + ax + b
since we are supposing that p 6 | b. Thus
2f = e.
In either case (or if e = 0) e is even. C
428–99 9–13
Lemma 10 We can find a finite number of elements x1 , . . . , xr ∈ K such
that x¯k ∈ im Θ, and for each x with x̄ ∈ im Θ we have
hxx−1
k i = a
2
Proof of Lemma B By the last Lemme, the only prime ideals p appearing
to an odd power in x are the finite number dividing b. Suppose these prime
ideals are p1 , . . . , ps . Consider the 2s ideals
say
a 1 , . . . , a 2s .
According to the last Lemma, if x̄ ∈ im Θ then
hxi = ak b2
hxk i = ak b2 .
C
If we are working over Q it follows that
xx−1 2
k = ±X ,
and so
x ≡ ±xk mod (K × )2
for some k. Hence
im Θ = {±x1 , . . . , ±xr }.
Thus im Θ is finite, and so the result is established: E(Q)/2E(Q) is finite.
For a general number field K we have a little more work to do.
428–99 9–14
Let
S = hx¯1 , . . . , x¯r i
be the subgroup of K × /(K × )2 generated by x¯1 , . . . , x¯r . This subgroup is
finite, since each element of K × /(K × )2 has order 2.
Let T be the subgroup of K × /(K × )2
T = {x̄ ∈ im Θ : hxi = a2 }.
im Θ ⊂ ST.
kST k = kST /T k kT k.
Φ : G → G/T.
ker ΦS = S ∩ T, im ΦS = ST /T.
ST /T ∼
= S/S ∩ T,
and so
kST /T k divides kSk.
C
428–99 9–15
Lemma 12 Let
S = {x ∈ K × : hxi = a2 }.
Then S ⊃ (K × )2 ; and the quotient-group
S/(K × )2
is finite.
hx2 i = hxi2 .
By the finiteness of the class number we can find a finite number of ideals
a1 , . . . , ah such that for any ideal a one of the ideals aai is principal, say
aai = hai.
428–99 9–16
It follows that
= e11 · · · emm η 2 (e1 , . . . , em ∈ {0, 1}),
where η ∈ U (K).
Putting all this together, we have
In other words,
x ≡ e11 · · · emm ai mod (K × )2 .
There are only a finite number of elements e11 · · · emm ai . We conclude that
the quotient-group
S/(K × )2
is finite. C
Since
T ⊂ S/(K × )2
it follows that T is finite. Hence
im Θ = ST
is finite. J
y 2 = x3 + ax2 + bx + c (a, b, c ∈ Q)
then
E(Q)/2E(Q)
is finite.
where E1 , . . . , Em ∈ E.
Recall our “plan for infinite descent”. Suppose P ∈ E. Then
P − E ∈ 2E
428–99 9–17
for some E ∈ {E1 , . . . , Em }, say
P − Ei0 = 2P1 .
Then similarly
P1 − Ei1 = 2P2
P2 − Ei2 = 2P3
...
The points P = P0 , P1 , P2 , · · · ∈ E(Q) — derived by repeated halving —
represent our infinite descent. But in what sense are they descending? We
need some notion of the ‘height’ of a point on E.
m1 m2 n mn1 m1 n2 + m2 n1 −1 n1
x1 x2 = , x 1 = n , x 1 + x2 = , x1 = .
n1 n2 n1 n1 n2 m1
The result follows at once. C
If n1 and n2 have a large common factor — which will usually be the
case for us — the result for x1 + x2 can be greatly improved, as the following
result illustrates.
Lemma 14 Suppose
f (x)
X= ,
g(x)
where f (x) and g(x) are polynomials of degrees d and e. Then
h(X) ≤ max(d, e)h(x) + C,
for some constant C.
428–99 9–18
Proof of Lemma B We can assume that d = e. For suppose d < e. Then we
can replace f (x) by f (x) + g(x). This replaces X by X + 1; but that does
not affect the result, since
from the estimate in the last Lemma for h(x1 + x2 ). If e < d we can apply
the same argument after replacing X by X −1 .
We may also assume that the coefficients of f (x), g(x) are integral, say
where ai , bj ∈ Z. Then
a0 md + a1 md−1 n + · · · + ad nd
X =
b0 md + b1 md−1 n + · · · + bd nd
M
= ,
N
say. Thus
and so
h(X) ≤ dh(x) + C.
C
We define the height of a point P = (x, y) to be the height of its x-
coordinate.
We want to show that our infinite descent is descending in the sense that
Lemma 15 For any constant C > 0, there are only a finite number of points
P ∈ E(Q) with
h(P ) ≤ C.
428–99 9–19
Proof of Lemma B There are at most 4e2C + 1 rationals with e(x) ≤ C,
since both denominator and numerator must be chosen from {−N, −N +
1, . . . , N − 1, N } where N = [eC ].
For each such x there are at most 2 values of y such that (x, y) ∈ E. C
h(P + P0 ) ≤ 2h(P ) + C.
P + P0 + Q = 0,
h(−P ) = h(P ).
y = mx + d.
Then
y − y0
m= .
x − x0
The line meets the curve where
Hence
x + x0 + X = m2 − a.
Thus
(y − y0 )2 − (x + x0 + a)(x − x0 )2
X =
(x − x0 )2
y 2 − 2y0 y + y02 − x3 − ax2 + 2x0 x2 − 2ax0 x − 3x20 x − ax20 − x30
=
(x − x0 )2
−2y0 y + 2x0 x2 + (b − 2ax0 − 3x20 )x + (c + y02 − ax20 − x30
= ,
(x − x0 )2
428–99 9–20
since y 2 = x3 + ax2 + bx + c.
The point is that
Ay + Bx2 + Cx + D
X=
Ex2 + F x + G
for some integers A, B, C, D, E, F depending only on P0 .
If x = m/n then
This allows us to apply the argument in the proof of the last Lemma. We
have
An2 y + Bm2 + Cmn + Dn2
X =
Em2 + F mn + Gn2
M
= ,
N
where
B ≤ (|E| + |F | + |G|)H(x)2 .
It follows that
H(X) ≤ CH(x)2 .
from which the result follows. C
h(2P ) ≥ 4h(P ) − C
for all P ∈ E.
428–99 9–21
Proof of Lemma B Suppose P = (x, y), 2P = (X, Y ). Let the tangent at P
be
y = mx + d.
If the elliptic curve E(Q) has equation
y 2 = x3 + ax2 + bx + c
then
dy
2y = 3x2 + 2ax + b = f 0 (x),
dx
and so
f 0 (x)
m= .
2y
The tangent meets E where
2x + X = m2 − x;
and so
X = m2 − a − 2x
f 0 (x)2 − (a + 2x)4y 2
=
4y 2
f 0 (x)2 − 4(a + 2x)f (x)2
= .
4f (x)2
h(x) ≤ 4h(x) + C.
428–99 9–22
Sublemma Suppose
f (x)
X= ,
g(x)
where f (x), g(x) are polynomials of degrees d, e, with gcd(f (x), g(x)) = 1.
Then
h(X) ≥ max(d, e)h(x) − C
for some constant C.
where ai , bj ∈ Z.
Let F (x, z), G(x, z) be the corresponding homogeneous forms, ie
If x = m/n then
F (m, n)
X= .
G(m, n)
We have to show that this is almost in its lowest terms.
Since gcd(f (x), g(x)) = 1, we can find polynomials u(x), v(x) ∈ Q[x] such
that
u(x)f (x) + v(x)g(x) = 1.
On ‘multiplying out’ the denominators of the coefficients, and passing to the
homogeneous forms, we obtain polynomials U (x, z), V (x, z) ∈ Z[x, z] such
that
U (x, z)F (x, z) + V (x, z)G(x, z) = Az N
where A is a non-zero integer, and N ∈ N.
In particular,
It follows that
gcd(F (m, n), G(m, n)) | AnN .
On the other hand
gcd(F (m, n), n) | a0 md .
428–99 9–23
Since gcd(m, n) = 1 this implies that
It follows that
gcd(F (m, n), nN ) | aN
0 ,
and so
gcd(F (m, n), AnN ) | AaN
0 .
Hence
gcd(F (m, n), G(m, n)) | AaN
0 .
F (m, n) M
X= = ,
G(m, n) N
say, is almost in its lowest terms. It only remains to show that the numerator
or denominator is of the correct order of magnitude. This is ‘trivial but not
obvious’.
Let
M (x) = max(|f (x)|, |g(x)|).
Since
f (x)
→ a0 as x → ∞
xd
there exist constants C1 > 0, C2 > 0 such that
M (x) ≥ C1 |x|d
for |x| ≥ C2 .
On the other hand, since f (x), g(x) have no root in common, there is a
constant C3 > 0 such that
M (x) ≥ C3
for |x| ≤ C2 . It follows that
for |x| ≤ C2 .
Putting these together,
M (x) ≥ C4 |x|d
428–99 9–24
for all x, with C4 = min(C1 , C3 C2−d ). On setting x = m/n, and multiplying
out, this gives
max(M, N ) ≥ C5 |n|d .
We conclude that
max(M, N ) ≥ C6 H(x)d ,
with C6 = min(C4 , C5 ). Since we know that
gcd(M, N ) ≤ AaN
0 ,
we conclude that if
M0
X=
N0
in its lowest terms then
with C7 = C6 /(AaN
0 ) > 0; and so finally,
h(X) ≥ dh(x) − C.
C
In particular, applying this to our formula for 2P , we have shown that
h(2P ) ≥ 4h(P ) − C.
Pi − Ej = 2Pi+1 ,
h(Pi − Ej ) ≥ 4h(Pi+1 ) − c1 .
428–99 9–25
But by Lemma 16 (and the fact that h(−P ) = h(P )),
h(Pi − Ej ) ≤ 2h(P ) + c2 .
Combining these,
2h(Pi ) + c2 ≥ 4h(Pi+1 ) − c1 .
Hence
1
h(Pi+1 ) ≤ h(Pi ) + c3
2
with c3 = (c1 + c2 )/4.
We have shown therefore that
P1 , . . . , P n .
Our infinite descent must lead to one of these points. We see therefore that
for any point P ∈ E is expressible in the form
P = u1 E1 + · · · + um Em + Pi ,
where u1 , . . . , ur ∈ N.
We conclude that E(Q) is generated by the points E1 , . . . , Em , P1 , . . . , Pn .
428–99 9–26
9.8 The formula for rank(E)
Since we now know that E is finitely-generated, it follows from the Structure
Theorem for Finitely Generated Abelian Groups that
E = Z ⊕ · · · ⊕ Z ⊕ Z/(pe11 ) ⊕ · · · Z/(pess ),
where there are r = rank(E) copies of Z.
Then
kE/2Ek = 2s ,
where
s = r + d.
Proof I If
A = A1 ⊕ · · · ⊕ Am
then
2A = 2A1 ⊕ · · · ⊕ 2Am
and so
428–99 9–27
Lemma 19 If A = Z/(pe ), where p is odd,then
A/2A = 0.
θ : A → A : a 7→ 2a.
Then
ker θ = {a ∈ A : 2a = 0} = 0,
since by Lagrange’s Theorem there are no elements of order 2 in A. Hence θ
is injective, and so surjective, ie 2A = A, and A/2A = 0. C
From the two Lemmas it follows that the number of copies of Z/(2) in
A = A 1 ⊕ A2 ⊕ · · · ⊕ A m
428–99 9–28
9.9 The square-free part
Each rational x ∈ Q× is uniquely expressible in the form
x = dy 2 ,
where y ∈ Q× and d is a square-free integer. Explicitly, if
x = ±2e2 3e3 5e5 · · ·
then
x = ±22 33 55 · · ·
where each p ∈ {0, 1} is given by
p ≡ ep mod 2.
For example,
x = 2/3 7→ d = 6, x = −3/4 7→ −3.
We may call d the square-free part of x.
Thus each x̄ ∈ Q× /Q×2 is represented by a unique square-free integer d,
establishing an isomorphism
Q× /Q×2 ←→ D,
where D is the group formed by the square-free integers under multiplication
modulo squares, eg
2 · 6 = 3, −3 · 6 = −2.
Let us see how to use this to compute the rank. Recall that
E/E 2 ∼
= im Θ
where
Θ = θα × θβ × θγ ,
with θα , for example, given by
(
x − α if x 6= α
P = (x, y) 7→
p0 (α) if x = α
If P = (x, y) is on the elliptic curve
E(Q) : y 2 = x3 + ax2 + bx + c (a, b, c ∈ Z)
then
m M
x= , y =
t2 t3
where m, M, t ∈ Z with gcd(m, t) = 1 = gcd(M, t) and t > 0.
428–99 9–29
9.10 An example
Consider the elliptic curve
Here
α = 0, β = 1, γ = −1,
so that
im Θ ⊂ S = {(d, e, f ) : d | 1, e | 2, f | 2}.
kE/2Ek ≤ 5,
and so
rank E ≤ 3.
However, we can restrict the range of im Θ much more than this. In the
first place, since
x(x − 1)(x + 1) = y 2 ,
it follows that def is a perfect square, say
def = g 2 .
This implies firstly that def > 0, and secondly that each prime p dividing
any of d, e, f must in fact divide just two of them. This reduces the number
of cases to 8:
(d, e, f ) = (1, 1, 1), (1, −1, −1), (−1, 1, −1), (−1, −1, 1), (1, 2, 2), (1, −2, −2), (−1, 2, −2), (−1, −
m = du2 , m − t2 = ev 2 , m + t2 = f w2 ,
428–99 9–30
it follows that
while
(d, e, f ) = (1, 1, 1), (−1, −1, 1), (1, 2, 2), (−1, −2, 2).
Thus
kE/2Ek ≤ 4
Since d = 2 (as there are 3 points of order 2),
kE/2Ek = 2r+d ≥ 4.
We conclude that
rank E = 0.
Here
p0 (0) = −4, p0 (2) = 8, p0 (−2) = 8,
and so
E/2E = im Θ ⊂ {(d, e, f ) : d, e, f | 2}
The group on the right contains 26 elements, since each of d, e, f can take
the values ±1, ±2.
But as before, the condition
def = g 2
428–99 9–31
Secondly, the factor 2 occurs in none, or just two, of d, e, f . This reduces the
choice to
(d, e, f ) = (1, 1, 1), (−1, −1, 1), (1, 2, 2), (−1, −2, 2), (2, 1, 2), (−2, −1, 2), (2, 2, 1), (−2, −2, 1).
Thus the rank is either 0 or 1. Can we reduce the choice further, and
reduce the rank to 0? or conversely, can we find a point of infinite order on
the curve, and so show that the rank is 1?
Note that it only necessary to eliminate one case; for we know that
kE/2Ek = 2s ≥ 4, since there are 3 points of order 2 (and so d = 2).
Suppose
(d, e, f ) = (−1, −1, 1).
In this case,
m = −u2 , m − 2t2 = −v 2 , m + 2t2 = w2 .
Thus
u2 − v 2 = 2t2 = u2 + w2 .
Now a2 ≡ 0 or 1 mod 4 according as a is even or odd. Since u2 − v 2 is even
it followu, v are both even or both odd; and in either case u2 − v 2 ≡ 0 mod 4.
So t is even, and therefore u, v must both be odd, since gcd(m, t) = 1 =
gcd(m − 2t2 , t).
The point
P = (−1, 3) ∈ E.
(We chose α, β, γ to give this result.)
The slope at P is
dx 3x2 + 4x − 8
=
dy 2y
3
=−
2
at P . It follows that P is of infinite order (since 2P has non-integral coordi-
nates). Thus
r = rank(E) ≥ 1.
428–99 9–32
We have
p0 (0) = −8, p0 (2) = 12, p0 (−4) = 24.
Thus
im Θ ⊂ S{(d, e, f ) : d | 2, e | 6, f | 6; def = g 2 }.
Note that any two of d, e, f determine the third since eg f = de (modulo
squares).
If
P = (m/t2 , M/t3 ) 7→ (d, e, f )
then
m = du2 , m − 2t2 = ev 2 , m + 4t2 = f w2 .
Thus
d > 0 =⇒ m > 0 =⇒ f > 0 =⇒ e > 0,
while
d < 0 =⇒ m < 0 =⇒ e < 0 =⇒ f > 0.
(So f > 0 in all cases.)
It follows that
kSk = 16,
with
S = {d = ±1, ±2, f = 1, 2, 3, 6}.
It follows that s ≤ 4, and so
rank(E) = s − d = s − 2 ≤ 2.
Thus rank(E) = 1 or 2.
In order to prove that rank(E) = 1 it is sufficient to show that one of the
16 elements of S does not lie in im Θ. For kSk is a power of 2, so if it is < 16
it must be ≤ 8.
Let us take the element (−1, −1, 1). Suppose this arises from a point
P = (m/t2 , M/t3 ), where for the moment we assume that P is not of order
2. Then
m = −u2 , m − 2t2 = −v 2 , m + 4t2 = w2 .
Thus
2t2 = v 2 − u2 , 4t2 = u2 + w2 .
From the second equation,
u2 + w2 ≡ 0 mod 4 =⇒ u, w even,
428–99 9–33
since a2 ≡ 0 or 1 mod 4 according as a is even or odd. It follows that t is
odd, since
gcd(m, t) = 1 =⇒ gcd(u, t) = 1.
But then t2 ≡ 1 mod 4, and so
v 2 − u2 ≡ 2 mod 4,
which is impossible.
(Alternatively, adding the two equations,
6t2 = v 2 + w2 .
Thus
v 2 + w2 ≡ 0 mod 3 =⇒ v ≡ w ≡ 0 mod 3
=⇒ t ≡ 0 mod 3
=⇒ u ≡ 0 mod 3,
Thus
m = du2 , m + t2 = ev 2 , m − 14t2 = f w2 .
In particular,
428–99 9–34
while
while of course
0 = [0, 1, 0] 7→ (1, 1, 1).
Thus the torsion group gives rise the subgroup
D = {(1, 1, 1), (−14, 1, −14), (−1, 15, −15), (14, 15, 14 · 15).
428–99 9–35
whence
dim im Θ = dim D =⇒ im Θ = D.
For our subspace U let us take those vectors with 3rd and 5th components
0, ie
U = {(d, e, f ) ∈ S : d = ±1, ±2, e = 1, 3}.
We see at once that U ∩D = {(1, 1, 1)} (the zero element of our vector space),
so U is — as required — complementary to D. It is sufficient therefore to
show that no element of U apart from (1, 1, 1) can be in im Θ. (This reduces
the number of cases to be considered from 28 to 7.)
ie
t2 = u2 + v 2 , 14t2 = w2 − u2 .
ie
t2 = v 2 − 2u2 , 7t2 = u2 − w2 .
From the second equation t must be even, since otherwise 7t2 ≡ 3 mod
4, and u2 − w2 cannot be ≡ 3 mod 4.
But then from the first equation, v is even and so u is even, contradict-
ing gcd(u, t) = 1.
428–99 9–36
ie
t2 = v 2 + 2u2 , 7t2 = w2 − u2 .
As in the last case, from the second equation t must be even, and then
from the first equation, so must v and u, contradicting gcd(u, t) = 1.
4. (1, 3, 3): in this case
m = u2 , m + t2 = 3v 2 , m − 14t2 = 3w2 ,
ie
t2 = 3v 2 − u2 , 14t2 = u2 − 3w2 .
u2 − 3w2 ≡ 0 mod 7.
ie
t2 = 3v 2 + u2 , 14t2 = 3w2 − u2 .
As in the last case, since 3 is not a quadratic residue mod 7, the second
equation implies that 7 | u, w, t, contradicting gcd(u, t) = 1.
6. (2, 3, 6): in this case
ie
428–99 9–37
7. (−2, 3, −6): in this case
ie
We conclude that
im Θ = D,
ie
rank E = 0.
428–99 9–38
Chapter 10
Mordell Revisited
10.1 Introduction
There is an alternative way of proving Mordell’s Theorem, by ‘factorising’
the doubling map
E → E : P 7→ 2P ;
although the factors are not, admittedly, homomorphisms from E to itself,
but involve a ‘twin’ elliptic curve Ē. The resulting computations are much
simpler. Moreover, the use of algebraic numbers is avoided if f (x) has one
rational root. (In the previous method, algebraic numbers were avoided if
two — and therefore all three — of the roots of f (x) are rational.)
The only disadvantage of this alternative method is that it requires either
an act of faith, in which ‘magic’ formulae are pulled out of a hat; or else a
rather lengthy digression into elliptic curves over C.
z 7→ (ϕ(z), ϕ0 (z)/2)
Φ : C/Λ ↔ E(C),
428–99 10–1
where E(C) is the curve
y 2 = x3 + bx + c,
with coefficients
b = −15g2 , c = −35g3 ,
where X 1
gr = 2r
.
ω∈Λ,ω6=0
ω
Under this correspondence, the ‘doubling’ homomorphism
Φ : P 7→ 2P
φ : z mod Λ 7→ 2z mod Λ.
φ = θ3 θ2 θ1 ,
where
The map θ3 is just the isomorphism (x, y) 7→ (x/4, y/8) associated to the
similarity 12 Λ → Λ; it is convenient to combine it with θ2 .
428–99 10–2
Let Ē(C) be the elliptic curve associated to the lattice Λ̄:
Ē = C/Λ̄.
Θ : E → Ē, Θ̄ : Ē → E.
Note that
Φ̄ = ΘΘ̄ : Ē → Ē
is also a doubling map, this time on Ē, being given by the composition
Similary Ē is parametrised by
Proposition 10.1 Let Λ = hω1 , ω2 i, and let Λ̄ = hω1 /2, ω2 i. Then, writing
ϕ(z) for ϕΛ (z),
ϕ(z)2 − αϕ(z) + 3α2 + b
ϕΛ̄ (z) = ,
ϕ(z) − α
where α = ϕ(ω1 /2), and b is the coefficient in the functional equation
428–99 10–3
Proof I Since Λ̄ ⊂ Λ, ϕΛ̄ (z) is elliptic with respect to Λ. It is also even.
Hence it is a rational function of ϕ(z),
has periods ω1 /2, ω2 , and so is elliptic with respect to Λ̄. Since it has a double
pole at 0, and no other poles inside Π1 ,
428–99 10–4
has a double pole at the points of Λ, and no other poles. Since F (z) is even,
it follows that
F (z) = Cϕ(z) + D
for some constants C, D. To determine these constants we expand F (z)
around z = 0.
By Taylor’s theorem,
1 1
ϕ(z + ω1 /2) = ϕ(ω1 /2) + ϕ00 (ω1 /2)z 2 + ϕ0000 (ω1 /2)z 4 + O(z 6 ).
2 24
On differentiating the functional equation
we deduce that
ϕ00 (z) = 2(3ϕ(z)2 + b).
Differentiating twice more,
In particular,
ϕ00 (ω1 /2) = 2(3α2 + b), ϕ0000 (ω1 /2) = 24α(3α2 + b).
Thus
ϕ(z + ω1 /2) = α + (3α2 + b)z 2 + α(3α2 + b)z 4 + O(z 6 )
in the neighbourhood of z = 0. It follows that
2 2
1
F (z) = α + (3α + b)z − α + O(z 2 )
z2
α
= 2 + (2α2 + b).
z
Hence
F (z) = αϕ(z) + 2α2 + b.
We conclude that
αϕ(z) + 2α2 + b
ϕΛ̄ (z) = ϕ(z) − α +
ϕ(z) − α
ϕ(z) − αϕ(z) + 3α2 + b
2
= .
ϕ(z) − α
J
428–99 10–5
Corollary 18 The derivative of ϕΛ̄ (z) is given by:
if x 6= α, while
Θ(α, 0) = O.
But what is the curve Ē? Recall that
1
ϕ(z) = + 3g2 z 2 + 5g3 z 4 + O(z 6 )
z2
1 b c
= 2 − z 2 − z 4 + O(z 6 ).
z 5 7
Similarly
1 b1 2 c 1 4
ϕΛ̄ (z) =− z − z + O(z 6 ).
z2 5 7
Thus we can determine b1 , c1 by looking at the expansion of ϕΛ̄ (z) around
z = 0. From above,
b̄ = −15α2 − 4b,
c̄ = −21α3 − 7αb + c
= −28α3 + 8c,
since
α3 + bα + c = 0.
The relation between Λ̄ and 12 Λ is exactly the same as that between Λ
and Λ̄, except that ω1 /2 is replaced by ω2 /2. More precisely,
α = ϕ(ω1 /2)
428–99 10–6
is replaced by
E(C) : y 2 = x3 + bx + c
Φ = Θ̄Θ
where
Θ : E → Ẽ, Θ̄ : Ẽ → E
428–99 10–7
are the maps
x2 − αx + 3α2 + b x2 − 2αx − 2α2 − b
Θ(x, y) = , y
x−α (x − α)2
if (x, y) 6= (α, 0), while Θ(α, 0) = Θ(O) = O; and
1 (x̄2 + 2αx̄ − 3α2 − 4b 1 x̄2 + 4αx̄ + 7α2 + 4b
Θ̄(x̄, ȳ) = · , · ȳ
4 x̄ + 2α 8 (x̄ + 2α)2
if (x̄, ȳ) 6= (−2α, 0), while Θ̄(−2α, 0) = Θ̄(O) = O.
428–99 10–8
We want to transform Ē into constant-free format. At first sight there
might seem some ambiguity in this, since it involves choosing a point of
order 2 on Ē. However, we know the point we want: (α1 , 0) = (−2α, 0) =
(−2ȧ/3, 0). Our transformation must bring this to (0, 0), and is therefore
ẋ = x + 2α = x + 2ȧ/3.
where
ȧ1 = −2ȧ,
ḃ1 = 4ȧ2 /3 + ḃ1
= ȧ2 − 4ḃ,
where
(ẋ + ȧ/3)2 − ȧ(ẋ + ȧ/3)/3 + 3(ȧ/3)2 − ȧ2 /3 + ḃ
x̃ = + 2ȧ/3
(ẋ + ȧ/3 − ȧ/3)2
ẋ2 + ȧẋ + ḃ
= ,
ẋ2
(ẋ + ȧ/3)2 − 2ȧ/3(ẋ + ȧ/3) − 2(ȧ/3)2 + ȧ2 /3 − ḃ
ỹ = ẏ
(ẋ + ȧ/3 − ȧ/3)2
ẋ2 − ḃ
= ẏ.
ẋ2
We derive Θ̄˙ from this by substituting b̄˙ = ȧ2 − 4ḃ for ḃ, and dividing the
x- and y-coordinates by 4 and 8, respectively:
!
2 2 2
˙ ẋ, ẏ) = ẏ ẋ − ȧ + 4ḃ
Θ̄( , ẏ .
4ẋ2 8ẋ2
428–99 10–9
Definition 10.1 To each elliptic curve
E : y 2 = x3 + ax2 + bx.
Ẽ : y 2 = x3 + āx2 + +b̄x,
where
ā = −2a, b̄ = a2 − 4b.
E : y 2 = x3 + ax2 + bx
over the field K. Let Ē(K) be the associated elliptic curve, and let the maps
Θ : E → Ē, Θ̄ : Ē → E
be defined by
y 2 x2 − b
Θ(x, y) = , y
x2 x2
if x 6= 0, while Θ(O) = Θ(T ) = 0 for T = (0, 0),
2
ȳ x̄2 − b̄
Θ̄)(x, y) = , ȳ
4x̄2 8x̄2
428–99 10–10
10.4 Divide and rule
Recall that our main aim is to show that if K = Q then [E : 2E] is finite.
The splitting of the doubling map allows us to divide this task.
φS : A → B → B/φS.
Evidently
im φS = φA/φS,
while
ker φS ⊃ S.
By the first isomorphism theorem,
φA/φS ∼
= A/ ker φS .
Hence
[φA : φS] = [A : ker φS ] ≤ [A : S].
J
Proposition 10.4 [E : 2E] and [Ē : 2Ē] are both finite if and only if [Ē :
im Θ] and [E : im Θ̄] are both finite.
Proof I We have
[E : 2E] = [E : Θ̄ΘE]
= [E : Θ̄Ē][Θ̄Ē : Θ̄ΘE]
≤ [E : Θ̄Ē][Ē : ΘE],
by Proposition 10.3 J
428–99 10–11
10.5 Characterisation of the image
Proposition 10.5 If P̄ = (x̄, ȳ) ∈ Ē with x̄ 6= 0 then
P̄ ∈ im Θ ⇐⇒ x̄ ∈ K 2 .
P ∈ im Θ̄ ⇐⇒ x ∈ K 2 .
y2
x̄ = 2 ∈ K 2 .
x
Conversely, suppose (x̄, ȳ) ∈ Ē; and suppose
x̄ = w2 ,
y2
= w2 .
x2
We may suppose that
y = wx,
on taking −P if y = −wx.
Substituting y = wx in the equation for E,
w2 x2 = x3 + ax2 + bx = 0.
x2 + (a − w2 )x + b = 0.
(a − w2 )2 − 4b ∈ K 2 ,
ie
428–99 10–12
ie
x̄2 + āx̄ + b̄ ∈ K 2 .
By hypothesis, x̄ ∈ K 2 . Hence
x̄2 + āx̄ + b̄ ∈ K 2 ,
under which
×2
x mod K
if P = (x, y) with x 6= 0
P 7→ b mod K × 2 if P = T = (0, 0)
2
1 mod K × if P = O
is a homomorphism.
Proof I Trivially,
χ(−P ) = χ(P ) = 1/χ(P ),
2
since x = 1/x for all x ∈ K × /K × (ie all elements are of order 1 or 2).
Now suppose
P + Q + R = 0,
ie P, Q, R are collinear. We have to show that
χ(P )χ(Q)χ(R) = 1.
Q + R = 0 =⇒ χ(Q)χ(R) = 1.
428–99 10–13
Suppose none of the points is O. Let the line P QR be y = mx + d. This
line meets E where
(mx + d)2 = x3 + ax2 + bx.
The roots of this are the x-coordinates of P, Q, R, say x1 , x2 , x3 . Thus
x1 x2 x3 = d2 .
If none of x1 , x2 , x3 is zero, then
2
χ(P )χ(Q)χ(R) = x1 x2 x3 ≡ 1 mod K × ,
as required.
Finally, suppose one of x1 , x2 , x3 is 0, say x1 = 0, ie P = T = (0, 0). Then
d = 0, and the remaining two points satisfy the quadratic
m2 x = x2 + ad + b = 0.
Thus
x2 x3 = b.
Now χ(T ) = b (by what may have seemed an arbitrary definition, but whose
purpose is now apparent); so
2
χ(P )χ(Q)χ(R) = bx2 x3 = b2 ≡ 1 mod K × .
Thus in all cases
P + Q + R = 0 =⇒ χ(P )χ(Q)χ(R) = 1.
Hence χ is a homomorphism. J
Now we can re-state Proposition 10.5 as
Proposition 10.8 [E : 2E] and Ē : 2Ē] are both finite if and only if im χ and
im χ̄ are both finite.
428–99 10–14
10.7 The rational case
So far we have been working over a general field K. Now let us turn to the
rational case K = Q. Note that since T = (0, 0) ∈ E, we are assuming that
our elliptic curve contains a rational point of order 2.
and let
2
χ : E → Q× /Q×
be the associated homomorphism under which
2
P = (x, y) 7→ x mod Q× .
where b1 | b.
Let
b1 = gcd(m, m2 + ae2 m + be4 )
Then
b1 = gcd(m, be4 )
= gcd(m, b),
b = b1 b2 , m = b1 m 1 ,
428–99 10–15
Hence b21 | n2 , and so b1 | n, say
n = b1 n 1 .
Thus
n21 = m1 (b1 m21 + ae2 m1 + b2 e4 ).
The two factors on the right are co-prime, since we took out their common
factor. Hence
m1 = U 2 , b1 m21 + ae2 m1 + b2 e4 = V 2 .
For future reference we note that this implies
b1 U 4 + ae2 U 2 + b2 e4 = V 2 ,
where b1 | b̄.
[E : 2E] ≤ [E : im Θ̄][Ē : im Θ]
= k im χk · k im χ̄k.
But these two images are finite, by Proposition 10.9 and its Corollary. J
428–99 10–16
10.8 Determining the rank of E
We know that
E = F ⊕ Zr ,
where F is the torsion subgroup of E, and r is its rank. It follows that
A = Z/(pe11 ⊕ · · · ⊕ Z/(perr )
= C1 ⊕ · · · ⊕ Cr ,
a 7→ 2a.
Then
ker φ = {a ∈ A : 2a = 0}.
428–99 10–17
If p 6= 2 then A has no elements of order 2, by Lagrange’s Theorem.
Hence ker φ = 0, and so
2A = A,
ie every element a ∈ A is of the form a = 2b for some b ∈ A.
On the other hand, if p = 2 then Z/(2e ) has just one element of order 2,
namely 2e−1 mod 2e . Thus k ker φk = 2; and so
[A : 2A] = 2.
Corollary If
A = Z/(pe11 ) ⊕ · · · ⊕ Z/(perr )
then
[A : 2A] = 2d ,
where d is the number of factors with pi = 2.
[A : 2A] = 2d ,
Proof I As we saw above, the factor Z/(pe ) contains just one element of
order 2 if p = 2 and none otherwise. But the element
a = a1 ⊕ · · · ⊕ ar
Then (
4 if b̄ ∈ Q2
[F : 2F ] =
/ Q2
2 if b̄ ∈
428–99 10–18
Proof I P ∈ E is of order 2 if P = (α, 0), where α is a root of
x3 + ax2 + bx = 0.
One root is α = 0; the other two are the roots of the quadratic
x2 + ax + b = 0.
This has rational roots if and only if
a2 − 4b = b̄ ∈ Q2 .
Thus E has 3 or 1 points of order 2, and so [F : 2F ] = 4 or 2, according as b̄
is or is not a perfect square. J
We proved that [E : 2E] is finite by showing that
[E : 2E] = [E : Θ̄ΘE]
= [E : Θ̄Ē][Θ̄Ē : Θ̄(ΘE)]
≤ [E : Θ̄Ē][Ē : ΘE].
But now we need a slightly more precise result in place of Proposition 10.3.
428–99 10–19
Corollary 20 We have
[E : im Θ̄][Ē : im Θ]
[E 2E] =
[ker Θ̄ ∩ im Θ]
Proof I This follows at once from the definitions of Θ, Θ̄, since Θ(x, y) is
finite (ie Z 6= 0) if x 6= 0; and Θ̄(x̄, ȳ) is finite if x̄ 6= 0. J
[E : im Θ̄][Ē : im Θ]
[E : 2E] =
d
where (
2 if b̄ ∈ Q2 ,
d=
/ Q2
1 if b̄ ∈
T̄ ∈ im Θ.
428–99 10–20
Theorem 10.3 If the rank of the elliptic curve
E(Q) : y 2 = x3 + ax2 + bx
is r then
k im χk · k im χ̄k
2r = .
4
Proof I If
E = F ⊕ Zr
then as we saw
[E : 2E]
2r =
[F : 2F ]
The result now follows at once from Propositions 10.14 and 10.11. J
10.9 An example
Consider the elliptic curve
E(Q) : y 2 = x3 + x.
Ē(Q) : y 2 = x3 − 4x.
Thus
b = 1, b̄ = −4.
If the rank of E is r then
k im χk · k im χ̄k
2r =
4
by Theorem 10.3. We have to determine k im χk, k im χ̄k.
Let us consider
2
χ : E → Q× /Q×
first. We know that the elements of im χ are of the form
2
b1 mod Q× ,
b1 = ±1.
428–99 10–21
Certainly 1 = χ(O) ∈ im χ. We have to determine if −1 ∈ im χ.
We saw in the proof of Proposition 10.9 that if this is so then we can find
e, U, V with e ≥ 1, gcd(U, V ) = 1 satisfying
b1 U 4 + b2 e4 = V 2 ,
ie
−U 4 − e4 = V 2 ,
which is clearly impossible. Hence −1 ∈
/ im χ, and so
im χ = {1}.
Turning to χ̄, we have b̄ = −4, and so b1 = ±1, ±2. (We can omit
b1 = ±4, since we are working modulo squares.) We know that 1 ∈ im χ̄.
Also
χ(T̄ ) = b̄ = −4 ≡ −1,
where T̄ = (0, 0). Thus −1 ∈ im χ̄.
It remains to determine if b1 = ±2 ∈ im χ̄. (Note that if one is in the
image then so is the other, since im χ̄ is a subgroup containing −1.) For
b1 = 2, b2 = −2, we have to solve the equation
2U 4 − 2e4 = V 2 .
This has the trivial solution (e, U, V ) = (1, 1, 0) (corresponding to the point
P = (2, 0) ∈ Ē).
We conclude that
im χ̄ = {±1, ±2}.
(Note that once we knew that im χ = {1}, it followed from Theorem 10.3
that k im χ̄k ≥ 4; so in fact it was clear that im χ̄ = {±1, ±2}.)
Hence
1·4
2r = = 1,
4
ie E is of rank 0, that is, E(Q) is finite.
Now we can find E = F easily, by the Nagell-Lutz Theorem. We have
D = −4.
Hence y = 0, ±1, ±2. But the equations
x3 + x − 1 = 0, x3 + x − 4
have no solutions. Hence the only rational points on E are the 3 points of
order 2,
E = {O, (0, 0), (2, 0), (−2, 0)}.
428–99 10–22
10.10 Another example
If b is not a perfect square, then 1 = Θ(O), b = Θ(T ) are distinct elements
of im χ. Similarly, if b̄ is not a perfect square, then 1, b̄ are distinct elements
of im χ̄.
Thus if neither b nor b̄ is a perfect square then these elements alone
contribute 4 to k im χk · k im χ̄k; so by Theorem 10.3 any further element in
either of these images ensures that the rank is ≥ 1.
Consider the elliptic curve
E(Q) : y 2 = x3 + 3x.
Thus
b = 3, b̄ = −12.
We know that 3 = χ(T ) ∈ im χ. On the other hand −1, −3 ∈
/ im χ, since
b1 U 4 + b2 e4 < 0
−U 4 + 12e4 = V 2
−U 4 ≡ 1 mod 4,
and so
−U 4 + 12e4 ≡ −1 mod 4.
Since −1 is not a square mod 4, the equation has no solution, and −1 ∈
/ im χ̄.
Thus k im χ̄k = 2 or 4.
The equation for b1 = −2, b2 = 6 is
−2U 4 + 6e4 = V 2 ,
428–99 10–23
which has the obvious solution (e, U, V ) = (1, 1, 2). Thus −2 ∈ im χ̄. It
follows that
im χ̄ = {1, −2, −3, 6}.
In particular k im χ̄k = 4, and so
rank(E) = 1.
Also
y 2 = x3 + 3x = x(x2 + 3) =⇒ x ≥ 0.
It is readily verified that the only possible points of finite order are: O, (0, 0), (1, ±2), (3, ±6).
We can use the ‘factors of double’ to simplify computation of 2P . (Al-
ternatively, we could find where the tangent at P meets the curve again, in
the usual way.) Let S = (1, 2). Then
2 2
2 1 −3
Θ(S) = , · 2 = (4, −4),
12 12
and so
1 16 1 42 + 12
1 7
2S = Θ̄Θ(S) = Θ̄(4, −4) = · ,− · ·4 = ,− .
4 16 8 42 4 8
it follows that
and so
(3, 6) = T − S.
428–99 10–24
Thus
F = {O, T }.
It is an interesting — if long-winded — exercise to show that T and S
together generate E:
E(Q) = hT i ⊕ hSi ∼
= Z/(2) ⊕ Z.
P = nS or P = T + nS.
428–99 10–25
10.11 Computing the rank — II
Recall that we associate to the elliptic curve
E : y 2 = x3 + ax2 + bx
E1 : y 2 = x3 + a1 x2 + b1 x,
where
a1 = −2a, b1 = a2 − 4b.
The map E → E : P 7→ 2P factorises into two homomorphisms
Θ : E → E1 , Φ : E1 → E,
defined by
2
x + ax + b x2 − b
2
x1 + a1 x1 + b1 x21 − b1
Θ(x, y) = , y , Φ(x1 , y1 ) = , y1 ,
x x2 4x1 8x21
except that in each case the point (0, 0) of order 2 maps to 0. (Thus each
homomorphism has kernel {0, (0, 0)}, since every affine point apart from (0, 0)
maps to an affine point.)
It follows (by a little elementary group theory) that
χ : E → Q× /Q×2 , χ1 : E1 → Q× /Q×2
defined by
428–99 10–26
then
im Θ = ker χ1 , im Φ = ker χ.
It follows that
k im χk k im χ1 k
[E : 2E] = ,
e
where (
1 if b1 is a perfect square,
e=
2 otherwise.
Since r = rank E is given by
2r+d = [E : 2E],
P = (x, y) ∈ E : y 2 = x3 + ax2 + bx + c,
E(Q) : y 2 = x3 + ax2 + bx
du4 + au2 t2 + d0 t4 = v 2 .
428–99 10–27
Proof I Suppose
du2 M
P = , ∈ E.
t2 t3
Then
M2 du2 d2 u4 du2
= +a 2 +b .
t6 t2 t4 t
Thus
du4 + au2 t2 + d0 t4 = v 2 .
p | v, t =⇒ p2 | du2 =⇒ p | u,
contradicting gcd(u, t) = 1. J
10.12 Example
Consider the elliptic curve
y 2 = x3 + 1.
over the rationals. There is one point of order 2 on the curve, namely D =
(−1, 0).
(The point P = (2, 3) is also on the curve. Since
dy 3x2
=
dx 2y
12
= =2
6
at this point, the tangent at P cuts E again at (X, Y ), where
2 + 2 + X = 22 ,
428–99 10–28
ie
X = 0.
E : x3 − 3x2 + 3x
Thus
a = −3, b = 3,
and so
a1 = 6, b1 = −3,
ie the associated curve is
E1 : y 2 = x3 + 6x2 − 3x.
Since there is just one point of order 2 on E, and b1 is not a perfect square,
k im χk k im χ1 k
2r+1 = ,
2
We start by computing k im χk. Since d | 3,
im χ ⊂ {±1, ±3}.
Since (0, 0) 7→ 3,
im χ = {1, 3} or {±1, ±3}.
Suppose d = −1. Then d0 = −3, and we are looking for solutions of
Since the left-hand side is negative while the right-hand side is positive, there
is no such solution. Hence
im χ = {1, 3}.
Turning to im χ1 , we again have d | 3, and so
im χ1 ⊂ {±1, ±3}.
428–99 10–29
Again, consider d = −1. Now d0 = 3, and we are looking for solutions of
32 | u4 , u2 t2 , v 2 =⇒ 32 | 3t4
=⇒ 3 | t,
rank E = r = 0.
E1 : y 2 = x3 + 4x,
im χ ⊂ {±1}.
428–99 10–30
In fact, since (0, 0) 7→ −1,
im χ = {±1}.
Turning to im χ1 , since d | 4 =⇒ d | 2 (as d is square-free),
im χ1 ⊂ {±1, ±2}.
−u4 − 4t4 = v 2 ,
which is impossible, since the left-hand side is negative, while the right-hand
side positive. Thus
im χ1 = {1, 2}.
We conclude that
2r+2 = 2 · 2,
whence
rank E = r = 0.
which we already saw (in the last Chapter) has rank 1, with the point P =
(−1, 3) having infinite order.
Since
a1 = −2a = −4, b1 = a2 − 4b = 36,
the associated curve is
E1 : y 2 = x3 − 4x2 + 36x.
2r+2 = k im χk k im χ1 k.
428–99 10–31
If d ∈ im χ then d | −8. Thus
im χ ⊂ {±1, ±2}.
im χ = {±1, ±2}.
The point (0, 0) 7→ 1 (since 36 ≡ 1 modulo squares), which is not much help.
Consider d = −1. In this case d0 = −36, and we have to solve the equation
3 | u, v =⇒ 32 | 12t4 =⇒ 3 | t
while
3 | v, t =⇒ 32 | 3u4 =⇒ 3 | u,
rank E = 1.
428–99 10–32
Chapter 11
Recall that
a b
SL(2, R) = { : a, b, c, d ∈ R, ad − bc = 1}.
c d
By analogy we set
a b
SL(2, Z) = { : a, b, c, d ∈ Z, ad − bc = 1}.
c d
Proof I Suppose
a b
X= ∈ Z (SL(2, Z)) .
c d
Let
0 −1 1 1
S= , T =
1 0 0 1
Then
−c −d b −a
SX = XS =⇒ = =⇒ a = d, b = −c;
a b d −c
while
a+c b+d a a+b
T X = XT =⇒ = =⇒ c = 0.
c d c c+d
Thus
b = c = 0 =⇒ X = ±I.
J
428–99 11–1
Definition 11.1 The modular group Γ is the quotient-group
Γ = SL(2, R)/{±I}.
H = {z ∈ C : =(z) > 0}
by
az + b
gz =
cz + d
if g = X̄, where
a b
X= .
c d
This action is faithful, ie g ∈ Γ acts trivially only if g = e. This allows us
to identify g ∈ Γ with the corresponding transformation of H.
Γ = hs, ti.
428–99 11–2
Step B We have
r a b + ra
XT = .
c d + rc
We can choose r so that
|b + ra| ≤ |a|/2.
|b| ≤ |a|/2.
Step C We have
r a + rc b + rd
T X= .
c d
We can choose r so that
|b + rd| ≤ |d|/2.
|b| ≤ |d|/2.
Note that in each of these steps, |b| + |c| is either reduced or at worst left
unchanged. We may suppose therefore that we reach a stage where none of
the steps leads to any “improvement”, ie our matrix entries satisfy
Hence
|bc| ≤ |ad|/4.
But
ad − bc = 1 =⇒ |ad| − 1 ≤ |bc|
=⇒ |ad| − 1 ≤ |ad|/4
=⇒ |ad| ≤ 4/3
=⇒ |ad| = 1
=⇒ |bc| ≤ 1/4
=⇒ |bc| = 0
=⇒ b = c = 0.
428–99 11–3
Accordingly, we have found ‘wordw’ W1 , W2 in S, T, T −1 such that
W1 XW2 = ±I.
It follows that
X = ±W1−1 W2−1 .
Since −I = S 2 , we have expressed X as a word in S, T, T −1 . Thus S, T
generate SL(2, Z); and so s, t generate Γ. J
Corollary 21 Γ is generated by s, u:
Γ = hs, ti.
where
0 ≤ i0 , in ≤ 2, 1 ≤ ij , in ≤ 2 (0 < j < n).
Evidently
g, h ∈ Γ+ =⇒ gh ∈ Γ+ .
Now
0 −1 0 −1 0 −1
SU = =
1 0 1 1 1 1
J
X≡Y (mod m)
as a shorthand for
Xij ≡ Yij (mod m)
428–99 11–4
for all i, j.
It is easy to see that
under which
X 7→ X mod n
is a ring-homomorphism.
az + b
z 7→
cz + d
with
a≡d≡1 (mod n), b≡c≡0 (mod n).
Proof I
J
428–99 11–5
Appendix A
The Structure of
Finitely-Generated Abelian
Groups
Proposition A.1 If
0→A→B→C→0
is an exact sequence of abelian groups then B is finitely-generated if and only
if A and C are both finitely-generated.
428–99 A–1
say. Thus
A = ha, a1 , . . . , as i.
Conversely, suppose A is generated by {a1 , . . . , ar }. and C is generated
by {b1 , . . . , bs }, where b1 , . . . , bs ∈ B. Then it is readily verified that B is
generated by {a1 , . . . , ar , b1 , . . . , bs }. J
428–99 A–2
Proposition A.4 The quotient-group A/F is torsion-free.
pm a = 0, pn b = 0,
F = ⊕p Ap .
n = pe11 · · · perr ;
and set
mi = n/epi i .
Then gcd(m1 , . . . , mr ) = 1, and so we can find n1 , . . . , nr such that
m1 n1 + · · · + mr nr = 1.
428–99 A–3
Thus
a = a1 + · · · + ar ,
where
ai = mi ni a.
But
pei i ai = (pei i mi )ni a = nni a = 0
(since na = 0). Hence
ai ∈ Api .
Thus A is the sum of the subgroups Ap .
To see that this sum is direct, suppose
a1 + · · · + ar = 0,
pei i ai = 0.
Let
e e
mi = pe11 · · · pi−1
i−1 i+1
pi+1 · · · perr .
Then
mi aj = 0 if i 6= j.
Thus (multiplying the given relation by mi ),
mi ai = 0.
mmi + npei i = 1.
But then
ai = m(mi ai ) + n(pei i ai ) = 0.
We conclude that A is the direct sum of its p-components Ap . J
428–99 A–4
Theorem A.1 Suppose A is a finite abelian p-group (ie each element is of
order pe for some e). Then A can be expressed as a direct sum of cyclic
p-groups:
A = Z/(pe1 ) ⊕ · · · ⊕ Z/(per ).
Moreover the powers pe1 , . . . , per are uniquely determined by A.
pA = {pa : a ∈ A}.
pA = A =⇒ pn A = A,
n1 a1 + · · · + nr ar = 0.
m1 (pa1 ) + · · · + mr (par ) = 0,
n1 (pa1 ) + · · · + nr (par ) = 0,
and so pni ai = 0 for all i. But if p 6 | ni this implies that pai = 0. (For the
order of ai is a power of p, say pe ; while pe | ni p implies that e ≤ 1.) But
this contradicts our choice of pai as a generator of a direct summand of pA.
Thus the subgroup B ⊂ A is expressed as a direct sum
B = ha1 i ⊕ · · · ⊕ har i.
Let
K = {a ∈ A : pa = 0}.
428–99 A–5
Then
A = B + K.
For suppose a ∈ A. Then pa ∈ pA, and so
pa = n1 (pa1 ) + · · · + nr (par )
p(a − n1 a1 − · · · − nr ar ) = 0,
and so
a − n1 a1 − · · · − nr ar = k ∈ K.
Hence
a = (n1 a1 + · · · + nr ar ) + k ∈ B + K.
If B = A then all is done. If not, then K 6⊂ B, and so we can find
k1 ∈ K, k1 ∈
/ B. Now the sum
B1 = B + hk1 i
is direct. For hk1 i is a cyclic group of order p, and so has no proper subgroups.
Thus
B ∩ hk1 i = {0},
and so
B1 = B ⊕ hk1 i
If now B1 = A we are done. If not we can repeat the construction, by
choosing k2 ∈ K, k2 ∈
/ B1 . As before, this gives us a direct sum
Continuing in this way, the construction must end after a finite number
of steps (since A is finite):
A = Bs = B ⊕ hk1 i ⊕ · · · ⊕ hks i
= ha1 i ⊕ · · · ⊕ har i ⊕ hk1 i ⊕ · · · ⊕ hks i.
It remains to show that the powers pe1 , . . . , per are uniquely determined
by A. This follows easily by induction. For if A has the form given in the
theorem then
pA = Z/(pe1 −1 ) ⊕ · · · ⊕ Z/(per −1 ).
428–99 A–6
Thus if e > 1 then Z/(pe ) occurs as often in A as Z/(pe−1 ) does in pA. It
only remains to deal with the factors Z/(p). But the number of these is now
determined by the order kAk of the group. J
Remark: It is important to note that if we think of A as a direct sum of cyclic
subgroups, then the orders of these subgroups are uniquely determined, by
the theorem; but the actual subgroups themselves are not in general uniquely
determined. In fact the only case in which they are uniquely determined (for
a finite p-group A) is if A is itself cyclic,
A = Z/(pe ),
A = Z/(pe ) ⊕ Z/(pf ).
Remarks:
1. Concretely, we construct V from A as follows. Each element v ∈ V is
of the form
v = λa (λ ∈ Q, a ∈ A).
Two elements
v = λa, w = µb.
are equal if we can find m, n, N such that
m n
λ= , µ = , ma = nb.
N N
428–99 A–7
In other words, a linear relation
λ1 v 1 + · · · + λr v r = 0
holds in V if when multiplied by some integer N with N λ1 , . . . , N λr ∈ Z
it yields a relation that holds in A.
2. We can put this in a more general setting. Recall that a module M
over a ring R (not necessarily commutative, but with identity element
1) is defined by giving an abelian group A on which R acts so that
(a) λ(µm) = (λµ)m;
(b) (λ + µ)m = λm + µm;
(c) λ(m + n) = λm + λn;
(d) 1m = m.
There are 2 special cases of importance. Firstly, a module over a field
k is just a vector space over k. Thus the concept of a module may be
seen as a natural generalisation of that of a vector space, in which the
scalars are allowed to form a ring.
Secondly, a module over the integers Z is just an abelian group.
Suppose
φ:R→S
is a ring-homomorphism. Then each R-module M gives rise to an S-
module N , where
N = S ⊗R M.
Concretely, each element n ∈ N is expressible as a sum
n = s1 m 1 + · · · + sr m r ,
with addition and scalar multiplication being defined in the natural
way. We have a natural map
M → N : m 7→ 1 · m.
428–99 A–8
3. In the language of categories and functors, we have a covariant functor
F :A→V
Definition A.5 The rank r(A) of the abelian group A is defined to be the
dimension of V :
r(A) = dimQ V.
r(A) ≤ n.
A → V : a 7→ 1 · a
A = rZ = Z ⊕ · · · ⊕ Z.
428–99 A–9
We derive a Z-basis b1 , . . . , br for A as follows. Choose b1 to be the
smallest positive multiple of a1 in A:
b1 = λ1 a1 ∈ A.
b2 = µ1 a1 + λ2 a2 ∈ A.
bi = µ1 a1 + · · · + µi−1 ai−1 + λi ai ∈ A.
a = ρr,1 a1 + · · · + ρr,r ar ,
428–99 A–10
Continuing in this fashion, we find finally that
a = nr br + nr−1 br−1 + n1 b1 ,
A = Ze ⊕ Zf,
A = F ⊕ P.
Q = A/F
is torsion-free.
428–99 A–11
Proof of Lemma B For suppose ā ∈ Q (where a ∈ A) has finite order, say
nā = 0, for some n > 0. In other words, na ∈ F . But then m(na) = 0 for
some m > 0. Thus a is of finite order, ie a ∈ F , and so ā = 0. C
It follows from Proposition ?? that Q is a direct sum of copies of Z:
Q = Z ⊕ · · · ⊕ Z.
1. B ∩ C = {0};
a = n 1 a1 + · · · + n r ar
nn1 a1 + · · · + nnr ar = 0.
nn1 e1 + · · · + nnr er = 0.
ā = m1 e1 + · · · + mr er ,
a − m1 a1 − · · · − mr ar = f ∈ F.
428–99 A–12
Thus
a = f + m1 a1 + · · · + mr ar ∈ F + P.
It follows that
A = F ⊕ P.
J
428–99 A–13
Remark: If we think of the Theorem as expressing A as a direct sum of cyclic
subgroups, then in general these subgroups will not be unique, although their
orders (pe or ∞) will be.
The only case in which the expression will be unique is if A is cyclic. For
if that is so then either A = Z or else A is a finite cyclic group Z/(n). In
this last case each p-component Ap is also cyclic, since every subgroup of a
cyclic abelian group is cyclic. Thus the expression for A as a direct sum in
the Theorem is just the splitting of A into its p-components Ap ; and we know
that this is unique.
Conversely, if A is not cyclic, then either
In each of these cases we have seen above that the splitting is not unique.
428–99 A–14
Appendix B
It follows that x and y cannot both be odd; for then we would have z 2 =
2 mod 4, which is impossible. Thus just one of x and y is even; and so z
must be odd. We can assume without loss of generality that x is even, say
x = 2X. Our equation can then be written
4X 2 = z 2 − y 2 = (z + y)(z − y).
428–99 B–1
We know that 2 | z + y, 2 | z − y, since y, z are both odd. On the other hand
no other factor can divide z + y and z − y:
gcd(z + y, z − y) = 2.
For
d | z + y, z − y =⇒ d | 2y, 2z.
It follows that
z + y = 2u2 , z − y = 2v 2 , x = 2uv.
Thus
(x, y, z) = (2uv, u2 − v 2 , u2 + v 2 ).
where gcd(u, v) = 1. Note that just one of u, v must be odd; for if both were
odd, x, y, z would all be even.
Every Pythagorean triple arises in this way from a unique pair (u, v) with
gcd(u, v) = 1, u > v > 0, and just one of u, v odd. The uniqueness follows
from the fact that
(u + v)2 = z + x, (u − v)2 = z − x.
x4 + y 4 = z 2 .
If we can show that this has no solution in non-zero integers, then the same
will be true a fortiori of Fermat’s equation with n = 4.
Suppose (x, y, z) is a solution of this equation. As before we may and
shall suppose that x, y, z > 0 and gcd(x, y.z) = 1. Evidently (x2 , y 2 , z) is
428–99 B–2
then a Pythagorean triple, and so can be expressed in the form (swapping
x, y if necessary)
x2 = 2ab, y 2 = a2 − b2 , z = a2 + b2 ,
where a, b are positive integers with gcd(a, b) = 1. Since x is even, 4 | x2 ,
and therefore just one of a and b must be even.
If a were even and b were odd, then a2 − b2 = 3 mod 4, so the second
equation y 2 = a2 − b2 would be untenable. Thus b is even, and so from the
first equation x2 = 2ab we can write
a = u2 , b = 2v 2 , x = 2uv,
where gcd(u, v) = 1, and u, v > 0.
The second equation now reads
y 2 = u4 − 4v 4 .
Thus
4v 4 + y 2 = u4 ,
and so (2v 2 , y, u2 ) is a Pythagorean triple. It follows that we can write
2v 2 = 2st, y = s2 − t2 , u2 = s2 + t2 ,
where gcd(s, t) = 1. From the first equation we can write
s = X 2 , t = Y 2 , v = XY,
where gcd(X, Y ) = 1, and X, Y > 0; and so on writing Z for u the third
equation reads
X 4 + Y 4 = Z 2,
which is just the equation we started from. So from any solution (x, y, z) of
the equation
x4 + y 4 = z 2
with gcd(x, y, z) = 1, x, y > 0 and x even, we obtain a second solution
(X, Y, Z) with gcd(X, Y, Z) = 1, X, Y > 0 and X even, where
x = 2uv = 2XY Z,
y = s2 − t2 = X 4 − Y 4 ,
z = a2 + b 2 = u 4 + v 4 = Z 4 + X 4 Y 4 .
The new solution is evidently smaller than the first in every sense. In
particular,
Z < z 1/4 ;
so our infinite chain must (rapidly) lead to a contradiction, and Fermat’s
Last Theorem is proved for n = 4.
428–99 B–3
Appendix C
xp + y p + z p = 0
f (x) = xn + a1 xn−1 + · · · + an = 0
428–99 C–1
Proof I If α satisfies the equation f (x) = 0 then −α satisfies f (−x) = 0,
while 1/α satisfies xn f (1/x) = 0 (where n is the degree of f (x)). It follows
that −α and 1/α are both algebraic. Thus it is sufficient to show that if α, β
are algebraic then so are α + β, αβ.
Suppose α satisfies the equation
f (x) ≡ xm + a1 xm−1 + · · · + am = 0,
g(x) ≡ xn + b1 xn−1 + · · · + bn = 0.
α + β, αβ ∈ V.
1, θ, θ2 , . . . , θmn
are necessarily linearly dependent (over Q), since dim V ≤ mn. In other
words θ satisfies a polynomial equation of degree ≤ mn. Thus each element
θ ∈ V is algebraic. In particular α + β and αβ are algebraic. J
f (x) = xn + a1 xn−1 + · · · + an = 0
Proposition C.2 The algebraic integers form a ring Z ¯ ⊂ Q.¯ That is, if
α, β are algebraic integers, then so are α + β, α − β and αβ.
428–99 C–2
Suppose α satisfies the equation
α + β, αβ ∈ V.
S 1 ⊂ S2 ⊂ S3 · · · ⊂ M
θN ∈ h1, θ, . . . , θN −1 i.
θN = a1 θN −1 + a2 θN −2 + · · · .
428–99 C–3
Proof I Suppose c = m/n, where gcd(m, n) = 1; and suppose c satisfies the
equation
xd + a1 xd−1 + · · · + ad = 0 (ai ∈ Z).
Then
md + a1 md−1 n + · · · + ad nd = 0.
Since n divides every term after the first, it follows that n | md . But that is
incompatible with gcd(m, n) = 1, unless n = 1, ie c ∈ Z. J
428–99 C–4
C.3.1 Automorphisms and norms
The conjugacy automorphism
z 7→ z̄ : C → C
ω 7→ ω̄ = ω 2 ,
a + ωb 7→ a + ω 2 b = (a − b) − bω.
a + ωb (a, b ∈ Z)
since the algebraic integers are closed under addition and multiplication.
Proposition C.5 The algebraic integers in Q(ω) are just the elements of
Z[ω].
Proof I Suppose
ξ = a + ωb (a, b ∈ Q)
428–99 C–5
is an algebraic integer. Then so is its conjugate
ξ¯ = a + ω 2 b = (a − b) − ωb.
Hence
ξ + ξ¯ = 2a − b
is an algebraic integer. Since this number is rational, it follows that
2a − b ∈ Z.
Similarly
ωξ = −b + ω(a − b)
is an algebraic integer, and so by the previous argument
−2b − (a − b) = −a − b ∈ Z.
We deduce that
3a, 3b ∈ Z;
say
r s
a= , b= ,
3 3
where r, s ∈ Z.
But we also know that
N (ξ) = ξ ξ¯ = a2 − ab + b2
r2 − rs + s2 = 0 mod 9.
±1, ±ω, ±ω 2 .
428–99 C–6
Proof I Suppose is a unit. Then
N ()N (−1 ) = 1.
It follows that
N () = 1.
Conversely, if N () = 1 then is a unit, since
N () = a2 − ab + b2 = 1.
π = αβ (α, β ∈ Z[ω])
either α or β is a unit.
If π is a prime then so is π for any unit . Two primes that differ only
by a unit factor are said to be equivalent, and we write
π ≡ π 0 = π.
428–99 C–7
The Euclidean Algorithm This is a procedure for determining the great-
est common divisor gcd(a, b) = d of a, b ∈ Z. We start by dividing a
by b:
a = q1 b + r 1 ,
where |r1 | < |b|. Now we divide b by the remainder r1 :
b = q2 r 1 + r 2 ,
where |r2 | < |r1 |. We continue in this way, successively dividing re-
mainders:
r 1 = q3 r 2 + r 3 ,
r 2 = q4 r 3 + r 4 ,
...
rn−1 = qn+1 rn .
d = gcd(a, b) = rn .
For d | rn−1 , from the last line of the algorithm. Hence d | rn−2 from
the previous line; and so, working up the algorithm,
d | rn−3 , rn−4 , . . . , r1 , b, a.
e | a, b, r1 , r2 , . . . , rn .
Thus
e | a, b =⇒ e | d.
428–99 C–8
au + bv = d The Euclidean Algorithm has one important consequence that
is not immediately obvious. Let us say that e is expressed linearly in
terms of c, d if we have an expression
e = cx + dy
with x, y ∈ Z.
The last line but one of the algorithm expresses d = rn linearly in terms
of rn−1 and rn−2 , say
d = rn−1 x1 + rn−2 y1 .
The previous line expresses rn−1 in terms of rn−2 and rn−3 , allowing us
to express d linearly in terms of rn−2 and rn−3 , say
d = rn−2 x2 + rn−3 y2 .
d = rn−3 x3 + rn−4 y3
...
d = r2 xn−2 + r1 yn−2
d = r1 xn−1 + byn − 1
and finally
d = bxn + ayn .
Thus d is expressed linearly in terms of a, b:
d = au + bv
for some u, v ∈ Z.
The Lemma Suppose p is a prime number. Then
p | ab =⇒ p | a or p | b.
pu + av = 1.
px + by = 1.
428–99 C–9
Multiplying these relations together
Now if p | ab then p divides all the terms on the right, and we deduce
that p | 1, which is absurd.
By repeated use of the lemma above, the first factor p1 on the left
must occur on the right. Dividing both sides by p1 , we can apply the
inductive hypothesis to show that the the factors, with one p1 removed,
are the same up to order. Hence they are the same with the p1 restored
to both sides.
Now we see that the entire argument rests upon Division with Remainder.
Wherever this exists we will have unique factorisation.
One place where this holds is the ring k[x] of polynomials over a field k,
since we can divide one polynomial by another,
leaving a remainder r(x) of lower degree than g(x). It follows by our argument
that there is unique factorisation into prime (or irreducible) polynomials in
k[x]. Note that the degree in this case plays the rôle of the absolute value
|n| in the case of Z above. The essential point is that it must be a positive
integer, to ensure that our reduction process ends.
428–99 C–10
Proof I We can certainly divide α by β in Q(ω), say
α
= r + ωs (r, s ∈ Z).
β
Now let us choose m, n to be the nearest integers to r, s, so that
1 1
|r − m| ≤ , |s − n| ≤ .
2 2
Set
γ = m + ωn ∈ Z[ω];
and let
θ = (r − m) + ω(s − n) ∈ Q(ω).
Then
x3 + y 3 + z 3 = 0.
428–99 C–11
Suppose that x = 1 mod 3, say x = 1 + 3a. Then
x3 = (1 + 3a)3
= 1 + 32 a + 33 a2 + 33 a3
= 1 mod 32 .
Similarly
x = −1 mod 3 =⇒ x3 = −1 mod 32 .
It follows that one (and just one) of x, y, z must be divisible by 3, since
otherwise we would have an impossible congruence
±1 ± 1 ± 1 = 0 mod 32 .
Our aim is to extend this idea to solutions in Z[ω], with the prime Π
playing the rôle of 3 (recalling that Π2 ≡ 3).
We note in the first place that there are just 3 residue classes in Z[ω]
modulo Π, representated by 0, 1, and − 1. (For the number of residues
modulo α is N (α), and N (Π) = 3.)
x3 = 1 mod Π4 .
Proof I Suppose
x = 1 + Πα.
Then
x3 = (1 + Πα)3
= 1 + 3Πα + 3Π2 α2 + Π3 α3
= 1 − ω 2 Π3 α + Π3 α3 mod Π4 ,
428–99 C–12
Corollary If x = −1 mod Π then
x3 = −1 mod Π4 .
x3 + y 3 + z 3 = 0,
where we are now looking for solutions in Z[ω] (although this will, of course,
include solutions in Z). We assume as usual that gcd(x, y, z) = 1.
One of x, y, z must be divisible by Π. For otherwise, by the Lemma and
Corollary above, we will have an impossible congruence
±1 ± 1 ± 1 = 0 mod Π4 .
Thus
Π2 | y + z =⇒ Π k y + ωz,
where π e k α means that π e | α but π e+1 6 | α. Similarly
Π2 | y + z =⇒ Π k y + ω 2 z.
It follows that
Π4 | y + z, Π k y + ωz, Π k y + ω 2 z.
Thus it follows from unique factorisation that
y + z ≡ Π4 X 3 , y + ωz ≡ ΠY 3 , y + ω 2 z ≡ ΠZ 3 ,
428–99 C–13
where gcd(ΠX, Y, Z) = 1. But
(y + z) + ω(y + ωz) + ω 2 (y + ω 2 z) = 0.
1 Π3 X 3 + 2 Y 3 + 3 Z 3 = 0,
±1 ± 3 = 0 mod Π3 .
Π3 X 3 + Y 3 + Z 3 = 0.
Π3 x3 + y 3 + z 3 = 0
Proof I Since Π 6 | y, z,
y 3 , z 3 = ±1 mod Π4 .
Thus
Π3 x3 ± 1 ± 1 = 0 mod Π4 .
The only way this congruence can be satisfied is if Π | x, say x = Πx0 . Then
3
Π6 x0 = −(y 3 + z 3 )
= −(y + z)(y + ωz)(y + ω 2 z).
Our earlier argument still holds — the introduction of the unit makes no
difference. After replacing z by ωz or ω 2 z, if necessary, we have
y + z ≡ Π4 X 3 , y + ωz ≡ ΠY 3 , y + ω 2 z ≡ ΠZ 3 ,
1 Π3 X 3 + 2 Y 3 + 3 Z 3 = 0,
428–99 C–14
where 1 , 2 , 3 are units. Dividing by 2 we have
Π3 X 3 + Y 3 + 0 Z 3 = 0.
x = ΠXY Z.
Thus
N (x) = 3N(X)N (Y )N (Z),
and so
max(N (x), N (y), N (z)) > max(N (X), N (Y ), N (Z)).
J
428–99 C–15
Appendix H
428–99 H–1
H.2 Elliptic curve factorisation
Let n, as before, be a large composite integer that we wish to factorise.
Suppose p is a prime factor of n. Let
E(Q) : y 2 = x3 + bx + c (b, c ∈ Z)
be an elliptic curve over Q. Unless we are very unlucky (or very lucky) p will
be a good prime for E, ie the curve
E(Fp ) : y 2 = x3 + bx + c
over the finite field Fp is still elliptic. (We say lucky because p is a bad prime
if and only if
p | ∆ = −(4b3 + 27c2 ).
Thus if p is a bad prime,
d = gcd(∆, n) > 1;
so if we wished we could compute this gcd at the outset. However, the prob-
ability of p being bad is so small that this is probably not worth considering.)
Suppose the curve E(Fp ) contains N points. By Hasse’s Theorem,
√ √
p + 1 − 2 p < N < p + 1 + 2 p.
Then
N | k.
Suppose P ∈ E(Q). We express P in homogeneous coordinates:
P = [X, Y, Z],
where X, Y, Z ∈ Z.
It is a straightforward matter to find a formula for the sum of two points:
428–99 H–2
In effect, we simply have to dress up our usual computation
x1 + x2 + x3 = m2 , y3 = mx3 + c
in homogeneous form.
As a special case, this gives a formula for the double of a point:
2[X, Y, Z] = [X1 , Y1 , Z1 ],
rP = [Xr , Yr , Zr ]
for any r ∈ N.
Now let
Pp = [X mod p, Y mod p, Z mod p]
be the point of E(Fp ) corresponding to P ∈ E(Q). By Lagrange’s Theorem,
N Pp = 0,
and therefore
kPp = 0.
kP = [Xk , Yk , Zk ]
Zk ≡ 0 mod p.
(We also have Xk ≡ 0 mod p. However, this follows from the result for Zk
since the only point of E(Fp ) on the line at infinity Z = 0 is O = [0, 1, 0].)
It follows that
d = gcd(Zk , n) > 1;
and unless we are very unlucky this will give us a proper factor of n.
Note that in constructing Zk for this purpose we can work throughout
modn.
This method has one very large advantage over Pollard’s p − 1 method;
by changing the coefficients b, c in the elliptic curve we change N , which
√ √
probably ranges at random over the interval (p + 1 − 2 p, p + 1 + 2 p). This
428–99 H–3
allows us many chances of finding a ‘smooth’ N, while Pollard’s method only
gives us the one chance p − 1.
Analysis shows that if we have some idea of the size of p then it pays to
√
choose b of order p, and move on to another elliptic curve if this fails.
Incidentally, it is easier to choose the point P = [X, Y, Z] first, and then
find b, c so that the elliptic curve contains this point, rather than choosing
the curve and then looking for a rational point on it.
428–99 H–4