Professional Documents
Culture Documents
Concise Encyclopedia of Coding Theory
Concise Encyclopedia of Coding Theory
of Coding Theory
Concise Encyclopedia
of Coding Theory
Edited by
W. Cary Huffman
Loyola University Chicago, USA
Jon-Lark Kim
Sogang University, Republic of Korea
Patrick Solé
University Aix-Marseille, Marseilles, France.
First edition published 2021
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
Paintings, front, and back cover images used with permission from Gayle Imamura-Huffman.
Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright
holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowl-
edged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or
utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including pho-
tocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission
from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the
Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are
not available on CCC please contact mpkbookspermissions@tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
Preface xxiii
Contributors xxix
I Coding Fundamentals 1
1 Basics of Coding Theory 3
W. Cary Huffman, Jon-Lark Kim, and Patrick Solé
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Generator and Parity Check Matrices . . . . . . . . . . . . . . . . . . . . 8
1.5 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Distance and Weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.7 Puncturing, Extending, and Shortening Codes . . . . . . . . . . . . . . . 12
1.8 Equivalence and Automorphisms . . . . . . . . . . . . . . . . . . . . . . . 13
1.9 Bounds on Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.9.1 The Sphere Packing Bound . . . . . . . . . . . . . . . . . . . . . . 16
1.9.2 The Singleton Bound . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.9.3 The Plotkin Bound . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.9.4 The Griesmer Bound . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.9.5 The Linear Programming Bound . . . . . . . . . . . . . . . . . . 18
1.9.6 The Gilbert Bound . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.9.7 The Varshamov Bound . . . . . . . . . . . . . . . . . . . . . . . . 20
1.9.8 Asymptotic Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.10 Hamming Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.11 Reed–Muller Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.12 Cyclic Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.13 Golay Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.14 BCH and Reed–Solomon Codes . . . . . . . . . . . . . . . . . . . . . . . 30
1.15 Weight Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.16 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.17 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.18 Shannon’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
vii
viii Contents
4 Self-Dual Codes 79
Stefka Bouyuklieva
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2 Weight Enumerators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3 Bounds for the Minimum Distance . . . . . . . . . . . . . . . . . . . . . . 84
4.4 Construction Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4.1 Gluing Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4.2 Circulant Constructions . . . . . . . . . . . . . . . . . . . . . . . 89
4.4.3 Subtraction Procedure . . . . . . . . . . . . . . . . . . . . . . . . 90
4.4.4 Recursive Constructions . . . . . . . . . . . . . . . . . . . . . . . 90
4.4.5 Constructions of Codes with Prescribed Automorphisms . . . . . 92
4.5 Enumeration and Classification . . . . . . . . . . . . . . . . . . . . . . . . 93
9.5 Additive Codes Which Are Cyclic in the Monomial Sense . . . . . . . . . 193
9.5.1 The Linear Case m = 1 . . . . . . . . . . . . . . . . . . . . . . . . 194
9.5.2 The General Case m ≥ 1 . . . . . . . . . . . . . . . . . . . . . . . 195
20.4.5 Codes with Few Weights from Semi-Bent Boolean Functions . . . 482
20.4.6 Linear Codes from Quadratic Boolean Functions . . . . . . . . . . 483
20.4.7 Binary Codes CDf with Three Weights . . . . . . . . . . . . . . . 484
20.4.8 Binary Codes CDf with Four Weights . . . . . . . . . . . . . . . . 484
20.4.9 Binary Codes CDf with at Most Five Weights . . . . . . . . . . . 485
20.4.10 A Class of Two-Weight Binary Codes from the Preimage of a Type
of Boolean Function . . . . . . . . . . . . . . . . . . . . . . . . . . 486
20.4.11 Binary Codes from Boolean Functions Whose Supports are Rela-
tive Difference Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 487
20.4.12 Binary Codes with Few Weights from Plateaued Boolean Functions 487
20.4.13 Binary Codes with Few Weights from Almost Bent Functions . . 488
20.4.14 Binary Codes CD(G) from Functions on F2m of the Form G(x) =
F (x) + F (x + 1) + 1 . . . . . . . . . . . . . . . . . . . . . . . . . 489
20.4.15 Binary Codes from the Images of Certain Functions on F2m . . . 489
20.5 Constructions of Cyclic Codes from Functions: The Sequence Approach . 490
20.5.1 A Generic Construction of Cyclic Codes with Polynomials . . . . 490
20.5.2 Binary Cyclic Codes from APN Functions . . . . . . . . . . . . . 492
20.5.3 Non-Binary Cyclic Codes from Monomials and Trinomials . . . . 495
20.5.4 Cyclic Codes from Dickson Polynomials . . . . . . . . . . . . . . . 498
20.6 Codes with Few Weights from p-Ary Functions with p Odd . . . . . . . . 503
20.6.1 Codes with Few Weights from p-Ary Weakly Regular Bent Func-
tions Based on the First Generic Construction . . . . . . . . . . . 503
20.6.2 Linear Codes with Few Weights from Cyclotomic Classes and
Weakly Regular Bent Functions . . . . . . . . . . . . . . . . . . . 504
20.6.3 Codes with Few Weights from p-Ary Weakly Regular Bent Func-
tions Based on the Second Generic Construction . . . . . . . . . . 507
20.6.4 Codes with Few Weights from p-Ary Weakly Regular Plateaued
Functions Based on the First Generic Construction . . . . . . . . 508
20.6.5 Codes with Few Weights from p-Ary Weakly Regular Plateaued
Functions Based on the Second Generic Construction . . . . . . . 510
20.7 Optimal Linear Locally Recoverable Codes from p-Ary Functions . . . . . 521
20.7.1 Constructions of r-Good Polynomials for Optimal LRC Codes . . 523
20.7.1.1 Good Polynomials from Power Functions . . . . . . . . . 523
20.7.1.2 Good Polynomials from Linearized Functions . . . . . . 523
20.7.1.3 Good Polynomials from Function Composition . . . . . 523
20.7.1.4 Good Polynomials from Dickson Polynomials of the First
Kind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
20.7.1.5 Good Polynomials from the Composition of Functions In-
volving Dickson Polynomials . . . . . . . . . . . . . . . . 525
Bibliography 823
Index 941
Preface
xxiii
xxiv Preface
codes, reversible cyclic codes, BCH codes, duadic codes, punctured generalized Reed–Muller
codes, and a new generalization of the punctured binary Reed–Muller codes.
Shannon’s proof of the existence of good codes is non-constructive and therefore of little
use for applications, where one needs one or all codes with specific (small) parameters.
Techniques for constructing and classifying codes are considered in Chapter 3, written by
Patric R. J. Östergård, with an emphasis on computational methods. Some classes of codes
are discussed in more detail: perfect codes, MDS codes, and general binary codes.
Self-dual codes form one of the important classes of linear codes because of their rich
algebraic structure and their close connections with other combinatorial configurations like
block designs, lattices, graphs, etc. Topics covered in Chapter 4, by Stefka Bouyuklieva,
include construction methods, results on enumeration and classification, and bounds for
the minimum distance of self-dual codes over fields of size 2, 3, and 4.
Combinatorial designs often arise in codes that are optimal with respect to certain
bounds and are used in some decoding algorithms. Chapter 5, written by Vladimir D.
Tonchev, summarizes links between combinatorial designs and perfect codes, optimal codes
meeting the restricted Johnson Bound, and linear codes admitting majority logic decoding.
Chapters 1–5 explore codes over fields; Chapter 6, by Steven T. Dougherty, introduces
codes over rings. The chapter begins with a discussion of quaternary codes over the integers
modulo 4 and their Gray map, which popularized the study of codes over more general
rings. It then discusses codes over rings in a very broad sense describing families of rings,
the Chinese Remainder Theorem applied to codes, generating matrices, and bounds. It also
gives a description of the MacWilliams Identities for codes over rings.
Quasi-cyclic codes form an important class of algebraic codes that includes cyclic codes
as a special subclass. Chapter 7, coauthored by Cem Güneri, San Ling, and Buket Özkaya,
focuses on the algebraic structure of quasi-cyclic codes. Based on these structural properties,
some asymptotic results, minimum distance bounds, and further applications, such as the
trace representation and characterization of certain subfamilies of quasi-cyclic codes, are
elaborated upon.
Cyclic and quasi-cyclic codes are studied as ideals in ordinary polynomial rings. Chap-
ter 8, by Heide Gluesing-Luerssen, is a survey devoted to skew-polynomial rings and skew-
cyclic block codes. After discussing some relevant algebraic properties of skew polynomials,
the basic notions of skew-cyclic codes, such as generator polynomial, parity check polyno-
mial, and duality are investigated. The basic tool is a skew-circulant matrix. The chapter
concludes with results on constructions of skew-BCH codes.
The coauthors Jürgen Bierbrauer, Stefano Marcugini, and Fernanda Pambianco of Chap-
ter 9 develop the theory of cyclic additive codes, both in the permutation sense and in the
monomial sense when the code length is coprime to the characteristic of the underlying field.
This generalizes the classical theory of cyclic and constacyclic codes, respectively, from the
category of linear codes to the category of additive codes. The cyclic quantum codes corre-
spond to a special case when the codes are self-orthogonal with respect to the symplectic
bilinear form.
Up to this point the codes considered are block codes where codewords all have fixed
length. That is no longer the case in Chapter 10, coauthored by Julia Lieb, Raquel Pinto,
and Joachim Rosenthal. This chapter provides a survey of convolutional codes stressing the
connections to module theory and systems theory. Constructions of codes with maximal
possible distance and distance profile are provided.
Chapter 11, written by Elisa Gorla, provides a mathematical introduction to rank-
metric codes, beginning with the definition of the rank metric and the corresponding codes,
whose elements can be either vectors or matrices. This is followed by the definition of code
equivalence and the notion of support for a codeword and for a code. This chapter treats
some of the basic concepts in the mathematical theory of rank-metric codes: duality, weight
Preface xxv
enumerators and the MacWilliams Identities, higher rank weights, MRD codes, optimal
anticodes, and q-polymatroids associated to a rank-metric code.
The final two chapters of Part I deal with the important technique of linear programming
and a related generalization to produce bounds. As described in Chapter 12, coauthored by
Peter Boyvalenkov and Danyo Danev, general linear programming methods imply universal
bounds for codes and designs. The explanation is organized in the Levenshtein framework
extended with recent developments on universal bounds for the energy of codes, including
the concept of universal optimality. The exposition is done separately for codes in Hamming
spaces and for spherical codes.
Linear programming bounds, initially developed by Delsarte, belong to the most power-
ful and flexible methods to obtain bounds for extremal problems in coding theory. In recent
years, after the pioneering work of Schrijver, semidefinite programming bounds have been
developed with two aims: to strengthen linear programming bounds and to find bounds
for more general spaces. Chapter 13, by Frank Vallentin, introduces semidefinite program-
ming bounds with an emphasis on error-correcting codes and its relation to semidefinite
programming hierarchies for difficult combinatorial optimization problems.
The next eight chapters make up Part II of the Concise Encyclopedia of Coding Theory,
where the focus is on specific families of codes. The codes presented fall into two categories,
with some overlap. Some of them are generalizations of classical codes from Part I. The rest
have a direct connection to algebraic, geometric, or graph theoretic structures. They all are
interesting theoretically and often possess properties useful for application.
There are many problems in coding theory which are equivalent to geometrical prob-
lems in Galois geometries. Certain formulations of some of the classical codes have direct
connections to geometry. Chapter 14, written by Leo Storme, describes a number of the
many links between coding theory and Galois geometries, and shows how these two research
areas influence and stimulate each other.
Chapter 15, written by Alain Couvreur and Hugues Randriambololona, surveys the
development of the theory of algebraic geometry codes since their discovery in the late
1970s. The authors summarize the major results on various problems such as asymptotic
parameters, improved estimates on the minimum distance, and decoding algorithms. In
addition, the chapter describes various modern applications of these codes such as public-
key cryptography, algebraic complexity theory, multiparty computation, and distributed
storage.
Very often the parameters of good/optimal linear codes can be realized by group codes,
that is, ideals in a group algebra FG where G is a finite group and F is a finite field. Such
codes, the topic of Chapter 16 written by Wolfgang Willems, carry more algebraic structure
than only linear codes, which leads to an easier analysis of the codes. In particular, the full
machinery of representation theory of finite groups can be applied to prove interesting
coding theoretical properties.
Chapter 17, coauthored by Hai Q. Dinh and Sergio R. López-Permouth, discusses foun-
dational and theoretical aspects of the role of finite rings as alphabets in coding theory, with
a concentration on the class of constacyclic codes over finite commutative chain rings. The
chapter surveys both the simple-root and repeated-root cases. Several directions in which
the notion of constacyclicity has been extended are also presented.
The next three chapters focus on codes with few weights; such codes have applications
delineated throughout this encyclopedia. As described in Chapter 18, written by Minjia Shi,
one important construction technique for few-weight codes is to use trace codes. For example
the simplex code, a one-weight code, can be constructed as a trace code by using finite field
extensions. In recent years, this technique has been refined by using ring extensions of a
xxvi Preface
finite field coupled with a linear Gray map. Moreover, these image codes can be applied to
secret sharing schemes.
Codes with few weights often have an interesting geometric structure. Chapter 19, writ-
ten by Andries E. Brouwer, focuses specifically on codes with exactly two nonzero weights.
The chapter discusses the relationship between two-weight linear codes, strongly regular
graphs, and 2-character subsets of a projective space.
Functions in general and more specifically cryptographic functions, that is highly non-
linear functions (PN, APN, bent, AB, plateaued), have important applications in coding
theory since they are used to construct optimal linear codes and linear codes useful for
applications such as secret sharing, two-party computation, and storage. The ultimate goal
of Chapter 20, by Sihem Mesnager, is to provide an overview and insights into linear codes
with good parameters that are constructed from functions and polynomials over finite fields
using multiple approaches.
Chapter 21 by Christine A. Kelley, the concluding chapter of Part II, gives an overview of
graph-based codes and iterative message-passing decoding algorithms for these codes. Some
important classes of low-density parity check codes, such as finite geometry codes, expander
codes, protograph codes, and spatially coupled codes are discussed. Moreover, analysis tech-
niques of the decoding algorithm for both the finite length case and the asymptotic length
case are summarized. While the area of codes over graphs is vast, a few other families such
as repeat accumulate and turbo-like codes are briefly mentioned.
The final chapter of Part II provides a natural bridge to the applications in Part III
as codes from graphs were designed to facilitate communication. The thirteen chapters of
Part III examine several applications that fall into two categories, again with some overlap.
Some of the applications present codes developed for specific uses; other applications use
codes to produce other structures that themselves become the main application.
The first chapter in Part III is again a bridge between the previous and successive
chapters as it has a distinct theoretical slant but its content is useful as well in applications.
Chapter 22, written by Marcelo Firer, gives an account of many metrics used in the context
of coding theory, mainly for decoding purposes. The chapter tries to stress the role of some
metric related invariants and aspects that are eclipsed at the usual setting of the Hamming
metric.
Chapter 23, written by Alfred Wassermann, examines algorithms for computer construc-
tion of “good” linear codes and methods to determine the minimum distance and weight
enumerator of a linear code. For code construction the focus is on the geometric view: a
linear code can be seen as a suitable set of points in a projective geometry. The search then
reduces to an integer linear programming problem. The chapter explores how the search
space can be much reduced by prescribing a group of symmetries and how to construct
special code types such as self-orthogonal codes or LCD codes.
In Chapter 24 by Swastik Kopparty we will see some algorithmic ideas based on polyno-
mial interpolation for decoding algebraic codes, applied to generalized Reed–Solomon and
interleaved generalized Reed–Solomon codes. These ideas will power decoding algorithms
that can decode algebraic codes beyond half the minimum distance.
The theory of pseudo-noise sequences has close connections with coding theory, cryptog-
raphy, combinatorics, and discrete mathematics. Chapter 25, coauthored by Tor Helleseth
and Chunlei Li, gives a brief introduction of two kinds of pseudo-noise sequences, namely
sequences with low correlation and shift register sequences with maximum periods, which
are of particular interest in modern communication systems.
Lattice coding is presented in Chapter 26, written by Frédérique Oggier, in the context
of Gaussian and fading channels, where channel models are presented. Lattice constructions
Preface xxvii
from both quadratic fields and linear codes are described. Some variations of lattice coding
are also discussed.
A bridge between classical coding theory and quantum error control was firmly put in
place via the stabilizer formalism, allowing the capabilities of a quantum stabilizer code to
be inferred from the properties of the corresponding classical codes. Well-researched tools
and the wealth of results in classical coding theory often translate nicely to the design of
good quantum codes, the subject of Chapter 27 by Martianus Frederic Ezerman. Research
problems triggered by error-control issues in the quantum setup revive and expand studies
on specific aspects of classical codes, which were previously overlooked or deemed not so
interesting.
Chapter 28 on space-time coding, written by Frédérique Oggier, defines what space-time
coding actually is and what are variations of space-time coding problems. The chapter also
provides channel models and design criteria, together with several examples.
Chapter 29, by Frank R. Kschischang, describes error-correcting network codes for
packet networks employing random linear network coding. In such networks, packets sent
into the network by the transmitter are regarded as a basis for a vector space over a finite
field, and the network provides the receiver with random linear combinations of the trans-
mitted vectors, possibly also combined with noise vectors. Unlike classical coding theory—
where codes are collections of well-separated vectors, each of them a point of some ambient
vector space—here codes are collections of well-separated vector spaces, each of them a sub-
space of some ambient vector space. The chapter provides appropriate coding metrics for
such subspace codes, and describes various bounds and constructions, focusing particularly
on the case of constant-dimension codes whose codewords all have the same dimension.
Erasure codes have attained a position of importance for many streaming and file down-
load applications on the internet. Chapter 30, written by Ian F. Blake, outlines the devel-
opment of these codes from simple erasure correcting codes to the important Raptor codes.
Various decoding algorithms for these codes are developed and illustrated.
Chapter 31, with coauthors Vinayak Ramkumar, Myna Vajha, S. B. Balaji, M. Nikhil
Krishnan, Birenjith Sasidharan, and P. Vijay Kumar, deals with the topic of designing
reliable and efficient codes for the storage and retrieval of large quantities of data over
storage devices that are prone to failure. Historically, the traditional objective has been one
of ensuring reliability against data loss while minimizing storage overhead. More recently,
a third concern has surfaced, namely, the need to efficiently recover from the failure of a
single storage unit corresponding to recovery from the erasure of a single code symbol. The
authors explain how coding theory has evolved to tackle this fresh challenge.
Polar codes are error-correcting codes that achieve the symmetric capacity of discrete
input memoryless channels with a polynomial encoding and decoding complexity. Chap-
ter 32, coauthored by Noam Presman and Simon Litsyn, provides a general presentation
of polar codes and their associated algorithms. At the same time, most of the examples in
the chapter use the basic Arıkan’s (u + v, v) original construction due to its simplicity and
wide applicability.
While one thinks of coding theory as the major tool to reveal correct information after
errors in that information have been introduced, the final two chapters of this encyclopedia
address the opposite problem: using coding theory as a tool to hide information. Chapter 33,
by Cunsheng Ding, first gives a brief introduction to secret sharing schemes, and then
introduces two constructions of secret sharing schemes with linear codes. It also documents
a construction of multisecret sharing schemes with linear codes. Basic results about these
secret sharing schemes are presented in this chapter.
Chapter 34, written by Philippe Gaborit and Jean-Christophe Deneuville, gives a gen-
eral overview of basic tools used for code-based cryptography. The security of the main diffi-
cult problem for code-based cryptography, the Syndrome Decoding problem, is considered,
xxviii Preface
together with its quasi-cyclic variations. The current state-of-the-art for the cryptographic
primitives of encryption, signature, and authentication is the main focus of the chapter.
The editors of the Concise Encyclopedia of Coding Theory thank the 48 other authors
for sharing their expertise to make this project come to pass. Their cooperation and patience
were invaluable to us. We also thank Gayle Imamura-Huffman for lending her transparent
watercolor Coded Information and opaque watercolor Linear Subspaces for use on the front
and back covers of this encyclopedia. Additionally we thank the editorial staff at CRC
Press/Taylor and Francis Group: Sarfraz Khan, who helped us begin this project; Callum
Fraser, who became the Mathematical Editor at CRC Press as the project progressed; and
Robin Lloyd-Starkes, who is Project Editor. Also with CRC Press, we thank Mansi Kabra
for handling permissions and copyrights and Kevin Craig who assisted with the cover design.
We thank Meeta Singh, Senior Project Manager at KnowledgeWorks Global Ltd., and her
team for production of this encyclopedia. And of course, we sincerely thank our families for
their support and encouragement throughout this journey.
– W. Cary Huffman
– Jon-Lark Kim
– Patrick Solé
Contributors
xxix
xxx Contributors
Coding Fundamentals
1
Chapter 1
Basics of Coding Theory
W. Cary Huffman
Loyola University, Chicago
Jon-Lark Kim
Sogang University
Patrick Solé
CNRS, Aix-Marseille Université
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Generator and Parity Check Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Distance and Weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.7 Puncturing, Extending, and Shortening Codes . . . . . . . . . . . . . . . . . . 12
1.8 Equivalence and Automorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.9 Bounds on Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.9.1 The Sphere Packing Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.9.2 The Singleton Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.9.3 The Plotkin Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.9.4 The Griesmer Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.9.5 The Linear Programming Bound . . . . . . . . . . . . . . . . . . . . . . . . 18
1.9.6 The Gilbert Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.9.7 The Varshamov Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.9.8 Asymptotic Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.10 Hamming Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.11 Reed–Muller Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.12 Cyclic Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.13 Golay Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.14 BCH and Reed–Solomon Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.15 Weight Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.16 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.17 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.18 Shannon’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.1 Introduction
Coding theory had it genesis in the late 1940s with the publication of works by Claude
Shannon, Marcel Golay, and Richard Hamming. In 1948 Shannon published a landmark
3
4 Concise Encyclopedia of Coding Theory
Message
Receiver
Source
6
e = e1 · · · en
x = x1 · · · xk noise e=x
x e1 · · · x
ek
message message
estimate
? ? y =c+e
c = c1 · · · cn received
Encoder codeword Channel vector Decoder
- -
paper A mathematical theory of communication [1661] which marked the beginning of both
information theory and coding theory. Given a communication channel, over which informa-
tion is transmitted and possibly corrupted, Shannon identified a number called the ‘channel
capacity’ and proved that arbitrarily reliable communication is possible at any rate below
the channel capacity. For example, when transmitting images of planets from deep space, it
is impractical to retransmit the images that have been altered by noise during transmission.
Shannon’s Theorem guarantees that the data can be encoded before transmission so that
the altered data can be decoded to the original, up to a specified degree of accuracy. Other
examples of communication channels include wireless communication devices and storage
systems such as DVDs or Blue-ray discs. In 1947 Hamming developed a code, now bearing
his name, in an attempt to correct errors that arose in the Bell Telephone Laboratories’
mechanical relay computer; his work was circulated through a series of memoranda at Bell
Labs and eventually published in [895]. Both Shannon [1661] and Golay [820] published
Hamming’s code, with Golay generalizing it. Additionally, Golay presented two of the four
codes that now bear his name. A monograph by T. M. Thompson [1801] traces the early
development of coding theory.
A simple communication channel is illustrated in Figure 1.1. At the source a mes-
sage, denoted x = x1 · · · xk in the figure, is to be sent. If no modification is made to x and
it is transmitted directly over the channel, any noise would distort x so that it could not
be recovered. The basic idea of coding theory is to embellish the message by adding some
redundancy so that hopefully the original message can be recovered after reception even if
noise corrupts the embellished message during transmission. The redundancy is added by
the encoder and the embellished message, called a codeword c = c1 · · · cn in the figure, is
sent over the channel where noise in the form of an error vector e = e1 · · · en distorts the
codeword producing a received vector y.1 The received vector is then sent to be decoded
where the errors are removed. The redundancy is then stripped off, and an estimate x e of
the original message is produced. Hopefully x e = x. (There is a one-to-one correspondence
1 Generally message and codeword symbols will come from a finite field F or a finite ring R. Messages
will be ‘vectors’ in Fk (or Rk ), and codewords will be ‘vectors’ in Fn (or Rn ). If c entered the channel and
y exited the channel, the difference y − c is what we have termed the error vector e in Figure 1.1. While
this is the normal scenario, other ambient spaces from which codes arise occur in this encyclopedia.
Basics of Coding Theory 5
between codewords and messages. In many cases, the real interest is not in the message x
but the codeword c. With this point of view, the job of the decoder is to obtain an estimate
e from y and hope that y
y e = c.) For example in deep space communication, the message
source is the satellite, the channel is outer space, the decoder is hardware at a ground sta-
tion on Earth, and the receiver is the people or computer processing the information; of
course, messages travel from Earth to the satellite as well. For a DVD or Blue-ray disc, the
message source is the voice, music, video, or data to be placed on the disc, the channel is
the disc itself, the decoder is the DVD or Blue-ray player, and the receiver is the listener or
viewer.
Shannon’s Theorem guarantees that the hope of successful recovery will be fulfilled a
certain percentage of the time. With the right encoding based on the characteristics of the
channel, this percentage can be made as high as desired, although not 100%. The proof of
Shannon’s Theorem is probabilistic and nonconstructive. No specific codes were produced
in the proof that give the desired accuracy for a given channel. Shannon’s Theorem only
guarantees their existence. In essence, the goal of coding theory is to produce codes that
fulfill the conditions of Shannon’s Theorem and make reliable communication possible.
There are numerous texts, ranging from introductory to research-level books, on coding
theory including (but certainly not limited to) [170, 209, 896, 1008, 1323, 1505, 1506, 1520,
1521, 1602, 1836]. There are two books, [169] edited by E. R. Berlekamp and [212] edited by
I. F. Blake, in which early papers in the development of coding theory have been reprinted.
Definition 1.2.1 A field F is a nonempty set with two binary operations, denoted + and
·, satisfying the following properties.
(a) For all α, β, γ ∈ F, α+β ∈ F, α·β ∈ F, α+β = β+α, α·β = β·α, α+(β+γ) = (α+β)+γ,
α · (β · γ) = (α · β) · γ, and α · (β + γ) = α · β + α · γ.
(b) F possesses an additive identity or zero, denoted 0, and a multiplicative identity
or unity, denoted 1, such that α + 0 = α and α · 1 = α for all α ∈ Fq .
(c) For all α ∈ F and all β ∈ F with β 6= 0, there exists α0 ∈ F, called the additive inverse
of α, and β ∗ ∈ F, called the multiplicative inverse of β, such that α + α0 = 0 and
β · β ∗ = 1.
The additive inverse of α will be denoted −α, and the multiplicative inverse of β will be
denoted β −1 . Usually the multiplication operation will be suppressed; that is, α · β will be
denoted αβ. If n is a positive integer and α ∈ F, nα = α+α+· · ·+α (n times), αn = αα · · · α
(n times), and α−n = α−1 α−1 · · · α−1 (n times when α 6= 0). Also α0 = 1 if α 6= 0. The
usual rules of exponentiation hold. If F is a finite set with q elements, F is called a finite
field of order q and denoted Fq .
Example 1.2.2 Fields include the rational numbers Q, the real numbers R, and the com-
plex numbers C. Finite fields include Zp , the set of integers modulo p, where p is a prime.
6 Concise Encyclopedia of Coding Theory
The following theorem gives some of the basic properties of finite fields.
Theorem 1.2.3 Let Fq be a finite field with q elements. The following hold.
(a) Fq is unique up to isomorphism.
(b) q = pm for some prime p and some positive integer m.
(c) Fq contains the subfield Fp = Zp .
(d) Fq is a vector space over Fp of dimension m.
(e) pα = 0 for all α ∈ Fq .
(f) If α, β ∈ Fq , (α + β)p = αp + β p .
(g) There exists an element γ ∈ Fq with the following properties.
(i) Fq = {0, 1 = γ 0 , γ, . . . , γ q−2 } and γ q−1 = 1,
(ii) {1 = γ 0 , γ, . . . , γ m−1 } is a basis of the vector space Fq over Fp , and
(iii) there exist a0 , a1 , . . . , am−1 ∈ Fp such that
γ m = a0 + a1 γ + · · · + am−1 γ m−1 . (1.1)
(h) For all α ∈ Fq , αq = α.
Definition 1.2.4 In Theorem 1.2.3, p is called the characteristic of Fq . The element γ is
called a primitive element of Fq .
Remark 1.2.5 Using Theorem 1.2.3(f), the map σp : Fq → Fq given by σp (α) = αp is
an automorphism of Fq , called the Frobenius automorphism of Fq . Once one primitive
element γ of Fq is found, the remaining primitive elements of Fq are precisely γ d where
gcd(d, q − 1) = 1.
The key to constructing a finite field is to find a primitive element γ in Fq and the
equation (1.1). We do not describe this process here, but refer the reader to the texts
mentioned at the beginning of the section. Assuming γ is found and the equation (1.1) is
known, we can construct addition and multiplication tables for Fq . This is done by writing
every element of Fq in two forms. The first form takes advantage of Theorem 1.2.3(g)(ii).
Every element α ∈ Fq is written uniquely in the form
α = a0 γ 0 + a1 γ + a2 γ 2 + · · · + am−1 γ m−1 with ai ∈ Fp = Zp for 0 ≤ i ≤ m − 1,
which we abbreviate α = a0 a1 a2 · · · am−1 , a vector in Zm p . Addition in Fq is accomplished
by ordinary vector addition in Zm
p . To each α ∈ F q , with α 6= 0, we associate a second form:
α = γ for some i with 0 ≤ i ≤ q − 2. Multiplication is accomplished by γ i γ j = γ i+j where
i
1.3 Codes
In this section we introduce the concept of codes over finite fields. We begin with some
notation.
The set of n-tuples with entries in Fq forms an n-dimensional vector space, denoted
Fnq = {x1 x2 · · · xn | xi ∈ Fq , 1 ≤ i ≤ n}, under componentwise addition of n-tuples and
componentwise multiplication of n-tuples by scalars in Fq . The vectors in Fnq will often be
denoted using bold Roman characters x = x1 x2 · · · xn . The vector 0 = 00 · · · 0 is the zero
vector in Fnq .
For positive integers m and n, Fm×n q denotes the set of all m × n matrices with
entries in Fq . The matrix in Fm×n q with all entries 0 is the zero matrix denoted 0m×n .
The identity matrix of Fn×n q will be denoted In . If A ∈ Fm×n
q , AT ∈ Fn×m
q will denote
m T
the transpose of A. If x ∈ Fq , x will denote x as a column vector of length m, that is,
an m × 1 matrix. The column vector 0T and the m × 1 matrix 0m×1 are the same.
If S is any finite set, its order or size is denoted |S|.
Definition 1.3.1 A subset C ⊆ Fnq is called a code of length n over Fq ; Fq is called the
alphabet of C, and Fnq is the ambient space of C. Codes over Fq are also called q-ary
codes. If the alphabet is F2 , C is binary. If the alphabet is F3 , C is ternary. The vectors
in C are the codewords of C. If C has M codewords (that is, |C| = M ) C is denoted an
(n, M )q code, or, more simply, an (n, M ) code when the alphabet Fq is understood. If C is a
linear subspace of Fnq , that is C is closed under vector addition and scalar multiplication, C
is called a linear code of length n over Fq . If the dimension of the linear code C is k, C is
denoted an [n, k]q code, or, more simply, an [n, k] code. An (n, M )q code that is also linear
is an [n, k]q code where M = q k . An (n, M )q code may be referred to as an unrestricted
code; a specific unrestricted code may be either linear or nonlinear. When referring to a
code, expressions such as (n, M ), (n, M )q , [n, k], or [n, k]q are called the parameters of
the code.
Example 1.3.2 Let C = {1100, 1010, 1001, 0110, 0101, 0011} ⊆ F42 . Then C is a (4, 6)2
binary nonlinear code. Let C1 = C ∪ {0000, 1111}. Then C1 is a (4, 8)2 binary linear code.
As C1 is a subspace of F42 of dimension 3, C1 is also a [4, 3]2 code.
Remark 1.3.3 Basic development of linear codes is found in papers by D. Slepian [1722,
1723, 1724]. In some chapters of this book, codes will be considered where the alphabet is
8 Concise Encyclopedia of Coding Theory
not necessarily a field but rather a ring R. In these situations, the vector space Fnq will be
replaced by an R-module such as Rn = {x1 x2 · · · xn | xi ∈ R, 1 ≤ i ≤ n}, and a code will
be considered linear if it is an R-submodule of that R-module. See for example Chapters 6,
17, and 18.
(b) If HGT = 0(n−k)×k , then G is a generator matrix for C if and only if H is a parity
check matrix for C.
Definition 1.4.6 Let C be an [n, k]q linear code with generator matrix G ∈ Fk×n q . For
any set of k independent columns of G, the corresponding set of coordinates forms an
information set for C; the remaining
n − k coordinates form a redundancy set for C.
If G has the form G = Ik | A , G is in standard form in which case {1, 2, . . . , k} is an
information set with {k + 1, k + 2, . . . , n} the corresponding redundancy set.
Theorem 1.4.7 ([1602,
Chapter
2.3]) If G = Ik | A is a generator matrix of an [n, k]q
T
code C, then H = −A | In−k is a parity check matrix for C.
Example 1.4.8 Continuing with Examples 1.3.2 and 1.4.2, the matrix G1 is in standard
form. Applying Theorem 1.4.7 to G1 , we get the parity check matrix H1 of Example 1.4.2.
The matrices G01 and G001 both row reduce to G1 ; so all three are generator matrices of the
same code, consistent with Remark 1.4.3. Any subset of {1, 2, 3, 4} of size 3 is an information
0T 00T
set for C1 . The fact that HGT
1 = HG1 = HG1 = 01×3 is consistent with Theorem 1.4.5.
Finally, let C2 = {0000, 1100, 0011, 1111} be the [4, 2]2 linear subcode of C1 . C2 does not
have a generator matrix in standard form; the only information sets for C2 are {1, 3}, {1, 4},
{2, 3}, and {2, 4}.
Example 1.4.9 Generator and parity check matrices for the [7, 4]2 binary linear Hamming
code H3,2 are
1 0 0 0 0 1 1
0 1 0 0 1 0 1 0 1 1 1 1 0 0
G3,2 = and H3,2 = 1 0 1 1 0 1 0 ,
0 0 1 0 1 1 0
1 1 0 1 0 0 1
0 0 0 1 1 1 1
respectively. G3,2 is in standard form. Two information sets for H3,2 are {1, 2, 3, 4} and
{1, 2, 3, 5} with corresponding redundancy sets {5, 6, 7} and {4, 6, 7}. The set {2, 3, 4, 5} is
not an information set. More general Hamming codes Hm,q are defined in Section 1.10.
1.5 Orthogonality
There is a natural inner product on Fnq that often proves useful in the study of codes.2
Definition 1.5.1 The ordinaryPinner product, also called the Euclidean inner prod-
n
uct, on Fnq is defined by x · y = i=1 xi yi where x = x1 x2 · · · xn and y = y1 y2 · · · yn . Two
n
vectors x, y ∈ Fq are orthogonal if x · y = 0. If C is an [n, k]q code,
Theorem 1.5.2 ([1323, Chapter 1.8]) Let C be an [n, k]q code with generator and parity
check matrices G and H, respectively. Then C ⊥ is an [n, n − k]q code with generator and
parity check matrices H and G, respectively. Additionally (C ⊥ )⊥ = C. Furthermore C is
n
self-dual if and only if C is self-orthogonal and k = .
2
Example 1.5.3 C2 from Example 1.4.8 is a [4, 2]2 self-dual code with generator and parity
check matrices both equal to
1 1 0 0
.
0 0 1 1
⊥
The dual of the Hamming [7, 4]2 code in Example 1.4.9 is a [7, 3]2 code H3,2 . H3,2 is a
⊥
generator matrix of H3,2 . As every row of H3,2 is orthogonal to itself and every other row
⊥ ⊥ ⊥ ⊥
of H3,2 , H3,2 is self-orthogonal. As H3,2 has dimension 3 and (H3,2 ) = H3,2 has dimension
⊥
4, H3,2 is not self-dual.
Definition 1.6.1 The (Hamming) distance between two vectors x, y ∈ Fnq , denoted
dH (x, y), is the number of coordinates in which x and y differ. The (Hamming) weight
of x ∈ Fnq , denoted wtH (x), is the number of coordinates in which x is nonzero.
where x ? y is the vector in Fn2 which has 1s precisely in those coordinates where both
x and y have 1s.
(g) If x, y ∈ Fn2 , then wtH (x ? y) ≡ x · y (mod 2). In particular, wtH (x) ≡ x · x (mod 2).
3 There are other notions of distance and weight used in coding theory. See for example Chapters 6, 7,
The minimum (Hamming) weight of a nonzero code C is the smallest weight of nonzero
codewords. The (Hamming) weight distribution of C is the list A0 (C), A1 (C), . . . , An (C)
where, for 0 ≤ i ≤ n, Ai (C) is the number of codewords of weight i. If C is understood,
the distance and weight distributions of C are denoted B0 , B1 , . . . , Bn and A0 , A1 , . . . , An ,
respectively.
Example 1.6.5 Let C be the (4, 6)2 code in Example 1.3.2. Its distance distribution is
B0 (C) = B4 (C) = 1, B2 (C) = 4, B1 (C) = B3 (C) = 0, and its minimum distance is 2. In par-
ticular C is a (4, 6, 2)2 code. The weight distribution of C is A2 (C) = 6 with Ai (C) = 0 other-
wise; its minimum weight is also 2. Let C 0 = 1000 + C = {0100, 0010, 0001, 1110, 1101, 1011}.
The distance distribution of C 0 agrees with the distance distribution of C making C 0 a
(4, 6, 2)2 code. However, the weight distribution of C 0 is A1 (C 0 ) = A3 (C 0 ) = 3 with Ai (C 0 ) = 0
otherwise; the minimum weight of C 0 is 1.
Theorem 1.6.6 ([1008, Chapter 1.4]) Let C be an [n, k, d]q linear code with k > 0. The
following hold.
(a) The minimum distance and minimum weight of C are the same.
(b) Ai (C) = Bi (C) for 0 ≤ i ≤ n.
n
X
(c) Ai (C) = q k .
i=0
Binary vectors possess an important relationship between weights and inner products. If
x, y ∈ Fn2 and each have even weight, Theorem 1.6.2(f) implies x + y also has even weight.
If x, y ∈ Fn2 are orthogonal and each have weights a multiple of 4, Theorem 1.6.2(f) and (g)
show that x + y has weight a multiple of 4. This leads to the following definition for binary
codes.
12 Concise Encyclopedia of Coding Theory
Definition 1.6.8 Let C be a binary linear code. C is called even if all of its codewords
have even weight. C is called doubly-even if all of its codewords have weights a multiple
of 4. An even binary code that is not doubly-even is singly-even.
Remark 1.6.9 By Theorem 1.6.6(e), self-orthogonal binary linear codes are even. The
converse is not true; code C1 from Example 1.3.2 is even but not self-orthogonal. Doubly-
even binary linear codes must be self-orthogonal by Theorem 1.6.2(f) and (g). There are
self-orthogonal binary codes that are singly-even; code C2 from Examples 1.4.8 and 1.5.3 is
singly-even and self-dual.
Example 1.6.10 Let H3,2 be the [7, 4]2 binary Hamming code of Example 1.4.9. With
Ai = Ai (H3,2 ), A0 = A7 = 1, A3 = A4 = 7, and A1 = A2 = A5 = A6 = 0, illustrating
⊥
Theorem 1.6.6(d) and (e), and showing H3,2 is a [7, 4, 3]2 code. The [7, 3]2 dual code H3,2
⊥
is self-orthogonal by Example 1.5.3 and hence even. Also by self-orthogonality, H3,2 ⊆
⊥ ⊥
(H3,2 ) = H3,2 ; the weight distribution of H3,2 shows that the 8 codewords of weights 0
⊥ ⊥
and 4 must be precisely the codewords of H3,2 . In particular, H3,2 is a doubly-even [7, 3, 4]2
⊥
code. H3,2 is called a simplex code, described further in Section 1.10.
The minimum weight of a linear code is determined by a parity check matrix for the
code; see [1008, Corollary 1.4.14 and Theorem 1.4.15].
Theorem 1.6.11 A linear code has minimum weight d if and only if its parity check matrix
has a set of d linearly dependent columns but no set of d−1 linearly dependent columns. Also,
if C is an [n, k, d]q code, then every n − d + 1 coordinate positions contain an information
set; furthermore, d is the largest number with this property.
Definition 1.7.1 Let C be an [n, k, d]q linear code with generator matrix G and parity
check matrix H.
(a) For some i with 1 ≤ i ≤ n, let C ∗ be the codewords of C with the ith component
deleted. The resulting code, called a punctured code, is an [n − 1, k ∗ , d∗ ] code. If
d > 1, k ∗ = k, and d∗ = d unless C has a minimum weight codeword that is nonzero
on coordinate i, in which case d∗ = d − 1. If d = 1, k ∗ = k and d∗ = 1 unless C has
a weight 1 codeword that is nonzero on coordinate i, in which case k ∗ = k − 1 and
d∗ ≥ 1 as long as C ∗ is nonzero. A generator matrix for C ∗ is obtained from G by
deleting column i; G∗ will have dependent rows if d∗ = 1 and k ∗ = k − 1. Puncturing
is often done on multiple coordinates in an analogous manner, one coordinate at a
time.
Pn+1
(b) Define Cb = c1 c2 · · · cn+1 ∈ Fn+1
q | c1 c2 · · · cn ∈ C where i=1 ci = 0 , called the
extended code. This is an [n + 1, k, db]q code where db = d or d + 1. A generator
Basics of Coding Theory 13
matrix Gb for Cb is obtained by adding a column on the right of G so that every row
sum in this k × (n + 1) matrix is 0. A parity check matrix H
b for Cb is
1 ··· 1 1
0
H
b = .. .
H .
0
(c) Let S be any set of s coordinates. Let C(S) be all codewords in C that are zero on
S. Puncturing C(S) on S results in the [n − s, kS , dS ]q shortened code CS where
dS ≥ d. If C ⊥ has minimum weight d⊥ and s < d⊥ , then kS = k − s.
Example 1.7.2 Let H3,2 be the [7, 4, 3]2 binary Hamming code of Examples 1.4.9 and
1.6.10. Extending this code, we obtain H
b3,2 with generator and parity check matrices
1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1
b 3,2 = 0 1 0 0 1 0 1 1 and H b 3,2 = 0 1 1 1 1 0 0 0 ,
G 0 0 1 0 1 1 0 1 1 0 1 1 0 1 0 0
0 0 0 1 1 1 1 0 1 1 0 1 0 0 1 0
respectively. Given the weight distribution of H3,2 found in Example 1.6.10, the weight
distribution of H
b3,2 must be A0 (H b3,2 ) = A8 (H
b3,2 ) = 1, A4 (H
b3,2 ) = 14, and Ai (H
b3,2 ) = 0
otherwise, implying H b3,2 is doubly-even and self-dual; see Remark 1.6.9. Certainly if H b3,2
is punctured on its right-most coordinate, the resulting code is H3,2 .
There is a relationship between punctured and shortened codes via dual codes.
Remark 1.8.2 Applying a permutation matrix to a code simply permutes the coordinates;
applying a monomial matrix permutes and re-scales coordinates. Applying either a permu-
tation or monomial matrix to a vector does not change its weight. Also applying either
a permutation or monomial matrix to two vectors does not change the distance between
these two vectors. There is a third more general concept of equivalence, involving semi-linear
transformations, where two linear codes C1 and C2 over Fq are equivalent provided one
can be obtained from the other by permuting and re-scaling coordinates and then applying
an automorphism of the field Fq . Note that applying such maps to a vector or to a pair of
vectors preserves the weight of the vector and the distance between the two vectors, respec-
tively; see [1008, Section 1.7] for further discussion of this type of equivalence. There are
other concepts of equivalence that arise when the code may not be linear but has some spe-
cific algebraic structure (e.g., additive codes over Fq that are closed under vector addition
but not necessarily closed under scalar multiplication). The common theme when defining
equivalence of such codes is to use a set of maps which preserve distance between the two
vectors, which preserve the algebraic structure under consideration, and which form a group
under composition of these maps. We will follow this theme when we define equivalence of
unrestricted codes at the end of this section.
Remark 1.8.3 Let C1 and C2 be linear codes over Fq of length n. Define C1 ∼P C2 to mean
C1 is permutation equivalent to C2 ; similarly define C1 ∼M C2 to mean C1 is monomially
equivalent to C2 . Then both ∼P and ∼M are equivalence relations on the set of all linear
codes over Fq of length n; that is, both are reflexive, symmetric, and transitive. If q = 2,
the concepts of permutation and monomial equivalence are the same; if q > 2, they may
not be. Furthermore, two permutation or monomially equivalent codes have the same size,
weight and distance distributions, and minimum weight and distance. If two linear codes
are permutation equivalent and one code is self-orthogonal, so is the other; this may not be
true of two monomially equivalent codes.
Row reducing a generator matrix of a linear code to reduced echelon form and then
permuting columns yields the following result.
Theorem 1.8.4 Let C be a linear [n, k, d]q code with k ≥ 1. There is a code permutation
equivalent to C with a generator matrix in standard form.
Example 1.8.5 Let C be an [8, 4, 4]2 binary linear code.
By Theorem 1.8.4, C is permuta-
tion equivalent to a code with generator matrix G = I4 | A . A straightforward argument
using minimum weight 4 shows that columns of A can be permuted so that the resulting
generator matrix is Gb 3 from Example 1.7.2. This verifies that C is permutation equivalent
to H3,2 .
b
The following is a generalization of a result of MacWilliams [1318]; see also [229, 1876].
This result motivated Definition 1.8.1.
Theorem 1.8.6 (MacWilliams Extension) There is a weight preserving linear trans-
formation between equal length linear codes C1 and C2 over Fq if and only if C1 and C2 are
monomially equivalent. Furthermore, the linear transformation agrees with the associated
monomial transformation on every codeword in C1 .
Definition 1.8.7 Let C be a linear code over Fq of length n. If CP = C for some permutation
matrix P ∈ Fn×nq , then P is a permutation automorphism of C; the set of all permutation
automorphisms of C is a group under matrix multiplication, denoted PAut(C). Similarly, if
CM = C for some monomial matrix M ∈ Fn×n q , then M is a monomial automorphism of
C; the set of all monomial automorphisms of C is a matrix group, denoted MAut(C). Clearly
PAut(C) ⊆ MAut(C).
Basics of Coding Theory 15
We now consider when two unrestricted codes are equivalent. It should be noted that, in
this definition, a linear code may end up being equivalent to a nonlinear code. See Chapter 3
for more on this general equivalence.
Definition 1.8.8 Let C1 and C2 be unrestricted codes of length n over Fq of the same size.
Then C1 is equivalent to C2 provided the codewords of C2 are the images under a map
of the codewords of C1 where the map is a permutation of coordinates together with n
permutations of the alphabet Fq , independently within each coordinate.4
Definition 1.9.1 For positive integers n and d, Aq (n, d) is the largest number of code-
words in an (n, M, d)q code, linear or nonlinear. Bq (n, d) is the largest number of code-
words in a [n, k, d]q linear code. An (n, M, d)q code is optimal provided M = Aq (n, d);
an [n, k, d]q linear code is optimal if q k = Bq (n, d). The concept of ‘optimal’ can also be
used in other contexts. Given n and d, kq (n, d) denotes the largest dimension of a linear code
over Fq of length n and minimum weight d; an [n, kq (n, d), d]q code could be called ‘optimal
in dimension’. Notice that kq (n, d) = logq Bq (n, d). Similarly, dq (n, k) denotes the largest
minimum distance of a linear code over Fq of length n and dimension k; an [n, k, dq (n, k)]q
may be called ‘optimal in distance’. Analogously, nq (k, d) denotes the smallest length of a
linear code over Fq of dimension k and minimum weight d; an [nq (k, d), k, d]q code might
be called ‘optimal in length’.5
Clearly Bq (n, d) ≤ Aq (n, d). On-line tables relating parameters of various types of codes
are maintained by M. Grassl [845].
The following basic properties of Aq (n, d) and Bq (n, d) are easily derived; see [1008,
Chapter 2.1].
n, a self-dual [n, n2
, d]q code over Fq with largest minimum weight d is sometimes called an ‘optimal q-ary
self-dual code of length n’. Optimal codes are explored in chapters such as 2–5, 11, 12, 16, 18, 20, and 23.
16 Concise Encyclopedia of Coding Theory
Definition 1.9.3 The sphere of radius r centered at u ∈ Fnq is the set Sq,n,r (u) =
{v ∈ Fnq | dH (u, v) ≤ r} of all vectors in Fnq whose distance from u is at most r.
The next result is the basis of the Sphere Packing Bound; part (a) is a direct count and
part (b) follows from the triangle inequality of Theorem 1.6.2.
Proof: Let C be an (n, M, d)q code. By Theorem 1.9.5, the spheres of radius t centered at
t
X n
distinct codewords are disjoint, and each such sphere has α = (q − 1)i total vectors.
i=0
i
Thus M α cannot exceed the number q n of vectors in Fnq . The result is now clear.
Remark 1.9.7 The Sphere Packing Bound is an upper bound on the size of a code given its
length and minimum distance. Additionally the Sphere Packing Bound produces an upper
bound on the minimum distance d of an (n, M )q code in the following sense. Given n, M ,
qn
and q, compute the smallest positive integer s with M > Ps n
; for an (n, M, d)q
i=0 i (q − 1)
i
Example 1.9.9 The code H3,2 of Examples 1.4.9 and 1.6.10 is a [7, 4, 3]2 code. So in this
n
27
case t = 3−1 = 1 and Pt qn (q−1)i = 1+7
= 24 yielding equality in the Sphere Packing
2 (
i=0 i )
Bound. So H3,2 is perfect.
Remark 1.9.11 In addition to providing an upper bound on code size, the Singleton Bound
yields the upper bound d ≤ n − logq (M ) + 1 on the minimum distance of an (n, M, d)q code.
Definition 1.9.12 A code for which equality holds in the Singleton Bound is called max-
imum distance separable (MDS). No code of length n and minimum distance d has
more codewords than an MDS code with parameters n and d; equivalently, no code of length
n with M codewords has a larger minimum distance than an MDS code with parameters n
and M . MDS codes are discussed in Chapters 3, 6, 8, 14, and 33.
Theorem 1.9.13 C is an [n, k, n−k +1]q MDS code if and only if C ⊥ is an [n, n−k, k +1]q
MDS code.
Example 1.9.14 Let H2,3 be the [4, 2]3 ternary linear code with generator matrix
1 0 1 1
G2,3 = .
0 1 1 −1
Examining inner products of the rows of G2,3 , we see that H2,3 is self-orthogonal of dimen-
sion half its length; so it is self-dual. Using Theorem 1.6.2(h), A0 (H2,3 ) = 1, A3 (H2,3 ) = 8,
and Ai (H2,3 ) = 0 otherwise. In particular H2,3 is a [4, 2, 3]3 code and hence is MDS.
Theorem 1.9.16 (Generalized Plotkin Bound) If an (n, M, d)q code exists, then
q−2 X
X q−1
M (M − 1)d ≤ 2n Mi Mj
i=0 j=i+1
M +i
where Mi = .
q
Example 1.9.17 The Sphere Packing Bound yields A2 (17, 9) ≤ 131 072
3 214 and A2 (18, 10) ≤
262 144
4 048 ; so A2 (17, 9) ≤ 40 and A2 (18, 10) ≤ 64. The Singleton Bound produces A2 (17, 9) ≤
512 and A2 (18, 10) ≤ 512. The Binary Plotkin Bound gives A2 (17, 9) ≤ 18 and A2 (18, 10) ≤
10. Using Theorem 1.9.2(e), the Plotkin Bound is best with A2 (18, 10) = A2 (17, 9) ≤ 10.
According to [845], there is a (18, 10, 10)2 code implying A2 (18, 10) = A2 (17, 9) = 10.
Theorem 1.9.18 (Griesmer Bound) Let C be an [n, k, d]q linear code with k ≥ 1. Then
k−1
X
d
n≥ .
i=0
qi
Remark 1.9.19 One can interpret the Griesmer Bound as an upper bound on the code
size given its length and minimum weight. Specifically, Bq (n, d) ≤ q k where k is the largest
Pk−1
positive integer such that n ≥ i=0 qdi . This bound can also be interpreted as a lower
bound on thePk−1 length
of a linear code of given dimension and minimum weight; that is,
nq (k, d) ≥ i=0 qdi . Finally, the Griesmer Bound can be understood as an upper bound
on the minimum weight given the code length and dimension; given n and k, dq (n, k) is at
most the largest d for which the bound holds.
Let C be an (n,PnM, d)q code with distance distribution Bi (C) for 0 ≤ i ≤ n. By Re-
mark 1.6.7, M = i=0 Bi (C), B0 (C) = 1, and Bi (C) = 0 for 1 ≤ i ≤ d − 1. Although Bi (C)
may not be an integer, Bi (C) ≥ 0. By the Delsarte–MacWilliams Inequalities, we also have
Pn (n,q) (n,q)
i=0 Bi (C)Kk ≥ 0 for 0 ≤ i ≤ n. As K0 (i) = 1, the 0th Delsarte–MacWilliams
(i) P
n
Inequality is merely i=0 Bi (C) ≥ 0, which is clearly already true. If q = 2, there are addi-
tional inequalities that hold. When q = 2, it is straightforward to show that Bn (C) ≤ 1. Fur-
thermore when q = 2 and d is even, we may also assume that Bi (C) = 0 when i is odd by The-
(n,2) (n,2)
orem 1.9.2(f). Properties of binomial coefficients show that Kk (i) = (−1)i Kn−k (i); thus
th th
the k Delsarte–MacWilliams Inequality is the same as the (n−k) Delsarte–MacWilliams
Inequality because Bi (C) = 0 when i is odd. This discussion leads to the linear program
that is set up to establish an upper bound on Aq (n, d).
SometimesPadditional constraints can be added to the linear program and reduce the
n
size of max { i=0 Bi }. Linear Programming Bounds will be considered in more detail in
Chapters 12 and 13.
(n−1
i )(q−1) )e .
Pd−2 i
Bq (n, d) ≥ q n−dlogq (1+ i=0
Definition 1.9.26 The information rate, or simply rate, of an (n, M, d)q code is defined
logq M k
to be . If the code is actually an [n, k, d]q linear code, its rate is , measuring the
n n
number of information coordinates relative to the total number of coordinates. In either the
linear or nonlinear case, the higher the rate, the higher the proportion of coordinates in a
d
codeword that actually contain information rather than redundancy. The ratio is called
n
the relative distance of the code; as we will see later, the relative distance is a measure
of the error-correcting capability of the code relative to its length.
Each asymptotic bound will be either an upper or lower bound on the largest possible
rate for a family of (possibly nonlinear) codes over Fq of lengths going to infinity with
relative distances approaching δ. The function, called the asymptotic normalized rate
function, that determines this rate is
As the exact value of αq (δ) is unknown, we desire upper and lower bounds on this function.
An upper bound would indicate that all families with relative distances approaching δ have
rates, in the limit, at most this upper bound. A lower bound indicates that there exists a
family of codes of lengths approaching infinity and relative distances approaching δ whose
rates are at least this bound. Three of the bounds in the next theorem involve the entropy
function.
Discussion and proofs of the asymptotic bounds can be found in [1008, 1323, 1505, 1836].
The MRRW Bound, named after the authors of [1365] who developed the bound, is the
Asymptotic Linear Programming Bound. The MRRW Bound has been improved by M.
Aaltonnen [2] in the case q > 2.
The parameters of the Hamming codes in fact determine the code. That Hm,q is perfect
follows by direct computation from Definition 1.9.8.
Definition 1.11.1 For i ∈ {1, 2}, let Ci be linear codes both of length n over Fq . The
(u | u + v) construction produces the linear code C of length 2n given by C = {(u, u + v) |
u ∈ C1 , v ∈ C2 }.
Remark 1.11.2 Let Ci , for i ∈ {1, 2}, be [n, ki , di ]q codes with generator and parity check
matrices Gi and Hi , respectively. C obtained by the (u | u + v) construction is a [2n, k1 +
k2 , min {2d1 , d2 }]q code with generator and parity check matrices
G1 G1 H1 0(n−k1 )×n
G= and H = . (1.2)
0k2 ×n G2 −H2 H2
Definition 1.11.3 Let r and m be integers with 0 ≤ r ≤ m and 1 ≤ m. The rth or-
der binary Reed–Muller (RM) code of length 2m , denoted RM(r, m), is defined
recursively. The code RM(0, m) = {0, 1}, the [2m , 1, 2m ]2 binary repetition code, and
m
RM(m, m) = F2q , a [2m , 2m , 1]2 code. For 1 ≤ r < m, define
Remark 1.11.4
Let G(r, m) be a generator matrix of RM(r, m). By Definition 1.11.3,
G(0, m) = 1 1 · · · 1 and G(m, m) = I2m . By Definition 1.11.3 and (1.2), for 1 ≤ r < m,
G(r, m − 1) G(r, m − 1)
G(r, m) =
O G(r − 1, m − 1)
Using the definition of Reed–Muller codes and properties from the (u | u + v) construc-
tion, along with induction, the following hold; see [1008, Theorem 1.10.1].
Theorem 1.11.6 Let r and m be integers with 0 ≤ r ≤ m and 1 ≤ m. The following hold.
(a) RM(i, m) ⊆ RM(j, m) if 0 ≤ i ≤ j ≤ m.
(b) The dimension of RM(r, m) equals m m m
0 + 1 + ··· + r .
(c) The minimum weight of RM(r, m) equals 2m−r .
(d) RM(m, m)⊥ = {0}, and if 0 ≤ r < m, then RM(r, m)⊥ = RM(m − r − 1, m).
Remark 1.11.7 Theorem 1.11.6(a) is sometimes called the nesting property of Reed–
Muller codes. As observed in Example 1.11.5, RM(1, 3) = H b3,2 . Using Theorem 1.11.6(d),
it can be shown that RM(m−2, m) = Hm,2 ; see [1008, Exercise 61]. By Theorem 1.11.6(d),
b
RM(m − 1, m) = RM(0, m)⊥ . Since RM(0, m) = {0, 1}, RM(m − 1, m) must be all even
m
weight vectors in F22 , a fact observed for m = 2 and m = 3 in Example 1.11.5.
1957 to 1959. The 1961 book by W. W. Peterson [1505] compiled extensive results about
cyclic codes and laid the framework for much of the present-day theory. In 1972 this book
was expanded and published jointly by Peterson and E. J. Weldon [1506].
Up to this point, the coordinates of Fnq have been denoted {1, 2, . . . , n}. For cyclic codes,
the coordinates of Fnq will be denoted {0, 1, . . . , n − 1}.
Definition 1.12.1 Let C be a code of length n over Fq . C is cyclic provided that for all
c = c0 c1 · · · cn−1 ∈ C, the cyclic shift c0 = cn−1 c0 · · · cn−2 ∈ C.
Remark 1.12.2 The cyclic shift described in Definition 1.12.1 is cyclic shift to the right
by one position with wrap-around. The code C is cyclic if and only if P ∈ PAut(C) where
the permutation matrix P = [pi,j ] is defined by pi,i+1 = 1 for 0 ≤ i ≤ n − 2, pn−1,0 = 1,
and pi,j = 0 otherwise. Cyclic codes are closed under cyclic shifts with wrap-around of any
amount and in either the left or right directions.
Example 1.12.3 Let C be the [7, 4]2 code with generator matrix
1 1 0 1 0 0 0
0 1 1 0 1 0 0
G= 0 0 1 1
.
0 1 0
0 0 0 1 1 0 1
Labeling the rows of G as r1 , r2 , r3 , r4 top to bottom, we see that r5 = 1000110 = r1 +r2 +r3 ,
r6 = 0100011 = r2 + r3 + r4 , and r7 = 1010001 = r1 + r2 + r4 . Since C is spanned by
{r1 , r2 , . . . , r7 } and this list is closed under cyclic shifts, C must be a cyclic code. By row
reducing G, we obtain another generator matrix
1 0 0 0 1 1 0
0 1 0 0 0 1 1
G0 = 0 0 1 0 1 1 1 .
0 0 0 1 1 0 1
While cyclic codes can be nonlinear, throughout this section we will examine only those
that are linear. To study linear cyclic codes it is useful to consider elements of Fnq as
polynomials inside a certain quotient ring of polynomials. In that framework, linear cyclic
codes are precisely the ideals of that quotient ring. We now establish the framework.
(e) (Unique Coset Representatives) Let p(x) ∈ Fq [x] be nonzero. The distinct cosets
of the quotient ring Fq [x]/hp(x)i are uniquely representable as a(x) + hp(x)i where
a(x) = 0 or deg(a(x)) < deg(p(x)); Fq [x]/hp(x)i has order q deg(p(x)) . The quotient
ring Fq [x]/hp(x)i is also a vector space over Fq of dimension deg(p(x)).
(f) If p(x) is irreducible over Fq , then Fq [x]/hp(x)i is a field.
there are significant differences. When gcd(n, q) 6= 1, cyclic codes are called repeated-root cyclic codes.
Repeated-root cyclic codes were first examined in their most generality in [369, 1835].
26 Concise Encyclopedia of Coding Theory
Remark 1.12.8 A splitting field of xn − 1 over Fq is Fqt where t is the size of the q-
cyclotomic coset of 1 modulo n.
Theorem 1.12.9 ([1323, Chapter 7.5]) Let α be a primitive nth Qroot of unity in the
splitting field Fqt of xn − 1 over Fq . For 0 ≤ s < n define Mαs (x) = i∈Cs (x − αi ). Then
n
Mαs (x) ∈ Fq [x] and is irreducible over Fq . Furthermore, the uniqueQ factorization of x − 1
n
into monic irreducible polynomials over Fq is given by x − 1 = s Mαs (x) where s runs
through a set of representatives of all distinct q-cyclotomic cosets modulo n.
Example 1.12.10 The 2-cyclotomic cosets modulo 7 are C0 = {0}, C1 = {1, 2, 4}, and
C3 = {3, 6, 5}. By Remark 1.12.8, F23 = F8 is the splitting field of x7 − 1 over F2 . In
the notation of Table 1.3, α = δ is a primitive 7th root of unity. In the notation of The-
orem 1.12.9, Mα0 (x) = −1 + x = 1 + x, Mα = (x − α)(x − α2 )(x − α4 ) = 1 + x + x3 ,
Mα3 = (x − α3 )(x − α6 )(x − α5 ) = 1 + x2 + x3 , and x7 − 1 = Mα0 (x)Mα (x)Mα3 (x).
Using Theorem 1.12.9, we have the following basic theorem [1008, Theorem 4.2.1] de-
scribing the structure of cyclic codes over Fq . We remark that all of this theorem except
part (g) is valid when gcd(n, q) 6= 1. We note that if a(x), b(x) ∈ Fq [x], then a(x) divides
b(x), denoted a(x) | b(x), means that there exists c(x) ∈ Fq [x] such that b(x) = a(x)c(x).
Theorem 1.12.11 Let C be a nonzero linear cyclic code over Fq of length n viewed as an
ideal of Fq [x]/hxn − 1i. There exists a polynomial g(x) ∈ C with the following properties.
(a) g(x) is the unique monic polynomial of minimum degree in C.
(b) C = hg(x)i in Fq [x]/hxn − 1i.
(c) g(x) | (xn − 1).
Pn−k
With k = n − deg(g(x)), let g(x) = i=0 gi xi where gn−k = 1. Then
(d) the dimension of C is k and {g(x), xg(x), . . . , xk−1 g(x)} is a basis for C,
(e) every element of C is uniquely expressible as a product g(x)f (x) where f (x) = 0 or
deg(f (x)) < k,
(f) a generator matrix G of C is
and
Basics of Coding Theory 27
(g) if α is a primitive nth root of unity in the splitting field Fqt of xn − 1 over Fq , then
Y
g(x) = Mαs (x)
s
Definition 1.12.12 The polynomial g(x) in Theorem 1.12.11 is the generator polyno-
mial of C. By convention, the cyclic code C = {0} has generator polynomial g(x) = xn − 1.
Corollary 1.12.13 There are 2m linear cyclic codes of length n (including the zero code)
over Fq where m is the number of q-cyclotomic cosets modulo n.
Corollary 1.12.14 If g1 (x) and g2 (x) are generator polynomials of C1 and C2 , respectively,
and if g1 (x) | g2 (x), then C2 ⊆ C1 .
Definition 1.12.15 Let C be a linear cyclic code of length n over Fq with generator poly-
nomial g(x) | (xn −1). Let α be a fixed primitive nthQroot n
Q of unity in ia splitting field of x −1
over Fq . By Theorems 1.12.9 and 1.12.11, g(x) = s i∈Cs (x − α ) where s runsSthrough
some subset of representatives of the q-cyclotomic cosets Cs modulo n. Let T = s Cs be
the union of these q-cyclotomic cosets. The roots of unity {αi | i ∈ T } are called the zeros
of C; {αi | 0 ≤ i < n, i 6∈ T } are the nonzeros of C. The set T is called the defining set
of C relative to α.
Remark 1.12.16 In Definition 1.12.15, if you change the primitive nth root of unity, you
change the defining set T ; so T is computed relative to a fixed primitive root of unity.
Remark 1.12.17 Corollary 1.12.14 can be translated into the language of defining sets: If
T1 and T2 are defining sets of C1 and C2 , respectively, relative to the same primitive root of
unity, and if T1 ⊆ T2 , then C2 ⊆ C1 .
Example 1.12.18 Continuing with Example 1.12.10, Table 1.5 describes the 23 = 8 bi-
nary cyclic codes of length 7. The code with g(x) = 1 + x + x3 is H3,2 as discussed in
Example 1.12.3. The code with g(x) = 1 + x2 + x3 is permutation equivalent to H3,2 . The
code of dimension k = 1 is the binary repetition code {0,1}.
The dual code of a cyclic code is also cyclic. We can determine its generator polynomial
and defining set; see [1323, Chapter 7.4].
Theorem 1.12.19 Let C be an [n, k]q cyclic code with generator polynomial g(x). Define
xn − 1 xk h(x−1 )
h(x) = . Then C ⊥ is cyclic with generator polynomial g ⊥ (x) = . Let α be
g(x) h(0)
a primitive nth root of unity in a splitting field of xn − 1 over Fq . If T is the defining set of
C relative to α, the defining set of C ⊥ is T ⊥ = {0, 1, . . . , n − 1} \ (−1)T mod n.
28 Concise Encyclopedia of Coding Theory
TABLE 1.5: The [7, k, d]2 cyclic codes with generator polynomial g(x) and defining set T
relative to α
k d g(x) T
0 — 1 + x7 = 0 {0, 1, 2, 3, 4, 5, 6}
1 7 1 + x + x2 + x3 + x4 + x5 + x6 {1, 2, 3, 4, 5, 6}
3 4 1 + x2 + x3 + x4 {0, 1, 2, 4}
3 4 1 + x + x2 + x4 {0, 3, 5, 6}
4 3 1 + x + x3 {1, 2, 4}
4 3 1 + x2 + x3 {3, 5, 6}
6 2 1+x {0}
7 1 1 ∅
Example 1.12.20 Continuing with Example 1.12.18, the dual of any code in Table 1.5
must be a code in the table. By comparing dimensions, the codes of dimension 0 and
7 in the table are duals of each other as are the codes of dimension 1 and 6; this is
confirmed by examining the defining sets and using Theorem 1.12.19. By this theorem,
{0, 1, 2, 3, 4, 5, 6} \ (−1){1, 2, 4} mod 7 = {0, 1, 2, 3, 4, 5, 6} \ {6, 5, 3} = {0, 1, 2, 4} showing
that the codes with defining sets {0, 1, 2, 4} and {1, 2, 4} are duals of each other. By Re-
mark 1.12.17, as {1, 2, 4} ⊆ {0, 1, 2, 4}, h1+x+x3 i⊥ = h1+x2 +x3 +x4 i ⊆ h1+x+x3 i, a fact
we already observed in Example 1.6.10. Similarly, the codes with defining sets {0, 3, 5, 6}
and {3, 5, 6} are duals of each other.
The following is a somewhat surprising fact about cyclic self-orthogonal binary codes;
see [1008, Theorem 4.4.18].
Definition 1.12.23 Quasi-cyclic codes are a natural generalization of cyclic codes. Let C
be a code of length n and ` a positive integer dividing n. C is `-quasi-cyclic provided
whenever c0 c1 · · · cn−1 ∈ C then cn−` cn−`+1 · · · cn−1 c0 · · · cn−`−2 cn−`−1 ∈ C. Cyclic codes
are 1-quasi-cyclic codes. Quasi-cyclic codes will be studied in Chapter 7.
See Chapter 2 and Sections 8.6 and 20.5 for more on cyclic codes over fields.
Example 1.13.1 There are three 2-cyclotomic cosets modulo 23 with sizes 1, 11, and 11:
C0 = {0}, C1 = {1, 2, 4, 8, 16, 9, 18, 13, 3, 6, 12}, C5 = {5, 10, 20, 17, 11, 22, 21, 19, 15, 7, 14}.
Theorem 1.12.9 implies that, over F2 , x23 − 1 = x23 + 1 factors into 3 monic irreducible
Basics of Coding Theory 29
polynomials of degrees 1, 11, and 11. These irreducible factors are b0 (x) = 1 + x, b1 (x) =
1 + x + x5 + x6 + x7 + x9 + x11 , and b2 (x) = 1 + x2 + x4 + x5 + x6 + x10 + x11 . There are
8 binary linear cyclic codes of length 23 by Corollary 1.12.13. By Theorem 1.12.11(d), the
codes C1 = hb1 (x)i and C2 = hb2 (x)i are [23, 12]2 codes. The map that fixes coordinate 0
and switches coordinates i and 23 − i for 1 ≤ i ≤ 11 leads to a permutation matrix P where
C1 P = C2 . Any code permutation equivalent to C1 is termed the [23, 12]2 binary Golay
code of length 23 and is denoted G23 . The splitting field of x23 − 1 over F2 is F211 . In F211
there is a primitive 23rd root of unity α where the defining sets of C1 and C2 are C1 and
C5 , respectively, relative to α. Another primitive 23rd root of unity is β = α5 ; relative to β,
the defining sets of C1 and C2 are C5 and C1 , respectively. By Theorem 1.12.19 the defining
set of C1⊥ relative to α is C0 ∪ C1 implying by Remark 1.12.17 that C1⊥ ⊆ C1 showing C1⊥ is
self-orthogonal. Using Theorem 1.12.21, C1⊥ is the doubly-even [23, 11]2 code consisting of
all codewords in C1 of even weight.
Example 1.13.2 The 3-cyclotomic cosets modulo 11 are C0 = {0}, C1 = {1, 3, 9, 5, 4},
and C2 = {2, 6, 7, 10, 8} of sizes 1, 5, and 5, respectively. Theorem 1.12.9 implies that,
over F3 , x11 − 1 factors into 3 monic irreducible polynomials of degrees 1, 5, and 5. These
irreducible factors are t0 (x) = −1 + x, t1 (x) = −1 + x2 − x3 + x4 + x5 , and t2 (x) =
−1−x+x2 −x3 +x5 . There are 8 ternary linear cyclic codes of length 11 by Corollary 1.12.13.
By Theorem 1.12.11(d), the codes C1 = ht1 (x)i and C2 = ht2 (x)i are [11, 6]3 codes. The map
that fixes coordinate 0 and switches coordinates i and 11 − i for 1 ≤ i ≤ 5 leads to a
permutation matrix P where C1 P = C2 . Any code monomially equivalent to C1 is termed
the [11, 6]3 ternary Golay code of length 11 and is denoted G11 . The splitting field of
x11 − 1 over F3 is F35 . In F35 there is a primitive 11th root of unity α where the defining
sets of C1 and C2 are C1 and C2 , respectively, relative to α. Another primitive 11th root of
unity is β = α2 ; relative to β, the defining sets of C1 and C2 are C2 and C1 , respectively.
Definition 1.13.3 G23 can be extended as in Section 1.7 to a [24, 12]2 code Gb23 , denoted
G24 , and called the binary Golay code of length 24. Similarly G11 can be extended to a
[12, 6]3 code Gb11 , denoted G12 , and called the ternary Golay code of length 12.
Remark 1.13.4 The automorphism groups of the four Golay codes involve the Mathieu
groups Mp for p ∈ {11, 12, 23, 24} discovered by Émile Mathieu [1352, 1353]. These four
permutation groups on p points are 4-fold transitive, when p ∈ {11, 23}, and 5-fold transitive,
when p ∈ {12, 24}, simple groups. Properties of these groups and their relationship to Golay
codes can be found in [442].
The following two theorems give basic properties of the four Golay codes. Parts (a), (b),
and (c) of each theorem can be found in most standard coding theory texts. The uniqueness
of these codes in part (d) of each theorem is a culmination of work in [525, 1514, 1518, 1732]
with a self-contained proof in [1008, Chapter 10]. Part (e) of each theorem follows by direct
computation from Definition 1.9.8. The automorphism groups in part (f) of each theorem
can be found in [434], which is also [442, Chapter 10].
Theorem 1.13.5 The following properties hold for the binary Golay codes.
(a) G23 has minimum distance 7 and weight distribution A0 = A23 = 1, A7 = A16 = 253,
A8 = A15 = 506, A11 = A12 = 1288, and Ai = 0 otherwise.
(b) G24 has minimum distance 8 and weight distribution A0 = A24 = 1, A8 = A16 = 759,
A12 = 2576, and Ai = 0 otherwise.
(c) G24 is doubly-even and self-dual.
30 Concise Encyclopedia of Coding Theory
(d) Both a (23, M )2 and a (24, M )2 , possibly nonlinear, binary code each containing 0
with M ≥ 212 codewords and minimum distance 7 and 8, respectively, are unique
up to permutation equivalence. They are the [23, 12, 7]2 and [24, 12, 8]2 binary Golay
codes.
(e) G23 is perfect.
Theorem 1.13.6 The following properties hold for the ternary Golay codes.
(f) MAut(G11 ) = Mf11 and MAut(G12 ) = M f12 where M f11 and Mf12 are isomorphic to the
double covers, or the non-splitting central extensions by a center of order 2, of M11
and M12 .
Remark 1.13.7 If there is equality for given parameters of a code in a bound from Sec-
tion 1.9, we say the code meets the bound. Perfect codes are those meeting the Sphere
Packing Bound; MDS codes are those meeting the Singleton Bound. Using Theorem 1.10.5,
direct computation shows that the [(q m − 1)/(q − 1), m, q m−1 ]q simplex code meets the
Griesmer Bound, as do G11 and G12 . Neither G23 nor G24 meet the Griesmer Bound.
{b, b + 1, . . . , b + s − 1} mod n ⊆ T.
Remark 1.14.2 When considering the notion of consecutive, wrap-around is allowed. For
example if n = 10, {8, 9, 0, 1} is a consecutive set in T = {0, 1, 2, 5, 8, 9}.
Basics of Coding Theory 31
TABLE 1.6: The [7, k, d]2 BCH codes with defining set T relative to α, b, and Bose distance
δ
k d T b δ
0 — {0, 1, 2, 3, 4, 5, 6} = C1 ∪ C2 ∪ · · · ∪ C6 ∪ C0 1 —
1 7 {1, 2, 3, 4, 5, 6} = C1 ∪ C2 ∪ · · · ∪ C6 1 7
3 4 {0, 1, 2, 4} = C0 ∪ C1 ∪ C2 0 4
3 4 {0, 3, 5, 6} = C5 ∪ C6 ∪ C0 5 4
4 3 {1, 2, 4} = C1 ∪ C2 1 3
4 3 {3, 5, 6} = C5 ∪ C6 5 3
6 2 {0} = C0 0 2
Rather surprisingly, the existence of consecutive elements in the defining set of a cyclic
code determines a lower bound, called the BCH Bound, on the minimum distance of the
code. A proof of the following can be found in [1323, Chapter 7.6].
Theorem 1.14.3 (BCH Bound) Let C be a linear cyclic code of length n over Fq and
minimum distance d with defining set T relative to some primitive nth root of unity. Assume
T contains δ − 1 consecutive elements for some integer δ ≥ 2. Then d ≥ δ.
T = Cb ∪ Cb+1 ∪ · · · ∪ Cb+δ−2
relative to some primitive nth root of unity where Ci is the q-cyclotomic coset modulo n
containing i. As T contains the consecutive set {b, b+1, . . . , b+δ−2}, this code has minimum
distance at least δ by the BCH Bound. If b = 1, the code is narrow-sense; if n = q t − 1
for some t, the code is primitive.
Definition 1.14.5 Sometimes a BCH code can have more than one designed distance; the
largest designed distance is called the Bose distance.
Example 1.14.6 Consider the eight [7, k, d]2 binary cyclic codes from Example 1.12.18 and
presented in Table 1.5. All except the code with defining set T = ∅ are BCH codes as seen
in Table 1.6. As 7 = 23 − 1, all these BCH codes are primitive. Technically, the zero code
is primitive with designed distance 8; of course distance in the zero code is meaningless.
Of the six remaining codes, three are narrow-sense. Notice that the code with defining set
{1, 2, 4} is narrow-sense with two designed distances 2 and 3 as {1, 2, 4} = C1 = C1 ∪ C2 ;
the Bose distance is 3. The code with defining set {1, 2, 3, 4, 5, 6} is narrow-sense with
designed distances 4 through 7 as {1, 2, 3, 4, 5, 6} = C1 ∪ C2 ∪ C3 = C1 ∪ C2 ∪ C3 ∪ C4 =
C1 ∪ C2 ∪ C3 ∪ C4 ∪ C5 = C1 ∪ C2 ∪ C3 ∪ C4 ∪ C5 ∪ C6 ; the Bose distance is 7. The Bose
designed distance and the true minimum distance are the same for the seven nonzero BCH
codes.
Example 1.14.7 In the notation of Example 1.13.1, G23 has defining set T = C1 which
contains 4 consecutive elements {1, 2, 3, 4}. By the BCH Bound, G23 has minimum weight
at least 5; its true minimum weight is 7 from Theorem 1.13.5(a). As the defining set of G23
is T = C1 = C1 ∪ C2 ∪ C3 ∪ C4 , G23 is a narrow-sense8 BCH code of Bose designed distance
8G
23 is permutation equivalent to the BCH code with designed distance 5 and defining set C5 = C19 =
C19 ∪ C20 ∪ C21 ∪ C22 ; in this formulation G23 is not narrow-sense.
32 Concise Encyclopedia of Coding Theory
δ = 5 with b = 1. Similarly, G11 of Example 1.13.2 is a BCH code viewed in several ways.
G11 is a narrow-sense BCH code with b = 1, δ = 2 and defining set C1 = {1, 3, 4, 5, 9}. It is
also a BCH code with b = 3, δ = 2, 3, or 4 as {1, 3, 4, 5, 9} = C3 = C3 ∪ C4 = C3 ∪ C4 ∪ C5 .
The Bose distance of G11 is 4 while its true minimum distance is 5 from Theorem 1.13.6(a).
At about the same time as BCH codes appeared in the literature, I. S. Reed and G.
Solomon [1582] published their work on the codes that now bear their names. These codes,
which are now commonly presented as a special case of BCH codes, were actually first
constructed by K. A. Bush [319] in 1952 in the context of orthogonal arrays. Because
of their burst error-correction capabilities, Reed–Solomon codes are used to improve the
reliability of compact discs, digital audio tapes, and other data storage systems.
Example 1.14.10 Using Table 1.1, γ is both a primitive element of F9 and a primitive 8th
root of unity. Let C be the narrow-sense Reed–Solomon code over F9 of length 8 and designed
distance δ = 4. Then C has defining set {1, 2, 3} relative to γ and generator polynomial
g(x) = (x−γ)(x−γ 2 )(x−γ 3 ) = γ 2 +γx+γ 3 x2 +x3 . C is an [8, 5, 4]9 code. By Theorem 1.12.19,
C ⊥ has defining set T ⊥ = {0, 1, . . . , 7} \ (−1){1, 2, 3} mod 8 = {0, 1, . . . , 7} \ {7, 6, 5} =
{0, 1, 2, 3, 4} and hence generator polynomial g ⊥ (x) = (x−1)(x−γ)(x−γ 2 )(x−γ 3 )(x−γ 4 ) =
(x2 − 1)g(x) = γ 6 + γ 5 x + γ 5 x2 + γ 7 x3 + γ 3 x4 + x5 . So C ⊥ is an [8, 3, 6]9 non-narrow-sense
Reed–Solomon code with b = 0 and designed distance 6, consistent with Theorem 1.14.9(d).
As T ⊆ T ⊥ , C ⊥ ⊆ C by Remark 1.12.17.
The original formulation of Reed and Solomon for the narrow-sense Reed–Solomon codes
is different from that of Definition 1.14.8. This alternative formulation of narrow-sense
Reed–Solomon codes is of particular importance because it is the basis for the definitions
9 While this is a common definition of Reed–Solomon codes, there are other codes of lengths different
from q − 1 that are also called Reed–Solomon codes. See Remark 15.3.21.
Basics of Coding Theory 33
of generalized Reed–Solomon codes, Goppa codes, and algebraic geometry codes; see Chap-
ters 15 and 24.
For this formulation, let Pk,q = {p(x) ∈ Fq [x] | p(x) = 0 or deg(p(x)) < k} when k ≥ 0.
See [1008, Theorem 5.2.3] for a proof of the following.
Theorem 1.14.11 Let n = q − 1 and let α be a primitive nth root of unity in Fq . For
0 < k ≤ n = q − 1, let RS k (α) = p(α0 ), p(α), . . . , p(αq−2 ) ∈ Fnq | p(x) ∈ Pk,q . Then
RS k (α) is the narrow-sense [q − 1, k, q − k]q Reed–Solomon code.
In general, extending an MDS code may not produce an MDS code; however extending
a narrow-sense Reed–Solomon code does produce an MDS code. With Pq−2the notation of The-
orem 1.14.11, Fq = {0, 1 = α0 , α, α2 , . . . , αq−2 } and, when q ≥ 3, i=0 αi P
= 0. Using this,
it is straightforward to show that if q ≥ 3, k < q − 1, and p(x) ∈ Pk,q , then β∈Fq p(β) = 0.
This leads to the following result.
Theorem 1.14.12
With the notation of Theorem 0 < k < n = q − 1,
1.14.11 and
RS
d k (α) = p(α0 ), p(α), . . . , p(αq−2 ), p(0) ∈ Fnq | p(x) ∈ Pk,q is a [q, k, q − k + 1]q
MDS code.
Remark 1.14.13 The code RS q−1 (α) omitted from consideration in Theorem 1.14.12
equals Fq−1
q . Its extension is not as given in Theorem 1.14.12; however RS
d q−1 (α) is still a
[q, q − 1, 2]q MDS code.
Definition 1.15.1 Let C be a linear code of length n over Fq with weight distribution
Ai (C) for 0 ≤ i ≤ n. Let x and y be independent indeterminates over Fq . The (Hamming)
weight enumerator of C is defined to be
n
X
HweC (x, y) = Ai (C)xi y n−i .
i=0
Definition 1.15.2 The Stirling numbers S(r, ν) of the second kind are defined for
nonnegative integers r, ν by the equation
ν
1 X ν−i ν
S(r, ν) = (−1) ir ;
ν! i=0 i
ν!S(r, ν) is the number of ways to distribute r distinct objects into ν distinct boxes with
no box left empty.
34 Concise Encyclopedia of Coding Theory
The next theorem gives six equivalent formulations of the MacWilliams Identities or
MacWilliams Equations. The fourth in the list involves the Krawtchouck polynomials;
see Definition 1.9.21. The last two are the Pless Power Moments. One proof of the
equivalence of these identities is found in [1008, Chapter 7.2].
Theorem 1.15.3 (MacWilliams Identities and Pless Power Moments) Let C be a
linear [n, k]q code and C ⊥ its [n, n − k]q dual code. Let Ai = Ai (C) and A⊥ ⊥
i = Ai (C ), for
⊥
0 ≤ i ≤ n, be the weight distributions of C and C , respectively. The following are equivalent.
n−ν
X n − j ν
n−j
X
(a) Aj = q k−ν
A⊥
j for 0 ≤ ν ≤ n.
j=0
ν j=0
n − ν
n ν
n−j
X j X
(b) Aj = q k−ν (−1)j (q − 1)ν−j A⊥
j for 0 ≤ ν ≤ n.
j=ν
ν j=0
n − ν
1
(c) HweC ⊥ (x, y) = HweC (y − x, y + (q − 1)x).
|C|
n
1 X (n,q)
(d) A⊥
j = Ai Kj (i) for 0 ≤ j ≤ n.
|C| i=0
n min{n,r} r
−
X X X n j
(e) j r Aj = (−1)j A⊥j
ν!S(r, ν)q k−ν (q − 1)ν−j for 0 ≤ r.
j=0 j=0 ν=j
n − ν
n min{n,r} r
−
X X X n j
(f) (n − j)r Aj = A⊥
j
ν!S(r, ν)q k−ν for 0 ≤ r.
j=0 j=0 ν=j
n − ν
The unique solution to this system is A6 = 264, A9 = 440, and A12 = 24. Thus the weight
enumerator of G12 is
The MacWilliams Identities can be used to find the weight distribution of an MDS
code as found, for example, in [1323, Theorem 6 of Chapter 11]. A resulting corollary gives
bounding relations on the length, dimension, and field size.
Theorem 1.15.7 Let C be an [n, k, d]q MDS code over Fq . The weight distribution of C is
given by A0 (C) = 1, Ai (C) = 0 for 1 ≤ i < d = n − k + 1 and
i−d
X
n i
(−1)j q i+1−d−j − 1
Ai (C) =
i j=0 j
for d ≤ i ≤ n.
This corollary becomes a foundation for the MDS Conjecture 3.3.21 in Chapter 3.
1.16 Encoding
Figure 1.1 shows a simple communication channel that includes a component called an
encoder, in which a message is encoded to produce a codeword. In this section we examine
two encoding processes.
As in Figure 1.1, a message is any of the q k possible k-tuples x ∈ Fkq . The encoder will
convert x to an n-tuple c from a code C over Fq with q k codewords; that codeword will then
be transmitted over the communication channel.
36 Concise Encyclopedia of Coding Theory
Suppose that C is an [n, k, d]q linear code with generator matrix G and parity check
matrix H. We first describe an encoder that uses the generator matrix G. The most common
way to encode the message x is as x 7→ c = xG. If G is replaced by another generator matrix,
the encoding of x will, of course, be different. Anice relationship exists between message
and codeword if G is in standard form Ik | A . In that case the first k coordinates of
the codeword c are the information symbols x in order; the remaining n − k symbols are
the parity check symbols, that is, the redundancy added to x in order to help recover x
if errors occur during transmission. A similar relationship between message and codeword
can exist even if G is not in standard form. Specifically, suppose there exist column indices
i1 , i2 , . . . , ik such that the k×k matrix consisting of these k columns of G is the k×k identity
matrix. In that case the message is found in the k coordinates i1 , i2 , . . . , ik of the codeword
scrambled but otherwise unchanged; that is, the message symbol xj is in component ij of the
codeword. If this occurs where the message is embedded in the codeword, we say that the
encoder is a systematic encoder of C. We can always force an encoder to be systematic.
For example, if G is row reduced to a matrix G1 in reduced row echelon form, G1 remains
a generator matrix of C by Remark 1.4.3; the encoding x 7→ c = xG1 is systematic as G1
has k columns which together form Ik . Another way to force an encoder to be systematic
is as follows. By Theorem 1.8.4, C is permutation equivalent to an [n, k, d]q code C 0 with
generator matrix G0 in standard form. If the code C 0 is used in place of C, the encoder
x 7→ xG0 is a systematic encoder of C 0 .
Example 1.16.1 Let C be the [7, 4, 3]2 binary Hamming code H3,2 with generator matrix
G3,2 from Example 1.4.9. Encoding x = x1 x2 x3 x4 ∈ F42 as xG3,2 produces the codeword
c = x1 x2 x3 x4 (x2 + x3 + x4 )(x1 + x3 + x4 )(x1 + x2 + x4 ).
Example 1.16.2 Let C be an [n, k, d]q cyclic code with generator polynomial g(x) and
generator matrix G obtained from cyclic shifts of g(x) as in Theorem 1.12.11(f). Suppose
the message m = m0 m1 · · · mk−1 is to be encoded as c = mG. Using the polynomial m(x) =
m0 +m1 x+· · ·+mk−1 xk−1 to represent the message m and c(x) = c0 +c1 x+· · ·+cn−1 xn−1
to represent the codeword c, it is a simple calculation to show c(x) = m(x)g(x). Generally,
this encoding is not systematic. Recall from Examples 1.12.3 and 1.12.18 that the Hamming
[7, 4, 3]2 code H3,2 has a cyclic form with generator polynomial g(x) = 1 + x + x3 . The
nonsystematic encoder m(x) 7→ c(x) = m(x)g(x) yields c(x) = m0 + (m0 + m1 )x + (m1 +
m2 )x2 + (m0 + m2 + m3 )x3 + (m1 + m3 )x4 + m2 x5 + m3 x6 .
1 1 1
Thus c5 = x2 + x3 + x4 , c6 = x1 + x3 + x4 , and c7 = x1 + x2 + x4 , consistent with
Example 1.16.1.
Basics of Coding Theory 37
Example 1.16.4 Let C be an [n, k, d]q cyclic code with generator polynomial g(x). In
Example 1.16.2 a nonsystematic encoder was described that encodes a cyclic code using
g(x). There is a systematic encoder of C using the generator polynomial g ⊥ (x) of C ⊥ .
By Theorem 1.12.19, g ⊥ (x) = xk h(x−1 )/h(0) = h00 + h01 x + · · · + h0k−1 xk−1 + h0k xk where
h(x) = (xn −1)/g(x) and h0k = 1. Let H, which is a parity check matrix for C, be determined
from the shifts of g ⊥ (x) as follows:
0
h0 h01 h02 · · · h0k ··· ··· 0
0 h00 h01 · · · h0k−1 h0k · · · 0
H = ... ..
.
0 0 0 h00 ··· ··· ··· h0k
g ⊥ (x)
xg ⊥ (x)
↔ .. .
.
xn−k−1 g ⊥ (x)
Examining the generator matrix G for C in Theorem 1.12.11(f), {0, 1, . . . , k − 1} is an infor-
mation set for C. Let c = c0 c1 · · · cn−1 ∈ C; so c0 c1 · · · ck−1 can be considered the associated
message. The redundancy components ck ck+1 · · · cn−1 are determined from HcT = 0T and
can be computed in the order i = k, k + 1, . . . , n − 1 where
k−1
X
ci = − h0j ci−k+j . (1.3)
j=0
Example 1.16.5 We apply the systematic encoding of Example 1.16.4 to the cyclic version
of the Hamming [7, 4, 3]2 code H3,2 with generator polynomial g(x) = 1 + x + x3 ; see
Example 1.12.18. By Example 1.12.20, g ⊥ (x) = 1 + x2 + x3 + x4 and (1.3) yields c4 =
c0 + c2 + c3 , c5 = c1 + c3 + c4 , and c6 = c2 + c4 + c5 . In terms of the information bits
c0 c1 c2 c3 , we have c4 = c0 + c2 + c3 , c5 = c0 + c1 + c2 , and c6 = c1 + c2 + c3 .
As discussed in Section 1.1, sometimes the receiver is interested only in the sent code-
words rather than the sent messages. However, if there is interest in the actual message, a
question arises as to how to recover the message from a codeword. If the encoder x 7→ xG
is systematic, it is straightforward to recover the message. What can be done otherwise?
Because G has independent rows, there is an n × k matrix K such that GK = Ik ; K is
called a right inverse for G. A right inverse is not necessarily unique. As c = xG, the
message x = xGK = cK.
Example 1.16.6 In Example 1.16.2, we encoded the message m0 m1 m2 m3 using the
[7, 4, 3]2 cyclic version of H3,2 with generator polynomial g(x) = 1 + x + x3 . The resulting
codeword was c = m0 , m0 + m1 , m1 + m2 , m0 + m2 + m3 , m1 + m3 , m2 , m5 . The generator
matrix G obtained from g(x) as in Theorem 1.12.11(f) has right inverse
1 0 0 0
0 1 0 0
0 1 0 0
K= 0 1 0 0 .
0 1 0 0
0 0 1 0
0 0 0 1
Computing cK gives m0 m1 m2 m3 as expected.
38 Concise Encyclopedia of Coding Theory
1−%
s
1 H - s 1
HH
H H %
HH *
H
HH
jH
% H
HH
s
0 - Hs 0
1−%
send receive
1.17 Decoding
Decoding is the process of determining which codeword c was sent when a vector y is
received. Decoding is generally more complex than encoding. Decoding algorithms usually
vary with the type of code being used. In this section we discuss only the basics of hard-
decision decoding. Decoding is discussed more in depth in Chapters 15, 21, 24, 28–30, and
32.
Definition 1.17.1 A hard-decision decoder is a decoder that inputs ‘hard’ data from
the channel (e.g., elements from Fq ) and outputs hard data to the receiver. A soft-decision
decoder is one which inputs ‘soft’ data from the channel (e.g., estimates of the symbols
with attached probabilities) and generally outputs hard data.
Initially we focus our discussion to the decoding of binary codes. To set the stage for
decoding, we begin with one possible mathematical model of a channel that transmits binary
data. Before stating the model, we establish some notation. If E is an event, Prob(E) is
the probability that E occurs. If E1 and E2 are events, Prob(E1 | E2 ) is the conditional
probability that E1 occurs given that E2 occurs. The model for transmitting binary data
we explore is called the binary symmetric channel (BSC) with crossover probability
% as illustrated in Figure 1.2. In a BSC, we have the following conditional probabilities: For
y, c ∈ F2 ,
1 − % if y = c,
Prob(y is received | c is sent) = (1.4)
% if y 6= c.
In a BSC we also assume that the probability of error in one bit is independent of previous
bits. We will assume that it is more likely that a bit is received correctly than in error; so
% < 21 . 10
Let C be a binary code of length n. Assume that c ∈ C is sent and y ∈ Fn2 is received
and decoded as e c ∈ C. Of course the hope is that ec = c; otherwise the decoder has made a
decoding error. So Prob(c | y) is the probability that the codeword c is sent given that y
is received, and Prob(y | c) is the probability that y is received given that the codeword c
10 While % is usually very small, if % > 21 , the probability that a bit is received in error is higher than
the probability that it is received correctly. One strategy is to then immediately interchange 0 and 1 at
the receiving end. This converts the BSC with crossover probability % to a BSC with crossover probability
1 − % < 12 . This of course does not help if % = 21 ; in this case communication is not possible.
Basics of Coding Theory 39
where Prob(c) is the probability that c is sent and Prob(y) is the probability that y is
received. There are two natural means by which a decoder can make a choice based on
c ∈ C where Prob(e
these two probabilities. First the decoder could decode y as e c | y) is max-
imum; such a decoder is called a maximum a posteriori probability (MAP) decoder.
Symbolically a MAP decoder makes the decision
Here arg maxc∈C Prob(c | y) is the argument c of the probability function Prob(c | y) that
maximizes this probability. Alternately the decoder could decode y as e c ∈ C where
Prob(y | e
c) is maximum; such a decoder is called a maximum likelihood (ML) decoder.
Symbolically an ML decoder makes the decision
Definition 1.17.2 If a decoder decodes a received vector y as the codeword c with dH (y, c)
minimized, the decoder is called a nearest neighbor decoder.
From this discussion, on a BSC, maximum likelihood and nearest neighbor decoding are
the same. We can certainly perform nearest neighbor decoding on any code over any field.
Before presenting an example of a nearest neighbor decoder, we need to establish the
relationship between the minimum distance of a code and the error-correcting capability of
the code under nearest neighbor decoding. Notice this theorem is valid for any code, linear
or not, over any finite field.
Proof: By definition y ∈ Sq,n,t (c), the sphere of radius t in Fnq centered at c. By The-
orem 1.9.5(b), spheres of radius t centered at codewords are pairwise disjoint; hence if
y ∈ Sq,n,t (c1 ) with c1 ∈ C, then c = c1 .
40 Concise Encyclopedia of Coding Theory
Remark
d−1 d−1
q code C is t-error-correcting for any t ≤
1.17.5 By Theorem 1.17.3, an (n, M, d)
2 . Furthermore, when M > 1 and t = 2 , there exist two distinct codewords such
that the spheres of radius t + 1 about them are not disjoint; if this were not the case, the
minimum distance of C is in fact larger than d. Thus when M > 1 and t = d−1 2 , C is not
(t + 1)-error-correcting.11
The nearest neighbor decoding problem foran (n, M, d)q code becomes one of finding an
efficient algorithm that will correct up to t = d−12 errors. An obvious decoding algorithm
is to examine all codewords until one is found with distance t or less from the received
vector. This is a realistic decoding algorithm only for M small. Another obvious algorithm
is to make a table consisting of a nearest codeword for each of the q n vectors in Fnq and then
look up a received vector in the table to decode it. This is impractical if q n is very large.
For an [n, k, d]q linear code, we can devise an algorithm using a table with q n−k rather
than q n entries where one can find the nearest codeword by looking up one of these q n−k
entries. This general nearest neighbor decoding algorithm for linear codes is called syndrome
decoding, which is the subject of the remainder of the section.
Definition 1.17.6 Let C be an [n, k, d]q linear code. For y ∈ Fnq , the coset of C with coset
representative y is y + C = {y + c | c ∈ C}. The weight of the coset y + C is the smallest
weight of a vector in the coset, and any vector of this smallest weight in the coset is called
a coset leader.
The next result follows from the theory of finite groups as a linear code is a group under
addition.
Theorem 1.17.7 Let C be an [n, k, d]q linear code. The following hold for y, y0 , e ∈ Fnq .
(a) y + C = y0 + C if and only if y − y0 ∈ C.
(b) Cosets of C all have size q k .
(c) Cosets of C are either equal or disjoint. There are q n−k distinct cosets of C and they
partition Fnq .
(d) If e is a coset representative of y + C, then e + C = y + C. In particular, if e is a coset
leader of y + C, then e + C = y + C.
(e) Any coset of weight at most t = d−1
2 has a unique coset leader.
Let C be an [n, k, d]q code; fix a parity check matrix H of C. For y ∈ Fnq , syn(y) = HyT
is called the syndrome of y. Syndromes are column vectors in Fn−k q . The code C consists
of all vectors whose syndrome equals 0T . As H has rank n − k, every column vector in Fn−k q
is a syndrome. From Theorem 1.17.7, if y, y0 ∈ Fnq are in the same coset of C, then y − y0 =
T T
c ∈ C. Therefore syn(y) = HyT = H(y0 + c)T = Hy0 + HcT = Hy0 + 0T = syn(y0 ).
0 0 T T 0
Conversely, if syn(y) = syn(y ), then H(y − y ) = 0 and so y − y ∈ C. Thus we have
the following theorem.
11 In the trivial case where M = 1, C is n-error-correcting as every received vector decodes to the only
codeword in C. However, since the information rate (Definition 1.9.26 ) of such a code is 0, it is never used
in practice.
Basics of Coding Theory 41
Theorem 1.17.8 Two vectors belong to the same coset if and only if they have the same
syndrome.
Example 1.17.10 Let C be the [6, 3, 3]2 binary code with parity check matrix
0 1 1 1 0 0
H= 1 0 1 0 1 0 .
1 1 0 0 0 1
Notice that the coset with syndrome 111T has weight 2 and does not have a unique coset
leader. This coset has two other coset leaders: 010010 and 001001. All other cosets have
unique coset leaders by Theorem 1.17.7(e). We analyze three received vectors.
• Now suppose that y = 101000 is received. Then syn(y) = 101T and y is decoded as
y − 010000 = 111000. This was the sent codeword provided only 1 error was made.
• Finally suppose that y = 111111 is received. Then syn(y) = 111T and y is decoded
as y − 100100 = 011011 and at least 2 errors were made in transmission. If exactly 2
errors were made, and we had chosen one of the other two possible coset leaders for
the table, y would have been decoded as y −010010 = 101101 or y −001001 = 110110.
For this code, any received vector where 0 or 1 errors were made would be decoded correctly.
If 2 errors were made, the decoder would decode the received vector to one of three possible
equally likely codewords; there is no way to determine which was actually sent. If more than
2 errors were made, the decoder would always decode the received vector incorrectly.
Example 1.17.11 Nearest neighbor decoding of the binary Hamming code Hm,2 is par-
ticularly easy. The parity check matrix for this code consists of the 2m − 1 nonzero binary
m-tuples of column length m; these can be viewed as the binary expansions of the integers
1, 2, . . . , 2m −1. Choose the parity check matrix H for Hm,2 where column i is the associated
binary m-tuple expansion of i. Step 1 of the Syndrome Decoding Algorithm 1.17.9 can be
skipped and the algorithm becomes the following: If y is received, compute s = syn(y). If
s = 0T , decode y as the codeword y. Otherwise s represents the binary expansion of some
integer i; the nearest codeword c to y is obtained from y by adding 1 to the ith bit.
As an illustration, the parity check matrix to use for H4,2 is
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
H= 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 .
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
by (1.5). Therefore the probability that the syndrome decoder makes a correct decision
Basics of Coding Theory 43
Pn
averaged over all equally likely transmitted codewords is i=0 αi %i (1 − %)n−i where αi is
the number of coset leaders of weight i. Thus
n
X
Perr = 1 − αi %i (1 − %)n−i . (1.6)
i=0
Example 1.18.1 Suppose binary messages of length k are sent unencoded over a BSC
with crossover probability %. This in effect is the same as transmitting codewords from the
[k, k, 1]2 code C = Fk2 . This code has a unique coset, the code itself, and its leader is the
zero codeword of weight 0. Hence α0 = 1 and αi = 0 for i > 0. Therefore (1.6) shows that
the probability of decoder error is
This is precisely what we expect as the probability of no decoding error is the probability
(1 − %)k that the k bits are received without error. For instance if % = 0.01 and k = 4, Perr
without coding the length 4 messages is 0.03940399.
by (1.6). For example if % = 0.01, Perr = 0.00203104 · · · , significantly lower than the word er-
ror rate for unencoded transmissions of binary messages of length 4 found in Example 1.18.1.
For comparison, when transmitting 10 000 unencoded binary messages each of length 4, one
can expect about 394 to be received in error. On the other hand, when transmitting 10 000
binary messages each of length 4 encoded to length 7 codewords from H3,2 , one can expect
about 20 to be decoded in error.
Definition 1.18.3 For a BSC with crossover probability %, the capacity of the channel
is
C(%) = 1 + % log2 % + (1 − %) log2 (1 − %).
The capacity C(%) = 1 − H2 (%) where H2 (%) is the binary entropy function defined in more
generality in Section 1.9.8.12 See Figure 1.3.
The next theorem is Shannon’s Theorem for binary symmetric channels. Shannon’s orig-
inal theorem was stated for nonlinear codes but was later shown to be valid for linear codes
as well. The theorem also holds for other channels provided channel capacity is appropri-
ately defined. For discussion and proofs of various versions of Shannon’s Theorem, see for
example [467, 1314]. For binary symmetric channels, Shannon’s Theorem is as follows.
Theorem 1.18.4 (Shannon) Let δ > 0 and R < C(%). Then for large enough n, there
exists an [n, k]2 binary linear code C with nk ≥ R such that Perr < δ when C is used for
communication over a BSC with crossover probability %. Furthermore no such code exists if
R > C(%).
12 When 1
q = 2, the domain of the entropy function H2 (x) can be extended from 0 ≤ x < 2
to 0 ≤ x < 1.
44 Concise Encyclopedia of Coding Theory
R
6
1 pppp p
ppp ppp
ppp
ppp pp
ppp R = C(ρ) pp p
ppp
ppp pp
ppp pp
p
ppp pp
ppp
pppp
ppp pp
ppp pp
ppp p
ppp pp
ppp p
ppp p
ppp pp
ppp p p pp
ppp p pp
p p pp pp p
p p pp p p
p p p pp p ppp
p p p pp p
p pp p p p pp p p p p p p p p ppp pp
p pp p p pp p p p p p p p p p p p p p p - ρ
1 1
2
Theorem indicates that communication is not possible. This is not surprising; when ρ = 12 ,
whether a binary symbol is received correctly or incorrectly is essentially determined by a
coin flip. See Footnote 10.
Remark 1.18.6 Recall that nk is the information rate of the code as in Definition 1.9.26.
The proof of Shannon’s Theorem is nonconstructive, but the theorem does guarantee that
good codes exist with information rates just under channel capacity and decoding error rates
arbitrarily small; unfortunately these codes may have to be extremely long. The objective
becomes to find codes with a large number of codewords (to send many messages), large
minimum distance (to correct many errors), and short length (to minimize transmission
time or storage space). These goals conflict as seen in Section 1.9.
Chapter 2
Cyclic Codes over Finite Fields
Cunsheng Ding
The Hong Kong University of Science and Technology
45
46 Concise Encyclopedia of Coding Theory
Theorems 2.2.1 and 2.2.3 work for all linear codes, including cyclic codes. Their proofs
could be found in [1008, Section 3.8]. We shall need them later.
Cyclic Codes over Finite Fields 47
Theorem 2.3.1 Let C be a cyclic code of length n over Fq with generator polynomial g(x).
Let h(x) = (xn − 1)/g(x). Then gcd(g(x), h(x)) = 1, as it was assumed that gcd(n, q) = 1.
Employing the Extended Euclidean Algorithm, one computes two polynomials a(x) ∈ Fq [x]
and b(x) ∈ Fq [x] such that 1 = a(x)g(x) + b(x)h(x). Then e(x) = a(x)g(x) mod (xn − 1) is
the generating idempotent of C.
The polynomial h(x) in Theorem 2.3.1 is called the parity check polynomial of C.
Given the generating idempotent of a cyclic code, one obtains the generator polynomial of
this code as follows [1008, Theorem 4.3.3].
Theorem 2.3.2 Let C be a cyclic code over Fq with generating idempotent e(x). Then the
generator polynomial of C is given by g(x) = gcd(e(x), xn − 1), which is computed in Fq [x].
Example 2.3.3 The cyclic code C of length 11 over F3 with generator polynomial g(x) =
x5 + x4 + 2x3 + x2 + 2 has parameters [11, 6, 5] and parity check polynomial h(x) = x6 +
2x5 + 2x4 + 2x3 + x2 + 1.
Let a(x) = 2x5 + x4 + x2 and b(x) = x4 + x3 + 1. It is then easily verified that 1 =
a(x)g(x) + b(x)h(x). Hence
which is the generating idempotent of C. On the other hand, we have g(x) = gcd(e(x), x11 −
1).
A generator matrix of a cyclic code can be derived from its generating idempotent as
follows [1008, Theorem 4.3.6].
Pn−1
Theorem 2.3.4 If C is an [n, κ] cyclic code with generating idempotent e(x) = i=0 ei xi ,
then the κ × n matrix
e0 e1 e2 ··· en−2 en−1
en−1 e0 e1 ··· en−3 en−2
.. .. .. .. .. ..
. . . . . .
en−κ+1 en−κ+2 en−κ+3 ··· en−κ−1 en−κ
is a generator matrix of C.
48 Concise Encyclopedia of Coding Theory
Theorem 2.3.5 Let C be a cyclic code of length n over Fq with parity check polynomial
h(x). Let J be a subset of Zn such that
Y
h⊥ (x) = Mαj (x),
j∈J
where h⊥ (x) is the reciprocal of h(x). Then C consists of all the following codewords:
n−1
X
ca (x) = Trqm /q (fa (αi ))xi
i=0
where X
fa (x) = aj xj for aj ∈ Fqm .
j∈J
The trace and generator polynomial approaches are the most popular and effective ways
for constructing and analysing cyclic codes. In particular, the trace approach allows the use
of exponential sums for the determination of the weight distribution of cyclic codes. A lot
of progress in this direction has been made in the past decade [560]. A less investigated fun-
damental approach to cyclic codes is the q-polynomial method developed in [562]. Another
fundamental construction uses sequences as described in Section 20.5.
The following theorem says that every projective linear code is a punctured code of a
special cyclic code [1914].
Theorem 2.3.6 Every linear code C of length n over Fq with minimum distance of C ⊥ at
least 3 is a punctured code of the cyclic code
m
Trqm /q (aγ 0 ), Trqm /q (aγ 1 ), . . . , Trqm /q (aγ q −2 ) | a ∈ Fqm .
minimum distances. Some of the bounds are easy to use, while others are hard to employ.
Below we introduce a few effective lower bounds on the minimum distances of cyclic codes.
Let C be a cyclic code of length n over Fq with generator polynomial
Y
g(x) = (x − αi )
i∈T
where T is the union of some q-cyclotomic cosets modulo n, and is called the defining set
of C relative to α. The following is a simple but very useful lower bound ([248] and [968]).
Theorem 2.4.1 (BCH Bound) Let C be a cyclic code of length n over Fq with defining
set T and minimum distance d. Assume T contains δ − 1 consecutive integers for some
integer δ. Then d ≥ δ.
The BCH Bound depends on the choice of the primitive nth root of unity α. Different
choices of the primitive root may yield different lower bounds. When applying the BCH
Bound, it is crucial to choose the right primitive root. However, it is open how to choose
such a primitive root. In many cases the BCH Bound may be far away from the actual
minimum distance. In such cases, the lower bound given in the following theorem may be
much better. It was discovered by Hartmann and Tzeng [914]. To introduce this bound, we
define
A + B = {a + b | a ∈ A, b ∈ B},
where A and B are two subsets of the ring Zn , n is a positive integer, and + denotes the
integer addition modulo n.
Theorem 2.4.2 (Hartmann–Tzeng Bound) Let C be a cyclic code of length n over Fq
with defining set T and minimum distance d. Let A be a set of δ − 1 consecutive elements of
T and B(b, s) = {jb mod n | 0 ≤ j ≤ s}, where gcd(b, n) < δ. If A + B(b, s) ⊆ T for some
b and s, then d ≥ δ + s.
When s = 0, the Hartmann–Tzeng Bound becomes the BCH Bound. Other lower bounds
can be found in [1838]. As cyclic codes are a special case of quasi-cyclic codes, bounds on
such codes found in Section 7.4 can be applied to cyclic codes.
Example 2.5.2 The celebrated Golay codes introduced in Section 1.13 are also irreducible
cyclic codes and the binary [24, 12, 8] extended Golay code was used on the Voyager 1 and
Voyager 2 missions to Jupiter, Saturn, and their moons.
relative to the primitive nth root of unity α, where Ci is the q-cyclotomic coset modulo n
containing i.
When b = 1, the code C(q,n,δ,b) with defining set in (2.2) is called a narrow-sense
BCH code. If n = q m − 1, then C(q,n,δ,b) is referred to as a primitive BCH code. The
Reed–Solomon code introduced in Section 1.14 is a primitive BCH code.
Sometimes T (b1 , δ1 ) = T (b2 , δ2 ) for two distinct pairs (b1 , δ1 ) and (b2 , δ2 ). The maxi-
mum designed distance of a BCH code is defined to be the largest δ such that the set
T (b, δ) in (2.2) defines the code for some b ≥ 0. The maximum designed distance of a BCH
code is also called the Bose distance.
Given the canonical factorization of xn − 1 over Fq in (2.1), we know that the total
number of nonzero cyclic codes of length n over Fq is 2t+1 − 1. Then the following two
natural questions arise:
1. How many of the 2t+1 − 1 cyclic codes are BCH codes?
2. Which of the 2t+1 − 1 cyclic codes are BCH codes?
Cyclic Codes over Finite Fields 51
The first question is open. Regarding the second question, we have the next result whose
proof is straightforward.
Theorem 2.6.1 A cyclic code of length n over Fq with defining set T ⊆ Zn is a BCH code
if and only if there exists an integer δ with 2 ≤ δ ≤ n, an integer b with −(n−1) ≤ b ≤ n−1,
and an element a ∈ Zn such that gcd(n, a) = 1 and
δ−2
[
aT mod n = Ci+b .
i=0
In general it is not easy to use Theorem 2.6.1 to check if a cyclic code is a BCH code.
In addition, it looks hard to answer the first question above with Theorem 2.6.1.
Definition 2.6.2 A family of codes is asymptotically good, provided that there exists
an infinite subset of [ni , ki , di ] codes from this family with limi→∞ ni = ∞ such that both
lim inf i→∞ ki /ni > 0 and lim inf i→∞ di /ni > 0.
The family of primitive BCH codes over Fq is asymptotically bad in the following sense
[1262].
Theorem 2.6.3 If Ci are [ni , ki , di ]q codes for i = 1, 2, . . . with limi→∞ ni = ∞, then either
lim inf i→∞ ki /ni = 0 or lim inf i→∞ di /ni = 0.
Despite this asymptotic property, binary primitive BCH codes of length up to 127 are
among the best linear codes known [549]. It is questionable if the definition of asymptotic
badness above makes sense for applications.
Theorem 2.6.4 Let C be the narrow-sense primitive BCH code of length n = q m − 1 over
Fq with designed distance δ. Then the minimum weight of C is its minimum odd-like weight.
Theorem 2.6.5 For any h with 1 ≤ h ≤ m − 1, the narrow-sense primitive BCH code
C(q,qm −1,qh −1,1) has minimum distance d = q h − 1.
It is easy to prove the following result [1247], which is a generalization of the classical
result for the narrow-sense primitive case.
Theorem 2.6.6 The code C(q,n,δ,b) has minimum distance d = δ if δ divides gcd(n, b − 1).
Although it is notoriously difficult to find out the minimum distance of a BCH code in
general, in a small number of cases other than the cases dealt with in the three theorems
above, the minimum distance of the BCH code C(q,n,δ,b) is known. In several cases, the
weight distribution of some BCH codes are known. Detailed information can be found in
[551, 555, 1245, 1246, 1247, 1277].
Theorem 2.6.8 Let C be an [n, κ] BCH code over Fq of designed distance δ. Then the
following statements hold.
(a) κ ≥ n − ordn (q)(δ − 1).
(b) If q = 2 and C is a narrow-sense BCH code, then δ can be assumed odd; furthermore
if δ = 2w + 1, then κ ≥ n − ordn (q)w.
The bounds in Theorem 2.6.8 may not be improved for the general case, as demonstrated
by the following example. However, in some special cases, they could be improved.
Example 2.6.9 Note that m = ord15 (2) = 4, and the 2-cyclotomic cosets modulo 15 are
code. In this case, the actual minimum weight is equal to the designed distance, and the
dimension reaches the bound in Theorem 2.6.8(b).
When (b, δ) = (2, 3), the defining set T (b, δ) = {1, 2, 3, 4, 6, 8, 9, 12}, and the binary
cyclic code has parameters [15, 7, 5] and generator polynomial x8 + x7 + x6 + x4 + 1. In this
case, the actual minimum weight is more than the designed distance, and the dimension
achieves the bound in Theorem 2.6.8(a).
When (b, δ) = (1, 5), the defining set T (b, δ) = {1, 2, 3, 4, 6, 8, 9, 12}, and the binary cyclic
code has parameters [15, 7, 5] and generator polynomial x8 + x7 + x6 + x4 + 1. In this case,
the actual minimum weight is equal to the designed distance, and the dimension is larger
than the bound in Theorem 2.6.8(a). Note that the three pairs (b1 , δ1 ) = (2, 3), (b2 , δ2 ) =
(2, 4) and (b3 , δ3 ) = (1, 5) define the same binary cyclic code with generator polynomial
x8 + x7 + x6 + x4 + 1. Hence the maximum designed distance of this [15, 7, 5] cyclic code is
5.
When (b, δ) = (3, 4), the defining set T (b, δ) = {1, 2, 3, 4, 5, 6, 8, 9, 10, 12}, and the binary
cyclic code has parameters [15, 5, 7] and generator polynomial x10 + x8 + x5 + x4 + x2 + x + 1.
In this case, the actual minimum weight is more than the designed distance, and dimension
is larger than the bound in Theorem 2.6.8(a).
Theorem 2.6.10 Let gcd(n, q) = 1 and q bm/2c < n ≤ q m − 1, where m = ordn (q). Let
2 ≤ δ ≤ min {bnq dm/2e /(q m − 1)c, n}. Then
for i ∈ {1, 2}. Since both S1 and S2 are unions of q-cyclotomic cosets modulo n, both g1 (x)
and g2 (x) are polynomials over Fq . The pair of cyclic codes C1 and C2 of length n over Fq
with generator polynomials g1 (x) and g2 (x) are called odd-like duadic codes, and the
pair of cyclic codes Ce1 and Ce2 of length n over Fq with generator polynomials ge1 (x) and
ge2 (x) are called even-like duadic codes.
By definition, C1 and C2 have parameters [n, (n + 1)/2] and Ce1 and Ce2 have parameters
[n, (n − 1)/2]. For odd-like duadic codes, we have the following result [1008, Theorem 6.5.2].
Theorem 2.7.1 (Square Root Bound) Let C1 and C2 be a pair of odd-like duadic codes
of length n over Fq . Let do be their (common) minimum odd-like weight. Then the following
hold.
(a) d2o ≥ n.
(b) If the splitting defining the duadic codes is given by µ = −1, then d2o − do + 1 ≥ n.
(c) Suppose d2o − do + 1 = n, where do > 2, and assume that the splitting defining the
duadic codes is given by µ = −1. Then do is the minimum weight of both C1 and C2 .
S1 = {1, 2, 4, 8, 9, 11, 15, 16, 18, 22, 23, 25, 29, 30, 32, 36, 37, 39, 43, 44, 46} ∪ {7, 14, 28}
and
S2 = {1, 2, . . . , 48} \ S1 .
It is easily seen that (S1 , S2 , −1) is a splitting of Z49 . The pair of odd-like duadic codes
C1 and C2 defined by this splitting have parameters [49, 25, 4] and generator polynomials
and
x24 + x23 + x21 + x17 + x16 + x14 + x3 + x2 + 1
Cyclic Codes over Finite Fields 55
respectively. The minimum weight of the two codes is even (i.e., 4), while the minimum
odd-like weight in the two codes is 9. Note the lower bound on do given in Theorem 2.7.1
is 7.
Duadic codes of prime lengths are of special interest as they include the quadratic residue
codes, which are defined as follows.
Definition 2.7.3 Let n be an odd prime and q be a quadratic residue modulo n. Denote by
S1 and S2 the set of quadratic residues and the set of quadratic non-residues, respectively.
Let µ be an element of S2 . Then (S1 , S2 , µ) is a splitting of Zn . The corresponding four
duadic codes C1 , C2 , Ce1 , Ce2 are called quadratic residue codes.
It is known that the automorphism groups of the extended odd-like quadratic residue
codes are transitive. Hence, their minimum weight codewords must be odd-like. We then
have the following.
Theorem 2.7.4 (Square Root Bound) Let n be an odd prime and q be a quadratic
residue modulo n. Let C1 and C2 be a pair of odd-like quadratic residue codes of length
n over Fq . Let d be their (common) minimum weight. Then the following hold.
(a) d2 ≥ n.
(b) If −1 is a quadratic non-residue, then d2 − d + 1 ≥ n.
(c) Suppose d2 − d + 1 = n, where d > 2, and assume that −1 is a quadratic non-residue.
Then d is the minimum weight of both C1 and C2 .
The Golay codes C1 and C2 introduced in Section 1.13 of Chapter 1 are the odd-like
quadratic residue binary codes of length 23 and have parameters [23, 12, 7]2 . The corre-
sponding even-like quadratic residue codes have parameters [23, 11, 8]2 . The ternary Golay
codes described in Section 1.13 are also quadratic residue codes.
It is very hard to determine the minimum distance of quadratic residue codes. However,
the Square Root Bound on their minimum distances is good enough. Quadratic residue
codes are interesting partly because their extended codes hold 2-designs and 3-designs.
Quadratic residue codes of length the product of two primes were introduced in [546].
Under certain conditions the extended odd-like duadic codes are self-dual [1220]. Duadic
codes have a number of interesting properties. For further information on the existence,
constructions, and properties of duadic codes, the reader is referred to [1008, Chapter 6],
[546], [559], and [565].
where the sum is taken over the ring of integers, and is called the q-weight of j.
Let ` be a positive integer with 1 ≤ ` < (q−1)m. The `th order punctured generalized
Reed–Muller code RMq (`, m)∗ over Fq is the cyclic code of length n = q m − 1 with
generator polynomial X
g(x) = (x − αj ),
1≤j≤n−1
ωq (j)<(q−1)m−`
where α is a generator of F∗qm . Since ωq (j) is a constant function on each q-cyclotomic coset
modulo n = q m − 1, g(x) is a polynomial over Fq .
The parameters of the punctured generalized Reed–Muller code RMq (`, m)∗ are known
and summarized in the next theorem [71, Section 5.5].
Theorem 2.8.1 For any ` with 0 ≤ ` < (q − 1)m, RMq (`, m)∗ is a cyclic code over Fq
with length n = q m − 1, dimension
` X m
i − jq + m − 1
j m
X
κ= (−1)
i=0 j=0
j i − jq
Since wtH (a) is a constant function on each q-cyclotomic coset modulo n, g(q,m,h) (x) is a
polynomial over Fq . By definition, g(q,m,h) (x) is a divisor of xn − 1.
Let f(q, m, h) denote the cyclic code over Fq with length n and generator polynomial
g(m,q,h) (x). By definition, g(q,m,m) (x) = (xn − 1)/(x − 1). Therefore, the code f(q, m, m) is
trivial, as it has parameters [n, 1, n] and is spanned by the all-1 vector. Below we consider
the code f(q, m, h) for 1 ≤ h ≤ m − 1 only.
Example 2.9.2 The following is a list of examples of the code f(q, m, h).
1. When (q, m, h) = (3, 3, 1), f(q, m, h) has parameters [26, 20, 4], and is distance-
optimal.
2. When (q, m, h) = (3, 4, 1), f(q, m, h) has parameters [80, 72, 4], and is distance-
optimal.
3. When (q, m, h) = (3, 4, 2), f(q, m, h) has parameters [80, 48, 13], and its minimum
distance is one less than that of the best code with parameters [80, 48, 14].
58 Concise Encyclopedia of Coding Theory
4. When (q, m, h) = (3, 4, 3), the code f(q, m, h) has parameters [80, 16, 40], which are
the best parameters known.
5. When (q, m, h) = (4, 3, 1), the code f(q, m, h) has parameters [63, 54, 5], which are
the best parameters known.
An interesting fact about the family of newly generalized codes f(q, m, h) is the follow-
ing.
Corollary 2.9.3 Let m ≥ 2. Then the ternary code f(3, m, 1) has parameters [3m −1, 3m −
1 − 2m, 4] and is distance-optimal.
We have also the next two special cases in which the parameters of the code f(q, m, h)
are known.
Theorem 2.9.4 Let m be even. Then the cyclic code f(q, m, 1) has parameters [q m −1, q m −
1 − m(q − 1), q + 1].
Theorem 2.9.5 Let m ≥ 2. Then the cyclic code f(q, m, m − 1) has parameters [q m −
1, (q − 1)m , (q m − 1)/(q − 1)].
⊥
The following theorem gives information on the parameters of the dual code f(q, m, h) .
⊥
Theorem 2.9.6 Let m ≥ 2 and 1 ≤ h ≤ m − 1. The dual code f(q, m, h) has parameters
[q m − 1, κ⊥ , d⊥ ], where
h
⊥
X m
κ = (q − 1)i .
i=1
i
⊥
The minimum distance d⊥ of f(q, m, h) is bounded below by
d⊥ ≥ q m−h + q − 2.
⊥
When q = 2, the lower bound on the minimum distance d⊥ of f(q, m, h) given in
Theorem 2.9.6 is achieved. Experimental data shows that the lower bound on d⊥ in Theorem
2.9.6 is not tight for q > 2. It is open how to improve it or determine the exact minimum
distance.
The code f(q, m, h) is clearly different from the punctured generalized Reed–Muller
code. It is open if f(q, m, h) has a geometric interpretation. For a proof of the results
introduced above and further properties of the cyclic code f(q, m, h), the reader is referred
to [561].
Example 2.10.8 Let (n, q) = (15, 2). The 2-cyclotomic cosets modulo 15 are
C0 = {0}, C1 = {1, 2, 4, 8}, C3 = {3, 6, 9, 12}, C5 = {5, 10}, and C7 = {7, 11, 13, 14}.
We also have
x15 − 1 = Mα0 (x)Mα1 (x)Mα3 (x)Mα5 (x)Mα7 (x),
where
Mα0 (x) = x + 1,
Mα1 (x) = x4 + x + 1,
Mα3 (x) = x4 + x3 + x2 + x + 1,
Mα5 (x) = x2 + x + 1,
Mα7 (x) = x4 + x3 + 1.
Note that Mαi (x) are self-reciprocal for i ∈ {0, 3, 5} while Mα1 (x) and Mα7 (x) are recipro-
cals of each other. In this case,
But
Π(n,q) = {0, 1, 3, 5}.
Hence, there are 16 reversible binary cyclic codes of length 15, including the zero code and
the code F15
2 .
Corollary 2.10.9 Let q be an even prime power and n = q m − 1. If m is odd, then the only
self-reciprocal irreducible divisor of xn − 1 over Fq is x − 1. If m is an odd prime, then the
q m +(m−1)q
total number of reversible cyclic codes of length n over Fq is equal to 2 2m , including
the zero code and the code Fnq .
Corollary 2.10.10 Let q be an odd prime power and n = q m − 1. If m is odd, then the
only self-reciprocal irreducible divisors of xn − 1 over Fq are x − 1 and x + 1. If m is an
odd prime, then the total number of reversible cyclic codes of length n over Fq is equal to
q m +(m−1)q+m
2 2m , including the zero code and the code Fnq .
The following three theorems follow directly from Theorem 2.10.3 and the definition of
BCH codes, and can be viewed as corollaries of Theorem 2.10.3.
Theorem 2.10.11 The BCH code C(q,n,δ,b) is reversible if b = −t and the designed distance
is δ = 2t + 2 for any nonnegative integer t.
Theorem 2.10.12 The BCH code C(q,n,δ,b) is reversible if n is odd, b = (n − t)/2 and the
designed distance is δ = t + 2 for any odd integer t with 1 ≤ t ≤ n − 2.
Theorem 2.10.13 The BCH code C(q,n,δ,b) is reversible if n is even, b = (n − 2t)/2 and
the designed distance is δ = 2t + 2 for any integer t with 0 ≤ t ≤ n/2.
The dimensions of some of the reversible BCH codes described in Theorems 2.10.11,
2.10.12, and 2.10.13 were determined in [1236]. Two families of reversible BCH codes were
studied in [1247]. Non-primitive reversible cyclic codes were also treated in [1236]. There
are many reversible cyclic codes and it is easy to construct them. However, determining
their parameters is a difficult problem in general. There are clearly reversible cyclic codes
with both good and bad parameters.
Chapter 3
Construction and Classification of Codes
Patric R. J. Östergård
Aalto University
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2 Equivalence and Isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2.1 Prescribing Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.2.2 Determining Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.3 Some Central Classes of Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3.1 Perfect Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3.2 MDS Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.3.3 Binary Error-Correcting Codes . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.1 Introduction
Shannon’s seminal work [1661] showed in a nonconstructive way that good codes exist,
as discussed in Section 1.1, and marked the birth of coding and information theory. One of
the main goals of coding theory is to construct codes that are as good as possible for given
parameters and types of codes. There is a mathematical as well as an engineering dimen-
sion of the construction problem. Whereas mathematicians may wish to study arbitrary
parameters and the behavior of codes when the length tends to infinity, the parameters of
a code used in some engineering applications are necessarily fixed and bounded. Depend-
ing on one’s philosophical viewpoint, one may say that codes are discovered rather than
constructed. Hence, perhaps the most fundamental question in coding theory can also be
phrased in terms of existence: Does a code with given parameters exist or not?
If the answer to an existence question is affirmative, one may further wish to carry out a
classification of those codes, that is, to determine all possible such codes up to symmetry
(equivalence, isomorphism). Also classification has both a theoretical and a practical side.
Classified codes can be studied to gain more insight into codes, to state or refute conjectures,
and so on. But they also form an exhaustive set of candidate codes, whose performance can
be tested in actual applications.
Remark 3.1.1 A nonexistence result is intrinsically a classification result, where the out-
come is the empty set.
61
62 Concise Encyclopedia of Coding Theory
A lot of (manual and/or computational) resources may be required for the study of
existence or classification of codes with certain parameters. However, whereas verifying
correctness of a (positive) outcome is fast and straightforward for a constructed code, it can
be as time-consuming as the original work for a set of classified codes.
Validation of classification results is considered in [1090, Chap. 10]. Especially double-
counting techniques have turned out to be useful. If we know the total number of codes—for
example, through a mass formula—then we can check whether the orbit-stabilizer theorem
applied to a set of classified codes leads to the same number; see for example Theorem 4.5.2.
This approach can even be used to show that a set of codes found in a non-exhaustive search
is complete. Lam and Thiel [1192] showed how double-counting can be integrated into
classification algorithms when no mass formula is available (which is the typical situation).
The main classical reference in coding theory, The Theory of Error-Correcting Codes
by MacWilliams and Sloane [1323], considers the existence problem extensively and is still,
many decades after the publication of the first edition, a good source of information. The
theme of classifying codes, on the other hand, has flourished much later, in part because of
its high dependence on computational resources; the monograph [1090] provides an in-depth
treatment of this theme.
In this chapter, a general overview of existence and classification problems will be given,
and some highlights from the past will be singled out. The chapter touches upon several of
the problems and problem areas presented in the list of major open problems in algebraic
coding theory in [630].
Remark 3.2.1 For binary linear codes, all three types of equivalence coincide.
Construction and Classification of Codes 63
Example 3.2.2 In some special cases—in particular, for perfect codes—quite large unre-
stricted codes have been classified, such as the (23, 4096, 7)2 code (the binary Golay code is
unique [1732]; see also [525]) and the (15, 2048, 3)2 codes (with the parameters of a Hamming
code; there are 5983 such codes [1472]).
But what can be done if we go beyond parameters for which the size of an optimal
code can be determined and the optimal codes can be classified? Analytical upper bounds
and constructive lower bounds on the size of codes can still be used. One way to speed up
computer-aided constructive techniques—some of which are discussed in Chapter 23—is to
restrict the search by imposing a structure on the codes. This is a two-edged sword: the
search space is reduced, but good codes might not have that particular structure. Hence
some experience is of great help in tuning the search. A very common approach is that of
prescribing symmetries (automorphisms).
Remark 3.2.3 In the discussion of groups in the context of automorphism groups of codes,
we are not only interested in the abstract group but in the group and its action. This is
implicitly understood in the sequel when talking about one particular group or all groups
of certain orders. For example, “prescribing a group” means “prescribing a group and its
action” and “considering all groups” means “considering all groups and all possible actions
of those groups”.
Remark 3.2.4 An [n, k]q linear code can be viewed as an unrestricted code which contains
the all-zero codeword and has a particular group of automorphisms G of order q k , which
only permutes elements of the alphabet, individually within each coordinate.
Example 3.2.5 Consider the [4, 2]2 binary linear code generated by
1 0 1 0
G= .
0 1 1 1
If this code contains some word c, then it also contains, for example, c+1010. Adding 0 to a
coordinate value means applying a value permutation that is the identity permutation, and
adding 1 to a coordinate value means applying the value permutation 0 ↔ 1. Permuting
elements of the alphabet, individually within each coordinate, is indeed covered by the
definition of equivalence of unrestricted codes.
64 Concise Encyclopedia of Coding Theory
Unrestricted codes that consist of cosets of a linear code can in a similar manner be
considered in the framework of prescribed groups of automorphisms. Let H be an (n−k)×n
parity check matrix for an [n, k, d0 ]q linear code1 , let x, y ∈ Fn−k
q , and let S ⊆ Fn−k
q . Then,
if wtH (t) is the Hamming weight of t, we define
dH (x, y) = min {wtH (t) | HtT = (x − y)T , t ∈ Fnq },
dH (S) = min dH (a, b).
a,b∈S,a6=b
The following theorem [1470, Theorem 1] shows that the situation here is a generalization
of the standard approach of finding arbitrary codes with minimum distance 2r+1 by packing
Hamming balls of radius r. Here we are packing more general geometrical objects given by
the columns of H, and we get the standard approach when k = 0 and thereby H = In .
Theorem 3.2.6 Let H be an (n − k) × n parity check matrix for an [n, k, d0 ]q linear code,
and let S ⊆ Fn−k
q . Then the code C = {c ∈ Fnq | HcT ∈ S} has minimum distance
H 0
min {d (S), d }.
When searching for unrestricted codes, one may consider any subgroup of Sq o Sn . Due
to the large number of (conjugacy classes of) subgroups, it is typically necessary to restrict
the set of subgroups considered. The subgroups should not be too small, whence the search
would not be limited enough, and not too big either. Experiments and experience, especially
regarding automorphism groups of known good codes, are helpful in the process of sifting
candidate groups. For unrestricted codes, Theorem 3.2.6 has turned out to be very useful.
Groups that act transitively on the n coordinates or even on the qn coordinate–value pairs
have also been considered with success [1186].
Arguably the principal automorphism for good linear as well as unrestricted codes is a
cyclic permutation of the coordinates. Codes with such an automorphism are called cyclic.
Cyclic linear codes—which have a nice algebraic structure as they can be viewed as ideals
of certain quotient rings—are discussed in Section 1.12 and Chapter 2.
We have seen a generalization of cyclic codes to l-quasi-cyclic codes in Definition 1.12.23,
and these are covered in Chapter 7; cyclic codes are 1-quasi-cyclic. Other possible automor-
phisms that are generalizations of the cyclic ones and that are common amongst good codes
are given by the following definition. The definition applies to both linear and unrestricted
codes. For linear codes, these types of codes can be considered algebraically. See also Chap-
ter 17.
Definition 3.2.7 Fix α ∈ F∗q . A code C is called α-constacyclic (or α-twisted)
if c0 c1 · · · cn−1 ∈ C implies that (αcn−1 )c0 c1 · · · cn−2 ∈ C. A 1-constacyclic code is
cyclic. A code C is (α, l)-quasi-twisted if l divides n and c0 c1 · · · cn−1 ∈ C implies
that (αcn−l )(αcn−l+1 ) · · · (αcn−1 )c0 c1 · · · cn−l−1 ∈ C. An (α, 1)-quasi-twisted code is α-
constacyclic (or α-twisted).
Important types of codes with a large automorphism group include quadratic residue
codes.
Definition 3.2.8 Let p be an odd prime, let Q be the set of quadratic residues modulo p,
let q ∈ Q be a prime, and let ζ be a primitive pth root of unity in some finite extension field
of Fq . A quadratic residue code of length p over Fq is a cyclic code with generator
polynomial Y
f (x) = (x − ζ j ).
j∈Q
1 This
means that H has full rank, which is a reasonable assumption. We get essentially the same main
theorem when H does not have full rank, but then some details regarding distances have to be tuned.
Construction and Classification of Codes 65
Remark 3.2.9 The quadratic residue codes in Definition 3.2.8 can be generalized to codes
whose alphabet size is a prime power and to codes whose length is a prime power.
The dimension of a quadratic residue code of length p is (p+1)/2. The minimum distance
√
is known to be at least p, which can be further strengthened for various parameters; see
Theorem 2.7.4. The true minimum distance typically has to be determined on a case-by-case
basis.
Example 3.2.10 The [7, 4, 3]2 binary Hamming code, the [23, 12, 7]2 binary Golay code,
and the [11, 6, 5]3 ternary Golay code are quadratic residue codes.
Gleason and Prange showed that the automorphism group of an extended quadratic
residue code is rather large. The result was originally published only in a laboratory research
report, but it has later been discussed in most coding theory textbooks, including [1323,
Chap. 16]. See also [210, 1002]. When extending quadratic residue codes, the standard
Definition 1.7.1 can be used for q = 2, 3, but a different definition has to be used for general
values of q; see, for example, [1323, Chap. 16] for details.
Remark 3.2.12 The extensions of the three codes in Example 3.2.10 are exceptional ex-
amples of extended quadratic residue codes whose automorphism group has PSL2 (p) as a
proper subgroup.
There is a two-way interaction between codes and groups: known good codes (or fam-
ilies of good codes) can be studied to find their automorphism groups, and groups can be
prescribed to find good codes with such automorphisms.
A search for codes with prescribed automorphisms that has a negative outcome leads to
results of scientific value only if the search is exhaustive and the parameters are of general
interest. For the most important open existence problems, smaller and smaller orders of
automorphisms and groups of automorphisms are commonly considered in a sequence of
publications. The hope in each and every such study is clearly to find codes with the
prescribed groups of automorphisms.
Remark 3.2.13 If the answer to a general existence problem is negative, then even a proof
that a code cannot have nontrivial automorphisms does not essentially bring us closer to a
nonexistence proof. However, this is not work in vain, but forms a strong basis for making
a nonexistence conjecture.
Example 3.2.14 Neil Sloane [1726] asked in 1973 whether there is a self-dual doubly-
even [72, 36, 16]2 code. This would be the third code in a sequence of self-dual doubly-even
[24m, 12m, 4m + 4]2 codes, which begins with the extended binary Golay code (m = 1)
and the extended binary quadratic residue code of length 48 (m = 2). The latter code is a
unique such code [991]; the extended binary Golay code is unique as a general linear code
[1514] and even as an unrestricted code [1732].
Jessie MacWilliams, in her review of [1726] for Mathematical Reviews, mentions that
the author had offered $10 for a solution and added another $10 to the prize money. It
was thus clear from the very beginning that this is an important and interesting problem.
Almost half a century later, the problem is still open, no less fascinating, and arguably the
most important open specific case in the theory of linear codes.
66 Concise Encyclopedia of Coding Theory
Conway and Pless [437] showed that the possible prime orders for automorphisms of
a self-dual doubly-even [72, 36, 16]2 code are 2, 3, 5, 7, 11, 17, and 23. Starting from the
biggest order, the largest four orders have been eliminated in [1516], [1525], [1010], and
[723], respectively. After further elimination of various groups of orders 2i 3j 5k , i, j, k ≥ 0 in
[238, 239, 241, 274, 280, 1422, 1922, 1930, 1931], the groups of order at most 5, except for
the cyclic group of order 4, remain. Moreover, automorphisms of order 2 and 3 cannot have
fixed points [273, 274], and automorphisms of order 5 must have exactly two fixed points
[437]. See also [240] and Section 4.3.
The two questions are closely related, since if we are able to find all possible maps
between two codes, C1 and C2 , we can answer both of them (the latter by letting C1 = C2 =
C).
Invariants can be very useful in studying the first question.
Definition 3.2.15 An invariant is a property of a code that depends only on the abstract
structure, that is, two equivalent (isomorphic) codes necessarily have the same value of an
invariant.
Remark 3.2.16 Two inequivalent (non-isomorphic) codes may or may not have the same
value of an invariant.
Example 3.2.17 The distance distribution is an invariant of codes. This invariant can
be used to show that the two unrestricted (4, 3, 2)3 codes C1 = {0000, 0120, 2121} and
C2 = {1000, 1111, 2112} are inequivalent. Actually, to distinguish these codes an even less
sensitive invariant suffices: the number of pairs of codewords with mutual Hamming distance
3 is 0 for C1 and 1 for C2 .
Since invariants are only occasionally able to provide the right answer to the first question
above, alternative techniques are needed for providing the answer in all possible situations.
One such technique relies on producing canonical representatives of codes.
Definition 3.2.18 Let S be a set of possible codes, and let r : S → S be a map with the
properties that (i) r(C1 ) = r(C2 ) if and only if C1 and C2 are equivalent (isomorphic) and
(ii) r(C) = r(r(C)). The canonical representative (or canonical form) of a code C ∈ S
with respect to this map is r(C).
To test whether two codes are equivalent (isomorphic), it suffices to test their canonical
representatives for equality.
Remark 3.2.19 The number of equivalence (isomorphism) classes that can be handled
when comparing canonical representatives is limited by the amount of computer memory
available. However, in the context of classifying codes, there are actually methods that do
not require any comparison between codes; see [1090].
Construction and Classification of Codes 67
For particular codes with special properties, algebraic and combinatorial techniques can
be used to determine the automorphism group of a code.
Example 3.2.20 The automorphism group of the [2m − 1, 2m − r − 1, 3]2 binary Hamming
code is isomorphic to the general linear group GLm (2).
Remark 3.2.21 Determining the automorphism group essentially consists of two parts:
finding the set of all automorphisms and proving that the set is complete. The latter task
is trivial in cases where the automorphism group is a maximal subgroup of the group of all
possible symmetries.
In the general case, algorithmic tools are required to answer the questions above. In
the early 1980s, Leon [1217] published an algorithm for computing automorphism groups of
linear codes over arbitrary fields, considering monomial equivalence. More recently, Feulner
[721] developed an algorithm for computing automorphism groups and canonical represen-
tatives of linear codes, considering equivalence.
Algorithms similar to those developed by Leon and Feulner could also be developed
for unrestricted codes. However, the task of developing such tailored algorithms is rather
tedious. An alternative, convenient approach for unrestricted codes and their various sub-
classes is to map the codes to colored graphs, which can be then be considered in the
framework of graph isomorphism. In particular, the nauty graph automorphism software
[1370] can then be used.
Maps from codes to colored graphs are discussed in [1090, Chap. 3]. For unrestricted
codes over Fq with length n and size M , one may consider the following graph with M + qn
vertices. Take one vertex for each codeword and a complete graph with q vertices for each
coordinate. Insert edges so that the neighbors of a codeword vertex show the values in the
respective coordinates. One may color the graph so that no automorphism can map a vertex
of one of the two types to the other (although the structure of the graph is such that for
nontrivial codes no such maps are possible even if the graph is uncolored).
Example 3.2.22 For the two codes in Example 3.2.17, one may construct the graphs in
Figure 3.1. The leftmost vertex in each triangle corresponds to value 0, and the other
two vertices correspond to 1 and 2 in a clockwise manner. Now we can also use graph
invariants, such as the distribution of the degrees of the vertices, to see that the graphs are
non-isomorphic, which in turn implies that the codes are inequivalent.
Remark 3.2.23 The number of codewords of linear codes grows exponentially as a function
of the dimension of the codes. Hence we do not wish to map individual codewords to
vertices of some graph, so an approach similar to that for unrestricted codes is not practical.
However, various other approaches have been proposed which partly rely on nauty; see
[483, 1467, 1608] and [1090, Chap. 7]. Further algorithms for classifying linear codes are
presented in [190].
consider upper bounds and are about nonexistence of codes. Lower bounds, on the other
hand, are typically obtained by constructing explicit codes. Especially for small parameters,
many best known codes have been obtained on a case-by-case basis. One possible approach
for finding such codes is that of prescribing symmetries—as discussed in Section 3.2.1—and
carrying out a computer search; see Chapter 23.
In some rare situations, there exist codes that attain some general upper bounds. For
such parameters, the problem of finding the size of an optimal code is then settled. When
this occurs and the upper bound is the Sphere Packing Bound, we get perfect codes (Defi-
nition 1.9.8), and when the upper bound is the Singleton Bound, we get maximum distance
separable (MDS) codes (Definition 1.9.12). In this section we will take a glance at these two
types of codes as well as general binary linear and unrestricted codes.
There are also some families of trivial perfect codes: codes containing one word, codes
containing all codewords in the space, and (n, 2, n)2 codes for odd n. If the order of the
alphabet q is a prime power, these are in fact the only sets of parameters for which (linear
and unrestricted) perfect codes exist [1805, 1949].
Construction and Classification of Codes 69
Theorem 3.3.1 The nontrivial perfect linear codes over Fq , where q is a prime power, are
precisely the Hamming codes with parameters (3.1) and the Golay codes with parameters
(3.2). A nontrivial perfect unrestricted code (over Fq , q a prime power) that is not equivalent
to a linear code has the same length, size, and minimum distance as a Hamming code (3.1).
Although the remarkable Theorem 3.3.1 gives us a rather solid understanding of perfect
codes, there are still many open problems in this area, including the following (a code with
different alphabet sizes for different coordinates is called mixed):
Research Problem 3.3.2 Solve the existence problem for perfect codes when the size of
the alphabet is not a prime power.
Research Problem 3.3.3 Solve the existence problem for perfect mixed codes.
Research Problem 3.3.4 Classify perfect codes, especially for the parameters covered by
Theorem 3.3.1.
Since Theorem 3.3.1 covers alphabet sizes that are prime powers, that is, exactly the sizes
for which finite fields and linear codes exist, Research Problems 3.3.2 to 3.3.4 are essentially
about unrestricted codes (although many codes studied for Research Problem 3.3.3 have
clear algebraic structures and close connections to linear codes).
The parameters (3.3) are feasible also when the alphabet size is not a prime power, but
for those alphabets there are also other parameters to consider.
70 Concise Encyclopedia of Coding Theory
Example 3.3.7 The Sphere Packing Bound cannot be used to prove nonexistence of perfect
(19, 214 318 , 3)6 codes. To prove nonexistence in this case, we may use the necessary condition
[931, Theorem 1] that
1 + n(q − 1) | q 1+(n−1)/q .
The tricky instances for general alphabet sizes seem to be those of type (3.3), and
this is perhaps the reason behind the pessimism in the quote of Remark 3.3.6. In fact, a
nonexistence proof [826, Theorem 6] for (7, 65 , 3)6 codes is the only result so far for (3.3).
The next open case for m = 2, that is, n = q + 1, is here singled out as a particularly
interesting question.
Research Problem 3.3.8 Solve the existence problem for (11, 109 , 3)10 codes.
Remark 3.3.9 Codes with m = 2 (n = q +1) in (3.3) are of special interest because if there
exists such a code, then there exists an infinite sequence of perfect codes with parameters
(3.3) over the same alphabet [1509].
Theorem 3.3.10 Let q be a prime power, and assume that there is a perfect code with
minimum distance 3 and length n over Fqm F(2) · · · F(n) . Then there is a perfect code with
(q m −1)/(q−1) (2)
minimum distance 3 and length n − 1 + (q m − 1)/(q − 1) over Fq F · · · F(n) .
Fnq = C0 ∪ C1 ∪ · · · ∪ Cqm −1 .
For each codeword x1 x2 · · · xn ∈ Fqm F(2) · · · F(n) in the original code, include the words
cx2 x3 · · · xn , c ∈ Cx1 in the new code. To show that the obtained code is perfect, it suf-
fices to observe that the minimum distance of the new code is 3 (there are two cases to
consider: codewords that come from different codewords, respectively the same codeword
of the original code) and that its size attains the Sphere Packing Bound.
Example 3.3.11 Starting from a perfect Hamming code over F54 and applying Theo-
rem 3.3.10 repeatedly, we get perfect codes with minimum distance 3 over F32 F44 , F62 F34 ,
F92 F24 , F12 15
2 F4 , and F2 . The last code again has the parameters of a Hamming code.
In Example 3.3.11, we started from a well-known perfect code, but in this way only
specific families of perfect mixed codes can be obtained. What about the general case? In
the general case, there are both existence and nonexistence results. For nonexistence results,
see, for example, [1588, 1842] and their references.
When the alphabet sizes are powers of a prime power q, mixed perfect codes with
minimum distance 3 are known to exist for a wide range of parameters. Actually, for all
those parameters, such perfect codes can be constructed via a vector space partition.
Let {C1 , C2 , . . . , Cm } be a partition of Fnq . We now define a map ϕ from the direct product
C1 × C2 × · · · × Cm to Fnq by
ϕ : (c1 , c2 , . . . , cm ) 7→ c1 + c2 + · · · + cm .
Example 3.3.14 Consider partitions of F32 . By taking the seven [3, 1]2 codes in the
partition, Theorem 3.3.13 gives the binary Hamming code of length 7. With C1 =
{000, 100, 010, 110}, C2 = {000, 001}, C3 = {000, 101}, C4 = {000, 011}, and C5 = {000, 111},
we get a perfect code over F4 F42 .
Remark 3.3.15 The term partition in Definition 3.3.12 is justifiable when the all-zero
word is ignored. Such partitions into subspaces (linear codes) should not be confused with
partitions into arbitrary subsets (unrestricted codes), such as those considered in the proof
of Theorem 3.3.10.
Remark 3.3.16 Partitions of vector spaces into t-dimensional subspaces are known as t-
spreads in finite geometry. Partitions of vector spaces are further related to q-analogs of
combinatorial designs.
Partitions of vector spaces have been studied extensively—see, for example, [667, 669,
1214, 1637] for selected results on this problem obtained during the last decade. Any exis-
tence results in this context give perfect codes, but our current knowledge does not allow us
to use nonexistence results to deduce nonexistence of mixed codes with the related param-
eters. Mixed codes obtained by Theorem 3.3.13 have a “linear” structure, and one central
question is about the existence of such codes. The next problem is [929, Problem 4].
Research Problem 3.3.17 Are there parameters for which perfect mixed codes with min-
imum distance 3 exist but which cannot be obtained via partitions of vector spaces (Theo-
rem 3.3.13)?
Remark 3.3.20 As the codewords of an MDS code with dimension k form an orthogonal
array with strength k and index 1, such codes are systematic and any k coordinates can be
used for the message symbols.
In a paper [319] published by Bush in 1952, the framework of orthogonal arrays is used
to construct objects that we now know as Reed–Solomon codes. In that study it is also
shown that for linear codes over Fq with k > q, n ≤ k + 1 is a necessary condition for an
[n, k, n − k + 1]q MDS code to exist, and that there are [k + 1, k, 2]q MDS codes. Such codes,
and generally codes with parameters [n, 1, n]q , [n, n − 1, 2]q , and [n, n, 1]q , are called trivial
MDS codes.
For k ≤ q, on the other hand, the following MDS Conjecture related to a question by
Segre [1638] in 1955 is still open.
Conjecture 3.3.21 (MDS) If k ≤ q, then a linear [n, k, n − k + 1]q MDS code exists
exactly when n ≤ q + 1 unless q = 2h and k = 3 or k = q − 1, in which case it exists exactly
when n ≤ q + 2.
Remark 3.3.22 MDS codes are typically discussed in the linear case, but the parameters of
the codes in Conjecture 3.3.21 conjecturally also cover the parameters for which unrestricted
MDS codes exist.
The MDS Conjecture has throughout the years been proved for various sets of parameters
k and q; see [962] for a survey. A major breakthrough was made in 2012 by Ball [104],
who proved that the MDS Conjecture holds when q is a prime; see Theorem 14.2.17. The
conjecture is still open for non-prime q; see [104, 106] and their references for some related
results. Much less is known in the more difficult case of unrestricted MDS codes, where the
MDS Conjecture has been proved for q ≤ 8 [1142, 1143].
MDS codes, linear and unrestricted, have been classified for various parameters. For
example, a general result is that for prime q, [q + 1, k, q − k + 2]q MDS codes—which have
the maximum value of n given k and q [104]—are Reed–Solomon codes. However, when
q is not a prime, examples of [q + 1, k, q − k + 2]q MDS codes that are not equivalent to
Reed–Solomon codes are known [811, 957].
Construction and Classification of Codes 73
Unrestricted MDS codes have been classified for small values of n and q. For the smallest
nontrivial case, (4, 9, 3)3 MDS codes, Taussky and Todd [1794] claimed uniqueness of the
Hamming code in 1948 but omitted the details—see [1074] for a complete proof. That code
is the first nontrivial member of a family of perfect 1-error-correcting codes with varying q
and smallest possible length, that is, with parameters (q + 1, q q−1 , 3)q . We have earlier seen
these parameters in Research Problem 3.3.8.
Actually, for q ≤ 5 there is a unique unrestricted MDS code for each set of admissible
parameters. The number of equivalence classes of unrestricted (n, 7k , d)7 MDS codes and
(n, 8k , d)8 MDS codes are given in Tables 3.1 and 3.2, respectively, for n ≥ k + 2. See
[35, 664, 1142, 1143] and their references for details.
n\k 2 3 4 5 6
4 4
5 1 1
6 1 3 1
7 1 1 1 1
8 1 1 1 1 1
n\k 2 3 4 5 6 7
4 2165
5 39 12484
6 1 39 14
7 1 2 2 8
8 1 2 1 2 4
9 1 2 1 1 2 4
10 1 1
Remark 3.3.24 The existence of a (q + 1, q 2 , q)q MDS code is equivalent to the existence
of a projective plane of order q and an affine plane of order q.
n\d 3 5 7 9 11
1 1 1 1 1 1
2 1 1 1 1 1
3 2 1 1 1 1
4 2 1 1 1 1
5 4 2 1 1 1
6 8 2 1 1 1
7 16 2 2 1 1
8 20 4 2 1 1
9 40 6 2 2 1
10 72 12 2 2 1
11 144 24 4 2 2
12 256 32 4 2 2
13 512 64 8 2 2
14 1024 128 16 4 2
15 2048 256 32 4 2
One may focus on either of the parities of d—even or odd—not only when determining
the values of A2 (n, d) but also when classifying codes. Namely, codes with odd minimum
distance can be extended to codes with even minimum distance, and codes with even min-
imum distance can be punctured to codes with odd minimum distance. These operations
can be carried out in several ways (but may still lead to equivalent codes).
When puncturing an arbitrary binary code, the minimum distance decreases by at most
one—if the length of the original code is n, this can be done in n ways. One may find
all possible ways of extending a code with odd minimum distance by finding all proper
2-colorings of a certain graph; see [1090, Sect. 7.1.4] for details. In particular, one possible
extension is obtained by adding a parity check symbol. Even parity is typically considered,
but odd parity gives equivalent codes. Codes extended with a parity check symbol further
have even distance between all pairs of codewords, cf. Theorem 1.9.2(f).
For parameters outside the range of Table 3.3, the exact value of A2 (n, d) is known only
for sporadic parameters, including those of Hamming codes, the binary Golay code, and
Preparata codes [1541]. For more on Preparata codes, see Chapters 5 and 6.
Construction and Classification of Codes 75
TABLE 3.4: Equivalence classes of (n, A2 (n, d), d)2 codes for n ≤ 15 and odd d ≤ 11
n\d 3 5 7 9 11
1 1 1 1 1 1
2 1 1 1 1 1
3 1 1 1 1 1
4 2 1 1 1 1
5 1 1 1 1 1
6 1 2 1 1 1
7 1 3 1 1 1
8 5 1 2 1 1
9 1 1 3 1 1
10 562 1 4 2 1
11 7398 1 1 3 1
12 237610 2 9 4 2
13 117823 1 6 5 3
14 38408 1 10 1 4
15 5983 1 5 9 5
It is known that the binary Golay code leads to optimal codes even after shortening it
six times [1469].
Since we further know [1468] that A2 (16, 7) = 36, we actually know A2 (n, 7) for all
n ≤ 23 (and consequently A2 (n, 8) for all n ≤ 24).
In the general case, we have lower and upper bounds on A2 (n, d). For codes with param-
eters just outside the range of Table 3.3, examples of techniques for finding these bounds
include, for lower bounds, conducting a computer search for codes with prescribed auto-
morphisms [1186] and, for upper bounds, employing semidefinite programming [804, 1634].
See Chapter 13 for details of the latter method.
For fixed (small) d, the relative distance d/n → 0 as n → ∞. In this case, the asymptotic
behavior is conveniently discussed in terms of the density of a code.
The asymptotic behavior of 1-error-correcting codes has been solved for prime-power
alphabets [1068]. For clarity we focus on the binary case.
1. Hamming codes over Fq , where q is a prime power. The parameters of these are given
by (3.1).
76 Concise Encyclopedia of Coding Theory
2. The following construction of a code C 0 over F2i from a code C over Fq , where q > 2i .
Ignoring the algebraic structure of the alphabet, label the elements of F2i so that they
form a subset of the elements of Fq , and take exactly those codewords of C that have
all coordinate values in F2i .
3. Transform a code over F2i to a code over F2 using the construction in the proof of
Theorem 3.3.10.
4. Lengthen an (n, M, 3)2 code to an (2n + 1, 2n M, 3)2 code.
5. Shorten codes to obtain codes with all possible parameters.
The Hamming codes have density 1, the constructions (3) and (4) do not affect the
density of a code, and constructions (2) and (5) decrease the density (in the way they are
used in practice). By putting the pieces together carefully, the desired result is reached.
For 2-error-correcting codes, that is, codes with minimum distance 5, the density of
Preparata codes tends to 1 as n → ∞. However, this only shows that the limit superior is 1.
For a relative distance approaching δ as n → ∞, the asymptotic behavior of A2 (n, δn)
is considered in Section 1.9.8 in terms of upper bounds on the asymptotic normalized rate
function. It was not until 1972 that Justesen [1063] published the first construction of
codes—now called Justesen codes—with a lower bound on the asymptotic normalized rate
function that is greater than 0. All these bounds apply to linear as well as unrestricted
codes.
n\d 3 5 7 9 11 13 15 17
3 1
4 1
5 2 1
6 3 1
7 4 1 1
8 4 2 1
9 5 2 1 1
10 6 3 1 1
11 7 4 2 1 1
12 8 4 2 1 1
13 9 5 3 1 1 1
14 10 6 4 2 1 1
15 11 7 5 2 1 1 1
16 11 8 5 2 1 1 1
17 12 9 6 3 2 1 1 1
TABLE 3.6: Equivalence classes of [n, k2 (n, d), ≥ d]2 codes for n ≤ 17 and odd d ≤ 17
n\d 3 5 7 9 11 13 15 17
3 1
4 2
5 1 1
6 1 2
7 1 3 1
8 6 1 2
9 5 4 3 1
10 4 2 4 2
11 3 1 1 3 1
12 2 14 4 4 2
13 1 15 1 5 3 1
14 1 11 1 1 4 2
15 1 6 1 4 5 3 1
16 145 1 7 9 6 4 2
17 129 1 3 2 1 5 3 1
Chapter 4
Self-Dual Codes
Stefka Bouyuklieva
Veliko Tarnovo University
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2 Weight Enumerators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3 Bounds for the Minimum Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.4 Construction Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4.1 Gluing Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4.2 Circulant Constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.4.3 Subtraction Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.4.4 Recursive Constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.4.5 Constructions of Codes with Prescribed
Automorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.5 Enumeration and Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.1 Introduction
Definition 4.1.1 If C is an [n, k]q linear code over the field Fq , its dual or orthogonal
code C ⊥ is the set of vectors which are orthogonal to all codewords of C:
C ⊥ = {u ∈ Fnq | (u, v) = 0 for all v ∈ C},
where (u, v) is an inner product in the vector space Fnq .
79
80 Concise Encyclopedia of Coding Theory
n
Remark 4.1.4 Any self-dual code of length n over a finite field Fq has dimension k = 2.
Therefore self-dual codes over a field exist only for even lengths.
Definition 4.1.5 A code C is divisible by ∆ provided all codewords have weights divisible
by an integer ∆, called a divisor of C; the code is called divisible if it has a divisor ∆ > 1.
Definition 4.1.6 If all codewords in a linear code have even weights, the code is even. If
all codewords in a binary linear code have weights divisible by 4, the code is doubly-even.
Binary self-orthogonal codes, which are not doubly-even, are called singly-even.
Remark 4.1.7 Any binary self-orthogonal code is even but there are even binary codes
which are not self-orthogonal. It is easy to see that any binary doubly-even code is self-
orthogonal. If C is a binary singly-even self-orthogonal code, then C is even but it contains
at least one codeword of weight ≡ 2 (mod 4).
Remark 4.1.8 Definition 4.1.6 repeats Definition 1.6.8 but we give it here for completeness.
Remark 1.6.9 and Example 1.6.10 also give some information about binary self-orthogonal
codes.
Theorem 4.1.9 (Gleason–Pierce–Ward [1008]) Let C be an [n, n/2] divisible code over
the field Fq with divisor ∆ ≥ 2. Then one (or more) of the following holds.
(a) q = 2, ∆ = 2, or
(b) q = 2, ∆ = 4, and C is self-dual, or
(c) q = 3, ∆ = 3, and C is self-dual, or
(d) q = 4, ∆ = 2, and C is Hermitian self-dual, or
(e) ∆ = 2, and C is equivalent to the code over Fq with generator matrix In/2 In/2 .
Remark 4.1.10 Note that in case (b), the code is doubly-even self-dual, and these codes
are also called Type II codes; the singly-even self-dual codes are called Type I codes.
Theorem 4.1.12 ([1008, Theorem 9.1.3]) There exists a Euclidean self-dual code over
Fq of even length n if and only if (−1)n/2 is a square in Fq . Furthermore, if n is even
and (−1)n/2 is not a square in Fq , then the dimension of a maximal self-orthogonal code of
length n is (n/2) − 1. If n is odd, then the dimension of a maximal self-orthogonal code of
length n is (n − 1)/2.
The structure of the self-dual codes plays a major role in their research. The Balance
Principle gives very important information on this structure. Theorem 4.1.14 is the base for
most of the construction methods presented in Section 4.4.
Definition 4.2.1 A linear code C is called formally self-dual if C and its dual code C ⊥
have the same weight enumerator, HweC (x, y) = HweC ⊥ (x, y). A linear code is isodual if it
is equivalent to its dual code.
Remark 4.2.2 Any isodual code is also formally self-dual, but there are formally self-dual
codes that are neither isodual nor self-dual. The smallest length for which a formally self-
dual code is not isodual is 14, and there are 28 such codes amongst 6 weight enumerators
[867]. Any self-dual code is also isodual and formally self-dual.
is isodual. Its weight enumerator is HweC (x, y) = y 6 + 4x3 y 3 + 3x4 y 2 , and its automorphism
group has order 24. Obviously, this code is not self-dual as it contains codewords with odd
weight.
82 Concise Encyclopedia of Coding Theory
Theorem 4.2.4 (Gleason [1008]) Let C be an [n, n/2] code over the field Fq , and let
g1 (x, y) = y 2 + x2 ,
g2 (x, y) = y 8 + 14x4 y 4 + x8 ,
g3 (x, y) = y 24 + 759x8 y 16 + 2576x12 y 12 + 759x16 y 8 + x24 ,
g4 (x, y) = y 4 + 8x3 y,
g5 (x, y) = y 12 + 264x6 y 6 + 440x9 y 3 + 24x12 ,
g6 (x, y) = y 2 + 3x2 , and
g7 (x, y) = y 6 + 45x4 y 2 + 18x6 .
n
b 12 c
n
X
(c) If q = 3 and C is a self-dual code, then HweC (x, y) = ai g4 (x, y) 4 −3i g5 (x, y)i .
i=0
P
In all cases, all coefficients ai are rational and i ai = 1.
Remark 4.2.5 In Theorem 4.2.4(a), if we let g20 (x, y) = x2 y 2 (x2 − y 2 )2 , then we can use
g20 (x, y) in place of g2 (x, y); so
n
b8c
n
X
HweC (x, y) = ai g1 (x, y) 2 −4i g2 (x, y)i
i=0
bn
8c
n
X
= ci (x2 + y 2 ) 2 −4i (x2 y 2 (x2 − y 2 )2 )i .
i=0
The usage of g20 (x, y) in place of g2 (x, y) is much more useful in many applications due to
the fact that the coefficients ci are integers.
and g2 (x, y) is the weight enumerator of the [8, 4, 4] extended binary Hamming code H b3,2 .
The weight enumerators of the extended binary and ternary Golay codes are g3 (x, y) and
g5 (x, y). The ternary [4, 2, 3] Hamming code H2,3 of Examples 1.9.14 and 1.10.3 (often
called the tetracode) has weight enumerator g4 (x, y), the repetition code of length 2 over
F4 has weight enumerator g6 (x, y), and the [6, 3, 4] hexacode over F4 = {0, 1, ω, ω 2 } with
generator matrix
1 0 0 1 ω ω
0 1 0 ω 1 ω
0 0 1 ω ω 1
has weight enumerator g7 (x, y).
Remark 4.2.7 The Gleason polynomials g3 , g5 , and g7 can be replaced by g30 (x, y) =
x4 y 4 (x4 − y 4 )4 , g50 (x, y) = x3 (x3 − y 3 )3 , and g70 (x, y) = x2 (x2 − y 2 )2 in the corresponding
formulas in Gleason’s Theorem.
Lemma 4.2.8 ([1555, Lemma 1]) Let C be a binary singly-even self-orthogonal code, and
let C0 be its subset of doubly-even codewords. Then C0 is a linear subcode of index 2 in C.
Remark 4.2.10 The term ‘shadow’ was introduced by Conway and Sloane [440] with the
purpose to prove a new upper bound for the minimum distance of the binary singly-even
self-dual codes as well as some restrictions on their weight enumerators.
Theorem 4.2.11 ([440]) Let S be the shadow of the singly-even self-dual code C of length
n and minimum distance d. The dual code C0⊥ of the doubly-even subcode C0 of C, is a union
of four cosets of C0 , i.e., C0⊥ = C0 ∪ C1 ∪ C2 ∪ C3 , where C = C0 ∪ C2 . The following hold.
(a) S = C0⊥ \ C = C1 ∪ C3 .
(b) The sum of any two vectors in S is a codeword in C. More precisely, if u, v ∈ C1 , then
u + v ∈ C0 ; if u ∈ C1 , v ∈ C3 , then u + v ∈ C2 ; if u, v ∈ C3 , then u + v ∈ C0 .
Pn
(c) Let HweS (x, y) = r=0 Ar (S)xr y n−r be the weight enumerator of the shadow S. Then
y−x y+x
HweS (x, y) = HweC i √ , √ ,
2 2
√
where HweC (x, y) is the weight enumerator of C and i = −1. Furthermore,
(i) Ar (S) = An−r (S) for r = 0, 1, . . . , n;
(ii) Ar (S) = 0 for any r 6≡ n/2 (mod 4);
(iii) A0 (S) = 0;
(iv) Ar (S) ≤ 1 for r < d/2; and
(v) Ad/2 (S) ≤ 2n/d.
Theorem 4.2.12 ([440]) If the weight enumerator of a self-dual code C of length n is given
by
bn
8c
n
X
HweC (x, y) = cj (x2 + y 2 ) 2 −4j (x2 y 2 (x2 − y 2 )2 )j ,
j=0
84 Concise Encyclopedia of Coding Theory
Example 4.2.13 By [440, Section V], a consequence of Theorem 4.2.12 is that the largest
minimum weight of a self-dual code of length 42 is 8. Additionally, a self-dual [42, 21, 8]
code C and its shadow S have the following weight enumerators, respectively,
(
HweC (x, y) = y 42 + (84 + 8β)x8 y 34 + (1449 − 24β)x10 y 32 + · · · ,
(4.1)
HweS (x, y) = βx5 y 37 + (896 − 8β)x9 y 33 + · · · ,
(
HweC (x, y) = y 42 + 164x8 y 34 + 697x10 y 32 + 15088x12 y 30 + · · · ,
(4.2)
HweS (x, y) = xy 41 + 861x9 y 33 + · · · ,
where β is a nonnegative integer [440]. A self-dual [42, 21, 8] code with weight enumerator
HweC (x, y) given by (4.1) exists if and only if β ∈ {0, 1, . . . , 22, 24, 26, 28, 32, 42}; see [279].
A self-dual [42, 21, 8] code with weight enumerator HweC (x, y) given by (4.2) is also known;
see [1005].
Remark 4.2.14 Similar definitions for a shadow can be given in some other cases. For
example, a shadow is defined if C is an additive trace-Hermitian self-orthogonal code over
F4 and C0 is its subcode with even Hamming weights, or if C is a self-orthogonal code over
Zm (m even) and C0 is the subcode with Euclidean norms divisible by 2m. Definitions and
theorems for these codes and their shadows are given in [1555].
Remark 4.2.15 Extremality for some of the parameters always gives interesting problems.
In [86] the minimum weights of a code and its shadow are considered simultaneously. In
[283], the authors study binary self-dual codes having a shadow with the smallest possible
minimum weight.
Theorem 4.2.16 ([86]) Let C be a self-dual binary code, assumed not to be doubly-even,
of minimum weight d, and let S be its shadow, of minimum weight s. Then 2d+s ≤ 4+n/2,
unless n ≡ 22 (mod 24) and d = 4bn/24c + 6, in which case 2d + s ≤ 8 + n/2.
Definition 4.2.17 ([86]) Let C be a binary self-dual [n, n/2, d] code whose shadow S has
minimum distance s. If 2d + s = 4 + n/2 or 2d + s = 8 + n/2, C is said to be s-extremal.
Definition 4.2.18 ([283]) Let C be a binary self-dual code of length n = 24m + 8l + 2r,
l = 0, 1, 2, r = 0, 1, 2, 3. Suppose the shadow of C has minimum weight s. Then C is a code
with minimal shadow if: (i) s = r when r > 0, and (ii) s = 4 when r = 0.
Remark 4.2.19 Some s-extremal codes and codes with minimal shadow support 1- and 2-
designs; see Chapter 5 for appropriate terminology. More properties of the s-extremal codes
are given in [86]. Connections between self-dual codes with minimal shadow, combinatorial
designs and secret sharing schemes are presented in [282].
Remark 4.3.2 Recall from Remark 4.1.10 that Type I codes are binary singly-even self-
dual codes, and Type II codes are binary doubly-even self-dual codes. We say that a ternary
self-dual code is Type III, and a code over F4 that is self-dual under the Hermitian inner
product is Type IV.
Theorem 4.3.3 (Zhang [1555, 1943]) Let HweC (x, y) be the weight enumerator of a for-
mally self-dual code C of Type I, or a self-dual code C of Type II, III or IV meeting the
corresponding bound from Theorem 4.3.1. Let c = 2, 4, 3, 2, respectively, and µ = bn/8c,
bn/24c, bn/12c, bn/6c. Then the coefficient Ac(µ+2) in HweC (x, y) is negative if and only if
Type I: n = 8i (i ≥ 4), n = 8i + 2 (i ≥ 5), n = 8i + 4 (i ≥ 6), n = 8i + 6 (i ≥ 7).
Type II: n = 24i (i ≥ 154), n = 24i + 8 (i ≥ 159), n = 24i + 16 (i ≥ 164). In particular,
C cannot exist for n > 3928.
Type III: n = 12i (i ≥ 70), n = 12i + 4 (i ≥ 75), n = 12i + 8 (i ≥ 78). So code C cannot
exist for n > 932.
Type IV: n = 6i (i ≥ 17), n = 6i + 2 (i ≥ 20), n = 6i + 4 (i ≥ 22). In particular, C
cannot exist for n > 130.
Theorem 4.3.4 ([1008]) There exist self-dual binary codes of length n meeting the bound
of Theorem 4.3.1(a) if and only if n = 2, 4, 6, 8, 12, 14, 22, and 24. There exist even
formally self-dual binary codes of length n meeting the bound of Theorem 4.3.1(a) if and
only if n is even with n ≤ 30, n 6= 16, and n 6= 26. For all these codes, the weight enumerator
is uniquely determined by the length.
Remark 4.3.5 The formally self-dual even codes meeting the bound of Theorem 4.3.1(a)
are called extremal. These extremal codes are classified. The number of all inequivalent
extremal formally self-dual even codes is given in Table 4.1 ([275, Table 1]). Table 1 in [912]
lists the number of all inequivalent extremal even codes, but for length 30 the number is
given by ≥ 6; the classification of the [30, 15, 8] formally self-dual even codes is completed in
[275]. In [867], Gulliver and Östergård classified all binary optimal linear [n, n/2] codes up
to length 28. Moreover, they proved that at least one linear [n, n/2, dmax(n) ] code is formally
86 Concise Encyclopedia of Coding Theory
self-dual for lengths n ≤ 64. The smallest length for which a formally self-dual code is not
isodual is 14, and there are 28 such codes amongst 6 weight enumerators. Simonis [1715]
showed that the extended quadratic residue code of length 18 is the unique [18, 9, 6] code,
up to equivalence. Since its dual distance is 6, this code is isodual. Jaffe [1031] proved that
there is a unique linear [28, 14, 8] code. In [728] the authors showed that there are exactly 7
inequivalent [20, 10, 6] even extremal formally self-dual binary codes and that there are over
1000 inequivalent [22, 11, 6] even extremal formally self-dual binary codes. The classification
for length 22 was completed in [912]. The formally self-dual codes of smaller lengths were
constructed and classified in [82, 188, 727, 1105].
Conway and Sloane in [440] obtained a new upper bound for the minimum distance of
the singly-even self-dual codes, which is updated by Rains in [1553].
Theorem 4.3.6 (Rains [1553]) If C is a binary self-dual [n, n/2, d] code then
n
d ≤ 4b 24 c+4 if n 6≡ 22 (mod 24),
n
d≤ 4b 24 c +6 if n ≡ 22 (mod 24).
Definition 4.3.7 A self-dual code is called extremal if it meets the bound in Theorem
4.3.6. The code is called optimal if there exists no self-dual code with larger minimum
distance for the same length.
Remark 4.3.8 All extremal self-dual codes are optimal but the opposite is not true. For
example, there are no extremal self-dual codes of length n = 26. The situation with the
optimal self-dual binary codes of length n for n = 2, 4, . . . , 72 is presented in Table 4.2 which
updates [1008, Table 9.1], [1423, Table 11.2], and [1555, Table X]. In the table, dI is the
largest minimum weight for which a Type I code is known to exist while dII is the largest
Self-Dual Codes 87
minimum weight for which a Type II code is known to exist. The superscript E indicates
that the code is extremal; the superscript O indicates the code is not extremal but optimal.
The number of inequivalent Type I and II codes of the given minimum weight is listed under
numI and numII , respectively. The optimal codes of length 34 have been classified in [205]
and [21] independently, the extremal codes of length 36 have been classified in [21]. The
papers [22] and [277] have presented the classification of the [38, 19, 8] codes using different
methods. The singly-even and doubly-even extremal self-dual codes of length 40 have been
classified in [277] and [189], respectively. For lengths 42 and 44, the extremal self-dual codes
having automorphisms of odd prime order have been classified (see [284, 285]). For the
larger lengths we give the same bound as in [1423]; nevertheless many new codes have been
constructed so far.
Conjecture 4.3.9 Extremal binary self-dual codes of lengths n ≡ 2, 4, 6, and 10 (mod 24)
do not exist.
Remark 4.3.10 ([1008]) The [8, 4, 4] extended Hamming code H b3,2 of Example 1.7.2 is
the unique Type II code of length 8 indicated in Table 4.2. Two of the five [32, 16, 8]
Type II codes are extended quadratic residue and Reed–Muller codes. The [24, 12, 8] and
[48, 24, 12] binary extended quadratic residue codes are extremal Type II codes, but the
binary extended quadratic residue code of length 72 has minimum weight 12 and hence is
not extremal. The first two lengths for which the existence of an extremal Type II code is
undecided are 72 and 96. For lengths beyond 96, extremal Type II codes are known to exist
only for lengths 104 and 136.
Remark 4.3.11 The extremal self-dual codes of length a multiple of 24 are of particular
interest. According to Theorem 4.3.6 these codes are doubly-even. Moreover, for any nonzero
weight w in such a code, the codewords of weight w hold a 5-design; see Chapter 5. The
extended Golay code G24 is the only [24, 12, 8] code and the extended quadratic residue code
q48 is the only [48, 24, 12] self-dual doubly-even code. The automorphism group of G24 is
the 5-transitive Mathieu group M24 of order 210 · 33 · 5 · 7 · 11 · 23; see Theorem 1.13.5. The
automorphism group of q48 is isomorphic to the projective special linear group PSL2 (47)
and has order 24 · 3 · 23 · 47. Both M24 and PSL2 (47) are non-abelian simple groups, and so
in particular are not solvable. These two codes are the only known extremal self-dual codes
of length a multiple of 24. The existence of an extremal code of length 72 is a long-standing
open problem; see Example 3.2.14 for a discussion. A series of papers investigates the
structure of its automorphism group excluding most of the subgroups of the symmetric group
S72 [238, 239, 241, 273, 274, 280, 437, 723, 1010, 1422, 1516, 1525, 1930, 1922, 1931]. The
following theorem summarizes the works of Pless, Conway, Thompson, Huffman, Yorgov,
Bouyuklieva, O’Brien, Willems, Yankov, Feulner, Nebe, Borello, and Dalla Volta.
Theorem 4.3.12 Let C be a self-dual [72, 36, 16] code. Then PAut(C) is trivial or it is
isomorphic to C2 , C3 , C2 × C2 , or C5 where Ci is the cyclic group of order i.
Remark 4.3.13 The possible automorphism groups of a putative extremal self-dual code
of length 72 are abelian and very small. So this code is almost a rigid object (i.e., without
symmetries) and it might be very difficult to find, if it exists.
Remark 4.3.14 Extremal ternary self-dual (Type III) codes do not exist for lengths n =
72, 96, 120, and all n ≥ 144. Extremal Type III codes of length n are known to exist for
all n divisible by 4 with 4 ≤ n ≤ 64; all other lengths through 140 except 72, 96, and 120
remain undecided. The known results about the extremal ternary codes are summarized in
[1005, Table 6].
88 Concise Encyclopedia of Coding Theory
Remark 4.3.15 By [1943], no extremal Type IV codes exist for lengths n = 102, 108,
114, 120, 122, 126, 128, and n ≥ 132. The existence of extremal codes of even lengths
32 ≤ n ≤ 130 except the lengths given above is unknown. We refer to [1005] for more
details about the extremal and optimal Type IV codes.
For the definition of t-designs used in the next theorem, see Chapter 5.
Theorem 4.3.16 ([1008]) The following results on t-designs hold in extremal codes of
Types II, III, and IV.
(a) Let C be a [24m + 8µ, 12m + 4µ, 4m + 4] extremal Type II code for µ = 0, 1, or 2. Then
codewords of any fixed weight except 0 hold t-designs for the following parameters:
(i) t = 5 if µ = 0 and m ≥ 1,
(ii) t = 3 if µ = 1 and m ≥ 0, and
(iii) t = 1 if µ = 2 and m ≥ 0.
(b) Let C be a [12m + 4µ, 6m + 2µ, 3m + 3] extremal Type III code for µ = 0, 1, or 2.
Then codewords of any fixed weight i with 3m + 3 ≤ i ≤ 6m + 3 hold t-designs for the
following parameters:
(i) t = 5 if µ = 0 and m ≥ 1,
(ii) t = 3 if µ = 1 and m ≥ 0, and
(iii) t = 1 if µ = 2 and m ≥ 0.
(c) Let C be a [6m + 2µ, 3m + µ, 2m + 2] extremal Type IV code for µ = 0, 1, or 2. Then
codewords of any fixed weight i with 2m + 2 ≤ i ≤ 3m + 2 hold t-designs for the
following parameters:
(i) t = 5 if µ = 0 and m ≥ 2,
(ii) t = 3 if µ = 1 and m ≥ 1, and
(iii) t = 1 if µ = 2 and m ≥ 0.
Example 4.4.2 Construct the self-dual [14, 7, 4] code using gluing. We choose two gluing
components C1 ∼= C2 ∼ ⊥
= e7 where e7 is the [7, 3, 4] code H3,2 , the dual of the Hamming code
H3,2 ; see Example 1.5.3. We need one glue vector. Since (0101010) + e7 is a coset in e⊥ 7
with minimum weight 3, we can take for a glue vector v = (0101010|0101010). So we have
a generator matrix of the code in the form
0001111 0000000
0110011 0000000
1010101 0000000
G= 0000000 0001111 .
0000000 0110011
0000000 1010101
0101010 0101010
a1 a2 ··· am
am a1 ··· am−1
A= .. .
.
a2 a3 ··· a1
α ββ · · · β
γ
...
B= ,
A
γ
Definition 4.4.4 We say that a code has a double circulant generator matrix or a
bordered double circulant generator matrix provided it has a generator matrix of the
form
Im A or Im+1 B ,
where A is an m × m circulant matrix and B is an (m + 1) × (m + 1) bordered circulant
matrix, respectively. A code has a double circulant construction or a bordered double
circulant construction provided it has a double circulant or bordered double circulant
generator matrix, respectively.
Proposition 4.4.5 A code with a double circulant construction is isodual. A code with a
bordered double circulant construction is isodual provided the bordered matrix used in the
construction satisfies either β = γ = 0 or both β and γ are nonzero.
90 Concise Encyclopedia of Coding Theory
Remark 4.4.6 All extremal binary self-dual codes of lengths up to 90 with circulant and
bordered circulant constructions are classified in a series of papers by Gulliver and Harada
[864, 865, 905].
Remark 4.4.7 For the structure of the quasi-cyclic self-dual codes we refer to the tetralogy
of papers of Ling, Niederreiter, and Solé [1268, 1269, 1271, 1272] and Chapter 7.
Remark 4.4.8 Double negacirculant (DN) codes are the analogs in odd characteristic of
double circulant codes. Four-negacirculant codes are introduced in [906]. Self-dual negacir-
culant and other generalizations of quasi-cyclic codes are studied by many authors; see, for
example, [29, 866, 906].
Remark 4.4.10 Any two coordinates of a self-dual code of length n > 2 and minimum
distance d > 2 are not equal. But even if a self-dual code of length n > 2 has minimum
distance d = 2, it has two coordinates which are not equal.
Remark 4.4.11 Any self-dual [n, n/2, 2] code Cn for n > 2 is decomposable as Cn =
i2 ⊕ Cn−2 where Cn−2 is a self-dual code of length n − 2 and i2 is the binary repetition code
of length 2. Furthermore, up to equivalence, the number of the self-dual [n, n/2, 2] codes is
equal to the number of all self-dual [n − 2, n/2 − 1] codes for n > 2)
1 0 x
y1 y1
G= ... ..
. G 0
yn yn
Theorem 4.4.17 ([1119]) Any self-dual code C over F2 of length n with minimum weight
d > 2 is obtained from some self-dual code C0 of length n − 2 (up to equivalence) by the
construction in Theorem 4.4.16.
Remark 4.4.18 Choosing G0 as G0 = In A and x = x1 x2 · · · xn/2 11 · · · 1 in Theorem
4.4.16, we obtain the Harada Construction as a corollary.
Remark 4.4.19 Let C be a binary self-dual code and Cd be its subcode generated by the
set of codewords with minimum weight d. If Gd is a generator
matrix of Cd with rank k,
Gd
and GE is an (n/2 − k) × n matrix such that G = is a generator matrix of C, then
GE
the matrix
1 0 x
1 1
.. ..
. . Gd
1 1 ,
a1 a1
.. ..
. . GE
an/2−k an/2−k
92 Concise Encyclopedia of Coding Theory
where ai ∈ F2 , 1 ≤ i ≤ n/2 − k, and the even weight vector 10x ∈ Fn+22 is orthogonal to all
other rows of the matrix, generates a self-dual code. To obtain only [n + 2, n/2 + 1, d + 2]
codes, the authors of [22] take only those vectors x ∈ C ⊥ \ C for which the constructed code
has minimum distance d + 2.
and say that σ is of type p-(c, f ). We present the main theorems about the structure of
such a code. This structure has been used by many authors in order to construct optimal
self-dual codes with different parameters.
Theorem 4.4.20 ([999]) Let C be a binary [n, n/2] code with automorphism σ from (4.3).
Let Ω1 = {1, 2, . . . , p}, . . . , Ωc = {(c − 1)p + 1, (c − 1)p + 2, . . . , cp} denote the cycles of σ,
and let Ωc+1 = {cp + 1}, . . . , Ωc+f = {cp + f = n} be the fixed points of σ. Define
Theorem 4.4.21 ([1928]) Let C be a binary [n, n/2] code with automorphism σ from (4.3).
Let π : Fσ (C) → Fc+f 2 be the projection map, where, for v ∈ Fσ (C), (π(v))i = vj for
some j ∈ Ωi , i = 1, 2, . . . , c + f . Let E (respectively P) be the set of all even-weight vec-
tors in Fp2 (respectively even-weight polynomials in F2 [x]/hxp − 1i). Define ϕ0 : E → P by
ϕ0 (v0 v1 · · · vp−1 ) = v0 + v1 x + · · · + vp−1 xp−1 . Let Eσ (C)∗ be Eσ (C) punctured on all the
fixed points of σ. Define ϕ : Eσ (C)∗ → P c by ϕ(v) = (ϕ0 (v|Ω1 ), ϕ0 (v|Ω2 ), . . . , ϕ0 (v|Ωc )) for
v ∈ Eσ (C)∗ ⊆ E c . Then C is self-dual if and only if the following two conditions hold:
(a) Cπ = π(Fσ (C)) is a binary self-dual code of length c + f , and
c
(b) for every two vectors u, v ∈ Cϕ = ϕ(Eσ (C)∗ ), we have ui (x)vi (x−1 ) = 0 where
P
i=1
ui (x) = ϕ0 (u|Ωi ) and vi (x) = ϕ0 (v|Ωi ) for i = 1, 2, . . . , c.
Theorem 4.4.22 ([999]) Let 2 be a primitive root modulo p. Then the binary code C with
an automorphism σ is self-dual if and only if the following two conditions hold:
(a) Cπ is a self-dual binary code of length c + f , and
Self-Dual Codes 93
(b) Cϕ = ϕ(Eσ (C)∗ ) is a self-dual code of length c over the field P under the inner product
c (p−1)/2
ui vi2
P
(u, v) = .
i=1
Example 4.4.23 Consider the extended Hamming code H b3,2 of Example 1.7.2. It has
parameters [8, 4, 4] and automorphism group of order 7 · 3 · 26 . Let σ be an automorphism
H3,2 of type 3-(2, 2). Then Cπ is the binary self-dual [4, 2, 2] code with generator matrix
of b
1 0 1 0
, and Cϕ is the [2, 1, 2]4 code with generator matrix e e , where e is
0 1 0 1
the identity element of the field P ∼ = F4 . Here P = {0, x + x2 , 1 + x2 , 1 + x} is the field
F4 = {0, 1, ω, ω } with 1 ↔ e = x + x2 , ω ↔ 1 + x2 , and ω 2 ↔ 1 + x. Hence we can write a
2
101 101 0 0
The first two rows of this matrix come from the generator matrix for Cπ with the left two
columns replaced by 000 and 111 corresponding to the two 3-cycles of σ. The bottom two
rows come from the row of the generator matrix for Cϕ followed by that row multiplied by
ω ↔ 1 + x2 .
There are such formulas for all families of self-dual and also of self-orthogonal codes. Detailed
information is presented in [1008, 1555]. See also Proposition 7.5.1.
Theorem 4.5.2 We have the following mass formulas.
(a) For self-dual binary codes of even length n,
n/2−1
X n! Y
= (2i + 1).
j
|PAut(Cj )| i=1
In each case, the summation is over all j, where {Cj } is a complete set of representatives
of inequivalent codes of the given type. The automorphism group ΓAut(Cj ) is the set of all
semi-linear monomial transformations from Fn4 to Fn4 that fix Cj ; see [1008, Section 1.7].
Example 4.5.3 Consider the binary self-dual codes of length 8. The codes i42 , which is the
direct sum of 4 [2, 1, 2]2 repetition codes, and the extended Hamming code Hb3,2 are of this
4 ∼ 4
type. Since PAut(i2 ) = Z2 o S4 and |PAut(H b3,2 )| = 7 · 3 · 64, we have
8! 8!
+ = 105 + 30 = 135 = (2 + 1)(4 + 1)(8 + 1).
24 · 4! 7 · 3 · 64
This proves, by Theorem 4.5.2(a), that these two codes are all of the inequivalent binary
self-dual codes of length 8.
Lemma 4.5.4 (Thompson [1800]) Let n be an even positive integer. Let HweC (x) =
HweC (x, 1) where HweC (x, y) is the weight enumerator of a code C. Then
n/2−1 n/2−1 n/2−2
X Y X n Y
HweC (x) = (1 + xn ) (2i + 1) + (2i + 1)x2j ,
2j
C i=1 j=1 i=1
Theorem 4.5.5 ([910]) Let n and d be even positive integers and let U be a family of
inequivalent binary self-dual codes of length n and minimum weight at most d. Then U is
a complete set of representatives for equivalence classes of self-dual codes of length n and
minimum weight at most d if and only if
n/2−2
X n! n Y i
{v ∈ C | wtH (v) = d} = (2 + 1).
PAut(C) d i=1
C∈U
Remark 4.5.6 The last formula gives us the possibility of checking the result even if only
the self-dual codes of length n and minimum distance ≤ d < 4bn/24c + 4 are constructed.
The classification of binary self-dual codes began in the seventies in the work of Vera
Pless [1515], where she classified the codes of length n ≤ 20. The self-dual codes of lengths
22 and 24 were classified by Pless and Sloane [1523], and for lengths n = 26, 28, 30, by
Conway and Pless [436]. The classification of the doubly-even self-dual codes of length 32
was presented in the paper [436]. All these results are summarized in the survey of Conway,
Pless and Sloane [439]. To construct the codes, the authors used gluing theory.
Self-Dual Codes 95
The classification of binary self-dual codes of length 34 was given in [205]. Harada and
Munemasa in [910] completed the classification of the self-dual codes of length 36 and
created a database of self-dual codes [908]. The doubly-even self-dual codes of length 40
were classified by Betsumiya, Harada and Munemasa [189]. Moreover, using these codes,
they classified the optimal self-dual [38, 19, 8] codes. All binary self-dual codes of lengths 38
and 40 were classified in [276] and [269], respectively.
Table 4.3 summarizes the classification of binary self-dual codes of length n ≤ 40. The
number of all inequivalent singly-even (Type I) and doubly-even (Type II) codes is denoted
by ]I and ]II , respectively. In the table dmax,I (dmax,II ) is the largest minimum distance
for which singly-even (doubly-even) self-dual codes of the corresponding length exist, and
]max,I (]max,II ) is their number.
Remark 4.5.7 Actually, the construction itself is easy; the equivalence test is the difficult
part of the classification. Because of that, for larger lengths the recursive constructions that
are used are preferably heuristic, for building examples for codes with some properties. There
are also many partial classifications, namely classifications of self-dual codes with special
properties, for example self-dual codes invariant under a given permutation, or self-dual
codes connected with combinatorial designs with given parameters. These classifications
are not recursive but they use codes with smaller lengths; that is why the full classification
is very important in these cases, too.
Remark 4.5.8 In [276], the authors present an algorithm for generating binary self-dual
codes which gives as output exactly one representative of every equivalence class. To de-
velop this algorithm, the authors use an approach introduced by Brendan McKay known
as isomorph-free exhaustive generation [1369]. The constructive part of the algorithm
is not different from the other recursive constructions for self-dual codes, but to take only
one representative of any equivalence class, we use a completely different technique. This
approach changes extremely the speed of generation of the inequivalent codes of a given
length. Its special feature is that there is practically no equivalence test for the objects.
96 Concise Encyclopedia of Coding Theory
Conjecture 4.5.9 Denote by SDk the number of all inequivalent binary self-dual codes of
Qk−1
dimension k. The sequence {ak = SDk (2k)!/ i=1 (2i + 1)} is decreasing for k ≥ 10.
Remark 4.5.10 For classification results for families of self-dual codes over larger fields
we refer to [1005, 1008, 1555].
Chapter 5
Codes and Designs
Vladimir D. Tonchev
Michigan Technological University
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.2 Designs Supported by Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.3 Perfect Codes and Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.4 The Assmus–Mattson Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.5 Designs from Codes Meeting the Johnson Bound . . . . . . . . . . . . . . . 102
5.6 Designs and Majority Logic Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.1 Introduction
Combinatorial designs often arise in codes that are optimal with respect to certain
bounds. This chapter summarizes some important links between codes and combinatorial
designs. Other references that discuss various relations between codes and designs are [71,
1008, 1814, 1816].
In Section 5.2, designs associated with supports of codewords of given nonzero weight and
how the regularity of such designs is related to transitivity properties of the automorphism
group of the code are discussed.
In Section 5.3, a necessary and sufficient condition for a code to be perfect, that is, to
attain the Sphere Packing Bound, is given in terms of a design supported by the codewords
of minimum weight. In particular, the codewords of minimum weight in any binary perfect
single-error-correcting code (linear or nonlinear) that contains the zero vector, support a
Steiner triple system, while the minimum weight codewords in the extended code support
a Steiner quadruple system.
The celebrated Assmus–Mattson Theorem that provides sufficient conditions for the
codewords of given weight in a linear code to support a t-design is discussed in Section 5.4,
together with examples of designs obtained from extremal self-dual codes.
Section 5.5 contains Delsarte’s generalization of the Assmus–Mattson Theorem for non-
linear codes. Delsarte’s Theorem yields designs from nonlinear binary codes meeting the
Johnson Bound, and 3-designs from the Preparata, Kerdock and Goethals codes in partic-
ular. Two infinite classes of 4-designs obtained from Preparata codes are also presented.
The main topic of Section 5.6 is combinatorial designs supported by linear codes that
can be used for majority logic decoding of their dual codes. Hamada’s Conjecture about
the minimum p-rank of designs derived from a finite projective or affine geometry, as well
as the proved cases of the conjecture and the known counter-examples, are discussed.
The last Section 5.7 gives references to other links between codes and designs that are
not covered in this chapter.
97
98 Concise Encyclopedia of Coding Theory
Definition 5.1.1 A combinatorial design D (or a design for short) is a pair of a finite
set X = {xi }vi=1 of points and a collection B = {Bj }bj=1 of subsets Bj ⊆ X called blocks
[187].
Any t-(v, k, λ) design is also an s-(v, k, λs ) design for every 0 ≤ s ≤ t where λs is given
by
v−s
t−s
λs = k−s
λ. (5.1)
t−s
In particular, the number of blocks b is equal to
v
t
b = λ0 = k
λ. (5.2)
t
The incidence matrix of a design is a (0, 1)-matrix A = (Ai,j ), with rows labeled by the
blocks and columns labeled by the points, such that Ai,j = 1 if the ith block contains the
j th point, and Ai,j = 0 otherwise. A design is simple if there are no repeated blocks, or
equivalently, if all rows of its incidence matrix are distinct.
Two designs are isomorphic if there is a bijection between their point sets that maps
the blocks of the first design to blocks of the second design. An automorphism of a
design is a permutation of the points that preserves the collection of blocks. The set of all
automorphisms of a design D is a permutation group called the automorphism group of
D.
The complementary design D of a design D is obtained by replacing every block of
D by its complement.
A Steiner design (or Steiner system) is a t-design with λ = 1. An alternative notation
used for a Steiner t-(v, k, 1) design is S(t, k, v). A Steiner triple system STS(v) is a Steiner
2-design with block size 3. A Steiner quadruple system SQS(v) is a Steiner 3-design with
block size 4.
Remark 5.2.2 If C is a linear code over a finite field of order q > 2, and c is a codeword
of weight w > 0, all q − 1 nonzero scalar multiples of c have the same support. To avoid
repeated blocks, we associate only one block with all scalar multiples of c. Suppose that
D is a t-(n, w, λ) design supported by a linear q-ary code C. It follows that the number of
blocks b of D is smaller than or equal to Aw /(q − 1), where Aw is the number of codewords
of weight w. If the support of every codeword of weight w is a block of D, then we have
Theorem 5.2.3 If a code is invariant under a monomial group that acts t-transitively or
t-homogeneously on the set of coordinates, the supports of the codewords of any nonzero
weight form a t-design.
Corollary 5.2.4 If C is a cyclic code of length n, the supports of all codewords of any
nonzero weight w form a 1-design.
Example 5.2.5 The binary Hamming code of length 7 is a linear cyclic code with generator
polynomial g(x) = 1 + x + x3 and weight distribution
A0 = A7 = 1, A3 = A4 = 7.
By Corollary 5.2.4 and equations (5.1) and (5.4), the sets of codewords of weight 3 and 4
support 1-designs with parameters 1-(7, 3, 3) and 1-(7, 4, 4), respectively. The full automor-
phism group of the code is of order 168 and acts 2-transitively on the set of coordinates.
By Theorem 5.2.3, the 1-designs supported by the codewords of weight 3 and 4 are actually
2-designs, with parameters 2-(7, 3, 1) and 2-(7, 4, 2), respectively. The incidence matrix of
the 2-(7, 3, 1) design supported by the codewords of weight 3 is
1 1 0 1 0 0 0
0 1 1 0 1 0 0
0 0 1 1 0 1 0
0 0 0 1 1 0 1 .
1 0 0 0 1 1 0
0 1 0 0 0 1 1
1 0 1 0 0 0 1
Example 5.2.6 The [24, 12, 8] extended binary Golay code G24 is invariant under the 5-
transitive Mathieu group M24 , while the [12, 6, 6] extended ternary Golay code G12 is in-
variant under M
e 12 , a central extension of the Mathieu group M12 that acts 5-transitively
on the coordinates; see Section 1.13. Both codes support 5-designs by Theorem 5.2.3.
Example 5.2.7 The binary Reed–Muller code RM(r, m) of length 2m (m ≥ 2) and order
r (1 ≤ r ≤ m − 1) is invariant under the 3-transitive group of all affine transformations
of the 2m -dimensional binary vector space. Consequently, the codewords of any nonzero
weight w < 2m support a 3-design.
Theorem 5.4.1 (Assmus–Mattson) Let C be an [n, k, d] linear code over Fq , and let C ⊥
be the dual [n, n − k, d⊥ ] code. Denote by n0 the largest integer smaller than or equal to n
such that n0 − n0q−1
+q−2
< d, and define n⊥ ⊥
0 similarly for the dual code C . Suppose that for
some integer t with 0 < t < d, there are at most d − t nonzero weights w in C ⊥ such that
w ≤ n − t. Then
(a) the codewords in C of any weight u with d ≤ u ≤ n0 support a t-design;
The largest value of t for any known t-design supported by a code via the Assmus–
Mattson Theorem is t = 5. All such 5-designs come from self-dual codes.
An [n, k] code C is self-orthogonal if C ⊆ C ⊥ and self-dual if C = C ⊥ . The length of a
self-dual code is even and n = 2k. A code is even if all weights are even, and doubly-even
if it is binary and all weights are divisible by 4. A binary code is singly-even if it is even
but not doubly-even.
The next theorem gives an upper bound on the minimum distance of a self-dual code
over Fq for q = 2, 3, and 4. We note that the ordinary inner product is used to define the
dual code of a binary or ternary code, while when q = 4, we use the Hermitian inner
product (x, y), defined by the equation
A code over F4 is self-orthogonal with respect to the Hermitian inner product if and only if
it is even.
A self-dual code whose distance d achieves equality in Theorem 5.4.3 is called extremal.1
Extremal self-dual codes can exist only for finitely many values of n, but the spectrum is
unknown; see Theorem 4.3.3. The smallest n ≡ 0 (mod 24) for which the existence of a
binary doubly-even self-dual code is unknown is n = 72. A data base of self-dual codes is
available online at [908]; see also Chapter 4.
Self-dual extremal codes yield t-designs with t ≤ 5 via the Assmus–Mattson Theorem
5.4.1; see also Theorem 4.3.16.
1 With improved bounds, another meaning of the term ‘extremal’ has developed over time for binary
self-dual codes; see Section 4.3 and, in particular, Definition 4.3.7 for this alternate meaning of ‘extremal’.
102 Concise Encyclopedia of Coding Theory
Remark 5.4.5 The weights of codewords that support designs depend on the value of n0 ;
see Remark 5.4.2.
Example 5.4.6 The nonzero weights in the [24, 12, 8] extended binary self-dual Golay code
G24 are 8, 12, 16, and 24. The nonzero weights in the [12, 6, 6] extended ternary self-dual
Golay code G12 are 6, 9, and 12. Both codes support 5-designs by the Assmus–Mattson
Theorem 5.4.1.
A table with parameters of t-designs obtained from self-dual codes is given in [1816,
Table 1.61, page 683].
Remark 5.4.7 There are extremal binary self-dual codes that do not satisfy the conditions
of the Assmus–Mattson Theorem 5.4.1 or Theorem 5.4.4, but some of their punctured codes
support 2-designs. An example of this phenomenon is a [50, 25, 10] extremal binary code
C50 discovered in [1009] with Hamming weight enumerator
All minimum weight codewords of C50 share one nonzero position, and the codewords of
∗
minimum weight of the punctured code C50 with respect to this position support a 2-(49, 9, 6)
design D, having the additional property that every two distinct blocks intersect each other
in either one or three points.
A 2-design that has only two distinct block intersection numbers is called quasi-
symmetric. Quasi-symmetric designs are related to other interesting combinatorial struc-
tures, such as strongly regular graphs and association schemes; see [308, Section VII.11] and
[1814, Section 6].
More [50, 25, 10] binary self-dual codes with weight enumerator (5.6) whose punctured
codes support quasi-symmetric 2-(49, 9, 6) designs were later found in [278, 909].
The MacWilliams transform {Bi0 }ni=0 of the distance distribution {Bi }ni=0 of an
(n, M, d)2 code C is defined by the identity
n
X n
X
2n Bj xj = M Bi0 (1 + x)n−i (1 − x)i .
j=0 i=0
Let σ1 < σ2 < · · · < σs0 be the nonzero subscripts i for which Bi0 6= 0. Then d0 = σ1 is
called the dual distance of C, and s0 is the external distance of C.
Remark 5.5.1 If {Bi }ni=0 is the distance distribution of a binary linear code C, then
{Bi0 }ni=0 is the weight distribution of the dual code C ⊥ .
The following theorem, due to Delsarte, generalizes the Assmus–Mattson Theorem 5.4.1
to nonlinear codes containing the zero vector 0.
Theorem 5.5.2 (Delsarte [521]) Let s be the number of distinct nonzero distances in an
(n, M, d)2 code C with distance distribution {Bi }ni=0 such that 0 ∈ C. Let s = s if Bn = 0
and s = s − 1 if Bn = 1.
(a) If s < d0 , then the codewords of any weight w ≥ d0 − s in C support a (d0 − s)-design.
(b) If d − s0 ≤ s0 < d, the codewords of any fixed weight support a (d − s0 )-design.
2n
M ≤ Pe n 1 n
n−e .
( e+1 − b n−e
i=0 i + bn/(e+1)c e e+1 c)
A code for which equality holds in Theorem 5.5.3 is called nearly perfect. It follows
from Delsarte’s Theorem 5.5.2 that nearly perfect codes support designs.
(b) The minimum weight codewords in the extended code Cb support a (e + 1)-(n + 1, 2e +
2, b(n − e)/(e + 1)c) design.
Example 5.5.5 ([1446]) The Nordstrom–Robinson code N R is a nonlinear (16, 256, 6)2
code obtained as the union of eight carefully chosen cosets of the first order Reed–Muller
code RM(1, 4) of length 16. For a detailed description of how this code can be obtained
from the extended binary Golay code G24 , see [1008, Section 2.3.4]. The weight and distance
distribution of N R is
and it coincides with its MacWilliams transform: Bi0 = Bi for 0 ≤ i ≤ 16. Hence, d = d0 = 6
and s = s0 = 4; by Theorem 5.5.2(a) (or Theorem 5.5.4(b)), the nonzero codewords support
3-designs with parameters 3-(16,6,4), 3-(16,8,3), and 3-(16,10,24).
The punctured code N R∗ of length 15 and minimum distance 5 meets the Johnson
Bound 5.5.3, and supports 2-designs by Theorem 5.5.4(a).
104 Concise Encyclopedia of Coding Theory
Remark 5.5.6 The Nordstrom–Robinson code is the first member of two infinite classes of
nonlinear binary codes that support 3-designs: the Preparata codes [1541], and the Ker-
dock codes [1107] (see also Section 20.4.4). The Preparata code Pm is of length n = 22m
(m ≥ 2), contains 2n−4m codewords, and has minimum distance d = 6. The Kerdock code
Km is of length n = 22m (m ≥ 2), contains 24m codewords and has minimum distance
22m−1 − 2m−1 . The codes Pm and Km contain the zero vector, and their distance distribu-
tions are related by the MacWilliams transform as if the codes were duals of each other.
Two decades after the discovery of these codes, as well as some nonlinear binary codes found
by Goethals [816], it was proved by Hammons, Kumar, Calderbank, Sloane, and Solé [898]
that appropriate versions of Pm and Km , as well as the Goethals codes, can be obtained
from pairs of dual codes that are linear over the ring Z4 ; see Chapter 6.
Parameters of 3-designs supported by Preparata, Kerdock, and Goethals codes are listed
in Table 5.1.
The 3-designs supported by the minimum weight codewords in the Preparata code Pm
have been used for the construction of infinite classes of 4-designs and Steiner 3-designs.
Theorem 5.5.7 ([81]) Let Pm be the 3-(4m , 6, 13 (4m − 4)) design supported by the code-
words of minimum weight in the Preparata code Pm .
(a) The 4-subsets of coordinate indices that are not covered by any block of Pm are the
blocks of a Steiner system Sm with parameters 3-(4m , 4, 1).
∗
(b) The 3-(4m , 4, 2) design Sm consisting of two identical copies of Sm is extendable to a
m ∗
4-(4 + 1, 5, 2) design Dm by adding one new point to all blocks of Sm , and adding as
further new blocks all 5-subsets that are contained in blocks of Pm .
The Preparata codes yield yet another infinite class of 4-designs, namely Steiner 4-
designs with blocks of two different sizes.
Theorem 5.5.11 ([1813]) The minimum weight vectors in the Preparata code Pm together
with the blocks of the Steiner system Sm from Theorem 5.5.7 extended by one new point,
yield a Steiner 4-(4m , {5, 6}, 1) design.
Theorem 5.6.1 (Rudolph [1609]) If the dual code C ⊥ of a linear code C supports a 2-
(n, w, λ) design, then C can correct by majority logic decoding up to e errors where
r+λ−1
e≤
2λ
Example 5.6.2 The maximum value of the minimum distance of a binary linear code of
length 28 and dimension 9 is 10 [845]. A [28, 9, 10] binary linear code C can be obtained
as the null space of the 63 × 28 incidence matrix A of a 2-(28, 4, 1) design D, known as
the Ree unital [306], or the design associated with a maximal (28, 4)-arc in the projective
plane of order 8 [1817]. The rank over F2 (or the 2-rank) of A is 19; so any set of 19 linearly
independent rows of A is a parity check matrix of C. Since C ⊥ supports D, the code C
can correct by majority logic decoding up to e ≤ b(9 + 1 − 1)/2c = 4 errors, which is the
maximum number of errors that a code with minimum distance 10 can correct.
Remark 5.6.3 There are more than 4000 known non-isomorphic 2-(28, 4, 1) designs [428,
1168], with 2-ranks ranging from 19 to 26 [1033]. However, up to isomorphism, the Ree
unital is the unique 2-(28, 4, 1) design of minimum 2-rank 19 [1368].
Theorem 5.6.1 can be improved if the code supports a t-design with t > 2.
Theorem 5.6.4 (Rahman–Blake [1552]) Assume that the dual code of a linear code C
Pl−1 j l−1
supports a t-(n, w, λ) design D with t ≥ 2. Let Al = j=0 (−1) j λj+2 , where λi =
λ n−i w−i
t−i / t−i is the number of blocks of D containing a set of i fixed points for 0 ≤ i ≤ t.
Define further
0 Al if l ≤ t − 1,
Al =
At−1 if l > t − 1.
Then C can correct up to l errors by majority logic decoding, where l is the largest integer
that satisfies the inequality
Xl
A0i ≤ b(λ1 + A0l − 1)/2c.
i=1
106 Concise Encyclopedia of Coding Theory
Theorem 5.6.5 ([889]) Let A be the incidence matrix of a 2-(n, w, λ) design. Let r =
λ(n − 1)/(w − 1) and p be a prime.
(a) If p does not divide r(r − λ), then rankp A = n.
(b) If p divides r but does not divide r − λ, then rankp A ≥ n − 1.
(c) If rankp A < n − 1, then p divides r − λ.
Example 5.6.6 Up to isomorphism, there are exactly four 2-(8, 4, 3) designs [428, page
27]. Since r = 3(8 − 1)/(4 − 1) = 7 and r − λ = 7 − 3 = 4, the only primes p for which the
p-rank of a 2-(8, 4, 3) design can be smaller than 8 are p = 7 and p = 2. The 2-ranks of the
four 2-(8, 4, 3) designs are 4, 5, 6, and 7. The binary code spanned by the incidence matrix
of the design of minimum 2-rank 4 is a self-dual [8, 4, 4] code equivalent to the first order
Reed–Muller code RM(1, 3) of length 8.
Most of the known majority-logic decodable codes are based on designs arising from
finite geometry, a notable class of such codes being the Reed–Muller codes. We refer to
any design having as points and blocks the points and subspaces of a given dimension of a
finite affine or projective geometry (see Chapter 14) as a geometric design. We denote by
P Gd (n, q) (respectively AGd (n, q)) the geometric design having as blocks the d-dimensional
subspaces of the n-dimensional projective space PG(n, q) (respectively the n-dimensional
affine space AG(n, q)) over Fq . P Gd (n, q) is a 2-(v, k, λ) design with parameters
where
(q n − 1) · · · (q n−i+1 − 1)
n
= .
i q (q i − 1) · · · (q − 1)
AGd (n, q) is a 2-(v, k, λ) design with parameters
n−1
v = qn , k = qd , λ = .
d−1 q
In the special case q = 2, AGd (n, 2) is also a 3-design (see, e.g., [1812, Theorem 1.4.13])
with parameters
n−2
v = 2n , k = 2d , λ3 = .
d−2 2
In particular, the planes in AG(n, 2) are the blocks of a Steiner quadruple system SQS(2n ).
Remark 5.6.7 The codewords of minimum weight in the binary Reed–Muller code
RM(r, n) of length 2n and order r with 1 ≤ r ≤ n − 1 support a 3-design isomorphic
to AGn−r (n, 2) [71, Section 5.3].
Codes and Designs 107
(a) ([1056]) The number of non-isomorphic Steiner designs having the parameters 2-
((q n+1 − 1)/(q − 1), q + 1, 1) of P G1 (n, q) grows exponentially with linear growth of
n.
(b) ([1059]) The number of non-isomorphic designs having the parameters of P Gd (n, q),
2 ≤ d ≤ n − 1, grows exponentially with linear growth of n.
(c) ([411]) The number of non-isomorphic designs having the parameters of AGd (n, q),
2 ≤ d ≤ n − 1, grows exponentially with linear growth of n.
Remark 5.6.9 The 2-(8, 4, 3) design of minimum 2-rank 4 from Example 5.6.6 is in fact
a 3-(8, 4, 1) design and is isomorphic to the geometric design AG2 (3, 2) of the points and
planes in AG(3, 2).
If q = ps where p is a prime number, the q-rank of a (0, 1)-matrix is equal to its p-rank.
The p-ranks of the incidence matrices of geometric designs were computed in the 1960s and
1970s, starting with P G1 (2, ps ) (Graham and MacWilliams [844], MacWilliams and Mann
[1322]), P Gn−1 (n, q) and AGn−1 (n, q) (Smith [1731], Goethals and Delsarte [818]), and a
formula for the p-rank of P Gd (n, q) and AGd (n, q), for any d in the range 1 ≤ d ≤ n − 1
was found by Hamada [889].
In [889], Hamada made the following conjecture.
Conjecture 5.6.10 (Hamada) The p-rank of any design D having the parameters of a
geometric design G in PG(n, q) or AG(n, q) (q = pm ) is greater than or equal to the p-rank
of G, with equality if and only if D is isomorphic to G.
Remark 5.6.11 Hamada’s Conjecture has been proved in the following cases:
• P Gn−1 (n, 2) and AGn−1 (n, 2) (Hamada and Ohmori [892]);
• P G1 (n, 2) and AG1 (n, 3) (Doyen, Hubaut and Vandensavel [640]);
• AG2 (n, 2) (Teirlinck [1795]).
In all proven cases of the conjecture, the geometric designs not only have minimum p-
rank, but are also the unique designs, up to isomorphism, of minimum p-rank for the given
parameters.
Although there are no known designs that have the same parameters, but smaller p-rank
than a geometric design, there are non-geometric designs that have the same p-rank as a
geometric design with the same parameters; hence they provide counter-examples to the
“only if” part of Hamada’s Conjecture 5.6.10. Throughout the remainder of this section,
we list all known designs that are counter-examples to Hamada’s Conjecture and examine
related questions.
Example 5.6.12 ([1811]) The codewords of minimum weight 8 in a [32, 16, 8] binary
doubly-even self-dual code support a 3-design by Theorem 5.4.4 with parameters 3-(32, 8, 7).
There are exactly five inequivalent such codes [436], and each code is generated by its set
of codewords of weight 8. Thus, the five 3-(32, 8, 7) designs supported by the five [32, 16, 8]
extremal self-dual codes have 2-rank 16 and are pairwise non-isomorphic. One of these de-
signs, namely the design supported by the second order Reed–Muller code RM(2, 5), is
isomorphic to the geometric design AG3 (5, 2).
108 Concise Encyclopedia of Coding Theory
Any [31, 16, 7] code punctured from a [32, 16, 8] doubly-even self-dual code supports a
2-(31, 7, 7) design and is spanned by its set of 155 codewords of minimum weight. Since all
[32, 16, 8] doubly-even self-dual codes admit transitive automorphism groups, their punc-
tured codes support a total of five pairwise non-isomorphic 2-(31, 7, 7) designs, all having
2-rank 16. One of these designs, namely the one supported by the punctured Reed–Muller
code RM(2, 5)∗ , is isomorphic to the geometric design P G2 (4, 2).
Example 5.6.13 ([907]) A symmetric (4, 4)-net is a 1-(64, 16, 16) design N = (V, B) with
the following properties:
• The point set V is partitioned into 16 disjoint subsets of size 4 (called point groups)
so that every two points that belong to different groups appear together in four blocks,
while two points belonging to the same group do not appear together in any block.
• The block set B is partitioned into 16 disjoint subsets of size 4 (called block groups)
so that every two blocks that belong to different groups share exactly four points,
while any two blocks from the same group are disjoint.
All non-isomorphic symmetric (4, 4)-nets admitting an automorphism group of order 4 that
acts regularly on every point group and every block group were enumerated in [907], and the
binary linear codes of length 64 spanned by their incidence matrices were analyzed. Three
of these codes have parameters [64, 16, 16] and support three non-isomorphic 2-(64, 16, 5)
designs, each having 2-rank 16. One of these three designs is isomorphic to the geometric
design AG2 (3, 4), while the other two designs are counter-examples to the “only if” part of
Hamada’s Conjecture 5.6.10.
Alternative constructions of the two non-geometric 2-(64, 16, 5) designs with 2-rank 16
were given later in [1355, 1817].
Definition 5.6.14 A generalized incidence matrix of a design D over Fq is any matrix
obtained from the (0, 1)-incidence matrix A of D by replacing nonzero entries of A with
nonzero elements from Fq .
A revised version of Hamada’s Conjecture that uses generalized incidence matrices in-
stead of (0,1)-incidence matrices, was proved by Tonchev [1815] for the complementary
designs of the classical geometric designs with blocks being hyperplanes in a projective or
affine space.
Theorem 5.6.15 ([1815]) The q-rank d of a generalized incidence matrix over Fq of a
n+1
2- q q−1−1 , q n , q n−1 design (n ≥ 2) is greater than or equal to n + 1. The equality d = n + 1
holds if and only if the design is isomorphic to the complementary design of the geometric
design P Gn−1 (n, q).
Theorem 5.6.16 ([1815]) The q-rank d of a generalized incidence matrix over Fq of a
2-(q n , q n − q n−1 , q n − q n−1 − 1) design (n ≥ 2) is greater than or equal to n + 1. The equality
d = n + 1 holds if and only if the design is isomorphic to the complementary design of the
geometric design AGn−1 (n, q).
An infinite class of non-geometric designs that share their parameters and p-rank with
geometric designs was discovered in [1058] by generalizing some properties of one of the
2-(31, 7, 7) designs with 2-rank 16 from Example 5.6.12. These designs were constructed by
using polarities in PG(2d − 1, q) and are called polarity designs.
Theorem 5.6.17 ([1058]) For every prime p ≥ 2 and every integer d ≥ 2, there exists a
polarity 2-design P having the same parameters and the same p-rank as P Gd (2d, p), but P
is not isomorphic to P Gd (2d, p).
Codes and Designs 109
Theorem 5.6.18 ([412]) The linear code over Fp that is the null space of the incidence
matrix of the polarity design P from Theorem 5.6.17 corrects by majority logic decoding
the same number of errors as the code that is the null space of the incidence matrix of
P Gd (2d, p).
The following theorem describes an infinite class of non-geometric designs having the
same parameters and 2-rank as certain designs in binary affine geometry.
Remark 5.6.20 The designs from Theorem 5.6.19 generalize properties of one of the five
3-(16, 8, 7) designs from Example 5.6.12, and were constructed by modifying the method
from [1058].
The first few codes of the infinite class of codes from Theorem 5.6.19(b) have the same
weight distribution as the Reed–Muller code RM(r, 2r + 1) of length 22r+1 and order r
[911]. This motivates the following.
Conjecture 5.6.21 For all r, the code from Theorem 5.6.19(b) has the same weight dis-
tribution as the Reed–Muller code RM(r, 2r + 1) of length 22r+1 and order r.
between 2-designs and certain binary constant weight codes that meet the restricted Johnson
Bound, as well as the equivalence of Steiner systems and certain constant weight binary
codes that meet the unrestricted Johnson Bound, are reviewed in [1814, Section 3]. A
thorough discussion of quasi-symmetric designs and their links to self-orthogonal and self-
dual codes is given in [1814, Section 9]).
Chapter 6
Codes over Rings
Steven T. Dougherty
University of Scranton
6.1 Introduction
In this chapter, we expand the collection of acceptable alphabets for codes and study
codes over finite commutative rings. Although there were some early papers about codes
over some finite commutative rings, a major interest in codes over rings did not really
develop until the landmark paper about quaternary codes, The Z4 -linearity of Kerdock,
Preparata, Goethals, and related codes [898]. In that paper, the authors showed that some
well known nonlinear binary codes could actually be understood as the images of linear
codes over the ring Z4 under a nonlinear Gray map. This explained why certain binary
codes acted like linear codes in so many ways yet were not linear in the classical sense for
binary codes. Following this important paper, a flurry of activity occurred studying codes
over the ring Z4 . For example, see Wan’s text on codes over Z4 [1861]. The next step was to
consider codes over the other rings of order 4 together with their respective Gray maps; see
[621, 622]. Other papers followed that found interesting applications for codes over rings.
For example, in [236] the notion of Type II codes over Z4 is introduced and a very simple
construction of the Leech lattice is given, and in [115] self-dual codes over Z2k were used
to construct real unimodular lattices which in turn were used to construct modular forms.
In [332], codes over the p-adics were considered and a unified approach to cyclic codes
was given. The acceptable alphabet for codes continued to grow to chain rings [979, 980],
principal ideal rings [1449], Galois rings [1862, 1863], and finally to finite Frobenius rings
[1905, 1906].
In extending coding theory to codes over rings, several important things must be con-
sidered. To begin, linear codes over fields are vector spaces, and the most important weight
111
112 Concise Encyclopedia of Coding Theory
is the Hamming weight. When switching to codes over rings, we must make new definitions
for linear codes and determine which weight is the most important one for the particular
application. For example, when considering quaternary codes which give rise to interesting
binary codes via the Gray map, the Lee weight is the most important weight (see [898]), and
the homogeneous weight generalizes this weight to chain rings. However, when considering
self-dual codes over Z2k used to produce real unimodular lattices, the Euclidean weight is
the most important weight; see [115]. Additionally, when determining a generator matrix
for a code over a field and determining the cardinality of a code from that matrix, the
standard definition of linear independence is used along with the classical results of linear
algebra. For codes over rings, we need new results to determine when a set of generators
is minimal and to determine the cardinality of the code generated by them. Moreover, we
must determine what is the largest class of rings that we can use as alphabets for codes.
While any set can be used as an alphabet for coding theory, in general, we want alphabets
for which the fundamental theorems of coding theory apply, like the MacWilliams theorems
(Theorem 1.8.6 and Theorem 1.15.3). Therefore, we generally restrict ourselves to rings
for which we have analogs of these theorems. In [1906], the class of rings for which both
MacWilliams theorems hold were shown to be the class of finite Frobenius rings.
In [1906], codes over both commutative and noncommutative rings were considered.
For noncommutative rings the situation is much different than for commutative rings. For
example, there is not a unique orthogonal. Rather, there is a left orthogonal and a right
orthogonal. Moreover, a linear code can either be a left submodule of the ambient space
or a right submodule of the ambient space. In this chapter, we shall restrict ourselves to
commutative rings since codes over commutative rings have been widely studied while the
study of codes over noncommutative rings is still in its infancy.
Throughout this chapter, we shall take the definition of ring that says a ring has a
multiplicative unity.
where A and C are matrices over Z2 and B is a matrix over Z4 . It follows immediately
that if C is generated by this matrix, then |C| = 4k0 2k1 . In this case, we say that the linear
code has type 4k0 2k1 . We note that for codes over fields of cardinality q, all linear codes
have cardinality of the form q k for some integer k, but not all quaternary linear codes have
cardinality of the form 4k . For example, the code C = {0, 2} of type 40 21 has cardinality 2,
which is not an integer power of 4.
Codes over Rings 113
P
The usual inner product is attached to the ambient space and is defined as v·w = vi wi .
The orthogonal is defined as C ⊥ = {v | v · w = 0 for all w ∈ C}. It follows immediately
that if C is a linear code of length n of type 4k0 2k1 , then C ⊥ is a linear code of type
4n−k0 −k1 2k1 . Additionally, we have that |C||C ⊥ | = 4n .
the octacode, see [898]. See Chapters 3, 5, 12, and 20 for more on the Kerdock and Preparata
codes.
that if K(C) = C, then R(C) = C as well. Any code generated by a matrix of
We note
the form 2Ik 2A satisfies this relation.
Codes over Rings 115
6.4 Rings
Following an intense study of quaternary codes, the question arose as to what is the
largest class of rings that can be used as alphabets for algebraic coding theory. Specifically,
the question is: what is the largest class of rings such that both MacWilliams theorems
(Theorem 1.8.6 and Theorem 1.15.3) have analogs for codes over this class of rings. Without
these theorems many of the techniques used in coding theory, which make it a powerful tool
in mathematics, would not work. Therefore, we really want to restrict ourselves to rings
where we have these theorems. This was answered in the landmark paper [1906] by J. A.
Wood. The class of rings which were identified as the answer to this question is the class
of Frobenius rings. Namely, Wood showed that if a ring is Frobenius then analogs of both
MacWilliams theorems hold for codes over that ring. Since that paper, a great deal of
work has been done studying codes over finite Frobenius rings and, in general, it is a tacit
assumption in papers about codes over rings that the alphabets will be Frobenius rings.
Examples of Frobenius rings include Zn , chain rings, and Galois rings. Not all finite
commutative rings are Frobenius. For example, the ring F2 [u, v]/hu2 , v 2 , uvi is not Frobenius.
We shall show why this is true in Example 6.5.6.
We shall give a further characterization of Frobenius rings in terms of generating char-
acters. We restrict ourselves to commutative rings. In this case, we have that R b∼= R R and
b∼
R = RR are the same and we do not have to distinguish between left and right modules.
Let R be a Frobenius ring and let φ : R → R b be the module isomorphism. Set χ = φ(1) so
that φ(r) = χr for r ∈ R, where χr (a) = χ(ra). This character χ is called a generating
character for R.b The following theorem gives a characterization for Frobenius rings in
terms of the generating character. See [1906] for a proof.
116 Concise Encyclopedia of Coding Theory
Theorem 6.4.4 The character χ for the finite commutative ring R is a generating char-
acter if and only if the kernel of χ contains no nontrivial ideals.
For a finite Frobenius ring R, we can make a character table by labeling the rows and
columns of the matrix with the elements of R and in the ath row and bth column putting
in the value of χ(ab) where χ is the generating character. Given that χ is a generating
character this value corresponds to χa (b) where χa is the character of R b that corresponds
to the element a ∈ R. We√can give the character table for Z4 , where i is the complex fourth
root of unity, that is i = −1:
0 1 2 3
0 1 1 1 1
1 1 i −1 −i .
2 1 −1 1 −1
3 1 −i −1 i
Compare this to the character table for F4 = {0, 1, ω, 1 + ω} with ω 2 = 1 + ω, which is:
0 1 ω 1+ω
0 1 1 1 1
1 1 −1 −1 1 .
ω 1 −1 1 −1
1+ω 1 1 −1 −1
Codes over Rings 117
We note that the character table for a ring is not unique as we can take different
generating characters for the ring. This non-uniqueness will not pose any problems in their
coding theory applications.
For this general class of rings we make the following definitions. Let R be a finite com-
mutative Frobenius ring. Then a linear code over R of length n is an R-submodule of Rn .
n
P
The ambient space R is attached with the usual inner product, namely v·w = vi wi . The
orthogonal defined with respect to this inner product is C ⊥ = {w | w · v = 0 for all v ∈ C}.
The code C ⊥ is linear even if C is not. If C ⊆ C ⊥ , the code is said to be self-orthogonal and
if C = C ⊥ , then the code is said to be self-dual. For a description of self-dual codes over
Frobenius rings see [628], and for an encyclopedic description of self-dual codes see [1555].
Two codes C and D are said to be equivalent if D can be formed from C by a com-
bination of permutations of the coordinates and multiplication of coordinates by units of
R. Multiplication by non-units does notgive an equivalent code; for example consider the
code of length 2 with generator matrix 1 1 over Z9 . This code has 9 codewords and is
{aa | a ∈ Z9 }. Multiplying the first and second coordinate by 3 gives the code {00, 33, 66}
which is substantially different as the cardinality is different.
In terms of coding theory, any ideal is a linear code of length 1 over the ring.
Recall the following definitions from the theory of rings. These definitions apply to
all rings, but we are only concerned with commutative rings. Moreover, they apply to non-
Frobenius and Frobenius rings as well; for example a local ring may or may not be Frobenius.
However, in terms of coding theory we are only concerned with Frobenius rings.
A chain ring R is necessarily a local principal ideal ring and its maximal ideal is m = hai
for some a ∈ R. It follows that the ideals of R are
Here, the integer e is known as the index of nilpotency. If p is a prime, then Zpe is a
chain ring with maximal ideal hpi and index of nilpotency e. In this ring, the ideals are of
the form hpi i for i = 0, 1, 2, . . . , e, where i = 0 and i = e give trivial ideals, that is, the
entire ring and the ideal consisting of only zero. Here the ideal hpi i has orthogonal hpe−i i.
Example 6.4.6 The ring Zn is a principal ideal ring for all n > 1. Consider the ring
Z12 . This ring has ideals h0i, h2i, h3i, h4i, h6i, h1i = Z12 with h0i⊥ = Z12 , h2i⊥ = h6i, and
h3i⊥ = h4i. The ring Z12 is neither a local ring nor a chain ring.
118 Concise Encyclopedia of Coding Theory
Example 6.4.7 Consider the ring R = F2 [u, v]/hu2 , v 2 , uv − vui. The ring R has 16 ele-
ments with maximal ideal m = {0, u, v, uv, u + v, u + uv, v + uv, u + v + uv}. The ring is a
local ring that is not a chain ring nor a principal ideal ring, since the maximal ideal is hu, vi
which is not principal.
There are four rings of order 4; they are all commutative and Frobenius. Namely, they
are Z4 which is a chain ring, F4 which is a finite field, F2 [u]/hu2 i which is a chain ring, and
F2 [v]/hv 2 + vi ∼
= F2 × F2 which is not a local ring but is a principal ideal ring. Each of these
rings is equipped with a Gray map to F22 and they are given below:
The Lee weight is defined for each of these rings according to the Hamming weight of its
image under the respective Gray map. We note that only the Gray map for Z4 is nonlinear.
Each of these rings generalizes to a family of Frobenius rings with a corresponding
family of Gray maps. For F4 the generalization is the natural generalization to finite fields
F2r where the Gray map is simply writing each element of the finite fields in its canonical
representation as an element in the vector space Fr2 .
The generalization of Z4 is the family of rings Z2k . These rings are all chain rings. We
shall describe the Gray map as given in [620]. Let 1i denote the all-one vector of length i and
k−1
let 0i denote the all-zero vector of length i. Then we define the Gray map φk : Z2k → Z22
by
if 0 ≤ i ≤ 2k−1 ,
02k−1 −i 1i
φk (i) = k−1
12k−1 + φk (i − 2 ) if i > 2k−1 .
We note that this map is a nonlinear map.
Any linear code over Z2k is equivalent to a code with a generator matrix of the following
form, where the elements in Ai,j are from Z2k :
A0,3 · · · ···
Iδ0 A0,1 A0,2 A0,k
0 2Iδ1 2A1,2 2A1,3 · · · ··· 2A1,k
0
. 0 4Iδ2 4A2,3 · · · ··· 4A2,k
G= . .. .. .. .. .
(6.2)
. . 0 . . .
.. .. .. .
.. .. .. ..
. . . . . .
0 0 0 ··· 0 2k−1 Iδk−1 2k−1 Ak−1,k
The matrix G is said to be in standard form and such a code C is said Q to have type
k−1
{δ0 , δ1 , . . . , δk−1 }. It is immediate that a code C with this generator matrix has i=0 (2k−i )δi
⊥
Pk−1
vectors. It follows that the type of C is {n − i=0 δi , δk−1 , δk−2 , . . . , δ1 }, where n is the
length of the code. The following can be found in [620].
Theorem 6.4.8 Let C be a linear code over Z2k of type {δ0 , δ1 , . . . , δk−1 }. If m =
dim(Ker(φk (C))), then
k−1
X k−1
X k−1
X k−1
X
m∈ δi , δi + 1, . . . , δi + δk−2 − 2, δi + δk−2 .
i=0 i=0 i=0 i=0
Codes over Rings 119
Moreover, there exists such a code C for any m in the interval. If C has length n, let
Pk−1
s = n − i=0 δi . If r is the dimension of hφk (C)i, then
k−1
X k−1
X k−1
X
k−(i+1) k−(i+1) k−1 k−1
r∈ 2 δi , 2 δi + 1, . . . , 2 δ0 + (2 − 1) ( δi ) + s .
i=0 i=0 i=1
The ring F2 [u]/hu2 i can be extended to an infinite family of rings Rk as in [626, 638, 639].
Let Rk = F2 [u1 , u2 , . . . , uk ]/hu2i = 0, ui uj = uj ui i; for k ≥ 1, the ring Rk is a finite
commutative ring.
We can describe the representation of the elements in the ring Rk . Take a subset A ⊆
{1, 2, . . . , k} and let Y
uA := ui
i∈A
We note that the element is a unit if and only if c∅ = 1. The ring Rk is a Frobenius local
k
ring with maximal ideal hu1 , u2 , . . . , uk i and |Rk | = 22 . This ring is neither a principal
ideal ring nor a chain ring when k ≥ 2.
The Gray map ψk on Rk can be defined inductively, building on the Gray map given
previously, recognizing R1 as F2 [u1 ]/hu21 i with Gray map ψ1 . Recall that ψ1 (α + βu1 ) =
(β, α + β) for α, β ∈ F2 . Then define
ψk (α + βuk ) = ψk−1 (β), ψk−1 (α) + ψk−1 (β)
for α, β ∈ Rk−1 . This map is a linear map and in fact preserves orthogonality.
A binary self-dual code is called Type II if all of its codewords have Hamming weights
congruent to 0 (mod 4). If a binary self-dual code is not Type II, it is Type I. Generalizing
these concepts, a self-dual code over Rk is Type II if all of its codewords have Lee weights
congruent to 0 (mod 4); otherwise it is Type I. Here, the Lee weight of a vector in Rkn is
the Hamming weight of its image under the Gray map ψk . The following can be found in
[639].
Theorem 6.4.9 If C is a self-dual code over Rk , then ψk (C) is a binary self-dual code of
length 2k . If C is a Type II code, then ψk (C) is Type II, and if C is Type I, then ψk (C) is
Type I.
The following can be found in [626]. We note that quasi-cyclic codes are introduced in
Definition 1.12.23 and studied in Chapter 7.
Theorem 6.4.10 Let C be a cyclic code of length n over the ring Rk . Then ψk (C) is a
2k -quasi-cyclic binary linear code of length 2k n.
The ring F2 [v]/hv 2 + vi can be extended to an infinite family of rings Ak as in [373]. Let
Ak = F2 [v1 , v2 , . . . , vk ]/hvi2 = vi , vi vj = vj vi i. For all k, the ring Ak is a finite commutative
Frobenius ring.
We can describe the representationQ of the elements in the ring Ak . Take a subset B ⊆
{1, 2, . . . , k}; then let vB = i∈B vi , with the convention that v∅ = 1. Any element of Ak
can be represented as X
αB vB with αB ∈ F2 .
B⊆{1,2,...,k}
120 Concise Encyclopedia of Coding Theory
k
The ring Ak has characteristic 2 and cardinality 22 and is not a local ring. For example, if
k = 1 then hv1 i 6= h1 + v1 i and both ideals are maximal.
The Gray map Θk on Ak can be defined inductively in a manner similar to the case
for Rk , building on the Gray map given previously, recognizing A1 as F2 [v1 ]/hv12 + v1 i with
Gray map Θ1 . Recall that Θ1 (α + βv1 ) = (α, α + β) for α, β ∈ F2 . Then define
Θk (α + βvk ) = Θk−1 (α), Θk−1 (α) + Θk−1 (β)
Theorem 6.4.11 ([373]) If C is a self-dual code over Ak , then Θk (C) is a binary self-dual
code.
The natural consequence of this theorem is the following result which decomposes rings
into rings which are more easily studied.
Example 6.4.14 Let n be a positive integer with n > 1. The Fundamental Theorem of
Arithmetic gives that n = pe11 pe22 · · · pess where pi is a prime, pi 6= pj when i 6= j, and ei ≥ 1.
Then the Chinese Remainder Theorem shows that
Zn ∼
= Zep11 × Zep22 × · · · × Zepss .
With the notation above, let n be a positive integer. Extend the definition CRT : R1 ×
R2 ×· · ·×Rs → R to CRT : R1n ×R2n ×· · ·×Rsn → Rn as follows. Let vi = vi,1 vi,2 · · · vi,n ∈ Rin
for 1 ≤ i ≤ s; thus (v1 , v2 , . . . , vs ) ∈ R1n × R2n × · · · × Rsn . Set wj = (v1,j , v2,j , . . . , vs,j ) ∈
R1 × R2 × · · · × Rs for 1 ≤ j ≤ n; hence CRT(wj ) ∈ R. Define
Finally, let Ci be a code of length n over Ri . Define the code CRT(C1 , C2 , . . . , Cs ) over R of
length n by
The rank of a code C over a ring R is the minimum number of generators of C. A code
C is free over a ring R if it is isomorphic to Rk for some k. The following theorems appear
in [627].
R = CRT(R1 , R2 , . . . , Rs ).
R = CRT(R1 , R2 , . . . , Rs ).
Let Ci be a code over Ri with Ci⊥ its orthogonal in Rin . Let C = CRT(C1 , C2 , . . . , Cs ).
(a) We have that C ⊥ = CRT(C1⊥ , C2⊥ , . . . , Cs⊥ ).
(b) If each C is self-dual, that is Ci = Ci⊥ for all i, then C = C ⊥ .
Given these results, it is clear that in most cases, one need only study codes over local
rings to get general results for codes over all finite commutative rings.
complete description. An analog for the MacWilliams Identities exists for codes over rings
if the ring is Frobenius. Without this fundamental result many of the techniques that give
coding theory power are no longer applicable. Therefore, we generally only study codes over
Frobenius rings.
We begin with the definition of various weight enumerators to give analogs to Theo-
rem 1.15.3(c).
Definition 6.5.1 Let C be a code over an alphabet A = {a0 , a1 , . . . , ar−1 } with the con-
vention that if A is a finite abelian group, a0 = 0. The complete weight enumerator for
the code C is defined as
X r−1
Y
cweC (xa0 , xa1 , . . . , xar−1 ) = xnaii (c) ,
c∈C i=0
1
sweC ⊥ (xg0 , xg1 , . . . , xgs−1 ) = sweC (S · (xg0 , xg1 , . . . , xgs−1 )T ).
|C|
Note that we are not saying that there is a unique way to express the MacWilliams
Identities since it depends on the generating character which is not unique for a given ring.
However, different matrices will still give the same weight enumerator for the orthogonal.
Example 6.5.3 We shall give the specific MacWilliams Identities for codes over Z4 as an
example. Let C be a linear code over Z4 . Then
1
cweC ⊥ (x0 , x1 , x2 , x3 ) = cweC (x0 + x1 + x2 + x3 , x0 + ix1 − x2 − ix3 ,
|C|
x0 − x1 + x2 − x3 , x0 − ix1 − x2 + ix3 ),
Codes over Rings 123
and
1
sweC ⊥ (x0 , x1 , x2 ) = sweC (x0 + 2x1 + x2 , x0 − x2 , x0 − 2x1 + x2 ).
|C|
The following was first proved by MacWilliams in [1318, 1319] for codes over finite fields.
Here we can extend the result to codes over finite commutative Frobenius rings.
Theorem 6.5.4 Let R be a finite commutative Frobenius ring with |R| = r. Let C be a
linear code over R. Then
1
HweC ⊥ (x, y) = HweC (y − x, y + (r − 1)x). (6.3)
|C|
One of the most significant corollaries of the MacWilliams Identities is the following.
Corollary 6.5.5 Let C be a linear code of length n over a finite commutative Frobenius
ring R. Then |C||C ⊥ | = |R|n .
Example 6.5.6 Consider the ring R = F2 [u, v]/hu2 , v 2 , uvi, which we claimed was not
Frobenius after Theorem 6.4.1. This ring has 8 elements {0, 1, u, v, u+v, 1+u, 1+v, 1+u+v}.
Consider the ideal hu, vi = a. Then a = {0, u, v, u + v}. Its orthogonal is a⊥ = a. But
4(4) = 16 6= 8; so the ring R is not Frobenius.
We shall give definitions for modular independent and independent which will be used
to describe a basis for a code over a ring, namely a minimal generating set for the code.
These definitions first appeared in [1490] for codes over Zn and were expanded to codes
over Frobenius rings in [632]. We begin with the definition for modular independence.
124 Concise Encyclopedia of Coding Theory
Definition 6.6.2 Let R be a finite local commutative ring with unique maximal ideal m,
n
and let v1 , vP
2 , . . . , vk be vectors in R . Then v1 , v2 , . . . , vk are modular independent if
k
and only if j=1 αj vj = 0 implies that αj ∈ m for all j.
It follows that for a finite chain ring (which is local by definition), this definition is
enough to find a minimal generating set. Moreover, we can put the generator matrix in a
form quite similar to the standard form for codes over fields. Namely, we have the following
theorem which is a generalization of (6.2). See [980, 1447, 1448] for a complete description.
Theorem 6.6.4 Let R be a finite commutative chain ring with maximal ideal hγi and index
of nilpotency e. Let C be a linear code over R. Then C is permutation equivalent to a linear
code over R with generator matrix
··· ···
Ik0 A0,1 A0,2 A0,3 A0,e
0 γIk1 γA1,2 γA1,3 · · · ··· γA1,e
2 2 2
0
. 0 γ I k 2
γ A 2,3 · · · · · · γ A 2,e
. .. .. .. ..
. . 0 . . .
. . . . . . .
.. .. .. .. .. .. ..
e−1 e−1
0 0 0 ··· 0 γ Ike−1 γ Ae−1,e
where the Ai,j are arbitrary matrices with elements from the ring R, and Ikj is the kj × kj
identity matrix.
A code with generator matrix of this form is said to have type {k0 , k1 , . . . , ke−1 }. A
simple counting argument gives the following corollary.
Corollary 6.6.5 Let R be a finite chain ring with maximal ideal hγi. Let C be a code over
R of type {k0 , k1 , . . . , ke−1 }. Then
Pe−1
|C| = |R/hγi| i=0 (e−i)ki .
For codes over chain rings, the situation is quite similar to codes over finite fields.
Namely, we have a standard form for the generator matrix and from that matrix we can
directly compute the cardinality of the code. For codes over other rings the situation is not
as simple.
We now use the previous definition of modular independence for local rings to give a
definition for the most general class of rings for which we define codes.
R = CRT(R1 , R2 , . . . , Rs ).
It is very important to note here that we are not saying that the projection to the local
components of the ring must be modular independent for all i but rather for some i. We
only need one projection to be modular independent for the definition to be fulfilled.
Codes over Rings 125
Example 6.6.7 Let us return to the codewords in Example 6.6.1. The vectors 20 and 05
project to 00 and 01 over Z2 and to 20 and 00 over Z5 . Hence these vectors are not modular
independent over Z10 .
We now present a concept needed to formulate the definition for a basis of a code over
a ring.
Example 6.6.9 Via the Chinese Remainder Theorem, Z45 is isomorphic to Z9 × Z5 . Con-
sider the vectors (44, 7) and (5, 7) in Z245 . Projecting to Z9 , these vectors are (8, 7) and
(5, 7) which are modular independent as follows. If α1 (8, 7) + α2 (5, 7) = (0, 0) in Z29 , then
7α1 + 7α2 = 0 and 8α1 + 5α2 = 0 implying α1 = −α2 and 3α1 = 0. Thus α1 , α2 ∈ h3i. As
h3i is the maximal ideal of Z9 , the vectors are modular independent over one projection and
hence over Z45 . However, 15(44, 7) + 3(5, 10) = (0, 0), with neither 15(44, 7) nor 3(5, 10) the
zero vector, and so these vectors are not independent.
Next we give an example where independence does not imply modular independence.
Example 6.6.10 Consider the vectors (9, 0) and (0, 5) in Z245 . Suppose α1 (9, 0)+α2 (0, 5) =
(0, 0) with α1 , α2 ∈ Z45 . Then 9α1 = 0 and 5α2 = 0 implying α1 is a multiple of 5 and α2
is a multiple of 9. In either case, both α1 (9, 0) and α2 (0, 5) are the zero vector. Hence these
vectors are independent. However, projecting to Z9 , these vectors are (0, 0) and (0, 5) and
projecting to Z5 , these vectors are (4, 0) and (0, 0). Neither set is modular independent.
Therefore the vectors are not modular independent over Z45 .
Definition 6.6.11 Let C be a code over a finite commutative Frobenius ring R. The code-
words v1 , v2 , . . . , vk are called a basis of C if they are independent, modular independent,
and generate C.
(c) If R is a principal ideal ring and C is an arbitrary code over R, then any basis for C
contains exactly r codewords, where r is the rank of C.
Note that over a local ring, the definition of independence implies modular independence.
Namely,
Pk if v1 , v2 , . . . , vk are independent over a local ring R with maximal ideal m, then
j=1 αj vj = 0 implies each αj vj = 0. Note that the non-units of R are precisely the
elements in m. If αj is a unit of R, then vj = 0, a contradiction. So αj is a not a unit of R
implying that αj ∈ m which shows v1 , v2 , . . . , vk are modular independent.
where dH (C) is the minimum Hamming distance of C. Codes meeting this bound are called
maximum distance separable (MDS) codes. See Singleton’s early paper [1717] for
details. The classification of such codes is at present an open question and has proven to be
quite intractable; see [630] for a complete description. For linear codes over fields this bound
becomes dH (C) ≤ n − k + 1 where k is the dimension of the code as a vector space. Since
a code over a ring is not a vector space, this algebraic version does not apply. An algebraic
version of this has been proven for codes over principal ideal rings. There are additional
results for more general classes of rings but they are not as easy to state. Recall that the
rank of a code over a principal ideal ring is the minimum number of generators of
that code.
Theorem 6.7.1 ([1682]) Let C be a linear code over a finite commutative Frobenius prin-
cipal ideal ring. Then
dH (C) ≤ n − rank(C) + 1.
A code meeting this bound is said to be a maximum distance with respect to rank
(MDR) code.1 An MDR code is not necessarily an MDS code as in the following example.
Example 6.7.2 Let R be a finite principal ideal ring. Let a be any nontrivial ideal in R.
We know that the ideal has a single generator since all the ideals are principally generated.
Therefore the rank of a is 1. Its minimum Hamming weight and length are also 1. Then
since 1 = 1 − 1 + 1 the code is an MDR code. This code is, however, not an MDS code since
it does not have cardinality |R|.
Recall that a code C is a free code if it is isomorphic to Rk for some natural number k.
The following theorem determines when an MDR code is MDS.
Theorem 6.7.3 ([627]) Let R be a finite commutative Frobenius ring and let C be a code
over R. The code C is an MDS code if and only if C is an MDR code and C is free.
1 The concept of MDR codes is not to be confused with MRD codes arising in the study of rank-metric
codes where MRD codes also satisfy an appropriate Singleton-type bound; see Chapter 11.
Codes over Rings 127
One can also use the Chinese Remainder Theorem to construct MDR codes as explained
in the following theorem.
Theorem 6.7.4 ([627]) Let R = CRT(R1 , R2 , . . . , Rs ) and let Ci be a code over Ri for
1 ≤ i ≤ s. If Ci is an MDR code for each i, then C = CRT(C1 , C2 , . . . , Cs ) is an MDR code.
If Ci is an MDS code of the same rank for each i, then C = CRT(C1 , C2 , . . . , Cs ) is an MDS
code.
Various other results can be found for different weights in terms of what constitutes a
maximal code; see [634] for examples. These results indicate that we are trying to find codes
which are maximal for certain weights given their parameters. Specifically, we are trying
to find codes over a ring R which have the largest minimum weight (for a given weight)
given its length and type. Essentially, this is a broader question than the main problem
of coding theory which seeks to find the largest minimum Hamming distance for a given
length and dimension over a field. In terms of rings, this is generalized for various weights
and for various forms of generators.
6.8 Conclusion
Codes over rings have become an integral part of algebraic coding theory. Its study
started with the realization that certain important binary codes were the images of linear
quaternary codes under a Gray map and expanded greatly when it was shown that both
MacWilliams theorems generalized to codes over Frobenius rings.
In this chapter, we have laid out foundational results for the study of codes over rings
that parallel classical coding theory. These results make it possible to expand the uses of
classical coding theory in applications as well as in number theory, the geometry of numbers,
ring theory and discrete mathematics. The interested reader can find a complete description
of codes over rings in [619, 1668, 1735].
Chapter 7
Quasi-Cyclic Codes
Cem Güneri
Sabancı University
San Ling
Nanyang Technological University
Buket Özkaya
Nanyang Technological University
7.1 Introduction
Cyclic codes are among the most useful and well-studied code families for various reasons,
such as effective encoding and decoding. These desirable properties are due to the algebraic
structure of cyclic codes, which is rather simple. Namely, a cyclic code can be viewed as
an ideal in a certain quotient ring obtained from a polynomial ring with coefficients from a
finite field.
129
130 Concise Encyclopedia of Coding Theory
It is then natural for coding theorists to search for generalizations of cyclic codes, which
also have nice properties. This chapter is devoted to one of the first such generalizations,
namely the family of quasi-cyclic (QC) codes. Algebraically, QC codes are modules rather
than ideals. As desired, QC codes turned out to be a very useful generalization and their
investigation continues intensively today.
There is a vast literature on QC codes and it is impossible to touch upon all aspects in
a chapter. This chapter is constrained to some of the most fundamental aspects to which
the authors also made contributions over the years. We start with the algebraic structure
of QC codes, present how the vectorial and algebraic descriptions are related to each other,
and also discuss the code generators. Then we focus on the decomposition of QC codes, via
the Chinese Remainder Theorem (CRT decomposition) and concatenation. In fact, these
decompositions turn out to be equivalent in some sense, which is also mentioned in the
chapter. Decomposition of a QC code yields a trace representation. Moreover, the dual QC
code can be described in the decomposed form, which has important consequences of its
own. For instance, self-dual and linear complementary dual QC codes can be characterized
in terms of their CRT decompositions. Moreover, the asymptotic results presented in the
chapter heavily rely on the decomposed structure of QC codes. Three different general
minimum distance bounds on QC codes are presented here, with fairly complete proofs or
detailed elaborations. A relation between QC codes and convolutional codes is also shown
in the end.
then being invariant under a shift by ` units amounts to being closed under the row shift
where each row is moved downward one row and the bottom row moved to the top.
Consider the principal ideal I = hxm − 1i of Fq [x] and define the quotient ring R :=
Fq [x]/I. If T represents the shift-by-1 operator on Fm` m`
q , then its action on v ∈ Fq will be
m`
denoted by T · v. Hence, Fq has an Fq [x]-module structure given by the multiplication
Fq [x] × Fm`
q −→ Fm`
q
Given a QC code C ⊆ R` , it follows that the preimage Ψ−1 (C) = Ce of C in Fq [x]` is an Fq [x]-
submodule containing K e = {(xm − 1)ej | 0 ≤ j ≤ ` − 1}, where ej denotes the standard
basis vector of length ` with 1 at the coordinate j and 0 elsewhere. Throughout, the tilde e
represents structures over Fq [x].
Since Ce is a submodule of the finitely generated free module Fq [x]` over the principal
ideal domain Fq [x] and it contains K,
e it has a generating set of the form
generate C.e By using elementary row operations, we may triangularize M so that another
generating set can be obtained from the rows of an upper-triangular `×` matrix with entries
in Fq [x] as follows:
where G(x)
e satisfies the following conditions (see [1191, Theorem 2.1]):
132 Concise Encyclopedia of Coding Theory
Since each fi (x) divides xm − 1, its roots are powers of some fixed primitive mth root of
unity ξ in an extension field of Fq . For each i = 1, 2, . . . , s, let ui be the smallest nonnegative
integer such that fi (ξ ui ) = 0. Since the fi (x)’s are irreducible, the direct summands in (7.5)
are field extensions of Fq . If Ei := Fq [x]/hfi (x)i for 1 ≤ i ≤ s, then we have
R ∼
= E1 ⊕ · · · ⊕ Es
(7.6)
a(ξ u1 ), . . . , a(ξ us )
a(x) 7→
Quasi-Cyclic Codes 133
ab,0 (ξ ui ), . . . , ab,`−1 (ξ ui ) | 1 ≤ b ≤ r .
Ci = spanEi (7.8)
Note that each extension field Ei above is isomorphic to a minimal cyclic code of length
m over Fq , namely, the cyclic code whose check polynomial is fi (x). If we denote by θi
the generating primitive idempotent for the minimal cyclic code in consideration (see Sec-
tion 2.3), then the isomorphism is given by the maps
ψi : Ei −→ hθi i
ϕi : hθi i −→ Ei m−1
and , (7.9)
a(ξ ui ) ak xk
P
a(x) − 7 → δ 7−→
k=0
where
1
ak = TrEi /Fq (δξ −kui ).
m
Here, TrEi /Fq denotes the trace map from Ei onto Fq ; if [Ei : Fq ] = ei , then TrEi /Fq (x) =
ei −1
Trqei /q (x) = x + xq + · · · + xq . If Ci is a length ` linear code over Ei , we denote its
concatenation with hθi i by hθi iCi and the concatenation is carried out by the map ψi ,
extended to E`i . In other words, ψi is applied to each symbol of the codeword in Ci to
produce an element of hθi i` .
Jensen gave the following concatenated description for QC codes.
Theorem 7.3.1 ([1042])
(a) Let C be an R-submodule of R` (i.e., a QC code). Then for some subset I of {1, . . . , s},
there exist linear codes Ci of length ` over Ei , which can be explicitly described, such
that M
C= hθi iCi .
i∈I
(b) Conversely, let Ci be a linear code in E`i for each i ∈ I ⊆ {1, . . . , s}. Then
M
C= hθi iCi
i∈I
7.3.2 Applications
In this section, we present some constructions and characterizations of QC codes using
their CRT decomposition.
where λi = (λi,0 , . . . , λi,`−1 ) ∈ Ci for all i. Since mC = C, every codeword in C can still be
1
written in the form of (7.10) with the constant m removed. Note that the row shift invariance
of codewords amounts to being closed under multiplication by ξ −1 in this representation.
Let F := Fq (ξ u1 , . . . , ξ us ) be the splitting field of xm −1 (i.e., the smallest field containing
all the Ei ’s), and let k1 , . . . , ks ∈ F with TrF/Ei (ki ) = 1 for each i. We can now unify the
traces and rewrite c ∈ C as follows (see [869]):
s
!!
X
c = TrF/Fq ki λi,t ξ −jui . (7.11)
i=1 0≤t≤`−1
0≤j≤m−1
Note that the case ` = 1 gives us the trace representation of a cyclic code of length m, in
the sense of the following formulation:
Theorem 7.3.2 ([1899, Proposition 2.1]) Let gcd(m, q) = 1 and let ξ be a primitive
mth root of unity in some extension field F of Fq . Assume that u1 , . . . , us are nonnegative
integers and let C be a q-ary cyclic code of length m such that the basic set ofQzeros of
its dual is BZ(C ⊥ ) = {ξ u1 , ξ u2 , . . . , ξ us } (i.e., the generator polynomial of C ⊥ is i mi (x),
where mi (x) ∈ Fq [x] is the minimal polynomial of ξ ui over Fq ). Then
Hence, the columns of any codeword in the QC code C viewed as in (7.11) lie in the cyclic
⊥ −u1
code D ⊆ Fmq with BZ(D ) = {ξ , . . . , ξ −us }; see [868, Proposition 4.2].
Example 7.3.3 Let m = 3 and q ≡ 2 (mod 3) such that x3 − 1 factors into x − 1 and
x2 + x + 1 over Fq . By (7.5) and (7.6), we obtain
R = Fq [x]/hx3 − 1i ∼
= Fq [x]/hx − 1i ⊕ Fq [x]/hx2 + x + 1i ∼
= Fq ⊕ Fq2 .
Therefore, any QC code C of length 3` and index ` has two linear constituents C1 ⊆ F`q and
C2 ⊆ F`q2 such that C ∼
= C1 ⊕ C2 . By (7.11), the codewords of C are of the form (see [1269,
Theorem 6.7])
z + 2a − b
z − a + 2b ,
z−a−b
Quasi-Cyclic Codes 135
where z ∈ C1 and c + αd + α2 e + α3 f ∈ C2 .
where gi (x) is self-reciprocal for all 1 ≤ i ≤ n, and h∗t (x) = xdeg(ht ) ht (x−1 ) denotes the
reciprocal polynomial of ht (x) for all 1 ≤ t ≤ p. Note that s = n + 2p since (7.4) and (7.13)
must be identical.
Let Gi := Fq [x]/hgi (x)i, H0t := Fq [x]/hht (x)i, and H00t := Fq [x]/hh∗t (x)i for each i and t.
By the CRT, the decomposition of R in (7.5) now becomes
p
n
! !
R∼
M M
0 00
= Gi ⊕ H ⊕H t ,
t
i=1 t=1
which implies
p
n
! !
R ∼
M M
`
= G`i ⊕ (H0t )` ⊕ (H00t )` .
i=1 t=1
136 Concise Encyclopedia of Coding Theory
where c = (c0 , . . . , c`−1 ), d = (d0 , . . . , d`−1 ) ∈ G`i . This is the Hermitian inner product. For
the two exceptions, in which case the corresponding field Gi is Fq , we equip G`i with the
usual Euclidean inner product. With the appropriate inner product, Hermitian or Euclidean,
on Gi , ⊥G denotes the dual on G`i . For each 1 ≤ t ≤ p, H0` 00`
t = Ht is also equipped with the
0` 00`
Euclidean inner product; ⊥e denotes the dual on Ht = Ht .
The dual of a QC code is also QC and the following result is immediate.
Proposition 7.3.5 ([1269, Theorem 4.2]) Let C be a QC code with CRT decomposition
as in (7.14). Then its dual code C ⊥ (under the Euclidean inner product) is of the form
p
n
! !
M M
⊥ ⊥G 00⊥e 0⊥e
C = Ci ⊕ Ct ⊕ Ct . (7.16)
i=1 t=1
Clearly Φ(c) lies in some cyclic code for any c ∈ C. We now define the smallest such cyclic
code as C.
b First, we equivalently extend the map Φ above to the polynomial description of
codewords as in (7.2):
(k) (k)
If C has generating set {f1 , . . . , fr } where fk = f0 (x), . . . , f`−1 (x) ∈ Fq [x]` for each
k ∈ {1, . . . , r}, then Cb = hgcd(f1 (x), . . . , fr (x), xm − 1)i with fk = Φ(fk ) ∈ Fq` [x] for all
k ∈ {1, . . . , r} [1189].
Next, we consider the q-ary linear code of length ` that is generated by the rows of
the codewords in C, which are represented as m × ` arraysas in (7.1). Namely, let B
be the linear block code of length ` over Fq generated by fk,i | k ∈ {1, . . . , r}, i ∈
(k) (k)
{0, . . . , m − 1} ⊆ F`q , where each fk,i := fi,0 , . . . , fi,`−1 ∈ F`q is the vector of the ith
(k) (k) (k) (k)
coefficients of the polynomials fj (x) = f0,j +f1,j x+· · ·+fm−1,j xm−1 , for all k ∈ {1, . . . , r}
and j ∈ {0, . . . , ` − 1}.
Since the image of any codeword c(x) ∈ C under the map Φ is an element of the cyclic
code Cb over Fq` , there are at least dH (C)
b nonzero rows in each nonzero codeword of C. For
th
any i ∈ {0, . . . , m − 1}, the i row ci = (ci,0 , . . . , ci,`−1 ) of a codeword c ∈ C can be viewed
as a codeword in B; therefore, a nonzero ci has weight at least dH (B). Hence we have shown
the following.
Theorem 7.4.2 (Lally Bound [1189, Theorem 5]) Let C be an r-generator QC code
of length m` and index ` over Fq with generating set {f1 , . . . , fr } ⊆ Fq [x]` . Let the cyclic
code Cb ⊆ Fm `
q ` and the linear code B ⊆ Fq be defined as above. Then
dH (C) ≥ dH (C)d
b H (B).
The roots of xm − 1 are of the form 1, ξ, . . . , ξ m−1 where ξ is a fixed primitive mth root
of unity. Henceforth, let Ω := {ξ k | 0 ≤ k ≤ m − 1} be the set of all mth roots of unity.
Recall that the splitting field of xm − 1, denoted by F, is the smallest extension field of Fq
that contains Ω. Given the cyclic code C = hg(x)i, the set L := {ξ k | g(ξ k ) = 0} ⊆ Ω of
roots of its generator polynomial is called the zeros of C. Note that ξ k ∈ L implies ξ qk ∈ L
for each k. A nonempty subset E ⊆ Ω is said to be consecutive if there exist integers e,
n, and δ with e ≥ 0, δ ≥ 2, n > 0, and gcd(m, n) = 1 such that
We now describe the Roos Bound for cyclic codes. For ∅ 6= P ⊆ Ω, let CP denote
any cyclic code of length m over Fq whose set of zeros contains P . Let dP denote the least
possible amount that the minimum distance of any CP can have. In particular, dP = dH (CP )
when P is the set of zeros of CP .
Theorem 7.4.3 (Roos Bound [1594, Theorem 2]) Let N and M be two nonempty
subsets of Ω. If there exists a consecutive set M 0 containing M such that |M 0 | ≤ |M |+dN −2,
then we have dM N ≥ |M | + dN − 1 where M N := {εϑ | ε ∈ M, ϑ ∈ N }.
Remark 7.4.5 In particular, the case M = {1} in Corollary 7.4.4 yields the BCH Bound
for the associated cyclic code. Taking M 0 = M in the corollary yields the Hartmann–Tzeng
Bound.
Another improvement to the Hartmann–Tzeng Bound for cyclic codes, known as the
Shift Bound, was given by van Lint and Wilson in [1838]. To describe the Shift Bound, we
need the notion of an independent set, which can be constructed over any field in a recursive
way.
Let S be a subset of some field K of any characteristic. One inductively defines a family
of finite subsets of K, called independent with respect to S, as follows:
Condition 1: ∅ is independent with respect to S.
Condition 2: If A ⊆ S is independent with respect to S, then A ∪ {b} is independent with
respect to S for all b ∈ K \ S.
Condition 3: If A is independent with respect to S and c ∈ K∗ = K \ {0}, then cA is
independent with respect to S.
We define the weight of a polynomial f (x) ∈ K[x], denoted by wtH (f ), as the number
of nonzero coefficients in f (x).
Theorem 7.4.6 (Shift Bound [1838, Theorem 11]) Let 0 6= f (x) ∈ K[x] and let S :=
{θ ∈ K | f (θ) = 0}. Then wtH (f ) ≥ |A| for every subset A of K that is independent with
respect to S.
The minimum distance bound for a given cyclic code follows by considering the weights
of its codewords c(x) ∈ C and the independent sets with respect to subsets of its zeros L.
Observe that, in this case, the universe of the independent sets is Ω, not F, because all of
the possible roots of the codewords are contained in Ω. Moreover, we choose b from Ω \ P
in Condition 2 above, where P ⊆ L, and c in Condition 3 is of the form ξ k ∈ F∗ , for some
0 ≤ k ≤ m − 1.
140 Concise Encyclopedia of Coding Theory
and we define an eigenvalue of C to be a root β of det(G(x)).e Note that, since gj,j (x) |
(xm − 1) for each 0 ≤ j ≤ ` − 1, all eigenvalues of C are elements of Ω; i.e., β = ξ k for some
k ∈ {0, . . . , m − 1}. The algebraic multiplicity of β is the largest integer a such that
(x − β)a | det(G(x)).
e The geometric multiplicity of β is the dimension of the null space
of G(β). This null space, denoted by Vβ , is called the eigenspace of β. In other words,
e
Vβ := v ∈ F` | G(β)v T
= 0`T ,
e
where F is the splitting field of xm − 1, as before, and T denotes the transpose. It was
shown in [1645] that, for a given QC code and the associated G(x)
e ∈ Fq [x]`×` , the algebraic
multiplicity a of an eigenvalue β is equal to its geometric multiplicity dimF (Vβ ).
Lemma 7.4.8 ([1645, Lemma 1]) The algebraic multiplicity of any eigenvalue of a QC
code C is equal to its geometric multiplicity.
From this point on, we let Ω ⊆ Ω denote the nonempty set of all eigenvalues of C where
|Ω| = t > 0. Note that Ω = ∅ if and only if the diagonal elements gj,j (x) in G(x) e are constant
and C is the trivial full space code. Choose an arbitrary eigenvalue βi ∈ Ω with multiplicity
ni for some i ∈ {1, . . . , t}. Let {vi,0 , . . . , vi,ni −1 } be a basis for the corresponding eigenspace
Vi . Consider the matrix
vi,0 vi,0,0 ··· vi,0,`−1
.. .. .. ..
Vi := = (7.19)
. . . .
vi,ni −1 vi,ni −1,0 · · · vi,ni −1,`−1
having the basis elements as its rows. We let
Hi := (1, βi , . . . , βim−1 ) ⊗ Vi and
H1 V1 β1 V1 · · · β1m−1 V1
. . .. .. ..
H := .. = .. . (7.20)
. . .
Ht Vt βt Vt · · · βtm−1 Vt
Pt P`−1
Observe that H has n := i=1 ni rows. We also have that n = j=0 deg(gj,j (x)) by Lemma
7.4.8. The rows of H are linearly independent, a fact shown in [1645, Lemma 2]. This proves
the following lemma.
Quasi-Cyclic Codes 141
The BCH-like minimum distance bound of Semenov and Trifonov for a given QC code in
[1645, Theorem 2] was expressed in terms of the size of a consecutive subset of eigenvalues
in Ω and the minimum distance of the common eigencode related to this consecutive subset.
Zeh and Ling generalized their approach and derived an HT-like bound in [1940, Theorem
1] without using the parity check matrix in their proof. The eigencode, however, is still
needed. In the next section, we will prove the analogs of these bounds for QC codes in
terms of the Roos and Shift Bounds.
Theorem 7.4.13 allows us to use any minimum distance bound derived for cyclic codes
based on their zeros. The following special cases are immediate after the preparation that
we have done in Section 7.4.3.1; compare this to Theorems 7.4.3 and 7.4.6.
Proof: For part (a), let N = {ξ u1 , . . . , ξ ur } and M = {ξ v1 , . . . , ξ vs } be such that there exists
a consecutive set M 0 = {ξ z | v1 ≤ z ≤ vs } ⊆ Ω containing M with |M 0 | ≤ |M | + dN − 2.
We define the matrices
1 ξ u1 ξ 2u1 · · · ξ (m−1)u1 1 ξ v1 ξ 2v1 · · · ξ (m−1)v1
e N := ... ...
H
..
.
..
.
..
. and H e M := ... ... ..
.
..
.
..
. .
ur 2ur (m−1)ur vs 2vs (m−1)vs
1 ξ ξ ··· ξ 1 ξ ξ ··· ξ
1 ξ w1 ξ 2w1 · · · ξ (m−1)w1
e T = ...
H
..
.
..
.
..
.
..
. .
A
wy 2wy (m−1)wy
1 ξ ξ ··· ξ
Let VTA be the matrix corresponding to a basis of VTA , which is the intersection of the
eigenspaces belonging to the eigenvalues in TA . Let CTA be the eigencode corresponding to
the eigenspace VTA . We again set H
b T := H
A
e T ⊗ VT and the result follows in a similar way
A A
by using the Shift Bound in Theorem 7.4.6.
Remark 7.4.15 We can obtain the BCH-like bound in [1645, Theorem 2] and the HT-like
bound in [1940, Theorem 1] by using Remarks 7.4.5 and 7.4.7.
7.5 Asymptotics
Let (Ci )i≥1 be a sequence of linear codes over Fq , and let Ni , di and ki denote respectively
the length, minimum distance, and dimension of Ci , for all i. Assume that lim Ni = ∞.
i→∞
Let
di ki
δ := lim inf and R := lim inf
i→∞ Ni i→∞ Ni
denote the relative distance and the relative rate of the sequence (Ci )i≥1 . Both R and δ
are finite as they are limits of bounded quantities. If Rδ 6= 0, then (Ci )i≥1 is called an
asymptotically good sequence of codes.
We will require the celebrated entropy function
defined for 0 < y < q−1 q and of constant use in estimating binomial coefficients of large
arguments [1323, pages 309–310]. The asymptotic Gilbert–Varshamov Bound (see [1323,
Chapter 17, Theorem 30] or Theorem 1.9.28) states that, for every q and 0 < δ < 1 − 1q ,
there exists an infinite family of q-ary codes with limit rate
R ≥ 1 − Hq (δ).
equipped with the Hermitian inner product (7.15). Self-duality in the following discussion
is with respect to these respective inner products. A binary self-dual code is said to be of
Type II if and only if all its weights are multiples of 4 and of Type I otherwise. We first
recall some background material on mass formulas for self-dual codes over F2 and F4 . See
also Theorem 4.5.2.
Proposition 7.5.1 Let ` be an even positive integer.
(b) Let v ∈ F`2 with even Hamming weight, other than 0 and `. The number of self-dual
binary codes of length ` containing v is given by
`
2 −2
Y
M (2, `) = (2i + 1).
i=1
(d) The number of self-dual F4 -codes of length ` containing a given nonzero codeword of
length ` and even Hamming weight is given by
`
2 −2
Y
M (4, `) = (22i+1 + 1).
i=0
Proof: Parts (a) and (c) are well-known facts, found for example in [1008, 1555]. Part (b)
is an immediate consequence of [1324, Theorem 2.1] with s = 2, noting that every self-dual
binary code must contain the all-one vector 1. Part (d) follows from [438, Theorem 1] with
n1 = ` and k1 = 1.
(b) Let v ∈ F`2 with Hamming weight divisible by 4, other than 0 and `. The number of
Type II binary self-dual codes of length ` containing v is given by
`
2 −3
Y
S(2, `) = 2 (2i + 1).
i=1
Quasi-Cyclic Codes 145
Proof: Part (a) is found in [1008, 1555], and part (b) is exactly [1324, Corollary 2.4].
Let C1 denote a binary code of length ` and let C2 be a code over F4 of length `. We
construct a binary QC code C of length 3` and index ` whose codewords are of the form
given in (7.12). It is easy to check that C is self-dual if and only if both C1 and C2 are
self-dual, and C is of Type II if and only if C1 is of Type II and C2 is self-dual.
We assume henceforth that C is a binary self-dual QC code constructed in the above
way. Any codeword c ∈ C must necessarily have even Hamming weight. Suppose that c
corresponds to the pair (c1 , c2 ), where c1 ∈ C1 and c2 ∈ C2 . Since C1 and C2 are self-dual,
it follows that c1 and c2 must both have even Hamming weights. When c 6= 03` , there are
three possibilities for (c1 , c2 ):
Case 1: c1 6= 0` , c2 6= 0` ;
Case 2: c1 = 0` , c2 6= 0` ; and
Case 3: c1 6= 0` , c2 = 0` .
We count the number of codewords c in each of these cases for a given even weight d.
For Case 2, if the Hamming weight of c is d, then C ∼ = C1 ⊕ C2 implies that c2 has
Hamming weight d/2. Since c2 has even Hamming weight, it follows that d is divisible by 4
in order for this case to occur. It is easy to see that the number A2 (`, d) of such words c is
`
bounded above by d/2 3d/2 where 4 | d. For d not divisible by 4, set A2 (`, d) = 0.
The argument to obtain the number of words of Case 3 is similar. It is easy to show
`
that the number A3 (`, d) of such words is bounded above by d/3 where 6 | d. When d is
not divisible by 6, set A3 (`, d) = 0.
3`
The total number of vectors in F3` 2 of weight d is d ; so for Case 1, we have
3`
A1 (`, d) ≤ − A2 (`, d) − A3 (`, d).
d
Theorem 7.5.3 ([1270, Theorem 3.1]) Let ` be an even integer and let d be the largest
even integer such that
! !
X 3` `
X ` X `
−1 e/2 `−1
+ (2 2 + 1) 3 + (2 + 1)
e e/2 e/3
e<d e<d e<d
e ≡ 0 mod 2 e ≡ 0 mod 4 e ≡ 0 mod 6
`
< (2 2 −1 + 1)(2`−1 + 1).
Then there exists a self-dual binary QC code of length 3` and index ` with minimum distance
at least d.
Proof: Multiplying both sides of the inequality by M (2, `)M (4, `), and applying the above
upper bounds for Ai (`, d) (i = 1, 2, 3), we see that the inequality in Theorem 7.5.3 implies
that the number of self-dual binary QC codes of length 3` and index ` with minimum
146 Concise Encyclopedia of Coding Theory
distance < d is strictly less than the total number of self-dual binary QC codes of length 3`
and index `.
If we are interested only in Type II QC codes, using Proposition 7.5.2, we easily see that
the number of Type II binary QC codes of length 3` and index ` with minimum weight is
< d is bounded above by
X
A1 (`, e)S(2, `)M (4, `) + A2 (`, e)T (2, `)M (4, `) + A3 (`, e)S(2, `)N (4, `) .
e<d
e ≡ 0 mod 4
Using an argument similar to that for Theorem 7.5.3, we obtain the following result.
Theorem 7.5.4 ([1270, Theorem 3.2]) Let ` be divisible by 8 and let d be the largest
multiple of 4 such that
! !
X 3` `
X ` X `
+ (2 2 −2 + 1) 3e/2 + (2`−1 + 1)
e e/2 e/3
e<d e<d e<d
e ≡ 0 mod 4 e ≡ 0 mod 4 e ≡ 0 mod 12
`
< (2 2 −2 + 1)(2`−1 + 1).
Then there exists a Type II binary QC code of length 3` and index ` with minimum distance
at least d.
We are now in a position to state and prove the asymptotic version of Theorems 7.5.3
and 7.5.4.
Theorem 7.5.5 ([1270, Theorem 4.1]) There exists an infinite family of binary self-
di
dual QC codes Ci of length 3`i and of distance di with lim `i = ∞ such that δ = lim inf
i→∞ i→∞ 3`i
exists and is bounded below by
Proof: The right-hand-side of the inequality of Theorem 7.5.3 is plainly of the order of 23`/2
for large `. We compare this in turn to each of the three summands on the left-hand-side
(at the price of a more stringent inequality, congruence conditions on the summation range
are neglected). By [1323, Chapter 10, Corollary 9], for large ` (with µ = δ and n = `),
the first and third summands are of order 23`H2 (δ) and 2`+`H2 (δ) , respectively. They both
are of the order of the right-hand-side for H2 (δ) = 1/2. By [1323, Chapter 10, Lemma
7], for large ` (with λ = δ and n = `), the second summand is of order 2`f (3δ/2) for
f (t) := 0.5 + t log2 (3) + H2 (t), which is of the order of the right-hand-side for
δ = 0.1762 · · · ,
Proof: Since we neglected the congruence conditions in the preceding analysis, the calcu-
lations are exactly the same but using Theorem 7.5.4.
Quasi-Cyclic Codes 147
Theorem 7.5.7 ([871, Theorems 3.3 and 3.7]) Let q be a power of a prime and let
m ≥ 2 be relatively prime to q. Then there exists an asymptotically good sequence of q-ary
QCCD codes.
Proof: Let ξ be a primitive mth root of unity over Fq . There are two possibilities.
(1) 1 and −1 are not in the same q-cyclotomic coset modulo m.
(2) 1 and −1 are in the same q-cyclotomic coset modulo m.
In case (1), ξ and ξ −1 have distinct minimal polynomials h0 (x) and h00 (x), respectively,
over Fq . Let H0 = Fq [x]/hh0 (x)i and H00 = Fq [x]/hh00 (x)i. Recall that these fields are equal:
H0 = Fq (ξ) = Fq (ξ −1 ) = H00 . We denote both by H. Let θ0 and θ00 denote the primitive
idempotents corresponding to the q-ary length m minimal cyclic codes with check poly-
nomials h0 (x) and h00 (x), respectively. Let (Ci )i≥1 be an asymptotically good sequence of
(Euclidean) LCD codes over H (such a sequence exists by [1347]), where each Ci has pa-
rameters [`i , ki , di ]. For all i ≥ 1, define the q-ary QC code Di as
dH (Di ) ≥ min {dH (hθ0 i)di , dH (hθ0 i ⊕ hθ00 i)di } ≥ dH (hθ0 i ⊕ hθ00 i)di .
For the sequence of QCCD codes (Di )i≥1 , the relative rate is
2eki 2e ki
R = lim inf = lim inf ,
i→∞ m`i m i→∞ `i
and this quantity is positive since (Ci )i≥1 is asymptotically good. For the relative distance,
we have
dH (Di ) di
δ = lim inf ≥ dH (hθ0 i ⊕ hθ00 i) lim inf .
i→∞ m`i i→∞ `i
Di := hθiCi .
148 Concise Encyclopedia of Coding Theory
If e := [G : Fq ], then the length and the dimension of Di are m`i and eki , respectively. By
the Jensen Bound of Theorem 7.4.1, we have
dH (Di ) ≥ dH (hθi)dH (Ci ).
As in part (a) we conclude that (Di )i≥1 is asymptotically good since (Ci )i≥1 is asymptotically
good.
Ψ : Fq [x] −→ R
f (x) 7−→ f 0 (x) := f (x) mod hxm − 1i.
It is clear that for a given (`, k) convolutional code C and any m > 1, there is a natural QC
code C 0 , of length m` and index `, related to it as shown below. Note that we denote the
map from C to C 0 also by Ψ.
Ψ : C −→ C 0
c(x) = (c0 (x), . . . , c`−1 (x)) 7−→ c0 (x) = (c00 (x), . . . , c0`−1 (x)).
Lally [1190] showed that the minimum distance of the QC code C 0 above is a lower bound
on the free distance of the convolutional code C.
Theorem 7.6.3 ([1190, Theorem 2]) Let C be an (`, k) convolutional code over Fq and
let C 0 be the related QC code in R` . Then df ree (C) ≥ dH (C 0 ).
Quasi-Cyclic Codes 149
Proof: Let c(x) be a codeword in C and set c0 (x) = Ψ(c(x)) ∈ C 0 . We consider the two possi-
bilities. First, if c0 (x) 6= 0, then wtH (c(x)) ≥ wtH (c0 (x)). For the second possibility, suppose
c0 (x) = 0. Let γ ≥ 1 be the maximal positive integer such that (xm −1)γ divides each coordi-
nate of c(x). Write c(x) = (xm −1)γ (v0 (x), . . . , v`−1 (x)) and set v(x) = (v0 (x), . . . , v`−1 (x)).
Then v(x) is a codeword of C, and for v0 (x) ∈ C 0 \ {0}, we have wtH (c(x)) ≥ wtH (v0 (x))
by the first possibility. Combining the two possibilities, for any c(x) ∈ C, there exists a
v0 (x) ∈ C 0 such that wtH (c(x)) ≥ wtH (c0 (x)). This proves that df ree (C) ≥ dH (C 0 ).
Remark 7.6.4 Note that Lally uses an alternative module description of convolutional and
QC codes in [1190]. Namely, a basis {1, α, . . . , α`−1 } of Fq` over Fq is fixed and the Fq [x]-
modules Fq [x]` and Fq` [x] are identified via the map Φ in (7.17). With this identification, a
length ` convolutional code is viewed as an Fq [x]-module in Fq` [x] and a length m`, index
` QC code is viewed as an Fq [x]-module in Fq` [x]/hxm − 1i. However, all of Lally’s findings
can be translated to the module descriptions that we have been using for convolutional and
QC codes, and this is how they are presented in Theorem 7.6.3.
Acknowledgement
The third author thanks Frederic Ezerman, Markus Grassl, and Patrick Solé for their
valuable comments and discussions. The authors are also grateful to Cary Huffman for his
great help in improving this chapter. The second and third authors are supported by NTU
Research Grant M4080456.
Chapter 8
Introduction to Skew-Polynomial Rings
and Skew-Cyclic Codes
Heide Gluesing-Luerssen
University of Kentucky
8.1 Introduction
In classical block coding theory, cyclic codes are the most studied class of linear codes
with additional algebraic structure. This additional structure turns out to be highly benefi-
cial from a coding-theoretical point of view. Not only does it allow the design of codes with
large minimum distance, it also gives rise to very efficient algebraic decoding algorithms.
For further details we refer to Chapter 1 of this encyclopedia, to [1008, Ch. 4 and 5] in the
textbook by Huffman/Pless, and to the vast literature on this topic.
Initiated by Boucher/Ulmer in [256, 258, 259], the notion of cyclicity has been generalized
in various ways to skew-cyclicity during the last decade. In more precise terms the quotient
space F[x]/(xn − 1), which is the ambient space for classical cyclic codes, is replaced by
F[x; σ]/• (xn − 1), where F[x; σ] is the skew-polynomial ring induced by an automorphism σ
of F (see Definition 8.2.1), and • (xn − 1) is the left ideal generated by xn − 1. Further
generalizations are obtained by replacing the modulus xn − 1 by xn − a, leading to skew-
constacyclic codes, or even more general polynomials f of degree n. In any case, the quotient
is isomorphic as a left F-vector space to Fn , and thus we may consider linear codes in Fn
as subspaces of the quotient.
This allows us to define skew-cyclic codes. A linear code in Fn is (σ, f )-skew-cyclic if
it is a left submodule of F[x; σ]/• (f ). As in the classical case, every such submodule is
generated by a right divisor of the modulus f . If f = xn − a or even f = xn − 1, the
resulting codes are called (σ, a)-skew-constacyclic or σ-skew-cyclic, respectively. The first
most striking difference when compared to the classical case is that the modulus xn − 1 has
in general far more right divisors in F[x; σ] than in F[x]. As a consequence, a polynomial may
151
152 Concise Encyclopedia of Coding Theory
have more roots than its degree suggests. All of this implies that the family of skew-cyclic
codes of given length is far larger than that of cyclic codes.
While these basic definitions are straightforward, a detailed study of the algebraic and
coding-theoretic properties of skew-cyclic codes requires an understanding of the skew-
polynomial ring F[x; σ]. In Sections 8.2, 8.4, and 8.5 we will present the theory of skew
polynomials as it is needed for our study of skew-cyclic codes. This entails division properties
in the ring F[x; σ], evaluations of skew polynomials and their (right) roots, and algebraic sets
along with a skew version of Vandermonde matrices. Division and factorization properties
were studied in detail by Ore, who introduced skew-polynomial rings in the 1930’s in his
seminal paper [1466]. Evaluations of skew polynomials were first considered by Lam [1193]
in the 1980’s and then further investigated by Lam and Leroy; see [1193, 1194, 1195, 1196,
1223]. For Sections 8.2, 8.4, and 8.5, we will closely follow these sources. We will add plenty
of examples illustrating the differences to commutative polynomial rings. In Section 8.3
we will briefly present the close relation between skew polynomials over a finite field and
linearized polynomials, which play a crucial role in the area of rank-metric codes.
The material in Sections 8.2, 8.4, and 8.5 provides us with the right toolbox to study
skew-cyclic codes and their generalizations. In Sections 8.7 and 8.8 we will derive the alge-
braic theory of (σ, f )-skew-cyclic codes and specialize to skew-constacyclic codes whenever
necessary. We will do so by introducing skew circulant matrices because their row spaces
are the codes in question. As a guideline to skew circulants we give a brief approach to
classical cyclic codes via classical circulants in Section 8.6. Among other things we will
see in Section 8.8 that the dual of a (σ, xn − a)-skew-constacyclic code is (σ, xn − a−1 )-
skew-constacyclic, and a generator polynomial arises as a certain kind of reciprocal of the
generator polynomial of the primal code. This result appeared first in [258] and was later
derived in more conceptual terms in [736].
In Section 8.9 we will report on constructions of skew-cyclic codes with designed min-
imum distance. The results are taken from the very recent papers [260, 828, 1791]. They
amount to essentially two kinds of skew-BCH codes. For the first kind the generator poly-
nomial has right roots that appear as consecutive powers of a suitable element in a field
extension (similar to classical BCH codes), whereas for the second kind it has right roots
that are consecutive Frobenius powers of a certain element. Both cases can be generalized
to Hartmann–Tzeng form as for classical cyclic codes. The theory of skew Vandermonde
matrices will be a natural tool in the discussion.
We wish to stress that in this survey we restrict ourselves to skew-cyclic codes derived
from skew polynomials of automorphism type over fields. More general situations have been
studied in the literature; see Remark 8.2.2. In addition, some results have been obtained
for quasi-skew-cyclic codes. We will not survey this material either. Finally, we also do not
discuss decoding algorithms for the codes of Section 8.9 in this survey.
Definition 8.2.1 Let F be any field and σ ∈ Aut(F ). The skew-polynomial ring F [x; σ]
PN i
is defined as the set i=0 fi x | N ∈ N0 , fi ∈ F endowed with usual addition, i.e.,
Introduction to Skew-Polynomial Rings and Skew-Cyclic Codes 153
along with distributivity and associativity. Then (F [x; σ], +, · ) is a ring with identity x0 = 1.
Its elements are called skew polynomials or simply polynomials.
If σ = id, then F [x; σ] = F [x], the classical commutative polynomial ring over F . We
refer to this special case as the commutative case and commutative polynomials. In the
general case, the additive groups of F [x; σ] and F [x] are identical, whereas multiplication
in F [x; σ] is given by
N
X M
X X
i
fi x gj xj = fi σ i (gj )xi+j .
i=0 j=0 i,j
PN
Note that the set of skew polynomials may also be written as { i=0 xi fi | N ∈ N0 , fi ∈ F },
i.e., with coefficients on the right of x. The only rule we have to obey is to apply σ when
moving coefficients from the right Pto the left of x, and thus σ −1 for the other direction. We
N
will always write polynomials as i=0 fi xi , and coefficients are meant to be left coefficients.
As a consequence, the leading coefficient of a polynomial is meant to be its left leading
coefficient.
Note that F [x; σ] is a left and right vector space over F , but these two vector space
structures are not identical.
Remark 8.2.2 Skew polynomial rings are commonly introduced and studied in much more
generality. One may replace the coefficient field F by a division algebra or even a noncom-
mutative ring; one may consider (ring) endomorphisms σ instead of automorphisms; and
one may introduce a σ-derivation, say δ, which then turns (8.1) into xa = σ(a)x + δ(a).
All of this is standard in the literature of skew-polynomial rings. For simplicity of this pre-
sentation we restrict ourselves to skew-polynomial rings as in Definition 8.2.1. However, we
wish to point to the article [260] for examples showing that a σ-derivation may indeed lead
to skew-cyclic codes with better minimum distance than what can be achieved with the
aid of an automorphism alone. In addition, we will not discuss skew-polynomial rings with
coefficients from a finite ring.
Remark 8.2.3 Consider the skew-polynomial ring F [x; σ], and let K ⊆ F be the fixed
field of σ. If σ has finite order, say m, the center of F [x; σ] is given by the commutative
polynomial ring K[xm ]. This is easily seen by using the fact that any f in the center satisfies
xf = f x and af = f a for all a ∈ F . If σ has infinite order, the center is K.
Example 8.2.4 Consider the field C of complex numbers, and let σ be complex conjuga-
tion. Then the center of C[x; σ] is the commutative polynomial ring R[x2 ]. Furthermore,
R[x] is a subring of both C[x; σ] and C[x], which shows that a skew-polynomial ring may
be a subring of skew-polynomial rings with different automorphisms.
Later we will restrict ourselves to skew-polynomial rings over finite fields. The following
situation essentially covers all such cases because each automorphism is a power of the
Frobenius automorphism over the prime field. Throughout, the automorphism Fqm → Fqm
given by c 7→ cq is simply called the q-Frobenius.
Example 8.2.5 Consider the skew-polynomial ring F[x; σ] where F = Fqm and σ ∈ Aut(F)
is the q-Frobenius. Then σ has order m and fixed field Fq , and the center of F[x; σ] is Fq [xm ].
154 Concise Encyclopedia of Coding Theory
Let us return to general skew-polynomial rings F [x; σ] and fix the following standard
notions.
Definition 8.2.6 The degree of skew polynomials is defined in the usual way as the largest
exponent of x appearing in the polynomial, and deg(0) := −∞. This does not depend on
the side where we place the coefficients because σ is an automorphism, and we obtain
deg(f + g) ≤ max {deg(f ), deg(g)} and deg(f g) = deg(f ) + deg(g).
As a consequence, the group of units of F [x; σ] is given by F ∗ = F \ {0}. A nonzero
polynomial is monic if its leading coefficient is 1. Again, this does not depend on the
sidedness of the coefficients because σ(1) = 1. We say that g is a right divisor of f and
write g |r f if f = hg for some h ∈ F [x; σ]. A polynomial f ∈ F [x; σ] \ F is irreducible if
all its right (hence left) divisors are units or polynomials of the same degree as f . Clearly,
polynomials of degree 1 are irreducible.
Example 8.2.7 Let F = F4 = {0, 1, ω, ω 2 }, where ω 2 = ω+1, and let σ be the 2-Frobenius.
Then σ −1 = σ. In F4 [x; σ] we have:
(1) x2 + 1 = (x + 1)(x + 1) = (x + ω 2 )(x + ω) = (x + ω)(x + ω 2 ). Thus, a skew polynomial
may have more linear factors than its degree suggests.
(2) (x2 +ωx+ω)(x+ω) = x3 +ω 2 x+ω 2 , and thus x+ω is a right divisor of x3 +ω 2 x+ω 2 .
It is not a left divisor. This is easily seen by computing
(x + ω)(f2 x2 + f1 x + f0 ) = f22 x3 + (ωf2 + f12 )x2 + (ωf1 + f02 )x + ωf0
and comparing coefficients with those of x3 + ω 2 x + ω 2 .
(3) The polynomial x14 +1 ∈ F4 [x; σ] has 599 nontrivial monic right divisors. On the other
hand, in the commutative polynomial ring F4 [x] the same polynomial has only 25
nontrivial monic right divisors.
Just as for commutative polynomials, one can carry out division with remainder in
F [x; σ] if one takes sidedness of the coefficients into account. This is spelled out in (a)
below. The proof is entirely analogous to the commutative case. Indeed, if deg(f ) = m ≥
deg(g) = ` and the leading coefficients of f and g are fm and g` , respectively, then the
polynomial f − fm σ m−` (g`−1 )xm−` g has degree less than m. This allows one to proceed
until a remainder of degree less than ` is obtained. The rest of the theorem formulates the
familiar consequences of division with remainder.
Theorem 8.2.8 ([1466, pp. 483–486]) F [x; σ] is a left Euclidean domain and a right
Euclidean domain. More precisely, we have the following.
(a) Right division with remainder: For all f, g ∈ F [x; σ] with g 6= 0 there exist unique
polynomials s, r ∈ F [x; σ] such that f = sg + r and deg(r) < deg(g). If r = 0, then g
is a right divisor of f .
(b) For any two polynomials f1 , f2 ∈ F [x; σ], not both zero, there exists a unique monic
polynomial d ∈ F [x; σ] such that d |r f1 , d |r f2 and such that whenever h ∈ F [x; σ]
satisfies h |r f1 and h |r f2 then h |r d. The polynomial d is called the greatest com-
mon right divisor of f1 and f2 , denoted by gcrd(f1 , f2 ). It satisfies a right Bezout
identity, that is,
d = uf1 + vf2 for some u, v ∈ F [x; σ].
We may choose u, v such that deg(u) < deg(f2 ) and, consequently, deg(v) < deg(f1 ).
This is a consequence of the Euclidean algorithm; see also [802, Sec. 2]. If d = 1, we
call f1 , f2 relatively right-prime.
Introduction to Skew-Polynomial Rings and Skew-Cyclic Codes 155
(c) For any two nonzero polynomials f1 , f2 ∈ F [x; σ], there exists a unique monic poly-
nomial ` ∈ F [x; σ] such that fi |r `, i = 1, 2, and such that whenever h ∈ F [x; σ]
satisfies fi |r h, i = 1, 2, then ` |r h. The polynomial ` is called the least common
left multiple of f1 and f2 , denoted by lclm(f1 , f2 ). Moreover, ` = uf1 = vf2 for
some u, v ∈ F [x; σ] with deg(u) ≤ deg(f2 ) and deg(v) ≤ deg(f1 ).
(d) For all nonzero f1 , f2 ∈ F [x; σ]
We refer to [356, 802] for various algorithms for fast computations in F[x; σ] for a finite
field F, in particular for factoring skew polynomials into irreducibles.
Exactly as in the commutative case, the above leads to the following consequence.
Theorem 8.2.9 Let I ⊆ F [x; σ] be a left ideal (that is, (I, +) is a subgroup of (F [x; σ], +)
and I is closed with respect to left multiplication by elements from F [x; σ]). Then I is
principal; i.e., there exist f ∈ I such that I = F [x; σ]f = {gf | g ∈ F [x; σ]}. For brevity
we will use the notation • (f ) for F [x; σ]f and call it the left ideal generated by f . An
analogous statement is true for right ideals. Thus, F [x; σ] is a left principal ideal ring
and a right principal ideal ring.
Recall the center of F [x; σ] from Remark 8.2.3. It is clear that for any polynomial f
in the center, the ideal • (f ) is two-sided, i.e., a left ideal and a right ideal. Polynomials
generating two-sided ideals are closely related to central elements.
Remark 8.2.10 Let σ have order m. An element f ∈ F [x; σ] is called two-sided if the ideal
•
(f ) is two-sided, i.e., • (f ) = (f )• . It is not hard to see [1029, Thm. 1.1.22] that the two-sided
elements of F [x; σ] are exactly the polynomials of the form {cxt g | c ∈ F, t ∈ N0 , g ∈ Z},
where Z = K[xm ] is the center of F [x; σ]. As a special case, for any a ∈ F ∗ , the polynomial
xn − a is two-sided if and only if it is central if and only if m | n and σ(a) = a.
With respect to many properties addressed thus far, the skew-polynomial ring F [x; σ]
behaves similarly to the commutative polynomial ring F [x]. However, a main difference is
that in F [x; σ], where σ 6= id, polynomials do not factor uniquely (up to order) into irre-
ducible polynomials. We have seen this already in Example 8.2.7(1). Of course, irreducible
factorizations of non-units still exist, which is a consequence of the boundedness of the
degree by zero. In order to formulate a uniqueness result, we need the notion of similarity
defined in [1466]. The equivalence of (a) and (b) below is straightforward. The equivalence
to (c) can be found in [1029, Prop. 1.2.8].
If (a), hence (b) and (c), holds true, the polynomials f, g are called (left) similar.
156 Concise Encyclopedia of Coding Theory
In the commutative ring F [x], two polynomials are thus similar if and only if they differ
by a constant factor. In general, similar polynomials have the same degree (see Proposi-
tion 8.2.8(d)). Part (c) shows that similarity is indeed an equivalence relation on F [x; σ]
and does not depend on the sidedness; i.e., f, g are right similar if and only if they are left
similar (see also [1466, Thm. 13, p. 489] without resorting to (c)). Part (c) above is the
similarity notion for left ideals as introduced and discussed by Cohn [427, Sec. 3.2]. It leads
to a simple criterion for similarity,
Pn−1which we will present next.
For a monic polynomial f = i=0 fi xi + xn ∈ F [x; σ] define the ordinary companion
matrix
1
..
.
Cf = ∈ F n×n . (8.2)
1
1
−f0 −f1 · · · −fn−2 −fn−1
Consider the map Lx given by left multiplication by x in the left F [x; σ]-module Mf :=
F [x; σ]/• (f ). This map is σ-semilinear, i.e., Lx (at) = σ(a)Lx (t) for all a ∈ F and
t ∈ Mf . Furthermore, the rows of Cf are the coefficient vectors of Lx (xi ) for i =
Pn−1 Pn−1
0, . . . , n − 1. All of this shows that Lx ( i=0 ai xi ) = i=0 bi xi , where (b0 , . . . , bn−1 ) =
(σ(a0 ), . . . , σ(an−1 ))Cf . In this sense, Cf is the matrix representation of the semi-linear
map Lx with respect to the basis {1, x, . . . , xn−1 }.
If g is another monic polynomial of degree n, then both Mf and Mg are n-dimensional
over F and thus isomorphic F -vector spaces. They are isomorphic as left F [x; σ]-modules
if we can find a left F [x; σ]-linear isomorphism. Along with Definition 8.2.11(c) this easily
leads to the following criterion for similarity (see also [1196, Thm. 4.9]).
In [356, Prop. 2.1.17] a different criterion is presented for finite fields F in terms of the
reduced norm.
Now we are ready to formulate the uniqueness result for irreducible factorizations.
The reader should be aware that the converse of the above statement is not true: it is
not hard to find (monic) polynomials fi , gi , i = 1, 2, such that fi and gi are similar for
i = 1, 2, but f1 f2 and g1 g2 are not similar (and thus certainly not equal).
We close this section with the following example illustrating yet some further challenging
features in factorizations of skew polynomials.
Example 8.2.14 Consider F4 [x; σ] as in Example 8.2.7. In (1) we had seen that x2 + 1 =
(x + ω)(x + ω 2 ). Right-multiplying the first factor by ω 2 and left-multiplying the second
one by ω does not change the product and thus
Hence we have two factorizations of x2 + 1 into linear polynomials. While in the first factor-
ization the two factors are relatively right-prime, the two factors of the second factorization
Introduction to Skew-Polynomial Rings and Skew-Cyclic Codes 157
are identical. In the first factorization the linear factors are monic, whereas in the second
one they are normalized such that their constant coefficients are 1. The reader is invited to
check that Theorem 8.2.13 is indeed true: choose π = id and h1 = ω, h2 = ωx. Examples
like this will make it impossible to define multiplicities of roots for skew polynomials in a
meaningful way. We will comment on this at the end of Section 8.5.
We refer to [1466] for further results on decompositions of skew polynomials, most no-
tably completely reducible polynomials, i.e., polynomials that arise as the least common
left multiple of irreducible polynomials — and thus generalize square-free commutative
polynomials.
Polynomials of this type are called q-linearized because for any f ∈ L the associated map
F −→ F, a 7−→ f (a) is Fq -linear. Linearized polynomials have been well studied in the
literature and a nice overview of the basic properties can be found in [1254, Ch. 3.4]. In
particular, (L, +, ◦) is a (noncommutative) ring, where + is the usual addition and ◦ the
composition of polynomials. The rings F[x; σ] and L are isomorphic in the obvious way.
Indeed, the map
N N
X X i
Λ : F[x; σ] −→ L where gi xi 7−→ gi y q (8.3)
i=0 i=0
is a ring isomorphism between F[x; σ] and (L, +, ◦). The only interesting part of the proof is
i
the multiplicativity of Λ. But that follows from Λ(axi bxj ) = Λ(aσ i (b)xi+j ) = Λ(abq xi+j ) =
i i+j i j
abq y q = ay q ◦ by q for all a, b ∈ F and i, j ∈ N0 . As a consequence, L inherits all
the properties of F[x; σ] presented in the previous section. We can go even further. The
m
polynomial y q − y ∈ L induces the zero map on F = Fqm . Its pre-image under Λ is xm − 1,
which is in the center of F[x; σ]; see Example 8.2.5. As a consequence, the left ideal generated
m m
by y q − y is two-sided and gives rise to the quotient ring L/(y q − y). Since the latter ring
2
has cardinality q m , this tells us that the quotient is isomorphic to the space of all Fq -linear
maps on Fqm . Thus,
m
= L/(y q − y) ∼
F[x; σ]/(xm − 1) ∼ = Fm×m
q .
m
Clearly, the second map is given by g + (y q − y) 7−→ [g]B B
B , where [g]B denotes the matrix
representation of the map g with respect to a chosen basis B of Fqm over Fq . Fix the basis
i m−1
B = (b0 , . . . , bm−1 ) and define its Moore matrix S = bqj i,j=0 (called the Wronskian in
[1194, (4.11)]). The linear independence of b0 , . . . , bm−1 implies that S is in GLm (Fqm ) (see
158 Concise Encyclopedia of Coding Theory
Pm−1 i
[1254, Cor. 2.38]). Furthermore, in [1909, Lem. 4.1] it is shown that for any g = i=0 gi y q
−1
we have S [g]BBS = Dg , where
g0 g1 · · · gm−2 gm−1
q
gm−1 g0q q
· · · gm−3 q
gm−2
Dg =
.. .. .. ..
(8.4)
. . . .
q m−1 q m−1 q m−1 q m−1
g1 g2 · · · gm−1 g0
is the Dickson matrix of g. This matrix is also known as q-circulant and generalizes the
notion of a classical circulant matrix; see (8.8). The isomorphisms above allow us to define
the Dickson matrix of g ∈ F[x; σ] as Dg := DΛ(g) so that we obtain the ring embedding
Example 8.2.7(1) shows that a polynomial of degree N may have more than N roots.
If the field F is infinite, it may even have infinitely many roots. For instance, in C[x; σ],
where σ is complex conjugation, the polynomial f = x2 − 1 splits as (x + a)(x − a) for any
a on the unit circle, and thus has exactly the complex numbers on the unit circle as roots.
The next example shows yet another surprising phenomenon, namely even if f splits into
linear factors, it may only have one root.
The evaluation f (a) can be computed explicitly without resorting to division with re-
mainder.
Qi−1
Definition 8.4.3 For any i ∈ N0 define Ni : F −→ F as N0 (a) = 1 and Ni (a) = j=0 σ j (a)
for i > 0. We call Ni the ith norm on F .
Thus, N1 (a) = a and Ni+1 (a) = Ni (a)σ i (a) for all a ∈ F . For F = Fqm and σ the
q-Frobenius, Nm is simply the field norm of F over Fq . Note that in the commutative case,
i.e., σ = id, we simply have Ni (a) = ai , and thus the following result generalizes evaluation
of commutative polynomials.
Proposition 8.4.4 ([1194, Lem. 2.4] or [1193, Eq. (11) and Thm. 3]) Let f =
PN i
i=0 fi x ∈ F [x; σ] and a ∈ F . Then
N
X
f (a) = fi Ni (a).
i=0
Now that we have a notion of roots for skew polynomials f ∈ F [x; σ] we may wonder
about the relation to the roots of the associated linearized polynomial Λ(f ) introduced in
Section 8.3. The latter are simply commutative polynomials and thus the ordinary notion
of roots applies.
Remark 8.4.5 Consider F [x; σ] = F[x; σ] as in Example 8.2.5, and let g ∈ F[x; σ] and
Λ(g) ∈ F[y] be as in (8.3). Note that Λ(g)(0) is always 0.
(a) For any q we have the following relation between the roots of g and Λ(g). For any
b ∈ F∗
g(bq−1 ) = 0 ⇐⇒ Λ(g)(b) = 0;
see also [828, Lem. A.3]. In order to see this, note that Λ(g)(b) = 0 implies Λ(g)(αb) =
0 for all α ∈ Fq , which means that the linearized polynomial y q − bq−1 y is a divisor
of Λ(g) in (F[y], +, · ). But then it is also a divisor of Λ(g) in the ring (L, +, ◦) (see
[1254, Thm. 3.62]), i.e., Λ(g) = G ◦ (y q − bq−1 y) for some G ∈ L. Applying the ring
homomorphism Λ−1 shows that x − bq−1 is a right divisor of g.
(b) For q = 2, part (a) shows that the set of nonzero roots of Λ(g) coincides with the set
of nonzero roots of g.
(c) If q 6= 2, the roots of g do not agree with the roots of Λ(g). To see this, take for
instance g = x − a for some nonzero a ∈ Fqm . Then a is a left and right root of g.
The associated linearized polynomial is Λ(g) = y q − ay = y(y q−1 − a), and it may or
may not have nonzero roots. For instance, if Fqm = F32 , the equation y 2 = a has two
160 Concise Encyclopedia of Coding Theory
distinct roots for four values of a (the nonzero squares) and no roots for the other
four values of a. The discrepancy between roots of skew polynomials and roots of
linearized polynomials is, of course, also related to the fact that multiplication in L
is composition whereas a root c corresponds to a factor y − c in the ordinary sense
(which is not even a linearized polynomial).
There is an obvious relation between the right roots of a skew polynomial over a finite
field and the roots of a different associated commutative polynomial. Consider again the
skew-polynomial ring F[x; σ] as in Example 8.2.5. Then the ith norm is given by
0
+q 1 +···+q i−1 qi − 1
Ni (a) = aq = a[[i]] where [[i]] := for i ≥ 0. (8.5)
q−1
Proposition 8.4.4 allows us to translate the evaluation of skew polynomials into evaluation
of commutative polynomials (see also [389, p. 278]).
Properties of the polynomial Pf can be found in [1223, Sec. 2]. Unfortunately, the map
f 7−→ Pf does not behave well under multiplication and therefore the above result is of
limited use.
Let us return to the general case with an arbitrary field F . Having defined evaluation
of polynomials, one may study properties of the map
Clearly, this map is additive and left F -linear, but unlike in the commutative case, it is not
multiplicative. As an extreme case of this non-multiplicativity note that f = x − a satisfies
f (a) = 0, whereas (f b)(a) 6= 0 for any b ∈ F not fixed by σ. Yet, evaluation is close to being
multiplicative. Before we can make this precise we need the following definition.
∆(a) = {ac | c ∈ F ∗ }.
Since the automorphism is fixed throughout this text, we will drop the prefix σ. The
reader should be aware of the ambiguity of the notation. For instance, for c = −1 ∈ F , the
notation ac does not represent the inverse of a; in fact, in this case ac = a. More generally,
a−c = ac for any c ∈ F ∗ .
It is easy to see that conjugacy defines an equivalence relation. Moreover, ∆(0) = {0},
and if σ = id, then ∆(a) = {a} for all a ∈ F . Furthermore, if a 6= 0, then ac = a if and only
if c is in the fixed field of σ. The relevance of conjugacy for us stems from the equivalence
b = ac ⇐⇒ (x − b)c = σ(c)(x − a)
which tell us that, up to conjugacy, linear factors can be reordered. Furthermore, as a special
case of Proposition 8.2.12 we have
x − a and x − b are similar
⇐⇒ a and b are conjugate.
in the sense of Definition 8.2.11
Example 8.4.8
(a) For any finite field F = Fqm with q-Frobenius σ, the identity ac = cq−1 a implies that
the nonzero conjugacy classes are given by the cosets of ∆(1) = {cq−1 | c ∈ F ∗ } in
F ∗ . Thus, for q = 2 the conjugacy classes are {0} and ∆(1) = F ∗ , whereas for q = 3
there are two nonzero conjugacy classes, one of which consists of the squares of F ∗
and the other of the non-squares.
(b) For C with complex conjugation, the nonzero conjugacy classes are exactly the circles
about the origin.
Remark 8.4.9 Obviously, the conjugacy classes are the orbits of the group action F ∗ ×
F −→ F where (c, a) 7−→ ac , and the stabilizer of any nonzero a is the multiplicative group
of the fixed field of σ. Therefore, in the case where F = Fqm and σ is the q-Frobenius,
the nonzero conjugacy classes have size (q m − 1)/(q − 1). This also shows that there are q
conjugacy classes (including {0}).
Now we can formulate the product theorem. It appeared first in [1193, Thm. 2] and was
later extended to more general skew-polynomial rings in [1194, Thm. 2.7]. The proof follows
by direct computations using Proposition 8.4.4 and properties of the norms Ni .
We have seen already that the number of roots of a skew polynomial may vastly exceed
its degree. Taking conjugacy into account, however, provides us with the following gener-
alization of the commutative case. It follows quickly from the previous result by inducting
on the degree.
Theorem 8.4.11 ([1193, Thm. 4]) Let f ∈ F [x; σ] have degree N . Then the roots of f
lie in at most N distinct conjugacy classes. Furthermore, if f = (x − a1 ) · · · (x − aN ) for
some ai ∈ F and f (a) = 0, then a is conjugate to some ai .
Note that for F = Fqm with q = 2, the theorem does not provide any insight because
in this case there is only one conjugacy class. This also shows that the converse of The-
orem 8.4.11 is not true: not every conjugate of some ai is a root of f (take N = 1 for
instance).
Even more, the above theorem does not state that every conjugacy class ∆(ai ) contains
a root of f . This is indeed in general not the case, and an example can be found in the
skew-polynomial ring Q(t)[x; σ], where σ is the Q-algebra automorphism given by t 7→ t + 1.
However, for finite fields the last statement is in fact true.
Theorem 8.4.12 Consider the ring F[x; σ] as in Example 8.2.5, and let f = (x−a1 ) · · · (x−
aN ) for some ai ∈ F∗ . Then each conjugacy class ∆(ai ) contains a root of f .
162 Concise Encyclopedia of Coding Theory
Proof: It suffices to show that ∆(a1 ) contains a root of f , which means that f is of the
form f = (x − b1 ) · · · (x − bN −1 )(x − ac1 ) for some b1 , . . . , bN −1 , c ∈ F∗ . Since (ac1 )c2 = ac1 c2 ,
the case N = 2 is sufficient. Thus, let f = (x − b)(x − a) for some a, b ∈ F∗ . If a ∈ ∆(b)
there is nothing to prove. Thus let a 6∈ ∆(b). Let now c ∈ F∗ . Theorem 8.4.10 implies
c c
f (bc ) = ((bc )b −a − b)(bc − a), and thus f (bc ) = 0 if and only if (bc )b −a = b. It is easy to
see that the latter is equivalent to σ(σ(c)b − ac) = σ(c)b − ac, which in turn is equivalent
to σ(c)b − ac ∈ Fq . Thus we have to establish the existence of some c ∈ F∗ such that
σ(c)b − ac ∈ Fq . Consider the Fq -linear map Ψa,b : F −→ F where c 7−→ σ(c)b − ac. If
we can show that Ψa,b is injective, then it is bijective and we are done. Suppose there
exists d ∈ ker Ψa,b \ {0}. Then σ(d)b − ad = dq b − ad = 0; hence dq−1 = a/b. But then
m
1 = dq −1 = (dq−1 )[[m]] = (a/b)[[m]] . This shows that Ψa,b is injective if (a/b)[[m]] 6= 1. On
the other hand, if (a/b)[[m]] = 1, then the order of a/b in F∗ , say t, is a divisor of [[m]].
m
Furthermore, a/b = ω k(q −1)/t for some k and a primitive element ω. Writing ts = [[m]], we
conclude a/b = (ω ks )q−1 . But this means that a and b are conjugate (see Example 8.4.8(a)),
a contradiction.
Similar reasoning shows that the previous result is also true in the skew-polynomial ring
C[x; σ] with complex conjugation σ.
We now turn to algebraic sets. Obviously, in the commutative case, i.e., σ = id, algebraic
sets are exactly the finite sets. This is not the case for skew polynomials as we have seen
right after Definition 8.4.1 for C[x; σ] with complex conjugation σ. From (8.6) we deduce
that every set A = {a, b} of cardinality 2 has rank 2, whereas Example 8.2.7(1) provides us
with a set of cardinality 3 and rank 2. In general, a finite set A = {a1 , . . . , an } has minimal
polynomial mA = lclm(x−a1 , . . . , x−an ). Using induction on the cardinality and the degree
formula in Proposition 8.2.8(d), one obtains immediately:
Proposition 8.5.3 (see also [1193, Prop. 6]) Let A = {a1 , . . . , an } ⊆ F . Then A is
algebraic and rk(A) =: r ≤ |A|. Furthermore, there exist distinct b1 , . . . , br ∈ A such that
mA = lclm(x − b1 , . . . , x − br ).
Example 8.5.4 ([1223, Rem. 2.4]) Let A = Fpr , where p is prime and r ∈ N. Then
mA = lclm(x − a | a ∈ Fpr ) = xr(p−1)+1 − x. Thus, rk(Fpr ) = r(p − 1) + 1.
Example 8.5.5 ([260, Prop. 4]) Let Fqs be an extension field of F = Fqm and con-
sider Fqs [x; σ] with q-Frobenius σ. Fix an element a ∈ Fqs and set A = {τ (a) | τ ∈
Aut(Fqs | Fqm )}. Then the minimal polynomial mA is in F[x; σ] and is the nonzero monic
polynomial of smallest degree in F[x; σ] with (right) root a. It is called the σ-minimal
polynomial of a over F.
The minimal polynomial of an algebraic set always factors linearly. The following result
comes closest to the commutative case.
Proposition 8.5.6 ([1193, Lem. 5]) Let A ⊆ F be an algebraic set of rank r. Then its
minimal polynomial is of the form mA = (x − a1 ) · · · (x − ar ), where each ai is conjugate to
some a ∈ A.
This result is indeed the best possible in the sense that the roots of the linear factors
need not be in A.
Example 8.5.7
(a) Consider F33 [x; σ] with 3-Frobenius σ and primitive element β satisfying β 3 +2β +1 =
0. Let A = {β 14 , β 25 }. Then mA = x2 +βx+β = (x−β 13 )(x−β 14 ) = (x−β 2 )(x−β 25 ),
and thus mA is not the product of two linear terms with roots in A. With the aid of
Example 8.4.8 one concludes that β 13 is conjugate to β 25 and β 2 is conjugate to β 14 .
Finally, A = V (mA ); that is, mA has no further roots in F33 .
(b) The linear factors x−ai in Proposition 8.5.6 need not be distinct. Consider for instance
F24 [x; σ] with 2-Frobenius σ and primitive element γ satisfying γ 4 + γ + 1 = 0. The
polynomial
Note that the matrix on the right hand side is the Moore matrix of a0 , . . . , am−1 (see
Section 8.3). Since the latter is invertible thanks to the linear independence of a0 , . . . , am−1
([1254, Cor. 2.38]), the same is true for the Vandermonde matrix on the left hand side. Let us
m−1
now apply this to a normal basis {γ, γ q , . . . , γ q } of Fqm . Then the Vandermonde matrix
m−1 j j i+j
q
above is Vm (b, b , . . . , b q
, where b = γ q−1
. It is easy to see that Ni (bq ) = (γ −1 )q γ q ,
j
which in turn yields that bq is a right root of xm − 1 ∈ F[x; σ] for all j = 0, . . . , m − 1.
m−1
As a consequence, xm − 1 = lclm(x − b, x − bq , . . . , x − bq ) thanks to Theorem 8.5.8.
m−1
Note that the obvious right root 1 of xm − 1 does not appear in the list b, bq , . . . , bq .
Theorem 8.5.8 also shows for any subset {j1 , j2 , . . . , jr } ⊆ {0, 1, . . . , m − 1} the polynomial
j1 j2 jr
lclm(x − bq , x − bq , . . . , x − bq ) has degree r (see also [827, Lem. 3.1]). Finally, consider
a polynomial xm − a ∈ F[x; σ] for some a ∈ F∗ and suppose c ∈ F is a root of xm − a.
Using the multiplicity of the maps Ni , one easily deduces that xm − a = lclm(x − cb, x −
m−1
cbq , . . . , x − cbq ).
Part (a) as well as the first equivalence in (b) are clear; see also Theorem 8.2.8(d).
The second part in (b) requires more work. We now present some strong properties of W -
polynomials. More machinery is needed to derive most of them, especially the Φ-transform
and λ-transform introduced by Lam/Leroy in [1195], and we refer to the excellent presen-
tation in [1195, Sec. 4 and 5] for further details.
166 Concise Encyclopedia of Coding Theory
also Section 1.12 and Chapter 2 of this encyclopedia), and is presented here for the mere
purpose to serve as a guideline for the skew case in the next sections.
Let us fix a finite field F = Fq and a length n. Usually, one requires that q and n be
relatively prime, but that is not needed for the algebraic theory. It only plays a role when it
comes to distance considerations. We consider the quotient ring R = F[x]/(xn − 1), which
as an F-vector space is isomorphic to Fn via the map
n−1
X
p : Fn −→ R with (g0 , g1 , . . . , gn−1 ) 7−→ gi xi ,
i=0
−1
where · denotes cosets. Set v := p (and think of these maps as vectorization and polyno-
mialization).
A cyclic code in Fn is, by definition, a subspace of the form C = v(I), where I is
an ideal in R. This means that C is invariant under the cyclic shift (a0 , . . . , an−1 ) 7−→
(an−1 , a0 , . . . , an−2 ). Usually, the code is identified with its corresponding ideal so that its
codewords are polynomials of degree less than n. It follows from basic algebra that R is a
principal ideal ring and every ideal is of the form (g), where g is a monic divisor of xn − 1.
Such a generator is unique and called the generator polynomial of the cyclic code (g). We
conclude that the number of cyclic codes of length n over F equals the number of divisors
of xn − 1. If gh = xn − 1, then h is the parity check polynomial of C. For all of this see
also Theorem 1.12.11 of Section 1.12 in Chapter 1 of this encyclopedia.
Before summarizing some well-known facts about cyclic codes we introduce circulant
matrices. For any g = (g0 , g1 , . . . , gn−1 ) ∈ Fn define the circulant
g0 g1 . . . gn−2 gn−1
gn−1 g0 . . . gn−3 gn−2
. .. .. ..
Γ(g) := .
. . . . . (8.8)
g g . . . g g
2 3 0 1
g1 g2 . . . gn−1 g0
Pn−1
Indexing by i = 0, 1, . . . , n − 1, we see that the ith row of Γ(g) is given by v(xi j=0 gj xj ).
The set Circ := {Γ(g) | g ∈ Fn } is an n-dimensional subspace of Fn×n . We also define
ρ: F[x] −→ F[x]
r r
X
i r −1
X (8.9)
g= gi x 7−→ x g(x )= gi xr−i (where gr 6= 0).
i=0 i=0
circulants Γ(φ(g)) and Γ(ρ(g)) have the same row space and the same column space. In
fact, the left (respectively right) factor Γ(xr ) simply permutes the rows (respectively
columns) of Γ(φ(g)).
(e) If g is a divisor of xn − 1, then so is ρ(g). But the representative of φ(g) of degree less
than n is in general not a divisor of xn − 1. Thus, while involution g(x) 7−→ g(xn−1 )
is the appropriate map for transposition of circulants, it does not behave well when it
comes to divisors of xn − 1.
Now we review the basic algebraic properties of cyclic codes in the terminology of cir-
culant matrices; for further details see for instance [1008, Sec. 4.1, 4.2] or Section 1.12 of
this encyclopedia and specifically Theorem 1.12.11.
Pr Pk
Remark 8.6.2 Let xn −1 = hg, where g = i=0 gi xi , h = i=0 hi xi are monic of degree r
and k, respectively. Let C be the cyclic code C = v (g) ⊆ Fn .
(a) The ideal (g) has dimension k := n − r as an F-vector space and g, xg, . . . , xk−1 g is a
basis. Thanks to the isomorphism v, this implies that C is the row space of the circulant
Γ(g) and actually of its first k rows (see also Remark 8.6.1(b)). Since deg(g) = r, these
first rows have the form
v(g) g0 g1 · · · gr
v(xg) g0 g1 · · · gr
..
G= = .. .. .. ,
. . . .
v(xk−1 g) g0 g1 ··· gr
C = {v ∈ Fn | Γ(φ(h))v T = 0}.
By Remark 8.6.1(b), the last n − k rows of Γ(φ(h)) form a basis of the row space of
Pk
this matrix. Writing h = i=0 hi xi , all of this implies C = {v ∈ Fn | Hv T = 0} where
hk hk−1 ··· h0
hk hk−1 ··· h0
∈ F(n−k)×n ,
H= .. .. ..
. . .
hk hk−1 ··· h0
which is known as the parity check matrix of C. This is also the submatrix consisting
of the first n − k rows of Γ(ρ(h)), where ρ(h) is the reciprocal of h.
(c) ρ(h)ρ(g) = 1 − xn , and the dual code C ⊥ (see Section 1.5 of Chapter 1) is cyclic with
generator and parity check polynomial ρ(h)/h0 and ρ(g)/g0 , respectively.
Introduction to Skew-Polynomial Rings and Skew-Cyclic Codes 169
n−1
X
pf : Fn −→ Rf where (c0 , . . . , cn−1 ) 7−→ ci x i .
i=0
It is crucial that the coefficients ci appear on the left of x, because this turns pf into an
isomorphism of (left) F-vector spaces. This map will relate codes in Fn to submodules in Rf .
Again we set vf = p−1 f . The map vf coincides with the map φ given in [262, Prop. 3], where
it is defined with the aid of a semi-linear map based on the companion matrix of f ; see (8.2).
The following facts about submodules of Rf are straightforward generalizations of the
commutative case, Theorem 1.12.11(a)–(c) in Chapter 1, and are proven in exactly the same
way (with the aid of right division with remainder in F[x; σ]). We use the notation • (g) for
the left submodule {zg | z ∈ F[x; σ]} of Rf generated by g.
The following definition of skew-cyclic codes was first cast in [256] for the case where f
is a central polynomial of the form f = xn − 1, i.e., σ n = id. In the form below, the
definition appeared in [258]. A different, yet equivalent, definition was cast in [262, Def. 3].
Note the generality of the setting. It includes for instance the case f = xn , for which even
in the commutative case the terminology ‘cyclic’ may be questionable because reduction
modulo f = xn simply means truncating the given polynomial at power xn . Yet, the basic
part of the algebraic theory, presented in this section, applies indeed to this generality, and
only in the next section will we restrict ourselves to skew-constacyclic codes.
vf (g)
vf (xg)
..
Γσf (g) := ∈ Fn×n .
.
v (xn−2 g)
f
vf (xn−1 g)
In the case where f = xn − a, a ∈ F∗ , we write Γσa instead of Γσxn −a . We call any matrix of
the form Γσf (g) a skew circulant.
One may regard Γσf (g) as the matrix representation of the left F-linear map on Rf given
by right multiplication by g with respect to the basis {x0 , . . . , xn−1 }. If f = xn − a, the
Pn−1
skew circulant of g can be given explicitly. For g = i=0 gi xi we have Γσa (g) =
If f = xn − 1, then Γσ1 (g) = Dg , the Dickson matrix in (8.4), and this specializes to the
classical circulant Γ(g) in (8.8) if σ = id. Furthermore, for any monic f the skew circulant
Introduction to Skew-Polynomial Rings and Skew-Cyclic Codes 171
Γσf (x) equals Cf , the companion matrix of f in (8.2). Note also that modulo xn − a
1 1
1 ..
.
σ
Γa (x) =
. .. σ 2
and Γa (x ) =
. (8.11)
1
1 a
a σ(a)
Example 8.7.4 Let f = x7 + α ∈ F8 [x; σ], where α3 + α + 1 = 0 and σ is the 2-Frobenius.
Let g = x4 + αx3 + α5 x2 + α. Then g is a right divisor of f and
0 α5 α
α 1 0 0
0 α2 0 α3 α2 1 0
4 6 4
0 0 α 0 α α 1
Γ := Γσf (g) =
5
α
3 0 0 α 0 α α .
α α2 0 0 α 2
0 α 3
1 α6 α4 0 0 α4 0
0 1 α5 α 0 0 α
The first row is simply the vector of left coefficients of g. The second and third row of Γ
are the cyclic shift of the previous row followed by the map σ applied entrywise. Thus,
the first 3 rows do not depend on f . Only in the last 4 rows, where deg(xi g) is at least 7,
reduction modulo • (f ) kicks in.
Now one obtains the straightforward analog of Remark 8.6.2(a): every (σ, f )-skew-cyclic
code has a generator matrix that reflects the skew-cyclic structure. Consider a general
modulus f of degree n. For any matrix G we use the notation rowsp(G) for the row span
of G.
Proposition 8.7.5 (see also [259], [736, Cor. 2.4]) Let M = • (g) ⊆ Rf , where g =
P r i
i=0 gi x ∈ F[x; σ] has degree r. Then:
Theorem 8.7.7 (see also [736, Thm. 3.6]) Let f ∈ F[x; σ] be two-sided; thus Rf is a
ring. Then
Γσf (gg 0 ) = Γσf (g)Γσf (g 0 ) for all g, g 0 ∈ F[x; σ].
Hence Γσf is an Fq -algebra isomorphism between Rf and the subring Γσf (Rf ) of Fn×n con-
sisting of the (σ, f )-circulants.
This result does not generalize if f is not two-sided. For instance, (8.11) shows that
2
6= Γσa (x) if σ(a) 6= a.
Γσa (x2 )
The following consequence for (σ, f )-skew-constacyclic codes is immediate. In [262,
Cor. 1] the matrix Γσf (h0 ) appearing below is called the control matrix of the code C. This
is not to be confused with the parity check matrix to which we will turn later.
Corollary 8.7.8 Let f ∈ F[x; σ] be two-sided and f = hg = gh0 for some g, h, h0 ∈ F[x; σ].
Then Γσf (g)Γσf (h0 ) = 0 and the code C = vf (• (g)) = rowsp(Γσf (g)) is the left kernel of the
skew circulant Γσf (h0 ).
It is not hard to see that actually the two-sidedness of f along with f = hg implies the
existence of h0 such that f = gh0 .
Now that we have a natural notion of generator matrix for a skew-cyclic code it remains
to discuss whether such a code also has a parity check matrix that reflects the skew-cyclic
structure. We have shown in Remark 8.6.2(b) that in the commutative case the parity
check matrix hinges on two facts: (i) the product of circulants is again a circulant, (ii) the
transpose of a circulant is a circulant. Theorem 8.7.7 shows that property (i) carries through
to the noncommutative case if the modulus is two-sided (and thus also to the commutative
case for arbitrary moduli f instead of xn − 1). But if f is not two-sided, then even in the
skew-constacyclic case (i.e., moduli of the form f = xn − a), the product of two (σ, f )-
circulants is not a (σ, f )-circulant in general. However, we will encounter a proxy of such
multiplicativity in the next section that fully serves our purposes.
Transposition of (σ, f )-circulants is an even bigger obstacle. For general modulus f the
transpose of a (σ, f )-circulant need not be a (σ 0 , f 0 )-circulant for any automorphism σ 0
and any modulus f 0 of the same degree as f . This is actually not very surprising because
even in the commutative case the transpose of a circulant in the sense of Definition 8.7.3
need not be a circulant. A trivial example is f = xn and g = xr , but examples also exist
for polynomials f with nonzero constant term. The following (noncommutative) example
illustrates this.
The matrix G has rank 1 and thus generates a 1-dimensional (σ, f )-skew-cyclic code C =
vf (• (g)). Suppose the transpose GT is a (σ 0 , f 0 )-circulant for some automorphism σ 0 and
0
f 0 ∈ F[x; σ 0 ] of degree 3, say GT = Γσf 0 (g 0 ). Then clearly g 0 is given by the first column of G;
0
hence g 0 = ω +ω 2 x+ωx2 . Using for instance SageMath one checks that GT 6= Γσf 0 (g 0 ) for any
automorphism σ 0 of F4 and any f 0 ∈ F4 [x; σ 0 ] of degree 3 (even non-monic). Furthermore,
0
there exists no skew circulant H = Γσf 0 (h) such that rank(H) = 2 and GH T = 0. This
means there is no analog of Remark 8.6.2(b),(c): C does not have a skew circulant parity
check matrix and C ⊥ is not (σ 0 , f 0 )-skew-cyclic for any (σ 0 , f 0 ).
174 Concise Encyclopedia of Coding Theory
In the next section we restrict ourselves to skew-constacyclic codes and will see that in
that case these obstacles can be overcome.
We close this section by presenting a different type of parity check matrix, namely a
generalization of the Vandermonde type parity check matrix for classical cyclic codes. Recall
W-polynomials from Section 8.5. The following result is immediate with the definition of
the skew Vandermonde matrix in (8.7).
Theorem 8.7.10 ([262, Prop. 4]) Let f ∈ F[x; σ] be any monic modulus of degree n and
g ∈ F[x; σ] be a monic right divisor of f of degree r. Suppose g is a W-polynomial. Thus we
may write g = lclm(x − a1 , . . . , x − ar ) for distinct a1 , . . . , ar ∈ F; see Theorem 8.5.13(a)(ii).
Let M = Vn (a1 , . . . , ar ) ∈ Fn×r be the skew Vandermonde matrix. Then the cyclic code
C = v(• (g)) is given by
ρl : F[x; σ] −→ F[x; σ]
r
X r
X r
X
gi xi 7−→ xr−i gi = σ i (gr−i )xi (where gr 6= 0).
i=0 i=0 i=0
Theorem 8.8.1 ([736, Thm. 5.3]) Let xn − a = hg. Set c = σ n (g0 )ag0−1 , where g0 is the
constant coefficient of g. Then xn − c = σ n (g)h and
Note that c defined in the theorem is the conjugate ag0 with respect to the automor-
phism σ n in the sense of Definition 8.4.7. If c = a (i.e., σ n (g0 ) = g0 ) we have the much
nicer formula Γσa (g 0 g) = Γσa (g 0 )Γσa (g), which may be regarded as a generalization of the two-
sided case in Theorem 8.7.7. However, the above result holds true only for right divisors g
Introduction to Skew-Polynomial Rings and Skew-Cyclic Codes 175
of xn − a. Check, for instance, with the aid of (8.11) that Γσa (x(x + 1)) 6= Γσb (x)Γσa (x + 1)
for any b 6= 0 unless a = σ(a) = b.
The above product formula plays a central role in the following quite technical result. It
tells us that the transpose of a (σ, xn − a)-circulant is a (σ, xn − a0 )-circulant for a suitable
constant a0 .
Theorem 8.8.2 ([736, Thm. 5.6]) Suppose xn − a = hg for some g, h ∈ F[x; σ] of de-
gree r and k = n − r, respectively. Again set c = σ n (g0 )ag0−1 , where g0 is the constant
coefficient of g. Then
Γσa (g)T = Γσc−1 (g # ) = Γσσk (c−1 ) (g ◦ )Γσc−1 (xk ),
where g # = aσ k (ρl (g))xk and g ◦ = aσ k (ρl (g)). Furthermore, g ◦ is a right divisor of the
modulus xn − σ k (c−1 ).
The result generalizes Remark 8.6.1(c) and (d): if f = xn − 1 = hg and σ = id,
then c = 1, g ◦ = ρ(g) and thus g # = ρ(g)xk . Furthermore, in general and analogously to
Remark 8.6.1(e), g ◦ is a right divisor of the modulus xn −σ k (c−1 ), whereas the representative
of g # of degree less than n is in general not a divisor of xn − c−1 . This is the reason why
we provide two formulas pertaining to the transpose of a skew circulant Γσa (g). The first
one above is interesting in itself as it tells us that the transpose is again a skew circulant.
The second formula says that the skew-constacyclic code • (g), i.e., the row space of Γσa (g),
equals the row space of a transposed skew circulant where the representing polynomial is a
right divisor of the modulus. In all these cases it is crucial that g is a right divisor of the
modulus xn − a for otherwise the transpose of Γσa (g) is not a skew circulant in general [736,
Ex. 5.7].
Now we are ready to derive a parity check matrix reflecting the skew-constacyclic struc-
ture of the code. The second part of the following theorem appeared first, proven differently,
in [258, Thm. 8]. The mere (σ, a−1 )-skew-constacyclicity of C ⊥ can also be shown directly
with the aid of (8.10); see [1827, Thm. 2.4].
Theorem 8.8.3 ([736, Cor. 4.4, Thm. 5.8 and Thm. 6.1]) Let xn − a = hg, where
deg(g) = r and deg(h) = k = n − r. Set h◦ := ρl (σ −n (h)). Then h◦ |r (xn − a−1 ). Con-
sider the (σ, a)-skew-constacyclic code C = vxn −a (• (g)). Then Γσa (g)Γσa−1 (h◦ )T = 0 and
rank(Γσa−1 (h◦ )) = n − k. Hence
C = rowsp(Γσa (g)) = {c ∈ Fn | Γσa−1 (h◦ )cT = 0},
and the (n − k) × n-submatrix consisting of the first n − k rows of Γσa−1 (h◦ ) is a parity
check matrix of C. As a consequence, the dual code C ⊥ is (σ, a−1 )-skew-constacyclic with
(non-monic) generator polynomial h◦ and generator matrix and parity check matrix given
by the first n − k rows of Γσa−1 (h◦ ) and the first k rows of Γσa (g), respectively.
Clearly, this parity check matrix of C has a form analogous to (8.12) and thus reflects
the skew-constacyclic structure of C. As in Proposition 8.7.5, its row space equals the row
space of the entire skew circulant Γσa−1 (h◦ ). In the commutative cyclic case, where xn − 1 =
hg = gh, we have a−1 = a = 1 and h◦ = ρ(h) and thus recover Remark 8.6.2(b).
Having an understanding of the dual of skew-constacyclic codes, we can address self-
duality. The following corollary is immediate.
Corollary 8.8.4 ([258, Prop. 13], [261, Cor. 6], [736, Cor. 6.2]) If there exists a
(σ, a)-skew-constacyclic self-dual code of length n, then n is even and a = ±1. More specifi-
cally, let n be even and consider the modulus xn − , where ∈ {1, −1}. Then there exists a
self-dual skew-constacyclic code of length n if and only if there exists a polynomial h ∈ F[x; σ]
such that xn − = hh◦ . In this case the self-dual code is given by C = vxn − (• (h◦ )).
176 Concise Encyclopedia of Coding Theory
Theorem 8.9.1 ([260, Thm. 4]) Fix b, δ ∈ N. Suppose there exists some α ∈ F (the
algebraic closure of F) such that
Then the code C = vf (• (g)) has minimum distance at least δ. If g is the smallest degree
monic polynomial with roots αb , . . . , αb+δ−2 , then C is called an (n, q m , α, b, δ)-skew-BCH
code of the first kind.
Corollary 8.9.2 ([260, Thm. 5]) Consider the situation of Theorem 8.9.1 and where α ∈
F. Then the polynomial g 0 := lclm(x − αb , . . . , x − αb+δ−2 ) is in F[x; σ] and of degree δ − 1.
Thus for any left multiple f 0 of g 0 of degree n, the skew-cyclic code vf 0 (• (g 0 )) has dimension
n − δ + 1 and is MDS. It is called an (n, q m , α, b, δ)-skew-RS code of the first kind.
Theorem 8.9.1 follows immediately from the fact that the code is contained in the left
kernel of the skew Vandermonde matrix (see (8.7))
1 ··· 1
(α[[1]] )b ··· (α[[1]] )b+δ−2
Vn (αb , . . . , αb+δ−2 ) =
.. .. .
. .
(α[[n−1]] )b ··· (α [[n−1]] b+δ−2
)
The columns of this matrix consist of consecutive [[i]]-powers of αb , . . . , αb+δ−2 (which simply
accounts for skew-polynomial evaluation), while the rows consist of ordinary consecutive
powers of α[[0]] , . . . , α[[n−1]] . The latter together with the fact that α[[0]] , α[[1]] , . . . , α[[n−1]] are
distinct, guarantees that any (δ − 1) × (δ − 1)-minor of Vn (αb , . . . , αb+δ−2 ) is nonzero, which
establishes the stated designed distance.
There exist some precursors of Theorem 8.9.1. In [256, Prop. 2] the case where f = xm −1
(thus n = m), q = 2, b = 0, and α is a primitive element of F was considered. It was the first
appearance of skew-BCH codes. Subsequently, in [389, Prop. 2] the modulus f was relaxed
to a two-sided polynomial of degree n, and α to a primitive element of a field extension
Fqs , where n ≤ (q − 1)s. In the same paper, examples of such codes were constructed by
translating the above situation into the realm of linearized polynomials.
Theorem 8.9.1 has been generalized to the following form.
Theorem 8.9.3 ([1791, Thm. 4.10]) Let f have a nonzero constant coefficient. Suppose
there exist δ, t1 , t2 ∈ N and b, ν ∈ N0 and some α ∈ F such that
(a) g(αb+t1 i+t2 j ) = 0 for i = 0, . . . , δ − 2, j = 0, . . . , ν,
(b) (αt` )[[i]] 6= 1 for i = 1, . . . , n − 1, ` = 1, 2 (if ν = 0, the condition on αt2 is omitted).
Then the code vf (• (g)) ⊆ Fn has minimum distance at least δ + ν. It may also be called an
(n, q m , α, b, t1 , t2 , δ)-skew-BCH code of the first kind.
Note that for ν = 0 and t1 = 1 this result reduces to Theorem 8.9.1 because Condition (b)
is equivalent to α[[0]] , α[[1]] , . . . , α[[n−1]] being distinct.
Since the constructed code has dimension n − deg(g), it remains to investigate how to
find the smallest degree monic polynomial g satisfying (a) above (and has degree at most n).
Recall from Proposition 8.7.5(c) that the modulus f does not play a role in the generator
matrix of the resulting code vf (• (g)) ⊆ Fn . Thus, once g is found, any monic left multiple f
of degree n suffices. The polynomial g is obtained as follows, which is a direct consequence
of Example 8.5.5.
178 Concise Encyclopedia of Coding Theory
Remark 8.9.4 Consider the situation of Theorem 8.9.3 and suppose α is in the field ex-
tension Fqms of F = Fqm . Set T = {b + t1 i + t2 j | i = 0, . . . , δ − 2, j = 0, . . . , ν} and
A = {τ (αt ) | τ ∈ Aut(Fqms | F), t ∈ T }. Then g := mA is in F[x; σ] and is the smallest
degree monic polynomial satisfying (a) of Theorem 8.9.3.
Example 8.9.5 Consider the field extension F212 | F26 . Let α be the primitive element
of F212 with minimal polynomial x12 + x7 + x6 + x5 + x3 + x + 1 and let γ = α65 , which
is thus a primitive element of F26 . As always, σ is the 2-Frobenius. Let b = 0, t1 = 23, t2 =
1, δ = 4, ν = 0. Since ν = 0, Condition (b) of Theorem 8.9.3 amounts to (α23 )[[i]] 6= 1 for
i = 1, . . . , n − 1. Since i = 12 is the smallest positive integer such that (α23 )[[i]] = 1, we can
construct skew-BCH codes up to length 12. Condition (a) and Remark 8.9.4 shows that the
desired g in F26 [x; σ] is given by g = mA , where
6 6 6
A = {α0 , α23 , α46 , (α0 )2 , (α23 )2 , (α46 )2 } ⊆ F212 .
We now turn to skew-BCH codes of the second kind. The codes presented next are σ-
cyclic, i.e., skew-cyclic with respect to the central modulus xn − 1, and a priori defined over
a field extension Fqn of F = Fqm . The result generalizes the Hartmann–Tzeng Bound for
classical cyclic codes (see Theorem 2.4.2).
Theorem 8.9.6 ([828, Thm. 3.3] and [1339, Cor. 5]) Let σ be the q-Frobenius on
Fqn . Let f = xn − 1 and g ∈ Fqn [x; σ] be a right divisor of f . Let α ∈ Fqn be such that
n−1
α, αq , . . . , αq is a normal basis of Fqn over Fq and set β = α−1 σ(α) = αq−1 . Suppose
there exist δ, t1 , t2 ∈ N and b, ν ∈ N0 such that gcd(n, t1 ) = 1 and gcd(n, t2 ) < δ and
b+it1 +jt2
g(β q ) = 0 for i = 0, . . . , δ − 2, j = 0, . . . , ν.
The version given in [1339, Cor. 5] is slightly different from the above. The difference is
explained in [828, Rem. A.6].
t
We have seen already in Example 8.5.10 that xn − 1 = lclm(x − β q | t = 0, 1, . . . , n − 1).
Therefore the root condition on g does not clash with the condition that g be a right divisor
of f .
Let us now assume that n = ms so that Fqn is a field extension of Fqm . In [828] it is
shown how to obtain from the code of the previous theorem a code over the subfield F =
Fqm with the same designed minimum distance δ + s. Thus, as for skew-BCH codes of
the first kind we want to find the smallest degree monic polynomial g in Fqm [x; σ] with
the desired roots. This can again be achieved with the aid of Remark 8.9.4, where we
simply have to replace the set T by Te = {q b+t1 i+t2 j | i = 0, . . . , δ − 2, j = 0, . . . , ν}.
`m
Noting that Aut(Fqms | Fqm ) = {τ0 , . . . , τs−1 }, where τ` (a) = aq , we conclude that the set
A = {τ (β t ) | t ∈ Te, τ ∈ Aut(Fqms | Fqm )} is given by
b+t1 i+t2 j+`m
A = {β q | i = 0, . . . , δ − 2, j = 0, . . . , ν, ` = 0, . . . , s − 1}.
Introduction to Skew-Polynomial Rings and Skew-Cyclic Codes 179
All of this can simply be described in terms of the q-exponents within the cyclic group Cms
of order ms. Consider Cs , the cyclic group of order s, as a subgroup of Cms . Furthermore, let
X0 = Cs , X1 , . . . , Xm−1 be the cosets of Cs in Cms . Then Remark 8.9.4 and Theorem 8.9.6
lead to the following.
Theorem 8.9.7 ([828, Thm. 4.5]) Let n = ms and consider the situation of Theo-
rem 8.9.6. Consider the set S = {b + it1 + jt2 | i = 0, . . . , δ − 2, j = 0, . . . , ν} as a
subset of Cms (this is well-defined since σ ms = id). Define S as the smallest union of
t
cosets Xi containing S. Then the polynomial g 0 = lclm(x − β q | t ∈ S) is in F[x; σ]. Thus
it defines a (σ, f )-skew cyclic code C = vf (• (g 0 )) of length n = ms over F. The code C has
minimum distance at least δ + ν and is called an (n, q m , α, b, t1 , t2 , δ)-skew-BCH code of
the second kind.
The last part follows from the fact that g 0 is a left multiple of g from Theorem 8.9.6 and
thus generates a code contained in the code from that theorem.
There is a connection between the above and q-cyclotomic spaces defined in [1339,
Sec. 3.2]. In particular, the q-polynomial of an element β defined in [1339, Lem. 3] is the
linearized version of the σ-minimal polynomial of β as discussed earlier in Example 8.5.5.
Further details on the connection are given in [828, Prop. A.7].
Example 8.9.8 As in Example 8.9.5, consider F212 | F26 with the same primitive element
α and the same data γ = α65 , b = 0, t1 = 23, t2 = 1, δ = 4, ν = 0. The element α5 generates
a normal basis of F212 over F2 . Thus, β = α−5 σ(α5 ) = α5 . We have to consider the set
S = {b + it1 | i = 0, 1, 2} = {0, 11, 10} and find the smallest union of cosets of C2 in C12
t
containing S. This is given by S = {0, 6, 11, 5, 10, 4}. Then lclm(x − (α5 )q | t ∈ S) =
6 61 5 41 4 4 3 20 2 46 7
x + γ x + γ x + γ x + γ x + γ x + γ ∈ F26 [x; σ] generates a skew-cyclic code
over F26 of length 12 and designed minimum distance 4. It has dimension 6 and its actual
minimum distance is 6. Thus the code is not MDS.
To our knowledge, no general comparison of the two kinds of skew-BCH codes has been
conducted so far.
Finally, we briefly address evaluation codes in the skew polynomial setting. Recall that
the evaluation below, p(αi ), is carried out according to Definition 8.4.1.
By Theorem 8.5.8 the rank of the skew Vandermonde matrix equals the degree of the
minimal polynomial of the set {α1 , . . . , αn }, which is given by lclm(x − αi | i = 1, 2, . . . , n).
Hence the rank condition above is equivalent to deg(lclm(x − αi | i = 1, 2, . . . , n)) = n.
In the classical case where σ = id, this is equivalent to α1 , . . . , αn being distinct, and the
code Eid,α1 ,...,αn is a generalized [n, k] Reed–Solomon code; see [1008, Sec. 5.3].
The proof of Theorem 8.9.9 follows easily as in the classical case of generalized Reed–
Solomon codes with the aid of Theorem 8.5.8. It is well-known that in many cases classical
generalized Reed–Solomon codes are cyclic, e.g., Eid,1,α,...,αn−1 is cyclic if α is a primitive
element of F and n ≤ |F|. However, no such statement holds true for skew-polynomial
evaluation codes. Indeed, it is not hard to find examples of codes Eσ,1,α,...,αn−1 , where α is a
primitive element of F, that are not (σ, f )-skew cyclic for any monic modulus f of degree n.
The same is true for the evaluation codes Eσ,1,α[[1]] ,...,α[[n−1]] .
180 Concise Encyclopedia of Coding Theory
We close this chapter by mentioning that many of the papers cited above also present
decoding algorithms for the codes constructed therein. We refer to the above literature on
this important topic.
Acknowledgement
The author was partially supported by grant #422479 from the Simons Foundation.
Chapter 9
Additive Cyclic Codes
Jürgen Bierbrauer
Michigan Technological University
Stefano Marcugini
Università degli Studi di Perugia
Fernanda Pambianco
Università degli Studi di Perugia
9.1 Introduction
We develop the theory of additive cyclic codes in the case when the length of the code
is coprime to the characteristic of the field. The definition of an additive code and some
basic relations between a code defined over an extension field and its subfield and trace
codes are recalled in Section 9.2. Section 9.3 discusses the notions of cyclicity of a code.
In Section 9.4 we develop the theory of additive codes which are cyclic in the permutation
sense. This contains the classical theory of cyclic linear codes. We discuss when cyclic codes
are equivalent and consider the special case of cyclic quantum codes. The general theory
of additive codes which are cyclic in the monomial sense is in Section 9.5. It is proved in
Corollary 9.5.10 that each additive code over F4 which is cyclic in the monomial sense either
is cyclic in the (more restricted) permutation sense or is linear over F4 .
Special cases of this theory were published in earlier work [196, 197, 198, 199, 201, 663].
181
182 Concise Encyclopedia of Coding Theory
Clearly this generalizes the notion of linear codes. Linear codes correspond to the special
case of Definition 9.2.1 when m = 1. The alphabet is seen not as the field Fqm but as an
m-dimensional vector space over the subfield Fq . Observe that the dimension k in Definition
9.2.1 is not necessarily an integer. It may have m in the denominator. As an example, we
will encounter additive [7, 3.5]4 codes over F4 . Such a code has 43.5 = 27 codewords. We
will describe the general theory of additive codes which are cyclic in the monomial sense
under the general assumption that the characteristic p of the underlying field Fq is coprime
to the length n of the code. The main effect of the assumption is that the action of a
(cyclic) group of order n on a vector space over a field of characteristic p is completely
reducible, by Maschke’s Theorem; see Definition 9.3.5 and Theorem 9.3.6. We will make use
of basic relations between linear codes over a finite field F and its trace codes and subfield
codes over a subfield K ⊂ F. These basic facts and fundamental notions like the Galois
group, cyclotomic cosets and Galois closedness are introduced in [199, Chapter 12]. Recall
in particular Delsarte’s Theorem [199, Theorem 12.14] relating codes over F, their duals,
the trace codes and the subfield codes as well as [199, Theorem 12.17] which states among
other things that dimF (U ) = dimK (TrF/K (U )) provided the F-linear code U is Galois closed
with respect to the subfield K.
The notion of permutation equivalence can be used for all codes of fixed block length n.
It uses the symmetric group Sn as the group of motions. Two codes are equivalent if they
are in the same orbit under Sn . The stabilizer under Sn is the permutation automorphism
group of the code; see Section 1.8. An additive code is called cyclic in the permutation sense
if it admits an n-cycle in its permutation automorphism group.
In Chapters 2 and 17, respectively, the notions of cyclic linear codes and the more
general notion of constacyclic linear codes are studied. In the present chapter we consider
generalizations of those concepts from the category of linear codes to the category of additive
codes. Here the concept of an additive code which is cyclic in the permutation sense is a
direct generalization of the concept of a cyclic linear code. The corresponding generalization
of the concept of a constacyclic linear code is the notion of an additive code which is cyclic
in the monomial sense. It is based on the following rather general notion of equivalence.
Additive Cyclic Codes 183
Definition 9.3.2 q-linear q m -ary codes C and D are monomially equivalent if there
exist a permutation π on n objects and elements Ai ∈ GLm (q) such that
for all x = (x0 , x1 , . . . , xn−1 ) ∈ C. Here we view GLm (q) as the group of nonsingular Fq -
linear transformations on E = Fm q with Ai xπ(i) the image Ai (xπ(i) ) of xπ(i) under Ai .
1
The group of motions is the wreath product GLm (q) o Sn of GLm (q) and Sn of order
|GLm (q)|n × n!; two additive codes are equivalent in the monomial sense if they are in
the same orbit under the action of this larger group. The elements of the wreath product
are described by (n + 1)-tuples (A0 , . . . , An−1 , π) where the coefficients Ai ∈ GLm (q) and
the permutation π ∈ Sn . Write the elements of the monomial group as g(A0 , . . . , An−1 , π);
observe the rule: first the entries are permuted according to permutation π and then the
matrices Ai are applied in the coordinates.
As the n-cycles are conjugate in Sn , it is clear that each code which is cyclic in the
permutation sense is permutation equivalent to a code which is invariant under the permu-
tation π = (0, 1, . . . , n − 1). 2 In what follows, n-tuples (x0 , x1 , . . . , xn−1 ) or (x1 , x2 , . . . , xn )
will often be denoted (xi ) where the subscripts are determined by the context.
Proposition 9.3.4 Let C be a q-linear q m -ary code of length n and cyclic in the monomial
sense. Let I be the identity transformation of GLm (q). Then the following hold.
(a) C is monomially equivalent to a code invariant under g(I, . . . , I, A, π) where π =
(0, 1, . . . , n − 1).
(b) If C is invariant under g(A0 , A1 , . . . , An−1 , (0, 1, . . . , n − 1)) and A0 A1 · · · An−1 = I,
then C is monomially equivalent to a code that is cyclic in the permutation sense.
(c) If C is invariant under g(I, . . . I, A, (0, 1, . . . , n − 1)) and the order of A is coprime to
n, then C is monomially equivalent to a code that is cyclic in the permutation sense.
Definition 9.3.5 Let the group G act on a vector space V . An irreducible submodule
A ⊆ V is a nonzero G-submodule which does not contain a proper G-submodule (different
from {0} and from A itself). The action of G is completely reducible if for every G-
submodule A ⊂ V there is a direct complement B of A which is a G-submodule of V .
If the action is completely reducible, it suffices in a way to know the irreducibles, as every
G-submodule is a direct sum of irreducibles. There is still work to do as the representation
of a module as a direct sum of irreducibles may not be unique. However the situation is
a lot simpler than in the cases where complete reducibility is not satisfied. This is where
Maschke’s Theorem comes in.
Theorem 9.3.6 (Maschke) In the situation of Definition 9.3.5, let the order of G be
coprime to the characteristic p of the underlying field. Then the action of G is completely
reducible.
See also [1203, page 666]. This fundamental theorem is the reason why in the theory of
cyclic codes it is often assumed that gcd(n, p) = 1. We will apply it in the study of additive
codes which are cyclic in the permutation sense.
Definition 9.4.1 Let P be the space of polynomials in F [X] of degree < n along with the
zero polynomial. Then P is an n-dimensional F -vector space. Let A ⊆ Z/nZ be a set of
exponents. The F -vector space P(A) consists of the zero polynomial and the polynomials
in F [X] of degree < n all of whose monomials have degrees in A. The Galois closure A e
of A is the union of all q-cyclotomic cosets that intersect A nontrivially.
Theorem 9.4.2 In case m = 1 with gcd(n, p) = 1, the irreducible cyclic codes in the
permutation sense are precisely the V1 (Z) = TrF/Fq (Ev(P(Z))) of Fq -dimension |Z| where
Z is a q-cyclotomic coset. Each cyclic code can be written as a direct sum of irreducibles in
precisely one way. The total number of cyclic codes is 2t , where t is the number of different
q-cyclotomic cosets.
Theorem 9.4.2 is of course well known in the classical theory of linear cyclic codes. In
order to be self-contained we will prove it in the remainder of this subsection.
186 Concise Encyclopedia of Coding Theory
It follows from basic properties of the trace that in the description of a code C ⊆ V1 by
codewords TrF/Fq (Ev(p(X))), it suffices to use polynomials of the form p(X) = a1 X z1 +
a2 X z2 + · · · where z1 , z2 , . . . are representatives of different q-cyclotomic cosets. Such a
polynomial is said to be in standard form, for the moment.
Lemma 9.4.3 Let Z be a q-cyclotomic coset, z ∈ Z, |Z| = s, and L = Fqs . The Fq -vector
space hW z i generated by the uz where u ∈ W is L. The codeword TrF/Fq (Ev(aX z )) where
a ∈ F is identically zero if and only if a ∈ L⊥ where duality is with respect to the trace form.
Here for a field extension Fq ⊂ F = Fqr the trace form is defined by hx, yi = TrF/Fq (xy)
where x, y ∈ F .
Proof: The entry of TrF/Fq (Ev(p(X))) in coordinate αi is TrF/Fq (a1 αiz1 + a2 αiz2 + · · · ).
After a cyclic shift we obtain a codeword whose entry in coordinate αi is TrF/Fq (a1 α(i+1)z1 +
a2 α(i+1)z2 + · · · ) = TrF/Fq (a1 αz1 αiz1 + a2 αz2 αiz2 + · · · ) which is the trace of the evaluation
of a1 αz1 X z1 + a2 αz2 X z2 + · · · . The first claim follows by repeated application, and the
second is obvious.
Let us complete the proof of Theorem 9.4.2. Let C be an irreducible cyclic code and
TrF/Fq (Ev(p(X))) ∈ C where p(X) is in standard form. Assume p(X) is not a monomial.
Let B be defined as in Lemma 9.4.4 and p(X) = a1 X z1 + a2 X z2 + · · · ∈ B. Lemma 9.4.4
implies that a1 αlz1 X z1 + a2 αlz2 X z2 + · · · ∈ B for all l. Assume at first |Z1 | =6 |Z2 |. Choose
l such that αlz1 = 1 and αlz2 6= 1. By subtraction we have a2 (αlz2 − 1)X z2 + · · · ∈ B. The
cyclic code generated by the trace-evaluation of this polynomial has trivial projection to
V1 (Z1 ). As this is not true of C it follows from irreducibility that this codeword must be the
zero codeword; hence a2 (αlz2 −1) ∈ L⊥ 2 . As 0 6= α
lz2
−1 ∈ L2 (see Lemma 9.4.3), this implies
⊥
a2 ∈ L2 , a contradiction. Assume now |Z 1 | = |Z 2 | = s; let L = Fqs . Use Lemma
P 9.4.4. Let
cl ∈ Fq such that l cl αlz1 = 0. The same argument as above shows that l cl αlz2 = 0.
P
We have β = αz1 ∈ L and L is the smallest
Ps subfield of F containing β. Also αz2 = β j where
j is coprime
Ps to the order of β. Let l=0 cl β l be the minimal polynomial of β. We just saw
that l=0 cl β jl = 0. This shows that the mapping x 7→ xj is a field automorphism of L.
This implies that z1 and z2 are in the same cyclotomic coset, a contradiction.
We have seen that a polynomial in standard form whose trace-evaluation generates an
irreducible cyclic code is necessarily a monomial aX z where a ∈ / L⊥ using the by now
standard terminology (z ∈ Z, |Z| = s, L = Fqs , and ⊥ with respect to the trace form).
Additive Cyclic Codes 187
Lemma 9.4.4 shows that aαlz X z ∈ B for all l. As the αlz generate L (see Lemma 9.4.3),
it follows that auX z ∈ B for all u ∈ L. By Lemma 9.4.4 again we have in fact C =
TrF/Fq (Ev(aLX z )). Consider the map u 7→ TrF/Fq (Ev(auX z )) from L onto C. The kernel
of this map is 0, and so C has dimension s. As it is contained in V1 (Z) of dimension s, we
have equality.
This completes the proof of Theorem 9.4.2. As a result we obtain an algorithmic descrip-
tion of all codewords of all cyclic linear codes in the permutation sense of length n, where
gcd(n, p) = 1. The codes are in bijection with sets of cyclotomic cosets. Let Z1 ∪ · · · ∪ Z` be
P`
such a set. Let zk ∈ Zk , sk = |Zk |, and Lk = Fqsk . The dimension of the code is k=1 sk .
Its generic codeword w(x1 , . . . , x` ) where xk ∈ Lk has entry
`
X
TrLk /Fq (xk αizk )
k=1
Lemma 9.4.5 Let C be an irreducible additive cyclic code. The q-cyclotomic cosets deter-
mined by the irreducible linear cyclic codes πj (C) are all identical.
Proof: We can assume m = 2 and TrF/Fq (Ev(p1 (X), p2 (X))) ∈ C where p1 (X) = a1 X z1 ,
p2 (X) = a2 X z2 and ai ∈ / L⊥
i with the notation used in the previous subsection. The proof
is similar to the main portion of the proof of Theorem 9.4.2. Lemma 9.4.4 shows that
(a1 αlz1 X z1 , a2 αlz2 X z2 ) ∈ B for all l, where B is the Fq -linear space of tuples of polynomials
whose trace-evaluation is in C. Assume z1 , z2 are in different q-cyclotomic cosets Z1 , Z2 ,
of lengths s1 , s2 . If s1 6= s2 , the same argument applies as in the proof of Theorem 9.4.2.
If s1 = s2 , the argument used in the proof of Theorem 9.4.2 shows Z1 = Z2 , another
contradiction.
Proof: Such a codeword can be written as TrF/Fq (Ev(a1 X z , a2 X z , . . . , am X z )). The entry
in outer coordinate i and inner coordinate j is TrF/Fq (aj αiz ). Lemma 9.4.4 shows that
the cyclic code C generated by this codeword is the span of the codewords with entry
TrF/Fq (aj αlz αiz ) in the same position. As the αlz generate L = Fqs (see Lemma 9.4.3) it
follows that C consists of the codewords with entry TrF/Fq (aj uαiz ) in the (i, j)-coordinate,
where u ∈ L. It is now clear that this code is the cyclic code generated by any of its
nonzero codewords; in other words C is irreducible. By the transitivity of the trace we
188 Concise Encyclopedia of Coding Theory
have TrF/Fq (aj uαiz ) = TrL/Fq (bj uαiz ) where bj = TrF/L (aj ). The fact that the code is
nonzero means that not all the bj vanish. The map u 7→ TrF/Fq (Ev(a1 uX z , . . . , am uX z ))
is injective, and so dim(C) = s.
Corollary 9.4.7 The total number of irreducible permutation cyclic q-linear q m -ary codes
of length n coprime to the characteristic is Z (q ms − 1)/(q s − 1) where Z varies over the
P
q-cyclotomic cosets and s = |Z|.
Proof: In fact, Vm (Z) has Fq -dimension ms, and each of the irreducible subcodes has q s − 1
nonzero codewords.
Observe that this is true also in the linear case m = 1: the number of irreducible cyclic
codes is the number of q-cyclotomic cosets in this case; see Theorem 9.4.2. More importantly
we obtain a parametric description of the irreducible codes. Such a code is described by the
following data:
The codewords are then parameterized by x ∈ L. The entry in outer coordinate i and
inner coordinate j is TrL/Fq (bj xαiz ). Denote this code by C(z, P ). This leads to a parametric
description of all cyclic additive codes, not just the irreducible ones.
where sj is column j of B and x · sj is the ordinary dot product. The image of w(x)
under one cyclic shift is w(αz x). What happens if we use a different basis (z0l ) for the same
Pk
subspace U ? Then z0l = t=1 alt zt and A = [alt ] is invertible. The image of w(x) under this
Pk Pk
operation (replacing zl by z0l ) has (i, j)-entry TrL/Fq ( l=1 xl t=1 alt btj αiz ). This describes
Pk
w(x0 ) where x0l = t=1 xt atl ; in other words x0 = xA ∈ Lk . We have seen that C(z, U ) is
indeed independent of the choice of an encoding matrix. Basic properties of the trace show
that the dependence on the choice of representative z ∈ Z is given by C(qz, U q ) = C(z, U ).
Here is a concrete expression for the weights of constituent codes: Codeword w(x) has
entry w(x)i = 0 in outer coordinate i if and only if x · sj ∈ (αiz )⊥ for all j, with respect to
the TrL/Fq -form.
Finally each cyclic
L code can be written in a unique way as the direct sum of its con-
stituent codes: C = Z C(z, UZ ) where Z varies over the cyclotomic cosets and z is a fixed
representative of Z.
Definition 9.4.9 Let N (m, q) be the total number of subspaces of the vector space Fm
q .
Corollary 9.4.10 The total number Q of permutation cyclic q-linear q m -ary codes of length
|Z|
n coprime to the characteristic is Z N (m, q ).
Here the first section of 3 lines corresponds to C(1, (1, 0)F8 ), the second section to
C(−1, (0, 1)F8 ); in both cases k = 1, L = F8 = F2 () defined by 3 + 2 + 1 = 0, and
α = .
Example 9.4.12 In the case q = 2, m = 2, and n = 13 there are only two 2-cyclotomic
cosets, of lengths 1 and 12. An optimal cyclic [13, 6.5, 6]4 code is obtained. This was first
described
by Danielsen and Parker in [484]. A generator matrix can be written in the form
I | A where I is the 13 × 13 identity matrix and
0 11 10 11 00 10 11
1 10 01 00 00 01 11
1 01 00 01 11 11 10
1 00 01 10 00 11 01
0 10 11 11 01 01 00
1 11 01 01 10 01 00
A= 0 00 10 11 11 01 01 .
0 01 11 01 01 10 01
1 10 01 10 11 10 10
1 10 00 11 01 00 01
0 10 00 10 10 01 01
1 11 11 00 11 00 11
1 10 11 00 10 11 10
For small and moderate length, the parameters of the best cyclic additive codes over F4
have the tendency to coincide with the best parameters of additive codes over F4 in general;
see [199, Section 18.3], and [200, 202].
9.4.3 Equivalence
When are two additive cyclic codes Z C(z, UZ ) and Z C(z, UZ0 ) equivalent? Two such
L L
situations are easy to see. Here is the first.
L
Proposition 9.4.15 Let L C ∈ GLm (q). Then the additive cyclic code C = Z C(z, UZ ) is
monomially equivalent to Z C(z, UZ C).
Additive Cyclic Codes 191
Proof: The generic codeword w(x) of the second code above has entry TrL/Fq ((x · sj )αitz )
in outer coordinate i, inner coordinate j (see Equation 9.1). This is the entry of the first
code in outer coordinate i/t mod n and inner coordinate j.
Proposition 9.4.16 describes an action of the group of units in Z/nZ. Here is an example.
Example 9.4.17 There are precisely three inequivalent length 7 irreducible additive codes
over F4 which are cyclic in the permutation sense. This can be seen as follows. This is the
case q = 2, m = 2, and n = 7. The total number of irreducible cyclic codes is 3 + 9 + 9 = 21
(once PG(1, 2) and twice PG(1, 8)). The number of inequivalent such codes is at most
1 + 2 = 3. In fact, the three codes C(0, U ) where U ∈ PG(1, 2) are all equivalent by
Proposition 9.4.15. By Proposition 9.4.16 it suffices to consider Z(1), the 2-cyclotomic coset
containing 1, for the remaining irreducible codes. The number of possibly inequivalent such
irreducible codes is the number of orbits of PGL2 (2) on PG(1, 8). There are two such orbits.
The corresponding codes are C(1, (0, 1)) and C(1, (1, )) where ∈ / F2 . It is in fact obvious
that those irreducible codes are pairwise inequivalent. Clearly the first one is inequivalent
to the others as it has binary dimension 1 (the repetition code h(11)7 i). The second code
has w(x)i = (0, TrF8 /F2 (xi )), of constant weight 4. The third of those irreducible codes has
w(x)i = (TrF8 /F2 (xi ), TrF8 /F2 (xi+1 )), clearly not of constant weight.
Lemma 9.4.18 For q r > 2, let W = hαi ⊂ F = Fqr be the cyclic subgroup of order n, and
Pn−1
let l be an integer not a multiple of n. Then i=0 αil = 0.
Pn−1
Proof: Let S = i=0 αil . Then αl S = S and αl 6= 1. It follows that (1 − αl )S = 0 which
implies S = 0.
192 Concise Encyclopedia of Coding Theory
Proposition 9.4.19 With the previous notation Vm,F (Z)⊥ = Vm,F (Z 0 ) and
L
Z 0 6=−Z
Vm (Z)⊥ = Z 0 6=−Z Vm (Z 0 ).
L
Proof: As the dimensions are right, for the first equality, it suffices to show that, with respect
to h·, ·i, Vm,F (Z) is orthogonal to Vm,F (Z 0 ) when Z 0 6= −Z; an analogous statement holds
Pm Pn−1 i i
for the second equality. Concretely this means k,l=1 akl i=0 pk (α )ql (α ) = 0 where
0
pk (X) ∈ P(Z), ql (X) ∈ P(Z ). By bilinearity it suffices to prove this for fixed k, l and
0
pk (X) = X z , ql (X) = X z where z ∈ Z, z 0 ∈ Z 0 . Let j = z + z 0 . Then j is not a multiple of
Pn−1 iz+iz0
n. We need to show i=0 α = 0. This follows from Lemma 9.4.18, proving the first
equality. The proof of the second equality is analogous, using the definition of the trace.
Observe that the result of Proposition 9.4.19 is independent of the bilinear form h·, ·i.
The information contained in the bilinear form will determine the duality relations between
subspaces of Vm,F (Z) and Vm,F (−Z) as well as between subspaces of Vm (Z) and Vm (−Z). In
order to decide self-orthogonality, the following formula is helpful; see [199, Lemma 18.54].
Lemma 9.4.20 Let Z be a nonzero q-cyclotomic coset, |Z| = s, z ∈ Z, and L = Fqs . Then
X
TrF/Fq (auz )TrF/Fq (bu−z ) = n × TrF/Fq (aTrF/L (b)).
u∈W
Pr−1 Let
Proof:
qi qj
SPbe the sum on the left. Writing out the definition of the trace, S =
(q i −q j )z
i,j=0 a b u∈W u . The inner sum vanishes if (q i − q j )z is not divisible by n.
In the contrary case the inner sum is n. The Platter case occurs if and only if i = j + ρs
i Ps−1
r q i q ρs
where ρ = 0, . . . , s − 1. It follows that S = n i=0 aq ρ=0 b = nTrF/Fq (aTrF/L (b)).
An interesting special case concerns the cyclic quantum stabilizer codes over an arbi-
trary ground field Fq . In this case m = 2, and the bilinear form is the symplectic form
h(x1 , x2 ), (y1 , y2 )i = x1 y2 − x2 y1 ; in particular hx, xi = 0 always. By definition such a
code is a (cyclic) q-ary quantum stabilizer code if it is self-orthogonal with respect to the
symplectic form. The basic result to decide self-orthogonality is the following.
Proposition 9.4.21 When m = 2, gcd(n, p) = 1, and h·, ·i is the symplectic form, irre-
ducible codes C(z, P ) and C(−z, P 0 ) are orthogonal if and only if P = P 0 .
Proof: Observe that Z and −Z have the same length s, L = Fqs , and P, P 0 ∈ PG(1, L). It
suffices to show that C(z, P ) and C(−z, P ) are orthogonal. Let P = (v1 : v2 ). Writing out
the symplectic product of typical vectors we need to show
X X
TrF/Fq (av1 uz )TrF/Fq (bv2 u−z ) = TrF/Fq (av2 uz )TrF/Fq (bv1 u−z )
u∈W u∈W
for a, b ∈ F . Lemma 9.4.20 shows that the sum on the left equals
Theorem 9.4.22 The number of additive cyclic q 2 -ary quantum stabilizer codes is
Y Y Y
(q + 2) (q s/2 + 2) (3q s + 6).
Z=−Z, s=1 Z=−Z, s>1 Z6=−Z
Additive Cyclic Codes 193
Here s = |Z| and the last product is over all pairs {Z, −Z} of q-cyclotomic cosets such that
Z 6= −Z.
P
Proof: Let C = Z SZ where SZ = C(z, UZ ). This is self-orthogonal if and only if SZ and
S−Z are orthogonal for each Z. Consider first the generic case Z 6= −Z. If SZ or S−Z is
{0}, then there is no restriction on the other. If SZ = C(z, L2 ), then S−Z = {0}.
Consider the case Z = −Z and s > 1. Then s = 2i. Either SZ = {0} or SZ = C(z, P )
is a self-orthogonal irreducible code where P ∈ PG(1, L). The dual of SZ in its constituent
i i
is C(−z, P ) by Proposition 9.4.21. As −z = z q , we have C(−z, P ) = C(z, P q ). This equals
0 0 i
SZ if and only if P ∈ PG(1, L ) where L = Fqi . There are q + 1 choices for P . The case
Z = −Z and s = 1 contributes q + 2 self-orthogonal and q + 1 self-dual codes.
Example 9.4.23 In the case q = 2 and n = 7, we obtain 4 × 30 = 120 quantum codes
total and 3 × 11 = 33 self-dual ones.
Example 9.4.24 For q = 2 and n = 15, the number of quantum codes is 4 × 4 × 6 × 54,
and there are 3 × 3 × 5 × 19 self-dual quantum codes.
For more on quantum codes, see Chapter 27.
Theorem 9.5.3 Let gcd(n, p) = 1, and let g = g(1, . . . , 1, γ, (0, 1, . . . , n − 1)) where γ ∈ F∗q
with ord(γ) = u. The codes which are invariant under the group generated by g are the
contractions of the cyclic codes of length un in the permutation sense defined by cyclotomic
cosets consisting of numbers which are 1 mod u. If there are t such cyclotomic cosets in
Z/unZ, then the number of codes invariant under g is 2t .
Proof: Observe that gcd(un, p) = 1. By Lemma 9.5.2 the codes in question are the con-
tractions of the length un cyclic codes all of whose codewords (xi ) satisfy xi+n = γxi for
all i. Let Z be a q-cyclotomic coset, z ∈ Z, |Z| = s, and L = Fqs . A typical codeword of the
irreducible cyclic code defined by Z has entry xi = TrL/Fq (aβ iz ), where a ∈ L. As β n = γ,
it follows that xi+n = TrL/Fq (aβ (i+n)z ) = γ z xi . We must have z ≡ 1 (mod u).
Here is an algorithmic view concerning linear length n q-ary codes which are invariant
under g = g(1, . . . , 1, γ, (0, 1, . . . , n − 1)) where ord(γ) = u. Let r0 = ordun (q) and F 0 = Fqr0 .
Let β ∈ F 0 have order un such that γ = β n and α = β u . Consider the q-cyclotomic cosets
Z1 , . . . , Zt in Z/unZ whose elements are 1 mod u. Observe that |Z1 ∪ · · · ∪ Zt | = n. Let
sk = |Zk |, zk ∈ Zk , and Lk = Fqsk ⊆ F 0 . Then C ⊆ ⊕Ck where the Ck are the irreducible
γ-constacyclic codes of length n with dim(Ck ) = sk . The codewords of Ck are w(x) where
x ∈ Lk and the entry of w(x) in coordinate i is TrLk /Fq (xβ izk ). The image of w(x) under
the monomial operation g(1, . . . , 1, γ, (0, 1, . . . , n − 1)) is w(β zk x). After n applications this
leads to w(γ zk x) = w(γx) = γw(x), as expected.
Example 9.5.4 Consider q = 4, u = 3, and n = 15. Then F = F16 whereas F 0 = F46 . The
4-cyclotomic cosets in Z/45Z consisting of numbers divisible by 3 are in bijection with the
nine 4-cyclotomic cosets mod 15. The contractions of the cyclic length 45 codes defined by
Z-cosets all of whose elements are divisible by 3 reproduce precisely the length 15 cyclic
codes in the permutation sense. There are only three 4-cyclotomic cosets in Z/45Z all of
whose elements are 1 mod 3. They are
{1, 4, 16, 19, 31, 34}, {7, 13, 22, 28, 37, 43}, {10, 25, 40}.
It follows that there are 23 = 8 constacyclic linear codes over F4 with constant γ = β 15 of
order 3.
Example 9.5.5 Consider q = 8, u = 7, and n = 21. Then F 0 = F242 = F814 . The constant
is γ = β 21 of order 7. We have un = 7 × 21 = 147, and we have to consider the action of
q = 8 (the Galois group Gal(F 0 |F8 ) has order 14) on the 21 elements which are 1 mod 7.
The corresponding cyclotomic cosets are
Z(1) = {1, 8, 64, 71, −20, −13, 43, 50, −41, −34, 22, 29, 85, −55}
of length 14 and
Z(15) = {15, −27, 78, 36, −6, −48, 57}
of length 7. It follows that there are precisely two irreducible γ-constacyclic F8 -linear codes
of length 21, of dimensions 14 and 7.
Example 9.5.7 The second family of constacyclic codes over F4 from [801] occurs in the
case u = 3 and n = (4f − 1)/3 for arbitrary f where the 4-cyclotomic cosets are those
generated by 1 and −2. This yields a 2f -dimensional code over F4 of dual distance 5 and
therefore [(4f − 1)/3, (4f − 1)/3 − 2f, 5]4 codes for arbitrary f .
Proof: The additive cyclic length un codes are direct sums over cyclotomic cosets Z of
codes parameterized by subspaces UZ ⊂ Lm , using the customary terminology. Such a
code is in IA (Fmn
q ) if and only if each codeword x = (xi ) (where i = 0, . . . , un − 1 and
xi ∈ F m
q ) satisfies xi+n = Axi . Fix Z and ask for which subspace U this is satisfied. Let
A = [ajj 0 ] ∈ GLm (q) and U = P = (b1 : · · · : bm ) ∈ PG(m − 1, L) a point. The stability
condition is
m
X
TrL/Fq (xbj β (i+n)z ) = ajj 0 TrL/Fq (xbj 0 β iz )
j 0 =1
Proposition 9.5.9 Let C be a q-linear q m -ary length n code, which is invariant under
g(I, . . . , I, A, (0, 1, . . . , n − 1)), where gcd(n, p) = 1 and A of order u = q m − 1 represents
multiplication by a primitive element γ in Fqm . Then C is Fqm -linear.
Proof: It suffices to prove this for irreducible constacyclic codes contr(C(z, P )). The generic
codeword w(x) has been described above. Applying matrix A in each outer coordinate
i = 0, . . . , n − 1 yields the codeword w(γ z x) which is still in the code.
Proposition 9.5.9 applies in particular in the cases q = 2, m = 2, u = 3 and q = 2,
m = 3, u = 7.
196 Concise Encyclopedia of Coding Theory
Corollary 9.5.10 Each additive code over F4 , which is cyclic in the monomial sense, either
is linear over F4 or is equivalent to a cyclic code in the permutation sense.
Julia Lieb
Universität Zürich
Raquel Pinto
Universidade de Aveiro
Joachim Rosenthal
Universität Zürich
10.1 Introduction
Convolutional codes were introduced by Peter Elias [674] in 1955. They can be seen as
a generalization of block codes. In order to motivate this generalization, consider a k × n
generator matrix G whose row space generates an [n, k] block code C. Denote by Fq the
finite field with q elements. In case a sequence of message words mi ∈ Fkq , i = 1, . . . , N has
to be encoded one would transmit the sequence of codewords ci = mi G ∈ Fnq , i = 1, . . . , N .
197
198 Concise Encyclopedia of Coding Theory
the whole encoding process using the block code C would be compactly described as
Instead of using the constant matrix G as an encoding map, Elias suggested using more
general polynomial matrices of the form G(z) whose entries consist of elements of the
polynomial ring Fq [z].
There are natural connections to automata theory and systems theory, and this was first
recognized by Massey and Sain in 1967 [1351]. These connections have always been fruitful
in the development of the theory on convolutional codes; the reader might also consult the
survey [1596].
In the 1970s Forney [740, 741, 742, 743] developed a mathematical theory which P∞ allowed
the processing of an infinite set of message blocks having the form m(z) := i=1 mi z i ∈
Fq [[z]]k . Note that the quotient field of the ring of formal power series Fq [[z]] is the field of
formal Laurent series Fq ((z)), and in the theory of Forney, convolutional codes were defined
as k-dimensional linear subspaces of the n-dimensional vector space Fq ((z))n that also
possess a k × n polynomial generator matrix G(z) ∈ Fq [z]k×n . The theory of convolutional
codes as first developed by Forney can also be found in the monograph by Piret [1511] and in
the textbook by Johanesson and Zigangirov [1051]; McEliece also provides a survey [1363].
In this chapter our startingPpoint is message words of finite length, i.e., polynomial
N i k
vectors of the form m(z) := i=0 m i z ∈ Fq [z] which get processed by a polynomial
matrix G(z) ∈ Fq [z]k×n . The resulting code becomes then in a natural way a rank k module
over the polynomial ring Fq [z]. The connection to discrete time linear systems by duality is
then also natural as was first shown by Rosenthal, Schumacher, and York [1597].
By Cramer’s rule and elementary properties of determinants, it follows that U (z) is uni-
modular if and only if det(U (z)) ∈ F∗q := Fq \ {0}.
Assume that G(z) and G(z) e are both generator matrices of the same code C =
rowspaceR (G(z)) = rowspaceR (G(z)). Then we can immediately show that there is a uni-
e
modular matrix U (z) such that
G(z)
e = U (z)G(z).
Note that this induces an equivalence relation on the set of k × k generator matrices: G(z)
and G(z)
e are equivalent if and only if G(z)
e = U (z)G(z) for some unimodular matrix U (z).
A canonical form for such an equivalence relation is the column Hermite form.
Definition 10.2.1 ([788, 1073]) Let G(z) ∈ Rk×n with k ≤ n. Then there exists a uni-
modular matrix U (z) ∈ Rk×k such that
h11 (z) h12 (z) · · · h1k (z) h1,k+1 (z) · · · h1n (z)
h22 (z) · · · h2k (z) h2,k+1 (z) · · · h2n (z)
H(z) = U (z)G(z) = .. .. .. ..
. . . .
hkk (z) hk,k+1 (z) · · · hkn (z)
where hii (z), i = 1, 2, . . . , k, are monic polynomials such that deg hii > deg hji for j < i.
H(z) is the (unique) column Hermite form of G(z).
Other equivalence relations are induced by right multiplication with a unimodular ma-
trix or by right and left multiplication with unimodular matrices. Canonical forms of such
equivalence relations are the row Hermite form and the Smith form, respectively.
Definition 10.2.2 ([788, 1073]) Let G(z) ∈ Rk×n with k ≤ n. Then there exists a uni-
modular matrix U (z) ∈ Rn×n such that
h11 (z) 0 0
h (z) h (z) .. ..
21 22 . .
H(z) = G(z)U (z) = .. .. .. .. ..
. . . . .
hk1 (z) hk2 (z) · · · hkk (z) 0 · · · 0
where hii (z), i = 1, 2, . . . , k, are monic polynomials such that deg hii > deg hij for j < i.
H(z) is the (unique) row Hermite form of G(z).
Definition 10.2.3 ([788, 1073]) Let G(z) ∈ Rk×n with k ≤ n. Then there exist unimod-
ular matrices U (z) ∈ Rk×k and V (z) ∈ Rn×n such that
γ1 (z) 0 ··· 0
.. ..
γ2 (z) . .
S(z) = U (z)G(z)V (z) = .. ..
..
. . .
γk (z) 0 0
where γi (z), i = 1, 2, . . . , k, are monic polynomials such that γi+1 (z) | γi (z) for i =
1, 2, . . . , k −1. These polynomials are uniquely determined by G(z) and are called invariant
polynomials of G(z). S(z) is the Smith form of G(z).
200 Concise Encyclopedia of Coding Theory
Since two equivalent generator matrices differ by left multiplication with a unimodular
matrix, they have equal k × k (full size) minors, up to multiplication by a constant. The
maximal degree of the full size minors of a generator matrix (called its internal degree) of
a convolutional code C is called the degree (or complexity) of C, and it is usually denoted
by δ. A convolutional code of rate k/n anddegree
δ is also called an (n, k, δ) convolutional
code. Throughout this chapter L := kδ + n−k δ
.
For i = 1, . . . , k, the largest degree of any entry in row i of a matrix G(z) ∈ Rk×n is called
the ith row degree νi . It is obvious that if G(z) is a generator matrix and ν1 , ν2 , . . . , νk are
the row degrees of G(z), then δ ≤ ν1 + ν2 + · · · + νk . The sum of the row degrees of G(z) is
called its external degree. If the internal degree and the external degree coincide, G(z) is
said to be row reduced, and it is called a minimal generator matrix. Thus, the degree of
the code can be equivalently defined as the external degree of a minimal generator matrix
of C.
Lemma 10.2.4 ([741, 1073]) Let G(z) ∈ Rk×n with row degrees ν1 , ν2 , . . . , νk , and let
[G]hr be the highest row degree coefficient matrix defined as the matrix with the ith row
consisting of the coefficients of z νi in the ith row of G(z). Then δ = ν1 + ν2 + · · · + νk , i.e.,
G(z) is row reduced, if and only if [G]hr is full row rank.
Let G(z) be a row reduced generator matrix with row degrees ν1 , ν2 , . . . , νk and
c(z) = u(z)G(z) where u(z) = [u1 (z) u2 (z) · · · uk (z)] ∈ Rk . Obviously, deg c(z) ≤
max {νi + deg ui (z)}. Let Λ = max {νi + deg ui (z)} and write ui (z) = αi z Λ−νi + ri (z)
i:ui (z)6=0 i:ui (z)6=0
with deg ri (z) < Λ − νi for i = 1, 2, . . . , k. Then
α1 z Λ−ν1 α2 z Λ−ν2 · · · αk z Λ−νk + r1 (z) r2 (z) · · · rk (z)
c(z) = ×
ν1
z
z ν2
[G]hr + Grem (z) ,
..
.
νk
z
where Grem (z) ∈ Rk×n has ith row degree smaller than νi for i = 1, 2, . . . , k. The coefficient
of c(z) of degree Λ is given by cΛ = [α1 α2 · · · αk ][G]hr which is different from zero since
αi 6= 0 for some i ∈ {1, 2, . . . , k} and [G]hr is full row rank, i.e.,
Equality (10.1) is called the predictable degree property, and it is an equivalent char-
acterization of the row reduced matrices [741, 1073].
Given a generator matrix G(z), there always exists a row reduced generator matrix
equivalent to G(z) [1073]. That is, all convolutional codes admit minimal generator matrices.
If G1 (z) and G2 (z) are two equivalent generator matrices, each row of G1 (z) belongs to the
image of G2 (z) and vice-versa. Then, if G1 (z) and G2 (z) are row reduced matrices, the
predictable degree property implies that G1 (z) and G2 (z) have the same row degrees, up
to row permutation.
Another important property of polynomial matrices is left (or right) primeness.
Definition 10.2.5 A polynomial matrix G(z) ∈ Rk×n , with k ≤ n, is left prime if in all
factorizations G(z) = ∆(z)G(z) with ∆(z) ∈ Rk×k and G(z) ∈ Rk×n , the left factor ∆(z)
is unimodular.
Left prime matrices admit several very useful characterizations. Some of these charac-
terizations are presented in the next theorem.
Convolutional Codes 201
Theorem 10.2.6 ([1073]) Let G(z) ∈ Rk×n with k ≤ n. The following are equivalent.
(a) G(z) is left prime.
(b) The Smith form of G(z) is [Ik 0k×(n−k) ].
(c) The row Hermite form of G(z) is [Ik 0k×(n−k) ].
(d) G(z) admits a right n × k polynomial inverse.
(e) G(z)can be completed to a unimodular matrix, i.e., there exists L(z) ∈ R(n−k)×n such
G(z)
that is unimodular.
L(z)
Example 10.2.7 Let us consider the binary field, i.e., q = 2. The convolutional code C of
rate 2/3 with generator matrix
1 1 z
G(z) =
z2 1 z + 1
is noncatastrophic, since
G(z)is left prime by Theorem 10.2.6(d) as G(z) admits the right
0 0
polynomial inverse z + 1 z . The highest coefficient matrix of G(z), [G]hr , is full row
1 1
rank and, consequently, G(z) is row reduced by Lemma 10.2.4. The degree of C is then
equal to the sum of the row degrees of G(z), which is 3. Therefore C is a (3, 2, 3) binary
convolutional code.
On the other hand, the convolutional code Ce with generator matrix
1 + z + z2 z 1 + z2
1+z 1
G(z) =
e G(z) =
0 1 z2 1 z+1
H(z) is called a parity check matrix of C, analogous to the block code case.
It was shown that if a convolutional code is noncatastrophic, then it admits a parity
check matrix. But the converse is also true.
Theorem 10.2.8 ([1932]) Let C be a convolutional code of rate k/n. Then there exists a
full row rank polynomial matrix H(z) ∈ R(n−k)×n such that
Proof: Let us assume that C admits a parity check matrix H(z) ∈ R(n−k)×n and let us write
H(z) = X(z)H(z),
e where X(z) ∈ R(n−k)×(n−k) is full rankh and H(z) ∈ R
e
i
(n−k)×n
is left
k×n T T
prime. Then there exists a matrix L(z) ∈ R such that L(z) H(z)
e is unimodular
and therefore h
G(z)
i
L(z)T H(z)
e T = In ,
N (z)
for some left prime matrices N (z) ∈ R(n−k)×n and G(z) ∈ Rk×n . Then H(z)G(z)
e T
=
T
0(n−k)×k and consequently also H(z)G(z) = 0(n−k)×k . It is clear that C = rowspaceR G(z)
and therefore C is noncatastrophic.
Remark 10.2.9 If Ce is catastrophic (i.e., its generator matrices are not left prime), we can
still obtain a right prime matrix H(z) ∈ R(n−k)×n such that Ce ( kerR H(z). If G(z) e is a
k×k
generator matrix of C, we can write G(z) = [∆(z) 0k×(n−k) ]U (z) with ∆(z) ∈ R
e e where
[∆(z) 0k×(n−k) ] is the row Hermite form of G(z)
e and U (z) is a unimodular matrix. Then
G(z) = ∆(z)U1 (z) where U1 (z) is the submatrix of U (z) consisting of its first k rows. This
e
means, by Theorem 10.2.6(e), that U1 (z) is left prime. The matrix H(z) is a parity check
matrix of the convolutional code C = rowspaceR U1 (z) and Ce ( C.
Example 10.2.10 Consider the convolutional codes C and Ce from Example 10.2.7. The
3 2
matrix H(z) = 1 1 + z + z 1 + z is a parity check matrix of C. Since Ce is catas-
trophic, it does not admit a parity check matrix. However, since Ce ( C, it follows that
Ce ( ker H(z).
The dual of a noncatastrophic convolutional code is also noncatastrophic. The left prime
parity check matrices of C are the generator matrices of C ⊥ and vice-versa. The degree of a
noncatastrophic code and its dual are the same. This result is a consequence of the following
lemma and Theorem 10.2.6(f).
Lemma 10.2.11 ([741]) Let H(z) ∈ R(n−k)×n and G(z) ∈ Rk×n be a left prime parity
check matrix and a generator matrix of a noncatastrophic convolutional code, respectively.
Given a full size minor of G(z) consisting of the columns i1 , i2 , . . . , ik , let us define the com-
plementary full size minor of H(z) as the minor consisting of the complementary columns,
i.e., by the columns {1, 2, . . . , n}\{i1 , i2 , . . . , ik }. Then the full size minors of G(z) are equal
to the complementary full size minors of H(z), up to multiplication by a nonzero constant.
Convolutional Codes 203
Proof: For simplicity, let us consider i1 = 1, i2 = 2, . . . , ik = k, i.e., the full size minor of
G(z), M1 (G), consisting of the first k columns:
G(z)
M1 (G) = det .
0(n−k)×k In−k
where Q(z) ∈ R(n−k)×k and H(z) e is the submatrix of H(z) consisting of its last n − k
columns, i.e., M 1 (H) = det H(z)
e is the complementary minor to M1 (G),and from (10.3)
we conclude that M1 (G) = αM 1 (H), where α = det L(z)T H(z)T belongs to F∗q .
Applying the same reasoning we conclude that all the full size minors of G(z) are equal to
α times the complementary full size minors of H(z).
Therefore if G(z) is a generator matrix and H(z) is a parity check matrix of a noncatas-
trophic convolutional code C, they have the same maximal degree of the full size minors,
which means that C and C ⊥ have the same degree.
Definition 10.2.12 The (Hamming) weight wtH (c) of c ∈ Fnq is defined as the number
Pdeg(c(z))
of nonzero components of c, and the weight of a polynomial vector c(z) = t=0 ct z t ∈
deg(c(z))
Rn is defined as wtH (c(z)) =
P
t=0 wtH (ct ). The (Hamming) distance between
c1 , c2 ∈ Fnq is defined as dH (c1 , c2 ) = wtH (c1 − c2 ); correspondingly the distance between
c1 (z), c2 (z) ∈ Rn is defined as dH (c1 (z), c2 (z)) = wtH (c1 (z) − c2 (z)),
During transmission of information over a q-ary symmetric channel, errors may occur;
i.e., information symbols can be exchanged by other symbols in Fq in a symmetric way.
After channel transmission, a convolutional code C can detect up to s errors in any received
word w(z) if df ree (C) ≥ s + 1 and can correct up to t errors in w(z) if df ree (C) ≥ 2t + 1,
which gives the following theorem.
Theorem 10.2.14 Let C be a convolutional code with free distance d. Then C can always
detect d − 1 errors and correct d−1
2 errors.
As convolutional codes are linear, the difference between two codewords is also a code-
word which gives the following equivalent definition of free distance.
204 Concise Encyclopedia of Coding Theory
Example 10.2.16 Consider the convolutional code C defined in Example 10.2.7. Since
[1 1 z] is a codeword of C with weight 3, it follows that df ree (C) ≤ 3. Let w(z) ∈ C with
P`1
w(z) 6= 0. So w(z) = i=` 0
wi z i for some `0 , `1 ∈ N0 such that w`0 6= 0 and w`1 6= 0. Then
w`0 ∈ rowspaceR G(0), and consequently w(z) has weight 2 or 3. Moreover, the predictable
degree property (10.1) of G(z) implies that `1 > `0 , and therefore w(z) must have weight
at least 3. Hence, we conclude that C has free distance 3.
Besides the free distance, convolutional codes also possess another notion of distance,
the so-called column distances. These distances have an important role in transmission of
information over an erasure channel, which is a suitable model for many communication
channels, in particular packet switched networks such as the internet. In this kind of channel,
each symbol either arrives correctly or does not arrive at all, and it is called an erasure. In
Section 10.5.1 the decoding over these type of channels will be analyzed in detail.
Column distances have important characterizations in terms of the generator matrices
of the code, but also in terms of its parity check matrices if the code is noncatastrophic. For
this reason, we will consider throughout this section noncatastrophic convolutional codes.
G0 G1 ··· Gj H0
G0 ··· Gj−1 H1 H0
Gcj := and Hjc :=
.. .
..
. . ... ..
.
..
.
G0 Hj Hj−1 ··· H0
i
P
Then if c(z) = i∈N0 ci z is a codeword of C, it follows that
and Hjc [c0 c1 · · · cj ]T = 0T . Note that since G(z) is left prime, G0 has full row rank, and
therefore c0 6= 0 in (10.4) if and only if u0 6= 0. Consequently,
dcj (C) = min wtH ([u0 · · · uj ]Gcj ) = min wtH (c[0,j] (z)) | Hjc [c0 · · · cj ]T = 0T ,
u0 6=0 c0 6=0
In the following, we will provide criteria to check whether a convolutional code is MDP.
Theorem 10.2.25 ([809]) Let C be a convolutional code with generator matrix G(z) =
µ i k×n
Pν
z i ∈ R(n−k)×n .
P
i=0 G i z ∈ R and with left prime parity check matrix H(z) = i=0 Hi
The following statements are equivalent.
(a) C is MDP.
G0 · · · GL
. .. .
.. , with Gi = 0k×n for i > µ, has the property that every full
(b) GL :=
0 G0
size minor formed by rows with indices 1 ≤ j1 < · · · < j(L+1)k ≤ (L+1)n which fulfills
jsk ≤ sn for s = 1, . . . , L is nonzero.
H0 0
. ..
(c) HL := .. . , with Hi = 0(n−k)×n for i > ν, has the property that every
HL · · · H0
full size minor formed by columns with indices 1 ≤ j1 < · · · < j(L+1)(n−k) ≤ (L + 1)n
which fulfills js(n−k) ≤ sn for s = 1, . . . , L is nonzero.
The property of being an MDP convolutional code is invariant under duality as shown
in the following theorem.
Theorem 10.2.26 ([809]) An (n, k, δ) convolutional code is MDP if and only if its dual
code, which is an (n, n − k, δ) convolutional code, is MDP.
MDP convolutional codes are very efficient for decoding over the erasure channel. Next
we introduce two special classes of MDP convolutional codes, the reverse MDP convolutional
codes and the complete MDP convolutional codes, which are especially suited to deal with
particular patterns of erasures. Section 10.5.1 is devoted to decoding over this channel, and
the decoding capabilities of these codes will be analyzed in more detail.
Definition 10.2.27 ([1014]) Let C be an (n, k, δ) convolutional code with left prime row
reduced generator matrix G(z) = [gij (z)]. Set g ij (z) := z νi gij (z −1 ) where νi is the ith row
degree of G(z). Then the code C with generator matrix G(z) = [g ij (z)] is also an (n, k, δ)
convolutional code, called the reverse code to C. It follows that c0 + · · · + cd z d ∈ C if and
only if cd + · · · + c0 z d ∈ C.
Remark 10.2.29 ([1810]) Let C be an (n, k, δ) MDP convolutional code such that (n −
k) | δ and H(z) = H0 + · · · + Hν z ν , with Hν 6= 0, be a left prime and row reduced parity
check matrix of C. Then the reverse code C has parity check matrix H(z) = Hν +· · ·+H0 z ν .
Moreover, C is reverse MDP if and only if every full size minor of the matrix
Hν · · · Hν−L
.. ..
HL := . .
0 Hν
Convolutional Codes 207
formed from the columns with indices j1 , . . . , j(L+1)(n−k) with js(n−k)+1 > sn for s =
1, . . . , L is nonzero.
Theorem 10.2.30 ([1810]) An (n, k, δ) reverse MDP convolutional code exists over a suf-
ficiently large base field.
Hν ··· H0 0
.. .. ∈ Fq(L+1)(n−k)×(ν+L+1)n
H := . .
0 Hν ··· H0
is called a partial parity check matrix of the code. Moreover, C is called a com-
plete MDP convolutional code if every full size minor of H that is formed by columns
j1 , . . . , j(L+1)(n−k) with j(n−k)s+1 > sn and j(n−k)s ≤ sn + νn for s = 1, . . . , L is nonzero.
Remark 10.2.32
(a) [1810] Every complete MDP convolutional code is a reverse MDP convolutional code.
(b) [1255] A complete (n, k, δ) MDP convolutional code exists over a sufficiently large
base field if and only if (n − k) | δ.
Theorem 10.3.1 ([1064]) For n ≥ 2 and |Fq | ≥ n + 1, set sj := d(j − 1)(|Fq | − 1)/ne for
j = 2, . . . , n and
2
9 |Fq | if n = 2,
1
δ := 3 |Fq | if 3 ≤ n ≤ 5,
1
2 |Fq | if n ≥ 6.
Qδ
Moreover, let α be a primitive element of Fq , and set g1 (x) := k=1 (x − αk ), gj (x) :=
g1 (xα−sj ) for j = 2, . . . , n. Then G(z) = [g1 (z) · · · gn (z)] is the generator matrix of an
(n, 1, δ) MDS convolutional code with free distance equal to n(δ + 1).
The second construction works for the same field size as the first one but contains a
restriction on the degree of the code, which is different from the restriction on the degree
in the first construction.
208 Concise Encyclopedia of Coding Theory
The construction of the preceding theorem could be explained using the following steps:
Theorem 10.3.6 ([809]) For every b ∈ N the not trivially zero minors of the Toeplitz
1 0 ··· ··· 0
b .. .. ..
1 . . .
. ..
matrix .. .. .. ..
. . . . ∈ Zb×b are all positive. Hence for each b ∈ N there
b .. ..
b−1
. . 0
b b
1 b−1 ··· 1 1
exists a smallest prime number pb such that this matrix is superregular over the prime field
Fpb .
The second construction for MDP convolutional code also requires large field sizes but
has the advantage that it works for arbitrary characteristic of the underlying field as well
as for arbitrary code parameters.
Theorem 10.3.7 ([39]) Let n, k, δ be given integers and m := max{n − k, k}. Let α be a
primitive element of a finite field FpN , where p is prime and N is an integer, and define
im im+1 (i+1)m−1
α2 α2 · · · α2
α2 im+1 im+2 (i+1)m
α2 α2
for i = 1, . . . , L = kδ + n−k
δ
Ti :=
.. .. ..
. Define
. . .
(i+1)m−1 (i+1)m (i+2)m−2
α2 α2 ··· α2
T0 0
.. .. ∈ F(L+1)m×(L+1)m
T (T0 , . . . , TL ) := . . pN .
TL ··· T0
The following theorem provides a construction for (n, k, δ) MDP convolutional codes
with (n − k) | δ using the superregular matrices from the preceding theorem.
δ
Theorem 10.3.8 ([39]) Let n, k, δ be integers such that (n − k) | δ with ν = n−k , and
l
let Tl = [tij ] for i, j = 1, 2, . . . , m and l = 0, 1, 2, . . . , L be the entries of the matrix Tl
as in Theorem 10.3.7. Define H l = [tlij ] for i = 1, 2, . . . , n − k, j = 1, 2, . . . , k and l =
0, 1, 2, . . . , L. If q = pN for N ∈ N andP|Fq | ≥ p2m(L+1)+n−2 , let C be the convolutional code
ν Pν
C = kerR [A(z) B(z)] where A(z) = i=0 Ai z i ∈ R(n−k)×(n−k) and B(z) = i=0 Bi z i ∈
R(n−k)×k are determined as follows:
210 Concise Encyclopedia of Coding Theory
(n−k)×(n−k)
• A0 = In−k and Ai ∈ Fq , for i = 1, . . . , ν, are obtained by solving the
equations
H l−ν ··· H1
.. ..
[Aν · · · A1 ] . = −[H L · · · H ν+1 ].
.
H L−1 ··· Hν
• Bi = A0 H i + A1 H i−1 + · · · + Ai H 0 for i = 0, . . . , ν.
In [1414] this construction was generalized to arbitrary code parameters, where (n−k) | δ
is not necessary. In this way, the authors of [1414] obtained constructions of (n, k, δ) convo-
lutional codes that are both MDP and sMDS convolutional codes. Other constructions for
convolutional codes can be found in [809]. There are two general constructions of complete
MDP convolutional codes, similar to the constructions for MDP convolutional codes given
in Theorem 10.3.5 and Theorem 10.3.8, presented in the following two theorems.
δ
Theorem 10.3.9 ([1255]) Let n, k, δ ∈ N such that k < n and (n−k) | δ, and let ν = n−k .
Pν
Then H(z) = i=0 Hi z i with
νn+k
k ··· 1 0
.. ..
H0 = . ,
.
νn+k
n−1 ··· 1
νn+k νn+k
in+k ··· (i−1)n+k+1
.. ..
Hi = for i = 1, . . . , ν − 1, and
. .
νn+k νn+k
(i+1)n−1 ··· in
νn+k
1 ··· n−1
.. ..
Hν = .
.
νn+k
0 1 ··· k
is the parity check matrix of an (n, k, δ) complete MDP convolutional code if the charac-
νn+k
(n−k)(L+1)
teristic of the base field is greater than b(νn+k)/2c · ((n − k)(L + 1))(n−k)(L+1)/2
δ δ
where L = k + n−k .
s×(n−k) k×(n−k)
where (A, B, C, D) ∈ Fs×s q × Fk×s
q × Fq × Fq , t ∈ N0 and s, k, n ∈ N when
n > k. The system (10.5) will be represented by Σ = (A, B, C, D), and the integer s is
called its dimension. We call xt ∈ Fsq the state vector, ut ∈ Fkq the information vector,
yt ∈ Fn−k
q the parity vector and ct ∈ Fnq the code vector. We consider that the initial
state is the zero vector, i.e., x0 = 0.
The input, state, and output sequences (trajectories), {ut }t∈N0 , {xt }t∈N0 , and {yt }t∈N0 ,
respectively, can be represented as formal power series:
X
u(z) = ut z t ∈ Fq [[z]]k ,
t∈N0
X
x(z) = xt z t ∈ Fq [[z]]s , and
t∈N0
X
y(z) = yt z t ∈ Fq [[z]]n−k .
t∈N0
A trajectory {xt , ut , yt }t∈N0 satisfies the first two equations of (10.5) if and only if
x(z) u(z) y(z) E(z) = 01×(n−k+s) ,
where
Is − Az −C
E(z) := −Bz −D .
0(n−k)×s In−k
In order to obtain the codewords of a convolutional code by means of the system (10.5)
we must only consider the polynomial input-output trajectories c(z) = u(z) y(z) of
the system. Moreover, we discard the input-output trajectories c(z) with corresponding
state trajectory x(z) having infinite weight, since this would make the system remain indef-
initely excited. Thus, we restrict to polynomial input-output trajectories with corresponding
state trajectory also polynomial. We will call these input-output trajectories finite-weight
input-output trajectories. The set of finite-weight input-output trajectories of the sys-
tem (10.5) forms a submodule of Rn of rank k, and thus it is a convolutional code of rate
k
n , denoted by C(A, B, C, D). The system Σ = (A, B, C, D) is said to be an input-state-
output (ISO) representation of the code.
Since C(A, B, C, D) is a submodule of rank k, there exists a matrix G(z) ∈ Rk×n such
that C(A, B, C, D) = rowspaceR G(z). In fact, if X(z) ∈ Rk×s and G(z) ∈ Rk×n are such
that [X(z) G(z)]E(z) = 0k×(n−k+s) and [X(z) G(z)] is left prime, then G(z) is a generator
matrix for C.
Conversely, for each convolutional code C of rate nk , there exists (A, B, C, D) ∈ Fs×s
q ×
s×(n−k) k×(n−k)
Fk×s
q × Fq × Fq such that C = C(A, B, C, D), if one allows permutation of the
coordinates of the codewords.
212 Concise Encyclopedia of Coding Theory
The following lemma gives equivalent conditions for reachability and observability, and
it is called the Popov, Belevitch and Hautus (PBH) criterion.
s×(n−k)
Lemma 10.4.2 ([1073]) Let A ∈ Fs×s q , B ∈ Fk×s
q , and C ∈ Fq .
−1
z Is − A
(a) (A, B) is reachable if and only if is right prime in the indeterminate z −1 .
B
z −1 .
T
If Σ = (A, B, C, D) is not reachable, then rank Φ(A, B) = δ < s. Let T = 1 ∈ Fs×s q ,
T2
be an invertible matrix such that the rows of T1 form a basis of rowspace Fq Φ(A, B). Then
B = B1 0k×(s−δ) T , for some B1 ∈ Fk×δ q . Moreover, since rowspace Fq Φ(A, B) is A-
invariant (by the Cayley-Hamilton Theorem), it follows that the rows of T1 A belong to
A1 0δ×(s−δ)
rowspace Fq Φ(A, B) and therefore T A = T , for some A1 ∈ Fδ×δ
q , A2 ∈
A2 A3
(s−δ)×δ (s−δ)×(s−δ)
Fq , and A3 ∈ Fq . Then
A1 0δ×(s−δ) C1
S −1 AS = 0k×(s−δ) , S −1 C =
, BS = B1 (10.6)
A2 A3 C2
Φ(A1 , B1 ) 0(kδ)×(s−δ)
B1 A1 δ
δ×(n−k) 0k×(s−δ)
where C1 ∈ Fq and S = T −1 . Moreover, Φ(A, B)S = .. .. has
. .
B1 As−1
1 0k×(s−δ)
rank δ. It follows from the Cayley-Hamilton Theorem that Φ(A1 , B1 ) also has rank δ, i.e.,
the system Σ1 = (A1 , B1 , C1 , D) is reachable. The partitioning (10.6) is called the Kalman
(controllable) canonical form [1073].
Convolutional Codes 213
e = (S −1 AS, BS, S −1 C, D) has updating equations
The system Σ
(1) (1) (2)
xt+1 = xt A1 + xt A2 + ut B1
(2) (2)
xt+1 = xt A3
(1) (2)
yt = xt C1 + xt C2 + ut D
The reachability together with the observability of the system influence the properties
of the corresponding code as the next theorem shows.
0 αδ
δ×(n−1)
C ∈ Fq where the columns c1 , c2 , . . . , cn−1 of the matrix C are chosen such that
Qδ ri +k
det(sIδ − (A − ci B)) = k=1 (s − α ) and ri , i = 1, . . . , δ, are chosen such that
ri +1 ri +2 ri +δ rj +1 rj +2
{α ,α ,...,α } ∩ {α ,α , . . . , αrj +δ } = ∅ for i 6= j. Then C(A, B, C, D)
is an MDS convolutional code.
ISO representations could also be helpful for the construction of MDP convolutional
codes, using the following criterion for being MDP.
214 Concise Encyclopedia of Coding Theory
Theorem 10.4.6 ([1015, Corollary 1.1]) The matrices (A, B, C, D) generate an MDP
D BC · · · BAL−1 C
..
0 ... ...
.
convolutional code if and only if the matrix FL :=
.
with L =
.. .. ..
. . BC
0 ··· 0 D
δ δ
k + n−k has the property that every minor which is not trivially zero is nonzero.
The next theorem establishes the number of erasures that can be corrected by a block
code.
Theorem 10.5.2 Let C be a block code with minimum distance d. Then C can correct at
most d − 1 erasures.
Let C be a block code of rate k/n and minimum distance d with parity check matrix
(n−k)×n
H0 ∈ Fq . Let c ∈ Fnq be a received codeword of C after transmission over an erasure
channel with at most d − 1 erasures. Then H0 cT = 0T . Denote by c(e) the vector consisting
of the components of c that are erased during transmission and by c(r) the vector consisting
(e)
of the components of c that are received (correctly). Moreover, denote by H0 the matrix
consisting of the columns of H0 whose indices correspond to the indices of the erased
(r)
components of c and by H0 the matrix consisting of the other columns of H0 . Then the
Convolutional Codes 215
(e) T (r) T
equation H0 cT = 0T is equivalent to the system of linear equations H0 c(e) = −H0 c(r)
(e)
with the erased components as unknowns. Since H0 has full column rank, the system
(e) T (r) T
H0 c(e) = −H0 c(r) has a unique solution and the erasures are recovered.
Clearly, the decoding is optimal if as many of these linear equations as possible are
linearly independent for as many as possible erasure patterns. This is the case if all full size
minors of H0 are nonzero, which is true if and only if C is MDS, and the maximal number
of erasures that a block code of rate k/n can correct is n − k.
(1)
one has Hj [ct−ν · · · ct+j ]T = 0T , where ci = 0 for i ∈
/ {0, . . . , deg(c)}. Denote by Hj the
(1)
matrix consisting of the first νn columns of Hj . Then Hj = [Hj Hjc ],
and consequently,
to recover the erasures in ct , . . . , ct+j , one has to consider the window [ct−ν · · · ct+j ] and
solve the linear system
(e) (e) (e) (r) (r) (r) (1)
Hj [ct · · · ct+j ]T = −Hj [ct · · · ct+j ]T − Hj [ct−ν · · · ct−1 ]T ,
(e) (r)
where ci and ci denote the erased and correctly received components of ci , respectively,
(e) (r)
and Hj and Hj denote the corresponding columns of Hjc . The erasures are recovered if
(e)
and only if the system has a unique solution, i.e., if and only if Hj has full column rank.
Hence, bearing in mind Theorem 10.2.19, one obtains the following theorem, which
relates the capability to correct erasures by forward decoding (i.e., decoding from left to
right) with the column distances of the convolutional code.
Theorem 10.5.3 ([1810]) Let C be an (n, k, δ) convolutional code. If in any sliding window
of length (j + 1)n at most dcj (C) − 1 erasures occur, then full error correction from left to
right is possible.
Clearly, the best situation is if one has an MDP convolutional code, i.e., dcj (C) − 1 =
(j + 1)(n − k) for j = 0, . . . , L. Define the recovery rate as the ratio of the number of
erasures and the total number of symbols in a window. In [1810], it has been shown that the
recovery rate n−k
n of an MDP convolutional code is the maximal recovery rate one could
get over an erasure channel.
216 Concise Encyclopedia of Coding Theory
Reverse MDP convolutional codes have the MDP property forward and backward and
hence also the erasure correcting capability described in Theorem 10.5.3 from left to right
and from right to left, and therefore they can recover even more situations than MDP
convolutional codes [1810]. There are bursts of erasures that cannot be forward decoded
but could be skipped and afterwards be decoded from right to left (backward); see [1810]
for examples.
If patterns of erasures occur that do not fulfill the conditions of Theorem 10.5.3, neither
forward nor backward, one has a block that could not be recovered and gets lost in the
recovering process. In order to continue recovering, one needs to find a block of νn correct
symbols, a so-called guard space, preceding a block that fulfills the conditions of Theorem
10.5.3. Complete MDP convolutional codes have the additional advantage that to compute
a guard space, it is not necessary to have a large sequence of correct symbols [1810]. Instead
it suffices to have a window with a certain percentage of correct symbols as the following
theorem states.
Theorem
δ δ 10.5.4
([1810]) Let C be an (n, k, δ) complete MDP convolutional code and L =
k + n−k . If in a window of size (L + ν + 1)n there are not more than (L + 1)(n − k)
erasures, and if they are distributed in such a way that between positions 1 and sn and
between positions (L + ν + 1)n and (L + ν + 1)n − sn + 1, for s = 1, . . . , L + 1, there are
not more than s(n − k) erasures, then full correction of all symbols in this interval will be
possible. In particular a new guard space can be computed.
The way of decoding over the erasure channel described here could be used for any
convolutional code. However, to correct as many erasures as possible it is optimal to use
MDP or even complete MDP convolutional codes. For algorithms to do that we refer to
[1810].
MDP convolutional codes are able to recover patterns of erasures that MDS block codes,
with the same recovery rate, cannot recover, as illustrated in the next example.
Example 10.5.6 ([1810]) Consider the received sequence w = (w0 , w1 , . . . , w100 ) with
wi ∈ F2q and with 120 erasures in wi , i ∈ {0, 1, . . . , 29} ∪ {70, 71, . . . , 99}. Let us assume
that w is a received word of an MDS block code of rate 101/202. Such a code can correct
up to 101 symbols in a sequence of 202 symbols, i.e., it has a recovery rate of 50%. Thus,
since w has 120 erasures it cannot be recovered.
X100
Let us assume now that w(z) = wi z i is a received word of a (2, 1, 50) MDP con-
i=0
volutional code C. C has the same recovery rate, but it is able to correct the erasures in
w(z). Note that dcj (C) − 1 = j + 1, for j = 0, 1, . . . , L with L = 100. To recover the whole
sequence one can consider a window with the first 120 symbols (i.e., the sequence consisting
of w0 , w1 , . . . , w59 ). Since dc59 (C) − 1 = 60, the first 60 symbols can be recovered by applying
the decoding algorithm described in this section. Afterwards, take another window with
the symbols w41 , w42 , . . . , w100 . Similarly, this window has 120 symbols and 60 erasures,
which means that these erasures can also be recovered, and the whole corrected sequence is
obtained.
Convolutional Codes 217
Assume that a convolutional code C and a received word cr (z) = i∈N0 cri z i , which should
P
be decoded, are given. Take a minimal ISO representation Σ = (A, B, C, D) of dimension δ
of C and set x0 = 0.
Step 1: Set t = 0, dmin = ∞ and spmin = ∅, and assign to the initial node the label
(d = 0, sp = ∅). Go to Step 2.
Step 2: For each node x2 at time instant t + 1 do the following: For each of the predecessors
x1 at time instant t with label (d, sp) and d < dmin , compute the sum d+d((u, y), ct ),
where (u, y) is the label of the arc (x1 , x2 ) in the state-transition diagram, determine
the minimum of these sums d, and assign to x2 the label (d, sp), where sp is the
shortest path from x = 0, at time instant zero, to x2 . If there are several sums with
the minimal value, then there are several paths from x = 0, at time instant zero, to
x2 with distance d; in this case select randomly one of these paths. If x2 = 0 and
d < dmin , then set dmin = d and spmin = sp. Go to Step 3.
218 Concise Encyclopedia of Coding Theory
Step 3: If at time instant t + 1, all the nodes x with label (d, sp) are such that d ≥ dmin ,
then STOP and the result of the decoding is the codeword corresponding to the
path spmin . Otherwise set t = t + 1 and go to Step 2.
The complexity of this algorithm grows with the number of states, at each time instant,
in the trellis diagram. The set of states is X = Fδq , and therefore it has q δ elements. This
algorithm is practical only for codes with small degree and defined in fields of very small
size.
Instead of using the linear systems representation, the Viterbi Decoding Algorithm could
also be operated using the trellis of the convolutional code; see, e.g., [533].
or equivalently, if det(U (z1 , z2 )) ∈ F∗q . Using the same reasoning as in Section 10.2.1 for
one-dimensional (1D) convolutional codes, the matrices G(z1 , z2 ) and G(ze 1 , z2 ) in Rk×n are
said to be equivalent if they are generator matrices of the same code, which happens if and
only if
G(z
e 1 , z2 ) = U (z1 , z2 )G(z1 , z2 )
Example 10.6.1 For any finite field, the 2D convolutional code with generator matrix
1 z1 0
G(z1 , z2 ) =
1 z2 1
Another important property of a 1D convolutional code is the existence (or not) of prime
generator matrices. When we consider polynomial matrices in two indeterminates, there are
two different notions of primeness: factor-primeness and zero-primeness [1177, 1234, 1402,
1592, 1933].
The following lemmas give characterizations of left factor-primeness and left zero-
primeness. Compare them to Theorem 10.2.6.
Lemma 10.6.4 Let G(z1 , z2 ) ∈ Rk×n , with n ≥ k. Then the following are equivalent.
(a) G(z1 , z2 ) is `F P .
220 Concise Encyclopedia of Coding Theory
s×(n−k) k×(n−k)
with (A1 , A2 , B1 , B2 , C, D) ∈ Fs×s q ×Fs×s
q ×Fk×sq ×Fk×s
q ×Fq ×Fq , i, j ∈ N0 and
s, k, n ∈ N with n > k. The system (10.7) will be represented by Σ = (A1 , A2 , B1 , B2 , C, D),
and the integer s is called its dimension. We call xi,j ∈ Fsq the local state vector,
ui,j ∈ Fkq the information vector, yi,j ∈ Fn−k q the parity vector, and ci,j ∈ Fnq the code
vector. Moreover, the input and the local state have past finite support, i.e., ui,j = 0 and
xi,j = 0, for i < 0 or j < 0, and we consider zero initial conditions, i.e., x0,0 = 0.
The input, the local state, and the output 2D sequences (trajectories) of the system,
{ui,j }(i,j)∈N20 , {xi,j }(i,j)∈N20 , {yi,j }(i,j)∈N20 , respectively, can be represented as formal power
series in the indeterminates z1 , z2 :
X
u(z1 , z2 ) = ui,j z1i z2j ∈ Fq [[z1 , z2 ]]k ,
(i,j)∈N20
X
x(z1 , z2 ) = xi,j z1i z2j ∈ Fq [[z1 , z2 ]]s , and
(i,j)∈N20
X
y(z1 , z2 ) = yi,j z1i z2j ∈ Fq [[z1 , z2 ]]n−k .
(i,j)∈N20
consists of the (x(z1 , z2 ), u(z1 , z2 ), y(z1 , z2 )) trajectories of the system, and we say that an
input-output trajectory has corresponding state trajectory x(z1 , z2 ) if
x(z1 , z2 ) u(z1 , z2 ) y(z1 , z2 ) E(z1 , z2 ) = 01×(n−k+s) .
For the same reasons stated for 1D convolutional codes in Section 10.4, we only consider
the finite-weight input-output trajectories of the system (10.7) to obtain a 2D convolutional
code, i.e, the polynomial trajectories (u(z1 , z2 ), y(z1 , z2 )) with corresponding state trajectory
x(z1 , z2 ) also polynomial.
Theorem 10.6.8 ([1412]) The set of finite-weight input-output trajectories of the system
(10.7) is a 2D convolutional code of rate k/n.
The 2D convolutional code whose codewords are the finite-weight input-output tra-
jectories of the system (10.7) is denoted by C(A1 , A2 , B1 , B2 , C, D). The system Σ =
222 Concise Encyclopedia of Coding Theory
B1 Ak−1
1
k−2
B1 (A1 ∆ A2 ) + B2 (A1 k−1∆0 A2 )
1
B1 (A1 k−3∆2 A2 ) + B2 (A1 k−2∆1 A2 )
Rk = .. .
.
B (A 0∆k−1 A ) + B (A 1∆k−2 A )
1 1 2 2 1 2
B2 Ak−12
is left factor-prime.
There also exists a notion of local observability that will not be considered here. There
are 2D systems that are locally reachable (observable) but not modally reachable (observ-
able) and vice-versa [739]. The next lemma shows the influence of these properties on the
corresponding convolutional code.
Convolutional Codes 223
sequence spaces such as AZ , AN , or AN0 . In order to be consistent with the rest of the chapter
we will work in the sequel with AN0 , i.e., with the nonnegative time axis N0 .
Let k be a natural number. A block over the alphabet A is defined as Pa finite string
β = x1 x2 · · · xk consisting of the k elements xi ∈ A, i = 1, . . . , k. If w(z) = i wi z i ∈ A[[z]]
is a sequence, one says that the block β occurs in w if there is some integer j such that
β = wj wj+1 · · · wk+j−1 . If X ⊂ A[[z]] is any subset, we denote by B(X) the set of blocks
which occur in some element of X.
As we will explain in this section one can view observable convolutional codes as the dual
of linear, compact, irreducible and shift-invariant subsets of Fnq [[z]]. In order to establish
this result we will have to explain the basic definitions from symbolic dynamics.
For this consider a set F of blocks. It is possible that this set is infinite.
Definition 10.7.1 The subset X ⊂ A[[z]] consisting of all sequences w(z) which do not
contain any of the (forbidden) blocks of F is called a shift space.
Let In be the n × n identity matrix acting on A = Fnq . The shift map σ extends to the shift
map
σIn : A[[z]] −→ A[[z]].
One says that X ⊂ A[[z]] is a shift-invariant set if (σIn )(X) ⊂ X.
Shift spaces can be characterized in a topological manner.
i i
P P
Definition 10.7.2 Let v(z) = i vi z and w(z) = i wi z be elements of A[[z]]. Let
dH (x, y) be the Hamming distance between elements in A. One defines the distance between
the sequences v(z) and w(z) as
X
d(v(z), w(z)) := 2−i dH (vi , wi ).
i∈N0
Note that in this metric two elements v(z), w(z) are ‘close’ if they coincide in a large
number of elements in the beginning of the sequence. The function d(·, ·) defines a metric
on the sequence space AN0 . The metric introduced in Definition 10.7.2 is equivalent to the
metric described in [1264, Example 6.1.10]. The following result can be found in [1264,
Theorem 6.1.21].
Theorem 10.7.3 A subset of A[[z]] is a shift space if and only if it is shift-invariant and
compact.
Definition 10.7.4 A shift space X ⊂ A[[z]] is called irreducible if for every ordered pair
of blocks β, γ of B(X), there is a block µ such that the concatenated block βµγ is in B(X).
Of particular interest are shift spaces which have a kernel representation. For this let
P (z) be an r × n matrix having entries in the polynomial ring Fq [z]. Define the set
Subsets B ⊆ A[[z]] having the particular form (10.8) can be characterized in a purely
topological manner.
Convolutional Codes 225
Theorem 10.7.5 A subset B ⊆ A[[z]] has a kernel representation of the form (10.8) if and
only if B is a linear, irreducible, compact and shift invariant subset of A[[z]].
Subsets of the form (10.8) also appear prominently in the behavioral theory of linear
systems championed by Jan Willems. The following theorem was proven by Willems [1892,
Theorem 5]. Note that a metric space is called complete if every Cauchy sequence converges
in this space.
Theorem 10.7.6 A subset B ⊂ Fnq [[z]] is linear, time-invariant, and complete if and only
if B has a representation of the form (10.8).
There is a small difference in the above notions as a subset B ⊂ A[[z]] which is linear,
irreducible, compact, and shift invariant is automatically also a ‘controllable behavior’ in
the sense of Willems.
We are now in a position to connect to convolutional codes (polynomial modules) using
Pontryagin duality. For this consider the bilinear form:
where h·, ·i represents the standard dot product on A = Fnq . As the sum has only finite
many nonzero terms the bilinear form (·, ·) is well defined and nondegenerate. Using this
bilinear form, for a subset C of Fnq [z] one defines the annihilator
The relation between these two annihilator operations is given by the following theorem
which was derived and proven in [1597].
Theorem 10.7.7 If C ⊆ Fnq [z] is a convolutional code with generator matrix G(z), then
C ⊥ is a linear, left-shift-invariant, and complete behavior with kernel representation P (z) =
G(z)T . Conversely, if B ⊆ Fnq [[z]] is a linear, left-shift-invariant, and complete behavior
with kernel representation P (z), then B ⊥ is a convolutional code with generator matrix
G(z) = P (z)T . Moreover C ⊆ Fnq [z] is noncatastrophic if and only if C ⊥ is a controllable
behavior.
Acknowledgement
This work is supported by The Center for Research and Development in Mathematics
and Applications (CIDMA) through the Portuguese Foundation for Science and Technol-
ogy (FCT - Fundação para a Ciência e a Tecnologia), references UIDB/04106/2020 and
UIDP/04106/2020, by the Swiss National Science Foundation grant n. 188430, and by the
German Research Foundation grant LI 3101/1-1.
Chapter 11
Rank-Metric Codes
Elisa Gorla
Université de Neuchâtel
227
228 Concise Encyclopedia of Coding Theory
Every vector rank-metric code can be regarded as a rank-metric code, up to the choice
of a basis of Fqm over Fq .
Definition 11.1.3 Let Γ = {γ1 , γ2 , . . . , γm } be a basis of Fqm over Fq and let v ∈ Fnqm .
Define Γ(v) ∈ Fn×m
q via the identity
m
X
vi = Γij (v)γj for i = 1, . . . , n.
j=1
Γ(C) = {Γ(v) | v ∈ C}
Example 11.1.4 Let C be the vector rank-metric code C = h(1, α)i ⊆ F28 . Let F8 =
F2 [α]/(α3 + α + 1) and let γ1 = 1, γ2 = α, γ3 = α2 . Then Γ = {γ1 , γ2 , γ3 } is a basis of F8
over F2 and
1 0 0 0 1 0 0 0 1
Γ(1, α) = , Γ(α, α2 ) = , Γ(α2 , α + 1) = .
0 1 0 0 0 1 1 1 0
Hence
1 0 0 0 1 0 0 0 1
Γ(C) = , , ⊆ F2×3
2 .
0 1 0 0 0 1 1 1 0
The image Γ(C) of a vector rank-metric code C via Γ as defined above is a rank-metric
code, whose parameters are determined by those of C. The proof of the next proposition is
easy and may be found, e.g., in [842, Section 1].
Some authors define a notion of equivalence for vector rank-metric codes as follows.
Notice however that Definition 11.1.3 allows us to apply the notion of equivalence from
Definition 11.1.6 to vector rank-metric codes. It is therefore natural to ask whether the
rank-metric codes associated to equivalent vector rank-metric codes are also equivalent. It
is easy to show that the answer is affirmative.
Rank-Metric Codes 229
Theorem 11.1.10 ([162]) Let ψ : Fnqm → Fnqm be an Fqm -linear isometry with respect to
the rank metric. Then there exist α ∈ F∗qm and B ∈ GLn (Fq ) such that ψ(v) = αvB for all
v ∈ Fnqm .
Notation 11.1.11 For a vector rank-metric code C ⊆ Fnqm and B ∈ GLn (Fq ), let
CB = {vB | v ∈ C} ⊆ Fnqm .
C T = {M T | M ∈ C} ⊆ Fm×n
q .
The MacWilliams Extension Theorem 1.8.6 is a classical result in the theory of linear
block codes in Fnq with the Hamming distance. It essentially says that any linear isometry
of block codes can be extended to a linear isometry of the ambient space Fnq . It is natural
to ask whether an analog of the MacWilliams Extension Theorem holds for rank-metric
codes. In other words, given rank-metric codes C, D ⊆ Fn×m q and an Fq -linear isometry
f : C → D, one may ask whether there exists an Fq -linear isometry ϕ : Fn×m
q → Fn×m
q such
that ϕ |C = f . The answer is no, as the next example shows. More counterexamples can be
found in [131] and in the preprint [501, Section 7].
Example 11.1.12 ([131, Example 2.9(a)]) Denote by 02×1 the zero matrix of size 2 × 1
and let
C = A 02×1 | A ∈ F2×2 ⊆ F2×3
2 2 .
Let ϕ : C → F2×3 be defined by ϕ A 02×1 = AT 02×1 . Then ϕ is an F2 -linear
2
isometry defined on C which restriction to C of an F2 -linear isometry of F2×3
is not the 2 . In
0 0 1
fact, there is no choice for ϕ that preserves the property that ϕ is an F2 -linear
0 0 0
isometry.
230 Concise Encyclopedia of Coding Theory
Definition 11.2.1 ([1061, Definition 2.1]) Let C ⊆ Fnqm be a vector rank-metric code,
and let Γ = {γ1 , γ2 , . . . , γm } be a basis of Fqm over Fq . The support of v ∈ C is the
Fq -linear space
supp(v) = colsp(Γ(v)) ⊆ Fnq .
The support of a subcode D ⊆ C is
X
supp(D) = supp(v) ⊆ Fnq .
v∈D
Notice that supp(v) does not depend on the choice of the basis Γ, since if Γ0 is another
basis of Fqm over Fq , then there exists a B ∈ GLm (Fq ) such that Γ(v) = Γ0 (v)B. This also
implies that supp(D) does not depend on the choice of Γ. See also [841, Proposition 1.13].
In the context of rank-metric codes, we define the support as follows.
supp(M ) = rowsp(M ) ⊆ Fm
q .
Notice that, if n ≤ m, Definition 11.2.2 agrees with Definition 11.2.1, when restricted to
rank-metric codes associated to vector rank-metric codes. Precisely, if C ⊆ Fnqm is a vector
rank-metric code and Γ = {γ1 , . . . , γm } is a basis of Fqm over Fq , then
supp(Γ(v)) = supp(v)
We wish to stress that, in the context of matrices, taking the support of the transpose
yields a different, but well-behaved notion of support of a matrix. Below we make a few
remarks on different possible notions of support and on why we choose to adopt Defini-
tion 11.2.2. It is clear that, depending on the application or on the information that one
wishes to encode, one may also choose to work with different notions of support.
Remark 11.2.4 If n = m, then one may define a notion of support by considering row
spaces instead of column spaces. This yields a different, but substantially equivalent notion
of support. A different, but possibly interesting, notion of support for a square matrix would
be defining the support of M ∈ Fn×nq to be the pair of vector spaces (rowsp(M ), colsp(M )).
This is connected to the definition of generalized weights (see Section 11.5) and to the ap-
proach taken in [841] for studying generalized weights of square matrices via q-polymatroids
(see Section 11.6).
Remark 11.2.5 For any value of m, n, one may define a different notion of support as
follows: For a rank-metric code C ⊆ Fn×m
q and for M ∈ C, let
supp(M ) = rowsp(M ) ⊆ Fm
q if n ≤ m (11.1)
and let
supp(M ) = colsp(M ) ⊆ Fnq if n > m. (11.2)
Then the support of a subcode D ⊆ C is
X
supp(D) = supp(M ) ⊆ Fmax{m,n}
q . (11.3)
M ∈D
This definition yields a different notion of support from that of Definition 11.2.2; in partic-
max{n,m}
ular it takes values in Fq . One can check that both notions of support are regular in
the sense of [1575]. However, the definition of support in (11.1), (11.2), and (11.3) for n 6= m
yields an empty extremality theory in the sense of [1575, Section 7] and a series of redundant
MacWilliams Identities. Moreover, in [841] we showed that, for n < m, the q-polymatroid
determined by the supports as in Definition 11.2.2 allows one to easily recover the general-
ized weights of the code, while the q-polymatroid determined by the supports as in (11.1),
(11.2), and (11.3) does not. With these in mind, we choose to adopt Definition 11.2.2.
Remark 11.2.6 Some authors choose to work with a definition of support which is the
same for every value of m, n; e.g., in [1341] the authors define
supp(M ) = colsp(M ) for any m, n and any M ∈ Fn×m
q . (11.4)
This notion of support agrees with Definition 11.2.2 for n ≤ m and with the definition
discussed in the previous remark for n > m. For all the reasons discussed in the previous
remark, for n > m we prefer Definition 11.2.2 to this definition. Notice however that the
definition of support in (11.4) is compatible with the notion of generalized matrix weights
as defined by Martı́nez-Peñas and Matsumoto (see Definition 11.5.12). In this chapter,
however, we define generalized weights as in Definition 11.5.7, which is compatible with the
notion of support as by Definition 11.2.2.
Given a definition of support, it is natural to consider the subcodes of a code which are
supported on a fixed vector space.
min{m,n}
Definition 11.2.7 Let C ⊆ Fn×m
q be a rank-metric code. Let V ⊆ Fq be a vector
subspace. The subcode of C supported on V is
C(V ) = {M ∈ C | supp(M ) ⊆ V }.
232 Concise Encyclopedia of Coding Theory
Analogous definitions can be given for vector rank-metric codes, using the corresponding
rank distance.
dmin (0) = n + 1.
for any vector rank-metric code 0 6= C ⊆ Fnqm and any basis Γ of Fqm over Fq .
Definition 11.3.4 Let C ⊆ Fnqm be a vector rank-metric code. The maximum rank of C
is
max rank(C) = max {rank(v) | v ∈ C}.
Both the minimum distance and the maximum rank of a code are related to the other
invariants of the code. The first inequality in the next theorem goes under the name of
Singleton Bound and was proved by Delsarte in [523, Theorem 5.4]. The second goes under
the name of Anticode Bound. The Anticode Bound was proved by Meshulam in [1375,
Theorem 1] in the square case, but one can check that the proof also works in the rectangular
case. The proof by Meshulam relies heavily on a result by König [1144] (see also [888,
Theorem 5.1.4]). Finally, a coding-theoretic proof of the Anticode Bound was given by
Ravagnani in [1574, Proposition 47].
and
dim(C) ≤ max {n, m} · max rank(C).
Rank-Metric Codes 233
Remark 11.3.6 For a vector rank-metric code C ⊆ Fnqm , the Singleton Bound can be
stated as
dimFqm (C) ≤ n − dmin (C) + 1.
This bound appeared in [760, Corollary of Lemma 1], under the assumption that n ≤ m.
However, it is easy to check that the bound holds for any n, m.
Remark 11.3.7 For a vector rank-metric code C ⊆ Fnqm with dimFqm (C) ≤ m, the Anti-
code Bound can be stated as
dimFqm (C) ≤ max rank(C). (11.5)
The bound was proved in [1573, Proposition 11], under the assumption that n ≤ m. Using
the same type of arguments however, one can easily prove the bound in the more general
form stated here. Notice moreover that, if dimFqm (C) > m, then
dimFqm (C) > m ≥ max rank(C).
In particular, the inequality (11.5) cannot hold if dimFqm (C) > m.
The codes whose invariants meet the bounds of Theorem 11.3.5 go under the names of
MRD codes and optimal anticodes, respectively. They have both been extensively studied.
Definition 11.3.8 A vector rank-metric code C ⊆ Fnqm is a maximum rank distance
(MRD) code if
dimFqm (C) = n − dmin (C) + 1.
It is an optimal vector anticode if
dimFqm (C) = max rank(C).
A rank-metric code C ⊆ Fn×m
q is a maximum rank distance (MRD) code if
dim(C) = max {n, m}(min {m, n} − dmin (C) + 1).
It is an optimal anticode if
dim(C) = max {n, m} · max rank(C).
Remark 11.3.9 Let C ⊆ Fn×m q be a rank-metric code and let C T ⊆ Fm×n
q be the code
obtained from C by transposition. The following are immediate consequences of the defini-
tions:
• C is MRD if and only if C T is MRD,
• C is an optimal anticode if and only if C T is an optimal anticode.
Several examples and constructions of MRD codes are given in Chapter 14.
Remark 11.3.10 Notice that 0 and Fnqm are the only MRD vector rank-metric codes, if
n > m. In fact, if a vector rank-metric code C exists with
dimFqm (C) = n − dmin (C) + 1 < n,
then dmin (C) > 1. Assume that C 6= 0 and let Γ be a basis of Fqm over Fq . By Propo-
sition 11.1.5 dim(Γ(C)) = m dimFqm (C) and dmin (C) = dmin (Γ(C)). Using the Singleton
Bound, Γ(C) is a rank-metric code of dimension
dim(Γ(C)) = m(n − dmin (Γ(C)) + 1) ≤ n(m − dmin (Γ(C)) + 1),
contradicting the assumption that n > m.
234 Concise Encyclopedia of Coding Theory
Optimal anticodes were characterized by de Seguins Pazzis, who proved that, up to code
equivalence, they are exactly the standard optimal anticodes of Example 11.3.11.
Rank-Metric Codes 235
Theorem 11.3.15 ([511, Theorem 4 and Theorem 6]) Recalling Definition 11.2.7, the
optimal anticodes of Fn×m
q with respect to the rank metric are exactly the following codes:
min{m,n}
• Fn×m
q (V ) for some V ⊆ Fq , if m 6= n,
• Fn×n
q (V ) and Fn×n
q (V )T for some V ⊆ Fnq , if m = n.
In particular, every optimal anticode is equivalent to a standard optimal anticode.
Proof: The only part of the statement which is not contained in the proof of [511, Theorem 4
and Theorem 6] is the claim that every optimal anticode is equivalent to a standard optimal
anticode. Notice that the standard optimal anticodes are Fn×m
q (E` ) and Fn×n
q (E` )T , where
E` = he1 , . . . , e` i and ` = 0, . . . , min {m, n}.
If n ≤ m, let A ∈ GLn (Fq ) be a matrix whose first k = dim(V ) columns are a basis of
V and let B ∈ GLm (Fq ) be any matrix. If n > m, let A ∈ GLn (Fq ) be any matrix and let
B ∈ GLm (Fq ) be a matrix whose first k = dim(V ) rows are a basis of V . Then
AFn×m
q (Ek )B = Fn×m
q (V ). (11.6)
colsp(AM B) = colsp(AM ) ⊆ V.
rowsp(AM B) = rowsp(M B) ⊆ V.
B T Fn×n
q (Ek )T AT = Fn×n
q (V )T .
Therefore Fn×n
q (V )T is equivalent to the standard optimal anticode Fn×n
q (Ek )T .
Optimal vector anticodes were characterized by Ravagnani in [1573, Theorem 18], under
the assumption that n ≤ m. One can also show that, up to code equivalence, optimal
vector anticodes are exactly the standard optimal vector anticodes of Example 11.3.12.
Notice that a vector rank-metric code C ⊆ Fnqm with dimFqm (C) > m cannot be an optimal
vector anticode by Remark 11.3.13. Hence we may assume without loss of generality that
dimFqm (C) ≤ m.
Theorem 11.3.16 Let C ⊆ Fnqm be a vector rank-metric code with k = dimFqm (C) ≤ m.
The following are equivalent.
(a) C is an optimal vector anticode,
(b) C has a basis consisting of vectors with entries in Fq ,
(c) C ∼ he1 , e2 , . . . , ek i, where ei denotes the ith vector of the standard basis of Fnqm .
236 Concise Encyclopedia of Coding Theory
C ⊥ = {M ∈ Fn×m
q | TR(M N T ) = 0 for all N ∈ C},
The usual scalar product of Fnqm induces a notion of dual for vector rank-metric codes.
The two notions of dual code are compatible with the definition of associated rank-metric
code with respect to a basis of Fqm over Fq , for a suitable choice of bases.
Proposition 11.4.4 ([1574, Theorem 21]) Let C ⊆ Fnqm be a vector rank-metric code
and let Γ, Γ0 be orthogonal bases of Fqm over Fq . Then
Γ(C)⊥ = Γ0 (C ⊥ ).
Example 11.4.5 Let C be the vector rank-metric code C = h(1, α)i ⊆ F28 , where F8 =
F2 [α]/(α3 + α + 1). Its dual code is C ⊥ = h(1, α2 + 1)i ⊆ F28 . Let Γ = {γ1 = 1, γ2 = α, γ3 =
α2 } be an F2 -basis of F8 . The rank-metric code associated to C with respect to Γ is
1 0 0 0 1 0 0 0 1
Γ(C) = , , ⊆ F2×3
2 .
0 1 0 0 0 1 1 1 0
Its dual code is
⊥ 0 0 1 1 0 1 0 1 0
Γ(C) = , , ⊆ F2×3
2 .
1 0 0 0 1 0 0 0 1
The orthogonal basis of Γ is Γ0 = {γ10 = 1, γ20 = α2 , γ30 = α}. The rank-metric code
associated to C ⊥ with respect to Γ0 is
1 0 0 0 0 1 0 1 0
Γ0 (C ⊥ ) = , , .
1 1 0 1 0 0 0 0 1
Clearly A0 (C) = 1, dmin (C) = min {i | Ai (C) 6= 0, i 6= 0}, and max rank(C) = max {i |
Ai (C) 6= 0}.
The relations between the weight distribution of C and C ⊥ go under the name of
MacWilliams Identities and were first proved in [523, Theorem 3.3]. A different proof, in-
spired by [1575, Theorem 27], was given in [842, Theorem 2]. Another proof was given
in [1683, Proposition 15]. The following formulation is analogous to Theorem 1.15.3(d).
From the MacWilliams Identities, one can derive a number of nontrivial consequences.
Here we give two relevant ones, starting with the celebrated result of Delsarte, which states
that the dual of an MRD code is MRD. Compare this result to Theorem 1.9.13.
The weight distribution of an MRD code was first computed by Delsarte in [523, Theo-
rem 5.6] and can be derived from the MacWilliams Identities via a standard computation.
An analogous result can be obtained for dually quasi-MRD codes.
Discussing the family of dually quasi-MRD codes is beyond the scope of this chapter. The
definition however is motivated by Proposition 11.4.6, which shows that dually quasi-MRD
codes are exactly the non-MRD codes which maximize the quantity dmin (C) + dmin (C ⊥ ).
We refer the interested reader to [500] for a discussion of the properties of codes which are
close to being MRD in this sense. The weight distribution of a dually quasi-MRD code was
computed in [500, Corollary 28]. Below we give a statement that covers both MRD and
dually quasi-MRD codes. Once again, compare this to Theorem 1.15.7.
Theorem 11.4.15 Let C ⊆ Fn×m q be an MRD or dually quasi-MRD rank-metric code. Let
d = dmin (C). Then A0 (C) = 1, Ai (C) = 0 for i = 1, . . . , d − 1, and
i−d
min{m, n} X j i
Ai (C) = (−1)j q (2) q dim(C)−max{m,n}(min{m,n}+j−i) − 1
i q
j q
j=0
The analog of Theorem 11.4.13 for optimal anticodes was proved by Ravagnani.
Theorem 11.4.16 ([1574, Theorem 54]) Let C ⊆ Fn×m q be a rank-metric code. Then C
⊥
is an optimal anticode if and only if C is an optimal anticode.
Ravagnani also proved the next interesting result, relating MRD codes and optimal
anticodes.
C + D = Fn×m
q
Definition 11.5.1 ([1459, Definition 1]) Let n ≤ m and let C ⊆ Fnqm be a vector rank-
metric code. For i = 1, 2, . . . , dimFqm (C) the generalized weights of C are
wi (C) = min max {dim supp(v) | v ∈ D, v 6= 0} | D ⊆ C, dimFqm (D) = i .
D v
A definition of relative generalized weights for vector rank-metric codes was given by
Kurihara, Matsumoto, and Uyematsu. Let φ : Fnqm → Fnqm be the Frobenius endomor-
phism defined by φ(v1 , . . . , vn ) = (v1q , . . . , vnq ).
240 Concise Encyclopedia of Coding Theory
Definition 11.5.2 ([1178, Definition 2]) Let C ⊆ Fnqm be a vector rank-metric code,
and let D ( C be a proper subcode. Let φ : Fnqm → Fnqm be the Frobenius endomorphism.
For i = 1, 2, . . . , dimFqm (C) − dimFqm (D) the relative generalized weights of C and D
are
wi (C, D) = min {dim supp(V ) | V ⊆ Fnqm , φ(V ) = V,
dimFqm (C ∩ V ) − dimFqm (D ∩ V ) ≥ i}.
In particular, for i = 1, 2, . . . , dimFqm (C) the relative generalized weights of C and 0 are
In [645], Ducoat proposed and studied the following modification of Definition 11.5.1.
For any D ⊆ Fnqm , let D∗ = D + φ(D) + · · · + φm−1 (D), where φ denotes the Frobenius
endomorphism. D∗ is the smallest Fqm -linear space containing D which is fixed by φ.
Definition 11.5.3 Let C ⊆ Fnqm be a vector rank-metric code. For i = 1, 2, . . . , dimFqm (C)
the generalized weights of C are
Notice that, although the definition by Ducoat does not assume n ≤ m, most of the
results that he establishes in [645] do.
It was shown by Ducoat in [645, Proposition II.1] for n ≤ m, and by Jurrius and Pellikaan
in [1061, Theorem 5.4] for any n, m, that the relative generalized weights of C and 0 agree
with the generalized weights of C, i.e.,
Moreover, it follows from [1061, Theorem 5.2 and Theorem 5.8] that Definition 11.5.1 and
Definition 11.5.3 are equivalent for n ≤ m.
The next definition is the natural analog of the definition of generalized rank weights
for linear block codes, endowed with the Hamming distance. It was given by Jurrius and
Pellikaan, who in [1061, Theorem 5.2] proved that it is equivalent to Definition 11.5.1 if
n ≤ m. In [1061, Theorem 5.8] they proved that it is equivalent to Definition 11.5.3.
Definition 11.5.4 ([1061, Definition 2.5]) Let C ⊆ Fnqm be a vector rank-metric code.
For i = 1, 2, . . . , dimFqm (C) the generalized weights of C are
The next result provides another equivalent definition of generalized weights for vector
rank-metric codes. It was proved by Ravagnani under the assumption n ≤ m, but it can
easily be extended to arbitrary n, m as follows.
Theorem 11.5.5 ([1573, Corollary 19]) Let C ⊆ Fnqm be a vector rank-metric code of
dimension dimFqm (C) ≤ m. Then for i = 1, 2, . . . , dimFqm (C),
wi (C) = min {dimFqm (A) | A ⊆ Fnqm optimal vector anticode with dimFqm (C ∩ A) ≥ i}.
Remark 11.5.6 Let C ⊆ Fnqm be a vector rank-metric code of dimension dimFqm (C) > m
and let m < i ≤ dimFqm (C). Then the quantity
cannot be equal to wi (C) since, by Remark 11.3.13, there exists no optimal vector anticode
A with dimFqm (A) ≥ dimFqm (C ∩ A) ≥ i > m.
Rank-Metric Codes 241
The first definition of generalized weights for the larger class of rank-metric codes was
given by Ravagnani.
Definition 11.5.7 ([1573, Definition 23]) Let C ⊆ Fn×m q be a rank-metric code. For
i = 1, 2, . . . , dim(C) the generalized weights of C are
1
di (C) = min {dim(A) | A ⊆ Fn×m
q optimal anticode with dim(C ∩ A) ≥ i},
max {m, n}
Remark 11.5.8 The characterization of optimal anticodes from Theorem 11.3.15, together
with the observation that
dim(Fn×m
q (V )) = max {n, m} · dim(V ),
and
Notice that, for m = n, this definition of generalized weights is coherent with the definition
of support given in Remark 11.2.4.
The next statement appears as a theorem in [1573, Theorem 28]. However, the proof is
incomplete. Therefore, we state it here as a conjecture.
Conjecture 11.5.9 ([1573, Theorem 28]) Let n ≤ m and let C ⊆ Fnqm be a vector rank-
metric code. Let Γ = {γ1 , . . . , γm } be a basis of Fqm over Fq . Then
Remark 11.5.10 In particular, under the assumptions of Conjecture 11.5.9, this conjecture
implies that
dm(i−1)+1 (Γ(C)) = · · · = dmi−1 (Γ(C)) = dmi (Γ(C))
for i = 1, 2, . . . , dimFqm (C).
One can easily find an example that shows that the equality in Conjecture 11.5.9 does
not hold if n > m.
Example 11.5.11 Let C = h(1, 0, 0)i ⊆ F34 , where F4 = F2 [α]/(α2 + α + 1). Then
has d1 (Γ(C)) = dmin (Γ(C)) = 1. Since Γ(C) is not an optimal anticode, any nonzero optimal
anticode A ) Γ(C). Hence A must have max rank(A) = 2 and dim(A) = 6. Therefore,
d2 (Γ(C)) = 2.
242 Concise Encyclopedia of Coding Theory
The ith generalized matrix weight of C is the ith relative generalized matrix weight of
C and 0; i.e., for i = 1, 2, . . . , dim(C)
colsp
δi (C) = min {dim(V ) | V ⊆ Fnq , dim(C ∩ Fn×m
q V
) ≥ i}.
The next result compares Definition 11.5.7 and Definition 11.5.12. It follows easily from
Theorem 11.3.15 and appears in the literature as [1341, Theorem 9]. Notice that the as-
sumption that n ≤ m is missing throughout [1341, Section VIII.C]. As a consequence, the
statement of [1341, Theorem 9] claims that Definition 11.5.7 and Definition 11.5.12 agree
for m 6= n; however the result is proved only for n < m.
Proof: The thesis follows from Remark 11.5.8, after observing that for n ≤ m one has
colsp
C(V ) = C ∩ Fn×m
q V
.
One can easily find examples that show that Definition 11.5.7 and Definition 11.5.12 do
not agree in the case m = n.
Rank-Metric Codes 243
In fact, one can also find examples that show that Definition 11.5.7 and Definition 11.5.12
do not agree in the case m < n. This implies in particular that the part of the statement
of [1341, Theorem 9] concerning the case m < n is incorrect.
hence δ3 (C) = 3.
The code C ⊆ F3×2 2 of Example 11.5.16 has d3 (C) < δ3 (C). One may wonder whether
di (C) ≤ δi (C) for all i, for a rank-metric code C ⊆ Fn×m
q with n > m. The answer turns out
to be negative, as the next example shows.
hence δ2 (C) = 1.
In fact, as an easy consequence of Theorem 11.3.15 and of Remark 11.3.9, one obtains
the following result.
where the first equality follows from Proposition 11.5.13, the second from Remark 11.5.8,
and the third from (11.7). If n = m, then
di (C) = min dim(V ) | V ⊆ Fnq , max {dim(C(V )), dim(C T (V ))} ≥ i = min {δi (C), δi (C T )},
where the first equality follows from Remark 11.5.8 and the second from (11.7).
As in the case of generalized weights, one can relate the generalized weights of a vector
rank-metric code and the generalized matrix weights of its associated rank-metric code. In
fact more is true, since the relative versions of the weights can also be related, and the
assumption n ≤ m is not needed.
We conclude this section with a few results on generalized weights. The next theorem
establishes some properties of the sequence of generalized weights of a rank-metric code.
Theorem 11.5.19 allows one to compute the generalized weights of MRD codes and
optimal anticodes. In Section 11.4 we stated analogous results for the weight distribution
of MRD codes and optimal anticodes.
Corollary 11.5.20 ([1573, Corollary 31]) Let C ⊆ Fn×m q be a rank-metric code of di-
mension dim(C) = ` = k · max {m, n}. The following are equivalent.
(a) C is MRD.
(b) di (C) = min {m, n} − k + di/ max {m, n}e for i = 1, 2, . . . , `.
Corollary 11.5.21 ([1573, Corollary 32]) Let C ⊆ Fn×m q be a rank-metric code of di-
mension dim(C) = ` = k · max {m, n}. The following are equivalent.
(a) C is an optimal anticode.
(b) d` (C) = k.
(c) di (C) = di/ max {m, n}e for i = 1, 2, . . . , `.
Corollary 11.5.22 Let C ⊆ Fn×m q be a rank-metric code. Then C is both MRD and an
optimal anticode if and only if C = 0 or C = Fn×m
q .
Rank-Metric Codes 245
A similar result can be obtained for dually quasi-MRD codes. It follows from [500,
Corollary 18] that the dimension of a dually quasi-MRD code C ⊆ Fn×mq is not divisible by
max {m, n}. Therefore, in the next result we make this assumption without loss of generality.
Corollary 11.5.23 ([500, Theorem 22]) Let C ⊆ Fn×m q be a rank-metric code of dimen-
sion dim(C) = ` = k · max {m, n} + r, with k ≥ 0 and 0 < r < max {m, n}. The following
are equivalent.
(a) C is dually quasi-MRD.
(b) d1 (C) = min {m, n} − k and dr+1 (C) = min {m, n} + 1 − k.
Moreover, if C is dually quasi-MRD, then its generalized weights are:
We already observed that, while there is no easy relation between the minimum distance
of C and C ⊥ , the weight distribution of C determines the weight distribution of C ⊥ . The
next result shows that the generalized weights of C determine the generalized weights of C ⊥ .
and
W i (C) = min {m, n} + 1 − di+j·max{m,n} (C) | j ≥ 0, 1 ≤ i + j · max {m, n} ≤ ` .
Remark 11.6.3 Definition 11.6.2 is slightly different from the definition of (q, r)-poly-
matroid given by Shiromoto in [1683, Definition 2]. However, a (q, r)-polymatroid (E, ρ)
as defined by Shiromoto corresponds to the q-polymatroid (E, ρ/r) according to Defini-
tion 11.6.2. Moreover, a q-polymatroid whose rank function takes values in Q corresponds
to a (q, r)-polymatroid as defined by Shiromoto up to multiplying the rank function by an
r which clears denominators.
Example 11.6.4 The pair (F`q , dim(·)) is a q-matroid, where dim(·) denotes the function
that associates to a vector space its dimension.
ρU (V ) = dim(V ) − dim(V ∩ U ⊥ )
Definition 11.6.6 ([841, Definition 4.4]) Let (F`q , ρ1 ) and (F`q , ρ2 ) be q-polymatroids.
We say that (F`q , ρ1 ) and (F`q , ρ2 ) are equivalent if there exists an Fq -linear isomorphism
ϕ : F`q → F`q such that ρ1 (V ) = ρ2 (ϕ(V )) for all V ⊆ F`q . In this case we write (F`q , ρ1 ) ∼
(F`q , ρ2 ).
Definition 11.6.7 ([841, Definition 4.5]) Let P = (F`q , ρ) be a q-polymatroid. For all
subspaces V ⊆ F`q define
where V ⊥ is the dual of V with respect to the standard inner product on F`q . We call
P ∗ = (F`q , ρ∗ ) the dual of the q-polymatroid P .
Rank-Metric Codes 247
Theorem 11.6.8 ([841, Theorem 4.6 and Proposition 4.7]) Let P, Q be q-poly-
matroids. Then the following hold.
(a) P ∗ is a q-polymatroid.
(b) P ∗∗ = P .
(c) If P ∼ Q, then P ∗ ∼ Q∗ .
One can associate q-polymatroids to rank-metric codes as follows. In [841, Theorem 5.4]
it is shown that these are indeed q-polymatroids according to Definition 11.6.2.
Notice that this is slightly different from what is done in [841], where a pair of q-
polymatroids is associated to each rank-metric code. In this chapter, we choose to present
the material of [841] differently, in order to stress the following facts (stated following the
notation [841, Notation 5.3]):
(1) For n < m the q-polymatroid that contains all the relevant information on C is
(Fnq , ρc (C, ·)).
(2) For n > m the q-polymatroid that contains all the relevant information on C is
(Fm
q , ρr (C, ·)).
(3) For n = m one needs to consider both (Fnq , ρc (C, ·)) and (Fnq , ρr (C, ·)), at least if
one wishes to have the property that equivalent codes have equivalent associated q-
polymatroids.
The code of [841, Example 2.10] shows that the definition of an associated (q, n)-
polymatroid given by Shiromoto for a rank-metric code C ⊆ Matn×n (Fq ) is not equivalence-
invariant.
Let (F22 , ρ1 ) and (F22 , ρ2 ) be the (q, 2)-polymatroids associated to C and C T respectively,
according to [1683, Proposition 3]. By definition, for any V ⊆ F22
ρ1 (V ) = 2 − dim(C(V ⊥ )) = dim(V )
and
if V = 0, h(1, 1)i, F22 ,
T ⊥ dim(V )
ρ2 (V ) = 2 − dim(C (V )) =
2 if V = h(1, 0)i, h(0, 1)i.
The natural notion of equivalence for (q, r)-polymatroids is the following: (F`q , ρ1 ) and
(F`q , ρ2 ) are equivalent if there exists an Fq -linear isomorphism ϕ : F`q → F`q such that
ρ1 (V ) = ρ2 (ϕ(V )) for all V ⊆ F`q . Clearly (F22 , ρ1 ) and (F22 , ρ2 ) are not equivalent with
respect to such a notion of equivalence.
The interest in associating q-polymatroids to rank-metric codes comes from the fact that
many invariants of rank-metric codes can be computed from the associated q-polymatroids.
In fact, one could think of (equivalence classes of) q-polymatroids as invariants of the rank-
metric codes to which they are associated, since equivalent codes are associated to equivalent
q-polymatroids.
One can also show that the q-polymatroid(s) associated to the rank-metric code Γ(C)
associated to a vector rank-metric code C ⊆ Fnqm do not depend on the choice of the basis
Γ.
Proposition 11.6.13 ([1062, Corollary 4.7], [841, Proposition 6.10]) Let C ⊆ Fnqm
be a vector rank-metric code, and let Γ, Γ0 be Fq -bases of Fqm . Then
In the rest of this section, we discuss how to recover various invariants of rank-metric
codes from the associated q-polymatroids. We start with the simplest invariants, namely
the dimension and the minimum distance.
Proposition 11.6.14 ([841, Proposition 6.1 and Corollary 6.3]) Let C ⊆ Fn×m
q be a
rank-metric code. Then
dim(C) = max {m, n} · ρC Fmin{m,n}
q
and
dmin (C) = min {m, n} + 1 − δ
where
dim(C) min{m,n}
δ = min k ρC (V ) = for all V ⊆ Fq with dim(V ) = k .
max {m, n}
The next result shows how one can compute the generalized weights of a rank-metric
code from its associated q-polymatroid(s).
Rank-Metric Codes 249
Theorem 11.6.15 ([841, Theorem 7.1]) Let C ⊆ Fn×m q be a rank-metric code. For i =
1, 2, . . . , dim(C) let
If n 6= m, then
di (C) = di (P (C)) for i = 1, 2, . . . , dim(C).
If n = m, then
Theorem 11.6.17 ([1683, Theorem 14]) Let C ⊆ Fn×m q be a rank-metric code of di-
mension dim(C) = `. Assume that n ≤ m. Then
dim(V )−1
X n Y
rweC (x, y) = xn−`/m (q m x)ρC (Fq )−ρC (V ) x−(dim(V )−ρC (V )) (y − q i x).
V ⊆Fn
q i=0
Finally, we state two results that show that the property of being MRD or an optimal
anticode can be characterized in terms of the associated q-polymatroid(s).
Theorem 11.6.18 ([841, Theorem 6.4 and Corollary 6.6]) Let C ⊆ Fn×m q be a rank-
metric code with minimum distance dmin (C) = d. The following are equivalent.
(a) C is MRD.
min{m,n}
(b) ρC (V ) = dim(V ) for all V ⊆ Fq with dim(V ) ≤ min {m, n} − d + 1.
min{m,n}
(c) ρC (V ) = dim(V ) for some V ⊆ Fq with dim(V ) = min {m, n} − d + 1.
min{m,n}
In particular, if C is MRD with dmin (C) = d, then P (C) = Fq , ρC where
min {m, n} − d + 1 if dim(V ) ≥ min {m, n} − d + 1,
ρC (V ) =
dim(V ) if dim(V ) ≤ min {m, n} − d + 1.
The corresponding result for optimal anticodes is the following. Notice that the q-
polymatroids associated to MRD codes or optimal anticodes are in fact q-matroids.
Theorem 11.6.19 ([841, Theorem 7.2 and Corollary 6.6]) Let C ⊆ Fn×m
q be a rank-
metric code with r = max rank(C). The following are equivalent.
(a) C is an optimal anticode.
250 Concise Encyclopedia of Coding Theory
min{m,n}
(b) ρC (V ) | V ⊆ Fq = {0, 1, . . . , r}, or ρC T (V ) | V ⊆ Fnq = {0, 1, . . . , r} and
m = n.
min{m,n}
(c) ρC Fq = r, or ρC T (Fnq ) = r and m = n.
In particular, if C is an optimal anticode with r = max rank(C), let
Theorem 11.6.20 Let C ⊆ Fn×m q be a rank-metric code and let C ⊆ Fnqm be a vector
rank-metric code. Let Γ be a basis of Fqm over Fq . Then
Acknowledgement
Part of this chapter was written while the author was participating in the Nonlinear
Algebra program at ICERM in Fall 2018. The author wishes to thank ICERM and Brown
University for an excellent working environment.
Chapter 12
Linear Programming Bounds
Peter Boyvalenkov
Institute of Mathematics and Informatics, Bulgarian Academy of Sciences
Danyo Danev
Linköping University
z
where j := z(z − 1) · · · (z − j + 1)/j! for z ∈ R.
(n,q)
Theorem 12.1.2 ([1229]) The polynomials Ki (z) satisfy the three-term recurrence re-
lation
(n,q) (n,q) (n,q)
(i + 1)Ki+1 (z) = i + (q − 1)(n − i) − qz Ki (z) − (q − 1)(n − i + 1)Ki−1 (z)
(n,q) (n,q)
with initial conditions K0 (z) = 1 and K1 (z) = n(q − 1) − qz.
where δ(i) is the Dirac-delta measure at i ∈ {0, 1, . . . , n}, and the form
Z
hf, gi = f (t)g(t) dµn (t) (12.2)
define an inner product over the class Pn of real polynomials of degree less than or equal to
n.
251
252 Concise Encyclopedia of Coding Theory
Theorem 12.1.4 (Orthogonality Relations [1229]) Under the inner product (12.2)
the Krawtchouk polynomials satisfy
n
X (n,q) (n,q) un n i n
Ki (u)Kj (u)(q − 1) = δi,j q (q − 1)
u=0
u i
Pn (n,q)
Theorem 12.1.5 (Expansion [1229]) If f (z) = i=0 fi Ki (z), then
−1 Xn
n (n,q) n
fi = q n (q − 1)i f (u)Ki (u)(q − 1)u
i u=0
u
n
X
= q −n f (u)Ku(n,q) (i).
u=0
Theorem 12.1.9 (Christoffel–Darboux Formula [1229]) For any real z and w and
for any i = 0, 1, . . . , n
i+1
(n,q) (n,q) (n,q) (n,q)
(w − z)Tin,q (z, w) = n
Ki+1 (z)Ki (w) − Ki+1 (w)Ki (z) .
q(q − 1)i i
is called the dual to f (z). Note that the dual to fb(z) is f (z) and also that fb(i) = q n/2 fi
for any i ∈ {0, 1, . . . , n}.
In Definition 1.9.1 in Chapter 1, Aq (n, d) is defined for codes over Fq . We now extend
that definition to codes over Fq as only the alphabet size is important and not its structure.
Theorem 12.2.4 (Linear Programming Bound for Codes [519, 521]) Let the real
Pn (n,q)
polynomial f (z) = i=0 fi Ki (z) satisfy the conditions
(A1) f0 > 0, fi ≥ 0 for i = 1, 2, . . . , n;
Remark 12.2.5 The polynomial f from Theorem 12.2.4 is implicit in Theorem 1.9.23(a)
Pn (n,q)
from Chapter 1 as i=0 Bi Ki (i) and it is normalized for f0 (or B0 ) to be equal to 1; note
also
Pn the normalization of the Krawtchouk polynomials. Thus the bound f (1)/f0 appears as
i=0 B i in Chapter 1.
Theorem 12.2.6 (Linear Programming Bound for Designs [519, 521]) Let the real
Pn (n,q)
polynomial f (z) = i=0 fi Ki (z) satisfy the conditions
(B1) f0 > 0, fi ≤ 0 for i = τ + 1, τ + 2, . . . , n;
(B2) f (0) > 0, f (i) ≥ 0 for i = 1, 2, . . . , n.
Then Bq (n, τ ) ≥ f (0)/f0 . Equality holds for designs C ⊆ Fqn with τ (C) = τ , distance
distribution (B0 , B1 , . . . , Bn ), dual distance distribution (B00 , B10 , . . . , Bn0 ) and polynomial
f (z) such that Bi f (i) = 0 and Bi0 fi = 0 for every i = 1, 2, . . . , n.
Remark 12.2.7 The Rao Bound in Theorem 12.3.3 and Levenshtein Bound in Theorem
12.3.10 below are obtained by suitable polynomials in Theorems 12.2.4 and 12.2.6, respec-
tively. Examples of codes attaining these bounds are listed in Table 12.1 presented later.
Theorem 12.2.8 (Duality [1231]) A real polynomial f (z) satisfies the conditions (A1)
and (A2) if and only if its dual polynomial fb satisfies the conditions (B1) and (B2). More-
over,
f (0) fb(0)
· = qn .
f0 fb0
Remark 12.2.9 Rephrased, the duality means that for any polynomial f which is good
for linear programming for codes, its dual fb is good for linear programming for designs (and
conversely). Thus we obtain, in a sense, bounds for free. In particular, the duality justifies
the pairs of bounds in Theorems 12.3.1, 12.3.3 and 12.3.10 below.
Definition 12.2.10 For a code C ⊆ Fqn and a function h : {1, 2, . . . , n} → R the potential
energy of C with respect to h is defined to be
X
Eh (C) := h(dH (x, y)).
x,y∈C,x6=y
Theorem 12.2.12 (Linear Programming Bound for Energy of Codes [426]) Let
n and q be fixed, h : (0, n] → (0, +∞) be a function, and M ∈ {2, 3, . . . , q n }. Let the real
Pn (n,q)
polynomial f (z) = i=0 fi Ki (z) satisfy the conditions
(D1) f0 > 0, fi ≥ 0 for i = 1, 2, . . . , n;
(D2) f (0) > 0, f (i) ≤ h(i) for i = 1, 2, . . . , n.
Then Eh (n, M ; q) ≥ M (f0 M − f (0)). Equality holds for codes C ⊆ Fqn with distance distri-
bution (B0 , B1 , . . . , Bn ), dual distance distribution (B00 , B10 , . . . , Bn0 ) and polynomials f (z),
such that Bi [f (i) − h(i)] = 0 and Bi0 fi = 0 for every i = 1, 2, . . . , n.
Remark 12.2.13 Theorems 12.2.4, 12.2.6 and 12.2.12 can be applied with the usual sim-
plex method for quite large parameters. For instance, the website [1610] (Delsarte, a.k.a.
Linear Programming (LP), upper bounds) offers a tool for computation of bounds via inte-
ger LP with Theorem 12.2.4. Several websites maintain tables of best known bounds (lower
and upper) for codes of relatively small lengths; see, e.g., [305].
Theorem 12.3.1 (Singleton Bound [1717]) For any code C ⊆ Fqn with minimum dis-
tance d and dual distance d0 0
q d −1 ≤ |C| ≤ q n−d+1 . (12.3)
The bounds (12.3) can be attained only simultaneously and this happens if and only if
d + d0 = n + 2 and all possible distances are realized (so the attaining code is an MDS code).
The Sphere Packing or Hamming Bound presented in Theorem 1.9.6 is another upper
bound on the code size, given q, n, and d. Rao gave a lower bound on the code size, given
q, n, and d0 . These bounds are combined in the following theorem as they are connected by
the duality.
Theorem 12.3.3 (Rao Bound [1566] and Hamming (Sphere Packing) Bound
[895]) For any code C ⊆ Fqn with minimum distance d and dual distance d0
qn
H n,q (d0 ) ≤ |C| ≤ . (12.4)
H n,q (d)
Definition 12.3.4 Codes attaining the upper bound in (12.4) are called perfect. Designs
attaining the lower bound in (12.4) are called tight.
Theorem 12.3.5 Let C ⊆ Fqn with |C| > 1. The following hold.
(a) C is a tight design if and only if d0 = 2s − δ + 1.
(b) C is a perfect code if and only if d = 2s0 − δ 0 + 1.
Definition 12.3.6 For fixed n, q, and i denote by ξin,q the smallest root of the Krawtchouk
(n,q)
polynomial Ki (z). In the case of i = 0 set ξ0n,q = n + 1.
Theorem 12.3.10 (Levenshtein Bound [1227, 1229, 1230]) For a code C ⊆ Fqn with
minimum distance d and dual distance d0
qn
≤ |C| ≤ Ln,q (d). (12.5)
Ln,q (d0 )
The lower (respectively upper) bound is attained if and only if d ≥ max {2s0 − δ 0 , 2} (respec-
tively d0 ≥ max {2s − δ, 2}).
Linear Programming Bounds 257
Example 12.3.11 In the first three relevant intervals, the Levenshtein (upper) Bound is
given by
qd
Aq (n, d) ≤ = Ln,q
1 (d)
qd − (q − 1)n
(which is the Plotkin Bound discussed in Section 1.9.3) if n − n−1
q = ξ1n−1,q + 1 ≤ d ≤
ξ0n−2,q + 1 = n,
q2 d
Aq (n, d) ≤ = Ln,q
2 (d)
qd − (q − 1)(n − 1)
if ξ1n−2,q + 1 ≤ d ≤ ξ1n−1,q + 1, and
qd(n(q − 1) + 1)(n(q − 1) − qd + 2 − q)
Aq (n, d) ≤ = Ln,q
3 (d)
qd(2n(q − 1) − q + 2 − qd) − (n − 1)(q − 1)2
if ξ2n−1,q + 1 ≤ d ≤ ξ1n−2,q + 1.
Remark 12.3.12 It is also worth noting two important values of the Levenshtein Bound:
n−2+ε,q
Ln,q (ξk−1+ε + 1) = H n,q (2k − 1 + ε) for ε ∈ {0, 1}.
h i
So at the ends of the intervals ξkn−1−ε,q + 1, ξk−1+ε
n−2+ε,q
+ 1 the Levenshtein Bound coincides
with the corresponding Rao Bound.
Recall the kernels Tin,q from Definition 12.1.8 and the parameters k = k(d) and ε = ε(d)
from Lemma 12.3.7(b). The next theorem gives a Gauss–Jacobi quadrature formula (12.6),
introduced by Levenshtein in [1228, 1229], which is instrumental in proofs of Theorem
12.3.16 and Theorem 12.3.21(a). Theorem 12.3.13 also introduces parameters needed for
the universal bound (12.8) below.
and
(n−ε,q)
(d) q −ε Kk (d)
ρk+1 = (n−ε,q) (n−1−ε,q) (n−1−ε,q) (n−ε,q)
.
Kk (d)Kk−1 (−1) − Kk−1 (d − 1)Kk (0)
258 Concise Encyclopedia of Coding Theory
to obtain the bounds (12.5). It was shown in [289] that its zeros α1 , . . . , αk+ε strongly sug-
gest the optimal choice of nodes for the simplex method of Theorem 12.2.4 (equivalently,
Theorem 1.9.23(a) from Chapter 1). Computational experiments show that simple replace-
ment of any double zero αi ∈ (j, j + 1) of Levenshtein’s polynomial (12.7) by two simple
zeros j and j + 1 gives in most cases (conjecture: for every sufficiently large rate d/n) the
best result that can ever be obtained from Theorem 12.2.4.
Remark 12.3.15 The assertions of Theorem 12.3.13 remain true when d ∈ [1, n] is a con-
tinuous variable. In particular, the quadrature formula (12.6) holds true for any real poly-
nomial f (z) of degree at most 2k − 1 + ε with well defined roots of gd (z) and corresponding
(d)
positive ρi for i = 1, 2, . . . , k + 1.
Theorem 12.3.16 ([1228, 1703]) The upper bound in (12.5) cannot be improved by a
real polynomial f of degree at most 2k − 1 + ε satisfying (A1) and the condition f (z) ≤ 0
for z ∈ [d, n].
(d)
where k = k(d), ε = ε(d), and ρi are as in Theorem 12.3.13. Note that Pjn,q = 0 for
(n,q)
(0) = (q − 1)j nj .
j ≤ 2k − 1 + ε. Note also that Kj
Theorem 12.3.18 ([287, 1230]) The upper bound in (12.5) can be improved by a real
polynomial f of degree at least 2k + ε satisfying (A1) and the condition f (z) ≤ 0 for
z ∈ [d, n] if and only if Pjn,q (d) < 0 for some j ≥ 2k + ε.
(d)
where the parameters αi and ρi are determined as in Theorem 12.3.13. Equality is attained
if and only if there exists a code with M codewords which attains the upper bound in (12.5).
Theorem 12.3.21 ([291]) Let h be a strictly completely monotone function. The bound
(12.8):
Linear Programming Bounds 259
The conditions for attaining the bounds (12.5) and (12.8) coincide. Thus, any code
which attains the upper bound in (12.5) for its cardinality (and therefore the lower bound
for (12.8) for its energy) is universally optimal; see Table 12.1. We summarize this in the
next theorem.
Theorem 12.3.23 ([291, 426]) All codes which attain the upper bound in (12.5) are uni-
versally optimal.
Theorem 12.3.24 ([426]) Let C ⊆ Fqn be a code and h : (0, n] → R be any function such
that the bound in Theorem 12.2.12 is attained. Let c ∈ C. Then
2m
Example 12.3.25 The Kerdock codes Km ⊂ F22 [1107] are nonlinear codes existing
for lengths n = 22m . Their cardinality is |Km | = n2 = 24m and their distance (weight)
distribution is as follows:
B22m−1 −2m−1 = 22m (22m−1 − 1), B22m−1 = 22m+1 − 2, B22m−1 +2m−1 = 22m (22m−1 − 1),
/ {0, 22m−1 − 2m−1 , 22m−1 , 22m−1 + 2m−1 }.
B0 = Bn = 1, and Bi = 0 for i ∈
It is easy to check that the Kerdock codes are asymptotically optimal with respect to the
upper bound in (12.5) and the bound (12.8) as they are very close to the bounds already
for small m.
Definition 12.3.26 For any code C ⊆ Fqn , ρ(C) := max {dH (x, C) | x ∈ Fqn } is called the
covering radius of C. Here dH (x, C) := min {dH (x, c) | c ∈ C}.
Theorem 12.3.27 (Delsarte Bound [521]) For any code C ⊆ Fqn with external distance
s0
ρ(C) ≤ s0 .
Theorem 12.3.28 (Tietäväinen Bound [1807, 1808]) For any code C ⊆ Fqn with dual
distance d0 = 2k − 1 + ε, where k is positive integer and ε ∈ {0, 1},
ρ(C) ≤ ξkn−1+ε,q .
TABLE 12.1: Parameters of known codes attaining (simultaneously) the upper bound in
(12.5) (see also [286]) and the lower bound (12.8). This table appears for (12.5) in [1228].
Here the column for the external distance s0 is added. All codes in the table are universally
optimal.
pl q q = pm n − 2 3 {n − pl , n} nq l, m = 1, 2, . . . , [1644]
hx, yi := x1 y1 + x2 y2 + · · · + xn yn .
Note that on Sn−1 the distance and the inner product are connected by
dE 2 (x, y)
hx, yi = 1 − .
2
Definition 12.4.2 An (n, M, s)-spherical code is a nonempty finite set C ⊂ Sn−1 with
cardinality |C| = M and maximal inner product
The minimum distance d = dE (C) := min {dE (x, y) | x, y ∈ C, x 6= y} and the maximal
inner product are connected by
d2
s=1− .
2
Definition 12.4.3 For fixed n and s denote by
n−3 n−3
(α, β) = a + ,b +
2 2
normalized by Piα,β (1) = 1. When (a, b) = (0, 0) we get the Gegenbauer polynomials
and use the (n) indexing instead of 0, 0. Denote by ta,b
i the greatest zero of the polynomial
Pia,b (t) and define t1,1
0 = −1. Let
2i + n − 2 i + n − 2
ri := .
i+n−2 i
(n)
Theorem 12.4.5 ([8, 1766]) The Gegenbauer polynomials {Pi (t)}∞
i=0 satisfy the recur-
rence relations
(n) (n) (n)
(i + n − 2)Pi+1 (t) = (2i + n − 2)tPi (t) − iPi−1 (t) for i = 1, 2, . . .
(n) (n)
where P0 (t) = 1 and P1 (t) = t.
262 Concise Encyclopedia of Coding Theory
Theorem 12.4.6 ([8, 1766]) The Gegenbauer polynomials are orthogonal on [−1, 1] with
respect to the measure
n−3
dµ(t) := γn (1 − t2 ) 2 dt for t ∈ [−1, 1]
√
where γn := Γ( n2 )/ π Γ( n−1
2 ) is a normalizing constant.
Theorem 12.4.7 (Linear Programming Bound for Spherical Codes [527, 1067])
Let n ≥ 2 and f (t) be a real polynomial such that
(A1) f (t) ≤ 0 for −1 ≤ t ≤ s, and
Pdeg(f ) (n)
(A2) the coefficients in the Gegenbauer expansion f (t) = i=0 fi P i (t) satisfy f0 > 0,
fi ≥ 0 for i = 1, 2, . . . , deg(f ).
Then A(n, s) ≤ f (1)/f0 .
Theorem 12.4.8 (Levenshtein Bound [1226, 1228]) For the quantity A(n, s) we have
1,0
!
Pk−1+ε (s) k−1+ε
X
A(n, s) ≤ Lτ (n, s) := 1 − ri for all s ∈ Iτ (12.9)
Pk0,ε (s) i=0
Example 12.4.9 The first three bounds in (12.9) are A(n, s) ≤ (s−1)/s for s ∈ [−1, −1/n],
2n(1 − s)
A(n, s) ≤
1 − ns
for s ∈ [−1/n, 0] and
n(1 − s)(2 + (n + 1)s)
A(n, s) ≤
1 − ns2
h √ i
for s ∈ 0, n+3−1
n+2 .
(σn is the normalized (n − 1)-dimensional Hausdorff measure) holds for all polynomials
p(x) = p(x1 , x2 , . . . , xn ) of degree at most τ .
Theorem 12.4.11 (Linear Programming Bound for Spherical Designs [527]) Let
n ≥ 2, τ ≥ 1 and f (t) be a real polynomial such that
(B1) f (t) ≥ 0 for −1 ≤ t ≤ 1, and
Pdeg(f ) (n)
(B2) the coefficients in the Gegenbauer expansion f (t) = i=0 fi P i (t) satisfy f0 > 0,
fi ≤ 0 for i = τ + 1, . . . , deg(f ).
Linear Programming Bounds 263
Definition 12.4.13 A spherical τ -design on Sn−1 is called tight if it attains the bound
(12.10).
Theorem 12.4.14 ([113, 114]) Let n ≥ 3. Tight spherical τ -designs on Sn−1 exist for
τ = 1, 2 and 3 for every n, and possibly for τ = 4, 5, 7, and 11. Tight spherical 4-designs
on Sn−1 exist for n = 6, 22, and possibly for n = m2 − 3, where m ≥ 7 is an odd integer.
Tight spherical 5-designs on Sn−1 exist for n = 3, 7, 23, and possibly for n = m2 − 2, where
m ≥ 7 is an odd integer. Tight spherical 7-designs on Sn−1 exist for n = 8, 23, and possibly
for n = 3m2 − 4, where m ≥ 4 is an integer. Tight spherical 11-designs on Sn−1 exist only
for n = 24.
Remark 12.4.15 ([527]) Tight spherical 4- and 5-designs coexist and are known for m =
3 and 5 only. Tight spherical 7-designs are known for m = 2 and 3 only.
See also [111, 112, 232] for general theory and [263, 709, 1734] for other bounds for
designs.
Theorem 12.4.16 ([1226, 1227, 1228, 1230]) The bounds (12.9) and (12.10) are re-
lated by the equalities
Definition 12.4.17 Given an (extended real-valued) function h : [−1, 1] → [0, +∞], the
h-energy (or potential energy) of C is given by
X
Eh (C) := h(hx, yi).
x,y∈C,x6=y
Definition 12.4.20 A real valued extended function h : [−1, 1] → (0, +∞] is called ab-
solutely monotone if h(k) (t) ≥ 0, for every t ∈ [−1, 1) and every integer k ≥ 0, and
h(1) = limt→1− h(t).
264 Concise Encyclopedia of Coding Theory
R1 2
where ria,b = 1/ca,b −1 Pia,b (t) (1 − t)a (1 + t)b dµ(t), c1,0 = c0,0 = γn , and c1,1 = γn+2
(see Theorem 12.4.6 for relevant notation). Note that ri0,0 = ri .
Let α1 < α2 < · · · < αk+ε be the roots of the polynomial used for obtaining the
Levenshtein Bound Lτ (n, s) with s = αk+ε , τ = 2k − 1 + ε, ε ∈ {0, 1}, and let
Tk0,0 (s, 1)
ρ1 := for ε = 1 and
Tk0,0 (−1, −1)Tk0,0 (s, 1) − Tk0,0 (−1, 1)Tk0,0 (s, −1)
1
ρi+ε := 1,ε for i = 1, 2, . . . , k.
c1,ε (1 + αi+ε )ε (1 − αi+ε )Tk−1 (αi+ε , αi+ε )
Theorem 12.4.22 ([290]) Let n ≥ 2, τ = 2k − 1 + ε ≥ 1, ε ∈ {0, 1}, and h be absolutely
monotone in [−1, 1]. For M ∈ (D(n, τ ), D(n, τ + 1)] let s be the largest root of the equation
M = Lτ (n, s) and α1 < α2 < · · · < αk+ε = s, ρ1 , ρ2 , . . . , ρk+ε be as in Definition 12.4.21.
Then
k+ε
X
Eh (n, M ) ≥ M 2 ρi h(αi ). (12.11)
i=1
If an (n, M, s) code attains (12.11), then it is a spherical τ -design and its inner products
form the set {α1 , α2 , . . . , αk+ε }.
See also [245, Chapter 5]. Note that α1 , α2 , . . . , αk+ε = s are in fact the roots of the
equation M = Lτ (n, s).
Remark 12.4.23 The conditions for attaining the bounds (12.9) and (12.11) coincide.
Thus a spherical (n, M, s) code attains (12.9) if and only if it attains (12.11). In particular,
every tight spherical design attains (12.11).
Theorem 12.4.24 ([1703] for (12.9), [290] for (12.11)) The bounds (12.9) and (12.11)
cannot be improved by using in Theorem 12.4.7 and 12.4.19, respectively, polynomials of the
same or lower degree.
Theorem 12.4.25 ([288] for (12.9), [290] for (12.11)) For j > τ = 2k − 1 + ε and ε ∈
{0, 1}, let
k+ε
(n) 1 X (n)
Qj := + ρi Pj (αi ).
Lτ (n, αk+ε ) i=1
The bounds (12.9) and (12.11) (where M = Lτ (n, αk+ε ) for (12.11)) can be (simultaneously)
(n)
improved if and only if Qj < 0 for some j > τ .
Linear Programming Bounds 265
Theorem 12.4.27 ([424]) Every spherical code which is a spherical (2k − 1)-design and
which admits exactly k inner products
√ between distinct points is universally optimal. The
600-cell (the unique (4, 120, (1 + 5)/4)-spherical code) is universally optimal.
Remark 12.4.28 Any code which attains (12.9) (and (12.11)) is universally optimal. It
is unknown if there exists a spherical code, apart from the 600-cell, which is universally
optimal but does not attain (12.9) (and (12.11)). Tables with all known codes which attain
(12.9) (and (12.11)) can be found in [424, 1228, 1230].
Example 12.4.29 Consider the case (n, M ) = (4, 24). The well known code D4 (D4 root
system; equivalently, the set of vertices of the regular 24-cell) is optimal in the sense that
it realizes the fourth kissing number [1411]. However, this code is not universally optimal
[421], despite having energy which is very close to the bound (12.11). For example, with the
1
Newtonian h(t) = 2(1−t) it has energy 334, while (12.11) gives 333 (which can be improved
to ≈ 333.157).
Conjecture 12.4.30 Every universally optimal spherical code attains the Linear Program-
ming Bound of Theorem 12.4.19.
Example 12.4.31 (Example 12.3.25 continued.) A standard construction (see [442, Chap-
ter 5]) maps
√ binary codes from Fn2 to the sphere Sn−1 – the coordinates 0 and 1 are replaced
by ±1/ n, respectively. Denote this map by x → x. The inner product hx, yi on Sn−1 and
the Hamming distance dH (x, y) in Fn2 are connected by hx, yi = 1 − 2dHn(x,y) . Thus the
weights of the Kerdock code Km correspond to the inner products 1, √1n , 0, − √1n , −1, re-
spectively.
2m
The image Km ⊂ S2 −1 of Km is asymptotically optimal with respect to both bounds
(12.9) and (12.11). For example, it has energy
which coincides with the asymptotic of (12.11) (obtained by a polynomial of degree 5).
Theorem 12.5.2 (Linear Programming Bound for Binomial Moments [68]) Let
C ⊂ F2n be a code with distance distribution (B0 , B1 , . . . , Bn ), w ∈ {1, 2, . . . , n}, and
w
n−i
X
Bw := Bi .
i=1
n−w
Pn (n,2)
Let the real polynomial f (z) = i=0 fi Ki (z) satisfy the conditions
(i) fi ≥ 0 for i = 1, 2, . . . , n, and
n−j
(ii) f (j) ≤ n−w for j = 1, 2, . . . , n.
Then Bw ≥ f0 |C| − f (0).
The final bound in this chapter is an upper bound on the size of a quantum code.
Quantum codes are the subject of Chapter 27.
Theorem 12.5.3 (Linear Programming Bound for Quantum Codes [70]; see also
Pn (n,4)
[1188]) Let f (z) = i=0 fi Ki (z) be a polynomial satisfying the conditions
(i) fi ≥ 0 for every i = 0, 1, . . . , n,
(ii) f (z) > 0 for every z = 0, 1, . . . , d − 1, and
(iii) f (z) ≤ 0 for every z = d, d + 1, . . . , n.
Then every ((n, K)) quantum code of minimum distance d satisfies
1 f (j)
K≤ max .
2n j∈{0,1,...,d−1} fj
Acknowledgement
A significant part of the work of the first author on this chapter was done during his stay
(August-December 2018) as a visiting professor in the Department of Mathematical Sciences
at Purdue University Fort Wayne. His research was supported, in part, by a Bulgarian NSF
contract DN02/2-2016.
Chapter 13
Semidefinite Programming Bounds for
Error-Correcting Codes
Frank Vallentin
Universität zu Köln
13.1 Introduction
Linear programming bounds belong to the most powerful and flexible methods to obtain
bounds for extremal problems in coding theory as described in Chapter 12. Initially, Del-
sarte [519] developed linear programming bounds in the algebraic framework of association
schemes. A central example in Delsarte’s theory is finding upper bounds for the parame-
ter A2 (n, d), the largest number of codewords in a binary code of length n with minimum
Hamming distance d.
The application of linear programming bounds led to the best known asymptotic bounds,
such as the MRRW Bound [1365]. It was realized that linear programming bounds are also
applicable to finite and infinite two-point homogeneous spaces [442, Chapter 9]. These are
metric spaces in which the symmetry group acts transitively on pairs of points having the
same distance. So one can treat metric spaces like the q-ary Hamming space Fnq , the sphere,
real/complex/quaternionic projective space, or Euclidean space [423].
In recent years, semidefinite programming bounds have been developed with two aims:
to strengthen linear programming bounds and to find bounds for more general spaces.
Semidefinite programs are convex optimization problems which can be solved efficiently
267
268 Concise Encyclopedia of Coding Theory
and which are a vast generalization of linear programs. The optimization variable of a
semidefinite program is a positive semidefinite matrix whereas it is a nonnegative vector for
a linear program.
Schrijver [1634] was the first who applied semidefinite programming bounds to improve
the known upper bounds for A2 (n, d) for many parameters n and d. The underlying idea
is that linear programming bounds only exploit constraints involving pairs of codewords,
whereas semidefinite programming bounds can exploit constraints between triples, quadru-
ples, . . . of codewords.
This chapter introduces semidefinite programming bounds with an emphasis on error-
correcting codes. The structure of the chapter is as follows.
In Section 13.2 the basic theory of linear and semidefinite programming is reviewed in
the framework of conic programming.
Semidefinite programming bounds can be viewed as semidefinite programming hierar-
chies for difficult combinatorial optimization problems. One can express the computation of
A2 (n, d) as finding the independence number of an appropriate graph G(n, d) and apply the
Lasserre hierarchy to find upper bounds for A2 (n, d). This approach is explained in Section
13.3.
The graph G(n, d) has exponentially many vertices and the Lasserre hierarchy for G(n, d)
employs matrices whose rows and columns are indexed by all t-element subsets of G(n, d);
so a computation of the semidefinite programs is not directly possible. However, the graph
has many symmetries and these symmetries can be exploited to substantially reduce the
size of the semidefinite programs. The technique of symmetry reduction is the subject of
Section 13.4. There, this technique is applied to the graph G(n, d) and the result of Schrijver
is explained.
After Schrijver’s breakthrough result, semidefinite programming bounds were developed
for different settings. These developments are reviewed in Section 13.5.
cone with an affine subspace. See Nemirovski [1429] and the references therein for a detailed
overview of conic programming.
Let E be an n-dimensional real or complex vector space equipped with a real-valued
inner product h·, ·iE : E × E → R.
Definition 13.2.1 A set K ⊆ E is called a (convex) cone if for all x, y ∈ K and all
nonnegative numbers α, β ∈ R+ one has αx + βy ∈ K. A convex cone K is called pointed
if K ∩ (−K) = {0}. A convex cone is called proper if it is pointed, closed, and full-
dimensional. The dual cone of a convex cone K is given by
The simplest convex cones are finitely generated cones; the vectors x1 , . . . , xN ∈ E
determine the finitely generated cone K by
N
X
K = cone {x1 , . . . , xN } = αi xi α1 , . . . , αN ≥ 0 .
i=1
x K y if and only if x − y ∈ K.
To define a conic program, we fix the space E, a proper convex cone K ⊆ E, and an
m-dimensional vector space F with a real-valued inner product h·, ·iF : F × F → R.
The vector x ∈ E is the optimization variable of the primal conic program; the vector
y ∈ F is the optimization variable of the dual. A vector x is called a feasible solution for
the primal if x ∈ K and Ax = b. It is called a strictly feasible solution if additionally x
lies in the interior of K. It is called an optimal solution if x is feasible and p∗ = hx, ciE .
T
Similarly, a vector y is called feasible for the dual if A y − c ∈ K ∗ , and it is called strictly
T
feasible if A y−c lies in the interior of K ∗ . It is called optimal if y is feasible and d∗ = hb, yiF .
The Bipolar Theorem (see for example Barvinok [134] or Simon [1714]) states that
(K ∗ )∗ = K when K is a proper convex cone. From this it follows easily that taking the dual
of the dual conic program gives a conic program which is equivalent to the primal.
Duality theory of conic programs looks at the (close) relationship between the primal
and dual conic programs. In particular duality can be used to systematically find upper
bounds for the primal program and lower bounds for the dual program.
(c) strong duality: Suppose that primal and dual conic programs both have a strictly
feasible solution. Then p∗ = d∗ and both primal and dual possess an optimal solution.
Definition 13.2.4 The nonnegative orthant is the following proper convex cone:
Rn+ = {x ∈ Rn | x1 ≥ 0, . . . , xn ≥ 0}.
The nonnegative orthant is self-dual, (Rn+ )∗ = Rn+ . So, for a matrix A ∈ Rm×n , a vector
b ∈ Rm , and a vector c ∈ Rn , we get the primal linear program
where TR denotes the trace of a matrix, that is, the sum of the diagonal entries. For the
convex cone K we choose the cone of positive semidefinite matrices.
Semidefinite Programming Bounds for Error-Correcting Codes 271
Definition 13.2.5 The cone of positive semidefinite matrices is the proper convex
cone
n
S+ = {X ∈ S n | X is positive semidefinite}.
Let us recall that a matrix X is positive semidefinite if and only if for all x ∈ Rn we
have xT Xx ≥ 0. Alternatively, looking at a spectral decomposition of X given by
n
X
X= λi ui uT
i,
i=1
is called a spectrahedron.
Spectrahedra are generalizations of polyhedra. They are central objects in convex alge-
braic geometry; see [221].
Under mild technical assumptions one can solve semidefinite programming problems in
polynomial time. The following theorem was proved in Grötschel, Lovász, and Schrijver [857]
using the Ellipsoid Algorithm and by de Klerk and Vallentin [499] using the Interior-point
Algorithm.
Theorem 13.2.7 Consider the primal semidefinite program (13.1) with rational input C,
A1 , . . . , Am , and b1 , . . . , bm . Suppose we know a rational point X0 ∈ F and positive rational
numbers r, R so that
B(X0 , r) ⊆ F ⊆ B(X0 , R),
where B(X0 , r) is the ball of radius r, centered at X0 , in the affine subspace
For every positive rational number > 0 one can find in polynomial time a rational matrix
X ∗ ∈ F such that
hC, X ∗ iT − p∗ ≤ ,
R
where the polynomial is in n, m, log2 r, log2 (1/), and the bit size of the data X0 , C,
A1 , . . . , Am , and b1 , . . . , bm .
272 Concise Encyclopedia of Coding Theory
Theorem 13.3.6 The Lasserre Bound of G of order t forms a hierarchy of stronger and
stronger upper bounds for the independence number of G. In particular,
α(G) ≤ lasn (G) ≤ · · · ≤ las2 (G) ≤ las1 (G)
holds.
Proof: To show that α(G) ≤ last (G) for every 1 ≤ t ≤ n we construct a feasible solution
P (V ) P
y ∈ R+2t from any independent set I of G. This feasible solution will satisfy |I| = i∈V yi
and the desired inequality follows. For this, we simply set y to be equal to the characteristic
P (V )
vector χI ∈ R+2t defined by
(
I 1 if J ⊆ I,
χJ =
0 otherwise.
274 Concise Encyclopedia of Coding Theory
Clearly, y satisfies the conditions y∅ = 1 and yij = 0 if i and j are adjacent. The moment
matrix Mt (y) is positive semidefinite because it is a rank-one matrix of the form (note the
slight abuse of notation here, χI is now interpreted as a vector in RPt (V ) )
P (V )
Mt (y) = χI (χI )T where Mt (y)J,J 0 = yJ∪J 0 = χIJ χIJ 0 and χI ∈ R+t .
Since Mt (y) occurs as a principal submatrix of Mt+1 (y), the inequality last+1 (G) ≤
last (G) follows.
One can show, using the Schur complement for block matrices,
A B
for A positive definite, then 0 ⇐⇒ C − B T A−1 B 0,
BT C
that the first step of the Lasserre Bound coincides with the Lovász ϑ-number of G, a
famous graph parameter which Lovász [1288] introduced to determine the Shannon capacity
Θ(C5 ) of the cycle graph C5 . Determining the Shannon capacity of a given graph is a very
difficult problem and has applications to the zero-error capacity of a noisy channel; see
Shannon [1663]. For instance, the value of Θ(C7 ) is currently not known.
Theorem 13.3.7 Let G = (V, E) be a graph. We have las1 (G) = ϑ0 (G) where ϑ0 (G) is
defined as the solution of the following semidefinite program
X
0 V
ϑ (G) = max Xi,j X ∈ S+ , Xi,j ≥ 0 for all i, j ∈ V,
i,j∈V
TR(X) = 1, Xi,j = 0 if {i, j} ∈ E .
Technically, the parameter ϑ0 (G) is a slight variation of the original Lovász ϑ-number
as introduced in [1288]. The difference is that in the definition of ϑ(G) one omits the
nonnegativity condition Xi,j ≥ 0 for all i, j ∈ V .
Schrijver [1632] and independently McEliece, Rodemich, and Rumsey [1364] realized
that ϑ0 (G) is nothing other than the Delsarte Linear Programming Bound in the special
case of the graph G = G(n, d), which was defined in Example 13.3.2. We will provide a
proof of this fact in Section 13.4.3.
An important feature of the Lasserre Bound is that it does not lose information. If the
step of the hierarchy is high enough, we can exactly determine the independence number of
G.
Theorem 13.3.8 For every graph G the Lasserre Bound of G of order t = α(G) is exact;
that means last (G) = α(G) for every t ≥ α(G).
Proof: (sketch) First we show that the hierarchy becomes stationary after α(G) steps. Let
J ⊆ V be a set of vertices which contains an edge {i, j} ∈ E with i, j ∈ J. Let y ∈ RP2t (V )
be a feasible solution of last (G) with 2t ≥ |J|. Then yJ = 0, which can be seen as follows:
Write J = J1 ∪ J2 with |J1 |, |J2 | ≤ t and {i, j} ⊆ J1 . First, consider the following 2 × 2
principal submatrix of the positive semidefinite matrix Mt (y)
ij J1
ij yij yJ1
0 =⇒ yJ1 = 0,
J 1 yJ 1 yJ1
Semidefinite Programming Bounds for Error-Correcting Codes 275
where we applied the constraint yij = 0. Then, consider the following 2 × 2 principal
submatrix of Mt (y)
J J2
1
J1 yJ1 yJ
0 =⇒ yJ = 0.
J2 yJ yJ2
Hence,
last (G) = last+1 (G) = · · · = lasn (G) for t ≥ α(G).
The next step is showing that vectors y ∈ RPn (V ) , indexed by the full power set Pn (V ),
which determine a positive semidefinite moment matrix Mn (y), form a finitely generated
cone:
Mn (y) 0 ⇐⇒ y ∈ cone {χI | I ⊆ V },
where χI are the characteristic vectors. Sufficiency follows easily from χIJ∪J 0 = χIJ χIJ 0 .
For necessity, we first observe that the characteristic vectors form a basis of RPn (V ) . Let
(ψ J )J∈Pn (V ) be its dual basis; it satisfies (χI )T ψ J = δI,J . Let y be so that Mn (y) is positive
semidefinite and write y in terms of the basis
X
y= αI χI with αI ∈ R.
I∈Pn (V )
0 ≤ (ψ J )T Mn (y)ψ J = αJ .
Now we finish the proof. Let y ∈ RPn (V ) be a feasible solution of lasn (G). Then from
the previous arguments we see
X
y= αI χI with αI ≥ 0.
I independent
Theorem 13.4.3 Let A ⊆ Cn×n be a matrix ∗-algebra. Then there are natural numbers d,
m1 , . . . , md such that there is a ∗-isomorphism between A and a direct sum of full matrix
∗-algebras
Md
ϕ: A → Cmk ×mk .
k=1
Here a ∗-isomorphism is a bijective linear map between two matrix ∗-algebras which re-
spects multiplication and taking the conjugate transpose.
An elementary proof of Theorem 13.4.3, which also shows how to find a ∗-isomorphism
ϕ algorithmically, is presented in [87]. An alternative proof is given in [1829, Section 3] in
the framework of representation theory of finite groups; see also [84, 1828].
Now we want to apply Theorem 13.4.3 to block diagonalize the Γ-invariant semidefinite
program (13.4). Let A = (Cn×n )Γ be the matrix ∗-algebra of Γ-invariant matrices. Let ϕ be
a ∗-isomorphism as in Theorem 13.4.3; then ϕ preserves positive semidefiniteness. Hence,
(13.4) is equivalent to
Thus, instead of dealing with one (potentially big) matrix of size n × n one only has to
work with d (hopefully small) block diagonal matrices of size m1 , . . . , md . This reduces the
dimension from n2 to m21 + · · · + m2d . Many practical semidefinite programming solvers can
take advantage of this block structure and numerical calculations can become much faster.
However, finding an explicit ∗-isomorphism is usually a nontrivial task, especially if one is
interested in parameterized families of matrix ∗-algebras.
with r = 0, . . . , n. So, ϑ0 (G(n, d)) in the form of (13.4) is the following semidefinite program
in n + 1 variables:
X n n
n n 1 X
max 2 xr x0 = n , x1 = · · · = xd−1 = 0, xd , . . . , xn ≥ 0, xr Br 0 .
r=0
r 2 r=0
Finding a simultaneous block diagonalization of the Br ’s is easy since they pairwise commute
278 Concise Encyclopedia of Coding Theory
Indeed,
X X
(Br χa )x = (Br )x,y (χa )y = (Br )x,y (χa )y−x (χa )x
y∈Fn
2 y∈Fn
2
X X
= (χa )y−x (χa )x = (χa )y (χa )x .
y∈Fn
2 , dH (x,y)=r y∈Fn
2 , dH (0,y)=r
j=0
j r−j
through X
(χa )y = Kr(n,2) (dH (0, a)).
y∈Fn
2 , dH (0,y)=r
(so m0 = · · · = mn = 1) defined by
So the semidefinite program ϑ0 (G(n, d)) degenerates to the following linear program
X n
n 1
max 2n xr x0 = n , x1 = · · · = xd−1 = 0, xd , . . . , xn ≥ 0,
r=0
r 2
Xn
(n,2)
xr Kr (j) ≥ 0 for j = 0, . . . , n .
r=0
This is the Delsarte Linear Programming Bound; see also Theorem 1.9.23.
Lasserre’s hierarchy is a 2t-point bound. The relation between k-point bounds and Lasserre’s
hierarchy was first made explicit by Laurent [1206] in the case of bounds for binary codes;
see also Gijswijt [803], who discusses the symmetry reduction needed to compute k-point
bounds for block codes, and de Laat and Vallentin [506], who consider k-point bounds for
compact topological packing graphs.
Schrijver’s bound for binary codes [1634] is a 3-point bound. Essentially, it looks at
n n
principal submatrices Ma ∈ RF2 ×F2 of the matrix M2 (y) defined by
The group which leaves the corresponding semidefinite program invariant is the stabilizer
of a codeword in Aut(G(n, d)), which is the symmetric group permuting the n coordinates
of Fn2 .
n n
The algebra An ⊆ RF2 ×F2 invariant under this group action is called the Terwilliger
algebra of the binary Hamming scheme. Schrijver determined a block diagonalization
of the Terwilliger algebra which we recall here.
For nonnegative integers i, j, t, with t ≤ i, j and i + j ≤ n + t, the matrices
(
t 1 if wtH (x) = i, wtH (y) = j, dH (x, y) = i + j − 2t,
(Bi,j )x,y =
0 otherwise
n+3
form a basis of An . Hence, dim An = 3 . The desired ∗-isomorphism
bn/2c
M
ϕ : An → C(n−2k+1)×(n−2k+1)
k=0
so that
n−k
t
ϕ(Bi,j ) = ..., n−2k −1/2 n−2k −1/2 t ,... .
i−k j−k βi,j,k i,j=k
k=0,...,bn/2c
Schrijver determined the ∗-isomorphism from first principles using linear algebra. Later,
Vallentin [1829] used representation theory of finite groups to derive an alternative proof.
Here the connection to the orthogonal Hahn and Krawtchouk polynomials becomes visi-
ble. Another constructive proof of the explicit block diagonalization of An was given by
Srinivasan [1748]; see also Martin and Tanaka [1334].
Cohn,
Sphere packings
Elkies [423]
de Laat,
Binary sphere and
Oliveira,
spherical cap packings
Vallentin [504]
Dostert,
Guzmán,
Translative body packings
Oliveira,
Vallentin [618]
Congruent copies Oliveira,
of a convex body Vallentin [508]
Semidefinite programming bounds have also been developed for generalized distances
and list-decoding radii of binary codes by Bachoc and Zémor [90], for permutation codes by
Bogaerts and Dukes [228], for mixed binary/ternary codes by Litjens [1274], for subsets of
coherent configurations by Hobart [965] and Hobart and Williford [966], for ordered codes
by Trinker [1820], for energy minimization on S2 by de Laat [503], and for spherical two-
distance sets and for equiangular lines by Barg and Yu [129] and by Machado, de Laat,
Oliveira, and Vallentin [505]. They have been used by Brouwer and Polak [311] to prove the
uniqueness of several constant weight codes.
In extremal combinatorics, (weighted) vector space versions of the Erdős–Ko–Rado
Theorem for cross intersecting families have been proved using semidefinite programming
bounds by Suda and Tanaka [1760] and by Suda, Tanaka and Tokushige [1761]; see also the
survey by Frankl and Tokushige [752].
Semidefinite Programming Bounds for Error-Correcting Codes 281
Another coding theory application of the symmetry reduction technique includes new
approaches to the Assmus–Mattson Theorem by Tanaka [1777] and by Morales and Tanaka
[1399]
Acknowledgement
The author was partially supported by the SFB/TRR 191 “Symplectic Structures in
Geometry, Algebra and Dynamics”, funded by the DFG. He also gratefully acknowledges
support by DFG grant VA 359/1-1. This project has received funding from the European
Unions Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie
agreement number 764759.
Part II
Families of Codes
283
Chapter 14
Coding Theory and Galois Geometries
Leo Storme
Ghent University
285
286 Concise Encyclopedia of Coding Theory
(b) The equivalence class of the nonzero vector v is the point P (v), or is simply denoted
by P when the explicit reference to v is not required.
(c) The nonzero vector v is called a coordinate vector for P (v). Note that also every
nonzero multiple ρv, ρ ∈ Fq \ {0}, is a coordinate vector for P (v).
Definition 14.1.3 Consider V (n + 1, q) and its corresponding projective space PG(n, q).
(a) For any m = −1, 0, 1, . . . , n, an m-dimensional subspace, also called m-space, of
PG(n, q) is a set of points for which the union of all the corresponding coordinate
vectors, together with the zero vector, form an (m + 1)-dimensional vector subspace
of V (n + 1, q). We will denote an m-space by Πm .
(b) In particular, the −1-dimensional subspace is the empty set of PG(n, q), and a 0-
dimensional subspace is a point. A 1-dimensional subspace is called a (projective)
line, a 2-dimensional subspace is called a (projective) plane, and a 3-dimensional
subspace is called a (projective) solid. An (n − 1)-dimensional subspace of PG(n, q)
is called a hyperplane. An (n − r)-dimensional subspace of PG(n, q) is also called a
subspace of codimension r.
Definition 14.1.4
(a) Let θn = (q n+1 − 1)/(q − 1).
(b) Let
(q t − 1)(q t−1 − 1) · · · (q t−s+1 − 1)
t
= .
s q (q s − 1)(q s−1 − 1) · · · (q − 1)
t
is called a q-ary Gaussian coefficient.
s q
Theorem 14.1.5
(a) The projective space PG(n, q) contains θn points.
n+1
(b) The projective space PG(n, q) contains different t-spaces.
t+1 q
Remark 14.1.6 In PG(n, q), every hyperplane is a set of points P (X) whose coordinate
vectors X = (X0 , . . . , Xn ) satisfy a linear equation
u0 X0 + u1 X1 + · · · + un Xn = 0,
with u0 · · · un ∈ Fn+1
q \ {0 · · · 0} fixed.
Definition 14.1.7
(a) If a point P lies in a subspace Π of PG(n, q), then we say that P is incident with Π,
and vice versa, that Π is incident with P .
(b) If Πr and Πs are subspaces of PG(n, q), then the intersection Πr ∩ Πs of Πr and Πs
is the set of points belonging to both the spaces Πr and Πs .
(c) If Πr and Πs are subspaces of PG(n, q), then the span hΠr , Πs i of Πr and Πs is the
smallest subspace of PG(n, q) containing both Πr and Πs .
Coding Theory and Galois Geometries 287
Substructures and subsets of a finite projective space sometimes are similar to each
other. This is expressed via collineations of a finite projective space.
x0 σ(x0 )
x1 σ(x1 )
α: ... 7→ A
.. .
.
xn σ(xn )
Definition 14.1.11 Two sets S and S 0 of spaces contained in PG(n, q) are called projec-
tively equivalent to each other if and only if there is a collineation α ∈ PΓLn+1 (q) which
maps S onto S 0 , i.e., α(S) = S 0 .
Definition 14.1.12
(a) A partial (k − 1)-spread in PG(n − 1, q) is a set of pairwise disjoint (k − 1)-spaces.
(b) A (k − 1)-spread in PG(n − 1, q) is a set of pairwise disjoint (k − 1)-spaces which
partitions the point set of PG(n − 1, q).
The question regarding the existence of (k − 1)-spreads in PG(n − 1, q) has been com-
pletely solved.
Regarding the other cases, where k does not divide n, so that (k − 1)-spreads in PG(n −
1, q) do not exist, the following results on the largest partial (k − 1)-spreads in PG(n − 1, q)
have been found.
288 Concise Encyclopedia of Coding Theory
Theorem 14.1.14
(a) ([195]) Let n ≡ r (mod k). Then, for all q, PG(n − 1, q) contains a partial (k − 1)-
n k r
spread of size at least q −q q(q
k −1
−1)−1
.
(b) ([193, 194]) Let n ≡ 1 (mod k). Then, for all q, the largest partial (k − 1)-spreads
n k
of PG(n − 1, q) have size equal to q −qqk(q−1)−1
−1
.
q r −1
(c) ([1415]) Let n ≡ r (mod k). For all q, if k > q−1 , then the largest partial (k − 1)-
q n −q k (q r −1)−1
spreads of PG(n − 1, q) have size equal to q k −1
.
(d) ([668]) Let n ≡ c (mod 3) with 0 ≤ c ≤ 2. Then the largest size of the partial
n c
2-spreads in PG(n − 1, 2) is equal to 2 −2
7 − c.
(e) ([1179]) Let n ≡ 2 (mod k). Then, for k ≥ 4, n ≥ 2k + 2, the largest partial (k − 1)-
n k
spreads of PG(n − 1, 2) have size equal to 2 −3·2
2k −1
−1
.
not have zero columns. Then the columns H1 , . . . , Hn define points of the projective space
PG(n − k − 1, q). If s columns out of H1 , . . . , Hn define the same projective point P , then we
now consider this point P to have weight w(P ) = s. Then a multiset (K, w) = {H1 , . . . , Hn }
of size n of PG(n − k − 1, q) is obtained, generating the space PG(n − k − 1, q).
Similarly, from a multiset (K, w) = {H1 , . . . , Hn } of size n of PG(n − k − 1, q) gen-
erating
the space PG(n − k − 1, q), it is possible to construct a parity check matrix
H = H1 · · · Hn of a linear [n, k]q code.
This second link leads to the following theorem, also mentioned in Theorem 1.6.11.
Theorem 14.2.2 Let (K, w) = {H1 , . . . , Hn } be a multiset of size n of PG(n − k − 1, q),
generating
the space PG(n−k
−1, q). Let C be the linear [n, k]q code with parity check matrix
H = H1 · · · Hn . Then dH (C) = d if and only if every d − 1 points of the multiset
(K, w) = {H1 , . . . , Hn } are linearly independent, and there exists at least one subset of d
points of the multiset (K, w) = {H1 , . . . , Hn } which are linearly dependent.
Note that because a generator matrix and a parity check matrix of a linear code are
a parity check matrix and a generator matrix, respectively, of the dual linear code, two
arcs are dual to each other if and only if they define two linear MDS codes, via generator
matrices, which are dual codes to each other.
It is known that the dual code of a [q+1, k, q+2−k]q GDRS code is a [q+1, q+1−k, k+1]q
GDRS code. This follows for instance from the standard form of a generator matrix of a
GDRS code, described in [1603]. This translates to the following equivalent geometric result.
Note that Section 1.14 also discusses Reed–Solomon codes. For instance, in Theorem
1.14.12, the Reed–Solomon code of length q defined by a generator matrix whose columns
consist of the q points (1, t, . . . , tk−1 ), t ∈ Fq , was discussed.
Many researchers in finite geometry have obtained results on arcs in finite projective
spaces. These results on arcs translate immediately into corresponding results on linear
MDS codes. Depending on the theorem, we present these results in the language of arcs or
in the language of MDS codes, but we state the first theorems in both languages. The first
result of the next theorem was proven by B. Segre. This characterization result was found as
a theorem on arcs, and motivated many researchers to also start investigating problems in
finite geometry. The links with the coding theoretic results on MDS codes then also inspired
these researchers to work on coding theoretic problems, and, vice-versa, inspired researchers
in coding theory to also investigate problems in finite geometry.
Theorem 14.2.9
(a) ([1639]) Every (q + 1)-arc in PG(2, q), q odd, and every (q + 1)-arc in PG(q − 3, q),
q odd, is equal to a normal rational curve.
For q odd, every [q + 1, 3, q − 1]q MDS code and every [q + 1, q − 2, 4]q MDS code is
GDRS.
(b) ([1638]) Every (q+1)-arc in PG(3, q), q > 3 odd, and every (q+1)-arc in PG(q−4, q),
q > 3 odd, is equal to a normal rational curve.
For q > 3 odd, every [q + 1, 4, q − 2]q MDS code and every [q + 1, q − 3, 5]q MDS code
is GDRS.
(c) ([364]) Every (q + 1)-arc in PG(4, q), q = 2h , h ≥ 3, and every (q + 1)-arc in
PG(q − 5, q), q = 2h , h ≥ 3, is equal to a normal rational curve.
For q = 2h , h ≥ 3, every [q + 1, 5, q − 3]q MDS code and every [q + 1, q − 4, 6]q MDS
code is GDRS.
The only known example of a (q + 1)-arc in PG(n, q), q odd, not equal to a normal
rational curve is a 10-arc in PG(4, 9), discovered by Glynn.
We now present an important result on MDS codes over finite fields Fq of even order,
and immediately discuss the geometrically equivalent result.
Theorem 14.2.11
(a) For q even, there exist [q + 2, 3, q]q MDS codes.
(b) In PG(2, q), q even, there exist (q + 2)-arcs.
Definition 14.2.12 A hyperoval is a (q + 2)-arc in PG(2, q), q even.
Definition 14.2.13 A permutation polynomial F over the finite field Fq is a polynomial
for which the mapping x 7→ F (x), x ∈ Fq , is a bijection over the finite field Fq .
Theorem 14.2.14 (Segre [1640]) Any (q + 2)-arc of PG(2, q), q = 2h and h > 1, is
projectively equivalent to a (q + 2)-arc {(1, t, F (t)) | t ∈ Fq } ∪ {(0, 1, 0), (0, 0, 1)}, where F
is a permutation polynomial over Fq of degree at most q − 2 satisfying F (0) = 0, F (1) = 1,
and such that for each s in Fq
F (X + s) + F (s)
Fs (X) =
X
is a permutation polynomial over Fq satisfying Fs (0) = 0.
All polynomials F (X) of this type are called o-polynomials. Examples of o-polynomials
can be found in Section 20.4.15.
Remark 14.2.15 Let q be even. The classical example of a hyperoval in PG(2, q) is the
regular hyperoval, which is defined in the following way. Every conic C in PG(2, q), q even,
has a nucleus N . The nucleus N of a conic C is the unique point belonging to all the lines
sharing one point with this conic C in PG(2, q). The union of a conic C and its nucleus
N forms a hyperoval, called a regular hyperoval. In Theorem 14.2.14, F (t) = t2 is an
example of an o-polynomial defining a regular hyperoval.
Other infinite classes of hyperovals are known: the translation hyperovals, the Segre
hyperoval, the two Glynn hyperovals, the Payne hyperoval, the Cherowitzo hyperoval,
the Subiaco hyperovals, and also the Adelaide hyperoval. We refer to [961] for the corre-
sponding o-polynomials. There is also one sporadic hyperoval in PG(2, 32) known [1463].
Remark 14.2.16 The MDS Conjecture on linear MDS codes states that the maximal
length n for a linear [n, k, n − k + 1]q MDS code, with 2 ≤ k ≤ q − 1, is at most q + 1,
up to two exceptional cases. This MDS Conjecture is summarized in Table 14.1; see also
Conjecture 3.3.21.
TABLE 14.1: The MDS Conjecture for linear [n, k, n − k + 1]q MDS codes
Conditions n≤
k≥q k+1
k ∈ {3, q − 1} and q even q+2
in all other cases q+1
The next theorem, by S. Ball, proves the MDS Conjecture on linear MDS codes over
prime fields, and is one of the nicest contributions of Galois geometries to coding theory.
Because of its great relevance, we state the result both in the language of MDS codes and
in the language of arcs.
292 Concise Encyclopedia of Coding Theory
We end this section with the complete classification of the (q + 1)-arcs in PG(3, q),
q = 2h , h ≥ 3.
Definition 14.2.19 Linear [gq (k, d), k, d]q codes, i.e., meeting the Griesmer Bound, are
called Griesmer codes.
Remark 14.2.21 An (f, m; N, q)-minihyper (F, w) is also known in the literature on Galois
geometries under the name of weighted m-fold blocking set of size f with respect
to the hyperplanes of PG(N, q).
The link between minihypers in PG(k − 1, q) and linear [gq (k, d), k, d]q Griesmer codes
begins with the so-called q-ary expansion of d. Choose the unique positive integer Phs such
that (s − 1)q k−1 < d ≤ sq k−1 . Then there is a unique q-ary expansion d = sq k−1 − i=1 q λi
where
• 0 ≤ λ1 ≤ · · · ≤ λh < k − 1, and
• at most q − 1 of the integers λi are equal to any given value.
Coding Theory and Galois Geometries 293
Using this expression for d, the Griesmer Bound for a linear [n, k, d]q code is
h
X
n ≥ sθk−1 − θλi ,
i=1
q n+1 −1
where θn = q−1 .
k−1
Ph
When d = sq − i=1 q λi , there is a one-to-one correspondence between the set
of all inequivalent [gq (k, d), k, d]q Griesmer codes and the set of all projectively distinct
Ph Ph
i=1 θλi , i=1 θλi −1 ; k − 1, q -minihypers (F, w), where the maximum weight for a point
P in PG(k − 1, q) is equal to s.
Belov, Logačev, and Sandimirov [157] gave a general construction method for Griesmer
codes. We describe this construction method via the minihypers.
Construction 14.2.22 (Belov–Logačev–Sandimirov) Consider in PG(k − 1, q) a mul-
tiset of 0 points, 1 lines, 2 planes, 3 solids, . . . , k−2 (k − 2)-spaces, with 0 ≤ i ≤ q − 1,
Pk−2 Pk−2
i = 0, . . . , k − 2. Then such a multiset defines a i=0 i θi , i=0 i θi−1 ; k − 1, q -minihyper
(F, w), where the weight w(R) of a point R of PG(k − 1, q) equals the number of objects in
the description of the multiset above in which it is contained.
One of the main contributions of finite geometries to the problem of linear codes meeting
the Griesmer Bound is the characterization of Griesmer codes via the characterization of
the corresponding minihypers. We mention two particular such characterization results.
Pk−2 Pk−2
Theorem 14.2.23 ([495]) A i=0 i θi , i=0 i θi−1 ; k − 1, q -minihyper (F, w), where
Pk−2 √
i=0 i < q + 1, is a multiset of 0 points, 1 lines, . . ., k−3 (k − 3)-spaces, k−2 hyper-
planes, and so is of Belov–Logačev–Sandimirov type.
Theorem 14.2.24 ([1199]) An xθk−2 , xθk−3 ; k − 1, q -minihyper (F, w), q = ph , p
prime, h ≥ 1, with x ≤ q − q/p, is a multiset of x hyperplanes, and so is of Belov–Logačev–
Sandimirov type.
Definition 14.3.1 Let Fd be the set of the homogeneous polynomials of degree d over
the finite field Fq in the variables X0 , . . . , Xn . An algebraic hypersurface Φ of degree
d in PG(n, q) is the set of points of PG(n, q) whose coordinates satisfy a homogeneous
polynomial f in Fd , i.e., Φ = {P ∈ PG(n, q) | f (P ) = 0} for given f ∈ Fd .
Definition 14.3.2 Let PG(n, q) = {P0 , . . . , Pθn −1 }, where we normalize the coordinates of
the points Pi by making the leftmost nonzero coordinate equal to one. Then the dth order
q-ary projective Reed–Muller code PRMq (d, n) is
PRMq (d, n) = f (P0 ), . . . , f (Pθn −1 ) | f ∈ Fd ∪ {0} .
294 Concise Encyclopedia of Coding Theory
We now present a geometric link between codewords of PRMq (d, n) and algebraic
hypersurfaces of degree d in PG(n, q). The weight of a codeword is the number of nonzero
positions in this codeword. A point Pi of PG(n, q) belongs to an algebraic hypersurface Φ
if and only if its coordinates satisfy the equation f (X0 , . . . , Xn ) = 0 of Φ. Hence, Pi ∈ Φ if
and only if f (Pi ) = 0. Since a nonzero codeword of minimum weight of PRMq (d, n) is in
fact a nonzero codeword of PRMq (d, n) with the largest number of zeroes, the preceding
equivalence leads to the following theorem.
Theorem 14.3.3 The nonzero codewords of minimum weight of PRMq (d, n) correspond
to the algebraic hypersurfaces of degree d having the largest number of points in PG(n, q).
In the context of projective Reed–Muller codes, finite geometries contributed to the
characterization of algebraic hypersurfaces of the largest size. As an example, we present
the following results.
Theorem 14.3.4
(a) (Serre [1654]) The minimum weight codewords of the code PRMq (d, n), for
d ≤ q − 1, are defined by the algebraic hypersurfaces of degree d which are the
union of d hyperplanes, passing through a common (n − 2)-space of PG(n, q). So
dH (PRMq (d, n)) = q n − (d − 1)q n−1 for d ≤ q − 1.
(b) (Sørensen [1741]) Let d − 1 = r(q − 1) + s, with 0 ≤ s < q − 1. For d ≤ n(q − 1),
dH (PRMq (d, n)) = (q − s)q n−r−1 .
Theorem 14.3.5 (Sboui [1626])
(a) The codewords of the second smallest nonzero weight of the code PRMq (d, n), with
5 ≤ d ≤ 3q + 2, are defined by the algebraic hypersurfaces Ad2 of degree d which are the
union of d hyperplanes, d − 1 of which meet in a common (n − 2)-space Πn−2 and with
the dth hyperplane not passing through this common (n − 2)-space Πn−2 . This second
smallest nonzero weight is equal to q n − (d − 1)q n−1 + (d − 2)q n−2 .
(b) The codewords of the third smallest nonzero weight of the code PRMq (d, n), with
5 ≤ d ≤ 3q + 2, are defined by the algebraic hypersurfaces Ad3 of degree d which are the
union of d hyperplanes, d − 2 of which meet in a common (n − 2)-space K1 , and where
the last two hyperplanes Hd−1 and Hd meet in an (n − 2)-space K2 , different from
K1 , such that K2 is contained in exactly one of the d − 2 hyperplanes passing through
K1 . This third smallest nonzero weight is equal to q n − (d − 1)q n−1 + 2(d − 3)q n−2 .
Theorem 14.3.6 (Bartoli–Sboui–Storme [133]) Let c be a nonzero codeword of the
√
code PRMq (d, n), where d < 3 q, satisfying
√
q n − r+d−4
n−1
2 q − ((d − r + 1)2 + d − 1 + 2 q)q n−2
√ n−3 √ qn−3 −1
− (2d − r + 2 q)q − d − 1 + 2 q ,
q−1
when d − r + 1 is odd,
wtH (c) < n−1 (d−r+1)2
n d+r−3
q − q − + d q n−2
2 2
n−3
− 3d−r+1 − d q q−1−1 + r+d
n−3
q 2 ,
2
when d − r + 1 is even.
Definition 14.4.2 Let q = ph , p prime, h ≥ 1, and let s < t. The code Cs,t (n, q) is the
linear code over the prime field Fp generated by the incidence matrix Gs,t of PG(n, q).
Definition 14.4.3 Let s < t. Let ∆s,t denote the incidence system whose points and
blocks are the s-spaces and t-spaces in PG(n, q), respectively, and the incidence is inclusion.
This means that a block consists of all the s-spaces of PG(n, q) contained in a fixed t-space
of PG(n, q).
The fundamental parameters of the linear codes Cs,t (n, q) have been determined.
Theorem 14.4.4
n+1
(a) The length of Cs,t (n, q) is .
s+1 q
(b) ([1024]) The dimension of the p-ary code C0,t (n, q), q = ph , p prime, h ≥ 1, is
n−t
X X h−1 X−1
Y rj+1
n + 1 n + prj+1 − rj − ps
1+ (−1)s .
i=1 j=0 s=0
s n
1 ≤ r1 , . . . , rh−1 ≤ n − t
r0 = rh = i
(c) ([818]) The dimension of the p-ary code C0,n−1 (n, q), q = ph , p prime, h ≥ 1, is
h
n+p−1
+ 1.
n
Intensive research has been done in Galois geometries to characterize small weight code-
words in the codes arising from the incidence matrices of finite projective spaces.
Theorem 14.4.6
(a) ([71]) The minimum weight of the code C0,t (n, q) is θt and the minimum weight code-
words are the scalar multiples of the incidence vectors of the t-spaces of PG(n, q).
(b) ([1533]) The second smallest nonzero weight of the code C0,n−1 (n, q) is 2q n−1 and
every codeword of C0,n−1 (n, q) of weight 2q n−1 is a scalar multiple of the difference of
two hyperplanes of PG(n, q).
Remark 14.4.7 The code C0,1 (2, p), p prime, contains a particular codeword c of weight
3p − 3, which is not a linear combination of three lines; see [91]. Let (X, Y, Z) be the
coordinates for the points of PG(2, p). For a point Q of PG(2, p), we denote the coordinate
position of this point Q in a codeword c by cQ .
Consider the three lines `1 : X = 0, `2 : Y = 0, and `3 : X = Y through the point
P (0, 0, 1). Consider also the line m : Z = 0 not through the point P . The support of the
S3
particular codeword c will consist of the points of ( i=1 `i ) \ (m ∪ {P }). For this particular
codeword c, the coordinate values are
t if Q = (0, 1, t) or Q = (1, 0, t), t ∈ Fq ,
cQ = −t if Q = (1, 1, t), t ∈ Fq ,
0 for all the other points of PG(2, p).
Then c is a codeword of the code C0,1 (2, p), p prime, of weight 3p − 3. Up to projective
equivalence, such a particular codeword is called a special codeword. Let c be such a
special codeword,
Pwith its support contained in the lines `1 , `2 , `3 . Then we call a codeword
3
of the form c + i=1 αi `i , with αi ∈ Fp , an extended special codeword.
(b) Let c be a codeword of the p-ary code C0,1 (2, q) with q = ph , p prime, h > 2, and
√ √
q > 27. If wtH (c) < ( q + 1)(q + 1 − q), then c is a linear combination of exactly
wtH (c)
q+1 lines.
(c) Let c be a codeword of the p-ary code C0,1 (2, q) with q = p2 , p prime, and q > 27. If
2
wtH (c) < (p−1)(p−4)(p +1)
, then c is a linear combination of exactly wtq+1
H (c)
2p−1 lines.
Remark 14.4.10 Table 14.2 presents known values and bounds on the minimum distance
of linear codes arising from incidence matrices of projective spaces PG(n, q), q = ph , p
prime, h ≥ 1. We refer the reader to [1208] for the exact references to the results in this
table.
Coding Theory and Galois Geometries 297
TABLE 14.2: Known values and bounds on the minimum distance of linear codes from
projective spaces PG(n, q), q = ph , p prime, h ≥ 1
14.5.1 Definitions
Definition 14.5.1 A subspace code C is a set of subspaces of a projective space PG(n −
1, q).
Definition 14.5.2
(a) The subspace distance dS (U, V ) between two subspaces U and V of PG(n − 1, q) is
equal to
dS (U, V ) = dim U + dim V − 2 dim(U ∩ V ).
Definition 14.5.3
(a) A constant-dimension code, also called a Grassmannian code, is a set of sub-
spaces of the same dimension, i.e., (k − 1)-spaces of PG(n − 1, q).
298 Concise Encyclopedia of Coding Theory
(b) The minimum subspace distance dS (C) of a constant-dimension code is always even.
We denote this by dS (C) = 2δ. An (n, 2δ, k)q constant-dimension code is a constant-
dimension code of (k − 1)-spaces in PG(n − 1, q) with minimum subspace distance
2δ.
(c) The parameter Aq (n, 2δ, k) denotes the largest number of codewords in an (n, 2δ, k)q
constant-dimension code.1
We now present a lower bound on Aq (n, 2δ, k). This lower bound follows from the theory
of rank-metric codes and the lifting procedure, described in Section 14.5.3, where we also
state this lower bound in Theorem 14.5.22. We refer to Section 14.5.3 and to Theorem
14.5.22 for more detailed information on this lower bound.
An (n, 2k, k)q subspace code is a set of (k − 1)-spaces in PG(n − 1, q) which are pairwise
disjoint. Hence, the codewords of an (n, 2k, k)q code form a partial (k − 1)-spread in PG(n −
1, q). This leads to the following result linking the Johnson Bound for (n, 2k, k)q codes to
the problem of finding the largest partial (k − 1)-spreads in PG(n − 1, q).
Theorem 14.5.6 The value Aq (n, 2k, k) equals the size of the largest partial (k − 1)-spread
in PG(n − 1, q).
The preceding theorem has motivated a lot of recent research on partial (k − 1)-spreads
[668, 1179, 1415], mentioned in Section 14.1.2. It is another example of how the two research
areas coding theory and Galois geometries interact with and influence each other.
1 Chapter 29 examines subspace codes from the non-projective point of view. In Section 29.5.2, the
subspace distance between non-projective spaces U and V is also defined as dS (U, V ) = dim U + dim V −
2 dim(U ∩V ). As the projective dimension of a subspace is 1 less than the non-projective dimension, the value
of dS (U, V ) is the same in either point of view. In Section 29.5.2, another distance, the injection distance
dI (·, ·) is defined. If U and V are of equal dimension, in either perspective, dS (U, V ) = 2dI (U, V ) implying
dS (C) = 2dI (C) for constant-dimension codes C. In Section 29.6, Aq (n, d, k) denotes the largest constant-
dimension code in Fn q where the codewords have dimension k and the code has minimum injection distance
d. Therefore Aq (n, 2δ, k) = Aq (n, δ, k) and the Johnson Bound of Theorem 14.5.4(b) and Theorem 29.6.5
are identical. The reader might compare the lower bound of Theorem 14.5.5 to the upper bound of (29.11)
in Chapter 29. (Note that the notation (n, 2δ, k)q used in this section is the same as (n, d, k)q with d = δ in
Section 29.6.)
Coding Theory and Galois Geometries 299
As a particular application of this theorem, for t = 1, the preceding theorem implies that
every projective point of PG(n − 1, q) belongs to exactly one codeword of this (n, 2k, k)q
subspace code C. Hence, the (k − 1)-spaces, which form the codewords of this subspace code
C, form a (k − 1)-spread of PG(n − 1, q), and this leads to the following theorem, also related
to Theorem 14.5.6.
Theorem 14.5.10 A 1-(n, k, 1)q q-Steiner system exists if and only if a (k − 1)-spread in
PG(n − 1, q) exists, i.e., if and only if k divides n.
A main research question was if a nontrivial q-Steiner system exists. A major break-
through in the theory of q-Steiner systems, and in the theory of subspace codes, was obtained
when indeed the first example of a nontrivial q-Steiner system was found.
. 14.5.11 ([295]) There exists a 2-(13, 3, 1)2 Steiner system; i.e., A2 (13, 4, 3) =
Theorem
13 3
= 1 597 245.
2 2 2 2
Designs arising from projective space are also examined in Section 5.6.
Similar to [n, k, d]q linear codes, there exists a Singleton Bound for rank-metric codes;
see also Theorem 11.3.5.
Let B be a basis of Fqn with respect to Fq . We can write an element a ∈ Fqn as a column
vector v(a) in Fnq with respect to this basis B. Let {β1 , . . . , βn } be a set of n linearly
independent elements of Fqn with respect to Fq ; then the Gabidulin code Gn,n,k is the set
of n × n matrices
T
Gn,n,k = v(f (β1 )) · · · v(f (βn )) | f ∈ Gn,n,k .
Theorem 14.5.17 (Gabidulin [760]) The Gabidulin codes Gn,n,k are [n×n, nk, n−k+1]q
MRD codes.
Theorem 14.5.19 (Gabidulin [475]) The punctured Gabidulin codes Gm,n,k , m < n, are
[n × m, nk, m − k + 1]q MRD codes.
We now present other examples of MRD codes via their corresponding sets of linearized
polynomials.
Theorem 14.5.20 The following sets of linearized polynomials define [n × n, nk, n − k + 1]q
MRD codes.
s s(k−1)
(a) ([1171]) Gk,s = {a0 x + a1 xq + · · · + ak−1 xq | a0 , . . . , ak−1 ∈ Fqn }, with n, k, s
positive integers and gcd(n, s) = 1, defines a generalized Gabidulin code.
k−1 h k
(b) ([1667]) Hk (ν, h) = {a0 x + a1 xq + · · · + ak−1 xq + νaq0 xq | a0 , . . . , ak−1 ∈ Fqn },
q n −1
with n, k, h positive integers where k < n and ν ∈ Fqn such that ν q−1 6= (−1)nk ,
defines a twisted Gabidulin code.
s s(k−1) h sk
(c) ([1303, 1667]) H k,s (ν, h) = {a0 x + a1 xq + · · · + ak−1 xq + νaq0 xq |
a0 , . . . , ak−1 ∈ Fqn }, with n, k, s, h positive integers where gcd(n, s) = 1, k < n,
q ns −1
and ν ∈ Fqn such that ν q s −1 6= (−1)nk , defines a generalized twisted Gabidulin
code.
Coding Theory and Galois Geometries 301
We now present the lifting procedure which transforms rank-metric codes into subspace
codes. One of the most important consequences of this lifting procedure is the lower bound
on the Johnson Bound, stated in Theorem 14.5.22, also mentioned in Theorem 14.5.5.
The known rank-metric codes, for instance mentioned in the preceding theorems, can
now be used in this lifting procedure to construct subspace codes with specified minimum
distance 2δ. This leads to the proof of Theorem 14.5.5, also stated in the next theorem.
Remark 14.5.24 Note that the r-dimensional vector space V over the finite field Fqn is
isomorphic to the finite field Fqrn , considered as an r-dimensional vector space over the
finite field Fqn . This is used in the next definition.
Definition 14.5.25 An Fq -subspace U of Fqrn is called scattered with respect to the finite
field Fqn if it defines a scattered Fq -linear set in PG(r−1, q n ) defined by considering the finite
field Fqrn as an r-dimensional vector space over the finite field Fqn where dimFq (U ∩hxiFqn ) ≤
1 for each x ∈ Fqrn \ {0}.
Definition 14.5.26 A scattered Fq -linear set of Λ of the highest possible rank is called a
maximum scattered Fq -linear set of Λ.
Theorem 14.5.27 ([223]) The rank of a scattered Fq -linear set of PG(r − 1, q n ), rn even,
is at most rn/2.
Definition 14.5.29 A finite (right) quasifield Q is a finite algebra with at least two
elements, and two binary operations + and ◦, satisfying the following axioms:
Definition 14.5.30 A finite semifield S is a finite quasifield which also satisfies the left
distributivity law a ◦ (b + c) = a ◦ b + a ◦ c for all a, b, c ∈ S.
We identify the elements of a finite semifield, which is n-dimensional over some finite
field Fq , with the elements of the field extension Fqn .
Example 14.5.31 The Dickson commutative semifields S are defined in the following
way. Consider the finite field Fpm with p odd, m > 1, σ a nontrivial automorphism of
Fpm over Fp , and f a non-square in Fpm . Then the Dickson semifield has elements a + λb,
a, b ∈ Fpm , where λ is a symbol not in Fpm . The addition in the Dickson semifield is
componentwise and the multiplication is defined by
Remark 14.5.32 For an element a of the semifield S = (Fqn , ◦), we can define the endo-
morphism of left multiplication Ma ∈ EndFq (Fqn ) by Ma (b) = a ◦ b. We then can define the
spread set of the semifield S to be the set C(S) = {Ma | a ∈ S} ⊂ EndFq (Fqn ). Since S is a
finite-dimensional division algebra, every Ma , a ∈ S \ {0}, is invertible. Moreover, C(S) is a
Fq -subspace of EndFq (Fqn ) since Mλa+b = λMa + Mb , for every λ ∈ Fq and every a, b ∈ Fqn .
Theorem 14.5.33 ([501]) MRD codes in Fn×n q , which contain the zero and the identity
matrix, with minimum distance n correspond to finite quasifields Q with Fq ≤ ker(Q) and
dimFq (Q) = n.
Theorem 14.5.34 ([501]) Additively closed MRD codes in Fn×n q , which contain the iden-
tity matrix, with minimum distance n correspond to finite semifields S with Fq ≤ ker(S) and
dimFq (S) = n.
Coding Theory and Galois Geometries 303
x1 y1 x1 y2 ··· x1 yn
x2 y1 x2 y2 ··· x2 yn
(x1 y1 , x1 y2 , x1 y3 , . . . , xn yn ) 7→
... .. .. .
. .
xn y1 xn y2 ··· xn yn
Theorem 14.5.36 ([963, Theorem 25.1.2]) The Segre variety Sn−1,n−1 of PG(n2 −1, q)
is the set of points of PG(n2 − 1, q) for which the corresponding matrix has rank 1.
Definition 14.5.37 An exterior set of points of PG(n2 − 1, q) with respect to the Segre
variety Sn−1,n−1 in PG(n2 − 1, q) is a set of points E of PG(n2 − 1, q) \ Sn−1,n−1 of size
θn2 −n−1 such that every line through two different points of E is skew to the Segre variety
Sn−1,n−1 .
Theorem 14.5.38 ([448]) An exterior set with respect to Sn−1,n−1 defines an MRD code
in Fn×n
q with minimum distance 2, which is closed under scalar multiplication in Fq . Con-
versely, every MRD code in Fn×n q with minimum distance 2, which is closed under scalar
multiplication in Fq , defines an exterior set with respect to Sn−1,n−1 .
The Gabidulin code Gn,n,n−1 is the best known MRD code with minimum distance 2
and so defines an exterior set with respect to Sn−1,n−1 .
Cossidente, Marino, and Pavese [448] constructed an exterior set with respect to the
Segre variety S2,2 . This construction was generalized to arbitrary n ≥ 3 by Durante and
Siciliano [653].
Definition 14.6.1
(a) Let S be a set of points in PG(n, q) and let P ∈ S. A tangent hyperplane TP (S)
to the set S in the point P is a hyperplane only sharing P with S.
(b) A strong representative system S in PG(n, q) is a set of points such that every
point P ∈ S belongs to at least one tangent hyperplane TP (S) to S.
√
Definition 14.6.2 A unital U of PG(2, q), q square, is a set of q q + 1 points of PG(2, q)
√
such that every line of PG(2, q) intersects U in either 1 or q + 1 points.
Example 14.6.3 The classical example of a unital in PG(2, q), q square, is the set of points
of a nonsingular Hermitian curve H. This is an √absolutely√ irreducible
√
algebraic curve of
√ q+1 q+1 q+1
degree q + 1 over Fq having standard equation X0 + X1 + X2 = 0. Also other
examples of unitals in PG(2, q), q square, called Buekenhout–Metz unitals, are known
[318, 1383].
Definition 14.6.4 An ovoid O of PG(3, q) is a set of q 2 + 1 points of PG(3, q), such that
every plane of PG(3, q) intersects O in either 1 or q + 1 points.
Example 14.6.5 The classical example of an ovoid in PG(3, q) is the set of points of a
nonsingular elliptic quadric Q. This is a nonsingular quadric in PG(3, q) having standard
equation f (X0 , X1 ) + X2 X3 = 0, where f (X0 , X1 ) is a homogeneous quadratic polynomial,
irreducible over Fq . Next to the elliptic quadric, one other type of ovoid is known. This is
the Tits ovoid in PG(3, q), q = 22h+1 , h ≥ 1 [1809].
The next theorem presents upper bounds on the size of strong representative systems of
PG(n, q).
The preceding upper bounds were obtained via geometric arguments. However, there
is an alternative way to obtain upper bounds on the size of strong representative systems
in PG(n, q) by using coding theoretic results. The next theorem follows from the ideas
developed in [225, Section 4].
By applying Theorem 14.4.4(c) to the result of the preceding theorem, the following
result on strong representative systems is obtained.
The preceding theorem and corollary motivate the title of this subsection. The result
of this theorem and corollary is, for particular primes p and exponents h, better than the
geometric bound of Theorem 14.6.7. So, here, results from coding theory, applied to a
geometric problem, lead to better results than the geometric techniques.
The preceding theorem and corollary are a particular example of the relevance of coding
theoretic research for Galois geometries, again showing the close relationship between the
research areas of coding theory and Galois geometries.
Chapter 15
Algebraic Geometry Codes and Some
Applications
Alain Couvreur
Inria and LIX, École Polytechnique
Hugues Randriambololona
ANSSI Laboratoire de cryptographie and Télécom Paris
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
Applications of Algebraic Geometry Codes . . . . . . . . . . . . . . . . . . . . . . 310
Organization of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
15.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
15.2 Curves and Function Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
15.2.1 Curves, Points, Function Fields, and Places . . . . . . . . . . . . 312
15.2.2 Divisors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
15.2.3 Morphisms of Curves and Pullbacks . . . . . . . . . . . . . . . . . . . . 314
15.2.4 Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
15.2.4.1 Canonical Divisors . . . . . . . . . . . . . . . . . . . . . . . . . 315
15.2.4.2 Residues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
15.2.5 Genus and the Riemann–Roch Theorem . . . . . . . . . . . . . . . . 315
15.3 Basics on Algebraic Geometry Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
15.3.1 Algebraic Geometry Codes, Definitions, and Elementary
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
15.3.2 Genus 0, Generalized Reed–Solomon and Classical
Goppa Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
15.3.2.1 The CL Description . . . . . . . . . . . . . . . . . . . . . . . . 320
15.3.2.2 The CΩ Description . . . . . . . . . . . . . . . . . . . . . . . . 321
15.4 Asymptotic Parameters of Algebraic Geometry Codes . . . . . . . . . . 323
15.4.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
15.4.2 The Tsfasman–Vlăduţ–Zink Bound . . . . . . . . . . . . . . . . . . . . . 324
15.4.3 Subfield Subcodes and the Katsman–Tsfasman–Wirtz
Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
15.4.4 Nonlinear Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
15.5 Improved Lower Bounds for the Minimum Distance . . . . . . . . . . . . 326
15.5.1 Floor Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
15.5.2 Order Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
15.5.3 Further Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
15.5.4 Geometric Bounds for Codes from Embedded Curves . . 332
15.6 Decoding Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
15.6.1 Decoding Below Half the Designed Distance . . . . . . . . . . . . 333
15.6.1.1 The Basic Algorithm . . . . . . . . . . . . . . . . . . . . . . . 333
307
308 Concise Encyclopedia of Coding Theory
15.6.1.2
Getting Rid of Algebraic Geometry:
Error-Correcting Pairs . . . . . . . . . . . . . . . . . . . . . 336
15.6.1.3 Reducing the Gap to Half the Designed
Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
15.6.1.4 Decoding Up to Half the Designed
Distance: The Feng–Rao Algorithm and
Error-Correcting Arrays . . . . . . . . . . . . . . . . . . . 337
15.6.2 List Decoding and the Guruswami–Sudan Algorithm . . 339
15.7 Application to Public-Key Cryptography: A McEliece-Type
Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
15.7.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
15.7.2 McEliece’s Original Proposal Using Binary Classical
Goppa Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
15.7.3 Advantages and Drawbacks of the McEliece Scheme . . . 342
15.7.4 Janwa and Moreno’s Proposals Using AG Codes . . . . . . . 342
15.7.5 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
15.7.5.1 Concatenated Codes Are Not Secure . . . . . . 343
15.7.5.2 Algebraic Geometry Codes Are Not Secure 343
15.7.5.3 Conclusion: Only Subfield Subcodes of AG
Codes Remain Unbroken . . . . . . . . . . . . . . . . . . 343
15.8 Applications Related to the ?-Product: Frameproof Codes,
Multiplication Algorithms, and Secret Sharing . . . . . . . . . . . . . . . . . 344
15.8.1 The ?-Product from the Perspective of AG Codes . . . . . . 344
15.8.1.1 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
15.8.1.2 Dimension of ?-Products . . . . . . . . . . . . . . . . . . . 345
15.8.1.3 Joint Bounds on Dimension and Distance . 347
15.8.1.4 Automorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
15.8.2 Frameproof Codes and Separating Systems . . . . . . . . . . . . . 348
15.8.3 Multiplication Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
15.8.4 Arithmetic Secret Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
15.9 Application to Distributed Storage Systems: Locally
Recoverable Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
15.9.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
15.9.2 A Bound on the Parameters Involving Locality . . . . . . . . . 356
15.9.3 Tamo–Barg Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
15.9.4 Locally Recoverable Codes from Coverings of Algebraic
Curves: Barg–Tamo–Vlăduţ Codes . . . . . . . . . . . . . . . . . . . . . . 357
15.9.5 Improvement: Locally Recoverable Codes with Higher
Local Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
15.9.6 Fibre Products of Curves and the Availability Problem 359
Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Introduction
Algebraic geometry codes is a fascinating topic at the confluence of number theory and
algebraic geometry on one side and computer science involving coding theory, combinatorics,
information theory, and algorithms, on the other side.
Algebraic Geometry Codes and Some Applications 309
History
The beginning of the story dates back to the early 1970s where the Russian mathe-
matician V. D. Goppa proposed code constructions from algebraic curves, using rational
functions or differential forms on algebraic curves over finite fields [836, 837, 838]. In par-
ticular, this geometric point of view permits one to regard Goppa’s original construction of
codes from rational functions [834, 835] (see [168] for a description of these codes in English)
as codes from differential forms on the projective line.
Shortly after, there appeared one of the most striking results in the history of coding
theory, which is probably at the origin of the remarkable success of algebraic geometry
codes. In 1982, Tsfasman, Vlăduţ, and Zink [1822] related the existence of a sequence of
curves whose numbers of rational points go to infinity and grow linearly with respect to
the curves’ genera to the existence of sequences of asymptotically good codes. Next, using
sequences of modular curves and Shimura curves, they proved the existence of sequences of
codes over a field Fq where q = p2 or p4 , for p a prime number, whose asymptotic rate R
and asymptotic relative distance δ satisfy
1
R≥1−δ− √ .
q−1
In an independent work and using comparable arguments, Ihara [1021] proved a similar
result over any field Fq where q is a square. For this reason, the ratio
max |X(Fq )|
lim sup ,
g→+∞ g
where the maximum is taken over the set of curves X of genus g over Fq , is usually referred
to as the Ihara constant and denoted by A(q). Further, Vlăduţ and Drinfeld [1857] proved
an upper bound for the Ihara constant showing that the families of curves exhibited in
[1021, 1822] are optimal.
The groundbreaking aspect of such a result appears when q is a square and q ≥ 49,
since, for such a parameter, the Tsfasman–Vlăduţ–Zink Bound is better than the asymp-
totic Gilbert–Varshamov Bound. Roughly speaking, this result asserts that some algebraic
geometry codes are better than random codes, while the opposite statement was commonly
believed in the coding theory community.
For this breakthrough, Tsfasman, Vlăduţ, and Zink received the prestigious Information
Theory Society Paper Award, and their result motivated an intense development of the
theory of algebraic geometry codes. The community explored various sides of this theory in
the following decades. First, the question of producing sequences of curves with maximal
Ihara constant, or the estimate of the Ihara constant A(q) when q is not a square, became a
challenging new problem in number theory. In particular, some constructions based on class
field theory gave lower bounds for A(q) when q is no longer a square. Second, in 1995, Garcı́a
and Stichtenoth [790] obtained new optimal sequences of curves (i.e., reaching the Drinfeld–
Vlăduţ Bound) using a much more elementary construction called recursive towers. Apart
from the asymptotic questions, many authors improved, in some specific cases, Goppa’s
estimate for the minimum distance of algebraic geometry codes. Such results permitted the
construction of new codes of given length whose parameters beat the tables of best known
codes; see, for instance, [846]. Third, another fruitful direction is on the algorithmic side
with the development of polynomial time decoding algorithms correcting up to half the
designed distance and even further using list decoding. This chapter presents known results
on improved bounds on the minimum distance and discusses unique and list decoding. The
asymptotic performances of algebraic geometry codes are also quickly surveyed without
providing an in-depth study of the Ihara constant and the construction of optimal towers.
310 Concise Encyclopedia of Coding Theory
The latter topic, being much too rich, would require a separate treatment which we decided
not to develop in the present survey.
It should also be noted that codes may be constructed from higher dimensional varieties.
This subject, also of deep interest, will not be discussed in the present chapter. We refer the
interested reader to [1276] for a survey on this question, to [1474] for a decoding algorithm,
and to [460] for a first attempt toward good asymptotic constructions of codes from surfaces.
variants: generalized Reed–Solomon codes, extended Reed–Solomon codes, doubly extended Reed–Solomon
codes, etc. In this chapter, we refer to Reed–Solomon or generalized Reed–Solomon codes as the algebraic
geometry codes from a curve of genus 0. See Section 15.3.2 for further details.
Algebraic Geometry Codes and Some Applications 311
developed in the present survey and would require a separate treatment. Next, Section 15.5
is devoted to various improvements of the Goppa designed distance. Section 15.6 surveys the
various decoding algorithms. Starting with algorithms correcting up to half the designed
distance and then moving to the more recent developments of list decoding permits one
to exceed this threshold. Finally, the last three sections present various applications of
algebraic geometry codes. Namely, we study their possible use for post-quantum public
key cryptography in Section 15.7. Thanks to their nice behavior with respect to the so-
called ?-product, algebraic geometry codes have applications to algebraic complexity theory,
secret sharing and multiparty computation, which are presented in Section 15.8. Finally,
applications to distributed storage with the algebraic geometric constructions of locally
recoverable codes are presented in Section 15.9.
15.1 Notation
For any prime power q, the finite field with q elements is denoted by Fq and its algebraic
closure by Fq . Given any ring R, the group of invertible elements of R is denoted by R× .
In particular, for a field F, the group F× is nothing but F \ {0}.
Unless otherwise specified, any code in this chapter is linear. The vector space Fnq is
equipped with the Hamming weight denoted by wtH (·), and the Hamming distance
between two vectors x, y is denoted by dH (x, y). Given a linear code C ⊆ Fnq , as is usual in
the literature, the fundamental parameters of C are listed as a triple of the form [n, k, d],
where n denotes its block length, k its dimension as an Fq -space, and d its minimum
Hamming distance, which is sometimes also referred to as dH (C). When the minimum
distance is unknown, we sometimes denote the known parameters by [n, k].
Another important notion in the sequel is that of the ?-product, also called the Hadamard
product. The ?-product is nothing but the componentwise multiplication in Fnq : for x =
(x1 , . . . , xn ) and y = (y1 , . . . , yn ), we have
def
x ? y = (x1 y1 , . . . , xn yn ).
This notion extends to codes: given two linear codes C, C 0 ⊆ Fnq we let
def
C ? C 0 = spanFq {c ? c0 | c ∈ C, c0 ∈ C 0 }
be the linear span of the pairwise products of codewords from C and C 0 . Observe that
C ? C 0 ⊆ Fnq is again a linear code since we take the linear span. Then the square of C is
defined as
def
C ?2 = C ? C.
Recall that Fnq is equipped with the Euclidean inner product (or bilinear form) defined
as
h·, ·iEucl : Fnq × Fnq −→ Pn Fq
(x, y) 7−→ i=1 xi yi .
The ?-product and the Euclidean inner product are related by the following adjunction
property.
Any such F is of the form F = Frac(Fq [x, y]/(P (x, y))) where P ∈ Fq [x, y] is absolutely
irreducible, i.e., irreducible even when regarded as an element of Fq [x, y].
Any such curve can be obtained as the projective completion and desingularization of an
affine plane curve of the form {P (x, y) = 0} where P is an absolutely irreducible polynomial
in the two indeterminates x and y over Fq .
From this observation we see that Definitions 15.2.1 and 15.2.2 are essentially equivalent.
This can be made more precise.
Theorem 15.2.3 There is an equivalence of categories between:
• algebraic function fields, with field morphisms, over Fq , and
• curves, with dominant (surjective) morphisms, over Fq .
The proof can be found for example in [915, § I.6]. In one direction, to each curve X one
associates its field of rational functions F = Fq (X), which is an algebraic function field. We
then have a natural correspondence between:
• places, or discrete valuations of F ;
• Galois orbits in the set X(Fq ) of points of X with coordinates in the algebraic closure
of Fq ;
• closed points of the topological space of X seen as a scheme.
Algebraic Geometry Codes and Some Applications 313
15.2.2 Divisors
The divisor group Div(X) is the free abelian group generated by the set of closed
points of X or equivalently by the set of places of its function field Fq (X). Thus, a divisor
is a formal sum X
D= nP P
P
def
where P ranges over closed points and vP (D) = nP ∈ Z are almost all zero. The support
of D is the finite set supp(D) of such P with nP 6= 0. The degree of D is
def
X
deg(D) = nP deg(P ).
P
It has finite dimension `(D) = dimFq L(D). Actually `(D) only depends on the linear
equivalence class of DPin Cl(X).
To a divisor D = P nP P , one can associate an invertible sheaf (or line bundle) O(D)
on X, generated locally at each P by t−n
P
P
, for tP a uniformizer at P . There is then a
natural identification
L(D) = Γ(X, O(D))
314 Concise Encyclopedia of Coding Theory
between the Riemann–Roch space of D and the space of global sections of O(D). Conversely,
given an invertible sheaf L on X, any choice of a nonzero rational section s of L defines a
divisor D = div(s) with L ' O(D). Another choice of s gives a linearly equivalent D. From
this we get an isomorphism
Cl(X) ' Pic(X)
where Pic(X), the Picard group of X, is the group of isomorphism classes of invertible
sheaves on X equipped with the tensor product.
where eQ|Pi denotes the ramification index at Q; see [1755, Def. 3.1.5].
The divisor φ∗ D is sometimes also called the conorm of D; see [1755, Def. 3.1.8]. In
addition, it is well-known that
Equivalently, ΩF is the space of rational sections of the invertible sheaf Ω1X , called the
canonical sheaf, or the sheaf of differentials of X, generated locally at each P by the
differential dtP , for tP a uniformizer at P .
For any divisor D we set
def
Ω(D) = Γ(X, Ω1X ⊗ O(−D)) = {ω ∈ ΩF \ {0} | div(ω) > D} ∪ {0}.
Algebraic Geometry Codes and Some Applications 315
Remark 15.2.5 Beware of the sign change compared to the definition of the L(D) space.
This choice of notation could seem unnatural, but it is somehow standard in the literature
on algebraic geometry codes (see, e.g., [1755]), and is related to Serre duality.
15.2.4.2 Residues
Given a rational differential ωP at P and a uniformizer tP we have a local Laurent series
expansion
ωP = a−N t−N −1
P dtP + · · · + a−1 tP dtP + ηP
where N is the order of the pole of ωP at P and ηP is regular at P , i.e., vP (ηP ) ≥ 0. Then
g = `(KX ).
`(D) ≥ deg(D) + 1 − g.
In addition, Riemann–Roch spaces satisfy the following properties; see, e.g., Cor. 1.4.12(b)
and Th. 1.5.17 of [1755].
Proposition 15.2.7 Let D be a divisor on a curve X such that deg(D) < 0. Then L(D) =
{0}.
`(D) = deg(D) + 1 − g.
316 Concise Encyclopedia of Coding Theory
Definition 15.3.1 The evaluation code, or function code, CL (X, P, G) is the image of
the map
L(G) −→ Fnq
f 7−→ (f (P1 ), . . . , f (Pn )).
Definition 15.3.2 The residue code, or differential code, CΩ (X, P, G) is the image of
the map
Ω(G − DP ) −→ Fnq
ω 7−→ (resP1 (ω), . . . , resPn (ω))
where again DP = P1 + · · · + Pn .
The two constructions are dual to each other; see, for instance, [1755, Th. 2.2.8]).
Theorem 15.3.3 The evaluation code and the residue code are dual of each other:
Definition 15.3.4 Two codes C1 , C2 ⊆ Fnq are diagonally equivalent (under a ∈ (F× n
q ) )
if
C2 = C1 ? a
or equivalently if
G2 = G1 Da
where G1 , G2 are generator matrices of C1 , C2 respectively, and Da is the n × n diagonal
matrix whose diagonal entries are the entries of a.
Remark 15.3.5 Diagonally equivalent codes are isometric with respect to the Hamming
distance.
Lemma 15.3.6 Let G1 ∼ G2 be two linearly equivalent divisors on X, both with support
disjoint from P = {P1 , . . . , Pn }. Then CL (X, P, G1 ) and CL (X, P, G2 ) are diagonally equiv-
alent under a = (h(P1 ), . . . , h(Pn )) where h is any choice of function with div(h) = G1 −G2 .
Likewise CΩ (X, P, G1 ) and CΩ (X, P, G2 ) are diagonally equivalent under a−1 .
Remark 15.3.7 In Lemma 15.3.6, the choice of h is not unique. More precisely, h may
be replaced by any nonzero scalar multiple λh of h. This would replace a by λa, which
has no consequence, since the codes are linear and hence globally invariant by a scalar
multiplication.
Remark 15.3.8 If we accept codes defined only up to diagonal equivalence, then we can
relax the condition that supp(G) is disjoint from P in Definitions 15.3.1 and 15.3.2. Indeed,
if supp(G) is not disjoint from P, then by the weak approximation theorem [1755, Th. 1.3.1]
we can find G0 ∼ G with support disjoint from P, and then we can use CL (X, P, G0 ) in place
of CL (X, P, G), and CΩ (X, P, G0 ) in place of CΩ (X, P, G). Lemma 15.3.6 then shows that,
up to diagonal equivalence, these codes do not depend on the choice of G0 .
In summary, the usual restriction “the support of G should avoid the Pi ’s” in the defi-
nition of CL (X, P, G) can always be ruled out at the cost of some technical clarifications.
Remark 15.3.11 The canonical divisors providing the equality between the CΩ and the
CL is the divisor of a differential form having simple poles with residue equal to 1 at all
the Pi ’s. The existence of such a differential is a consequence of the weak approximation
theorem; see [1755, Lem. 2.2.9 & Prop. 2.2.10].
Theorem 15.3.12 ([1755, Th. 2.2.2 & 2.2.7]) The evaluation code CL (X, P, G) is a
linear code of length n = |P| = deg(DP ) and dimension
k = `(G) − `(G − DP ).
k = `(G) ≥ deg(G) + 1 − g,
k = `(G) = deg(G) + 1 − g,
where g is the genus of X. Its minimum distance d = dH (CL (X, P, G)) satisfies
def
d ≥ d∗Gop = n − deg(G)
k = n + g − 1 − deg(G)
Remark 15.3.15 In the sequel, both quantities d∗Gop and dGop are referred to as the
Goppa Bound or the Goppa designed distance. They do not provide a priori the ac-
tual minimum distance but yield a lower bound. In addition, as we will see in Section 15.6,
correcting errors up to half these lower bounds will be considered as good “targets” for
decoding.
c 7→ cPσ ? v
ϕσ : Fq (X) −→ Fq (X)
f 7−→ f ◦σ
∼
induces an isomorphism L(G) −→ L(σ ∗ G).
φσ : CL (X, P, G) −→ CL (X, P, σ ∗ G)
(f (P1 ), . . . , f (Pn )) 7−→ (f (σ(P1 )), . . . , f (σ(Pn ))),
which, by definition of Pσ , is nothing but the map c 7→ cPσ . Next, Lemma 15.3.6 yields an
isomorphism
ψ : CL (X, P, σ ∗ G) −→ CL (X, P, G)
c 7−→ c ? v.
The composition map ψ ◦ φσ provides an automorphism of CL (X, P, G) which is the map
c 7→ cPσ ? v.
Remark 15.3.18 In this chapter, according to the usual notation in algebraic geometry,
the projective space of dimension m is denoted as Pm . In particular, its set of rational points
Pm (Fq ) is the finite set sometimes denoted as P G(m, q) in the literature of finite geometries
and combinatorics.
320 Concise Encyclopedia of Coding Theory
where, by convention, the zero polynomial has degree −∞. A Reed–Solomon (RS) code
is a GRS code with y = (1, . . . , 1) and is denoted as RS k (x).
Remark 15.3.21 Beware that our definition of Reed–Solomon codes slightly differs from
the most usual one in the literature and in particular may differ from that of other chapters of
this encyclopedia. Indeed, most of the references define Reed–Solomon codes as cyclic codes
of length q − 1, i.e., as a particular case of BCH codes. For instance, see Definition 1.14.8,
[1008, § 5.2], [1323, § 10.2], [1602, § 5.2], or [1836, Def. 6.8.1]. Note that this commonly
accepted definition is not exactly the historical one made by Reed and Solomon themselves
in [1582], where they introduced a code of length n = 2m over F2m which is not cyclic.
Further, Reed–Solomon codes “of length q” are sometimes referred to as extended Reed–
Solomon codes. Next, using Remark 15.3.8, one can actually define generalized Reed–
Solomon codes of length q+1, corresponding to the codes called (generalized) doubly extended
Reed–Solomon codes in the literature; see also Definition 14.2.6.
Generalized Reed–Solomon codes are known to have length n, dimension k, and mini-
mum distance d = n − k + 1; see Lemma 24.1.2. That is to say, such codes are maximum
distance separable (MDS), i.e., they meet the Singleton Bound of Theorem 1.9.10 asserting
that for any code of length n and dimension k and minimum distance d, we always have
k + d ≤ n + 1. In addition, many algebraic constructions of codes such as BCH codes, Goppa
codes, or Srivastava codes derive from some particular GRS codes by applying the subfield
subcode operation.
Definition 15.3.22 Consider a finite field Fq and its degree m extension Fqm for some
positive integer m. Let C ⊆ Fnqm be a linear code; the subfield subcode of C is the code
C ∩ Fnq .
The operation defined above is of particular interest for public-key cryptography applica-
tions. All these codes fit in a broader class called alternant codes. See for instance [1323,
Chap. 12, Fig. 12.1].
In some sense, algebraic geometry codes are natural generalizations of generalized Reed–
Solomon codes, the latter being algebraic geometry codes from the projective line P1 . Let
us start with the case of Reed–Solomon codes.
Proof: By definition, the Riemann–Roch space L((k − 1)P∞ ) is the space of rational func-
tions with a pole of order less than k at infinity, which is nothing but the space of polynomials
of degree less than k.
More generally, the equivalence between GRS codes and algebraic geometry codes from
P1 is summarized in the next theorem.
Theorem 15.3.24 Any generalized Reed–Solomon code is an AG code from P1 whose eval-
uation points avoid P∞ . Conversely, any such AG code is a GRS code. More precisely:
(a) For any pair (x, y) as in Definition 15.3.19, we have GRS k (x, y) = CL (P1 , P, G), with
where h is the Lagrange interpolation polynomial satisfying deg(h) < n and h(xi ) = yi
for any 1 ≤ i ≤ n.
(b) Conversely, for any ordered n-tuple P of distinct rational points of P1 \ {P∞ } with
coordinates (x1 : 1), . . . , (xn : 1) and any divisor G of P1 of degree k − 1 whose support
avoids P, we have CL (P1 , P, G) = GRS k (x, y) where
Proof: Using Remark 15.3.20, it suffices to prove that CL (P1 , P, G) = RS k (x) ? y. This
last equality is deduced from Proposition 15.3.23 together with Lemma 15.3.6 since (k −
1)P∞ − G = div(h), proving (a).
Conversely, since deg(G) = k − 1 and the genus of P1 is 0, then from Corollary 15.2.8, we
deduce that the space L(G − (k − 1)P∞ ) has dimension 1. Let us take any nonzero function
f in this space. Then by definition,
and, for degree reasons, the above inequality is an equality. Set y = (f (x1 ), . . . , f (xn )).
Again from Proposition 15.3.23 and Lemma 15.3.6, we deduce that
Remark 15.3.25 Theorem 15.3.24 asserts that any AG code from the projective line is
diagonally equivalent to a one-point code, i.e., an AG code whose divisor G is supported
by one point. This fact can be directly observed without introducing the notation of Reed–
Solomon codes, by using a classical result in algebraic geometry asserting that on P1 , two
divisors are linearly equivalent if and only if they have the same degree.
The major interest of this definition lies in the case of a proper subfield K Fq , for
which the corresponding code benefits from a better estimate of its parameters compared
to general alternant codes [1323, 1763]. This estimate comes with an efficient decoding
algorithm called the Patterson Algorithm [1493] correcting up to half the designed minimum
distance. However, in what follows, we are mainly interested in the relationship between this
construction and that of CΩ codes and, for this reason, we will focus on the case Fq = K.
Remark 15.3.27 Note that the terminology might be misleading here. Algebraic geometry
codes are sometimes referred to as Goppa codes, while the classical Goppa codes are not
algebraic geometry codes from P1 in general since K may be different from Fq . These codes
are subfield subcodes (see Definition 15.3.22) of some algebraic geometry codes from P1 .
For this reason and to avoid any confusion, in the present chapter, we refer to algebraic
geometry codes when speaking about CL and CΩ codes and to Goppa codes or classical
Goppa codes when dealing with codes as defined in (15.6) with K Fq .
Theorem 15.3.28 Denote by P = (P1 , . . . , Pn ) = ((x1 : 1), . . . , (xn : 1)). The code
Γ(x, f, Fq ) equals CΩ (P1 , P, (f )0 − P∞ ) where (f )0 is the effective divisor given by the zeroes
of the polynomial f counted with multiplicity.
The previous discussion shows that the poles of η are simple and are contained in
{P1 , . . . , Pn , P∞ }. In addition, the two forms have simple poles and the same residue at
any of the points P1 , . . . , Pn , and by the residue formula [1755, Cor. 4.3.3] they also should
have the same residue at P∞ . Therefore, the differential form η −ω has no pole on P1 . More-
over, since the degree of a canonical divisor is 2g − 2 = −2, a nonzero rational differential
form on P1 should have poles. Therefore η = ω, and we deduce that the rational function
P resPi (ω)
i x−xi vanishes on (f )0 or, equivalently, that
n
X resPi (ω)
≡ 0 mod (f ).
i=1
x − xi
This yields the inclusion CΩ (P1 , P, (f )0 − P ) ⊆ Γ(x, f, Fq ) and concludes the proof.
where Fqm [X]<k denotes the subspace of polynomials of degree less than k, and
def
(f (X), Xf (X))(x) = (f (x1 ), x1 f (x1 ), f (x2 ), x2 f (x2 ), . . . , f (xn ), xn f (xn )).
The image of this evaluation map is a [2n, k] code over Fqm , and identifying Fqm with Fm
q
this becomes a [2nm, km] code C over Fq .
Theorem 15.4.1 Let 0 < R < 21 . Then as m → ∞ and nk → 2R, the codes C are asymp-
totically good, with asymptotic rate R and asymptotic relative minimum distance at least
(1 − 2R)Hq−1 ( 12 ).
The idea of the proof is that, for > 0 and m → ∞, the number of vectors in F2m q of
−1 1 2m( 21 −) m(1−2)
relative weight less than Hq ( 2 − ) is roughly q =q , which is exponentially
negligible compared to n = q m − 1. Moreover, if such a vector, seen in Fqm × Fqm , is of the
form (α, xi α), then it uniquely determines xi . Now if f has degree k ≤ 2Rn, there are at
least (1 − 2R)n values of xi such that f (xi ) 6= 0, and then, except for a negligible fraction
of them, (f (xi ), xi f (xi )) has weight at least 2mHq−1 ( 12 − ) in F2m
q . The conclusion follows.
n(X)
A(q) = lim sup
g(X)→∞ g(X)
where X ranges over all curves over Fq , and n(X) = |X(Fq )| is the number of rational
points of X.
We then readily get:
Theorem 15.4.3 Assume A(q) > 1. Then, for any R, δ > 0 satisfying
1
R+δ =1− ,
A(q)
there exist asymptotically good algebraic geometry codes with asymptotic rate at least R and
asymptotic relative minimum distance at least δ.
For this result to be meaningful, we need estimates on A(q). Let us start with upper
√
bounds. First, the well-known Hasse–Weil Bound [1755, Th. 5.2.3] implies A(q) ≤ 2 q. Sev-
eral improvements were proposed, culminating with the following, known as the Drinfeld–
Vlăduţ Bound.
√
Theorem 15.4.4 ([1857]) For any q we have A(q) ≤ q − 1.
On the other hand, lower bounds on A(q) combine with Theorem 15.4.3 to give lower
bounds on codes.
Algebraic Geometry Codes and Some Applications 325
√
• When q = p2m (m ≥ 1) is a square, we have A(q) ≥ q − 1. This was first proved
by Ihara in [1021], then independently in [1822], using modular curves if m = 1,
and Shimura curves for general m. Observe that Ihara’s lower bound matches the
√
Drinfeld–Vlăduţ Bound, so we actually get equality: A(q) = q − 1. Other more
effective constructions matching the Drinfeld–Vlăduţ Bound were later proposed, for
instance in [790]. These constructions use recursive towers of curves, although it was
observed by Elkies [676, 677] that they in fact yield modular curves.
Combined with Theorem 15.4.3, Ihara’s lower bound gives asymptotically good codes
with
1
R+δ ≥1− √ .
q−1
This is known as the Tsfasman–Vlăduţ–Zink (TVZ) Bound [1822]. For q ≥ 49,
it is shown that this TVZ Bound beats the Gilbert–Varshamov Bound on a certain
interval as illustrated in Figure 15.1
• When q = p2m+1 (m ≥ 1) is a non-prime odd power of a prime, Bassa, Beelen, Garcı́a,
and Stichtenoth [135] show
−1
2(pm+1 − 1)
1 m
A(p2m+1 ) ≥ = ((p − 1)−1 + (pm+1 − 1)−1 ) .
p + 1 + pp−1
m −1 2
The proof is constructive and uses a recursive tower of curves, although these curves
can also be interpreted in terms of Drinfeld modular varieties.
• For general q, Serre [1653] shows A(q) ≥ c log(q) > 0 for a certain constant c which
1
can be taken as c = 96 (see [1437, Th. 5.2.9]). When q = p is prime, this is often the
best one knows.
1
TVZ
0.9 Plotkin
Gilbert Varashamov
0.8
0.7
0.6
0.5
R
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
δ
There are some explicit lower bounds for A(p) where p = 2, 3, 5, . . . is a small prime.
However these bounds cannot be used in Theorem 15.4.3, which requires A(q) > 1. Indeed,
the Drinfeld–Vlăduţ Bound actually shows A(p) < 1 for p = 2 or 3.
326 Concise Encyclopedia of Coding Theory
Theorem 15.4.5 For any even positive integer `, there exists a sequence of codes over Fq
defined as subfield subcodes of codes over Fq` whose asymptotic parameters (R, δ) satisfy
In particular, when ` → ∞ the asymptotic parameters of such codes reach the Gilbert–
Varshamov Bound for δ ∼ 0.
Theorem 15.4.6 For any prime power q, there exist asymptotically good nonlinear codes
over Fq with asymptotic relative minimum distance δ and asymptotic rate
1 1
R≥1−δ− + logq 1 + 3 .
A(q) q
This holds for any δ > 0 such that this quantity is positive.
k + d ≥ n + 1 − g. (15.7)
Algebraic Geometry Codes and Some Applications 327
Therefore, the parameters of an algebraic geometry code are “at distance at most g” from the
Singleton Bound. However, the Goppa Bound is not always reached and improvements may
exist under some hypotheses. The present section is devoted to such possible improvements.
It should be emphasized that this concerns only “finite length” codes; the statements to
follow do not provide any improvement on the asymptotic performances of (linear) algebraic
geometry codes presented in Section 15.4.
First, let us try to understand why the Goppa Bound may be not reached by proving
the following lemma.
Lemma 15.5.1 The code CL (X, P, G) has minimum distance d = d∗Gop = n − deg(G) if
and only if there exists f ∈ L(G) such that the positive divisor div(f ) + G is of the form
Pi1 + · · · + Pis where the Pij’s are distinct points among P1 , . . . , Pn .
Proof: Suppose there exists such an f ∈ L(G) satisfying div(f ) + G = Pi1 + · · · + Pis .
Since principal divisors have degree 0, then s = deg(G). Consequently, the corresponding
codeword vanishes at positions with index i1 , . . . , is and hence has weight n − deg(G). Thus,
such a code reaches the Goppa Bound.
Conversely, if the Goppa Bound is reached, then there exists f ∈ L(G) vanishing at
s = deg(G) distinct points Pi1 , . . . , Pis among the Pi ’s. Hence
Remark 15.5.2 In algebraic geometry, the complete linear system or complete lin-
ear series associated to a divisor G, denoted by |G|, is the set of positive divisors linearly
equivalent to G. Such a set is parameterized by the projective space P(L(G)). Using this
language, one can claim that the code CL (X, P, G) reaches the Goppa Bound if and only if
|G| contains a reduced divisor supported by the Pi ’s.
For instance, one can provide examples of codes from elliptic curves (i.e., curves with
g = 1) that are MDS, i.e., reach the Singleton Bound.
Example 15.5.3 Recall that given an elliptic curve E over a finite field Fq with a fixed
rational point OE , the set of rational points has a natural structure of an abelian group with
zero element OE given by the chord-tangent group law; see [1713, § III.2]. The addition law
will be denoted by ⊕. This group law can also be deduced from the linear equivalence of
divisors (see (15.1)) as follows:
P ⊕Q=R ⇐⇒ (P − OE ) + (Q − OE ) ∼ (R − OE ) (15.8)
We claim that C is MDS. Indeed, from the Singleton Bound and (15.7), the minimum
distance of the code is either n − k or n − k + 1. Suppose the minimum distance is n − k;
then, from Lemma 15.5.1, there exists f ∈ L(G) and k distinct points Pi1 , . . . , Pik among
P1 , . . . , Pn such that
div(f ) = Pi1 + · · · + Pik − G.
Therefore
where the last equivalence is a consequence of (15.8) and (15.9), but contradicts the assertion
r > n(n+1)
2 . Thus, C is MDS.
Remark 15.5.4 The existence of MDS codes from elliptic curves is further discussed in
[1821, § 4.4.2].
for some strictly decreasing sequence of divisors (Gi )i and the iterated estimates of
the minimum distance of the sets CL (X, P, Gi ) \ CL (X, P, Gi+1 ).
A nice overview of many improved bounds in the literature is given in [656].
Thanks to Lemma 15.3.10, we know that any CL code is a CΩ one. Therefore, always
choosing the most convenient point of view, we alternate between improved lower bounds
for the minimum distance of CL and CΩ codes. Our point is to provide lower bounds for the
minimum distance that improve the Goppa designed distance
def
dGop = deg(G − KX ) = deg(G) + 2 − 2g,
Remark 15.5.5 Actually, the notion of base point depends only on the divisor class. Thus,
if A0 ∼ A, then P is also a base point of A0 .
Remark 15.5.6 From the Riemann–Roch Theorem, any divisor A such that deg(A) >
2g − 1 has no base points.
If a divisor G has a base point P outside the set P, then CL (X, P, G) = CL (X, P, G − P )
which implies that the minimum distance of this code satisfies d ≥ n − deg(G) + 1 instead
of n − deg(G). Similarly, for CΩ codes, if G − KX has a base point, we have the following.
Algebraic Geometry Codes and Some Applications 329
Lemma 15.5.7 ([656, Lem. 1.3]) Let P be a base point of G − KX where P is disjoint
from the elements of P. Then
Remark 15.5.8 According to Remark 15.3.8, one can get rid of the hypothesis that P is
disjoint from the elements of P. In such a situation, the codes CL (X, P, G) and CL (X, P, G−
P ) are only diagonally equivalent, but the result on the minimum distance still holds.
More generally, the floor bAc of a divisor A is the divisor A0 of smallest degree such
that L(A0 ) = L(A). Such a divisor satisfies bAc 6 A [1325, Prop. 2.1] and we have these
bounds:
Theorem 15.5.9 (Maharaj, Matthews, Pirsic [1325, Th. 2.9]) The code CL (X, P, G)
has dimension k and minimum distance d which satisfy
k ≥ deg(G) + 1 − g and d ≥ n − deg bGc .
This principle has been used and improved in several references such as [658, 872, 1304,
1325]. The so-called ABZ Bound due to Duursma and Park [658, Th. 2.4], inspired from
the AB Bound of van Lint and Wilson [1838, Th. 5], permits one to deduce many other
bounds.
Theorem 15.5.10 (Duursma, Park [658, Th. 2.4]) Let G = A + B + Z for a divisor
Z > 0 whose support is disjoint from P. Then
Remark 15.5.11 The Goppa designed distance can be deduced from the ABZ Bound by
choosing A = G and B = Z = 0. Indeed, we get
Several floor bounds can be deduced from Theorem 15.5.10, such as the Lundell–
McCullough (LM) Bound and the Güneri–Stichtenoth–Taşkin (GST) Bound.
Remark 15.5.13 The case A = B was previously proved by Maharaj, Matthews, and
Pirsic in [1325, Th. 2.10].
Theorem 15.5.14 (Güneri, Stichtenoth, Taşkin [872, Th. 2.4]) Let A, B, C, and Z
be divisors on X satisfying:
• The support of A + B + C + Z is disjoint from P.
330 Concise Encyclopedia of Coding Theory
Proof: This is proved in [656, Cor. 2.6] as another consequence of Theorem 15.5.10.
Example 15.5.15 This example is borrowed from [656]. Calculations have been verified
using Magma [250]. Consider the Suzuki curve X over F8 defined by the affine equation
y 8 + y = x2 (x8 + x).
This curve is known to have genus 14 and 65 rational points. We set P and Q to be the
places above the points with respective homogeneous coordinates (0 : 1 : 0) and (0 : 0 : 1).
Let P contain all the rational points of X except P and Q. Set G = 22P + 6Q. The code
CΩ (X, P, G) has length 63 and dimension 48. According to the Goppa Bound, its minimum
distance satisfies
d ≥ dGop = 2.
If we set A = 16P , B = 5P + 4Q, and Z = P + 2Q, a calculation gives `(A) = `(A + Z) = 6
and `(B) = `(B + Z) = 1. Then the Lundell–McCullough Bound may be applied and yields
d ≥ dLM = 5.
Next, taking A = 14P + 2Q, B = 8P + 4Q, C = 8P , and Z = 2Q, one can check that the
conditions of Theorem 15.5.14 are satisfied and that
d ≥ dGST = 6.
Actually, Theorem 15.5.10 gives this lower bound d ≥ 6 using A = 14P , B = 8P , and
Z = 6Q.
Definition 15.5.16 Given a divisor F on a curve X and a point P , the non-gaps semi-
group of F at P denoted by ν(F, P ) is defined as
def
ν(F, P ) = {j ∈ Z | L(F + (j − 1)P ) 6= L(F + jP )}.
Algebraic Geometry Codes and Some Applications 331
Remark 15.5.26 Actually, Theorem 15.5.25 applies not only to codes from embedded
curves but to duals of CL codes from arbitrary dimensional complete intersections in a
projective space.
In [457, Th. 4.1], it is proved that, for plane curves, Theorem 15.5.25 provides a nontrivial
lower bound even in cases where the Goppa Bound is negative and hence irrelevant.
Example 15.5.27 This example is borrowed from [457, Ex. 4.3]. Consider the finite field
F64 and the curve X with homogeneous equation
Remark 15.6.1 In the sequel, we suppose that for any divisor A on the given curve X,
bases of the spaces L(A) and Ω(A) can be efficiently computed. It is worth noting that the
effective computation of Riemann–Roch spaces is a difficult algorithmic problem of deep
interest but which requires an independent treatment. For references on this topic, we refer
the reader to, for instance, [950, 1210].
Recall from Theorem 15.3.12 that d∗Gop = n − deg(G) denotes the designed distance of C.
By definition, there exists a function f ∈ L(G) such that c = (f (P1 ), . . . , f (Pn )). Denote
by {i1 , . . . , iw } ∈ {1, . . . , n} the support of e, that is to say, the set of indexes corresponding
to the nonzero entries of e, i.e., the positions at which errors occurred.
A decoding algorithm correcting t errors takes as inputs (C, y) and returns either c
(or equivalently e) if wtH (e) ≤ t or “?” if wtH (e) is too large. The first algorithms in the
334 Concise Encyclopedia of Coding Theory
literature [1065, 1720] rest on the calculation of an error-locating function. For this sake, one
introduces an extra divisor F whose support avoids P and whose additional properties are
to be decided later. The point is to compute a nonzero function λ ∈ L(F ) “locating the error
positions”, i.e., such that λ(Pi1 ) = · · · = λ(Piw ) = 0. Once such a function is computed, its
zero locus provides a subset of indexes J ⊆ {1, . . . , n} such that {i1 , . . . , iw } ⊆ J. If this set
J is small enough, then its knowledge allows the decoding to proceed by solving a linear
system as suggested by the following statement. This is a classical result of coding theory:
once the errors are located, decoding reduces to correcting erasures.
Proposition 15.6.2 Let H be a parity check matrix for C and J ⊆ {1, . . . , n} such that
|J| < dH (C) and which contains the support of e. Then e is the unique solution of the system
HeT = HyT
(15.11)
ei = 0 for all i ∈ {1, . . . , n} \ J.
Proof: If λ ∈ L(F − De ), then λ vanishes at the error points and the result is a consequence
of (15.12).
Ky = L(F − De ).
Proof: Inclusion ⊇ is given by Lemma 15.6.3. Conversely, if (λ(P1 )y1 , . . . , λ(Pn )yn ) ∈
CL (X, P, G + F ), then, since (λ(P1 )f (P1 ), . . . , λ(Pn )f (Pn )) ∈ CL (X, P, G + F ), this implies
def
that u = (λ(P1 )e1 , . . . , λ(Pn )en ) also lies in this code. In addition, wtH (u) ≤ wtH (e) ≤ t.
On the other hand, from Theorem 15.3.12, dH (CL (X, P, G+F )) ≥ n−deg(G+F ). Therefore,
wtH (u) is less than the code’s minimum distance, and hence u = (λ(P1 )e1 , . . . , λ(Pn )en ) =
0. Thus, λ vanishes at any position where e does not. Hence λ ∈ L(F − De ).
The previous statements provide the necessary material to describe a decoding algorithm
for the code CL (X, P, G). Mainly, the algorithm consists of:
1. computing Ky ;
2. taking a nonzero function λ in Ky ;
Algebraic Geometry Codes and Some Applications 335
3. computing the zeroes of λ among the Pi ’s, which, hopefully, should provide a set
localizing the errors; and
4. finding e by solving a linear system using Proposition 15.6.2.
More precisely the pseudo-code of the complete procedure is given in Algorithm 15.6.5.
d∗ −1
Theorem 15.6.6 If t ≤ Gop2 − g2 and deg(F ) = t + g, then Algorithm 15.6.5 is correct
and returns the solution e in O(nω ) operations in Fq , where ω is the complexity exponent
of linear algebraic operations (in particular ω ≤ 3).
Proof: If deg(F ) ≥ t+g, then deg(F −De ) ≥ g and hence, by the Riemann–Roch Theorem,
`(F − De ) > 0. In particular, from Lemma 15.6.3, Ky 6= {0}. Next, by assumption on t and
deg(F ), we get
2t + g ≤ d∗Gop − 1 =⇒ t ≤ d∗Gop − deg(F ) − 1 (15.14)
which, from Proposition 15.6.4, yields the equality Ky = L(F − De ). Therefore, one can
compute L(F − De ). Take any nonzero function λ in this space. It remains to prove that the
conditions of Proposition 15.6.2 are satisfied, i.e., that the zero locus of λ in {P1 , . . . , Pn }
is not too large. But, from (15.14), λ ∈ L(F ), implying its zero divisor (λ)0 has degree
at most deg(F ). Since 0 ≤ t ≤ d∗Gop − deg(F ) − 1, we immediately see that deg((λ)0 ) ≤
deg(F ) < d∗Gop which proves that the solution of the system (15.11) will provide e as the
unique possible solution.
About the complexity, the computation of Ky and the solution of the system (15.11)
are nothing but the solution of linear systems with O(n) equations and O(n) unknowns.
The other operations in the algorithm are negligible.
336 Concise Encyclopedia of Coding Theory
• The modified algorithm [1720] and the extended modified algorithm [654] combine the
basic algorithm with an iterative search for a relevant choice for the extra divisor F .
These algorithms permit a reduction in the gap g2 to about g4 ; see [971, Rem. 4.11 &
4.15].
• In [1499, 1856] the question of the existence of an extra divisor F for which the basic
algorithm corrects up to half the designed distance is discussed. It is, in particular,
proved that such an F exists and can be chosen in a set of O(n) divisors as soon as
q ≥ 37.
• Finally, an iterative approach to finding an F achieving half the designed distance
was proposed by Ehrhard [665].
All the above contributions are summarized with further details in [971, § 4 to 7].
Definition 15.6.12 An array of codes for a code C is a triple of sequences of codes
(Ai )0≤i≤u , (Bj )0≤j≤v , (Cr )w≤r≤n such that
(A1) C = Cw ;
(A2) for all i, j, r, dim Ai = i, dim Bj = j, and dim Cr = n − r;
(A3) the sequences (Ai )i and (Bj )j are increasing, while (Cr )r is decreasing; and
(A4) for all i, j, we define rb(i, j) to be the least integer r such that Ai ? Bj ⊆ Cr⊥ . For any
i, j ≥ 1, if a ∈ Ai \ Ai−1 and b ∈ Bj \ Bj−1 and r = rb(i, j) > w, then a ? b ∈ Cr⊥ \ Cr−1
⊥
.
Note that the function rb(i, j) is increasing in the variables i and j but not necessarily
strictly increasing. This motivates the following definition of a well-behaving pair, a termi-
nology borrowed from that of well-behaving sequences [799].
nr0 | r ≤ r0 ≤ n − 1}.
dH (Cr ) ≥ min {b
Remark 15.6.15 Of course, Theorem 15.6.14 is almost the same as Theorem 15.5.20.
The slight difference is in the quantity n br , which is very close to nr introduced in (15.10)
from Section 15.5.2. The difference lies in the fact that nr is defined on the level of function
algebras while n
br is defined on the level of codes. On function algebras over curves, valuations
assert a strict growth of the function r(i, j) that could be defined in this context. When
dealing on the level of codes, we should restrict to a subset of pairs (the well-behaving ones)
to keep this strict increasing property.
This requirement of strict increasing is necessary to prove Theorem 15.6.14. This bound
on the minimum distance is obtained by relating the Hamming weight of some codeword
to the rank of a given matrix. The well-behaving pairs provide the position of some pivots
in this matrix. Thus, their number give a lower bound for the rank. Of course two pivots
cannot lie on the same row or column, hence the requirement of strict decreasing. Note that
in [1501, Th. 4.2], the strict increasing requirement is lacking, which makes the proof not
completely correct. Replacing general pairs by well-behaving ones, as is done in [462], fixes
the proof.
Definition 15.6.16 An array of codes (Ai )1≤i≤u , (Bj )1≤j≤v , (Cr )w≤r≤n for a code C is
said to be a t-error-correcting array for C if
(i) Cw = C and
dH (Cw )−1
(ii) t ≤ 2 .
Theorem 15.6.17 (See [1501, Th. 4.4]) If a code C has a t-error-correcting array, then
it has a decoding algorithm correcting up to t errors in O(nω ) operations in Fq .
yw+1 = cw+1 + e
where cw+1 ∈ Cw+1 noting that the error is unchanged. Applying this process iteratively
until cn = 0 yields the error.
Remark 15.6.18 Actually, as noticed in [1501], the process may be stopped before reach-
ing cn . Indeed, after r ≥ g iterations, one obtains a new decoding problem yr = cr +e where
Cr benefits from a t-error-correcting pair; one can at this step switch to the error-correcting
pair algorithm, which corrects up to dH (C2r )−1 − g2 ≥ dH (C2w )−1 errors, and get e.
dGop − 1
t= .
2
Algebraic Geometry Codes and Some Applications 339
Remark 15.6.20 Of course, a similar result holds for evaluation codes by replacing G by
K − G + DP .
Remark 15.6.21 A simple choice of an error-correcting array for CΩ (X, P, G) can be ob-
tained by choosing a rational point P of X and considering the following sequences to
construct the Ai ’s, the Bj ’s, and the Cr ’s:
where (µi )i and (νj )j , respectively, denote the 0- and G-non-gap sequences at P ; see Defi-
nition 15.5.16.
Q = Q0 + Q1 Y + · · · + Q` Y `
satisfying:
(i) for any j ∈ {0, . . . , `}, Qj ∈ L(F + (` − j)G), and
(ii) for any i ∈ {1, . . . , n}, the function Q vanishes at (Pi , yi ) with multiplicity at least
s.
Root finding: Compute the roots f1 , . . . , fm (m ≤ `) of Q(Y ) lying in Fq (X) and output the
list of codewords of the form (fi (P1 ), . . . , fi (Pn )) that are at distance at most t from y.
340 Concise Encyclopedia of Coding Theory
Remark 15.6.24 In [876, § 6.3.4], a polynomial time algorithm to perform the root finding
step is presented. This algorithm consists in a reduction of Q modulo some place of Fq (X)
of large enough degree, then factoring the reduced polynomial using the Berlekamp or the
Cantor–Zassenhaus Algorithm, followed by a “lifting” of the roots in Fq (Y ), under the
assumption that they lie in a given Riemann–Roch space.
Theorem 15.6.25 (See, e.g., [148, Lem. 2.3.2 & 2.3.3]) Let
n(s + 1) ` deg(G) g
t≤n− − − .
2(` + 1) 2s s
Then, for any extra divisor F satisfying
ns(s + 1) ` deg(G)
+ + g − 1 < deg(F + `G) < s(n − t),
2(` + 1) 2
the Guruswami–Sudan Algorithm 15.6.22 succeeds in returning in polynomial time the full
list of codewords in CL (X, P, G) at distance less than or equal to t from y.
Remark 15.6.26 The case with no multiplicity, i.e., the case s = 1 was considered before
Guruswami and Sudan by Sudan for Reed–Solomon codes [1762] (see Algorithm 24.3.1) and
Shokrollahi and Wasserman [1693] for algebraic geometry codes.
Remark 15.6.27 When ` = s = 1, the algorithm should j ∗ yield kthe basic algorithm. How-
dGop −1
ever, Theorem 15.6.25 yields a decoding radius of 2 − g : a defect of g instead of
g
2 . Further analysis shows that, whenever g > 0, the linear system whose solution is the
polynomial Q never has full rank. On the other hand, the analysis of the decoding radius
is done without taking this rank defect into account, leading to a slightly pessimistic esti-
mate. A further analysis of the rank of the system in the general case might provide a slight
improvement of the Guruswami–Sudan decoding radius.
Problem 15.7.1 Let C ⊆ Fnq be a code, t ≤ n be a positive integer, and y ∈ Fnq . Decide
whether or not there exists c ∈ C whose Hamming distance to y is less than or equal to t.
Algebraic Geometry Codes and Some Applications 341
Note that NP-completeness may only assert hardness in the worst case. However, this
problem is commonly believed by the community to be “difficult in average”. By this, we
mean that most of the instances of the problem seem difficult.
This result motivated McEliece to propose a code-based encryption scheme2 whose se-
curity was relying on the hardness of this problem [1361]. Roughly speaking, the McEliece
scheme can be described follows.
More formally, let Fn,k be a family of codes of fixed length n and dimension k. Consider
a set S of secrets together with a surjective map C : S → Fn,k such that for any s ∈ S, the
code C(s) has a decoding algorithm Dec(s) depending on s that corrects up to t errors. See
Figure 15.2.
def
ycipher = mG + e.
McEliece scheme.
342 Concise Encyclopedia of Coding Theory
of pairs (x, y) where x is an ordered n-tuple of distinct elements of Fq and y and ordered
n-tuple of nonzero elements of Fq . Then the map C is C : (x, y) 7→ GRS k (x, y).
where P is the sequence of points ((x1 : 1), . . . , (xn : 1)) and G = (f )0 + P∞ . Therefore,
classical Goppa codes are subfield subcodes of AG codes from P1 .
3 A cryptosystem is claimed to have λ bits of security if the best known attack would cost more than 2λ
15.7.5 Security
For any instantiation of the McEliece scheme, one should distinguish two kinds of attacks:
1. Message recovery attacks consist in trying to recover the plain text from the
data of the cipher text. Such an attack rests on generic decoding algorithms, such as
Prange’s Information Set Decoding [1540] and its improvements [146, 336, 646, 1212,
1356, 1357, 1752].
2. Key recovery attacks consist in recovering the secret key from the data of the
public key and rest on ad hoc methods depending on the family of codes composing
the set of public keys.
We conclude this section with a discussion of the security of Janwa and Moreno’s three
proposals with respect to key-recovery attacks. It is explained in particular that proposals
(JM1) and (JM2) are not secure.
and
dH (C ?(t+1) ) ≤ dH (C ?t ).
(b) ([1564, Cor. 2.33]) If for some r we have dim(C ?(r+1) ) = dim(C ?r ), then also
dim(C ?(r+i) ) = dim(C ?r ) for all i ≥ 0.
Algebraic Geometry Codes and Some Applications 345
The smallest r = r(C) such that dim(C ?(r+1) ) = dim(C ?r ) is called the regularity of C
[1564, Def. 1.5]. Thus, dim(C ?t ) strictly increases for t < r and then it stabilizes.
If C is an [n, k] code and C 0 is an [n, k 0 ] code, we have a short exact sequence [1564,
§ 1.10]
π∆
0 −→ I(C, C 0 ) −→ C ⊗ C 0 −→ C ? C 0 −→ 0 (15.16)
where C ⊗ C 0 is the usual [n2 , kk 0 ] (tensor) product code that identifies with the space of
n × n matrices all of whose columns are in C and all of whose rows are in C 0 , I(C, C 0 ) is the
subspace of such matrices that are zero on the diagonal, and π∆ is the projection onto the
diagonal.
Equivalently, let p1 , . . . , pn (respectively, p01 , . . . , p0n ) be the columns of a full rank gen-
erator matrix of C (respectively, of C 0 ), and consider the evaluation map
0
Bilin(Fkq × Fkq → Fq ) −→ Fnq
b 7−→ (b(p1 , p1 ), . . . , b(pn , p0n )),
0
0 0
where Bilin(Fkq × Fkq → Fq ) denotes the space of Fq -bilinear forms on Fkq × Fkq . Then C ? C 0
is the image of this evaluation map, and I(C, C 0 ) is its kernel.
Similar to (15.16), for any t ≥ 2, we have a natural short exact sequence [1332, Sec. 3]
0 −→ It (C) −→ S t C −→ C ?t −→ 0 (15.17)
where S t C is the tth symmetric power of C, which actually defines It (C) as the kernel of the
natural map S t C → C ?t .
Equivalently and more concretely, let p1 , . . . , pn be the columns of a generator matrix of
C. Denote by Fq [x1 , . . . , xk ]t the space of degree t homogeneous polynomials in k variables
and consider the evaluation map
Fq [x1 , . . . , xk ]t −→ Fnq
f 7−→ (f (p1 ), . . . , f (pn )).
Then C ?t is the image of this evaluation map, and It (C) is its kernel, the space of degree t
homogeneous forms vanishing at p1 , . . . , pn .
When C is an AG code, Márquez-Corbella, Martı́nez-Moro, and Pellikaan showed how
I2 (C) allows one to retrieve the underlying curve of an AG code:
Proposition 15.8.2 ([1332, Prop. 8 & Th. 12]) Let C = CL (X, P, G) be an evaluation
code, and assume that g = g(X), n = |P|, and m = deg(G) satisfy
n
2g + 2 ≤ m < .
2
Then I2 (C) is a set of quadratic equations defining X embedded in Pk−1 .
and one expects that if C and C 0 are independent random codes, then, with high proba-
bility, this inequality becomes an equality: dim(C ? C 0 ) = min {n, kk 0 }. We refer to [1563,
Th. 16–18] for precise statements and proofs. By contrast, we observe that for AG codes
C = CL (X, P, G) and C 0 = CL (X, P, G0 ), with the same evaluation points sequence P, we
346 Concise Encyclopedia of Coding Theory
can get much stronger upper bounds. For instance, if 2g − 2 < deg(G + G0 ) < n, then
Theorem 15.3.12 together with (15.15) gives dim(C ? C 0 ) ≤ k + k 0 − 1 + g.
Likewise from (15.17) it follows that
k+t−1
dim(C ?t ) ≤ min n, ,
t
Remark 15.8.3 ([1563, Prop. 19]) When t > q, we always have the strict inequality
?q
dim(C ?t ) < k+t−1 of the extra relations c?q ? c0 = c ? c0 .
t because
Concerning lower bounds, from now on we will make the simplifying assumption that
all our codes have full support, i.e., any generator matrix has no zero column.
It is a subalgebra of Fnq and admits a basis made of the characteristic vectors of the supports
of the indecomposable components of C ([1129, Th. 1.2] and [1564, Prop. 2.11]). In particular,
dim(Stab(C)) is equal to the number of indecomposable components of C.
Mirandola and Zémor [1390] then established a coding analog of Kneser’s Theorem from
additive combinatorics [1790, Th. 5.5] in which Stab(C) plays the same role as the stabilizer
of a subgroup.
Pursuing the analogy with additive combinatorics, they also obtained the following char-
acterization of cases of equality in Proposition 15.8.4, which one might see as an analog of
Vosper’s Theorem [1790, Th. 5.9].
Theorem 15.8.6 ([1390, Th. 23]) Assume both C and C 0 are MDS, with k, k 0 ≥ 2, k +
k 0 ≤ n − 1, and
dim(C ? C 0 ) = k + k 0 − 1.
Then C and C 0 are (generalized) Reed–Solomon codes with a common evaluation point se-
quence.
Remark 15.8.7 The symmetric version (C = C 0 ) of Theorem 15.8.6 can actually be re-
garded as a direct consequence of Castelnuovo’s Lemma [61, § III.2, p. 120] asserting that
for n ≥ 2k + 1, any n points in general position in Pk−1 imposing 2r − 1 independent
conditions on quadrics lie on a Veronese embedding of P1 .
Algebraic Geometry Codes and Some Applications 347
Theorem 15.8.8 ([1561]) Over any finite field Fq , there exist asymptotically good codes
C whose squares C ?2 are also asymptotically good.
In particular for q = 2 and for any 0 < δ < 0.003536 and R ≤ 0.001872 − 0.5294δ,
there exist codes C of asymptotic rate at least R whose squares C ?2 have asymptotic relative
minimum distance at least δ.
In finite length, other constructions have been studied by Cascudo and co-authors that
give codes such that both C and C ?2 have good parameters. This includes cyclic codes [357]
and matrix-product codes [362].
Concerning upper bounds, the following result is sometimes called the Product Sin-
gleton Bound.
Theorem 15.8.9 ([1562]) Let t ≥ 2 be an integer and C1 , . . . , Ct ⊆ Fnq linear codes with
full support. Then there exist codewords c1 ∈ C1 , . . . , ct ∈ Ct whose product has Hamming
weight
0 < wtH (c1 ? · · · ? ct ) ≤ min {t − 1, n − (k1 + · · · + kt ) + t}
where ki = dim(Ci ). In particular we have
Mirandola and Zémor describe cases of equality for t = 2. Essentially these are either
pairs of (generalized) Reed–Solomon codes with a common evaluation point sequence, or
pairs made of a code and its dual (up to diagonal equivalence). See [1390, Sec. V] for further
details.
15.8.1.4 Automorphisms
Let C be a linear code of regularity r(C), and let t ≤ t0 be two integers. Assume one of
the following two conditions holds:
(i) t | t0 or
(ii) t0 ≥ r(C).
0
Then, from [1564, § 2.50–2.55], it follows that |Aut(C ?t )| divides |Aut(C ?t )|.
0
This makes one wonder whether one could compare Aut(C ?t ) and Aut(C ?t ) for arbitrary
t ≤ t0 . For instance, do we always have |Aut(C ?2 )| ≤ |Aut(C ?3 )|?
Motivated by Proposition 15.3.16, Couvreur and Ritzenthaler tested this question
against AG codes and eventually showed that the answer is negative.
348 Concise Encyclopedia of Coding Theory
Example 15.8.10 ([466]) Let E be the elliptic curve defined by the Weierstrass equation
y 2 = x3 + 1 over F7 with point at infinity OE . Set P = E(F7 ), G = 3OE , and consider the
evaluation code C = CL (E, E(F7 ), 3OE ). So C has generator matrix
0 0 1 1 2 2 3 4 4 5 6 0
1 6 3 4 3 4 0 3 4 0 0 1 .
1 1 1 1 1 1 1 1 1 1 1 0
Computer aided calculations show that C ?2 = CL (E, E(F7 )), 6OE ) has
|Aut(C ?2 )| = 432,
|Aut(C ?3 )| = 108.
So in particular
|Aut(C ?2 )| > |Aut(C ?3 )|.
c1 , . . . , cs 6= 0 =⇒ c1 ? · · · ? cs 6= 0.
Some elementary combinatorial constructions of frameproof codes can be found in [206, 419].
In [1915] Xing considers asymptotic bounds, and in particular, constructions from AG
codes. The starting observation of his main result is that a sufficient condition for an eval-
uation code CL (X, P, G) to be s-frameproof is
`(sG − DP ) = 0, (15.18)
where DP is defined in (15.3). Condition (15.18) is perhaps the simplest instance of what
was later called a Riemann–Roch equation [361].
Xing uses a counting argument in the divisor class group of the curve to prove the
existence of solutions to (15.18) with deg(sG − DP ) ≈ (1 − 2 logq s)g. This leads to the
following result.
Theorem 15.8.11 ([1915]) For 2 ≤ s ≤ A(q), there exist s-frameproof codes over Fq of
length going to infinity and asymptotic rate at least
1 1 1 − 2 logq s
− + .
s A(q) sA(q)
In this bound, the 2 logq s term reflects the possible s-torsion in the class group that hinders
the counting argument.
For s = 2, an alternative method is proposed in [1560] that allows one to construct
solutions to (15.18) up to deg(2G − DP ) ≈ g − 1, which is best possible. This gives:
Algebraic Geometry Codes and Some Applications 349
Theorem 15.8.12 ([1560]) If A(q) ≥ 4, one can construct 2-frameproof codes over Fq of
length going to infinity and asymptotic rate at least
1 1
− .
2 2A(q)
This result has a somehow unexpected application. In [683], Erdös and Füredi studied a
certain problem in combinatorial geometry. It led them to introduce certain configurations,
which admit the following equivalent descriptions:
• sets of M vertices of the unit cube {0, 1}n in the Euclidean space Rn , any three x, y, z
of which form an acute angle: hx − z, y − ziEucl > 0;
• sets of M points in the binary Hamming space Fn2 , any three x, y, z of which satisfy
the strict triangle inequality: dH (x, y) < dH (x, z) + dH (y, z);
• sets of M binary vectors of length n, any three x, y, z of which have a position i where
xi = yi 6= zi ;
• sets of M subsets in an n-set, no three A, B, C of which satisfy A ∩ B ⊆ C ⊆ A ∪ B.
In the context of coding theory, such a configuration is also called a binary (2, 1)-
separating system of length n and size M . A random coding argument shows that there
exist such separating systems of length n → ∞ and size
√
M ≈ (2/ 3)n ≈ 1.1547005n .
However, combining Theorem 15.8.12 with a convenient concatenation argument provides
a dramatic(?) improvement.
Theorem 15.8.13 ([1560]) One can construct binary (2, 1)-separating systems of length
n → ∞ and size
3
M ≈ (11 50 )n ≈ 1.1547382n .
This is an instance of a construction based on AG codes beating a random coding argument
over the binary field !
Definition 15.8.16 The bilinear complexity µq (k) (respectively, the symmetric bi-
linear complexity µsym q (k)) of Fqk over Fq is the smallest possible length of a bilinear
multiplication algorithm (respectively, a symmetric bilinear multiplication algorithm) for
Fqk over Fq .
Obviously we always have µq (k) ≤ µsym q (k) ≤ k 2 . For k ≤ 2q + 1 this is easily improved
to µq (k) ≤ µsym
q (k) ≤ 2k − 1, using the Fourier transform, or equivalently, Reed–Solomon
codes.
In [408], Chudnovsky and Chudnovsky highlighted further links between multiplication
algorithms and codes. Then, using a construction similar to AG codes, they were able to
prove the first linear asymptotic upper bound on µq (k).
Theorem 15.8.17 ([408, Th. 7.7]) If q ≥ 25 is a square, then
1 1
lim sup µq (k) ≤ 2 1 + √ .
k→∞ k q−3
This result was originally stated for bilinear complexity, but the proof also works for sym-
metric bilinear complexity, since it provides symmetric algorithms. So we actually get
1 1
lim sup µsym
q (k) ≤ 2 1 + √
k→∞ k q−3
for q ≥ 25 a square.
Chudnovsky and Chudnovsky proceeded by evaluation-interpolation on curves. Suppose
we are given a curve X over Fq , together with a collection P = {P1 , . . . , Pn } of n distinct
rational points, a point Q of degree k, and a suitably chosen auxiliary divisor G. Also
assume:
Step 3: By Lagrange interpolation, find a function h in L(2G) that takes these values
h(P1 ) = y1 , . . . , h(Pn ) = yn , and then evaluate h at Q.
Algebraic Geometry Codes and Some Applications 351
Then (Eval2) ensures that we necessarily have h = fx fx0 ; so the last evaluation in Step 3
gives h(Q) = xx0 as desired.
In [1697], Shparlinski, Tsfasman, and Vlăduţ propose several improvements. First, they
correct certain imprecise statements in [408] or give additional details to the proofs. For
instance, on the choice of the curves to which the method is applied in order to get Theo-
rem 15.8.17, they provide an explicit description of a family of Shimura curves Xi over Fq ,
for q a square, that satisfy
√
• (optimality) |Xi (Fq )|/g(Xi ) → A(q) = q − 1 and
• (density) g(Xi+1 )/g(Xi ) → 1.
They further show how to deduce linearity of the complexity over an arbitrary finite
field.
def
Lemma 15.8.18 ([1697, Cor. 1.3]) Set Mq = lim supk→∞ k1 µq (k). Then, for any prime
power q and integer m, we have
Mq ≤ µq (m)Mqm .
Mq < +∞
for all q. As above, this result was originally stated only for bilinear complexity, but the
proof also works for symmetric bilinear complexity; so we likewise get
def 1
Mqsym = lim sup µsym
q (k) < +∞.
k→∞ k
Finally, they devise a possible improvement on Theorem 15.8.17. For this they observe
that for given P and Q, the choice of the auxiliary divisor G satisfying (Eval1) and (Eval2)
above reduces to the solution of the following system of two Riemann–Roch equations:
`(KX − G + Q) = 0,
(15.19)
`(2G − DP ) = 0.
In order to get the best possible parameters, one would like to set P = X(Fq ), and find
a solution with deg(KX − G + Q) and deg(2G − DP ) close to g − 1. The authors propose
a method to achieve this, but it was later observed that their argument is incomplete. A
corrected method was then proposed in [1559], relying on the tools introduced in the proof
of Theorem 15.8.12. Combining all these ideas then gives:
Theorem 15.8.19 ([1559, Th. 6.4]) If q ≥ 49 is a square, then
sym 1
Mq ≤ 2 1 + √ .
q−2
In this same work, it is also observed that if one does not insist on having symmetric
algorithms, then (15.19) can be replaced with an asymmetric variant: for given P and Q,
find two divisors G and G0 satisfying
`(KX − G + Q) = 0,
`(KX − G0 + Q) = 0,
`(G + G0 − DP ) = 0.
The study of bilinear multiplication algorithms over finite fields is a very active area of
research and we covered only one specific aspect. Other research directions include:
• Give asymptotic bounds for non-square q, or for very small q = 2, 3, 4, . . ..
• Instead of asymptotic bounds valid for k → ∞, give uniform bounds valid for all k.
• Study multiplication algorithms in more general finite dimensional algebras, not only
extension fields.
• Give effective constructions of multiplication algorithms.
For a more exhaustive survey of recent results, see [108].
The importance of multiplicative linear secret sharing schemes perhaps comes from a
result in [468] that shows that these schemes can serve as a basis for secure multiparty com-
putation protocols. In [1026] it is also shown that certain two-party protocols, for instance a
zero-knowledge proof, admit communication-efficient implementations in which one player,
for instance the verifier, has to simulate “in her head” a multiparty computation with a
large number of players. This last result makes it very desirable to overcome the limitation
on the number of players in Shamir’s scheme.
In the same way that AG codes provide a generalization of RS codes of arbitrary length,
one can construct linear secret sharing schemes with an arbitrary number of players by
using evaluation of functions on an algebraic curve. Moreover, under certain conditions,
these schemes also admit good multiplicative properties. This was first studied by Chen
and Cramer in [394] and then refined and generalized in several works such as [358] and
[360]. We follow the presentation of the latter.
Given a finite field Fq and integers k, n ≥ 1, we equip the vector space Fkq × Fnq with the
linear projection maps
Definition 15.8.21 A (n, t, d, r)-arithmetic secret sharing scheme for Fkq over Fq is a
linear subspace C ⊆ Fkq × Fnq with the following properties:
(i) (t-disconnectedness) for any subset B ⊆ {1, . . . , n} of cardinality |B| = t, the
projection map
π0,B : C −→ Fkq × πB (C)
v 7−→ (v0 , vB )
is surjective
(ii) (dth power r-reconstruction) for any subset B ⊆ {1, . . . , n} of cardinality |B| = r
we have
(ker πB ) ∩ C ?d ⊆ (ker π0 ) ∩ C ?d .
|B|
Moreover we say that C has uniformity if in (i) we have πB (C) = Fq for all B with
|B| = t.
Then C = CL (X, S, G) is an (n, t, d, r)-arithmetic secret sharing scheme for Fkq with unifor-
mity.
In [360] a method is developed to solve (15.20) using some control on the d-torsion of the
class group of the curve. It is not known if this method is optimal, but it gives arithmetic
secret sharing schemes with the best parameters up to now.5 It leads to:
Theorem 15.8.23 ([360]) For all prime powers q except perhaps for q = 2, 3, 4, 5, 7, 11, 13,
there is an infinite family of (n, t, 2, n − t)-arithmetic secret sharing schemes for Fkq over Fq
with uniformity, where n is unbounded, k = Ω(n), and t = Ω(n).
It should be noted that, if one is ready to drop the uniformity condition, then the corre-
sponding result holds for all q. This can be proved using a concatenation argument [358].
The literature on arithmetic secret sharing is rapidly evolving, and we cannot cover
all recent developments. For further references on the topic, together with details on the
connection with multiparty computation, we recommend the book [469].
with (15.20), even in the case d = 2, because the number of equations in the system is too high.
Algebraic Geometry Codes and Some Applications 355
Definition 15.9.3 A code C ⊆ Fnq is locally recoverable with locality ` if, for any
i ∈ {1, . . . , n}, there exists at least one subset A(i) ⊆ {1, . . . , n} containing i such that
|A(i)| ≤ ` + 1 and C|A(i) has minimum distance ≥ 2.
Definition 15.9.4 In the context of Definition 15.9.3, a subset A(i) is called a recovery
set of C for i.
Remark 15.9.5 Note that the sets A(1), . . . , A(n) need not be distinct. In addition, we
emphasize that, for a given position i ∈ {1, . . . , n}, there might exist more than one re-
covery set for i; this is actually the point of the notion of availability discussed further in
Section 15.9.6.
On the other hand, in most of the examples presented in Sections 15.9.3, 15.9.4,
and 15.9.5, we consider a partition A1 t· · ·tAs of {1, . . . , n} such that for any i ∈ {1, . . . , n},
the unique recovery set for i is the unique subset Aj containing i.
Remark 15.9.6 One can prove that Definition 15.9.3 is equivalent to the following one.
For any i ∈ {1, . . . , n}, there exists at least one codeword c(i) ∈ C ⊥ of weight less than or
equal to ` + 1 and whose ith entry is nonzero.
Let us give some practical motivation for this definition. Suppose we distribute data
on n servers. Our file (or a part of it) is an element m ∈ Fkq that has been encoded as a
codeword c = (c1 , . . . , cn ) ∈ C, where C is a code with locality `. For any i ∈ {1, . . . , n}
the element ci ∈ Fq is stored in the ith server. In distributed storage systems, data should
be recoverable at any moment, even if failures or maintenance operations are performed.
For this reason, when a machine fails, the data it contains should be recovered and saved
on another machine. To perform such operations efficiently, we wish to limit the number of
machines from which data is downloaded. Here comes the interest of codes having a small
locality! Suppose the ith server failed. Then we need to reconstruct ci from the knowledge
of the cj ’s for j 6= i. The objective is to recover ci from the knowledge of an as small as
possible number of cj ’s. From Remark 15.9.6, there exists d = (d1 , . . . , dn ) ∈ C ⊥ of weight
less than or equal to ` + 1 with di 6= 0. Then the support of d is {i, i1 , . . . , is } with s ≤ `
and
s
1 X
ci = − ci di .
di j=1 j j
Consequently, ci can be recovered after downloading data from at most ` distinct servers.
The smaller the `, the more efficient the recovery process.
Remark 15.9.7 Note that the literature on distributed storage actually involves two dis-
tinct families of codes:
• locally recoverable codes which are the purpose of the present section (see also Sec-
tions 20.7, 31.3, and 31.4);
• regenerating codes which are not discussed in the present chapter but are examined
in Section 31.2.
We also refer the interested reader to [542] for an introduction to regenerating codes and
to [1430] for a more geometric presentation of them.
356 Concise Encyclopedia of Coding Theory
Then, for k divisible by `, define the [n, k] Tamo–Barg code of locality ` as the code
k
` −1
`−1 X
def
X
aij X i g(X)j .
C = (f (x1 ), . . . , f (xn )) f (X) = (15.22)
i=0 j=0
This code has length n and dimension k. In addition, the polynomials that are evaluated
to generate codewords have degree at most
k k
deg(f ) ≤ ` − 1 + (` + 1) − 1 = k + − 2.
` `
To get a lower bound for its minimum distance, it suffices to embed this code into a larger
code whose minimum distance is known, namely a Reed–Solomon code
k k
C ⊆ (f (x1 ), . . . , f (xn )) deg(f ) ≤ k + − 2 = CL P1 , A, k + − 2 P∞ ,
` `
whose minimum distance equals n − k − k` + 2. Finally, it remains to be shown that the code
has locality ` which will explain the rationale behind the construction. Suppose we wish to
recover a symbol cr from a given codeword c ∈ C. The index r ∈ As for some integer s;
since g is constant on As and identically equal to some constant γP s , the restriction to As
of any polynomial f as in (15.22) coincides with the polynomial i,j aij γsj xi , which is a
6 Such polynomials are called `-good or (`, n)-good polynomials; they are discussed extensively in Sec-
Remark 15.9.11 It is well known that, given a Galois cover φ : Y → X, the Galois group
Γ acts on Y and the orbits are the pre-images of points of X.
Tamo and Barg’s construction can be generalized in terms of curve morphisms. Indeed,
the situation of Example 15.9.9 can be interpreted as follows. The elements x1 , . . . , xn at
which polynomials are evaluated are now regarded as a sequence of rational points P1 , . . . , Pn
of P1 that are from a disjoint union of orbits under the action of the automorphism σ : (x :
g
y) 7−→ (ζx : y). Next the polynomial g induces a cyclic cover φ : P1 −→ P1 with Galois
group spanned by σ. The orbits can be regarded as fibres of rational points of P1 that split
totally.
Similarly to Reed–Solomon codes, for a given ground field, the Tamo–Barg approach
permits one to generate optimal codes with respect to (15.21), but the code length will be
bounded from above by the number of rational points of P1 , i.e., by q + 1. If one wishes to
create longer good locally recoverable codes, the construction can be generalized as proposed
in [128]. Consider
extension Fq (Y )/Fq (X) whose pole locus avoids7 the points Pi,j , and let G be a divisor on
X. Then one can construct the code
( `−1
)
def
X
∗ i
C = f (P1,1 ) , . . . , f (P `+1
n
,`+1 )
f=
(φ fi ) · x , fi ∈ L(G) , (15.23)
i=0
Remark 15.9.13 Since x is a primitive element of the extension Fq (Y )/Fq (X), then its
restriction to a fibre is injective. Indeed, suppose that for two distinct points R1 , R2 in a
given fibre, we have x(R1 ) = x(R2 ). For any f ∈ Fq (X), φ∗ f (R1 ) = φ∗ f (R2 ); also Fq (Y ) is
generated as an algebra by φ∗ Fq (X) and x. Together these two facts imply that no function
in Fq (Y ) takes distinct values at R1 and R2 , a contradiction.
Theorem 15.9.14 ([128, Th. 3.1]) The Barg–Tamo–Vlăduţ code defined in (15.23) has
locality ` and parameters [n, k, d] with
Proof: The dimension is a consequence of the definition and of the Riemann–Roch Theorem.
The proof of the locality is the very same as that of Tamo–Barg codes given in Section 15.9.3.
For the minimum distance, observe that the code C is contained in the code CL (Y, P, (` −
1)(x)∞ +φ∗ G). Therefore, it suffices to bound from below the minimum distance of this code
to get a lower bound for the minimum distance of C. From (15.2), deg(φ∗ G) = (`+1) deg(G)
and, from Theorem 15.3.12, the code CL (Y, P, (` − 1)(x)∞ + φ∗ G) has minimum distance
at least n − (` − 1) deg((x)∞ ) − (` + 1) deg(G). Finally, from [1401, Lem. 2.2], we get
deg((x)∞ ) = deg(φ), which concludes the proof.
Remark 15.9.15 The above proof is more or less that of [128, Th. 3.1]. We chose to
reproduce it here in order describe the general strategy of evaluation of the parameters of
such locally recoverable codes constructed from curves. Namely:
• the dimension is obtained by applying the Riemann–Roch Theorem on X together
with an elementary count of monomials;
• the minimum distance is obtained by observing that the constructed LRC is contained
in an actual algebraic geometry code to which the Goppa Bound (Theorem 15.3.12)
can be applied.
Remark 15.9.16 See [128, § IV.A] for examples of LRC from the Hermitian curve.
7 Here again, this avoiding condition can be relaxed thanks to Remark 15.3.8.
Algebraic Geometry Codes and Some Applications 359
where i ∈ {1, . . . , n} and A(i) ranges over all the recovery sets of C for i (which may be
non-unique according to Remark 15.9.5).
The codes introduced in (15.23) have local distance 2. Actually, improving the local
distance permits a reduction in the amount of requested symbols for the recovery of a given
symbol as suggested by the following statement.
Lemma 15.9.18 Let C ⊆ Fnq be a locally recoverable code with local distance ρ and recovery
sets of cardinality ` + 1. For a codeword c ∈ C, any symbol ci of c can be recovered from
any (` − ρ + 2)-tuple of other symbols in the same recovery set.
Proof: Let A(i) ⊆ {1, . . . , n} be a recovery set for the position i. The restricted code C|A(i)
has length ` + 1 and minimum distance ρ. Therefore, by definition of the minimum distance
for any I ⊆ Ai \ {i} with |I| = (` + 1) − (ρ − 1) = ` − ρ + 2, the restriction map C|A(i) −→ C|I
is injective.
This can be done by reducing the degree in x of the evaluated functions, that is to say,
by considering a code
( s−1
)
X
0 ∗ i
C = f (P1,1 , . . . , f (P `+1
n
,`+1 )
f=
(φ fi )x , fi ∈ (L(G)) (15.24)
i=0
for some integer 0 ≤ s ≤ `. Here again, the code restricted to a recovery set is nothing but
an [` + 1, s, ` − s + 2] Reed–Solomon code. In such a code, a codeword is entirely determined
by any s-tuple of its entries.
Theorem 15.9.19 The Tamo–Barg–Vlăduţ code C 0 defined in (15.24) has length n and
locality `, with local distance ρ, dimension k, and minimum distance d satisfying
Definition 15.9.20 The availability of a locally recoverable code is the minimum over
all positions i ∈ {1, . . . , n} of the number of recovery sets for i.
360 Concise Encyclopedia of Coding Theory
Practically, a distributed storage system with a large availability benefits from more
flexibility in choosing the servers contacted for recovering the contents of a given one.
The availability of an LRC is a positive integer and all the previous constructions had
availability 1. Constructing LRC with multiple recovery sets is a natural problem.
• For Reed–Solomon-like LRC, this problem is discussed in [1772, § IV] by considering
two distinct group actions on Fq . The cosets with respect to these group actions
provide two partitions of the support yielding LRC with availability 2.
• In the curve case, constructions of LRC with availability 2 from a curve X together
with two distinct morphisms from this curve to other curves Y (1) , Y (2) is considered
in [122, § 6] and [128, § V]; the case of availability t ≥ 2 is treated in [926].
The construction can be realized from the bottom using the notion of a fibre product.
Given three curves Y (1) , Y (2) , and X with morphisms
φ1 φ2
Y (1) −→ X and Y (2) −→ X,
then the fibre product Y (1) ×X Y (2) is defined as
n o
def
(Y (1) ×X Y (2) )(Fq ) = (P1 , P2 ) ∈ (Y (1) × Y (2) )(Fq ) φ1 (P1 ) = φ2 (P2 ) .
It comes with two canonical projections ψ1 and ψ2 onto Y (1) and Y (2) respectively:
Y (1) ×X Y (2)
ψ1 ψ2
Y (1) Φ Y (2)
φ1 φ2
X
where we denote by Φ the morphism Φ = φ1 ◦ ψ1 = φ2 ◦ ψ2 : Y (1) ×X Y (2) −→ X.
The construction of LRC with availability 2 can be done as follows. Let `1 , `2 be integers
such that deg(φ1 ) = `1 + 1, deg(φ2 ) = `2 + 1 (hence deg(ψ1 ) = `2 + 1 and deg(ψ2 ) = `1 + 1),
and let
• x1 and x2 be respective primitive elements of the extensions Fq (Y (1) )/Fq (X) and
Fq (Y (2) )/Fq (X);
• G be a divisor on X;
• Q1 , . . . , Qs be rational points of X that are totally split in Φ and denote by P1 , . . . , Pn
their pre-images.
Similar to the previous cases, either we suppose that the supports of G and the points
Q1 , . . . , Qs avoid the image under φ1 of the pole locus of x1 and the image under φ2 of the
pole locus of x2 , or we use Remark 15.3.8.
With the above data, we define a locally recoverable code as follows:
1 −1 `X
`X 2 −1
def
(Φ∗ hi,j )ψ1∗ (x1 )i ψ2∗ (x2 )j , hi,j ∈ L(G) . (15.25)
C = (f (P1 ), . . . , f (Pn )) f =
i=0 j=0
Let i ∈ {1, . . . , n}. The point Pi is associated to two recovery sets, namely, ψ1−1 ({ψ1 (P1 )})
and ψ2−1 ({ψ2 (P2 )}), which have respective cardinalities `2 + 1 and `1 + 1.
Algebraic Geometry Codes and Some Applications 361
Theorem 15.9.21 The code C of (15.25) has availability 2 with localities (`1 , `2 ). Its pa-
rameters [n, k, d] satisfy
Proof: The minimum distance comes from the fact that the code is a subcode of
CL Y (1) ×X Y (2) , P, Φ∗ (G) + (`1 − 1)(ψ1∗ ((x1 )∞ )) + (`2 − 1)(ψ2∗ (x2 )∞ ) . The dimension
is a direct consequence of the definition of the code. For further details, see, for instance,
[926, Th. 3.1].
Remark 15.9.22 See [926, § 5, 6, 7] for examples of LRC with availability ≥ 2 from
Giulietti–Korchmaros curves, Suzuki curves, and Artin–Schreier curves.
Acknowledgement
To write Section 15.5, the authors followed some advice of Iwan Duursma. Re-
mark 15.6.27 was transmitted by Peter Beelen. The authors warmly thank these colleagues
for their help.
Chapter 16
Codes in Group Algebras
Wolfgang Willems
Otto-von-Guericke Universität and Universidad del Norte
16.1 Introduction
Group codes, i.e., ideals in a group algebra FG, where F is a finite field and G a finite
group, serve as a fairly good illustration of the fact that the more algebraic structure
the ambient space carries the better one understands the code. In the sixties of the last
century group codes came into the game. Already in 1967 Berman [174] discovered that
binary Reed–Muller codes are ideals in a group algebra where the underlying group is an
elementary abelian 2-group. At the same time MacWilliams investigated ideals in a group
algebra over a dihedral group of order 2n [1320]. In [1321] she proposed that one should
“look for a class of groups, not cyclic, which produce codes with some desirable practical
properties.”
Since that time many papers have dealt with group codes, and there are mainly two
reasons for that. First of all, there are numerous good codes in the class of group codes.
Second, methods from representation theory seem to be extremely powerful when dealing
with coding theoretic properties.
Throughout this chapter we always denote by G a finite group and by F = Fq a finite
field of size q and characteristic p if nothing else is said. The notation we shall use in
both, coding theory and representation theory, is standard. For coding theory we refer to
Chapter 1 in this encyclopedia and for representation theory to the textbooks [43, 1200]
and [1013, Chapter VII].
In the following we denote by Sn , An the symmetric, respectively alternating, group on
n letters; by Cn , Dn a cyclic, respectively dihedral, group of order n. The notion n | m for
363
364 Concise Encyclopedia of Coding Theory
In the following, A always denotes a finite dimensional algebra over the field F. By an
A-module M we mean a right A-module of finite dimension over F; by an ideal I of A a
right ideal in A. To be brief we write I ≤ A. Recall that an A-module M satisfies
• m1 = m for all m ∈ M ,
• (m + m0 )a = ma + m0 a for all m, m0 ∈ M and a ∈ A,
• m(a + a0 ) = ma + ma0 for all m ∈ M and a, a0 ∈ A,
• (ma)a0 = m(aa0 ) for all m ∈ M and a, a0 ∈ A.
Note that the algebra structure of A makes A into an A-module which we call the
regular module of A. By a submodule of an A-module M we mean an F-linear subspace
of M which is invariant under the action of A.
In the following we do not distinguish between isomorphic A-modules. This means that
we usually suppress ‘up to isomorphism’.
Theorem 16.2.4 ([1200]) A has only finitely many irreducible modules; i.e., the number
of pairwise non-isomorphic irreducible A-modules is finite.
J(A) is a 2-sided nilpotent ideal in A where nilpotent means that there exists an n ∈ N
such that J(A)n = 0. It can also be characterized as the intersection of all maximal right
or left ideals in A.
of M with Mi /Mi−1 irreducible, the number t is the same and the composition factors
Mi /Mi−1 are the same up to isomorphism including multiplicities.
(b) (Krull–Schmidt) For all decompositions
M = M1 ⊕ · · · ⊕ Ms
of M into a direct sum of indecomposable modules Mi , the number s is the same and
the Mi are unique up to isomorphism and labeling.
Definition 16.2.10 For any subset C ⊆ M , where M is an A-module, the right annihi-
lator Annr (C) is defined by
Note that the right (left) annihilators are right (left) ideals in A.
366 Concise Encyclopedia of Coding Theory
Since the Jacobson radical is the intersection of all maximal right ideals of A, we have
J(A/J(A)) = 0. Thus, by Theorem 16.2.7, A/J(A) is a completely reducible A/J(A)-
module, hence a completely reducible A-module since J(A) annihilates the irreducible A-
modules. Moreover, each irreducible A-module occurs as a factor module of A/J(A). This
can be seen as follows: If M is an irreducible A-module and 0 6= m ∈ M , then the A-linear
map α : A −→ M defined by a 7→ ma ∈ M for a ∈ A is an epimorphism where the kernel
Annr (m) is a maximal right ideal of A. Thus A/Annr (m) ∼ = M and M is an epimorphic
image of A/J(A) since J(A) ≤ Annr (m).
In general, an irreducible A-module does not need to be isomorphic to a minimal ideal
of A as Example 16.2.8 shows.
Definition 16.2.11
(a) An element 0 6= e ∈ A is called an idempotent if e2 = e.
(b) Two idempotents e, f ∈ A are orthogonal if ef = 0 = f e.
(c) If an idempotent e cannot be written as e = e1 + e2 with orthogonal idempotents e1
and e2 , then we say that e is primitive.
Definition 16.2.15
(a) An A-module P is called projective if P is a direct summand of An for some n, i.e.,
if there exists an A-module P 0 such that
P ⊕ P0 ∼
= A ⊕ ··· ⊕ A
as A-modules.
(b) If e = e2 ∈ A is a primitive idempotent, then we call P = eA a PIM (projective
indecomposable module). Note that the module eA is indeed projective as A =
eA ⊕ (1 − e)A.
We say that P is the projective cover of M since P and M determine each other up to
isomorphism. In particular, there is a one-to-one correspondence between the isomorphism
classes of PIM’s of A and the isomorphism classes of irreducible A-modules.
Definition 16.2.17
(a) Since A is of finite dimension over F, we may decompose A into a direct sum
A = B0 ⊕ B1 ⊕ · · · ⊕ Bs
M = M f0 ⊕ · · · ⊕ M fs
with A-modules M fi implies that there is exactly one index i such that M = M fi
and M fj = 0 for j 6= i. We say that M belongs to the block Bi .
Note that a block of A is an algebra over F where the block idempotent serves as the
identity.
is an F-algebra which is called a group algebra or more precisely the group algebra of
G over F.
368 Concise Encyclopedia of Coding Theory
Example 16.3.2 Suppose that the characteristic of F is p and G is a p-group; i.e., the
order |G| of G is a power of p. Then the trivial module is the only irreducible FG-module.
Furthermore, the regular FG-module is indecomposable and in the case |G| > 1 not irre-
ducible.
Suppose that char F = p and p | |G|. If a = g∈G g, then a2 = |G|a = 0, and J = aFG
P
is a nilpotent ideal since a ∈ Z(FG). If M is an irreducible FG-module, then obviously
M J = 0 or M J = M since M J is a submodule of M . As J is nilpotent the second case
also leads to M J = 0. Thus 0 6= J ≤ J(FG) by Definition 16.2.5, which means that FG is
not semisimple.
Thus, if the characteristic p of the field F divides |G|, then there are irreducible FG-
modules whose PIM’s are indecomposable but not irreducible.
P Again let char F = p, and let B be a p-block of FG with block idempotent f =
g∈G ag g ∈ Z(FG). By a result of R. Brauer [1200], there exists a g ∈ G with ag 6= 0
and a Sylow p-subgroup δ(B) of CG (g) = {x ∈ G | xg = gx} such that for all h ∈ G
with ah 6= 0 the Sylow p-subgroup of CG (h) is contained in a G-conjugate of δ(B), i.e., in
δ(B)x = x−1 δ(B)x for some x ∈ G. The p-subgroup δ(B) of G is unique up to conjugation.
Definition 16.3.5 Up to conjugation the p-subgroup δ(B) of G is called the defect group
of B. If |δ(B)| = pd(B) , then we call d(B) the defect of B.
Recall that for any prime p a finite group G always has subgroups of order |G|p , which
are called Sylow p-subgroups, and that all of them are conjugate [43, Section 7]. By
definition |G|p is the highest power of p dividing |G|.
Codes in Group Algebras 369
Definition 16.3.7 Let M be an FG-module. The vector space HomF (M, F) of all F-linear
maps from M to F becomes an FG-module by
m(αg) = (mg −1 )α
for m ∈ M, g ∈ G and α ∈ HomF (M, F). This module is denoted by M ∗ and called the dual
module of M . If M ∼= M ∗ , we say that M is a self-dual FG-module.
Observe that the trivial FG-module is always self-dual. Furthermore, for any FG-module
M we have dim M = dim M ∗ .
Example 16.3.8 Let F = F4 and let G be a cyclic group of order 3 generated by g. Clearly,
F contains a primitive third root of unity, say . The 1-dimensional vector-space M = hmi
becomes an FG-module via the action mg = m. On M ∗ = hm∗ i the generator g acts via
m∗ g = −1 m∗ . Thus M ∼6 M ∗.
=
supp(a) = {g ∈ G | ag 6= 0}
Definition 16.4.2 On the group algebra FG there is a distance dH , also called the Ham-
ming distance, which is given by
dH (a, b) = wtH (a − b)
for a, b ∈ FG. Note that dH satisfies the axioms of a metric; see Remark 1.6.3.
Definition 16.4.3 A right ideal C in FG endowed with the distance dH is called a group
code, or in case we want to refer to the underlying group, a G-code. If G is abelian, we
call C an abelian group code.
To focus on right ideals as group codes is only for convention. All results easily transfer
to left ideals.
In general it is hard to determine all ideals, i.e., group codes, in a given group algebra
FG.
370 Concise Encyclopedia of Coding Theory
M M−
FG = M −
⊕ M = P (M ) ⊕ P (M − )
−
M M
Remark 16.4.6 In this remark a G-code or group code means a 2-sided ideal in FG.
(a) [175] If G = AB with abelian subgroups A, B ≤ G, then any G-code is an abelian
group code (up to equivalence). In particular, a given G-code may also be realized as
an H-code where G and H are non-isomorphic groups.
(b) [791, 792, 793] For |G| < 128 and |G| 6∈ {24, 48, 50, 60, 64, 72, 96, 108, 120} all G-codes
are abelian group codes. Over F5 there are S4 -codes which are not abelian group codes.
Over the binary field there are SL2 (3)-codes which are optimal and non-abelian group
codes.
Next we consider the natural F-linear action of the symmetric group Sn on Fn defined
on the standard basis ei = (0, . . . , 0, 1, 0, . . . , 0) (where 1 is at position i) by
ei σ = eiσ
PAut(C) = {σ ∈ Sn | σ ∈ Aut(C)}
Theorem 16.4.7 ([175]) Let C be a linear code over F of length n, and let G be a group of
order n. Then, up to permutation equivalence, C is a G-code if and only if G is isomorphic
to a regular subgroup of PAut(C).
In order to see the ‘if’ part we may assume that G ≤ PAut(C). Since G is transitive of
order n there exists to each j ∈ {1, . . . , n} exactly one g ∈ G such that j = 1g. The F-linear
map φ : Fn −→ FG defined by φ(ej ) = φ(e1g ) = g shows that φ(C) is a (right) ideal of FG.
There is a slight extension of Theorem 16.4.7 to monomial equivalence in [1424].
Pn
Example 16.4.8 The linear code C = {(c1 , . . . , cn ) | ci ∈ F, i=1 ci = 0} is a G-code for
any group
P G of order
P n. This may be seen as follows. First note that α : FG → F defined
by α( g∈G ag g) = g∈G ag is G-linear. Since C is permutation equivalent to the kernel of
α, which is an ideal in FG, we are done.
Instead of looking at elementary abelian p-groups, we may look more generally at ar-
bitrary p-groups. Using Jennings’ description [1041] of the radical powers, Faldum proved
that this larger class of group codes does not lead to better parameters. More precisely, he
proved the following.
Remark 16.4.13 Many codes which are optimal in the class of linear codes are group
codes. Here we mention a few examples over the binary field.
(a) [792] There is a [24, 6, 10] group code for G = SL2 (3).
(b) [1858] There is a [32, 17, 8] group code for G = C4 oC2 , G = D8 YD8 with amalgamated
centers, and G = C2 × C2 × C2 : C2 × C2 .
(c) [435] There is a [45, 13, 16] abelian group code for G = C3 × C15 .
Remark 16.4.14 Geometric Goppa codes arising from curves with many rational points
often turn out to be group codes.
(a) Using the Klein quartic x3 y + y 3 z + z 3 x = 0, Hansen proved in [900] that a [21, 10, 9]8
code exists as a group code in F8 G, where G is a Frobenius group of order 21. Note
that 9 is the largest minimum distance of a known [21, 10]8 code as stated in [845].
(b) In [901] there are geometric Goppa codes constructed from Hermitian curves. Some
of them are group codes in Fq2 G, where G is a non-abelian group of order q 3 and q is
a power of p.
(c) According to [902] a similar result holds true for the Suzuki curve in which the codes
are group codes over Fq for the Sylow 2-subgroup of the simple Suzuki group Sz(q)
of order q 2 (q 2 + 1)(q − 1) where q = 22n+1 and n ∈ N.
Observe that with C the dual code C ⊥ is a group code as well since
hc, c⊥ gi = hcg −1 , c⊥ i = 0
Note that this map is an anti-algebra automorphism of FG. Thus, if C is a right ideal in
FG, then Cb = {b
c | c ∈ C} is a left ideal.
For a group code C there exists a powerful connection between the dual code C ⊥ and
the dual module C ∗ given in Definition 16.3.7.
Proposition 16.4.20 ([482, 1893]) If C ≤ FG is a group code, then FG/C ⊥ ∼
= C ∗ as
FG-modules.
Corollary 16.4.21 If C ⊥ = C ≤ FG is a self-dual group code, then the multiplicity of every
irreducible self-dual FG-module as a composition factor in a Jordan–Hölder series of FG is
even.
Proof: If X is a composition factor of C = C ⊥ , then X ∗ is a composition factor of C ∗ . Since
FG/C ∼ = C ∗ by Proposition 16.4.20, we see that X has even multiplicity in FG if X ∼ = X ∗.
Note, that if X is a composition factor of FG/C ∼ = C ∗
, then X ∗
is a composition factor of
C ∗∗ ∼
= C.
The statement of Corollary 16.4.21 is one of the main tools when studying self-dual
group codes.
Proposition 16.4.22 ([1464]) If C = aFG, then C ∗ ∼ aFG. In particular, dim aFG =
= b
dim FGa.
Theorem 16.5.4 ([1893]) The group algebra FG contains a self-dual group code if and
only if the characteristic of F is 2 and 2 | |G|.
Remark 16.5.5 ([176]) The ternary extended self-dual Golay code G12 is not a group
code, but an ideal in a twisted group algebra of the alternating group G = A4 over F3 .
In [1727] Sloane and Thompson proved that a binary self-dual group code is never
doubly-even provided the Sylow 2-subgroup of the underlying group is cyclic. More generally,
we have the following.
Theorem 16.5.6 ([1342]) Let F = F2 and let T be a Sylow 2-subgroup of G. Then FG
contains a self-dual doubly-even group code if and only if T is neither cyclic nor a Klein
four group.
If C is a G-code, then G is a subgroup of the symmetric group S|G| . In the case that F
is binary and C = C ⊥ is doubly-even, we can say more.
Theorem 16.5.7 ([873]) If C = C ⊥ ≤ Fn2 is a doubly-even linear code, then PAut(C) is
contained in the alternating group An . In particular, a binary self-dual doubly-even G-code
implies G ≤ A|G| (up to isomorphism).
Remark 16.5.8 In [1043] an enumeration of self-dual cyclic codes over F2m is given. This
has been extended in [1050] to the case where the underlying group is abelian with a cyclic
Sylow 2-subgroup.
Note that 1
|H|∈ F exists since H is a p0 -group. Furthermore, H
e is an idempotent lying in
the center Z(FG) of FG. If e = e2 ∈ FG is a primitive idempotent, then Hee = e or He
e = 0.
The second case means that eFG ∩ HFGS= 0. In order to understand the first case let T be
e
a right transversal of H in G; i.e., G = t∈T Ht (disjoint union). Thus
M
eFG = HeFG
e ⊆ HFG
e =H e Ft .
t∈T
In the next result we use the notion Op0 (G) which denotes the largest normal subgroup
of G whose order is not divisible by p.
Theorem 16.6.7 ([378]) Let A be an abelian group with char F - |A| and let e be a prim-
e = 0 for all subgroups H 6= 1 of A, then A is cyclic.
itive idempotent in FA. If He
Corollary 16.6.8 Let A be an abelian group with char F - |A|, but not cyclic. Then any
minimal group code in FA is a |U |-repetition code for some subgroup 1 6= U ≤ A.
Theorem 16.6.9 ([378]) Every minimal abelian group code in a semisimple group algebra
is permutation equivalent to a cyclic code.
LCD codes were first considered by Massey in [1347]. The present-day interest in LCP
and LCD codes arises from the fact that they can be used in protection against side channel
and fault injection attacks; see [304, 353, 1431]. In this context the security of a linear com-
plementary pair (C, D) can be measured by its security parameter min {dH (C), dH (D⊥ )}.
Clearly, if D = C ⊥ , i.e., C is an LCD code, then the security parameter is dH (C).
LCD codes are asymptotically good [1347], and they achieve the Gilbert–Varshamov
Bound [1648]. Applying Proposition 16.2.12 we immediately get the following.
Proposition 16.7.2 If (C, D) is an LCP of group codes, then C = eFG and D = (1 − e)FG
for some e = e2 ∈ FG.
Proof: Part (a) is just Definition 16.2.15; part (b) is Dickson’s Theorem 16.6.3 using Propo-
sition 16.7.2; and part (c) follows from Proposition 16.4.20.
Theorem 16.7.4 ([242]) Let (C, D) be an LCP of codes in FG, where C and D are 2-sided
ideals of FG. Then C and D⊥ are permutation equivalent. In particular (C, D) has security
parameter dH (C).
If C and D are nD cyclic codes, then the codes can be expressed as ideals in
F[x1 , . . . , xn ]/hxm mn
1 − 1, . . . , xn − 1i. In this case a proof of Theorem 16.7.4, which uses
1
e = 1 + a + a2 + a4 + b + a2 b + a5 b + a6 b,
but dH (C) = dH ((1−e)FG) = 2. We would like to mention here that C and D are quasi-cyclic
codes.
Theorem 16.7.6 ([502]) A group code C ≤ FG is an LCD code if and only if C = eFG
with e2 = e = eb ∈ FG. In this case C ⊥ = (1 − e)FG.
Example
1
P 16.7.7 Suppose 2that char F = p and H is a subgroup of G with p - |H|. Then
e = |H| h∈H h satisfies e = e = e b. Thus C = eFG is an LCD group code. In the case
that H is a p-complement in G, i.e., G = HT where T is a Sylow p-subgroup of G, we have
dim C = |G|p and dH (C) = |H| = |G|p0 . This follows directly from C = eFG = eFT .
Theorem 16.7.8 ([502]) Let char F = 2 and let C = eFG with e2 = e = eb. Then C is
symplectic, i.e., hc, ci = 0 for all c ∈ C, if and only if 1 6∈ supp(e).
Thus, if C is a binary symplectic LCD group code, then C is singly-even. Observe that
C is never doubly-even. This can be seen as follows. If C were doubly-even, then C ⊆ C ⊥ , a
contradiction to the assumption that C 6= 0 is an LCD group code.
For a group code C the vector space Cb is a left ideal. Thus, if C is self-adjoint, then C is
a 2-sided ideal.
Theorem 16.7.10 ([502]) A group code C ≤ FG is a self-adjoint LCD code if and only if
C = eFG where e2 = e = eb and e ∈ Z(FG).
Codes in Group Algebras 377
Note that a self-adjoint group code C 6= 0 is a direct sum of blocks of the group algebra;
see Definition 16.2.17.
Furthermore, as a consequence of Theorem 16.7.10 we immediately obtain an early result
of Yang and Massey.
Corollary 16.7.11 ([1920]) If g(x) is the generator polynomial of an [n, k] cyclic code C
(the characteristic of F and n not necessarily coprime), then C is an LCD code if and only
n
−1
if g(x) is a self-reciprocal polynomial and gcd g(x), xg(x)
= 1.
Example 16.7.12 Let G = A5 and let F = F2 . Furthermore, let e denote the sum of all
elements of G of order 3 and 5. Obviously eb = e2 = e ∈ Z(F2 G). Thus C = eF2 G is a
self-adjoint LCD group code. It is the unique 2-block of defect 0 in FG. Its parameters are
[60, 16, 18]. According to Grassl’s list [845] a best known binary [60, 16] code has minimum
distance 20. The principal 2-block B0 (G) = (1 − e)F2 G has parameters [60, 44, 5] whereas
the best known binary [60, 44] code has minimum distance 6.
Remark 16.7.13 Let q = pm where p is a prime and let n = q − 1. Then there are Reed–
Solomon LCD codes over Fq of length n for all dimensions k with 0 < k < n. Using Corollary
16.7.11, a construction is given in [353] in the case p = 2. It also for works for p odd.
Proof: First note that the conditions C < FG and C MDS imply that C ⊥ is MDS. Thus
dH (C) = |G| − dim C + 1 and dH (C ⊥ ) = dim C + 1. Furthermore, C ⊥ = (1 − e)FG. Hence
which proves the first part of the theorem. The second statement follows by elementary
group theory.
Question 16.7.15 Does the existence of an LCD MDS group code C ≤ FG imply that
char F - |G|? The answer is yes if G is abelian; see [502].
Remember that a linear code C is r-divisible if r | wtH (c) for all c ∈ C. In the following
we denote by ∆(C) the greatest common divisor of all weights of codewords in C. Sometimes,
∆(C) is called the divisor of C.
378 Concise Encyclopedia of Coding Theory
where the sum runs over representatives of the one dimensional subspaces of C. Thus
∆(C)p0 | |G|.
pk −1
Example 16.8.2 Let C be the p−1 , k, pk−1 simplex code over the prime field Fp where
pk −1
k ≥ 2 and gcd(k, p − 1) = 1. Then C is a group code in Fp G where G is cyclic of order p−1
(see Example 16.4.10(a)). Clearly, ∆(C) = ∆(C)p = pk−1 , but ∆(C) - |G|.
Theorem 16.8.3 ([482, 1873]) Let char F = p and let 0 6= C ≤ FG be a group code. Then
the following hold true.
(a) |KM (C)| divides the weights of all codewords c ∈ C.
(b) ∆(C)p0 = |KM (C)|p0 .
In contrast to the p0 -part, the p-part of ∆(C) does not seem to be easy to calculate. As
a first result we like to mention a theorem of McEliece.
Theorem 16.8.4 ([1360]) Let C be a cyclic code over Fp of length n where the prime p
does not divide n. Let x denote the cyclic shift of order n, and let E be the set of eigenvalues
of the action of x on C. Then ∆(C)p = pe , where m = (p − 1)(e + 1) is the smallest multiple
of p − 1 for which a product of m members of E (repetitions allowed) is equal to 1.
for all z ∈ FG. We say that a group code C ≤ FG satisfies Condition (E) if the following
holds true.
Condition (E): Whenever f ∈ C ∗ there exists η ∈ EndFG (C) such that f (c) = hcη, 1i for
all c ∈ C.
In general it is hard to see whether a given right ideal C satisfies (E) or not. A 2-sided ideal
in FG always satisfies the condition according to [1874].
Theorem 16.8.5 ([1874]) Let C = eFp G where p is again a prime and e = e2 6= 0.
Suppose that C satisfies Condition (E). Then ∆(C)p = pr−1 where r is the least positive
integer for which C has a nontrivial G-invariant multilinear form of degree r(p − 1).
For an extension of Theorem 16.8.5 to fields Fps , which requires more definitions, we
refer the reader directly to [1874].
Example 16.8.6
(a) Let C = P0 be the projective cover of the trivial module in FG where char F = 2.
Since the restriction h· , ·i|P0 of the bilinear form to P0 is non-degenerate, we see that
P0 is an LCD code. Thus P0 = eFG where e2 = e = eb, according to Theorem 16.7.6.
It follows that
he, ei = heb
e, 1i = he, 1i.
Codes in Group Algebras 379
By [502, Proposition 3.6], we know that he, 1i = 1F . Suppose for a moment that
F = F2 . Thus wtH (e) is odd since wtH (e)1F = he, ei. This implies ∆(P0 )2 = 1 if we
are working over the prime field F2 . Since the codewords of the projective cover of the
trivial module over F2 are also codewords in the projective cover of the trivial module
over any extension field F = F2s , we get ∆(P0 )2 = 1 for every field F of characteristic
2.
(b) Suppose in (a) that char F =P p and that G has a p-complement H. Let T be a Sylow
1 2
p-subgroup of G. If e = |H| h∈H h, then e = e = e b and P0 = eFG = eFT is the
projective cover of the trivial module. Thus ∆(P0 ) = |H| = |G|p0 .
(c) [482] Let C = B0 be the principal p-block of FG where again char F = p. Let
K(B0 ) = {g ∈ G | gb = b for all b ∈ B0 }. By [1013, Chapter VII, Section 14],
we have K(B0 ) = Op0 (G), where Op0 (G) is the largest normal subgroup of G of order
prime to p. Since B0 is a left FG-module, KM (B0 ) is a normal subgroup of G. Fur-
thermore, KM (B0 )/K(B0 ) is a p0 -group. Thus we obtain KM (B0 ) = Op0 (G). Suppose
again for a moment that F = Fp and note that B0 satisfies Condition (E). Thus, by
Theorem 16.8.5, we have r = 1. Therefore ∆(B0 )p = 1 in this case. Now recall that
the principal p-block over Fp is contained in the principal p-block over any extension
field. Thus we get ∆(B0 ) = |Op0 (G)| for every field of characteristic p.
Question 16.8.7 Does ∆(P0 )p = 1 for the projective cover P0 of the trivial module in odd
characteristic p?
Based on a theorem of Ax [78] the following explicit example has been carried out.
Example 16.8.8 ([526, 1875]) Let RMq (r, m) denote the rth order generalized Reed–
Muller code over Fq where q is a power of the prime p, as described in Example 16.4.11.
m
Then ∆(RMq (r, m))p = q d r e−1 .
For irreducible group codes C in semisimple abelian group algebras over fields of char-
acteristic p, the p0 -part of the divisor determines the equivalence class of C.
Theorem 16.8.9 ([1873]) Let C1 and C2 be irreducible group codes in a semisimple abelian
group algebra FG where char F = p. Then C1 and C2 are equivalent if and only if ∆(C1 )p0 =
∆(C2 )p0 .
Remark 16.8.10 The assertion in Theorem 16.8.9 is not true in general for non-abelian
groups. For instance, we may take a field F of characteristic p 6= 2, 3 P
and G = S3 .
TherePare two irreducible FG-modules of dimension 1, namely T = ( g∈G g)F and
S = ( g∈G sgn(g)g)F. Clearly, T ∼
6 S, but ∆(T ) = ∆(S) = 6.
=
Example 16.9.2
(a) Let e = e2 ∈ A. Then the ideal eA is checkable which can be seen as follows. Clearly,
eA ≤ Annr (A(1 − e)). Since any
0 6= (1 − e)b ∈ (1 − e)A
is not in Annr (A(1 − e)) we have eA = Annr (A(1 − e)) = Annr (1 − e).
(b) LCD group codes C are checkable since C = eFG with e = e2 = eb.
(c) All cyclic codes are checkable via the control polynomial.
(d) Maximal group codes in FG are checkable since they are right annihilators of minimal
left ideals which obviously are principal.
(e) A semisimple algebra is code-checkable, in particular group algebras FG if char F - |G|.
(f) A p-block B of defect 0 in FG is code-checkable, since B is a simple algebra.
Example 16.9.4 The extended binary Golay code is checkable in F2 G for all groups G
mentioned in Example 16.5.1, since it is always constructed as a principal ideal. The same
holds true for the binary self-dual [48, 24, 12] code.
Example 16.9.5 Let char F = p and let G be a finite non-cyclic p-group. P Let C =
( g∈G g)F be the trivial FG-module in FG. One easily computes that C ⊥ = g∈G (1 − g)F,
P
which is the Jacobson radical of FG. Since G is not cyclic, it has a factor group of type
(p, p) which implies that C ⊥ is not a principal ideal. Thus C is not checkable. Note that in
this example FG consists only of the principal p-block B0 (G) which is the projective cover
of the trivial module and not uniserial.
Definition 16.9.6 For an F-algebra A a module M is called uniserial if M has only one
Jordan–Hölder series; i.e., the submodules Mi in Theorem 16.2.9(a) are unique.
Theorem 16.9.7 ([243]) Let char F = p and let B be a p-block of FG. Then the following
are equivalent.
(a) All right ideals in B are checkable; i.e., B is code-checkable.
(b) All left ideals in B are principal.
(c) B contains only one irreducible left module, say L, and its projective cover P (L) is
uniserial.
The condition that B contains only one irreducible left module whose projective cover
is uniserial is obviously equivalent to the fact that B contains only one irreducible right
module whose projective cover is uniserial. Thus we may interchange left and right in (a)
and (b) of Theorem 16.9.7.
Theorem 16.9.9 ([243]) Let char F = p and let B0 (G) be the principal p-block of FG.
Then the following are equivalent.
(a) B0 (G) is code-checkable.
(b) G is p-nilpotent with cyclic Sylow p-subgroups.
(c) FG is code-checkable.
The equivalence of (b) and (c) is already contained in [1492], and in [734] for nilpotent
groups, i.e., for groups which are p-nilpotent for all primes p dividing the order of the group.
Definition 16.9.10 Let p be a prime. A finite group G is called p-solvable if all non-
abelian composition factors of G are p0 -groups.
Corollary 16.9.11 ([243]) Let G be a p-solvable group and let B be a p-block of G. Then
the following are equivalent.
(a) B is code-checkable.
(b) The defect group of B is cyclic and B contains only one irreducible left (right) module.
It turned out that in numerous cases the parameters of checkable group codes are as good
as the best known linear codes [845]. Even more, some earlier bounds have been improved
via group codes.
Remark 16.9.12 ([1049]) There is a checkable [36, 28, 6] group code in F5 (C6 × C6 ) and
a checkable [72, 62, 6] group code in F5 (C6 × C12 ). In both cases the minimum distance is
improved by 1 from an earlier lower bound given in [845].
Theorem 16.10.1 ([482]) Let C be a 2-sided ideal in FG with 0 6= C 6= FG. Then dH (C) ≥
3 if and only if KM (Ann(C)) = 1 (see Definition 16.8.1).
Corollary 16.10.2 Let G be a simple group. If FG has more than one block, then any sum
of blocks different from FG has minimum distance at least 3.
Note that all simple non-abelian groups have more than one p-block unless p = 2 and
G = M22 or M24 (Mathieu groups).
For cyclic codes, i.e., group codes for cyclic groups G, and Reed–Muller codes, i.e., group
codes for elementary abelian p-groups, there are several decoding algorithms which work
quite effectively. To our knowledge, almost nothing seems to be known for general groups
G.
382 Concise Encyclopedia of Coding Theory
Tm
Syndrome Decoding 16.10.3 Let C = i=1 Annr (ai ) with ai ∈ FG. Furthermore sup-
pose that dH (C) ≥ 2t + 1. Suppose that c ∈ C is transmitted and v = c + e is received with
an error vector e of weight wtH (e) ≤ t. If we write v = c0 + e0 with c0 ∈ C and wtH (e0 ) ≤ t,
then we get ai v = ai e = ai e0 for all i. Thus
m
\
0
e−e ∈ Annr (ai ) = C,
i=1
implying e − e0 = 0. This shows that the syndromes ai v uniquely determine the error
vector, hence the transmitted codeword c.
Remark 16.10.4 The above decoding algorithm has been proposed in [176] to decode the
binary extended Golay code, and in [900] to decode a [21, 16, 3]8 group code in F8 G where
G is a Frobenius group of order 21.
A quasi-group code C of index ` over the field F is a submodule of the FG-module FG`
where G is a any finite group.SObviously, for ` = 1, C is a group code. If H is a subgroup
s
of G, then we may write G = i=1 gi H where s = |G : H|. Thus
s
gi FH ∼
M
FG = = FH s .
i=1
In particular, a G-code C is a quasi-group code for any subgroup H of index |G : H|. In the
case that H is a cyclic group, C is a quasi-cyclic code in the sense of classical coding theory.
For decoding quasi-cyclic codes we refer to [116] and [1940].
In [251] Bossert proposed a decoding algorithm for cyclic codes using codewords of the
dual code. A modification to general group codes is possible and straightforward. How
efficiently such an algorithm works is admittedly open.
Research Problem 16.10.5 Find a decoding algorithm which works effectively for group
codes or at least some classes of group codes, which are not cyclic.
A partial answer in the positive to the above question has recently been given by Shi,
Wu, and Solé for additive cyclic codes. By this we mean a code in Fnql (l ≥ 2) which is linear
over the field Fq and invariant by cyclic shifts.
1
Theorem 16.11.2 ([1678]) The class of additive cyclic codes over Fql (l ≥ 2) of rate ≥ l
is asymptotically good.
Codes in Group Algebras 383
Theorem 16.11.3 ([145, 244]) The class of group codes in code-checkable group algebras
is asymptotically good over every finite field.
In contrast to the field case a cyclic code in R[x]/hxn − 1i may not be generated by
g ∈ R[x] with g | (xn − 1).
Theorem 16.12.2 ([851]) Let C ≤ R[x]/hxn − 1i be a cyclic code. Then the following are
equivalent.
(a) There exists a divisor g of xn − 1 in RG such that C = gR[x]/hxn − 1i.
(b) C has a complement in R[x]/hxn − 1i as an R-module.
Remark 16.12.3 Already in 1972 Blake constructed in [211] group codes in Zm Cn where
m is a product of different primes and gcd(m, n) = 1. More on cyclic codes over rings can
be found in [213, 332, 441, 1742, 1743] and Chapter 6.
Theorem 16.12.6 ([1893]) Let m = 2s pn1 1 · · · pnr r with pairwise different odd primes pi .
Then Zm G contains a self-dual group code if and only if all the ni are even and s or |G| is
even.
Remark 16.12.7 In [1012] structure theorems for group ring codes are presented via co-
homology. In [1556] a characterization of abelian group codes over Zpr is given by using a
generalized Discrete Fourier Transform. As a consequence in both articles a weaker version
of Theorem 16.12.6 is obtained.
384 Concise Encyclopedia of Coding Theory
If π is a set of primes, then Oπ0 (G) denotes the largest normal subgroup of G whose
order has no prime factor in π.
Theorem 16.12.8 ([7, 617]) Let R be a finite commutative semisimple ring and let G be
abelian. If π denotes the set of non-invertible primes in R, then the following are equivalent.
(a) Every ideal in RG is principal.
(b) RG is code-checkable.
(c) G/Oπ0 (G) is a cyclic π-group.
Chapter 17
Constacyclic Codes over Finite
Commutative Chain Rings
Hai Q. Dinh
Kent State University
Sergio R. López-Permouth
Ohio University
17.1 Introduction
Noise is unavoidable. You cannot get rid of it; all you can do is manage it. Codes are
used for data compression, cryptography, error-correction, and more recently for network
coding. The study of codes has grown into an important subject that lies at the intersection
of various disciplines, including information theory, electrical engineering, mathematics, and
computer science. Its purpose is designing efficient, reliable, and secure data transmission
methods.
The expression ‘algebraic coding theory’ refers to the approach to coding theory where
alphabets and ambient spaces are enhanced with algebraic structures to facilitate the design,
analysis, and decoding of the codes produced. As with coding theory in general, algebraic
coding theory is divided in two major subcategories of study: block codes and convolutional
codes. The study of block codes focuses on three important parameters: code length, total
number of codewords, and minimum distance between codewords. Originally, work centered
on the Hamming distance. Recently, fueled by an incremental use of finite rings as alphabets,
the use of other distances such as the Lee distance, the uniform distance, and Euclidean
distance has increased.
The algebraic theory of error-correcting codes has traditionally taken place in the setting
of vector spaces over finite fields. Codes over finite rings were first considered in the early
385
386 Concise Encyclopedia of Coding Theory
1970s. But they might have initially been considered mostly as a mathematical curiosity
and their study was limited to only a handful of publications.
Some of the highlights of that period include the work of Blake [211], who, in 1972,
showed how to construct codes over Zm from cyclic codes over Fp where p is a prime factor
of m. He then focused on studying the structure of codes over Zpr ; see [213]. In 1977, Spiegel
[1742, 1743] generalized those results to codes over Zm , where m is an arbitrary positive
integer.
A boost for codes over finite rings came when connections with ring-linearity were found
for well-known families of nonlinear codes that at the time of the discovery had more
codewords than all comparable linear codes. The nonlinear codes by Kerdock, Preparata,
Nordstrom–Robinson, Goethals, and Delsarte–Goethals [312, 524, 816, 817, 1107, 1323,
1446, 1541] exhibited great error-correcting capabilities and remarkable structures. For in-
stance, the fact that the weight distributions of Kerdock and Preparata codes relate to each
other as if they were linear duals (via the MacWilliams Identities) became a tantalizing
enigma that was only settled when ring-alphabet linear codes were reintroduced for the
purpose. Several researchers have investigated these and other codes with the same weight
distributions [94, 342, 1076, 1077, 1078, 1080, 1834]. The families of Kerdock and Preparata
codes exist for all length n = 4m ≥ 16, and at length 16, they coincide, providing the
Nordstrom–Robinson code [817, 1446, 1732]; this code is the unique binary code of length
16 consisting of 256 codewords with Hamming distance 6. In [329, 898] (see also [441, 442]),
it has also been shown that the Nordstrom–Robinson code is equivalent to a quaternary
code which is self-dual. From that point on, codes over finite rings in general and over Z4 in
particular, have gained considerable prominence in the literature. There are now numerous
research papers on this subject and there is at least one book [1861] devoted to the study
of codes over Z4 .
The discovery in the early 1990s that these nonlinear binary codes are actually equivalent
to linear codes over the ring of integers modulo four, the so-called quaternary codes1 ,
propelled the importance of codes over rings and prompted many more publications such as
[329, 441, 898, 1425, 1426, 1522, 1524]. A code of length n over a ring R is linear provided
it is a submodule of Rn . Nechaev pointed out that the Kerdock codes are, in fact, linear
cyclic codes over Z4 in [1426]. Furthermore, the intriguing relationship between the weight
distributions of Kerdock and Preparata codes, a relation that is akin to that between the
weight distributions of a linear code and its dual, was explained by Calderbank, Hammons,
Kumar, Sloane, and Solé [329, 898] when they showed in 1993 that these well-known codes
are in fact equivalent to a pair of mutually dual linear codes over the ring Z4 .
More precisely, the relation between the Kerdock and Preparata codes and the corre-
sponding ring-linear counterparts over Z4 is due to an isometry between them, induced by
the Gray map µ : Z4 → Z22 sending 0 to 00, 1 to 01, 2 to 11, and 3 to 10. The isometry
relates codes over Z4 equipped with the so-called Lee metric to the Kerdock and Preparata
codes with the standard Hamming metric. Consequently, from its inception, the theory of
codes over rings was not only about the introduction of an alternate algebraic structure for
the alphabet of the codes under consideration but also of a different metric for them. In
addition to the Lee metric, other alternative metrics have been considered; see Chapter 22.
Cyclic codes have been one of the most important classes of codes for at least two reasons.
First, cyclic codes can be efficiently encoded using shift registers; this explains their preferred
role in engineering. Secondly, from a more theoretical perspective, cyclic codes are easily
characterized as the ideals of a specific ring: hxF[x]
n −1i . This ring is a quotient of the (infinite)
1 In the coding theory literature, the term ‘quaternary codes’ sometimes is used for codes over the finite
field F4 . Throughout this chapter, including references, unless otherwise stated, by ‘quaternary codes’ we
mean codes over Z4 .
Constacyclic Codes over Finite Commutative Chain Rings 387
ring F[x] of polynomials with coefficients in the alphabet field F, a principal ideal domain.
It is this characterization that makes the notion of cyclic codes suitable to be generalized in
various ways. The concepts of negacyclic and constacyclic codes, for example, are the result
of focusing, respectively, on codes that correspond to ideals of the quotient rings hxF[x]n +1i
and hxF[x]
n −λi (where λ ∈ F \ {0}) of F[x]. In fact, the most general such generalization is the
notion of a polycyclic code, namely, those codes that correspond to ideals of some quotient
ring hfF[x]
(x)i of F[x]; see [1287].
All of the notions above can easily be extended to the finite ring alphabet case by re-
placing the finite field F by the finite ring R in each definition. Those concepts, when R is a
chain ring, are the main subject of our survey. Section 17.2 discusses chain rings and their
archetypical example, the Galois rings, in some detail; that section also serves as an opportu-
nity to briefly visit various alternative metrics which often serve the purposes of ring-linear
coding theory in a better way. Section 17.3 surveys what can be said about constacyclic
codes in full generality over finite commutative rings without further assumptions on the
alphabet. Staying true to the purpose of this chapter, we focus on chain rings alphabets in
the next three sections 17.4, 17.5, and 17.6. Section 17.4 deals with the simple-root case for
arbitrary chain rings. Considering the repeated-root case requires specific techniques for the
various chain rings considered. With one remarkable exception (F2 + uF2 where u2 = 0),
Section 17.5 focuses mostly on Galois rings. Section 17.6 deals with the case of the chain
ring R = Fpm + uFpm where u2 = 0. R consists of all pm -ary polynomials of degree 0
and 1 in indeterminate u; it is closed under pm -ary polynomial addition and multiplication
F m [u]
modulo u2 . Thus, R = phu2 i = {a + ub | a, b ∈ Fpm } is a chain ring with maximal ideal
uFpm . The ring R has precisely pm (pm − 1) units, which are of the forms α + uβ and γ,
where α, β, γ are nonzero elements of the field Fpm . The analysis in that section requires
independent considerations of various subcases which are thus presented in corresponding
subsections. In Section 17.7, we close by considering various directions in which the study of
constacyclic codes has been extended. Notions considered include polycyclic and sequential
codes; Theorem 17.7.3 highlights a theoretical reason for the centrality of the notion of
constacyclicity.
This chapter updates and extends [585]. Note that general codes over rings are examined
in Chapter 6.
chain ring need not be commutative. The smallest noncommutative chain ring has order 16 [1127] and can
388 Concise Encyclopedia of Coding Theory
alphabet for constacyclic codes. Various decoding schemes for codes over Galois rings have
been considered in [322, 323, 324, 325].
The following equivalent conditions are well-known for the class of finite commutative
chain rings; see [584, Proposition 2.1].
Proposition 17.2.1 For a finite commutative ring R the following conditions are equiva-
lent.
(a) R is a local ring and the maximal ideal M of R is principal,
(b) R is a local principal ideal ring,
(c) R is a chain ring.
Let ζ be a fixed generator of the maximal ideal M of a finite commutative chain ring R.
Then ζ is nilpotent, and we denote its nilpotency index by t. The ideals of R form a chain:
R = hζ 0 i ) hζ 1 i ) · · · ) hζ t−1 i ) hζ t i = h0i.
R
Let R denote the field M , called the residue field of R. The natural epimorphism of R
onto R extends naturally to − : R[x] → R[x], such that, for all r ∈ R, r 7→ r + M and the
variable x ∈ R[x] maps to x ∈ R[x].
The following theorem summarizes various known general facts about finite commutative
chain rings; see [1359].
Proposition 17.2.2 Let R be a finite commutative chain ring with maximal ideal M = hζi,
and let t be the nilpotency of ζ. Then
(a) there exists a prime p such that |R|, |R|, and the characteristics of R and R are powers
of p,
hζ i−1 i
(b) for i = 1, 2, . . . , t, the quotients hζ i i are isomorphic as R-modules and as R-vector
spaces, and
(c) for i = 0, 1, . . . , t, |hζ i i| = |R|t−i . In particular, |R| = |R|t .
We continue with a family of rings that constitute a general yet rather concrete family
of finite commutative chain rings; they are the so-called Galois rings. While they can be
characterized in pure theoretical terms, they also have a very specific construction which
resembles that of finite (Galois) fields; undoubtedly, that is the reason for their name.
The theory of Galois rings was first developed by W. Krull in 1924. A finite ring R with
identity 1 is said to be a Galois ring if the set of its zero divisors forms a principal ideal
hp · 1i, where p is a prime integer and p · 1 = 1 + · · · + 1 denotes the p-fold addition of copies
of 1. Chapter 14 of [1864] is devoted to a characterization of Galois rings which amounts to
showing that they must be one of the rings in the following definition.
where h(z) is a monic basic irreducible polynomial of degree m in Zpa [z]. A polynomial h(z)
is said to be basic irreducible if h(z) is irreducible over the residue field of GR(pa , m).
Note that if a = 1, then GR(p, m) = GF(pm ) = Fpm , and if m = 1, then GR(pa , 1) =
Zpa . We gather here some well-known facts about Galois rings; see [898, 1008, 1359].
Zpa [z]
Proposition 17.2.4 Let GR(pa , m) = hh(z)i be a Galois ring. Then the following hold.
(a) Each ideal of GR(pa , m) is of the form hpk i = pk GR(pa , m), for 0 ≤ k ≤ a. In
particular, GR(pa , m) is a chain ring with maximal ideal hpi = pGR(pa , m), and
residue field Fpm .
(b) For 0 ≤ i ≤ a, |pi GR(pa , m)| = pm(a−i) .
(c) Each element of GR(pa , m) can be represented as upk , where u is a unit and 0 ≤ k ≤ a;
in this representation k is unique and u is unique modulo hpn−k i.
(d) h(z) has a root ξ, which is also a primitive (pm − 1)th root of unity. The set
m
−2
Tm = {0, 1, ξ, ξ 2 , . . . , ξ p }
a
GR(p ,m) a
is a complete set of representatives of the cosets pGR(p a ,m) = Fpm in GR(p , m). Each
a
element r ∈ GR(p , m) can be written uniquely as
r = ξ0 + ξ1 p + · · · + ξa−1 pa−1 ,
with ξi ∈ Tm , 0 ≤ i ≤ a − 1.
(e) There is a natural injective ring homomorphism GR(pa , m) → GR(pa , md) for any
positive integer d.
(f) There is a natural surjective ring homomorphism GR(pa , m) → GR(pa−1 , m) with
kernel hpa−1 i.
(g) Each subring of GR(pa , m) is a Galois ring of the form GR(pa , l), where l divides m.
Conversely, if l divides m then GR(pa , m) contains a unique copy of GR(pa , l). That
means, the number of subrings of GR(pa , m) is the number of positive divisors of m.
The set Tm in Theorem 17.2.4(d) is called a Teichmüller set of GR(pa , m).
We conclude this section by mentioning three important distances used for codes over
specific classes of finite chain rings, namely, the homogeneous, Lee, and Euclidean distances.
We will provide the exact values of these distances for several classes of codes in Section
17.5. In this chapter, we make the convention that the zero code has distance 0, for any
distance.
The homogeneous weight was first introduced in [430] (see also [431, 432]) over integer
residue rings, and later over finite Frobenius rings. This weight has numerous applications for
codes over finite rings, such as constructing extensions of the Gray isometry to finite chain
rings [853, 932, 978], or providing a combinatorial approach to MacWilliams equivalence
theorems [1317, 1318, 1906] for codes over finite Frobenius rings [854]. The homogeneous
distance of codes over the Galois rings GR(2a , m) is defined as follows.
Let a ≥ 2; the homogeneous weight on GR(2a , m) is a weight function wthom :
GR(2a , m) → N given by
0
if r = 0,
m m(a−2)
wthom (r) = (2 − 1) 2 if r ∈ GR(2a , m) 2a−1 GR(2a , m),
m(a−1)
2 if r ∈ 2a−1 GR(2a , m) {0}.
390 Concise Encyclopedia of Coding Theory
The homogeneous weight of a vector (c0 , c1 , . . . , cn−1 ) of length n over GR(2a , m) is the
rational sum of the homogeneous weights of its components:
The homogeneous distance (or minimum homogeneous weight) dhom of a linear code
C is the minimum homogeneous weight of nonzero codewords of C:
The Lee distance, named after its originator [1211], is a good alternative to the Hamming
distance in algebraic coding theory, especially for codes over Z4 . For instance, the Lee
distance plays an important role in constructing an isometry between binary and quaternary
codes via the Gray map in landmark papers [329, 898] of the theory of codes over rings.
Classically, for codes over finite fields, Berlekamp’s negacyclic codes [167, 170], the class of
cyclic codes investigated in [404], and the class of alternant codes discussed in [1605], are
examples of codes designed with the Lee metric in mind.
Let z ∈ Z2a ; the Lee value of z, denoted by |z|L , is given as
(
z if 0 ≤ z ≤ 2a−1 ,
|z|L =
2a − z if 2a−1 < z ≤ 2a − 1.
The Lee weight of a vector (c0 , c1 , . . . , cn−1 ) of length n over Z2a is the rational sum of
the Lee values of its components:
The Lee distance (or minimum Lee weight) dL of a linear code C is the minimum Lee
weight of nonzero codewords of C:
As codes over Z4 have gained more prominence, interesting connections with binary
codes and unimodular lattices were found with relations to codes over Z2k ; see [115]. The
connection between codes over Z4 and unimodular lattices prompted the definition of the
Euclidean weight of codewords of length n over Z4 [235, 236], and more generally, over Z2k
[115, 624, 625].
Let z ∈ Z2a , the Euclidean weight of z, denoted by |z|E , is given as
(
z2 if 0 ≤ z ≤ 2a−1 ,
|z|E =
(2a − z)2 if 2a−1 < z ≤ 2a − 1.
The Euclidean weight of a vector (c0 , c1 , . . . , cn−1 ) of length n over Z2a is the rational
sum of the Euclidean weights of its components:
The Euclidean distance (or minimum Euclidean weight) dE of a linear code C is the
minimum Euclidean weight of nonzero codewords of C:
and
ν(x0 , x1 , . . . , xn−1 ) = (−xn−1 , x0 , x1 , . . . , xn−2 ).
A code C is called cyclic if τ (C) = C, and C is called negacyclic if ν(C) = C. More generally,
if λ is a unit of the ring R, then the λ-constacyclic (λ-twisted) shift τλ on Rn is the shift
CSλ ⊆ C,
0 1 ··· 0 0
... ... .. .. ..
. . . In−1 ∈ Rn×n .
Sλ = =
0 0 ··· 1 0
λ 0 ··· 0 λ 0 ··· 0
In light of this definition, when λ = 1, λ-constacyclic codes are cyclic codes, and when
λ = −1, λ-constacyclic codes are just negacyclic codes.
Each codeword c = (c0 , c1 , . . . , cn−1 ) is customarily identified with its polynomial rep-
resentation c(x) = c0 + c1 x + · · · + cn−1 xn−1 , and the code C is in turn identified with
the set of all polynomial representations of its codewords. Then in the ring hxR[x] n −λi , xc(x)
corresponds to a λ-constacyclic shift of c(x). From that, the following fact is well-known
and straightforward.
Under the ordinary inner product, the dual of a cyclic code is a cyclic code, and the
dual of a negacyclic code is a negacyclic code. In general, we have the following implication
of the dual of a λ-constacyclic code.
Proposition 17.3.2 ([578]) Under the ordinary inner product, the dual of a λ-consta-
cyclic code is a λ−1 -constacyclic code.
Note that (f ∗ )∗ = f if and only if the constant term of f is nonzero if and only if deg(f ) =
deg(f ∗ ). For A ⊆ R[x], we let A∗ = {f ∗ (x) | f (x) ∈ A}. It is easy to see that if A is an
ideal of R[x], then A∗ is also an ideal of R[x].
Proposition 17.3.3 ([591, Propositions 3.3, 3.4]) Let R be a finite commutative ring,
and λ a unit of R.
(a) Let a(x) = a0 + a1 x + · · · + an−1 xn−1 , b(x) = b0 + b1 x + · · · + bn−1 xn−1 be in
R[x]. Then a(x)b(x) = 0 in hxR[x] n −λi if and only if (a0 , a1 , . . . , an−1 ) is orthogonal to
When studying λ-constacyclic codes over finite fields, most researchers assume that the
code length n is not divisible by the characteristic p of the field. This ensures that xn −λ, and
hence the generator polynomial of any λ-constacyclic code, will have no multiple factors, and
hence no repeated roots in an extension field. The case when the code length n is divisible
by the characteristic p of the field yields the so-called repeated-root codes, which were
first studied in 1967 by Berman [173], and then in the 1970s and 1980s by several authors
such as Massey et al. [1350], Falkner et al. [702], and Roth and Seroussi [1604]. However,
repeated-root codes over finite fields were investigated in the most generality in the 1990s
by Castagnoli [369] and van Lint [1835], where they showed that repeated-root cyclic codes
have a concatenated construction, and are asymptotically bad. Nevertheless, such codes are
optimal in a few cases and that motivates further study of the class.
Repeated-root constacyclic codes over a class of finite chain rings have been extensively
studied over the last few years by many researchers, such as Abualrub and Oehmke [9, 10,
11], Blackford [207, 208], Dinh [573, 574, 575, 576, 578, 580], Ling et al. [631, 1112, 1268],
Sălăgean et al. [1449, 1611], etc. To distinguish the two cases, codes where the code length
is not divisible by the characteristic p of the residue field R are called simple-root codes.
We will consider this class of codes in Section 17.4 and the class of repeated-root codes in
Sections 17.5 and 17.6.
The dual notions of polycyclic and sequential codes were introduced in [1287]. In addition
to being generalizations of constacyclicity, they serve to characterize precisely that concept
in terms of a symmetry criterion. We mention this result as Theorem 17.7.3 at the end of
this chapter.
on with a different proof by Kanwar and López-Permouth [1081] in 1997. In 1999, with a
different technique, Norton and Sălăgean [1448] extended the structure theorems given in
[332] and [1081] to cyclic codes over finite chain rings; they used an elementary approach
which did not appeal to commutative algebra as [332] and [1081] did.
Let R be a finite chain ring with maximal ideal hζi, and let t be the nilpotency of ζ. For
a linear code C of length n over R, the submodule quotient of C by r ∈ R is the code
(C : r) = e ∈ Rn er ∈ C .
C = (C : ζ 0 ) ⊆ · · · ⊆ (C : ζ i ) ⊆ · · · ⊆ (C : ζ t−1 ).
C = (C : ζ 0 ) ⊆ · · · ⊆ (C : ζ i ) ⊆ · · · ⊆ (C : ζ t−1 ).
We denote by γ(C) the number of rows of a generator matrix in standard form of C, and
γi (C) the number of rows divisible by ζ i but not by ζ i+1 . Equivalently, γ0 (C) = dim(C), and
Pt−1
γi (C) = dim (C : ζ i ) − dim (C : ζ i−1 ), for 1 ≤ i ≤ t − 1. Obviously, γ(C) = i=0 γi (C).
For a linear code C of length n over a finite chain ring R, the information on generator
matrices, parity check matrices, and sizes of C, its dual C ⊥ , its projection C to the residue
field R, is given as follows.
394 Concise Encyclopedia of Coding Theory
Theorem 17.4.1 ([1448, Lemma 3.4, Theorems 3.5, 3.10]) Let C be a linear code
of length n over a finite chain ring R with generator matrix G in standard form (17.1)
associated to the matrix
A0
A1
A= A.2 .
..
At−1
Then
(a) For 0 ≤ i ≤ t − 1, (C : ζ i ) has generator matrix
A0
A1
. ,
..
Ai
and dim (C : ζ i ) = k0 + k1 + · · · + ki .
(b) If E0 ⊆ E1 ⊆ · · · ⊆ Et−1 are linear codes of length n over R, then there is a code D
of length n over R such that (D : ζ i ) = Ei , for 0 ≤ i ≤ t − 1.
(c) The parameters k0 , k1 , . . . , kt−1 are the same for any generator matrix G in standard
form for C.
(d) Any codeword c ∈ C can be written uniquely as
c = (v0 , v1 , . . . , vt−1 ) G,
B0
B1
B= ... .
Bt−1
Constacyclic Codes over Finite Commutative Chain Rings 395
B0
B1
B= ... .
Bt−1
The set {ζ a0 ga0 , ζ a1 ga1 , . . . , ζ ak gak } is said to be a generating set in standard form
of the cyclic code C if the following conditions hold:
• C = hζ a0 ga0 , ζ a1 ga1 , . . . , ζ ak gak i,
• 0 ≤ k < t,
• 0 ≤ a0 < a1 < · · · < ak < t,
• gai ∈ R[x] is monic for 0 ≤ i ≤ k,
• deg(gai ) > deg(gai+1 ) for 0 ≤ i ≤ k − 1, and
• gak | gak−1 | · · · | ga0 | (xn − 1).
The existence and uniqueness of a generating set in standard form of a cyclic code were
proven by Calderbank and Sloane [332] in 1995 for the alphabet Zpa ; in 2000, that was
extended to the general case of any chain ring R by Norton and Sălăgean [1448].
Proposition 17.4.2 ([332, Theorem 6], [1448, Theorem 4.4]) Any nonzero cyclic
code C over a finite chain ring R has a unique generating set in standard form.
Theorem 17.4.3 ([1448, Theorems 4.5, 4.9]) Let C be a cyclic code with generating set
{ζ a0 ga0 , ζ a1 ga1 , . . . , ζ ak gak } in standard form. The following hold.
(a) If, for 0 ≤ i ≤ k, di = deg(gai ), and by convention, d−1 = n, dk+1 = 0, and
Sk
ζ i gai xdi−1 −di −1 , . . . , ζ ai gai x, ζ ai gai ,
a
T = i=0
hi ∈ R Rζ t−ai [x] ∼
= (Rζ ai ) [x],
In 2004, Dinh and López-Permouth [584] generalized the methods of [332, 1081] for
simple-root cyclic codes over Zpa to obtain the structures of simple-root cyclic and self-dual
cyclic codes over finite chain rings R. The strategy was independent from the approach in
[1448] and the results were more detailed.
Since the code length n and the characteristic p of the residue field R are coprime, xn −1
factors uniquely to a product of monic basic irreducible pairwise coprime polynomials in
R[x]. The ambient ring hxR[x]
n −1i can be decomposed as a direct sum of chain rings. So any
cyclic code of length n over R, viewed as an ideal of this ambient ring hxR[x]
n −1i , is represented
as a direct sum of ideals from those chain rings. A polynomial f ∈ R[x] is said to be regular
if it is not a zero divisor.
Theorem 17.4.4 ([584, Lemma 3.1, Theorem 3.2, Corollary 3.3]) Let R be a finite
chain ring with maximal ideal hζi and t the nilpotency of ζ. The following hold.
R[x]
(a) If f is a regular basic irreducible polynomial of the ring R[x], then hf i is also a chain
ring whose ideals are hζ i i, 0 ≤ i ≤ t.
(b) Let xn − 1 = f1 f2 · · · fr be a representation of xn − 1 as a product of monic basic
irreducible pairwise coprime polynomials in R[x]. Then hxR[x]
n −1i can be represented as
R[x]
a direct sum of chain rings hfi i :
r
R[x] ∼ M R[x]
= .
hxn − 1i i=1
hfi i
R[x]
(c) Each cyclic code of length n over R, i.e., each ideal of hxn −1i , is a direct sum of ideals
of the form hζ j fbi i, where 0 ≤ j ≤ t, 1 ≤ i ≤ r.
(d) The number of cyclic codes over R of length n is (t + 1)r .
For each cyclic code C, using the decomposition above, a unique set of pairwise coprime
monic polynomials that generates C is constructed, which in turn provides the sizes of C and
its dual C ⊥ , and a set of generators for C ⊥ . The set of pairwise coprime monic polynomial
generators of C also gives a single generator of C; that implies hxR[x]n −1i is a principle ideal
ring. Two elements a, b of a ring R are said to be associate if there is an invertible element
u of R such that a = bu, i.e., hai = hbi.
Theorem 17.4.5 ([584, Theorems 3.4, 3.5, 3.6, 3.8, 3.10, 4.1]) Let R be a finite
chain ring with maximal ideal hζi and t the nilpotency of ζ. Let C be a cyclic code of length
n over R. The following hold.
(a) There exists a unique family of pairwise coprime monic polynomials F0 , F1 , . . . , Ft in
R[x] such that F0 F1 · · · Ft = xn − 1 and C = hFb1 , ζ Fb2 , . . . , ζ t−1 Fbt i.
(b) The number of codewords in C is
t−1
P
(t−i) deg Fi+1
|C| = Ri=0 .
Constacyclic Codes over Finite Commutative Chain Rings 397
(c) There exist polynomials g0 , g1 , . . . , gt−1 in R[x] such that C = hg0 , ζg1 , . . . , ζ t−1 gt−1 i
and
gt−1 | gt−2 | · · · | g1 | g0 | (xn − 1).
(d) Let F = Fb1 +ζ Fb2 +· · ·+ζ t−1 Fbt . Then F is a generating polynomial of C, i.e., C = hF i.
In particular, hxR[x]
n −1i is a principal ideal ring.
and
t
P
i deg Fi+1
|C ⊥ | = Ri=1
where Ft+1 = F0 .
(f) Let G = Fb0∗ + ζ Fbt∗ + · · · + ζ t−1 Fb2∗ . Then G is a generating polynomial of C ⊥ , i.e.,
C ⊥ = hGi.
(g) C is self-dual if and only if Fi is an associate of Fj∗ for all i, j ∈ {0, . . . , t} such that
i + j ≡ 1 (mod t + 1).
If the nilpotency t of ζ is even, then hζ t/2 i is a cyclic self-dual code, which is the so-called
trivial self-dual code. Using the structure of cyclic codes above, a necessary and sufficient
condition for the existence of nontrivial self-dual cyclic codes was obtained.
Theorem 17.4.6 ([584, Theorems 4.3, 4.4]) Assuming that t is an even integer, the
following conditions are equivalent.
(a) Nontrivial self-dual cyclic codes exist.
(b) There exists a basic irreducible factor f ∈ R[x] of xn − 1 such that f and f ∗ are not
associates.
(c) pi 6≡ −1 (mod n) for all positive integers i.
Theorem 17.4.7 ([1524, Theorem 4], [584, Theorem 4.5]) Let R be a finite chain
ring with maximal ideal hζi where |R| = 2lt , |R| = 2l , and t is the nilpotency of ζ. If t is
even and n is odd, then nontrivial self-dual cyclic codes of length n over R exist if and only
if n is divisible by any of the following:
• a prime τ ≡ 7 (mod 8), or
• a prime τ ≡ 1 (mod 8), where the order of 2 (mod τ ) is odd, or
• different odd primes % and σ such that the order of 2 (mod %) is 2ς i and the order of
2 (mod σ) is 2ς j, where i is odd, j is even, and ς ≥ 1.
398 Concise Encyclopedia of Coding Theory
There are cases where pi ≡ −1 (mod n) for some integer i, which leads to the nonex-
istence of nontrivial self-dual cyclic codes for certain values of n and p. Recall that for
relatively prime integers a and m, a is called a quadratic residue or quadratic non-
residue of m according to whether the congruence x2 ≡ a (mod m) has a solution or
not. We refer to [584, Appendix] for important properties of quadratic residues and related
concepts.
Theorem 17.4.8 ([584, Corollaries 4.6, 4.7, 4.8]) Let R be a finite chain ring with
maximal ideal hζi, |R| = plt , where |R̄| = pl , and t is the nilpotency of ζ such that t is even.
The following hold.
(a) If n is a prime, then nontrivial self-dual cyclic codes of length n over R do not exist
in the following cases:
• p = 2, n ≡ 3, 5 (mod 8),
• p = 3, n ≡ 5, 7 (mod 12),
• p = 5, n ≡ 3, 7, 13, 17 (mod 20),
• p = 7, n ≡ 5, 11, 13, 15, 17, 23 (mod 28),
• p = 11, n ≡ 3, 13, 15, 17, 21, 23, 27, 29, 31, 41 (mod 44).
(b) If n is an odd prime different from p, and p is a quadratic non-residue of nk , where
k ≥ 1, then nontrivial self-dual cyclic codes of length n over R do not exist.
(c) If n is an odd prime, then nontrivial self-dual cyclic codes of length n over R do not
exist in the following cases:
• p ≡ 1 (mod 4), and there exists a positive integer k such that gcd(p, 4nk ) = 1
and p is a quadratic non-residue of 4nk ,
• p ≡ 1 (mod 8), and there exist positive integers i, j with i > 2 such that
gcd(p, 2i nj ) = 1 and p is a quadratic non-residue of 2i nj .
Furthermore, let m = 2k0 pk11 · · · pkr r be the prime factorization of some integer m > 1
with k0 ≥ 2. Assume that gcd(p, m) = 1, p is a quadratic non-residue of m, and
(
1 (mod 4) if k0 = 2,
p≡
1 (mod 8) if k0 ≥ 3.
Then there exists an integer i ∈ {1, 2, . . . , r} such that nontrivial self-dual cyclic codes
of length pi over R do not exist.
Remark 17.4.9
(a) All results in this section for simple-root cyclic codes also hold for simple-root nega-
cyclic codes, reformulated accordingly. We obtain valid results if we replace “cyclic”
by “negacyclic” and “xn − 1” by “xn + 1”.
(b) Most of the techniques that Dinh and López-Permouth [584] used for simple-root
cyclic codes over finite chain rings (Theorems 17.4.4 − 17.4.8) are the most general
form of the techniques that were first introduced by Pless et al. [1522, 1524] in 1996
for simple-root cyclic codes over Z4 . Those were previously extended to the setting
of simple-root cyclic codes over Zpm by Kanwar and López-Permouth [1081] in 1997,
and simple-root cyclic codes over Galois rings GR(pa , m) by Wan [1862] in 1999.
Constacyclic Codes over Finite Commutative Chain Rings 399
(c) As shown by Hammons et al. [898], well-known nonlinear binary codes can be
constructed from quaternary linear codes using the Gray map. The Gray map
G : Zn4 → Z2n n
2 is defined as follows: since each c ∈ Z4 is uniquely represented as
n
c = a + 2b with a, b ∈ Z2 , define G(c) = (b, a ⊕ b) where ⊕ is the componentwise
addition of vectors modulo 2. The Gray map is significant because it is an isometry in
the sense that the Lee weight of c is equal to the Hamming weight of G(c). The Gray
map also preserves duality, since for any linear code C over Z4 , G(C) and G(C ⊥ ) are
formally dual, i.e., their Hamming weight enumerators are MacWilliams transforms
of each other.
However, the Gray map does not preserve linearity; in fact the Gray image of a linear
code is usually not linear. It was shown in [898] that for a Z4 -linear cyclic code of odd
length C, its Gray image G(C) is linear if and only if for any codewords c1 , c2 ∈ C,
2(c1 ? c2 ) ∈ C, where ? is the componentwise multiplication of vectors called the
Hadamard product, that is, a ? b = (a0 b0 , . . . , an−1 bn−1 ). Indeed, binary nonlinear
codes having better parameters than their linear counterparts have been constructed
via the Gray map.
Wolfmann [1901, 1902] showed that the Gray image of a simple-root linear negacyclic
code over Z4 is a (not necessarily linear) cyclic binary code. He classified all Z4 -linear
negacyclic codes of odd length and provided a method to determine all linear binary
cyclic codes of length 2n (n is odd) that are images of negacyclic codes under the
Gray map. Therefore, the Gray image of a simple-root negacyclic code over Z4 is
permutation equivalent to a binary cyclic code under the Nechaev permutation.
Equivalently, A is a cyclic code of length n over R if and only if ξ(A) is a negacyclic code
of length n over R.
It was shown by Sălăgean in [1611] that repeated-root cyclic and negacyclic codes over
finite chain rings are not in general principally generated.
Proposition 17.5.2 ([1611, Theorem 3.4]) Let R be a finite chain ring whose residue
field has characteristic p. For p | n the following hold.
R[x]
(a) hxn −1i is not a principal ideal ring.
R[x]
(b) If p is odd or p = 2 and R is not a Galois ring, then hxn +1i is not a principal ideal
ring.
400 Concise Encyclopedia of Coding Theory
R[x]
(c) If p = 2 and R is a Galois ring, then hxn +1i is a principal ideal ring.
The description of generators of ideals of R[x] by Gröbner bases was developed in [1427,
1428, 1449] for a chain ring R. Sălăgean et al. [1449, 1611] used Gröbner bases to obtain
the structure of repeated-root cyclic codes over finite chain rings, and furthermore provide
generating matrices, sizes, and Hamming distances of such codes.
Theorem 17.5.3 ([1449, Theorem 4.2], [1611, Theorems 4.1, 5.1, 6.1]) Let R be a
finite chain ring with maximal ideal hζi and t be the nilpotency of ζ. If C is a nonzero cyclic
code of length n over R, then
(a) C admits a set of generators C = hζ a0 ga0 , ζ a1 ga1 , . . . , ζ ak gak i such that
(i) 0 ≤ k < t,
(ii) 0 ≤ a0 < a1 < · · · < ak < t,
(iii) gai ∈ R[x] is monic for 0 ≤ i ≤ k,
(iv) deg(gai ) > deg(gai+1 ) for 0 ≤ i ≤ k − 1,
(v) For 0 ≤ i ≤ k, ζ ai+1 gai ∈ hζ ai+1 gai+1 , . . . , ζ ak gak i in R[x], and
(vi) ζ a0 (xn − 1) ∈ hζ a0 ga0 , ζ a1 ga1 , . . . , ζ ak gak i in R[x].
(b) This set {ζ a0 ga0 , ζ a1 ga1 , . . . , ζ ak gak } of generators is a strong Gröbner basis. It is
not necessarily unique. However, the cardinality k + 1 of the basis, the degrees of its
polynomials and the exponents a0 , a1 , . . . , ak are unique.
(c) Let di = deg(gai ) for 0 ≤ i ≤ k, and d−1 = n. Then the matrix consisting of the rows
corresponding to the codewords ζ ai xj gai , with 0 ≤ i ≤ k and 0 ≤ j ≤ di−1 − di − 1, is
a generator matrix for C.
(d) The number of codewords in C is
k
P
(t−ai )(di−1 −di )
|C| = Ri=0 .
In fact, Theorem 17.5.3(a) provides a structure theorem for both simple-root and
repeated-root cyclic codes. Conditions (v) and (vi) imply that gak | gak−1 | · · · | ga0 | (xn − 1).
In the simple-root case, conditions (v) and (vi) can be replaced by the stronger condition
gak | gak−1 | · · · | ga0 | (xn −1), as in Proposition 17.4.2, giving a structure theorem for simple-
root cyclic codes. For repeated-root cyclic codes, conditions (v) and (vi) cannot be improved
in general; [1449, Example 3.3] gave cyclic codes for which no set of generators of the form
given in Theorem 17.5.3(a) has the property gak | gak−1 | · · · | ga0 | (xn − 1).
Most of the research on repeated-root codes concentrated on the situation where the
chain ring is a Galois ring, i.e., R = GR(pa , m). In this case, using polynomial representation,
it is easy to show that the ideals hx−1, pi and hx+1, pi are the sets of non-invertible elements
a a a a
of GR(p ,m)[x]
hxps −1i
and GR(p ,m)[x]
hxps +1i
, respectively. Therefore, GR(p ,m)[x]
hxps −1i
and GR(p ,m)[x]
hxps +1i
are local
rings whose maximal ideals are hx−1, pi and hx+1, pi, respectively. When a ≥ 2, GR(pa , m)
is not a field; Proposition 17.5.2 gives us information on the ambient rings of cyclic and
negacyclic codes of length ps over GR(pa , m).
Constacyclic Codes over Finite Commutative Chain Rings 401
When a = 1, the Galois ring GR(pa , m) is the Galois field Fpm . Dinh [577] showed that
F m [x] F m [x]
the ambient rings hxpps −1i and hxpps +1i are chain rings, and used this to establish the structure
of cyclic and negacyclic codes of length ps over Fpm , as well as the Hamming distances of
all such codes. These results were then generalized by Dinh [579] to all constacyclic codes.
Theorem 17.5.5 ([577, 579]) For any a nonzero element λ of Fpm , there exists λ0 ∈ Fpm
s
F m [x]
such that λp0 = λ. The ambient ring hxpps −λi is a chain ring with maximal ideal hx − λ0 i.
The λ-constacyclic codes of length ps over Fpm are precisely the ideals Ci = h(x − λ0 )i i
F m [x] s
of hxpps −λ0 i , for i ∈ {0, 1, . . . , ps }. The λ-constacyclic code Ci has pm(p −i) codewords. Its
s m [x]
dual code is the λ−1 -constacyclic code h(x − λ−1 p −i F
0 ) i ⊆ hxpsp−λ−1 i with pmi codewords. The
λ-constacyclic code Ci has Hamming distance
1 if i = 0,
β + 2 if β ps−1 + 1 ≤ i ≤ (β + 1) ps−1 where 0 ≤ β ≤ p − 2,
dH (Ci ) = (t + 1)pk if ps − ps−k + (t − 1)ps−k−1 + 1 ≤ i ≤ ps − ps−k + tps−k−1
where 1 ≤ t ≤ p − 1 and 1 ≤ k ≤ s − 1,
if i = ps .
0
If the dimension m = 1, the Galois ring GR(2a , m) is the ring Z2a . The Hamming,
homogeneous, Lee, and Euclidean distances of all negacyclic codes of length 2s over Z2a
were established in [576].
Theorem 17.5.7 ([576]) Let C be a negacyclic code of length 2s over Z2a . Then C = Ci =
a [x]
h(x+1)i i ⊆ hxZ22s +1i , for i ∈ {0, 1, . . . , 2s a}, and the Hamming distance dH (Ci ), homogeneous
distance dhom (Ci ), Lee distance dL (Ci ), and Euclidean distance dE (Ci ) of C are determined
as follows:
0 if i = 2s a,
1 if 0 ≤ i ≤ 2s (a − 1),
2 if 2s (a − 1) + 1 ≤ i ≤ 2s (a − 1) + 2s−1 ,
dH (Ci ) = k k+1
2k+1 if 2s (a − 1) + 1 + 2s−j ≤ i ≤ 2s (a − 1) +
P P s−j
2
j=1 j=1
for 1 ≤ k ≤ s − 1.
0 if i = 2s a,
2a−2 if 0 ≤ i ≤ 2s (a − 2),
2a−1
if 2s (a − 2) + 1 ≤ i ≤ 2s (a − 1),
dhom (Ci ) = 2a if 2s (a − 1) + 1 ≤ i ≤ 2s (a − 1) + 2s−1 ,
k k+1
a+k
if 2s (a − 1) + 1 + 2s−j ≤ i ≤ 2s (a − 1) +
P P s−j
2 2
j=1 j=1
for 1 ≤ k ≤ s − 1.
i = 2s a,
0 if
1 i = 0,
if
2 if 1 ≤ i ≤ 2s ,
l+1
2 if 2s l + 1 ≤ i ≤ 2s (l + 1) for 1 ≤ l ≤ a − 2,
dL (Ci ) =
2a
if 2s (a − 1) + 1 ≤ i ≤ 2s (a − 1) + 2s−1 ,
k k+1
2a+k if 2s (a − 1) + 1 + 2s−j ≤ i ≤ 2s (a − 1) +
P P s−j
2
j=1 j=1
for 1 ≤ k ≤ s − 1.
Constacyclic Codes over Finite Commutative Chain Rings 403
if i = 2s a,
0
1 if i = 0,
2l+1
2 if 2s l + 1 ≤ i ≤ 2s l + 2s−1 for 0 ≤ l ≤ a − 2,
2l+2
2 if 2s l + 2s−1 + 1 ≤ i ≤ 2s (l + 1) for 0 ≤ l ≤ a − 2,
dE (Ci ) =
2a−1
2
if 2s (a − 1) + 1 ≤ i ≤ 2s (a − 1) + 2s−1 ,
k k+1
22a+k−1 if 2s (a − 1) + 1 + 2s−j ≤ i ≤ 2s (a − 1) +
P P s−j
2
j=1 j=1
for 1 ≤ k ≤ s − 1.
In the special case when the alphabet is Z4 , or its Galois extension GR(4, m), repeated-
root cyclic and negacyclic codes have been studied in more detail. Among other partial
results, the structures of negacyclic and cyclic codes over Z4 of any length were provided,
respectively, by Blackford in 2003 [208] and by Dougherty and Ling in 2006 [631].
The Discrete Fourier Transform is a useful tool to study structures of codes; for instance,
it was used by Blackford [207, 208], and Dougherty and Ling [631] to recover a tuple c from
its Mattson–Solomon polynomial. In 2003, Blackford used the Discrete Fourier Transform
to give a decomposition of the ambient ring hx2Za4n[x]
+1i
of cyclic codes of length 2a n over Z4 as
a direct sum of GR(4,m i )[u]
hu2a +1i
. The rings GR(4,m i )[u]
hu2a +1i
are the ambient rings of negacyclic codes
a
of length 2 over GR(4, mi ), which were shown to be chain rings by Blackford, and later by
Dinh [573] for the more general case over GR(2z , mi ).
Theorem 17.5.8 ([208, Lemma 2, Theorem 1]) Let n be an odd positive integer and a
be any nonnegative integer. Let I denote a complete set of representatives of the 2-cyclotomic
cosets modulo n, and for each i ∈ I, let mi be the size of the 2-cyclotomic coset containing
i. The following hold.
given by
φ(c(x)) = [b
ci ]i∈I ,
where (b
c0 , b
c1 , . . . , b
cn−1 ) is the Discrete Fourier Transform of c(x), is a ring isomor-
phism.
(c) Each negacyclic code of length 2a n over Z4 , i.e., an ideal of the ring hx2Za4n[x]
+1i
, is
L GR(4,mi )[u]
isomorphic to i∈I Ci , where Ci is an ideal of hu2a +1i
(such ideals are given in
part (a)).
Theorem 17.5.9 ([208, Theorems 2, 3]) Let C be a negacyclic code of length 2a n over
Z4 with n odd, i.e., an ideal of the ring hx2Za4n[x]
+1i
. The following hold.
404 Concise Encyclopedia of Coding Theory
Q2a+1
(a) C = hg(x)i, where g(x) = i=0 [gi (x)]i and {gi (x)} are monic pairwise coprime
divisors of xn − 1 in Z4 [x].
(b) Any codeword of C is equivalent to a (2a n)-tuple of the form (b0 | b1 | · · · | b2a −1 ),
where a
2X −1
j j j
bi = aj with = mod 2
j=0
i i i
and
Z4 [x]
aj ∈ hgj+1 · · · g2a+1 + 2gj+2a +1 · · · g2a+1 i ⊆ .
hxn − 1i
We now turn our attention to repeated-root cyclic codes over Z4 . In 2003, Abualrub and
Oehmke [10, 11] classified cyclic codes of length 2k over Z4 by their generators, and in 2004
they derived a mass formula for the number of such codes [9]. In 2006, Dougherty and Ling
[631] generalized that to give a classification of cyclic codes of length 2k over the Galois ring
GR(4, m).
Theorem 17.5.10 ([631, Lemma 2.3, Theorem 2.6]) Let η be a primitive (2m − 1)th
m
root of unity in GR(4, m) with Teichmüller set Tm = {0, 1, η, η 2 , . . . , η 2 −2 }. Then the
ambient ring GR(4,m)[u]
2k
is a local ring with maximal ideal h2, u − 1i and residue field F2m .
hu −1i
GR(4,m)[u]
Cyclic codes of length 2k over GR(4, m), i.e., ideals of , are as follows:
hu2k −1i
• h0i, h1i,
• h2(u − 1)i i where 0 ≤ i ≤ 2k − 1,
D i−1 E
(u − 1)i + 2 sj (x − 1)j where 1 ≤ i ≤ 2k − 1 and sj ∈ Tm , or equivalently,
P
•
j=0
(u − 1)i + 2(u − 1)t h(u) where 0 ≤ t ≤ i − 1 and h(u) is 0 or a unit,
D i−1 E
2(u − 1)l , (u − 1)i + 2 sj (u − 1)j where 1 ≤ i ≤ 2k − 1, l < i, and sj ∈ Tm , or
P
•
j=0
equivalently, 2(u − 1)l , (u − 1)i + 2(u − 1)t h(u) , where 0 ≤ t ≤ i − 1 and h(u) is 0
or a unit.
Furthermore, the number of such cyclic codes is
k−1
k−1 2m(2 −1) − 1 2k−1 − 1
N (m) = 5 + 22 m
+ 2m (5 · 2m − 1) −4· m .
(2 − 1)
m 2 2 −1
In 2003, using the Discrete Fourier Transform, Blackford [207] gave the structure of
cyclic codes of length 2n (n is odd) over Z4 . Later, in 2006, Dougherty and Ling [631]
generalized that to obtain a description of cyclic codes of any length over Z4 as a direct
sum of cyclic codes of length 2k over GR(4, mi ).
Theorem 17.5.11 ([207, Theorem 2], [631, Theorem 3.2, Corollaries 3.3, 3.4]) Let
n be an odd positive integer and k any nonnegative integer. Let I denote a complete set of
representatives of the 2-cyclotomic cosets modulo n, and for each i ∈ I, let mi be the size
of the 2-cyclotomic coset containing i. The following hold.
Constacyclic Codes over Finite Commutative Chain Rings 405
given by
γ(c(x)) = [b
ci ]i∈I ,
where (b
c0 , b
c1 , . . . , b
cn−1 ) is the Discrete Fourier Transform of c(x), is a ring isomor-
phism.
(b) Each cyclic code of length 2k n over Z4 , i.e., an ideal of the ring 2Zk4n[x] , is isomorphic
hx −1i
L GR(4,mi )[u]
to i∈I Ci , where Ci is an ideal of 2k
(such ideals are classified in Theorem
hu −1i
17.5.10).
(c) The number of distinct cyclic code of length 2k n over Z4 is i∈I N (mi ), where N (mi )
Q
is the number of cyclic codes of length 2k over GR(4, mi ), which is given in Theorem
17.5.10.
This decomposition of cyclic codes was then used to completely determine the generators
of all cyclic codes and their sizes; see [631] for specific details about the polynomials involved
in the next result.
Theorem 17.5.12 ([631, Theorems 4.2, 4.3]) Let n be an odd positive integer and k
any nonnegative integer, and let C be a cyclic code of length 2k n over Z4 , i.e., C is an ideal
of the ring 2Zk4n[x] . The following hold.
hx −1i
k k ! 2k −1 i−1
−1 −1
2Y 2Y
!+
2k
Y Y Y
i T l
2p(x ) qi (x) ri,T (x) si,l (x) ,
i=0 i=1 T i=1 l=0
where
k
k ! 2k −1 i−1 !
2Y −1 2Y −1 Y Y Y
xn − 1 = p(x) qi (x) ri,T (x) si,l (x) y(x),
i=0 i=1 T i=1 l=0
Q
and r^ ^
i,T (x) = ri,T (x) mod 2, si,l (x) = si,l (x) mod 2, and for each i, the product T
is taken over all possible values of T as follows:
• if 1 ≤ i ≤ 2k−1 , then T = i,
• if 2k−1 < i < 2k−1 + t (t > 0), then T = 2k−1 ,
• if i = 2k−1 + t (t > 0), then 2k−1 ≤ T ≤ i,
• if i > 2k−1 + t (t > 0), then T = 2k−1 or 2k − i + t.
406 Concise Encyclopedia of Coding Theory
There are four finite commutative rings of four elements, namely, the Galois field F4 ,
the ring Z4 of integers modulo 4, the ring F2 + uF2 where u2 = 0, and the ring F2 + vF2
where v 2 = v. The first three are chain rings, while the last one, F2 + vF2 , is not. Indeed,
F2 + vF2 ∼= F2 × F2 , which is not even a local ring. The ring F2 + uF2 consists of all binary
polynomials of degree 0 and 1 in indeterminate u; it is closed under binary polynomial
addition and multiplication modulo u2 . Thus, F2 + uF2 = Fhu2 [u] 2 i = {0, 1, u, u = u + 1} is
a chain ring with maximal ideal {0, u}. The addition in F2 + uF2 is similar to that of the
Galois field F4 = {0, 1, ξ, ξ 2 = ξ + 1}, where u is replaced by ξ. The multiplication in
F2 + uF2 is similar to the multiplication in the ring Z4 , where u is replaced by 2. In fact,
(F2 + uF2 , +) ∼
= (F4 , +), and (F2 + uF2 , ∗) ∼
= (Z4 , ∗). Thus, F2 + uF2 lies between F4 and
Z4 , in the sense that it is additively analogous to F4 , and multiplicatively analogous to
Z4 . In 2009, Dinh [578] established the structure of all constacyclic codes of length 2s over
F2m + uF2m , for any positive integer m. Of course, over F2m + uF2m , cyclic and negacyclic
codes coincide; their structure and sizes are as follows.
Theorem 17.5.13 ([578])
(a) The ring (F2 hx+uF 2 )[x]
m m
2s +1i is a local ring with maximal ideal hu, x + 1i, but it is not a
chain ring.
(b) Cyclic codes of length 2s over F2m + uF2m are precisely the ideals of the ring
(F2m +uF2m )[x]
hx2s +1i
, which are
(c) The number of distinct cyclic codes of length 2s over F2m + uF2m is
s−1 s
−1)
2m(2 (22m + 2m + 2) − 22m+1 − 2 6 · 2m(2 −1) − 2s+1 − 1
+ +
(2m − 1)2 2m − 1
s−1 s−1
−1)
2m2 + 4 · 2m(2 + 3 · 2s−1 − 1.
(d) Let C be a cyclic code of length 2s over F2m + uF2m , as classified in (b). Then the
number of codewords nC of C is given as follows.
• If C = h0i, then nC = 1.
s+1
• If C = h1i, then nC = 2m2 .
s
• If C = u(x + 1)i where 0 ≤ i ≤ 2s − 1, then nC = 2m(2 −i) .
s
• If C = (x + 1)i where 1 ≤ i ≤ 2s − 1, then nC = 22m(2 −i) .
• If C = (x + 1)i + u(x + 1)t h(x) where 1 ≤ i ≤ 2s − 1, 0 ≤ t < i, and h(x) is a
unit, then ( s
22m(2 −i) if 1 ≤ i ≤ 2s−1 + 2t ,
nC = s
2m(2 −t) if 2s−1 + 2t < i ≤ 2s − 1.
• If C = (x + 1)i + u(x + 1)t h(x), u(x + 1)κ where 1 ≤ i ≤ 2s − 1, 0 ≤ t < i,
either h(x) is 0 or h(x) is a unit, and
(
i if h(x) = 0,
κ<T =
min {i, 2s − i + t} if h(x) 6= 0,
s+1
−i−κ)
then nC = 2m(2 .
m m
precisely p (p − 1) units, which are of the forms α + uβ and γ, where α, β, γ are nonzero
elements of the field Fpm .
Theorem 17.6.1 ([579]) Let λ be a unit of the ring R, i.e., λ is of the form α + uβ or γ,
where α, β, γ are nonzero elements of the field Fpm .
408 Concise Encyclopedia of Coding Theory
R[x]
(a) If λ = α + uβ, then the ambient ring <(ps , α + uβ) = hxps −(α+uβ)i
is a chain ring
s
p 3
with maximal ideal hα0 x − 1i, and h(α0 x − 1) i = hui. The (α + uβ)-constacyclic
codes of length ps over R are the ideals h(α0 x − 1)i i, 0 ≤ i ≤ 2ps , of the chain ring
<(ps , α + uβ).
ideal hu, x − γ0 i, but it is not a chain ring. The γ-constacyclic codes of length ps over
4
Using the structure in Theorem 17.6.1, the Hamming distances of all constacyclic codes
of length ps over R were established in [579, 586]. Recall that Fpm is a subring of R. In this
subsection, we denote h(x − γ0 )i iF as the code of length ps generated by (x − γ0 )i over Fpm ;
F m [x]
i.e., h(x − γ0 )i iF is the ideal h(x − γ0 )i i of hxpps −γi .
Theorem 17.6.2 ([579, 586]) Let the notations and λ-constacyclic codes C of length ps
over R be as given in Theorem 17.6.1.
(a) If λ = α + uβ, then
1 if 0 ≤ i ≤ ps ,
(e + 1)pk
if 2ps − ps−k + (e − 1)ps−k−1 + 1 ≤ i ≤ 2ps − ps−k + eps−k−1
dH (C) =
for 1 ≤ e ≤ p − 1 and 0 ≤ k ≤ s − 1,
0 if i = 2ps .
s
3 Here α0 ∈ Fpm such that αp0 = α−1 , which is determined as follows. Since α is a nonzero element of
m km
the field Fpm , α−p = α−1 , implying α−p = α−1 , for any nonnegative integer k. Using the positive
integers s and m as dividend and divisor, by the division algorithm, there exist nonnegative integers qm
(qm +1)m−s m−rm
and rm such that s = qm m + rm , and 0 ≤ rm ≤ m − 1. Let α0 = α−p = α−p . Then
ps (qm +1)m
α0 = α−p = α−1 .
s
4 Here γ ∈ F m such that γ p = γ, which is determined similarly to α in part (a) as follows. Let
0 p 0 0
(qm +1)m−s m−rm s (qm +1)m
γ0 = γ p = γp . Then γ0p = γ p = γ.
Constacyclic Codes over Finite Commutative Chain Rings 409
dH (C) = dH (h(x − γ0 )i iF )
1 if i = 0,
(e + 1)pk if ps − ps−k + (e − 1)ps−k−1 + 1 ≤ i ≤
=
ps − ps−k + eps−k−1
for 1 ≤ e ≤ p − 1 and 0 ≤ k ≤ s − 1,
With the development of high-density data storage technologies, symbol-pair codes are
proposed to protect efficiently against a certain number of pair-errors. Let Ξ be the code
alphabet consisting of q elements. Then each element c ∈ Ξ is called a symbol. In symbol-
pair read channels, a codeword (c0 , c1 , . . . , cn−1 ) is read as ((c0 , c1 ), (c1 , c2 ), . . . , (cn−1 , c0 )).
Suppose that x = (x0 , x1 , . . . , xn−1 ) is a vector in Ξn . The symbol-pair vector of x is de-
fined as π(x) = ((x0 , x1 ), (x1 , x2 ), . . . , (xn−1 , x0 )). Hence, each vector has a unique symbol-
pair vector π(x) ∈ (Ξ, Ξ)n . An important parameter of symbol-pair codes is symbol-pair
distance. In 2010, Cassuto and Blaum [365] gave the definition of the symbol-pair dis-
tance as the Hamming distance over the alphabet (Ξ, Ξ). Given x = (x0 , x1 , . . . , xn−1 ), y =
(y0 , y1 , . . . , yn−1 ), the symbol-pair distance between x and y is defined as
dsp (x, y) = dH (π(x), π(y)) = {i | (xi , xi+1 ) 6= (yi , yi+1 )}.
The symbol-pair weight of a vector x is defined as the Hamming weight of its symbol-pair
vector π(x):
wtsp (x) = wtH (π(x)) = {i | (xi , xi+1 ) 6= (0, 0), 0 ≤ i ≤ n − 1, xn = x0 }.
If the code C is linear, its symbol-pair distance equals the minimum symbol-pair weight of
nonzero codewords of C, namely
In [365], Cassuto and Blaum studied the model of symbol-pair read channels. They
provided constructions and decoding methods of symbol-pair codes. Then in [366] they con-
sidered lower and upper bounds on such codes; moreover, they gave bounds and asymptotics
on the size of optimal symbol-pair codes. They also established the relationship between
the symbol-pair distance and the Hamming distance in [366, Theorem 2]. In 2011, by using
algebraic methods, Cassuto and Litsyn [367] constructed cyclic symbol-pair codes. Apply-
F [x]
ing the Discrete Fourier Transform to coefficients of codeword polynomials c(x) ∈ hxnq −1i
and the BCH Bound, Cassuto and Litsyn proved that the symbol-pair distance of a cyclic
code C with Hamming distance dH (C) is at least dH (C) + 2 [367, Theorem 10]. This im-
plies that the lower bounds on symbol-pair distances of cyclic codes can be computed by
determining the Hamming distances of the cyclic codes. In addition, if g(x) is a generator
polynomial of a cyclic code C with prime length n and g(x) has at least m roots in Fqt and
dH (C) ≤ min {2m − n + 2, m − 1}, then the lower bounds on symbol-pair distance of the
cyclic code C is at least dH (C) + 3 [367, Theorem 11]. Cassuto and Litsyn also constructed
symbol-pair codes from cyclic codes and showed that there exist symbol-pair codes with
rates strictly higher than such codes in the Hamming metric with the same relative distance.
More recently, Yaakobi et al. [1917] considered and gave a lower bound on the symbol-
pair distances for binary cyclic codes as follows: for a given linear cyclic
l m C with Ham-
code
ming distance dH (C), the symbol-pair distance is at least dH (C) + dH2(C) [1917, Theorem
4]. This shows that the lower bound on the symbol-pair distances proved by Yaakobi et
al. [1917, Theorem 4] is better than the result of Cassuto and Litsyn [367, Theorem 10]
for binary cyclic codes. However, the algorithms introduced by Cassuto et al. and Yaakobi
et al. for decoding symbol-pair codes could not construct all pair-errors within the pair-
error-correcting capability. By giving the definition of a parity check matrix for symbol-pair
codes, Hirotomo et al. [956] proposed a new syndrome decoding algorithm of symbol-pair
codes that improved the algorithms of Cassuto et al. [367] and Yaakobi et al. [1917] for
decoding symbol-pair codes. In particular, in 2015, Kai et al. [1071] extended the result of
Cassuto and Litsyn [367, Theorem 10] for the case of simple-root constacyclic codes.
Recently, the symbol-pair distance distribution of all constacyclic code of length ps over
Fpm was established by Dinh et al. in [587].
Theorem 17.6.3 ([587]) Given a λ-constacyclic code of length ps over Fpm , then it has
s
F m [x]
the form Ci = h(x − λ0 )i i ⊆ hxpps −λi for i ∈ {0, 1, . . . , ps }, where λ0 ∈ Fpm such that λp0 = λ
(such λ0 exists and can be determined as in Theorem 17.6.1). The symbol-pair distance
dsp (Ci ) of Ci is completely determined as follows:
2 if i = 0,
k
3p if i = ps − ps−k + 1 for 0 ≤ k ≤ s − 2,
4pk if ps − ps−k + 2 ≤ i ≤ ps − ps−k + ps−k+1
for 0 ≤ k ≤ s − 2,
2(δ + 2)pk
if p − ps−k + δ ps−k−1 + 1 ≤ i ≤
s
dsp (Ci ) =
ps − ps−k + (δ + 1) ps−k−1
for 0 ≤ k ≤ s − 2 and 1 ≤ δ ≤ p − 2,
s−1
(δ + 2)p if i = ps − p + δ for 0 ≤ δ ≤ p − 2,
s
p if i = ps − 1,
if i = ps .
0
Using the Hamming and symbol-pair distances of constacyclic codes of length ps over
Constacyclic Codes over Finite Commutative Chain Rings 411
Fpm (see Theorems 17.5.5 and 17.6.3), Dinh et al. [586] obtained the symbol-pair distances
of all such codes over R.
Theorem 17.6.4 ([586]) Let the notations and λ-constacyclic codes C of length ps over
R be as listed in Theorem 17.6.1.
(a) If λ = α + uβ, then
2 if 0 ≤ i ≤ ps ,
k
3p if i = 2ps − ps−k + 1 for 0 ≤ k ≤ s − 2,
4pk if 2ps − ps−k + 2 ≤ i ≤ 2ps − ps−k + ps−k+1
for 0 ≤ k ≤ s − 2,
2(δ + 2)pk
if 2ps − ps−k + δ ps−k−1 + 1 ≤ i ≤
dsp (C) =
2ps − ps−k + (δ + 1) ps−k−1
for 0 ≤ k ≤ s − 2 and 1 ≤ δ ≤ p − 2,
(δ + 2)ps−1 if i = 2ps − p + δ for 0 ≤ δ ≤ p − 2,
ps
if i = 2ps − 1,
if i = 2ps .
0
which contains p2mi codewords. Moreover, the ideal hui<(2ps ,α+uβ) is the unique self-
dual (α + uβ)-constacyclic code of length 2ps over R.
(c) If λ is not a square in R and λ = γ ∈ F∗pm , then γ-constacyclic codes are classified
R[x]
by categorizing the ideals of the local ring <(2ps , γ) = hx2p s
−γi
into 4 distinct types,
s
where γ0p = γ, as follows:
Constacyclic Codes over Finite Commutative Chain Rings 413
u(x2 − γ0 )i
where 0 ≤ i ≤ ps − 1,
Type 3: (principal ideals with monic polynomial generators)
2
(x − γ0 )i + u(x2 − γ0 )t h(x)
Furthermore, the number of distinct γ-constacyclic codes of length 2ps over R, i.e.,
distinct ideals of the ring <(2ps , γ), is
s s
2(p2m + 1)pm(p −1) − 2p4m − 2 (2p2m + 3)pm(p −1) − 2ps − 1 s
+ + pm(p −1) + 2.
(p2m − 1)2 p2m − 1
When γ ∈ F∗pm , it is easy to see that for any γ-constacyclic code C of length 2ps over
R, its residue code Res(C) and torsion code Tor(C) are γ-constacyclic codes of length 2ps
over Fpm . By [581], each γ-constacyclic code of length 2ps over Fpm is an ideal of the form
Fpm [x]
h(x2 − γ0 )i i of the finite chain ring hx2p s
−γi
, where 0 ≤ i ≤ ps , and each code h(x2 − γ0 )i i
s
contains p2m(p −i) codewords. Therefore, we can determine the size of all γ-constacyclic
codes of length 2ps over R in Theorem 17.6.5(c) by multiplying the sizes of Res(C) and
Tor(C) in each case.
Theorem 17.6.6 ([390, Section 3]) Let γ ∈ F∗pm and C be a γ-constacyclic code of length
2ps over R as in Theorem 17.6.5(c). Then the number of codewords of C, denoted by nC , is
determined as follows.
• If C = h0i, then nC = 1.
s
• If C = h1i, then nC = p4mp .
s
−i)
• If C = hu(x2 − γ0 )i i where 0 ≤ i ≤ ps − 1, then nC = p2m(p .
s
−i)
• If C = h(x2 − γ0 )i i where 1 ≤ i ≤ ps − 1, then nC = p4m(p .
• If C = h(x2 − γ0 )i + u(x2 − γ0 )t h(x)i where 1 ≤ i ≤ ps − 1, 0 ≤ t < i, and h(x) is a
unit, then ( s
p4m(p −i) if 1 ≤ i ≤ ps−1 + 2t ,
nC = s
p2m(p −i) if ps−1 + 2t < i ≤ ps − 1.
414 Concise Encyclopedia of Coding Theory
ps +t s
• If h(x) is a unit and 2 < i ≤ ps −1, then ann(C)∗ = hb(x), u(x2 −γ0−1 )p −i
i
where
psX
−i−1
b(x) = (−γ0 ) i−t 2
(x −γ0−1 )i−t −u (bj x+aj )(−γ0 )j (x2 −γ0−1 )j x2i−2t−2j−1 .
j=0
s s
c(x) = (−γ0 )i−t (x2 − γ0−1 )p −ω
− u(x2 − γ0−1 )p −i−ω+t
×
ω−t−1
X
(bj x + aj )(−γ0 )j (x2 − γ0−1 )j x2i−2t−2j−1 .
j=0
codes are h(x4 − α0 )i i for 0 ≤ i ≤ 2ps . For the remaining case that the unit λ = γ is not
(F m +uFpm )[x]
a square for a nonzero element γ of Fpm , the ambient ring p hx4ps −γi is a local ring
with unique maximal ideal hx4 − γ0 , ui. Such λ-constacyclic codes are then classified into 4
distinct types of ideals, and the detailed structures of ideals in each type are provided.
The key result in [582] was that, when the unit λ is not a square in R, any nonzero
R[x]
polynomial of degree < 4 over Fpm is invertible in the ambient ring hx4p s
−λi
of constacyclic
s
codes of length 4p ; see [582, Propositions 4.1 and 5.1]. This fact was then used to obtain
the algebraic structure of all constacyclic codes of length 4ps over Fpm + uFpm and their
duals. However, [582, Section 6] explained that this fact is no longer true for the case pm ≡ 3
(mod 4).
In 2018, Dinh et al. [589, 590] completed the case pm ≡ 3 (mod 4) by showing that, if
the unit λ is not a square, then x4 − λ0 can be decomposed into a product of two irreducible
2 2 s
coprime quadratic polynomials which are x2 + γx + γ2 and x2 − γx + γ2 , where λp0 = λ
and γ 4 = −4λ0 . The quotient rings D 2 R[x]γ 2 ps E and D 2 R[x]γ 2 ps E are local non-
(x +γx+ 2 ) (x −γx+ 2 )
chain rings, and their ideals, which are completely classified, provide the structure of all
λ-constacyclic codes and their duals.
So a λ-constacyclic code of length 4ps over R, i.e., an ideal C of <(4ps , λ), is rep-
resented as a direct sum C = C+ ⊕ C− where C+ and C− are ideals of <(2ps , −δ)
and <(2ps , δ), respectively. Thus, the classification, detailed structure, and number of
codewords of constacyclic codes C of length 4ps over R can be obtained from that of
the direct summands C+ and C− in Theorem 17.6.5. Also C ⊥ = C+ ⊥
⊕ C−⊥
.
(b) λ = α + uβ, with α, β ∈ F∗pm , is not a square in R where pm ≡ 1 (mod 4). Let
s
α0p = α. Then the ambient ring
R[x]
<(4ps , α + uβ) =
hx4ps − (α + uβ)i
u(x4 − γ0 )i
where 0 ≤ i ≤ ps − 1,
Type 3: (principal ideals with monic polynomial generators)
4
(x − γ0 )i + u(x4 − γ0 )t h(x)
s
(d) λ = α+uβ, with α, β ∈ F∗pm , is not a square in R where pm ≡ 3 (mod 4). Let α0p = α
and α0 = −4η 4 . Then x4 − α0 = (x2 + 2ηx + 2η 2 )(x2 − 2ηx + 2η 2 ) is a factorization
of x4 − α0 into monic coprime irreducible factors. The ambient ring
R[x]
<(4ps , α + uβ) =
hx4ps − (α + uβ)i
is a principal ideal ring with ideals (x2 + 2ηx + 2η 2 )i (x2 − 2ηx + 2η 2 )j where 0 ≤
i, j ≤ 2ps . Equivalently, each (α + uβ)-constacyclic code of length 4ps over R has the
form
C = (x2 + 2ηx + 2η 2 )i (x2 − 2ηx + 2η 2 )j
s
for 0 ≤ i, j ≤ 2ps , which contains pm(8p −2i−2j) codewords. Its dual C ⊥ is the (α−1 −
uα−2 β)-constacyclic code
* 2ps −i 2ps −j +
⊥ 2 −1 η −2 2 −1 η −2
C = x +η x+ x −η x+
2 2
where 0 ≤ i ≤ ps − 1,
Type 3: (principal ideals with monic polynomial generators)
* +
2 i 2 t
γ γ
x2 + δγx + + u x2 + δγx + h(x)
2 2
γ2 T γ2 i γ2 t
2 2 2
u(x + δγx + ) ∈ (x + δγx + ) + u(x + δγx + ) h(x) ;
2 2 2
such T can be determined as
(
i if h(x) = 0,
T =
min {i, ps − i + t} 6 0.
if h(x) =
Theorem 17.6.9 ([589]) Let γ ∈ F∗pm , and C be a γ-constacyclic code of length 4ps over
R in Theorem 17.6.8(c). Then the number of codewords of C, denoted by nC , is determined
as follows.
• If C = h0i, then nC = 1.
418 Concise Encyclopedia of Coding Theory
s
• If C = h1i, then nC = p8mp .
s
−i)
• If C = hu(x4 − γ0 )i i where 0 ≤ i ≤ ps − 1, then nC = p4m(p .
s
−i)
• If C = h(x4 − γ0 )i i, where 1 ≤ i ≤ ps − 1, then nC = p8m(p .
• If C = h(x4 − γ0 )i + u(x4 − γ0 )t h(x)i where 1 ≤ i ≤ ps − 1, 0 ≤ t < i, and h(x) is a
unit, then ( s
p8m(p −i) if 1 ≤ i ≤ ps−1 + 2t ,
nC = s
p4m(p −i) if ps−1 + 2t < i ≤ ps − 1.
that deg(vj (x)) < deg(fj (x)) = dj and vj (x)Fj (x) + wj (x)fj (x) = 1. This implies
s s s s s
vj (x)p Fj (x)p + wj (x)p fj (x)p = (vj (x)Fj (x) + wj (x)fj (x))p = 1.
Denote
s
• <(nps , λ) = R[x]/hxnp − λi;
s
• A = Fpm [x]/hxnp − λi and A[u]/hu2 i = A + uA (u2 = 0);
s
• Kj = Fpm [x]/hfj (x)p i and Kj [u]/hu2 i = Kj + uKj (u2 = 0);
• εj (x) ∈ A defined by
s s s s s
εj (x) = vj (x)p Fj (x)p = 1 − wj (x)p fj (x)p (mod xnp − λ).
Theorem 17.6.10 ([341, Theorem 2.5]) Let C ⊆ <(nps , λ). Then the following state-
ments are equivalent.
(a) C is a λ-constacyclic code over R of length nps , i.e. an ideal of <(nps , λ).
(b) C is an ideal of A + uA.
Constacyclic Codes over Finite Commutative Chain Rings 419
ps −k s
where b(x) ∈ fj (x)d 2 e−1 (Kj /hfj (x)p −k−1 i) and 1 ≤ k ≤ ps − 1.
s
Case III: ps + 1 ideals: Cj = fj (x)k with |Cj | = p2mdj (p −k) , 0 ≤ k ≤ ps .
t
where b(x) ∈ fj (x)d 2 e−1 (Kj /hfj (x)t−1 i), 1 ≤ t ≤ ps − 1.
Pps −2 Pps −k−1 (t−d t e)mdj
Case V: k=1 t=1 p 2 ideals:
s
Cj = fj (x)k+1 b(x) + ufj (x)k , fj (x)k+t with |Cj | = pmdj (2p −2k−t) ,
t
where b(x) ∈ fj (x)d 2 e−1 (Kj /hfj (x)t−1 i), 1 ≤ t ≤ ps − k − 1 and 1 ≤ k ≤ ps − 2.
Moreover, let N(pm ,dj ,ps ) be the number of ideals in Kj + uKj . Then
( P s−1 s−1
2
i=0 (1 + 4i)2(2 −i)mdj if p = 2,
N(pm ,dj ,ps ) = s
P p 2−1 s
( p 2−1 −i)mdj
i=0 (3 + 4i)p if p is odd.
For the rest of this subsection,
s
we adopt the following notations and definitions. Let
λ, λ0 ∈ F∗pm satisfying λp0 = λ. Recall that f ∗ (x) = xdeg(f ) f (x−1 ) is the reciprocal polyno-
mial of f (x) ∈ Fpm [x]; see Section 17.3. Note that (f g)∗ = f ∗ g ∗ . Denote
s
• Ab = Fpm [x]/hxnp − λ−1 i and A[u]/hu
b 2
i = Ab + uAb (u2 = 0);
b j = Fpm [x]/hf ∗ (x)ps i and K
• K b j [u]/hu2 i = K b j (u2 = 0);
b j + uK
j
for all a(x) ∈ A. Then one can easily verify that τ is a ring isomorphism from A onto Ab
and can be extended to a ring isomorphism from A + uA onto Ab + uAb in the natural way,
namely
τ : ρ(x) 7→ ρ(x−1 ) = a(x−1 ) + ub(x−1 )
for all ρ(x) = a(x) + ub(x) with a(x), b(x) ∈ A.
s s
Clearly, for 1 ≤ j ≤ r, fj∗ (x)p is a divisor of xnp − λ−1 in Fpm [x]. This implies
s s s
xnp ≡ λ−1 (mod fj∗ (x)p ). Hence x−1 = λxnp −1 in the rings Kb j and K
b j + uK
b j as well.
Moreover, we define
s s
• εbj (x) = τ (εj (x)) = εj (x−1 ) = εj (λxnp −1
) (mod xnp − λ−1 );
• Φ
b : (K
b 1 + uK
b 1 ) × · · · × (K
b r + uK
b r ) → Ab + uAb via
r
X s
b : (ξ1 + uη1 , . . . , ξr + uηr ) 7→
Φ εbj (x)(ξj + uηj ) (mod xnp − λ−1 )
j=1
for all ξj , ηj ∈ K
b j , j = 1, . . . , r;
• τj : Kj + uKj → K
b j + uK
b j via
s s s
τj : ξ 7→ a(x−1 ) + ub(x−1 ) = a(λxnp −1
) + ub(λxnp −1
) (mod fj∗ (x)p ),
Pnps −1
If τ (a(x))b(x) = 0 in Ab + uA,
b then a · b =
i=0 ai bi = 0.
The dual of each λ-constacyclic code of length nps over R can be obtained from its
canonical form decomposition.
Constacyclic Codes over Finite Commutative Chain Rings 421
Theorem 17.6.13 ([341, Theorem 4.4]) Let CLbe a λ-constacyclic code of length nps
r ps n
over R with canonical form decomposition C = j=1 εj (x)Cj (mod x − λ), where Cj
is an ideal of Kj + uKj given in Theorem 17.6.11. Then the dual code C of C is a λ−1 -
⊥
where D
bj is an ideal of K
b j + uK
b j given by one of the following five cases:
ps s
b(x) ∈ fj (x)d 2 e−1
(Kj /hfj (x)p −1
i).
In the rest of this subsection we determine all self-dual cyclic codes and negacyclic codes
s
of length nps over R. Let ν ∈ {1, −1} and <(nps , ν) = R[x]/hxnp − νi. Then C is a
cyclic, respectively negacyclic, code if and only if C is an ideal of the ring <(nps , ν), i.e., a
ν-constacyclic code over R of length nps , where ν = 1, respectively ν = −1.
Using the notations above, as ν −1 = ν, we have <(nps , ν) = <(nps , ν −1 ) and A = Ab =
s s
Fpm [x]/hxp n − νi. Hence the map τ : A → A defined by τ (a(x)) = a(x−1 ) = a(νxnp −1 )
s
(mod xnp − ν) for all a(x) ∈ A is a ring automorphism on A satisfying τ −1 = τ .
We then have
s s s
xnp − ν = f1 (x)p · · · fr (x)p and
s s s s
xnp − ν = −ν(xp n
− ν)∗ = −νf1∗ (x)p · · · fr∗ (x)p .
Since f1 (x), . . . , fr (x) are pairwise coprime monic irreducible polynomials in Fpm [x], for
each 1 ≤ j ≤ r there is a unique integer j 0 , 1 ≤ j 0 ≤ r, such that fj∗ (x) = δj fj 0 (x)
where δj = fj (0)−1 ∈ F∗pm . In the following, we still denote the bijection j 7→ j 0 on the set
{1, . . . , r} by τ , i.e.,
fj∗ (x) = δj fτ (j) (x).
Whether τ denotes the automorphism of A or this map on {1, . . . , r} is determined by
context. The next lemma shows the compatibility of the two uses of τ .
(b) After a suitable rearrangement of f1 (x), . . . , fr (x), there are nonnegative integers ρ,
such that ρ + 2 = r, τ (j) = j for all j = 1, . . . , ρ, τ (ρ + i) = ρ + + i and τ (ρ + + i) =
ρ + i for all i = 1, . . . , .
(c) For any 1 ≤ j ≤ r, τ (εj (x)) = εbj (x) = εj (x−1 ) = ετ (j) (x) in the ring A.
(d) For any 1 ≤ j ≤ r, the map τj : Kj + uKj → Kτ (j) + uKτ (j) defined by
for all a(x), b(x) ∈ Kj , is a ring isomorphism from Kj + uKj onto Kτ (j) + uKτ (j) with
inverse τj−1 = ττ (j) . Moreover, for any ξ ∈ Kj + uKj we have
By Theorem 17.6.13 and Lemma 17.6.14, we have the following decomposition for dual
codes of cyclic and negacyclic codes over R of length nps .
Corollary 17.6.15 ([341, Corollary]) Let ν ∈ {1, −1} and LrC be a ν-constacyclic code
s
over R of length nps with canonical form decomposition C = j=1 εj (x)Cj (mod xp n − ν),
where Cj is an ideal of Kj + uKj given in Theorem 17.6.11. Then the dual code C ⊥ of C is
a ν-constacyclic code over R of length nps with canonical form decomposition
Lr s
C ⊥ = j=1 ετ (j) (x)Dτ (j) (mod xnp − ν),
where Dτ (j) is an ideal of Kτ (j) + uKτ (j) given by one of the following five cases:
ps s
Case I: If Cj = hfj (x)b(x) + ui where b(x) ∈ fj (x)d 2 e−1
(Kj /hfj (x)p −1
i), then
s
Dτ (j) = h−νδj xp n−dj
fτ (j) (x)b(x−1 ) + ui.
ps −k s
Case II: If Cj = hfj (x)k+1 b(x) + ufj (x)k i where b(x) ∈ fj (x)d 2 e−1
(Kj /hfj (x)p −k−1
i)
and 1 ≤ k ≤ ps − 1, then
s s
Dτ (j) = h−νδj xp n−dj
fτ (j) (x)b(x−1 ) + u, fτ (j) (x)p −k
i.
s
−k
Case III: If Cj = hfj (x)k i where 0 ≤ k ≤ ps , then Dτ (j) = hfτ (j) (x)p i.
d 2t e−1
Case IV: If Cj = hfj (x)b(x) + u, fj (x)t i where b(x) ∈ fj (x) (Kj /hfj (x)t−1 i) and
1 ≤ t ≤ ps − 1, then
s s s
−t+1
Dτ (j) = h−νδj xp n−dj
fτ (j) (x)p b(x−1 ) + ufτ (j) (x)p −t
i.
Using these structures, all self-dual cyclic and negacyclic codes of length ps n over R can
be determined as follows.
s
Theorem 17.6.16 ([341, Theorem 5.3]) Let ν ∈ {−1, 1} and x−1 = νxnp −1
. All dis-
tinct self-dual ν-constacyclic codes of length ps n over R have decomposition
Constacyclic Codes over Finite Commutative Chain Rings 423
Lr s
C = j=1 εj (x)Cj (mod xnp − ν),
where Cj is an ideal of Kj + uKj given by one of the following two cases.
Case II: If j = ρ + i where 1 ≤ i ≤ , the pair (Cj , Cj+ ) of ideals is given by one of the
following five subcases.
s
II-1: Cj = hfj (x)b(x) + ui and Cj+ = h−νδj xp n−dj
fj+ (x)b(x−1 ) + ui where
ps s
b(x) ∈ fj (x)d 2 e−1 (Kj /hfj (x)p −1 i).
II-2: Cj = hfj (x)k+1 b(x) + ufj (x)k i and
s s
Cj+ = h−νδj xp n−dj
fj+ (x)b(x−1 ) + u, fj+ (x)p −k
i
ps −k s
where b(x) ∈ fj (x)d 2 e−1
(Kj /hfj (x)p −k−1
i) and 1 ≤ k ≤ ps − 1.
s
−k
II-3: Cj = hfj (x)k i and Cj+ = hfj+ (x)p i where 0 ≤ k ≤ ps .
II-4: Cj = hfj (x)b(x) + u, fj (x)t i and
s s s
−t+1
Cj+ = h−νδj xp n−dj
fj+ (x)p b(x−1 ) + ufj+ (x)p −t
i
t
where b(x) ∈ fj (x)d 2 e−1 (Kj /hfj (x)t−1 i) and 1 ≤ t ≤ ps − 1.
II-5: Cj = hfj (x)k+1 b(x) + ufj (x)k , fj (x)k+t i and
s s s s
−k−t+1
Cj+ = h−νδj xp n−dj
fj+ (x)p b(x−1 )+ufj+ (x)p −k−t
, fj+ (x)p −k
i
t
where b(x) ∈ fj (x)d 2 e−1 (Kj /hfj (x)t−1 i), 1 ≤ t ≤ ps − k − 1, and 1 ≤ k ≤
ps − 2.
17.7 Extensions
In this section we briefly mention a few alternative directions in which the theories
studied here have been extended.
In information theory, the traditional way to analyze noisy channels is to divide the
message into information units called symbols. The research on the process of writing and
reading is often presumed to be performed on individual symbols. However, in some of
today’s emerging storage technologies, one finds that symbols can only be written and
read in possibly overlapping groups. Then, symbol-pair read channels were first studied by
Cassuto and Blaum [365, 366], where the outputs of the read process are pairs of consecutive
424 Concise Encyclopedia of Coding Theory
symbols. After that, their results were generalized by Yaakobi et al. [1918] to b-symbol read
channels, where the read operation is performed as a consecutive sequence of b symbols.
Let Ξ be an alphabet of size q, whose elements are called symbols. Suppose that x =
(x0 , x1 , . . . , xn−1 ) is a vector in Ξn ; in [543], Ding et al. defined the b-symbol read vector
of x as
where b ≥ 1. For any two vectors x and y, the b-distance between x and y is defined as
db (x, y) = {0 ≤ i ≤ n − 1 | (xi , . . . , xi+b−1 ) 6= (yi , . . . , yi+b−1 )}
where the subscripts are reduced modulo n. Accordingly, the b-weight of a vector x is
defined as
wtb (x) = {0 ≤ i ≤ n − 1 | (xi , . . . , xi+b−1 ) 6= 0}
where the subscripts are reduced modulo n.
In light of this definition, if b = 1, the b-distance is the Hamming distance, and if b = 2,
such b-distance is the symbol-pair distance. The b-distance of a code C is defined to be
Clearly, if C is a linear b-symbol code, the minimum b-weight and b-distance are the same,
i.e., db (C) = min {wtb (c) | c ∈ C, c 6= 0}.
Recently, Dinh et al. [599] provided the b-distance for all constacyclic codes of length ps
over Fpm when 1 ≤ b ≤ b p2 c.
Note that when b = 1 and b = 2, the b-symbol distance is the Hamming distance and
symbol-pair distance of codes. Namely, Theorem 17.7.1 is exactly the Hamming distance
and symbol-pair distance of all λ-constacyclic codes of length ps over Fpm , presented in
Theorems 17.5.5 and 17.6.3, respectively.
Certain specific cyclic codes can be constructed as cyclic DNA codes to be used in DNA
computing. The idea of DNA computing, introduced by Head in [927], is to link genetic data
analysis with scientific computations in order to handle computationally difficult problems.
However, Adleman [13] initiated the studies on DNA computing by solving an instance of
an NP-complete problem over DNA molecules. Four different constraints on DNA codes are
considered.
• The Hamming constraint: For all x, y ∈ C, dH (x, y) ≥ d with x 6= y, for some
prescribed minimum distance d.
Constacyclic Codes over Finite Commutative Chain Rings 425
A code C is said to be a quasi-cyclic code of index l if C is closed under the cyclic shift
of l symbols τ l , i.e., if τ l (C) = C, and C is called a λ-quasi-twisted code of index l if
it is closed under the λ-twisted shift of l symbols, i.e., τλl (C) = C. Of course, when λ = 1,
a λ-quasi-twisted code of index l is just a quasi-cyclic code of index l, and it becomes a
λ-constacyclic code if l = 1. It is easy to see that a code of length n is λ-quasi-twisted
(quasi-cyclic) of index l if and only if it is λ-quasi-twisted (quasi-cyclic) of index gcd(l, n).
Therefore, without loss of generality, one only need to consider λ-quasi-twisted (quasi-cyclic)
codes of index l where l is a divisor of the length n.
Quasi-cyclic codes over finite fields, the topic of Chapter 7, have a rich history in and
of themselves. Connections between quasi-cyclic block codes and convolutional codes, the
subject of Chapter 10, are found in [686, 1736] and Section 7.6. Beginning in the 1990s
many new quasi-cyclic linear codes over finite fields and finite rings have been discovered;
see, for example, [79, 398, 485, 486, 862, 863, 1269, 1271, 1701].
Another variation that yields interesting results both for codes over fields and codes
over rings is when one starts with a noncommutative ambient space for codes rather than
the usual commutative setting of quotient rings of the polynomial ring F[x]. Specifically,
consider the codes that are ideals of quotient rings of the skew polynomial ring R[x; σ]
(where σ is an automorphism of the ring R). These are the skew-cyclic codes. They
have the property that if (a0 , a1 , . . . , an−1 ) is a codeword in a skew-cyclic code C, then
(σ(an−1 ), σ(a0 ), . . . , σ(an−2 )) is also a codeword in C. Of course when σ is the identity, this
produces the normal cyclic shift. This approach, introduced in [256] for skew-cyclic codes
over finite fields, was later extended to skew-cyclic and skew-constacyclic codes over finite
rings; see for example [257, 588]. Chapter 8 is devoted to the study of codes over skew
polynomial rings.
If quotients of a multivariable polynomial ring R[x1 , . . . , xn ] are used as ambient spaces
for codes, one gets the so-called multivariable codes. The study of multivariable codes
426 Concise Encyclopedia of Coding Theory
goes back to the work of Poli in [1531, 1532] where multivariable codes over finite fields
R[x,y,z]
were first introduced and studied. There, ideals of ht1 (x),t 2 (y),t3 (z)i
, where R is a finite field,
were considered. This notion then was extended by Martı́nez-Moro and Rúa in [1337, 1338]
where R is assumed to be a finite chain ring.
Finally, there are the notions of polycyclic codes and sequential codes, which were in-
troduced in [1287] and [987], respectively. A linear code C of length n is right polycyclic
if there exists an n-tuple c = (c0 , c1 , . . . , cn−1 ) ∈ Fn , where F is a finite field, such that
for every codeword (a0 , a1 , . . . , an−1 ) ∈ C, (0, a0 , a1 , . . . , an−2 ) + an−1 (c0 , c1 , . . . , cn−1 ) ∈ C.
Left polycyclic is defined similarly. C is bi-polycyclic if it is both left and right polycyclic.
Polycyclicity of codes is clearly a generalization of cyclicity, as a λ-constacyclic code is right
polycyclic induced by c = (λ, 0, . . . , 0), and left polycyclic using d = (0, . . . , 0, λ−1 ). So
indeed a λ-constacyclic code is bi-polycyclic.
As with cyclic and constacyclic codes, polycyclic codes may be understood in terms
of ideals in a quotient ring of a polynomial ring. Given c = (c0 , c1 , . . . , cn−1 ) ∈ Fn , let
f (x) = xn − c(x) where c(x) = c0 + c1 x + · · · + cn−1 xn−1 . Then the F-linear isomorphism
ρ : Fn → hfF[x] n
(x)i = Rn sending a = (a0 , a1 , . . . , an−1 ) ∈ F to the polynomial a0 + a1 x +
· · · + an−1 xn−1 associates the right polycyclic codes induced by c with the ideals of Rn .
Similarly, when C is a left polycyclic code, a slightly different isomorphism gives the
identification of the left polycyclic codes induced by c as ideals of the corresponding ambient
ring. As before, let c = (c0 , c1 , . . . , cn−1 ) ∈ Fn but this time let c0 (x) = c0 xn−1 + c1 xn−2 +
· · · + cn−1 . Then let f 0 (x) = xn − c0 (x) and consider γ : Fn → hfF[x] 0 (x)i = Ln defined via
n−1
γ : (a0 , a1 , . . . , an−1 ) 7→ a0 x + · · · + an−2 x + an−1 . In this setting, very much like before,
one can see that γ(C) is an ideal of Ln .
Since all ideals of F[x] are principal, the same is true in hfF[x] (x)i ; thus the ambient space
F[x]
hf (x)iis a principal ideal ring. Furthermore, following the usual arguments used in the
theory of cyclic codes, one easily sees that every polycyclic code C of dimension k has a
monic polynomial g(x) of minimum degree n − k belonging to the code. This polynomial,
called a generator polynomial of C, is a factor of f (x). Also, a generator polynomial of a
code is unique up to associates in the sense that if g1 (x) ∈ F[x] has degree n − k, it is easy
to show that g1 (x) is in the code generated by g(x) if and only if g1 (x) = ag(x) for some
0 6= a ∈ F.
As with cyclic codes, using the generator polynomial of a polycyclic code C, one can read-
ily construct a generator matrix for it. It turns out that this property in fact characterizes
polycyclic codes, as pointed out in [1287, Theorem 2.3].
with gn−k 6= 0. In this case ρ(C) = hg0 + g1 x + · · · + gn−k xn−k i is an ideal of Rn = hfF[x]
(x)i .
The same criterion, but requiring g0 6= 0 instead of gn−k 6= 0, serves to characterize left
polycyclic codes. In the latter case, γ(C) = hgn−k + gn−k−1 x + · · · + g0 xn−k i is an ideal of
Ln = hfF[x]
(x)i .
Theorem 17.7.3 Let C be a code of length n over the finite field F. Then
Minjia Shi
Anhui University
18.1 Introduction
Few-weight codes, codes with a small number of codeword weights, were studied first
for their intrinsic mathematical appeal. The study of various two-weight codes is described
in Chapter 19; few-weight codes appear prominently in Chapter 20. Few-weight codes are
also widely used in secret sharing schemes as introduced in Chapter 33. Ding et al. in
[560, 568, 570, 572, 1935, 1946] constructed several few-weight trace codes over finite fields
and studied their applications. Inspired by these articles, many scholars have further studied
the trace codes over some finite rings. Using the trace function to construct few-weight codes
is an effective method. In different alphabets, under different defining sets (the definition
of defining sets will be introduced later), many scholars have constructed many classes of
interesting few-weight codes. Selecting the suitable Gray map, some image codes are optimal
and their parameters are completely new.
429
430 Concise Encyclopedia of Coding Theory
18.2 Preliminaries
The classes of trace codes play an important role in the theory of error-correcting codes.
Many scholars considered trace codes over some finite rings [1307, 1308, 1669, 1670, 1671,
1672, 1673, 1674, 1675, 1676, 1677, 1679]. These finite rings are
Rk = F2 [u1 , u2 , . . . , uk ]/hu2i = 0, ui uj = uj ui i, (Type I)
Rk (p) = Fp [u1 , u2 , . . . , uk ]/hu2i = 0, ui uj = uj ui i, and
R(k, p, uk = a) = Fp + uFp + · · · + uk−1 Fp , uk = a, a ∈ Fp (Type II).
The ring Rk is also considered in Chapter 6.
Let R denote one of the three finite rings given above. Let R be the ring extension of
R obtained by replacing the field Fp in R by Fpm . These extensions will be denoted by Rk ,
Rk (p) and R(k, p, uk = a), respectively. For q a prime power, the trace function Trqs /q (·)
Ps−1 i
from Fqs down to Fq is defined by Trqs /q (x) = i=0 xq for all x ∈ Fqs . For each of the
three extensions we can define a generalized trace function Tr(·) from R down to R,
as given in the above references. For example, when we consider the ring R(k, p, uk = a),
the generalized trace function from R(k, p, uk = a) down to R(k, p, uk = a) is defined as
Tr(a0 + a1 u + · · · + ak−1 uk−1 ) = Trpm /p (a0 ) + Trpm /p (a1 )u + · · · + Trpm /p (ak−1 )uk−1 for
ai ∈ Fpm , i = 0, 1, . . . , k − 1. In the other two cases, the definition of the generalized trace
function is similar.
Let L ⊆ R∗ where R∗ denotes the group of units in R. For r ∈ R, the vector Ev(r)
is obtained by applying the evaluation map to r: Ev(r) = (Tr(rx))x∈L . Define the code
C(m, p, L) of length |L| by the formula
C(m, p, L) = {Ev(r) | r ∈ R} = {(Tr(rx))x∈L | r ∈ R},
where the subset L is called the defining set of C(m, p, L).
Theorem 18.3.1 ([1672]) For a ∈ R1 , the Lee weight distribution of the code C(m, 2, R∗1 )
is as follows.
(a) If a = 0, then wtL (Ev(a)) = 0.
(b) If a ∈ M \ {0}, then wtL (Ev(a)) = 22m .
(c) If a ∈ R∗1 , then wtL (Ev(a)) = 22m − 2m .
Theorem 18.3.1 shows that C(m, 2, R∗1 ) is a two-weight code of length 22m − 2m over R1 .
By Theorem 18.3.1 together with the sizes of M and R∗1 , φ1 (C(m, 2, R∗1 )) is a binary two-
weight linear code of length n = 22m+1 − 2m+1 and dimension 2m with Hamming weight
enumerator
2m 2m
Hweφ1 (C(m,2,R∗1 )) (x, y) = y n + (22m − 2m )xn/2 y n/2 + (2m − 1)x2 y n−2 .
Remark 18.3.2 These parameters are reminiscent of those of the MacDonald code [1312,
Table 6, page 54] but they are not the same. However, if we make the concatenation of a
MacDonald code of type 13 in the sense of [1312, Table 6, page 54] (k = 2m, u = m) with
a length 2 repetition code (i.e., replace 0 by 00 and 1 by 11), then we obtain a code with
the same parameters as φ1 (C(m, 2, R∗1 )). It is in fact an equivalent code, obtained by the
Gray map from a simplex code over F2 + uF2 = R1 . See the construction in [981, page 18,
equation (31)] with K = 2m + 1, U = m + 1.
By the Griesmer Bound ([855] and Theorem 1.9.18) the parameters of an [n, k, d]q linear
Pk−1
code satisfy i=0 qdi ≤ n. Theorem 18.3.1 and the Griesmer Bound yield the following
result.
Theorem 18.3.3 ([1672]) For any m ≥ 2, the code φ1 (C(m, 2, R∗1 )) is optimal for a given
length and dimension.
We find that φ1 (C(2, 2, R∗1 )) is a [24, 4, 12] binary code. This code is optimal, for a given
length and dimension, by [845], but different from the Magma BKLC(F2 , 24, 4) [250], which
has a different Hamming weight distribution, namely
We also see that φ1 (C(3, 2, R∗1 )) is a [112, 6, 56]2 binary code with Hamming weight
enumerator
Hweφ1 (C(3,2,R∗1 )) (x, y) = y 112 + 56x56 y 56 + 7x64 y 48 .
This code is optimal for a given length and dimension, but it is different from the Magma
BKLC(F2 , 112, 6) code as HweBKLC(F2 ,112,6) (x, y) = y 112 + 60x56 y 56 + 3x64 y 48 . The Magma
code has a fourteen point construction in [845].
Theorem 18.3.4 ([1672]) The minimum Lee distance of C(m, 2, R∗1 )⊥ is 2, for all m ≥ 2.
We say that a vector x covers a vector y if supp(x) contains supp(y). A minimal
codeword of a linear code C is a nonzero codeword x that does not cover any other nonzero
codeword except multiples of x. A code in which every nonzero codeword is minimal is called
a minimal code. In Chapter 33, minimal codes are discussed in relation to secret sharing
schemes.
In general determining the minimal codewords of a given linear code is a difficult task;
determining minimal codewords is called the covering problem. However, there is a nu-
merical condition, derived in [67], bearing on the weights of the code, that is easy to check.
Theorem 18.3.5 ([67]) Denote by w0 and w∞ the minimum and maximum nonzero Ham-
ming weights of a linear code C over Fq . If ww∞
0
> q−1
q , then every nonzero codeword of C is
minimal; hence C is minimal.
Theorem 18.3.6 ([1672]) φ1 (C(m, 2, R∗1 )) is a minimal code.
Theorem 18.3.7 ([1673]) For a ∈ R2 , the Lee weight of codewords in the code C(m, 2, R∗2 )
is given below.
(a) If a = 0, then wtL (Ev(a)) = 0.
(b) If a = αuv, where α ∈ F∗2m , then wtL (Ev(a)) = 24m+1 .
(c) If a ∈ M \ {αuv | α ∈ F2m }, then wtL (Ev(a)) = 24m+1 − 23m+1 .
(d) If a ∈ R∗2 , then wtL (Ev(a)) = 24m+1 − 23m+1 .
In Theorem 18.3.7, we have constructed a binary code φ2 (C(m, 2, R∗2 )) of length n =
3m+2
2 (2m − 1), dimension 4m, with two nonzero weights ω1 < ω2 of values
ω1 = 23m+1 (2m − 1), ω2 = 24m+1 ,
and respective frequencies f1 , f2 given by
f1 = 24m − 2m , f2 = 2m − 1.
Weight Distribution of Trace Codes over Finite Rings 433
Remark 18.3.8 These parameters are reminiscent of those of the family SU 1 in [330].
However, identifying the weights we found with that of SU 1 forces ` = 4m + 2, t = 3m + 2.
These values lead to a different f1 . Besides, the code dimension is only 4m < `. Thus, our
family is not of the form SU 1.
Theorem 18.3.9 ([1673]) For any m ≥ 2, the code φ2 (C(m, 2, R∗2 )) is optimal for a given
length and dimension.
Theorem 18.3.10 ([1673]) The minimum Lee distance of C(m, 2, R∗2 )⊥ is 2, for all m ≥
2.
• L2 ⊂ R∗k with L2 ∼
= D × F2m × · · · × F2m .
| {z }
2k −1
k
For c ∈ Rkn define the Lee weight of c as wtH (φ3k (c)) where φ3k : Rkn → F22 n is the
distance preserving Gray map defined recursively in [638]. When k = 1, φ31 is the map
defined previously; i.e., φ31 = φ1 . For k ≥ 2, write c = c1 + uk c2 with c1 , c2 ∈ Rk−1
n
. Define
3 3 3 3 3 2
φk (c) = (φk−1 (c2 ), φk−1 (c1 ) + φk−1 (c2 )). Notice that φ2 = φ as defined earlier.
Let M be the maximal ideal of Rk given by
nX o
M= rA uA A ⊆ {1, . . . , k}, r∅ = 0 .
A
As before, R∗k = Rk \ M .
Theorem 18.3.12 ([1670]) For any integer k ≥ 3, denote the integer range {1, 2, . . . , k}
by [k]. For a ∈ Rk , the weight distribution of the trace code C(m, 2, L1 ) is as follows.
(a) If a = 0, then wtL (Ev(a)) = 0.
k
(b) If a = αu[k] , where α ∈ F∗2m , then wtL (Ev(a)) = 22 m+k−1
.
k
−1)+k−1
(c) If a ∈ M \ {αu[k] | α ∈ F2m }, then wtL (Ev(a)) = (2m − 1)2m(2 .
k
(d) If a ∈ R∗k , then wtL (Ev(a)) = (2m − 1)2m(2 −1)+k−1
.
Therefore, we obtained a binary two-weight code φ3k (C(m, 2, L1 ))) of length (2m −
m(2k −1)+k
1)2 and dimension 2k m. More details are shown in Table 18.1.
434 Concise Encyclopedia of Coding Theory
TABLE 18.1: Hamming weight distribution of φ3k (C(m, 2, L1 )); see Theorem 18.3.12
Remark 18.3.13 We compare the two-weight code φ3k (C(m, 2, L1 )) with MacDonald codes
in [1312]. It can be seen as a concatenation of the MacDonald code of type 13 (k = 2k m, u =
2k m − m) in [1312, Table 6, page 54] with the [2k , 1, 2k ]2 repetition code. In fact, it is an
equivalent code obtained by the Gray map φ3k from a simplex code over Rk . Note that the
parameters in Table 18.1 includes references [1672, 1673] as special cases.
√
Theorem 18.3.14 ([1670]) Let m be even. If 1 ≤ N0 < 2m + 1, then C(m, 2, L2 ) is a
k
(|L2 |, 2m2 , dL ) linear code over Rk which has at most N0 + 1 different nonzero Lee weights,
k m k
where N10 2m(2 −1)+k−1 2m − (N0 − 1)2 2 ≤ dL ≤ N10 2m(2 −1)+k−1 (2m − 1).
Theorem 18.3.15 ([1670]) Let m be even and N0 > 2. Assume that there exists a positive
integer l such that 2l ≡ −1 (mod N0 ) and 2l divides m. Set t = m 2l . The linear code
m
φ3k (C(m, 2, L2 )) is a three-weight code provided that 2 2 + (−1)t (N0 − 1) > 0, and its weight
distribution is given in Table 18.2.
N0 22 m
− 2m
k m
2m(2 −1)+k−1 2m −(−1)t 2 2 (N0 −1)(2m −1)
N0 N0
The choice of the defining set L2 is inspired from [945], but φ3k (C(m, 2, L2 )) is different
from codes constructed in [945]. As the length, minimum distance and weight distribution
of the three-weight code φ3k (C(m, 2, L2 )) are determined by the values of N0 , k and m, it is
difficult to compare φ3k (C(m, 2, L2 )) with other three-weight codes completely.
Note that when N0 = 1, L1 = L2 and so C(m, 2, L2 ) = C(m, 2, L1 ).
(c) If a = αu, where α ∈ N , then wtL (Ev(a)) = (p−1) (p2m−1 −pm−1 )−pm−1 ((p)pm/2 +
1) .
(d) If a ∈ R1 (p)∗ , then wtL (Ev(a)) = (p − 1)(p2m−1 − pm−1 ).
w2 = (p − 1)(p2m−1 − pm−1 ),
w3 = (p − 1) (p2m−1 − pm−1 ) + pm−1 (pm/2 − 1) ,
Theorem 18.3.19 ([1676]) Assume m is odd and p ≡ 3 (mod 4). For a ∈ R1 (p), the Lee
weight of codewords of C(m, p, Q + uFpm ) is the following.
(a) If a = 0, then wtL (Ev(a)) = 0.
(b) If a = αu, where α ∈ F∗pm , then wtL (Ev(a)) = (p − 1)p2m−1 .
f1 = p2m − pm , f2 = pm − 1.
436 Concise Encyclopedia of Coding Theory
Theorem 18.3.20 ([1676]) If m is odd and p ≡ 3 (mod 4), then φ1 (C(m, p, Q + uFpm ))
meets the Griesmer Bound with equality.
Theorem 18.3.22 ([1676]) For m ≥ 1 odd and p ≡ 3 (mod 4), φ1 (C(m, p, Q + uFpm )) is
a minimal code.
Theorem 18.3.23 ([1308]) Let Dl be an l-subset of F∗q , where 0 < l < q. Assume that
K1 = Dl + uFr with u2 = 0 and r = q m . Then C(m, q, K1 ) is a linear code of length lr over
Fq + uFq , and the Lee weight distribution is listed in Table 18.3.
Theorem 18.3.24 ([1308]) With the notation of Theorem 18.3.23, then φ1 (C(m, q, K1 ))
is a [2q m l, m + 1, 2q m−1 l(q − 1)] linear code over Fq with the weight distribution in Ta-
ble 18.4. Furthermore, φ1 (C(m, q, K1 )) meets the Griesmer Bound with equality if 2l < q,
and φ1 (C(m, q, K1 )) is optimal for a given length and dimension if 2l ≥ q.
Let r = q m and s = q k , where m > 1 and k > 1 are integers with k | m. In [1307], Luo
et al. considered the following defining sets:
(i) K3 = Dc + uFr , Dc = {x ∈ Fs | Trqs /q (x) = c} for all c ∈ Fq ;
(ii) K4 = Fs \ Fq + uFr ;
(iii) K5 = SQFs \Fq + uFr , where SQFs \Fq denotes the set of all squares in Fs \ Fq ;
Weight Distribution of Trace Codes over Finite Rings 437
Similar to case (ii), the Gray map φ2 from R2 (p) to F4p is also given by
where a, b, c, d ∈ Fp .
p+1
Theorem 18.3.26 ([1674]) Assume m ≡ 2 (mod 4). Let (p) = (−1) 2 . Let Q be the
nonzero squares in Fpm and N the non-squares in Fpm . For a ∈ R2 (p), the Lee weight of
codewords of C(m, p, L1 ) is given below.
(a) If a = 0, then wtL (Ev(a)) = 0.
7m−2
(b) If a = αuv, where α ∈ Q, then wtL (Ev(a)) = 2(p − 1) p4m−1 − (p)p 2 .
438 Concise Encyclopedia of Coding Theory
7m−2
(c) If a = αuv, where α ∈ N , then wtL (Ev(a)) = 2(p − 1) p4m−1 + (p)p 2 .
(d) If a ∈ R2 (p) \ {αuv | α ∈ Fpm }, then wtL (Ev(a)) = 2(p − 1) p4m−1 − p3m−1 .
We have constructed a class of p-ary linear codes of length 2p4m − 2p3m , dimension 4m,
with three nonzero weights w1 < w2 < w3 of values
7m−2
w1 = 2(p − 1)(p4m−1 − p 2 ),
4m−1 3m−1
w2 = 2(p − 1)(p −p ),
7m−2
4m−1
w3 = 2(p − 1)(p +p 2 ),
Theorem 18.3.27 ([1674]) Assume m is odd and p ≡ 3 (mod 4). For a ∈ R2 (p), the Lee
weight of codewords of C(m, p, L1 ) is given below.
(a) If a = 0, then wtL (Ev(a)) = 0.
(b) If a = αuv, where α ∈ F∗pm , then wtL (Ev(a)) = 2(p4m − p4m−1 ).
(c) If a ∈ R2 (p) \ {αuv | α ∈ Fpm }, then wtL (Ev(a)) = 2(p − 1)(p4m−1 − p3m−1 ).
We obtain a class of p-ary two-weight linear [2p4m − 2p3m , 4m] codes with two nonzero
weights w1 < w2 given by
f1 = pm − 1, f2 = p4m − pm .
Theorem 18.3.28 ([1674]) If m is odd and p ≡ 3 (mod 4), then φ2 (C(m, p, L1 )) is opti-
mal, meeting the Griesmer Bound.
In hfact, by Theorem 18.3.29, we i can obtain the MacDonald codes [981] withmparam-
0 0
pK −pU 0 K 0 −1 U 0 −1 −1 3m
eters p−1 , K , p −p . As N2 = 1, |L2 | = np3m = NN1 p3m = pp−1 p .
h 4m 3m i
−4p
Thus φ2 (C(m, p, L2 )) is a p-ary code with parameters 4p p−1 , 4m, 4p4m−1 − 4p3m−1 ;
these parameters are 4-fold of the MacDonald codes with K 0 = 4m and U 0 = 3m. Also
φ2 (C(m, p, L3 )) is a p-ary code with parameters [4p4m − 4p3m , 4m, 4(p − 1)(p4m−1 − p3m−1 )];
these parameters are 4(p − 1)-fold of the MacDonald codes with K 0 = 4m and U 0 = 3m.
And these trace codes can be described geometrically as analogs of the simplex codes over
Fp + uFp + vFp + uvFp [981].
By Theorem 18.3.29, φ2 (C(m, p, L2 )) is a two-weight linear code over Fp of length
4(pm −1)p3m
p−1 and dimension 4m. The Hamming weight distribution of φ2 (C(m, p, L2 )) is given
in Table 18.7.
TABLE 18.7: Hamming weight distribution of φ2 (C(m, p, L2 )); see Theorem 18.3.29
TABLE 18.8: Hamming weight distribution of φ2 (C(m, p, L3 )); see Theorem 18.3.29
Comparing parameters in [1674], it is not hard to see that the corresponding dimension
and frequencies are the same. However, the corresponding length and weights are different;
for example, the weights are double those of parameters in [1674, Theorem 5].
Next, we consider the case N2 > 1.
Theorem 18.3.30 ([1675]) Let m be even or m be odd and p ≡ 3 (mod 4). If 1 < N2 <
√ 0
pm + 1, then C(m, p, L2 ) is a (|L2 |, p4m , dL ) linear code over R2 (p), which has at most
0
N2 + 1 nonzero Lee weights, where m ≤ m and
m
4p3m−1 pm − (N2 − 1)p 2 4p3m−1 (pm − 1)
≤ dL (C(m, p, L2 )) ≤ .
N2 N2
If there exists a positive integer l such that pl ≡ −1 (mod N2 ), then the weight distri-
bution of C(m, p, L2 ) is given in the following result.
m
−1
Theorem 18.3.31 ([1675]) Let m be even and N2 = gcd N, pp−1
> 2 with N | (pm − 1).
l
Assume that there exists a positive integer l such that p ≡ −1 (mod N2 ) and 2l divides m.
Set t = m
2l .
440 Concise Encyclopedia of Coding Theory
l
(a) Assuming that p, t, and pN+12
are odd, then the linear code C(m, p, L2 ) is a three-
m
weight linear code, where N2 is even and N2 < p 2 + 1. The weights of C(m, p, L2 ) are
presented in Table 18.9.
(b) In all other cases, the linear code C(m, p, L2 ) is a three-weight linear code, where
m
p 2 + (−1)t (N2 − 1) > 0. The weights of C(m, p, L2 ) are presented in Table 18.10.
Theorem 18.3.32 ([1675]) The minimum Lee distance of C(m, p, L1 )⊥ is 2 for all m > 1.
Theorem 18.3.33 ([1675]) The Gray image φ2 (C(m, p, L1 )) satisfies the following prop-
erties.
(a) For m > 2 with m even, φ2 (C(m, p, L1 )) is a minimal code.
(b) For m > 1 with m odd and p ≡ 3 (mod 4), φ2 (C(m, p, L1 )) is a minimal code.
Theorem 18.3.35 ([1675]) Let m > 1 and p be an odd prime. We have the following.
(a) If m is even or m is odd and p ≡ 3 (mod 4), then φ2 (C(m, p, L2 )) is a minimal code
when N2 = 1.
(b) φ2 (C(m, p, L3 )) is a minimal code.
Weight Distribution of Trace Codes over Finite Rings 441
For instance, when p = k = 2, it is easy to check that the Gray map φ4 is identified
with [1672]. As an additional example, when p = k = 3, we write φ4 (a0 + a1 u + a2 u2 ) =
(b0 , b1 , b2 , . . . , b8 ). According to the above definition, we have 0 ≤ i ≤ 2, 0 ≤ ≤ 2, and
k−2
P
pl−1 (i)al = p0 (i)a1 = ia1 . Then we get
l=1
b0 = a2 , b1 = a2 + a0 , b2 = a2 + 2a0 , b3 = a2 + a1 , b4 = a2 + a1 + a0
b5 = a2 + a1 + 2a0 , b6 = a2 + 2a1 , b7 = a2 + 2a1 + a0 , and b8 = a2 + 2a1 + 2a0 .
k−1
It is easy to extend the Gray map φ4 from R(k, p, uk = 0)n to Fpp n , and we also know
from [1680] that φ4 is injective and linear.
The homogeneous weight of an element x ∈ R(k, p, uk = 0) is defined as follows:
0 if x = 0,
wthom (x) = pk−1 if x = αuk−1 with α ∈ F∗p ,
(p − 1)pk−2 otherwise.
k n
The homogeneous
Pn weight of a codeword c = (c1 , c2 , . k. . , cn )nof R(k, p, u = 0) is defined as
wthom (c) = i=1 wthom (ci ). For any x, y ∈ R(k, p, u = 0) , the homogeneous distance
dhom is given by dhom (x, y) = wthom (x − y). As was observed in [1680], φ4 is a distance
k−1
preserving isometry from (R(k, p, uk = 0)n , dhom ) to (Fpp n , dH ), where dhom and dH denote
k−1
the homogeneous and Hamming distance in R(k, p, uk = 0)n and Fpp n , respectively. This
means if C is a linear code over R(k, p, uk = 0) with parameters (n, pt , d), then φ4 (C) is a
linear code of parameters [pk−1 n, t, d] over Fp . Note that when p = k = 2, the homogeneous
weight is none other than the Lee weight, which was considered in [1672].
Shi et al. [1677] studied three different defining sets of C(m, p, L) over R(k, p, uk = 0).
For the first defining set, Q denotes the nonzero squares in Fpm and N the non-squares
in Fpm . For the third defining set, we need additional notation (analogous to that found
in Section 18.3.4). Let ξ be a primitive element of Fpm . For N 0 a positive integer such
m
−1 m
−1 N0
that N 0 | (pm − 1), let N10 = lcm N 0 , pp−1 and N20 = gcd N 0 , pp−1 . Define n0 = N10 and
0
D0 = {ξ N (j−1)
| j = 1, 2, . . . , n0 }. The three defining sets are as follows:
• L1 = Q + uFpm + · · · + uk−1 Fpm ;
442 Concise Encyclopedia of Coding Theory
• L2 = R(k, p, uk = 0)∗ ;
Theorem 18.4.1 ([1677]) Let m be even, p ≡ 1 (mod 4), and N1 = (pm −1)p(k−1)(m+1) /2.
For a ∈ R(k, p, uk = 0), the homogeneous weight distribution of codewords in C(m, p, L1 ) is
given below.
(a) If a = 0, then wthom (Ev(a)) = 0.
p−1 m
(b) If a = αuk−1 , where α ∈ Q, then wthom (Ev(a)) = p N1 + p(k−1)(m+1) (p 2 + 1)/2 .
p−1 m
(c) If a = αuk−1 , where α ∈ N , then wthom (Ev(a)) = p N1 − p(k−1)(m+1) (p 2 + 1)/2 .
p−1
(d) If a ∈ R(k, p, uk = 0) \ {αuk−1 | α ∈ Fpm }, then wthom (Ev(a)) = p N1 .
Theorem 18.4.2 ([1677]) Let m be odd, p ≡ 3 (mod 4), and N1 = (pm −1)p(k−1)(m+1) /2.
For a ∈ R(k, p, uk = 0), the homogeneous weight distribution of codewords in C(m, p, L1 ) is
given below.
(a) If a = 0, then wthom (Ev(a)) = 0.
p−1
(b) If a = αuk−1 , where α ∈ F∗pm , then wthom (Ev(a)) =
p N1 + p(k−1)(m+1) /2 .
p−1
(c) If a ∈ R(k, p, uk = 0) \ {αuk−1 | α ∈ Fpm }, then wthom (Ev(a)) = p N1 .
It is necessary to distinguish the difference between the case when k = 2 in Theorems 18.4.1
and 18.4.2 and the case in [1676]. Although the ring and the defining set are the same
as [1676], the codes are not the same, because the Gray maps are different. We list their
homogeneous weight distributions in Tables 18.11 and 18.12 to show the difference.
pm (pm+1 −pm )
pm−1 (pm+1 − pm ) 2 pm − 1
According to Tables 18.11 and 18.12, it is easy to see that the corresponding dimension
Weight Distribution of Trace Codes over Finite Rings 443
and frequency are the same. However, the nonzero weights of the codes are different. Fur-
thermore, we can check that the corresponding nonzero weights and lengths have constant
ratio in both tables. For example, in Table 18.12, we have
(pm+1 −pm )(pm −1) pm (pm+1 −pm )
2 2 p
= = ,
(pm − pm−1 )(pm − 1) pm−1 (pm+1 − pm ) 2
and we know the length of the codes has the same proportional relationship.
Theorem 18.4.3 ([1677]) Let N2 = (pm − 1)p(k−1)(m+1) . For a ∈ R(k, p, uk = 0), the
homogeneous weight distribution of codewords in C(m, p, L2 ) is given below.
(a) If a = 0, then wthom (Ev(a)) = 0.
p−1
(b) If a = αuk−1 , where α ∈ F∗pm , then wthom (Ev(a)) =
p N2 + p(k−1)(m+1) .
p−1
(c) If a ∈ R(k, p, uk = 0) \ {αuk−1 | α ∈ Fpm }, then wthom (Ev(a)) = p N2 .
Theorem 18.4.4 ([1677]) Suppose N20 = 1 and either m is even or m is odd with p ≡ 3
(mod 4). For a ∈ R(k, p, uk = 0), the homogeneous weight distribution of codewords in
C(m, p, L3 ) is given below.
(a) If a = 0, then wthom (Ev(a)) = 0.
(b) If a = αuk−1 , where α ∈ F∗pm , then wthom (Ev(a)) = pk(m+1)−2 .
Theorem 18.4.6 ([1677]) Let m be even and N20 > 2. Assume there exists a positive
0
integer k 0 such that pk ≡ −1 (mod N20 ) and 2k 0 divides m. Let t = 2k
m
0.
0
pk +1
(a) If N20 is even, p, t and N20 are odd, then the code C(m, p, L3 ) is a three-weight linear
m
code provided that N20 < p 2 + 1, and its homogeneous weight distribution is given in
Table 18.13.
(b) In all other cases, the code C(m, p, L3 ) is a three-weight linear code provided that
m
p 2 +(−1)t (N20 −1) > 0 and its homogeneous weight distribution is given in Table 18.14.
Theorem 18.4.7 ([1677]) If m is odd and p ≡ 3 (mod 4), then the code φ4 (C(m, p, L1 ))
is optimal for a given length and dimension whenever
n j pk−1 − 2k + 1 k o
m ≥ max k, +1 .
2(k − 1)
444 Concise Encyclopedia of Coding Theory
Theorem 18.4.8 ([1677]) The code φ4 (C(m, p, L2 )) is optimal for a given length and di-
mension if
n j pk−1 − k k o
m ≥ max k, +1 .
k−1
Theorem 18.4.9 ([1677]) Assume N20 = 1, m is even or m is odd and p ≡ 3 (mod 4).
Then the code φ4 (C(m, p, L3 )) is optimal for a given length and dimension if
n j pk−1 − p(k − 1) + k − 2 k o
m ≥ max k, +1 .
(p − 1)(k − 1)
(d) If a = β(1 − u), where β ∈ F∗pm , then wtL (Ev(a)) = (p − 1)(p2m−1 − pm−1 ).
(e) If a = αu + β(1 − u) ∈ R(2, p, u2 = u)∗ , where α ∈ Q, then
Theorem 18.5.2 ([1669]) Assume m is odd and p ≡ 3 (mod 4). For a ∈ R(2, p, u2 = u),
the Lee weight of codewords of C(m, p, Q + uF∗pm ) is given below.
(a) If a = 0, then wtL (Ev(a)) = 0.
(b) If a = βu with β ∈ F∗pm , then wtL (Ev(a)) = (p − 1)(p2m−1 − pm−1 ).
(c) If a = β(1 − u) with β ∈ F∗pm , then wtL (Ev(a)) = (p − 1)(p2m−1 − pm−1 ).
(d) If a ∈ R(2, p, u2 = u)∗ , then wtL (Ev(a)) = (p − 1)(p2m−1 − 2pm−1 ).
Thus we obtain a family of p-ary two-weight codes of parameters [p2m − 2pm + 1, 2m], with
weight distribution as given in Table 18.15. The parameters are different from those in
[330, 1672, 1673, 1676].
Theorem 18.5.3 ([1669]) For a ∈ R(2, p, u2 = u), the Lee weight distribution of code-
words of C(m, p, F∗pm + uF∗pm ) is given below.
(a) If a = 0, then wtL (Ev(a)) = 0.
(b) If a = αu with α ∈ F∗pm , then wtL (Ev(a)) = 2(p − 1)(p2m−1 − pm−1 ).
446 Concise Encyclopedia of Coding Theory
(c) If a = β(1 − u) with β ∈ F∗pm , then wtL (Ev(a)) = 2(p − 1)(p2m−1 − pm−1 ).
(d) If a ∈ R(2, p, u2 = u)∗ , then wtL (Ev(a)) = 2(p − 1)(p2m−1 − 2pm−1 ).
The code φ5 (C(m, p, F∗pm + uF∗pm )) is a two-weight linear code over Fp of length 2(pm − 1)2
and dimension 2m. The Hamming weight distribution is given in Table 18.16. Note that the
parameters are different from those in [330, 1307, 1308, 1312].
TABLE 18.16: Hamming weight distribution of φ5 (C(m, p, F∗pm + uF∗pm )); see Theo-
rem 18.5.3
Hamming weight Frequency
0 1
2(p − 1)(p2m−1 − 2pm−1 ) p2m − 2pm + 1
2(p − 1)(p2m−1 − pm−1 ) 2pm − 2
Theorem 18.5.4 ([1669]) Assume m ≥ 3 is odd and p ≡ 3 (mod 4). Then φ5 (C(m, p, Q+
uF∗pm )) is optimal for a given length and dimension.
18.5.2 R(3, 2, u3 = 1)
Shi et al. [1679] considered the ring R(3, 2, u3 = 1) = F2 + uF2 + u2 F2 with u3 = 1. The
Gray map φ6 from R(3, 2, u3 = 1) to F32 is defined by
φ6 (a1 + a2 u + a3 u2 ) = (a1 , a2 , a3 ),
Theorem 18.5.6 ([1679]) Suppose m is odd. For a ∈ R(3, 2, u3 = 1), the Lee weight
distribution of codewords in C(m, 2, R(3, 2, u3 = 1)∗ ) is given below.
(a) If a = 0, then wtL (Ev(a)) = 0.
(b) If a ∈ R(3, 2, u3 = 1)∗ , then wtL (Ev(a)) = 3(23m−1 − 22m−1 − 2m−1 ).
(c) If a ∈ R(3, 2, u3 = 1) \ {R(3, 2, u3 = 1)∗ ∪ {0}}, a = a1 + a2 u + a3 u2 , and
(i) if (a1 +a2 , a1 +a3 ) = (0, 0), a1 +a2 +a3 6= 0, then wtL (Ev(a)) = 3(23m−1 −2m−1 );
(ii) if (a1 +a2 , a1 +a3 ) 6= (0, 0), a1 +a2 +a3 = 0, then wtL (Ev(a)) = 3(23m−1 −22m−1 ).
Next, the optimality of their binary images is given in the following theorem.
Theorem 18.5.8 ([1679]) For any odd m > 1, the code φ6 (C(m, 2, R(3, 2, u3 = 1)∗ )) is
optimal for a given length and dimension.
18.5.3 R(3, 3, u3 = 1)
Using the same Gray map as φ6 , Shi et al. [1671] considered the ring R(3, 3, u3 = 1) =
F3 + uF3 + u2 F3 and extension ring R(3, 3, u3 = 1) = F3m + uF3m + u2 F3m with u3 = 1. As
usual, Q is the nonzero squares in F3m , and N is the non-squares in F3m .
Theorem 18.5.9 ([1671]) Suppose m ≡ 2 (mod 4). For a ∈ R(3, 3, u3 = 1), the Lee
weight distribution of codewords in C(m, 3, Q + uF3m + u2 F3m ) is given below.
(a) If a = 0, then wtL (Ev(a)) = 0.
(b) If a = α(u − 1)2 , where α ∈ Q, then wtL (Ev(a)) = 33m − 35m/2 .
(c) If a = α(u − 1)2 , where α ∈ N , then wtL (Ev(a)) = 33m + 35m/2 .
(d) If a ∈ R(3, 3, u3 = 1) \ h(u − 1)2 i, then wtL (Ev(a)) = 33m − 32m .
Theorem 18.5.10 ([1671]) Suppose m is odd. For a ∈ R(3, 3, u3 = 1), the Lee weight
distribution of codewords in C(m, 3, Q + uF3m + u2 F3m ) is given below.
(a) If a = 0, then wtL (Ev(a)) = 0.
(b) If a = α(u − 1)2 with α ∈ F∗3m , then wtL (Ev(a)) = 33m .
(c) If a ∈ R(3, 3, u3 = 1) \ h(u − 1)2 i, then wtL (Ev(a)) = 33m − 32m .
Theorem 18.5.11 ([1671]) For a ∈ R(3, 3, u3 = 1), the Lee weight distribution of code-
words in C(m, 3, R(3, 3, u3 = 1)∗ ) is given below.
(a) If a = 0, then wtL (Ev(a)) = 0.
Theorem 18.5.12 ([1671]) The codes φ6 (C(m, 3, L)) of length 3|L| are optimal given
length and dimension for the following L:
(a) m is odd and L = Q + uF3m + u2 F3m in Theorem 18.5.10;
(b) m is a positive integer and L = R(3, 3, u3 = 1)∗ in Theorem 18.5.11.
448 Concise Encyclopedia of Coding Theory
18.6 Conclusion
Inspired by trace codes over finite fields, many scholars have studied several classes of
few-weight codes over finite rings. This chapter has explored some of these codes. With the
proper definition of the Gray map, images of these trace codes under the Gray map have
led to few-weight linear codes over finite fields, many of which turn out to be optimal.
Chapter 19
Two-Weight Codes
Andries E. Brouwer
Eindhoven University of Technology
19.1 Generalities
A linear code C with length n, dimension m, and minimum distance d over the
field Fq (in short, an [n, m, d]q code) is an m-dimensional subspace of the vector space Fnq
such that any two distinct codewords (elements of C) differ in at least d coordinates. A
generator matrix for C is an m × n matrix M such that its rows span C. The dual code
449
450 Concise Encyclopedia of Coding Theory
C ⊥ of C is the (n − m)-dimensional
P code consisting of the vectors orthogonal to all of C for
the inner product (u, v) = ui vi .
The weight wtH (c) of the codeword c is its number of nonzero coordinates. A weight of
C is the weight of some codeword in C. A two-weight code is a linear code with exactly
two nonzero weights. (Codes with fewPweights appear in Chapters 18 and 20.) The weight
enumerator of C is the polynomial fi xi where the coefficient fi of xi is the number of
words of weight i in C. (This equals HweC (x, 1) from Definition 1.15.1.)
19.2.1 Weights
Let Z be the (multi)set of columns of M , so that V = hZi. We can extend any codeword
u = (u(z))z∈Z ∈ C to a linear functional on V . Let X be the projective (multi)set in P V
determined by the columns of M . For u 6= 0, let Hu be the hyperplane in P V defined by
u(z) = 0. The weight of the codeword u is wtH (u) = n − |X ∩ Hu |.
Therefore searching for a code with large minimum distance is the same as searching for
a projective (multi)set such that all hyperplane intersections are small. Searching for a code
with few different weights is the same as searching for a projective (multi)set such that its
hyperplane sections only have a few different sizes.
19.3 Graphs
Let Γ be a graph with vertex set S of size s, undirected, without loops or multiple
edges. For x, y ∈ S we write x = y or x ∼ y or x 6∼ y when the vertices x and y are
equal, adjacent, or nonadjacent, respectively. The adjacency matrix of Γ is the matrix
A of order s with rows and columns indexed by S, where Axy = 1 if x ∼ y and Axy = 0
otherwise. The spectrum of Γ is the spectrum of A, that is, its (multi)set of eigenvalues.
vector space to V , that is, the space of Fq -linear maps from V to Fq . Then the characters
χ of V are of the form χa (x) = ζ tr(a(x)) , with a ∈ V ∗ . Now
X q if a(x) = 0,
χa (λx) =
0 otherwise.
λ∈Fq
P
It follows that d∈D χa (d) = q |Ha ∩ X| − |X| where Ha is the hyperplane {hxi | a(x) = 0}
in P V .
This can be formulated in terms of coding theory. To the set X corresponds a (projective)
linear code C of length n and dimension m. Each a ∈ V ∗ gives rise to the vector (a(x))x∈X
indexed by X, and the collection of all these vectors is the code C. A codeword a of weight
w corresponds to a hyperplane Ha that meets X in n − w points, and hence to an eigenvalue
q(n − w) − n = k − qw. The number of codewords of weight w in C equals the multiplicity
of the eigenvalue k − qw of Γ.
eigenvalues of Γ, that is, the eigenvalues of A, are the valency k and the two solutions of
x2 + (µ − λ)x + µ − k = 0.
In the above setting, with a graph Γ on a vector space, with adjacency defined by a
projective set X as difference set, the graph Γ will be strongly regular precisely when |H ∩X|
takes only two different values for hyperplanes H, that is, when the code corresponding to
X is a two-weight code.
This 1-1-1 correspondence between projective two-weight codes, projective sets that meet
the hyperplanes in two cardinalities (these are known as 2-character sets), and strongly
regular graphs defined on a vector space by a projective difference set, is due to Delsarte
[518].
The more general case of a code C with dual C ⊥ of minimum distance at least 2 corre-
sponds to a multiset X. Brouwer and van Eupen [313] gives a 1-1 correspondence between
arbitrary projective codes and arbitrary two-weight codes. See Section 19.9 below.
A survey of two-weight codes was given by Calderbank and Kantor [330]. Additional
families and examples are given in [446, 447]. In [270] numerical data (such as the number
of non-isomorphic codes and the order of the automorphism group) is given for small cases.
19.3.4 Parameters
Let V be a vector space of dimension m over Fq . Let X be a subset of size n of the point
set of P V that meets hyperplanes in either m1 or m2 points, where m1 > m2 . Let f1 and
f2 be the numbers of such hyperplanes. Then f1 and f2 satisfy
qm − 1
f1 + f2 = ,
q−1
q m−1 − 1
f1 m1 + f2 m2 = n ,
q−1
q m−2 − 1
f1 m1 (m1 − 1) + f2 m2 (m2 − 1) = n(n − 1)
q−1
and it follows that
where r, s are the eigenvalues of Γ other than k (with r ≥ 0 > s) and f, g their multiplicities.
Example 19.3.1 The hyperoval in PG(2, 4) (m = 3, q = 4, n = 6) gives the linear [6, 3, 4]4
code (with weight enumerator 1 + 45x4 + 18x6 ), but also corresponds to a strongly regu-
lar graph with parameters (v, k, λ, µ) = (64, 18, 2, 6) and spectrum 181 245 (−6)18 , where
exponents denote multiplicities.
Two-Weight Codes 453
It is not often useful, but one can also check the definition of strong regularity directly.
The graph Γ defined by the difference set X will be strongly regular with constants λ, µ if
and only if each point outside X is collinear with µ ordered pairs of points of X, while each
point p inside X is collinear with λ − (q − 2) ordered pairs of points of X \ {p}.
19.3.5 Complement
Passing from a 2-character set X to its complement corresponds to passing from the
strongly regular graph to its complement. The two-weight codes involved have a more com-
plicated relation and will look very different, with different lengths and minimum distances.
Example 19.3.2 The dual of the ternary Golay code is a [11, 5, 6]3 -code with weights 6
and 9. It corresponds to an 11-set in PG(4, 3) such that hyperplanes meet it in 5 or 2 points.
Its complement is a 110-set in PG(4, 3) such that hyperplanes meet it in 35 or 38 points. It
corresponds to a [110, 5, 72]3 -code with weights 72 and 75.
19.3.6 Duality
Suppose X is a subset of the point set of P V that meets hyperplanes in either m1 or
m2 points. We find a subset Y of the point set of the dual space P V ∗ consisting of the
hyperplanes that meet X in m1 points. Also Y is a 2-character set. If each point of P V
m−2
−1)−m2 (q m−1 −1)
is on n1 or n2 hyperplanes in Y , with n1 > n2 , then n2 = n(q (q−1)(m 1 −m2 )
and
m−2
(m1 − m2 )(n1 − n2 ) = q . It follows that the difference of the weights in a projective
two-weight code is a power of the characteristic. (This is a special case of the duality for
translation association schemes. See [519, Section 2.6] and [309, Section 2.10B].)
To a pair of complementary sets or graphs belongs a dual pair of complementary sets
or graphs. The valencies k, v − k − 1 of the dual graph are the multiplicities f1 , f2 of the
graph.
Let C be the two-weight code belonging to X. Then the graph belonging to Y has vertex
set C, where codewords are joined when their difference has weight w1 .
Example 19.3.3 For the [11, 5, 6]3 -code of Example 19.3.2 (with weight enumerator
1 + 132x6 + 110x9 ) the corresponding strongly regular graph has parameters (v, k, λ, µ) =
(243, 22, 1, 2) and spectrum 221 4132 (−5)110 . One of the two dual graphs has parameters
(v, k, λ, µ) = (243, 110, 37, 60) and spectrum 1101 2220 (−25)22 . The corresponding two-
weight code is a [55, 5, 36]3 -code with weights 36 and 45.
r−1 mi + r r−1−1 (n − mi ) now. We see that a q-ary code of dimension m, length n, and
q−1
weights wi becomes an r-ary code of dimension me, length n r−1 and weights wi rq .
454 Concise Encyclopedia of Coding Theory
Conjecture 19.4.1 (Schmidt–White [1627, Conj. 4.4]; see also [798, Conj. 1.2])
Let F be a finite field of order q = pf , p prime. Suppose 1 < e | (q − 1)/(p − 1) and let D be
the subgroup of F ∗ of index e. If the Cayley graph on F with difference set D is strongly
regular, then one of the following holds:
In each of the exceptional cases the graph is strongly regular. These graphs correspond
to two-weight codes over Fp .
Since F ∗ has a partition into cosets of D, the point set of the projective space P F is
partitioned into isomorphic copies of the two-intersection set X = {hdi | d ∈ D}. See also
[568, 1565, 1848].
19.5 Cyclotomy
More generally, the difference set D can be a union of cosets of a subgroup of F ∗ , for
some finite field F . Let F = Fq where q = pf , p is prime, and let e | (q − 1), say q = em + 1.
Let K ⊆ F∗q be the subgroup of the eth powers (so that |K| = m). Let α be a primitive
element of Fq . For J ⊆ {0, 1, . . . , e − 1} put u := |J| and D := DJ := {αj K | j ∈ J} =
S
{αie+j | j ∈ J, 0 ≤ i < m}. Define a graph Γ = ΓJ with vertex set Fq and edges (x, y)
whenever y − x ∈ D. Note that Γ will be undirected ifP q is even or e | (q − 1)/2.
As before, the eigenvalues of Γ are the sums θ(χ) = d∈D χ(d) for the characters χ of F .
Their explicit determination requires some theory of Gauss sums. Let us write Aχ = θ(χ)χ.
Clearly, θ(1) = mu, the valency of Γ. Now assume χ 6= 1. Then χ = χg for some g, where
2πi
χg (αj ) = exp tr(αj+g )
p
e−1
X e if µ(x) = 1,
µi (x) =
0 otherwise.
i=0
Two-Weight Codes 455
Hence,
e−1
X XX 1XX X
θ(χg ) = χg (d) = χj+g (y) = χj+g (x) µi (x)
e ∗ i=0
d∈D j∈J y∈K j∈J x∈Fq
e−1 X e−1
1X X 1X X
= (−1 + χj+g (x)µi (x)) = (−1 + µ−i (αj+g )Gi )
e i=1
e i=1
j∈J x6=0 j∈J
where
−1 if e is even and (pl + 1)/e is odd,
ε=
+1 otherwise.
Theorem 19.5.2 ([141, 314]) Let q = pf , p prime, f = 2lt and e | (pl + 1) | (q − 1). Let
u = |J|, 1 ≤ u ≤ e − 1. Then the graphs ΓJ are strongly regular with eigenvalues
k = q−1
e u with multiplicity 1,
u √
e (−1 + (−1)t q) with multiplicity q − 1 − k,
u √ √
e (−1 + (−1)t q) + (−1)t+1 q with multiplicity k.
This will yield two-weight codes over Fr in case K is invariant under multiplication by
q−1
nonzero elements in Fr , i.e., when e | r−1 . This is always true for r = pl , but also happens,
for example, when q = p2lt , r = plt , e | (pl + 1), and t is odd.
19.5.3 Generalizations
The examples given by de Lange and by Ikuta and Munemasa [1022, 1023] (p = 2,
f = 20, e = 75, J = {0, 3, 6, 9, 12} and p = 2, f = 21, e = 49, J = {0, 1, 2, 3, 4, 5, 6})
and the sporadic cases of the Schmidt–White Conjecture 19.4.1 were generalized by Feng
and Xiang [717], Ge, Xiang, and Yuan [798], Momihara [1394], and Wu [1910], who found
several further infinite families of strongly regular graphs. See also [1395].
Lemma 19.6.1 Let H be a subgroup of ΓL(1, q). Then H = htb,0 i for suitable b, or H =
s
htb,0 , tc,s i for suitable b, c, s, where s | r and c(q−1)/(p −1) ∈ hbi.
Proof: The subgroup of all elements ta,0 in H is cyclic and has a generator tb,0 . If this
was not all of H, then H/htb,0 i is cyclic again, and has a generator tc,s with s | r. Since
Two-Weight Codes 457
s
−1)
tc,s i = tcj ,is where j = 1+ps +p2s +· · ·+p(i−1)s , it follows for i = r/s that c(q−1)/(p ∈ hbi.
Theorem 19.6.2 H = htb,0 i has two orbits if and only if q is odd and H consists precisely
of the elements ta,0 with a a square in F∗q .
Proof: Let b have multiplicative order m. Then m | (q − 1), and htb,0 i has d orbits, where
d = (q − 1)/m.
Let b have order m and put d = (q − 1)/m. Choose a primitive element ω ∈ F∗q with
b = ω d . Let c = ω e .
Theorem 19.6.3 H = htb,0 , tc,s i, where s | r and d | e(q − 1)/(ps − 1), has exactly two orbits
of different lengths n1 , n2 , where n1 < n2 , n1 + n2 = q − 1, if and only if n1 = m1 m, with
(a) the prime divisors of m1 divide ps − 1,
(b) v := (q − 1)/n1 is an odd prime, and pm1 s is a primitive root mod v,
(c) gcd(e, m1 ) = 1, and
(d) m1 s(v − 1) | r.
That settled the case of two orbits of different lengths. Next consider that of two orbits
of equal length. As before, let b have order m and put d = (q − 1)/m. Choose a primitive
element ω ∈ F∗q with b = ω d . Let c = ω e .
Theorem 19.6.4 H = htb,0 , tc,s i, where s | r and d | e(q − 1)/(ps − 1), has exactly two orbits
of the same length (q − 1)/2 where (q − 1)/2 = m1 m if and only if
(a) the prime divisors of 2m1 divide ps − 1,
(b) no odd prime divisor of m1 divides e,
(c) m1 s | r, and
(d) one of the following cases applies:
(i) m1 is even, ps ≡ 3 (mod 8), and e is odd,
(ii) m1 ≡ 2 (mod 4), ps ≡ 7 (mod 8), and e is odd,
(iii) m1 is even, ps ≡ 1 (mod 4), and e ≡ 2 (mod 4), or
(iv) m1 is odd and e is even.
The graphs from Theorem 19.6.2 are the Paley graphs. The Van Lint–Schrijver Con-
struction from Section 19.5.1 is the special case of Theorem 19.6.3 where s = 1, e = 0,
m1 = 1.
19.7.1 Subspaces
(i) Easy examples are subspaces of P V . A subspace with vector space dimension i (pro-
i
−1
jective dimension i−1), where 1 ≤ i ≤ m−1, has size n = qq−1 and meets hyperplanes
q i −1 q i−1 −1
in either m1 = q−1 or m2 = q−1 points.
Here m1 − m2 = q i−1 can take many values.
(ii) If m = 2l is even, we can take the union of any family of pairwise disjoint l-subspaces. A
l l−1
−1
hyperplane will contain either 0 or 1 of these, so that n = qq−1 u, m1 = q q−1−1 u+q l−1 ,
q l−1 −1
m2 = q−1 u where u is the size of the family, 1 ≤ u ≤ q l .
Clearly, one has a lot of freedom choosing this family of pairwise disjoint l-subspaces,
and one obtains exponentially many non-isomorphic graphs with the same parameters
(cf. [1079]). There are many further constructions with these parameters; see, e.g.,
Section 19.7.2(ii) below, the alternating forms graphs on F5q (with u = q 2 + 1, see [309,
Theorem 9.5.6]), and [223, 224, 453, 497, 513].
19.7.2 Quadrics
(i) Let X = Q be the point set of a non-degenerate quadric in P V , and let H denote a
hyperplane. Intersections Q∩H are quadrics in H, and in the cases where there is only
one type of non-degenerate quadric in H, there are two intersection sizes, dependent
on whether H is tangent or not.
m−1
In more detail, if m is even, then n = |Q| = q q−1−1 + εq m/2−1 with ε = 1 for a
hyperbolic quadric, and ε = −1 for an elliptic quadric. A non-degenerate hyperplane
m−2 m−2
meets Q in m1 = q q−1−1 points, and a tangent hyperplane meets Q in m2 = q q−1−1 +
εq m/2−1 points. (Here we dropped the convention that m1 > m2 .) The corresponding
weights are w1 = q m−2 + εq m/2−1 and w2 = q m−2 . The corresponding graphs are
known as the affine polar graphs V Oε (m, q).
In the special case m = 4, ε = −1 one has n = q 2 + 1, m1 = q + 1, m2 = 1, and not
only the elliptic quadrics but also the Tits ovoids have these parameters.
(ii) The above construction with ε = 1 has the same parameters as the subspaces con-
struction in Section 19.7.1(ii) with u = q m/2−1 +1. Brouwer et al. [310] gave a common
generalization of both by taking (for m = 2l) the disjoint union of pairwise disjoint
l-spaces and non-degenerate hyperbolic quadrics, where possibly a number of pairwise
disjoint l-spaces contained in some of the hyperbolic quadrics is removed.
(iii) For odd q and even m, consider a non-degenerate quadric Q of type ε = ±1 in V , the
m-dimensional vector space over Fq . The non-isotropic points fall into two classes of
equal size, depending on whether Q(x) is a square or not. Both sets are (isomorphic)
2-character sets.
Let X be the set of non-isotropic projective points x where Q(x) is a nonzero square
(this is well-defined). Then |X| = 21 (q m−1 −εq m/2−1 ) and m1 , m2 = 21 q m/2−1 (q m/2−1 ±
1) (independent of ε). The corresponding graphs are known as V N Oε (m, q).
(iv) In Brouwer [307] a construction for two-weight codes is given by taking a quadric
defined over a small field and cutting out a quadric defined over a larger field. Let
F1 = Fr , and F = Fq , where r = q e for some e > 1. Let V1 be a vector space of
Two-Weight Codes 459
dimension d over F1 , where d is even, and write V for V1 regarded as a vector space
of dimension de over F . Let tr : F1 → F be the trace map. Let Q1 : V1 → F1 be a
non-degenerate quadratic form on V1 . Then Q = tr ◦ Q1 is a non-degenerate quadratic
form on V . Let X = {x ∈ P V | Q(x) = 0 and Q1 (x) 6= 0}. Write ε = 1 (ε = −1) if Q
is hyperbolic (elliptic).
Proposition 19.7.1 In the situation described in (iv), the corresponding two-weight code
has dimension de, length n = |X| = (q e−1 − 1)(q de−e − εq de/2−e )/(q − 1), and weights
w1 = (q e−1 − 1)q de−e−1 and w2 = (q e−1 − 1)q de−e−1 − εq de/2−1 .
For example, when q = e = 2, d = 4, ε = −1, this yields a projective binary [68, 8] code
with weights 32, 40. This construction was generalized in Hamilton [894].
where ε = (−1)m . If we view V as a vector space of dimension 2m over Fr , the same set X
now has n = (rm − ε)(rm−1 + ε)/(r − 1), w2 = r2m−2 , w1 − w2 = εrm−1 , as expected, since
the form is a non-degenerate quadratic form in 2m dimensions over Fr . Thus, the graphs
that one gets here are also graphs one gets from quadratic forms, but the codes here are
defined over a larger field.
q m n w1 w2 −w1 Comments
2 9 73 32 8 Fiedler & Klin [726]; [1136]
2 9 219 96 16 dual
2 10 198 96 16 Kohnert [1136]
2 11 276 128 16 Conway-Smith 211 .M24 rank 3 graph
2 11 759 352 32 dual; [819]
2 12 65i 32i 32 Kohnert [1136] (12 ≤ i ≤ 31, i 6= 19)
2 24 98280 47104 2048 Rodrigues [1593]
4 5 11i 8i 8 Dissett [600] (7 ≤ i ≤ 14, i 6= 8)
4 6 78 56 8 Hill [952]
4 6 429 320 32 dual
4 6 147 96 16 [307]; Cossidente et al. [445]
4 6 210 144 16 Cossidente et al. [445]
Section 19.7.1; De Wispelaere
4 6 273 192 16
and Van Maldeghem [512]
4 6 315 224 16 [307]; Cossidente et al. [445]
8 4 39 32 4 De Lange [507]
8 4 273 224 16 dual
16 3 78 72 4 De Resmini and Migliori [510]
3 5 11 6 3 dual of the ternary Golay code
3 5 55 36 9 dual
3 6 56 36 9 Games graph, Hill cap [951]
3 6 84 54 9 Gulliver [861]; [1343]
3 6 98 63 9 Gulliver [861]; [1343]
3 6 154 99 9 Van Eupen [1831]; [860]
3 8 82i 54i 27 Kohnert [1136] (8 ≤ i ≤ 12)
3 8 41i 27i 27 Kohnert [1136] (26 ≤ i ≤ 39)
3 8 1435 945 27 de Lange [507]
3 12 32760 21627 243 Liebeck [1256] 312 .2.Suz rank 3 graph
9 3 35 30 3 De Resmini [509]
9 3 42 36 3 Penttila and Royle [1502]
9 4 287 252 9 de Lange [507]
5 4 39 30 5 Dissett [600]; [270]
5 6 1890 1500 25 Liebeck [1256] 56 .4.J2 rank 3 graph
125 3 829 820 5 Batten and Dover [138]
125 3 7461 7400 25 dual
343 3 3189 3178 7 Batten and Dover [138]
343 3 28701 28616 49 dual
Two-Weight Codes 461
Example 19.9.1 If we start with the unique [16, 5, 9]3 -code, with weight enumerator 1 +
116x9 + 114x12 + 12x15 , and take α = 1/3, β = −3, we find a [69, 5, 45]3 -code with weight
enumerator 1 + 210x45 + 32x54 . With α = −1/3, β = 5, we find a [173, 5, 108]3 -code with
weight enumerator 1 + 32x108 + 210x117 .
462 Concise Encyclopedia of Coding Theory
Remark 19.9.2
• In both directions there is a choice: pick α, β or pick w2 ∈ {w1 , w2 }. The correspon-
dence is 1-1 in the sense that if C ∗ is a Brouwer and van Eupen dual (BvE-dual) of
C, then C is a BvE-dual of C ∗ . If the projective code C one starts with has only two
different weights, then one can choose α, β so that Y becomes a set and the BvE-dual
coincides with the Delsarte dual.
• For another introduction and further examples, see Hill and Kolev [954].
• In Section 19.9.1, the degree 1 polynomial p(w) = αw + β was used. One can use
higher degree polynomials when more information about subcodes is available. See
the last section of [313] and Dodunekov and Simonis [611].
• See also [267, 1032] and [1768, Lemma 5.1].
Chapter 20
Linear Codes from Functions
Sihem Mesnager
University of Paris VIII and Télécom Paris
463
464 Concise Encyclopedia of Coding Theory
20.1 Introduction
Vectorial multi-output Boolean functions are functions from the finite field F2n of order
n
2 to the finite field F2m , for given positive integers n and m. These functions are called
(n, m)-functions and include the single-output Boolean functions, which correspond to the
case m = 1. The (n, m)-functions are a mathematical object but naturally appear in differ-
ent areas of computer science and engineering since any such mapping can be understood
as a transformation substituting a sequence of n bits (zeros and ones) with a sequence of
m bits according to a given prescription. Moreover, (n, m)-functions play an important role
in symmetric cryptography, usually referred to as an “S-box” or “substitution box” in this
context; they are fundamental parts of block ciphers by providing confusion, a requirement
mentioned by C. E. Shannon [1662], which is necessary to withstand known (and hopefully
future) attacks. When they are used as S-boxes in block ciphers, the number m of their
output bits is less than or equal to the number n of input bits, most often. Such func-
tions can also be used in stream ciphers, with m significantly smaller than n, in the place
of Boolean functions to speed up the ciphers. Researchers have defined various properties
which measure the resistance of an (n, m)-function to different kinds of cryptanalysis, in-
cluding nonlinearity, differential uniformity, boomerang uniformity, algebraic degree, and
so forth. Vectorial Boolean functions have been extended and studied in any characteristic.
An excellent survey on Boolean and vectorial Boolean functions can be found in [346, 347].
We also deeply recommend the outstanding book of C. Carlet [349] dealing with Boolean
and vectorial functions for cryptography and coding theory.
Linear codes are important in communication systems, consumer electronics, data stor-
age systems, etc. Those with good parameters can be employed in data storage systems
and communication systems. Determining the (Hamming) weights of linear codes is useful
for secret sharing, authentication codes, association schemes, strongly regular graphs, etc.
Cryptographic functions and codes have important applications in data communication and
data storage. These two areas are closely related and have had fascinating interplay. Cryp-
tographic functions (e.g., highly nonlinear functions, PN, APN, bent, AB, plateaued) have
important applications in coding theory. For instance, bent functions and almost bent (AB)
functions have been employed to construct optimal linear codes.
In the past decades, a lot of progress has been made in the construction of linear codes
from cryptographic functions and polynomials defined over finite fields. The main aim of
this chapter is to give an overview of linear codes from functions and polynomials using
different approaches. It is quite fascinating that vectorial functions can be nicely involved
to construct codes with good parameters and play a major role in these constructions. A
recent survey in this direction is [1244a].
The chapter is structured as follows. In Section 20.2, we present some background on
vectorial functions and related notions (some parameters of vectorial functions and tools for
handling vectorial functions). In Section 20.3, generic constructions of linear codes and their
extensions are presented. The core of the chapter starts in Section 20.4. We shall present
constructions of binary linear codes and highlight the role of some Boolean and vectorial
Boolean functions in the construction of codes with few weights. Section 20.5 is devoted to
the construction of cyclic codes using a sequence approach. We shall see for example that
APN functions play an important role in deriving such codes. Section 20.6 deals with p-ary
codes from functions over finite fields in odd characteristic p. We shall present constructions
of p-ary codes from specific functions based on the algebraic approaches presented in Section
20.3. Finally, in Section 20.7, we present linear locally recoverable codes (LRC) and highlight
the role of special functions in the design of optimal LRC.
466 Concise Encyclopedia of Coding Theory
20.2 Preliminaries
For a finite set A, #A denotes the cardinality of A and A∗ denotes the set A \ {0}. For
a complex number z, |z| denotes its absolute value.
r−1
X i 2 r−1
Trqr /q (x) := xq = x + xq + xq + · · · + xq .
i=0
The trace function from Fqr = Fpn to its prime subfield Fp is called the absolute trace
function.
Recall that the (relative) trace function Trqr /q is Fq -linear. It satisfies the transitiv-
ity property in a chain of extension fields. More precisely, let Fph be a finite field, let
Fpm be a finite extension of Fph and Fpr be a finite extension of Fpm . Then Trpr /ph (x) =
Trpm /ph (Trpr /pm (x)) for all x ∈ Fpr .
where
Linear Codes from Functions 467
• Γn is the set of integers obtained by choosing the smallest element, called the coset
leader, in each p-cyclotomic coset modulo pn − 1;
• o(j) is the size of the p-cyclotomic coset containing j;
• Aj ∈ Fpo(j) ;
• Apn −1 ∈ Fp .
Note that o(j) divides n. The algebraic degree of f , coming from its unique trace rep-
resentation, is max {wp (j) | Aj 6= 0} where wp (j) is the number of nonzero entries in the
p-ary expansion of j.
If we do not identify Fnp with Fpn , a p-ary function f has a representation as a unique
multinomial in x1 , . . . , xn , where the variables xi occur with exponent at most p − 1. This
is called the multivariate representation or algebraic normal form (ANF) of f . The
algebraic degree of f is the global (total) degree of its multivariate representation.
is a unique polynomial in two variables, representing f , with Ai,j ∈ Fpm . The algebraic
degree of f is max {wp (i) + wp (j) | Ai,j 6= 0}.
A p-ary linear function f (x) over Fpn has the form Trpn /p (ax) for some unique a ∈
Fpn ; linear functions include the identically zero function. A p-ary function whose algebraic
degree equals 1 (respectively 2) is called affine (respectively quadratic).
Example 20.2.4 Let p = 2. Any Boolean affine function f (x) over F2n can be written
uniquely as f (x) = Tr2n /2 (ax + b) where a, b ∈ F2n with a 6= 0; these include the nonzero
468 Concise Encyclopedia of Coding Theory
Boolean linear functions. Any Boolean quadratic function f (x) over F2n can be written
uniquely in the form
n
b2c
X k
f (x) = Tr2n /2 ak x2 +1 + Tr2n /2 (ax + b)
k=1
Note that the notion of a Walsh transform refers to a scalar product; it is convenient
to choose the isomorphism such that the canonical scalar product “·” in Fpn coincides with
the trace of the product bx, i.e., b · x := Trpn /p (bx).
Example 20.2.6 Let p = 2. Consider a quadratic Boolean function from F2m to F2 of the
form
bm/2c
X i
f (x) = Tr2m /2 fi x2 +1
i=0
From its Walsh transform, the function f can be recovered by the inverse transform
1 X Tr n (bx)
ξpf (x) = n Wf (b)ξp p /p
p n b∈Fp
Definition 20.2.7 Let F be a vectorial function from Fpn into Fpm . The linear combina-
tions of the coordinates of F are the p-ary functions Fu : x 7→ u · F (x) := Trpm /p (uF (x)),
u ∈ Fpm , where F0 is the null function. The functions Fu (u 6= 0) are called the components
of F .
where dH (f, g) is the Hamming distance between f and g; that is, dH (f, g) := {x ∈ Fn2 |
f (x) 6= g(x)}. The relationship between nonlinearity and the Walsh spectrum of f is
1
nl(f ) = 2n−1 − max |Wf (ω)|.
2 ω∈Fn2
n
By Parseval’s Identity (20.1) with p = 2, it can be shown that max {|Wf (ω)| | ω ∈ Fn2 } ≥ 2 2 ,
n
which implies that nl(f ) ≤ 2n−1 − 2 2 −1 .
Bent Boolean functions f defined over F2n are then those of maximum Hamming distance
to the set of affine Boolean functions. They exist only when n is even.
The notion of bent function was introduced by Rothaus [1606] and attracted a lot of
research for more than four decades. Such functions are extremal combinatorial objects
with several areas of application, such as coding theory, maximum length sequences, and
cryptography. A recent survey on bent functions can be found in [355] as well as the book
[1378].
Bent functions can be characterized by means of the Walsh transform as follows; this
characterization is independent of the choice of the inner product on F2n .
Proposition 20.2.10 (see, e.g., [538]) Let n be an even integer and function f be a
n n
Boolean function defined over F2n . Then f is bent if and only if Wf (ω) ∈ {2 2 , −2 2 },
for all ω ∈ F2n .
470 Concise Encyclopedia of Coding Theory
Let p = 2. It is observed in Rothaus’ paper [1606] and developed in Dillon’s thesis [538]
that a Boolean function f : F2m → F2 is bent if and only if its support is a difference set.
More precisely, a function f from F2m to F2 is bent if and only if its support Df := {x ∈
F2m | f (x) = 1} is a difference set in (F2m , +) with parameters
m−2 m−2
2m , 2m−1 ± 2 , 2m−2 ± 2
2 2 .
where v · F denotes the usual inner product on Fm 2 and nl(·) denotes the nonlinearity of
Boolean functions. From the Covering Radius Bound, we have N (F ) ≤ 2n−1 − 2n/2−1 . The
functions achieving this bound are called (n, m)-bent functions and are characterized by
the fact that all nonzero component functions of F are bent Boolean functions. In [1451],
it is shown that (n, m)-bent functions exist only if n is even and m ≤ n/2.
The generalization to vectorial functions of the notion of nonlinearity, as introduced by
Nyberg [1450, 1452] and further studied by Chabaud and Vaudenay [377], can be expressed
by means of the Walsh transform as follows:
1 X
N (F ) = 2n−1 − max |WF (u, v)| where WF (u, v) = (−1)v·F (x)+u·x .
2 m
v∈F2 \{0}
u∈Fn2
x∈Fn
2
Bent (n, m)-functions do not exist if m > n2 . For m ≥ n − 1, the following bound has
been found by Sidelńikov and re-discovered by Chabaud and Vaudenay in [377] and is now
called the Sidelńikov–Chabaud–Vaudenay Bound.
Definition 20.2.13 ([377]) The (n, n)-functions F which achieve (20.2) with equality are
called almost bent (AB).
Proposition 20.2.14 Let F be a function from F2n to F2n . Then F is AB if and only
if WF (a, b) := x∈F2n (−1)Tr2n /2 (aF (x)+bx) takes the values 0 or ±2(n+1)/2 for every pair
P
(a, b) ∈ F∗2n × F2n .
Linear Codes from Functions 471
TABLE 20.2: Walsh spectrum of s-plateaued Boolean functions f for f (0) = 0 and n + s
even
Wf (w) Number of w’s
n+s n−s−2
2 2 2n−s−1 + 2 2
0 2n − 2n−s
n+s n−s−2
−2 2 2n−s−1 − 2 2
In odd characteristic, bent functions are classified into regular bent and weakly regular
bent functions.
Definition 20.2.17 A bent function f is called regular bent if for every b ∈ Fpm ,
m f ? (b)
p− 2 Wf (b) equals ξp for some p-ary function f ? : Fpm → Fp . The bent function f
is called weakly regular bent if there exists a complex number u with |u| = 1 and a p-ary
m f ? (b)
function f ? such that up− 2 Wf (b) = ξp for all b ∈ Fpm . The function f ? (x) is called the
dual of f (x).
where f = ±1 is called the sign of the Walsh transform of f (x) and p∗ = (−1)(p−1)/2 p.
472 Concise Encyclopedia of Coding Theory
Moreover, the Walsh transform coefficients of a p-ary bent function f with odd p satisfy
f ? (b)
(
−m ±ξp if m is even or m is odd and p ≡ 1 (mod 4),
p 2 Wf (b) = f ? (b)
±iξp if m is odd and p ≡ 3 (mod 4),
where i is a complex primitive 4th root of unity; see [1176]. Therefore, regular bent functions
can only be found for even m and for odd m with p ≡ 1 (mod 4). Moreover, for a weakly
regular bent function, the constant u in Definition 20.2.17 can only be equal to ±1 or ±i.
For more information about those functions, we invite the reader to consult the book [1378].
We summarize in Table 20.3 all known weakly regular bent functions over Fpm with odd
characteristic p.
TABLE 20.3: Known weakly regular bent functions over Fpm for p odd
Bent functions m p
Pbm/2c i+1
i=0 Trpm /p (ai xp ) arbitrary arbitrary
Ppk −1 k pm −1
i=0 Trpm /p (ai xi(p −1) ) + Trpl /p (δx ) with e
m = 2k arbitrary
e | (p + 1), l = min {i | e | (pi − 1) and i | m}
k
3m −1
+3k +1
Trpm /p (ax 4 ) m = 2k p=3
p3k +p2k −pk +1 2
Trpm /p (x +x ) m = 4k arbitrary
3i +1
Trpm /p (ax 2 ), with i odd, gcd(i, m) = 1 arbitrary p=3
Similarly to bent functions, plateaued functions are classified in odd characteristic into
regular plateaued and weakly regular plateaued functions. The absolute Walsh distribution of
plateaued functions follows from Parseval’s Identity (20.1); see, e.g., [1376]. More precisely,
we have the following result.
Proposition 20.2.18 Let f : Fpn → Fp be an s-plateaued function. Then for ω ∈ Fpn ,
|Wf (ω)|2 takes pn−s times the value pn+s and pn − pn−s times the value 0.
Definition 20.2.19 The support of a vectorial function f : Fqn → Fqm is the set
supp(f ) := {x ∈ Fqn | f (x) 6= 0}.
In 2016, Hyun, Lee, and Lee [1018] showed that the Walsh transform coefficients of p-ary
s-plateaued f satisfy
( n+s g(ω)
±p 2 ξp , 0 if n + s is even, or n + s is odd and p ≡ 1 (mod 4),
Wf (ω) = n+s g(ω)
±ip 2 ξp , 0 if n + s is odd and p ≡ 3 (mod 4),
where i is a complex primitive 4th root of unity and g is a p-ary function over Fpn with
g(ω) = 0 for all ω ∈ Fpn \ supp(Wf ). It is worth noting that by the definition of g : Fpn →
Fp , it can be regarded as a mapping from supp(Wf ) to Fp such that g(ω) = 0 for all
ω ∈ Fpn \ supp(Wf ). In 2017, Mesnager, Özbudak, and Sınak [1380, 1381] introduced the
notion of weakly regular plateaued functions in odd characteristic, which covers a nontrivial
subclass of the class of plateaued functions.
regular p-ary s-plateauednif there exists oa complex number u, not dependent on ω, such
n+s g(ω)
that |u| = 1 and Wf (ω) ∈ 0, up 2 ξp for all ω ∈ Fpn , where g is a p-ary function
over Fp with g(ω) = 0 for all ω ∈ Fp \ supp(Wf ); otherwise, f is called (non)-weakly
n n
The following lemma has, in particular, a significant role in finding the Hamming weights
of the codewords of a linear code. It can be proven in a similar way to the case of s = 0;
see, e.g., [1379].
Lemma 20.2.21 Let p be an odd prime and f : Fpn → Fp be a weakly regular s-plateaued
function. For all ω ∈ supp(Wf ),
√ n+s g(ω)
Wf (ω) = f p∗ ξp ,
Remark 20.2.22 Notice that the notion of (non)-weakly regular 0-plateaued functions
coincides with the one of (non)-weakly regular bent functions. Indeed, if we have |Wf (ω)|2 ∈
{0, pn } for all ω ∈ Fpn , then by Parseval’s Identity, we have p2n = pn #supp(Wf ), and
thereby, #supp(Wf ) = pn . Hence, a (non)-weakly regular 0-plateaued function is a (non)-
weakly regular bent function.
Definition 20.2.26 ([186, 1453, 1454]) An (n, n)-function F is called almost perfect
nonlinear (APN) if it is differentially 2-uniform; that is, if for every a ∈ Fn2 ∗ and every
b ∈ Fn2 , the equation F (x) + F (x + a) = b has 0 or 2 solutions.
Since (n, m)-functions have differential uniformity at least 2n−m when m ≤ n/2 (n even)
and strictly larger when n is odd or m > n/2, we shall use the term ‘APN function’ only
when m = n. An interesting article dealing with Boolean, vectorial plateaued, and APN
functions is [348].
Both planar and APN functions over Fqm for odd q exist. The following is a summary
of known APN monomials F (x) = xd over F2m :
• [822, 1453] d = 2i + 1 with gcd(i, m) = 1 (Gold function – degree 2);
• [1039, 1087] d = 22i − 2i + 1 with gcd(i, m) = 1 (Kasami function – degree i + 1);
• [607] d = 2t + 3 with m = 2t + 1 (Welch function – degree 3);
• [606] d = 2t + 2t/2 − 1 if t is even and d = 2t + 2(3t+1)/2 − 1 if t is odd where m = 2t + 1
(Niho function – degree equals (t + 2)/2 if t is even and t + 1 if t is odd);
• [186, 1453] d = 22t − 1 with m = 2t + 1 (Inverse function – degree m − 1);
• [609] d = 24t + 23t + 22t + 2t − 1 with m = 5t (Dobbertin function – degree t + 3).
Linear Codes from Functions 475
The following is a summary of known APN monomials F (x) = xd over Fpm where p is
odd:
• d = 3, p > 3;
• d = pm − 2, p > 2, and p ≡ 2 (mod 3);
pm −3
• d= 2 , p ≡ 3, 7 (mod 20), pm > 7, pm 6= 27, and m is odd;
pm +1 pm −1
• d= 4 + 2 , pm ≡ 3 (mod 8);
pm +1
• d= 4 , pm ≡ 7 (mod 8);
2pm −1
• d= 3 , pm ≡ 2 (mod 3);
• d = 3m − 3, p = 3, and m is odd;
• d = ph + 2, ph ≡ 1 (mod 3), and m ≡ 2h (mod 17);
5h +1
• d= 2 ,
p = 5, and gcd(2m, h) = 1;
• d = 3(m+1)/4 − 1 3(m+1)/2 + 1 , m ≡ 3 (mod 4), and p ≡ 3 (mod 33);
3(m+1)/2 −1 3(m+1)/2 −1 3m −1
• p = 3 and d = 2 if m ≡ 3 (mod 4) or d = 2 + 2 if m ≡ 1
(mod 4);
3m+1 −1 3m+1 −1 3m −1
• p = 3 and d = 8 if m ≡ 3 (mod 4) or d = 8 + 2 if m ≡ 1 (mod 4);
5m −1 5(m+1)/2 −1
• d= 4 + 2 , p = 5, and m is odd.
The following is a list of some known planar functions over Fpm where p is odd:
• f (x) = x2 ;
h
• f (x) = xp +1
where m/ gcd(m, h) is odd;
h
• f (x) = x(3 +1)/2
where p = 3 and gcd(m, h) = 1;
10
• f (x) = x − ux − u2 x2 , where p = 3, u ∈ Fpm , and m is odd.
6
where a ∈ Fqm , and h ≥ 0 is called the order of the polynomial. This family is referred to
as the Dickson polynomials of the first kind.
Dickson polynomials of the second kind over Fqm are defined by
bh
2c
X h
Eh,a (x) = (−a)i xh−2i ,
i=0
h − i
where a ∈ Fqm , and h ≥ 0 is called the order of the polynomial. We invite the reader to
consult the excellent book [1253] for detailed information about Dickson polynomials.
476 Concise Encyclopedia of Coding Theory
The resulting code C(f ) is a linear code over Fp of length q, and its dimension is bounded
above by 2m; this bound is reached when the nonlinearity of the vectorial function f is
larger than 0, which happens in many cases.
One can also define a code C ∗ (f ) over Fp involving a polynomial f from Fq to Fq , where
q = pm , which vanishes at 0 defined by
The resulting code C ∗ (f ) is a linear code of length q − 1, and its dimension is also bounded
above by 2m, which is reached in many cases.
As mentioned in the literature (see, e.g., [552]), the first generic construction has a
long history and its importance is supported by Delsarte’s Theorem [522]. In the binary
case, that is, when p = 2, the first generic construction allows a kind of coding-theoretic
characterization of special cryptographic functions such as APN functions, almost bent
functions, and semi-bent functions; see [350].
More generally, but in a more complex way, for any α, β ∈ Fqr , define
fα,β : Fqr −→ Fq
x 7−→ fα,β (x) := Trqr /q (αΨ(x) − βx)
where Ψ is a mapping from Fqr to Fqr such that Ψ(0) = 0. One can define a linear code CΨ
over Fq as
CΨ = e cα,β = fα,β (ζ1 ), fα,β (ζ2 ), . . . , fα,β (ζqr −1 ) | α, β ∈ Fqr (20.4)
Proposition 20.3.1 ([1379]) The linear code CΨ has length q r − 1. If the mapping Ψ has
no linear component Ψu : x 7→ Trqr /q (uΨ(x)) over Fp , then CΨ has dimension 2r.
Hence, ecα,β = 0 implies that the component of Ψ associated with α 6= 0 is linear or null
and coincides with x 7→ Trqr /p (βx). Therefore, to ensure that the null codeword appears
only once (for α = β = 0), it suffices that no component function of Ψ is identically equal
to 0 or linear. Furthermore, this implies that all the codewords e cα,β are pairwise distinct.
In this case, the size of the code is q 2r and its dimension thus equals 2r.
The following statement shows that the weight distribution of the code CΨ of length
q r − 1 can be expressed by means of the Walsh transform of some absolute trace functions
over Fpm involving the map Ψ.
Proposition 20.3.2 ([1379]) We keep the notation above. Set q = pm . Let a ∈ Fpm and
let Ψ be a mapping from Fqr to Fqr such that Ψ(0) = 0. Let us denote by ψa a mapping
from Fpm to Fp defined as
ψa (x) = Trpm /p (aΨ(x)).
cα,β ∈ CΨ , we have
For e
1 X
cα,β ) = pm −
wtH (e Wψωα (ωβ).
q
ω∈Fq
cα,β be a codeword of CΨ .
Proof: Note that Ψ(0) = 0 implies that fα,β (0) = 0. Now let e
Then
The latter equality comes from the fact that the sum of characters equals q if fα,β (x) = 0
and 0 otherwise. Moreover, using the transitivity property of the trace function Trqr /p and
the fact that Trqr /q is Fq -linear, we have
1 X X Trqr /p (ωαΨ(x)−ωβx)
= pm − ξp
q
ω∈Fq x∈Fq
r
is the canonical additive character2 of Fq , aD denotes the set {ad | d ∈ D}, and
where χ1 P
χ1 (S) := x∈S χ1 (x) for any subset S of Fq . Therefore,
(p − 1) 1 X
wtH (cx ) = n− χ1 (yxD).
p p ∗ y∈Fp
In practice, the defining set could be the support or the co-support (i.e., the complement
of the support) of a function defined over a finite field.
II. Generalization II: Let F : F2k → F2s and D be the support of Tr2s /2 (λF (x)) where
λ ∈ F∗2s . A linear code CD over F2 is defined by
n o
CD = Tr2k /2 (xd) + Tr2s /2 (yF (d)) d∈D | x ∈ F2k , y ∈ F2s .
III. Generalization III: Let Fi : Fpk → Fpk for i = 1, 2, . . . , t, where t is a positive integer.
Define a linear code CD as follows:
n o
CD = Trpk /p (a1 x1 + · · · + at xt ) (a1 ,...,at )∈D | x1 , . . . , xt ∈ Fpk ,
Note that the linear code defined as (20.5) coincides with the one given in the original
defining set construction if F (x) = x.
20.4 Binary Codes with Few Weights from Boolean Functions and
Vectorial Boolean Functions
20.4.1 A First Example of Codes from Boolean Functions: Reed–Muller
Codes
Reed–Muller codes, introduced by D. E. Muller and I. S. Reed in 1954, are one of the
best-understood families of codes; see, e.g., [1323]. Except for first-order Reed–Muller codes
480 Concise Encyclopedia of Coding Theory
and for codes of small lengths, their minimum distance is lower than that of BCH codes. But
they have very efficient decoding algorithms, they contain nonlinear subcodes with optimal
parameters together with efficient decoding algorithms, and they give a useful framework
for the study of Boolean functions in cryptography. Reed–Muller codes can be defined in
terms of Boolean functions defined over Fm 2 .
For every 0 ≤ r ≤ m, the Reed–Muller code of order r is the set of all Boolean functions
of algebraic degrees at most r. More precisely, it is the linear code of all binary words
of length 2m corresponding to the last columns of the truth-tables of these functions; see
[1323]. The Reed–Muller codes are nested: RM(0, m) ⊂ RM(1, m) ⊂ · · · ⊂ RM(m, m).
For every 0 ≤ r ≤ m − 1, the dual code of RM(r, m) is the (m − r − 1)th -order Reed–Muller
code RM(m − r − 1, m); see Theorem 1.11.6.
The binary code Cf has length 2m − 1 and dimension m + 1 if f (x) 6= Tr2m /2 (wx) for
1 wm∈ F2m . In addition, the weight
all
m−1
of Cf is given by the multiset union
distribution
∗
2 (2 − Wf (w)) | w ∈ F2m ∪ 2 | w ∈ F 2m ∪ 0 .
Theorem 20.4.1 ([550]) Let f be a function from F2m to F2 , and let Df be the support
of f . If 2nf + Wf (w) 6= 0 for all w ∈ F∗2m , then CDf is a binary linear code with length nf
and dimension m, and its weight distribution is given by the multiset
2nf + Wf (w)
w ∈ F∗2m ∪ {0} . (20.6)
4
The determination of the weight distribution of the binary linear code CDf is equivalent
to that of the Walsh spectrum of the Boolean function f satisfying 2nf + Wf (w) 6= 0 for
all w ∈ F∗2m . As highlighted by Ding, when the Boolean function f is selected properly, the
code CDf could have only a few weights and may have good parameters.
Theorem 20.4.1 can be generalized into the following statement whose proof is the same
as that of Theorem 20.4.1.
Linear Codes from Functions 481
Theorem 20.4.2 ([550]) Let f be a function from F2m to F2 , and let Df be the support
2n +W (w)
of f . Let ew denote the multiplicity of the element f 4 f and e the multiplicity of 0 in
∗
the multiset {2nf + Wf (w) | w ∈ F2m }. Then CDf is a binary linear code with length nf
and dimension m − log2 e, and the weight distribution of CDf is given by
2nf + Wf (w) ew
with frequency
4 e
2nf +Wf (w)
for all 4 in the multiset of (20.6).
are bent quadratic, except for u = 0. Moreover, the sum of any two distinct functions fu is
bent. Since we know that the sum of a bent function and an affine function is bent, the code
m
Km/2 has for minimum distance 2m−1 − 2 2 −1 , which is the best possible minimum distance
for a code equal to a union of cosets of RM(1, m), according to the Covering Radius Bound.
m
In fact, as shown by Delsarte [520], 2m−1 − 2 2 −1 is the best possible minimum distance
m 2m
for any code of length 2 and size 2 . Interesting developments about these nonlinear
codes with their relationship to bent Boolean functions can be found in [345, 346]. Later,
more codes, especially with few weights, were constructed from bent Boolean functions.
We highlight that Wolfmann [1900] was the first to construct two-weight codes from bent
functions.
Recall that a function f from F2m to F2 is bent if and only if its support Df = {x ∈
F2m | f (x) = 1} is a difference set in (F2m , +) with parameters
m−2 m−2
2m , 2m−1 ± 2 2 , 2m−2 ± 2 2 .
Wolfmann established for the first time a link between bent Boolean functions and two-
weight binary codes.
Theorem 20.4.3 ([1900]) Let f be a Boolean function from F2m to F2 with f (0) = 0 where
m is even and m ≥ 4. Then the code CDf , obtained from the second generic construction
and whose defining set is Df , is an nf , m, (nf − 2(m−2)/2 )/2 2 two-weight binary code with
weight distribution given by Table 20.4 if and only if f is bent.
482 Concise Encyclopedia of Coding Theory
Consequently, any bent function can be plugged into Theorem 20.4.3 to obtain a two-weight
binary linear code.
Now we are interested in constructing binary linear codes based on the first generic
construction. With Ψ : F2m → F2m where Ψ(0) = 0, let CΨ be the linear code defined by
(20.4). Set ψ1 (x) = Tr2m /2 (Ψ(x)). Let us define a subcode Cψ1 of CΨ as follows:
Cψ1 = e cα,β = fα,β (ζ1 ), fα,β (ζ2 ), . . . , fα,β (ζ2m −1 ) | α ∈ F2 , β ∈ F2m , (20.7)
Theorem 20.4.4 ([1379]) Let Cψ1 be the linear code defined by (20.7) whose codewords
are denoted by ecα,β . Let m be an even integer. Assume that the function ψ1 = Tr2m /2 (Ψ)
is bent. We denote by ψ1? its dual function. Then Cψ1 is a three-weight code with weights
as follows. We have wtH (e c0,β ) = 2m − 2m−1 . Moreover, the
c0,0 ) = 0, and for β 6= 0 wtH (e
? m
Hamming weight of e c1,β for β ∈ F2m is wtH (e c1,β ) = 2 m−1
− (−1)ψ1 (β) 2 2 −1 . The weight
distribution of Cψ1 , which is of dimension m + 1, is in Table 20.5.
2 if Wf (0) = 0
if and only if the function f is semi-bent.
Linear Codes from Functions 483
All semi-bent functions can be plugged into Theorem 20.4.5 to obtain three-weight binary
linear codes. The known semi-bent functions are given in [1378, Chapter 17].
be a quadratic Boolean function from F2m to F2 , where fi ∈ F2m , and the rank of f is rf .
Let Df be the support of f . By definition, we have
m−1
2 if Wf (0) = 0,
Wf (0) m−1
nf = #Df = 2m−1 − = 2 − 2m−1−rf /2 if Wf (0) = 2m−rf /2 , (20.9)
2 m−1
2 + 2m−1−rf /2 if Wf (0) = −2m−rf /2 .
Let CDf be the code based on the second generic construction with defining set Df . From
Theorem 20.4.1 and Table 20.1, Ding has deduced the following result.
Theorem 20.4.6 ([550]) Let f be a quadratic Boolean function of the form in (20.8) such
that rf > 2. Then CDf is a three-weight binary code with length nf given in (20.9), dimension
m, and the weight distribution in Table 20.7, where
(1, 0, 0) if Wf (0) = 0,
(1 , 2 , 3 ) = (0, 1, 0) if Wf (0) = 2m−1−rf /2 ,
(0, 0, 1) if Wf (0) = −2m−1−rf /2 .
As mentioned by Ding in [552], the code CDf in Theorem 20.4.6 defined by any quadratic
Boolean function f is different from any subcode of the second-order Reed–Muller code, due
to the difference in their lengths as well as their weight distributions.
484 Concise Encyclopedia of Coding Theory
Theorem 20.4.7 ([552]) Let m ≥ 4 be even. Let f (x) = Tr2m /2 (xd ) where d are given
above. Then the code CDf with defining set Df is a three-weight code and has parameters
m−1
2 , m, 2m−2 − 2(m−2)/2 2 and weight distribution given in Table 20.9.
• [1438] When d = (2m/2 + 1)(2m/4 − 1) + 2 and m ≡ 0 (mod 4), the code CDf has
length 2m−1 and dimension m, and the weight distribution of CDf is deduced from
Theorem 20.4.1 and Table 20.11.
(m+2)h/2
• [605] When d = 2 2h −1 −1 and m ≡ 0 (mod 4), where 1 ≤ h < m and gcd(h, m) = 1,
the code CDf has length 2m−1 and dimension m, and the weight distribution of CDf
is deduced from Theorem 20.4.1 and Table 20.10.
(m+2)h/2
• [1438] When d = 2 2h −1 −1 and m ≡ 2 (mod 4), where 1 ≤ h < m and gcd(h, m) =
1, the code CDf has length 2m−1 − 2m/2 and dimension m, and the weight distribution
of CDf is deduced from Theorem 20.4.1 and Table 20.12. Note that in this case,
gcd(d, 2m − 1) = 3.
m h+1 m/2+1
• [941] When d = 2 +2 2h−2
−1
−1
and m ≡ 0 (mod 4), where 2h divides m/2, the
m−1
code CDf has length 2 and dimension m, and the weight distribution of CDf is
deduced from Theorem 20.4.1 and Table 20.10.
• When d = (2m/2 − 1)s + 1 with s = 2h (2h ± 1)−1 (mod 2m/2 + 1), where e2 (h) <
e2 (m/2) and e2 (h) denotes the highest power of 2 dividing h, the parameters and
the weight distribution of the code CDf can be deduced from Theorem 20.4.1 and the
results in [610].
• Let d be any integer such that 1 ≤ d ≤ 2m − 2 and d(2` + 1) ≡ 2h (mod 2m − 1) for
some positive integers ` and h. Then the parameters and the weight distribution of
the code CDf can be deduced from Theorem 20.4.1 and the results in [936].
All these cases of d above are derived from the cross-correlation of a binary maximum-
length sequence with its d-decimation version.
m/2
• When f (x) = Tr2m /2 (x2 +3 ) where m ≥ 6 and is even, CDf is a five-weight code
with length 2m−1 and dimension m, and its weight distribution can be derived from
[934].
m
• When f (x) = Tr2m /2 (ax(2 −1)/3 ) with Tr2m /4 (a) 6= 0 where m is even, CDf is a two-
weight code with length (2m+2 − 4)/6 and dimension m, and its weight distribution
can be derived from [1237].
m/2 m/2
• [340] When f (x) = Tr2m /2 (λx2 +1 ) + Tr2m /2 (x)Tr2m /2 (µx2 −1 ) where m is even,
m
µ ∈ F∗2m/2 , and λ ∈ F2m with λ + λ2 = 1, CDf is a five-weight code.
m/2 m/2
• [340] When f (x) = (1 + Tr2m /2 (x))Tr2m /2 (λx2 +1 ) + Tr2m /2 (x)Tr2m /2 (µx2 −1 )
m
where m is even, µ ∈ F∗2m/2 , and λ ∈ F2m with λ + λ2 = 1, CDf is a five-weight code.
Some Boolean functions f documented in [1549] also give binary linear codes CDf with five
weights.
Theorem 20.4.8 ([1872]) Let rm ≥ 9 and let D be defined in (20.10). Then CD is a two-
weight binary code of length (q − 1)(rm − r + 1)/rm , dimension (r − 1)rm−1 , and weight
distribution given in Table 20.13.
It is known (see, e.g., [552, 1900]) that any relative difference set D of size n in (Fm
2 , +)
defines a binary code CD , where D is the support of a Boolean function on F2m , with at
most the following four weights:
√ √
n ± n n ± n − λ`
, .
2 2
20.4.12 Binary Codes with Few Weights from Plateaued Boolean Func-
tions
Let Ψ : F2m → F2m be a mapping with Ψ(0) = 0. Set ψ1 (x) = Tr2m /2 (Ψ(x)). Let fα,β (x)
be the Boolean function defined by
where ζ1 , . . . , ζ2m −1 are the elements of F∗2m . The linear code Cψ1 of length 2m − 1 over F2
defined by (20.11) is a k-dimensional subspace of Fm 2 where k = 2m if we assume that Ψ
has no linear component.
Assume that ψ1 is an s-plateaued Boolean function where m + s is an even integer with
0 ≤ s ≤ m − 2. The following theorem (compare to Theorem 20.4.4) gives the Hamming
weights of the codewords and the weight distribution of the binary three-weight linear code
Cψ1 defined by (20.11)
Theorem 20.4.10 ([1381]) With Cψ1 as in (20.11), under the above assumptions, the
Hamming weights of the codewords and the weight distribution of Cψ1 are as in Table 20.14.
Example 20.4.11 Let Ψ(x) = ζ 18 x5 + ζ 2 x3 be the mapping over F25 , where F∗25 = hζi
with ζ 5 + ζ 2 + 1 = 0. Then ψ1 (x) = Tr25 /2 (Ψ(x)) is a 3-plateaued Boolean function, and
the set Cψ1 in (20.11) is the binary three-weight linear code with parameters [31, 6, 8]2 and
weight enumerator HweCψ1 (x, y) = y 31 + 3x8 y 23 + 59x16 y 15 + x24 y 7 ; see Definition 1.15.1.
488 Concise Encyclopedia of Coding Theory
20.4.13 Binary Codes with Few Weights from Almost Bent Functions
Recall by Proposition P 20.2.14 that a function F from F2m to F2m is almost bent (AB)
if and only if WF (a, b) = x∈F2m (−1)Tr2m /2 (aF (x)+bx) takes the values 0 or ±2(m+1)/2 for
every pair (a, b) ∈ F∗2m × F2m .
A characterization of AB functions by the weight distribution of related codes has been
obtained in [350].
Theorem 20.4.12 ([350]) Let F be any vectorial (Boolean) function from F2m to F2m
such that F (0) = 0. Let H be the matrix
m
α2 α2 −2
1 α ···
m
F (1) F (α) F (α2 ) · · · F (α2 −2 )
where α is a primitive element of the field F2m , and where each symbol stands for the column
of its coordinates with respect to a basis of the F2 -vector space F2m . Let CF be the linear
code admitting H as a parity check matrix. Then F is AB if and only if CF⊥ (i.e., the code
admitting H as a generator matrix) has Hamming weights
0, 2m−1 − 2(m−1)/2 , 2m−1 , 2m−1 + 2(m−1)/2 .
By definition, almost bent functions over F2m exist only for m odd. Moreover, by def-
inition, for every almost bent function F , the sum WF (1, 0) ∈ {0, ±2(m+1)/2 }. Ding has
deduced the following result.
Theorem 20.4.13 ([550]) Let F be an almost bent function from F2m to F2m (where m is
odd) such that F (0) = 0. Define f := Tr2m /2 (F ). Then CDf is an [nf , m, (nf −2(m−1)/2 )/2]2
three-weight binary code with the weight distribution as given by Table 20.15 where
m−1
2 + 2(m−1)/2 if WF (1, 0) = −2(m+1)/2
m−1
nf = #Df = 2 − 2(m−1)/2 if WF (1, 0) = 2(m+1)/2
m−1
2 if WF (1, 0) = 0.
The list of the known AB power functions F (x) = xd on F2m for odd m is given below
(see, e.g., [1378, Chapter 12]):
• [822] d = 2h + 1 where gcd(m, h) = 1;
• [1087] d = 22h − 2h + 1 where h ≥ 2 and gcd(m, h) = 1;
• [1087] d = 2(m−1)/2 + 3;
• [974, 986] d = 2(m−1)/2 + 2(m−1)/4 − 1 where m ≡ 1 (mod 4);
• [974, 986] d = 2(m−1)/2 + 2(3m−1)/4 − 1 where m ≡ 3 (mod 4).
The length of each code CDf obtained from this list is equal to nf = 2m−1 , and the weight
distribution of the code is given in Table 20.15.
Linear Codes from Functions 489
20.4.14 Binary Codes CD(G) from Functions on F2m of the Form G(x) =
F (x) + F (x + 1) + 1
Definition 20.4.14 Let A and B be two finite sets, and let f be a mapping from A to B.
Then f is called a 2-to-1 mapping if one of the following two cases hold:
(i) #A is even, and for any b ∈ B, b has either 2 or 0 preimages under f , or
(ii) #A is odd, and for all but one b ∈ B, b has either 2 or 0 preimages under f , and the
exception element has exactly one preimage.
G(x) = F (x) + F (x + 1) + 1.
By the definition of APNness, G is 2-to-1 for APN functions F (x) over F2m . For example,
2h h 2h h
G(x) = x2 −2 +1 + (x + 1)2 −2 +1 + 1 is 2s -to-1, where s = gcd(h, m); see [948]. We define
In this subsection, we consider the code CD(G) based on the second generic construction
whose defining set is D(G).
h
Theorem 20.4.15 ([552]) Let F (x) = x2 +1 , and let gcd(h, m) = 1. Then CD(G) is a
one-weight code with parameters [2m−1 , m − 1, 2m−2 ]2 .
From Ding [552], we have the following comments on other APN monomials.
m
• Let F (x) = x2 −2 . Then CD(G) is a binary code of length 2m−1 , dimension m, and
has at most m weights. The weights are determined by the Kloosterman sums.3
(m−1)/2 (m−1)/4
−1
• For the Niho function F (x) = x2 +2
where m ≡ 1 (mod 4), the code
m−1
CD(G) has length 2 and dimension m, but many weights.
(m−1)/2 (3m−1)/4
−1
• For the Niho function F (x) = x2 +2
where m ≡ 3 (mod 4), the code
m−1
CD(G) has length 2 and dimension m, but many weights.
According to Ding, it would be extremely difficult to determine the weight distribution of
the code CD(G) for these three classes of APN monomials.
We consider again the codes CD(F ) and CD(F )∗ based on the second generic construction.
As highlighted by Ding [552], it is difficult to determine in general the parameters of these
codes except in certain special cases.
If 0 6∈ D(F ), then the two codes CD(F ) and CD(F )∗ are the same. Otherwise, the length
of the code CD(F )∗ is one less than that of the code CD(F ) , but the two codes have the same
weight distribution. Ding has provided more information about the codes CD(F )∗ in [552].
3 The binary Kloosterman sum Km : F2m → {−1, 1} is defined as follows: For a ∈ F2m , Km (a) =
1
(−1)Tr2m /2 (ax+ x ) .
P
x∈F2 m
490 Concise Encyclopedia of Coding Theory
1
F (z) = z 1/2 + ×
Tr2n /2m (v)
1−r
m
Tr2n /2m (v r )(z + 1) + Tr2n /2m (vz + v 2 )r z + Tr2n /2m (v)z 1/2 + 1 .
structures such as quantum codes [1799], frequency hopping sequences [569], etc. Thus, it
is of great interest to construct cyclic codes with good parameters.
Let q be a positive power of a prime p. One way of constructing cyclic codes over Fq of
length n is to use the generator polynomial
xn − 1
(20.12)
gcd(S n (x), xn − 1)
Pn−1
where S n (x) = i=0 si xi ∈ Fq [x] and s∞ = (si )∞ i=0 is a sequence of period n over Fq .
We call the cyclic code Cs with generator polynomial (20.12) the code defined by the
sequence s∞ , and the sequence s∞ the defining sequence of Cs .
It can be seen that every cyclic code of length n over Fq can be expressed as Cs for
some sequence s∞ of period n over Fq . For this reason, this construction of cyclic codes
is said to be fundamental. In the past decade, impressive progress, in particular thanks to
C. Ding, in the construction of cyclic codes with this approach has been made; see, e.g.,
[545, 547, 548, 549, 571, 1779, 1886].
P`
Let s∞ = (si )∞
i=0 be a sequence of period n over Fq . The polynomial c(x) = i=0 ci x
i
The characteristic polynomial with the smallest degree is called the minimal polynomial
of s∞ . The degree of the minimal polynomial is referred to as the linear span or linear
complexity of s∞ . Since we require that the constant term of any characteristic polynomial
be 1, the minimal polynomial of any periodic sequence s∞ must be unique. In addition, any
characteristic polynomial must be a multiple of the minimal polynomial.
For periodic sequences, there are a few ways to determine their linear span and minimal
polynomials. One of them is given in the following lemma [567, page 87, Theorem 5.3].
xn − 1
gcd(xn − 1, S n (x))
The other one is given in the following lemma ([52, Theorem 3], [940]).
Lemma 20.5.2 ([571]) Any sequence s∞ over Fq of period q m − 1 has a unique expansion
of the form
m
qX −2
st = ci αit for all t ≥ 0
i=0
It should be noted that in some references the reciprocal of Ms (x) is called the minimal
polynomial of the sequence s∞ . Lemma 20.5.2 is a modified version of the original one [940].
Given a polynomial F (x) over Fqm , one can define its associate sequence s∞ by
where α is a generator of F∗q . Any method for constructing an [n, k] cyclic code over Fq
corresponds to the selection of a divisor F over Fq of xn − 1 of degree n − k, which is
employed as the generator polynomial. The minimum weight dH (Cs ) and other parameters
of this cyclic code Cs are determined by the generator polynomial F (x). It is unnecessary to
require that F is highly nonlinear in order to obtain cyclic codes Cs defined by F with good
parameters. Both linear and highly nonlinear polynomials F could give optimal cyclic codes
Cs when they are plugged into the above generic construction. As demonstrated by Ding
(see, e.g., [544, 548]), it is possible to construct optimal cyclic codes meeting some bound
on parameters of linear cyclic codes or cyclic codes with good parameters. But this generic
construction may produce also codes with bad parameters; see for instance the discussion
of Ding in [548].
A characterization of APN functions by the minimum distance of related codes has been
obtained in [350].
Theorem 20.5.3 ([350]) Let F be any vectorial (Boolean) function from F2m to F2m such
that F (0) = 0. Let H be the matrix
m
α2 α2 −2
1 α ···
m
F (1) F (α) F (α2 ) · · · F (α2 −2 )
where α is a primitive element of the field F2m , and where each symbol stands for the column
of its coordinates with respect to a basis of the F2 -vector space F2m . Let CF be the linear
code with parity check matrix H. Then F is APN if and only if CF has minimum distance
5.
Note that APN and perfect nonlinear (PN) functions over finite fields have been em-
ployed to construct a number of classes of cyclic codes. APN monomials over F2m were
employed to construct binary [2m − 1, 2m − 1 − 2m, 5]2 cyclic codes by Carlet, Charpin,
and Zinoviev [350], and PN polynomials over Fp , with p an odd prime, were used to define
[pm − 1, pm − 1 − 2m, d]p non-binary codes by Carlet, Ding, and Yuan [352]. The dimension
of the codes obtained from these two constructions in [350] and [352] is always equal to
pm − 1 − 2m. In [548] Ding employed both PN polynomials and APN monomials to con-
struct cyclic codes. His constructions using these polynomials are different from those in
[352] as the dimension of the codes obtained in [548] is usually different from pm − 1 − 2m.
In fact, the parameters of the cyclic codes from APN or PN polynomials obtained in [548]
do not depend on the APN or PN property of these polynomials. Further, Ding and Zhou
have constructed in [571] a number of families of binary cyclic codes with monomials and
trinomials of special types.
Ding [544, 548, 554] and Ding and Zhou [571] provided several constructions of cyclic
codes from polynomials based on the generic construction of cyclic codes with polynomials;
see Section 20.5.1. Below we give a brief description of state-of-the-art cyclic codes con-
structed from some APN functions. To get more information about those codes, we invite
the reader to consult the excellent survey [554].
Linear Codes from Functions 493
Recalling Section 20.5.1, we adopt the following notation unless otherwise stated:
• p is a prime, q is a positive power of p, and m is a positive integer.
• Np (x) is a function defined by Np (x) = 0 if x ≡ 0 (mod p) and Np (x) = 1 otherwise.
• α is a generator of F∗qm , the multiplicative group of Fqm .
• ma (x) is the minimal polynomial of a ∈ Fqm over Fq .
• s∞ is the sequence given by (20.13).
• Ls is the linear span of s∞ .
• Ms is the minimal polynomial of s∞ .
• Cs is the cyclic code over Fq of length n = q m − 1 with defining sequence s∞ .
Theorem 20.5.4 ([544, 571]) Let m = 2t + 1 ≥ 7 and let F (x) be the APN function
t
F (x) = x2 +3 defined over F22t+1 . The sequence s∞ has Ls = 5m + 1 and
Ms (x) = (x − 1)mα−1 (x)mα−3 (x)mα−(2t +1) (x)mα−(2t +2) (x)mα−(2t +3) (x).
Cs is a binary [2m − 1, 2m − 2 − 5m, d]2 cyclic code with d ≥ 8 and generator polynomial
Ms (x).
if j = 2h−1 − 1,
(
(h)
1
κ2j+1 =
l m
2h −1
log2 2j+1 mod 2 if 0 ≤ j < 2h−1 − 1.
h
−1
Theorem 20.5.5 ([544, 571]) Let F (x) be the APN function F (x) = x2 with 2 ≤ h ≤
∞
m
2 . The sequence s has linear span
m(2h +(−1)h−1 )
if m is even,
Ls = 3
m(2 +(−1) )+3 if m is odd
h h−1
Theorem 20.5.6 ([544, 571]) Let m ≡ 1 (mod 4) with m ≥ 9. Let F (x) be the APN
function F (x) = xe where e = 2(m−1)/2 + 2(m−1)/4 − 1. The sequence s∞ has linear span
m(2(m+7)/4 +(−1)(m−5)/4 )+3
if m ≡ 1 (mod 8),
Ls = 3
m(2 +(−1)(m−5)/4 −6)+3
(m+7)/4
3 if m ≡ 5 (mod 8)
494 Concise Encyclopedia of Coding Theory
if m ≡ 1 (mod 8) and
m−1
2 Y−1
4
Y
Ms (x) = (x − 1) m m−1 (x) mα−2j−1 (x)
α −i−2 2
i=1 m−1
3≤2j+1≤2 4 −1
((m−1)/4)
κ =1
2j+1
3 if h is odd
if h is even and
h
2Y −1 Y
Ms (x) = (x − 1) mα−i−2m−h (x) mα−2j−1 (x)
i=1 3≤2j+1≤2h −1
(h)
κ =1
2j+1
The code Cs in Theorems 20.5.5, 20.5.6, and 20.5.7 could be optimal in some cases [571].
We examine one final APN monomial.
m
Theorem 20.5.8 ([544, 571]) Let F be the APN function F (x) = x2 −2 over F2m . Let
Ci be the 2-cyclotomic coset modulo 2m −1 containing i, ρi the total number of even integers
in Ci , and `i = #Ci . Let Γ be the set of coset leaders modulo n = 2m − 1. Define
mρi
νi = mod 2.
`i
The sequence s∞ has linear span Ls = (n + 1)/2 and minimal polynomial
Y
Ms (x) = mα−j (x).
j∈Γ, νj =1
Cs is a binary [2m −1, 2m−1 −1, d]2 cyclic code with generator polynomial Ms (x). If m is odd,
then d ≥ d1 where d1 is the smallest even integer satisfying d21 − d1 + 1 ≥ 2m − 1; moreover,
the dual code Cs⊥ is a binary [2m − 1, 2m−1 , d⊥ ]2 cyclic code where (d⊥ )2 − d⊥ + 1 ≥ 2m − 1.
m
When F (x) = xq −2 and q > 2, the dimension of the code Cs over Fq was settled in
[1779]. But no lower bound on the minimum distance of Cs is developed.
Theorem 20.5.9 ([548]) Let m and q be odd, and let κ ≥ 0. Let Cs be the code defined by
κ
the sequence s∞ given by (20.13) where F (x) = xq +1 . Then the linear span Ls of s∞ is
equal to 2m + Np (m) and the minimal polynomial Ms (x) of s∞ is
Before proceeding, we need a notation that will be used in the next result and later in
the chapter:
496 Concise Encyclopedia of Coding Theory
0 if Trqm /q (x) = 0,
• For x ∈ Fqm , δ(x) =
1 otherwise.
Ding [548] studied the code Cs obtained from the trinomial F (x) = x10 −ux6 −u2 x2 when
q = 3, m is odd, and u ∈ F3m . We present the results in the next theorem and example.
The lower bounds on d in the theorem come from the BCH Bound while the upper bounds
on d follow from the Sphere Packing Bound.
Theorem 20.5.11 ([548]) Let m be odd and q = 3. Let s∞ be the sequence in (20.13)
where F (x) = x10 − ux6 − u2 x2 with u ∈ F3m . Then the linear span Ls of s∞ is
2m + δ(u2 + u − 1) if u6 + u = 0,
Ls =
3m + δ(u2 + u − 1) otherwise,
and the minimal polynomial Ms (x) of s∞ is
( 2
(x − 1)δ(u +u−1) mα−1 (x)mα−10 (x) if u6 + u = 0,
Ms (x) = δ(u2 +u−1)
(x − 1) mα−1 (x)mα−2 (x)mα−10 (x) otherwise.
The code Cs defined by the sequence s∞ has generator polynomial Ms (x) and parameters
[3m − 1, 3m − 1 − Ls , d]3 where
5 ≤ d ≤ 8 if u6 + u 6= 0 and δ(u2 + u − 1) = 1,
4 ≤ d ≤ 6 if u6 + u 6= 0 and δ(u2 + u − 1) = 0,
3 ≤ d ≤ 6 if u6 + u = 0 and δ(u2 + u − 1) = 1,
3 ≤ d ≤ 4 if u6 + u = 0 and δ(u2 + u − 1) = 0.
and
Y
Ms (x) = (x − 1)Np (m) mα−1 (x)Np (h) mα−(q0 +qu ) (x)×
1≤u≤h−1
Np (h−u)=1
h−1
Y Y Y
m −(q 0 +
Pt−1 ij
q +q u )
(x).
α j=1
t=2 t≤u≤h−1 1≤i1 <···<it−1 <u
Np (h−u)=1
Linear Codes from Functions 497
The code Cs defined by the sequence s∞ has parameters [q m −1, q m −1−Ls , d]q and generator
polynomial Ms (x).
As a corollary of Theorem 20.5.13, Ding has proved the following result when h = 3.
Corollary 20.5.14 ([548]) Let h = 3 and m ≥ 6. The code Cs of Theorem 20.5.13 has
parameters [q m − 1, q m − 1 − Ls , d]q where
4m + Np (m) if p 6= 3,
Ls =
3m + Np (m) if p = 3
In addition,
3 ≤ d ≤ 8 if p = 3 and Np (m) = 1,
3 ≤ d ≤ 6 if p = 3 and Np (m) = 0,
3 ≤ d ≤ 8 if p > 3.
In the case q = 3, Ding [548] also studied the code Cs defined by the sequence s∞ of
h
(20.13) with F (x) = x(3 +1)/2 where h is odd with 1 ≤ h ≤ m
2 and gcd(m, h) = 1. When
h = 1, the code Cs becomes a special case of a code in Theorem 20.5.9. The following
theorem provides information on the code Cs when h ≥ 3.
and
h−1
Y Y
Ms (x) = (x − 1)N3 (m) mα−1 (x)N3 (h+1) mα−2 (x) m −(2+
Pt i
3 j)
(x)×
α j=1
t=1 1≤i1 <···<it ≤h−1
Y h−1
Y Y Y
mα−(1+3u ) (x) m −(1+
Pt i
3 j)
(x).
α j=1
1≤u≤h−1 t=2 t≤it ≤h−1 1≤i1 <···<it−1 <it
N3 (h−u+1)=1 N3 (h−it +1)=1
As a corollary of Theorem 20.5.15, Ding proved the following result on the code Cs when
h = 3; the lower bound on d follows from the BCH Bound and the upper bound follows
from the Sphere Packing Bound.
Corollary 20.5.16 ([548]) The code Cs of Theorem 20.5.15 with h = 3 has parameters
[3m − 1, 3m − 1 − Ls , d]3 where Ls = 7m + N3 (m) and 4 + N3 (m) ≤ d ≤ 16. Cs has generator
polynomial Ms (x) given by
Ms (x) = (x − 1)N3 (m) mα−1 (x)mα−2 (x)mα−5 (x)mα−10 (x)mα−11 (x)mα−13 (x)mα−14 (x).
In [548], Ding also provided a list of nice open problems related to cyclic codes from
monomials and trinomials over finite fields.
498 Concise Encyclopedia of Coding Theory
Theorem 20.5.17 ([553, 554]) The cylic code Cs over Fq defined by the sequence in
u
(20.13) arising from the Dickson polynomial F (x) = Dpu ,a (x) = xp has generator polyno-
mial Ms (x) = (x − 1)δ(1) mα−pu (x) and parameters [q m − 1, q m − 1 − m − δ(1), d]q where
4 if q = 2 and δ(1) = 1,
3 if q = 2 and δ(1) = 0,
d=
3 if q > 2 and δ(1) = 1,
2 if q > 2 and δ(1) = 0.
Theorem 20.5.18 ([553, 554]) Let p > 2 and m ≥ 3. The code Cs defined by the sequence
in (20.13) arising from the Dickson polynomial F (x) = D2,a (x) = x2 − 2a with a ∈ Fqm has
parameters [q m − 1, q m − 1 − 2m − δ(1 − 2a), d]q and generator polynomial
where
4 if q =3 and δ(1 − 2a) = 0,
5 if q =3 and δ(1 − 2a) = 1,
d=
3 if q >3 and δ(1 − 2a) = 0,
4 if q >3 and δ(1 − 2a) = 1.
Again using the Sphere Packing Bound, the code in Theorem 20.5.18 is either optimal
or almost optimal for all m ≥ 2.
The cyclic codes Cs arising from the Dickson polynomial D3,a (x) = x3 −3ax were studied
by Ding. In his study, Ding has distinguished among the three cases: p = 2, p = 3, and
p ≥ 5. The case p = 3 was addressed previously in Theorem 20.5.17. So we give information
for only the two remaining cases. The following theorem provides information on the code
Cs when q = p = 2.
Theorem 20.5.19 ([553, 554]) Let q = p = 2 and let m ≥ 4. Let s∞ be the sequence
in (20.13) where F (x) = D3,a (x) = x3 − 3ax = x3 + ax with a ∈ F2m . Then the minimal
polynomial Ms (x) of s∞ is
As highlighted by Ding, when a = 0 and δ(1) = 1, the code is equivalent to the even-
weight subcode of the Hamming code. When a = 1, the code Cs is a double-error correcting
binary BCH code or its even-weight subcode. Theorem 20.5.19 shows that well-known classes
of cyclic codes can be constructed with Dickson polynomials of order 3. The code is either
optimal or almost optimal.
In the case where q = pt with p ≥ 5 or p = 2 and t ≥ 2, Ding has given the following
result.
Theorem 20.5.20 ([553, 554]) Let q = pt where p ≥ 5 or p = 2 and t ≥ 2. Let s∞ be
the sequence in (20.13) where F (x) = D3,a (x) = x3 − 3ax with a ∈ Fqm . Then the minimal
polynomial Ms (x) of s∞ is
(x − 1)δ(−2) mα−3 (x)mα−2 (x)
if a = 1,
Ms (x) =
(x − 1)δ(1−3a) mα−3 (x)mα−2 (x)mα−1 (x) if a 6= 1,
and the linear span Ls of s∞ is
δ(−2) + 2m if a = 1,
Ls =
δ(1 + a) + 3m 6 1.
if a =
Moreover, the code Cs defined by the sequence s∞ has generator polynomial Ms (x) and
parameters [q m − 1, q m − 1 − Ls , d]q where
3 if a = 1,
4 if a 6= 1 and δ(1 − 3a) = 0,
d≥ 5 if a 6= 1 and δ(1 − 3a) = 1,
5 if a 6= 1 and δ(1 − 3a) = 0 and q = 4,
6 if a 6= 1 and δ(1 − 3a) = 1 and q = 4.
The code Cs of Theorem 20.5.20 is either a BCH code or the even-like subcode of a BCH
code. Ding has showed that the code is either optimal or almost optimal.
Similarly Ding studied cyclic codes coming from the Dickson polynomial D4,a (x) =
x4 − 4ax2 + 2a2 where he distinguished among the three cases: p = 2, p = 3, and p ≥ 5.
The case p = 2 was covered by a code presented in Theorem 20.5.17.
The following result gives information about the cyclic code Cs when q = p = 3
Theorem 20.5.21 ([553, 554]) Let q = p = 3 and m ≥ 3. Let s∞ be the sequence in
(20.13) where F (x) = D4,a (x) = x4 −4ax2 +2a2 with a ∈ F3m . Then the minimal polynomial
Ms (x) of s∞ is
Moreover, the code Cs defined by the sequence s∞ has generator polynomial Ms (x) and
parameters [3m − 1, 3m − 1 − Ls , d]3 where
d = 2 if a = 1,
d = 3 if a = 0 and m ≡ 0 (mod 6),
d ≥ 4 if a = 0 and m 6≡ 0 (mod 6),
d ≥ 5 if a2 6= a and δ(1 − a − a2 ) = 0,
d = 6 if a2 6= a and δ(1 − a − a2 ) = 1.
Ding showed that when a = 1, the code in Theorem 20.5.21 is neither optimal nor almost
optimal. The code is either optimal or almost optimal in all other cases.
In the case where q = pt , with p ≥ 5 or p = 3 and t ≥ 2, Ding proved the following
result about the code Cs .
Theorem 20.5.22 ([553, 554]) Let m ≥ 2 and q = pt where p ≥ 5 or p = 3 and t ≥ 2.
Let s∞ be the sequence in (20.13) where F (x) = D4,a (x) = x4 − 4ax2 + 2a2 with a ∈ Fqm .
Then the minimal polynomial Ms (x) of s∞ is
3
δ(1)
(x − 1) mα−4 (x)mα−3 (x)mα−1 (x) if a = 2 ,
Ms (x) = (x − 1)δ(1) mα−4 (x)mα−3 (x)mα−2 (x) if a = 12 ,
2 Q4
(x − 1)δ(1−4a+2a ) i=1 mα−i (x) if a 6∈ { 32 , 12 },
As indicated by Ding, the code Cs of Theorem 20.5.22 is either optimal or almost optimal,
except in the cases that a ∈ { 32 , 12 }.
Ding studied cyclic codes Cs determined by the Dickson polynomial D5,a (x) = x5 −
5ax3 + 5a2 x distinguishing among the three cases: p = 2 (with three subcases), p = 3 (with
two subcases), and p ≥ 7. The case p = 5 was included in Theorem 20.5.17.
When q = p = 2, the following result describes parameters of the code Cs .
Theorem 20.5.23 ([553, 554]) Let q = p = 2 and m ≥ 5. Let s∞ be the sequence in
(20.13) where F (x) = D5,a (x) = x5 − 5ax3 + 5a2 x with a ∈ F2m . Then the minimal
polynomial Ms (x) of s∞ is
δ(1)
(x − 1) mα−5 (x) if a = 0,
Moreover, the code Cs defined by the sequence s∞ has generator polynomial Ms (x) and
parameters [2m − 1, 2m − 1 − Ls , d]2 where
d = 2 if a = 0 and δ(1) = 0 and 5 | (2m − 1),
d = 3 if a = 0 and δ(1) = 0 and 5 - (2m − 1),
d = 4 if a = 0 and δ(1) = 1,
d ≥ 3 if 1 + a + a3 = 0 and δ(1) = 0,
d ≥ 4 if 1 + a + a3 = 0 and δ(1) = 1,
d ≥ 7 if a + a2 + a4 6= 0 and δ(1) = 0,
d = 8 if a + a2 + a4 6= 0 and δ(1) = 1.
As showed by Ding, the code in Theorem 20.5.23 is either optimal or almost optimal.
The code is not a BCH code when 1 + a + a3 = 0 and is a BCH code in the remaining cases.
For the case where (p, q) = (2, 4), Ding has proved the following result on the code Cs .
Moreover, the code Cs defined by the sequence s∞ has generator polynomial Ms (x) and
parameters [4m − 1, 4m − 1 − Ls , d]4 where
d = 2 if a = 0 and δ(1) = 0 and 5 | (4m − 1),
d = 3 if a = 0 and 5 - (4m − 1),
d ≥ 3 if a = 1,
d ≥ 6 if a + a2 6= 0 and δ(1) = 0,
d ≥ 7 if a + a2 6= 0 and δ(1) = 1.
Examples of the code in Theorem 20.5.24 are documented in [553], and many of them
are optimal.
We turn to the case where (p, q) = (2, 2t ) with t ≥ 3.
Moreover, the code Cs defined by the sequence s∞ has generator polynomial Ms (x) and
parameters [q m − 1, q m − 1 − Ls , d]q where
3 if a = 0 and δ(1) = 0,
4 if a = 0 and δ(1) = 1,
d≥ 5 if 1 + a + a2 = 0,
6 if a + a2 + a3 6= 0 and δ(1) = 0,
7 if a + a2 + a3 6= 0 and δ(1) = 1.
Examples of the code in Theorem 20.5.25 can be found in [553], and many of them are
optimal. The code in Theorem 20.5.25 is not a BCH code when a = 0 and is a BCH code
otherwise.
When q = p = 3, Ding has stated the following result.
Theorem 20.5.26 ([553, 554]) Let q = p = 3 and m ≥ 3. Let s∞ be the sequence in
(20.13) where F (x) = D5,a (x) = x5 − 5ax3 + 5a2 x with a ∈ F3m . Then the minimal
polynomial Ms (x) of s∞ is
( 2
(x − 1)δ(1+a+2a ) mα−5 (x)mα−4 (x)mα−2 (x) if a − a6 = 0,
Ms (x) = 2 Q5
(x − 1)δ(1+a+2a ) i=2 mα−i (x) if a − a6 6= 0,
and the linear span Ls of s∞ is
δ(1 + a + 2a2 ) + 3m if a − a6 = 0,
Ls =
δ(1 + a + 2a2 ) + 4m if a − a6 6= 0.
Moreover, the code Cs defined by the sequence s∞ has generator polynomial Ms (x) and
parameters [3m − 1, 3m − 1 − Ls , d]3 where
4 if a − a6 = 0,
d≥ 7 if a − a6 6= 0 and δ(1 + a + 2a2 ) = 0,
8 if a − a6 6= 0 and δ(1 + a + 2a2 ) = 1.
Examples of the code in Theorem 20.5.26 are described in [553], and some of them are
optimal.
The case (p, q) = (3, 3t ) where t ≥ 2 is stated in the following theorem.
Theorem 20.5.27 ([553, 554]) Let p = 3, q = 3t with t ≥ 2, and m ≥ 2. Let s∞ be the
sequence in (20.13) where F (x) = D5,a (x) = x5 − 5ax3 + 5a2 x with a ∈ Fqm . Then the
minimal polynomial Ms (x) of s∞ is
δ(1)
(x − 1) mα−5 (x)mα−4 (x)mα−2 (x)mα−1 (x)
if 1 + a = 0,
Ms (x) = (x − 1)δ(a−1) mα−5 (x)mα−4 (x)mα−3 (x)mα−2 (x) if 1 + a2 = 0,
(x − 1)δ(1+a+2a2 ) Q5 m −i (x)
if (a + 1)(a2 + 1) 6= 0,
i=1 α
Moreover, the code Cs defined by the sequence s∞ has generator polynomial Ms (x) and
parameters [q m − 1, q m − 1 − Ls , d]q where
3 if a = −1 and δ(1) = 0,
4 if a = −1 and δ(1) = 1,
5 if a2 = −1 and δ(a − 1) = 0,
d≥
6 if a2 = −1 and δ(a − 1) = 1,
6 if (a + 1)(a2 + 1) 6= 0 and δ(1 + a + 2a2 ) = 0,
7 if (a + 1)(a2 + 1) 6= 0 and δ(1 + a + 2a2 ) = 1.
Examples of the code in Theorem 20.5.27 are available in [553], and some of them are
optimal. The code Cs is a BCH code, except in the case that a = −1.
Finally, in the case p ≥ 7, Ding has shown the following result.
Theorem 20.5.28 ([553, 554]) Let p ≥ 7 and m ≥ 2. Let s∞ be the sequence in (20.13)
where F (x) = D5,a (x) = x5 − 5ax3 + 5a2 x with a ∈ Fqm . Then the minimal polynomial
Ms (x) of s∞ is
2
(x − 1)δ(1−5a+5a ) mα−5 (x)mα−4 (x)mα−2 (x)mα−1 (x) if a = 2,
(x − 1)δ(1−5a+5a2 ) m −5 (x)m −4 (x)m −3 (x)m −1 (x) if a = 2 ,
α α α α 3
Ms (x) = δ(1−5a+5a2 ) 2
(x − 1) m α−5 (x)mα−4 (x)mα−3 (x)mα −2 (x) if a − 3a + 1 = 0,
(x − 1)δ(1−5a+5a2 ) Q5 m −i (x)
otherwise,
i=1 α
Examples of the code in Theorem 20.5.28 can be found in [553], and some of them are
optimal. The code Cs is a BCH code, except in the cases a ∈ {2, 2/3}.
20.6 Codes with Few Weights from p-Ary Functions with p Odd
20.6.1 Codes with Few Weights from p-Ary Weakly Regular Bent Func-
tions Based on the First Generic Construction
Let p be an odd prime, Ψ : Fpm → Fpm with Ψ(0) = 0, and ψ1 (x) := Trpm /p (Ψ(x)). Let
CΨ be the code defined by (20.4). Analogous to (20.11), define the subcode Cψ1 of CΨ by
Cψ1 := ecα,β = fα,β (ζ1 ), fα,β (ζ2 ), . . . , fα,β (ζpm −1 ) | α ∈ Fp , β ∈ Fpm (20.14)
504 Concise Encyclopedia of Coding Theory
TABLE 20.16: Weight distribution of Cψ1 in Theorem 20.6.2 for m even and p odd
TABLE 20.17: Weight distribution of Cψ1 in Theorem 20.6.2 for m and p odd
The following theorem presents the weight distribution of the code Cψ1 .
Theorem 20.6.2 ([1379]) Let Cψ1 be the three-weight code in Theorem 20.6.1. Then Cψ1
has dimension m + 1; its Hamming weight distribution is given in Table 20.16 when m is
even and in Table 20.17 when m is odd.
20.6.2 Linear Codes with Few Weights from Cyclotomic Classes and
Weakly Regular Bent Functions
Let p be an odd prime and q = pm . Recently (2020), Y. Wu, N. Li and X. Zeng [1911]
generalized the construction of linear codes of length n defined by
C = Trq/p (ax) x∈D | a ∈ Fq .
Linear Codes from Functions 505
where D is the defining set of C. More specifically, they investigated codes CD of the form
CD = c(a, b) = Trq/p (ax + by) (x,y)∈D | a, b ∈ Fq (20.15)
where D ⊆ F2q by employing weakly regular bent functions (Definition 20.2.17) and cyclo-
tomic classes over finite fields.
Let α be a generator of F∗q .
Definition 20.6.3 Let e be a positive integer with e | (q − 1). Define Ci = αi hαe i for
i = 0, 1, . . . , e − 1, where hαe i denotes the subgroup of F∗q generated by αe . The cosets Ci
are called the cyclotomic classes of order e with respect to Fq . The Gauss periods are
defined by X Trq/p (x)
ηi = ξp .
x∈Ci
2πi
where ξp is the complex primitive pth root of unity e p .
Concretely, Wu et al. first considered the linear codes of the form (20.15) by selecting
the defining set as
D = {(x, y) ∈ F2q | x ∈ Ci , y ∈ Cj }, (20.16)
where Ci , Cj are any two cyclotomic classes of order e. It has been seen that the weights of
CD depend on the values of Gauss periods of order e which are normally difficult to calculate.
By using the semi-primitive case of cyclotomic classes of order e, they showed that CD is
a five-weight linear code and determined its weight distribution according to Gauss sums.
As a special case, when e = 2 and m is odd, CD is proved to be a two-weight linear code.
Moreover, Wu et al. derived new two-weight linear codes and three-weight linear codes of
the form (20.15) from weakly regular bent functions. The defining set D for this case is of
the form
D = (x, y) ∈ F2q \ {(0, 0)} | f (x) + g(y) = 0 ,
(20.17)
and the following two cases are considered to construct linear codes:
(1) f (x) = Trq/p (x) and g(y) is a weakly regular bent function, or
(2) both f (x) and g(y) are weakly regular bent functions.
By calculating Weil sums and using the properties of weakly regular bent functions, it was
shown that a class of three-weight linear codes can be obtained from the first case, and the
second case leads to both two-weight and three-weight linear codes. We present below the
main results in [1911].
We first consider linear codes obtained from cyclotomic classes. When D is as defined
in (20.16), let
Na,b (0) = #{(x, y) ∈ D | Trq/p (ax + by) = 0}.
Then the Hamming weight of the codeword c(a, b) ∈ CD can be expressed as
For the special case e = 2, Wu et al. have derived the following result.
506 Concise Encyclopedia of Coding Theory
TABLE 20.19: Weight distribution of CD in Theorem 20.6.5 for m even, p odd, and e = 2
Theorem 20.6.5 ([1911]) Let the linear code CD be defined as (20.15) and (20.16) with
m 2
e = 2. Then CD is a five-weight (p 4−1) , 2m p code over Fp with the weight distribution
m 2
in Table 20.19 if m is even, and otherwise CD is a two-weight (p 4−1) , 2m p code over Fp
TABLE 20.20: Weight distribution of CD in Theorem 20.6.5 for m odd, p odd, and e = 2
TABLE 20.21: Weight distribution of CD in Theorem 20.6.6 for m even and p odd
is valid when both f (x) and g(y) are in RF. Let g denote the sign of the Walsh transform
of g ∈ RF (see (20.3)) and set p∗ = (−1)(p−1)/2 p.
Theorem 20.6.6 ([1911]) Let f (x) = Trq/p (x) and assume g(y) is a weakly regular bent
2m−1 in RF.
function Then the linear code CD defined by (20.15) and (20.17) is a three-weight
p − 1, 2m p code over Fp with the weight distribution in Table 20.21 when m is even
and in Table 20.22 when m is odd.
Theorem 20.6.7 ([1911]) Let f (x) and g(y) be any two weakly bent functions in RF
where lf , lg , f , and g are defined as before. If lf 6= lg and m is even, then the linear code
CD defined by (20.15) and (20.17) is a three-weight p2m−1 + p−1 p f g (p∗m
− 1, 2m p
code
over F with
2m−1p p−1 the weight distribution given in Table 20.23. If lf = lg , then CD is a two-weight
+ p f g p∗m − 1, 2m p code over Fp with the weight distribution in Table 20.24.
p
20.6.3 Codes with Few Weights from p-Ary Weakly Regular Bent Func-
tions Based on the Second Generic Construction
Zhou, Li, Fan and Helleseth [1947] derived several classes of p-ary linear codes with two
or three weights constructed from quadratic bent functions over Fp where p is an odd prime.
Let Q be a quadratic bent function from Fpm to Fp . Define DQ = {x ∈ F∗pm | Q(x) = 0}.
Then if m is odd, we have #DQ = pm−1 − 1 and if m is even, we have #DQ = pm−1 +
m−2
(p − 1)p 2 where ∈ {−1, 1}. Denote by CDQ the linear code based on the second generic
construction whose defining set is DQ .
Theorem 20.6.8 ([1947]) With the above notation the following hold. If m is odd, then
CDQ is a three-weight linear code with parameters [pm−1 − 1, m]p . If m is even, then CDQ
m−2
is a two-weight linear code with parameters pm−1 + (p − 1)p 2 − 1, m p . The weight
distributions and the description of are given in [1947].
Tang, Li, Qi, Zhou and Helleseth [1778] generalized their approach to weakly regular
bent functions. More precisely, they derived linear codes with two or three weights from the
subclass RF of p-ary weakly regular bent functions defined in Section 20.6.2.
Given a p-ary function f : Fpm → Fp in RF define Df := {x ∈ Fpm | f (x) = 0}
and denote by CDf the related linear code based on the second generic construction whose
defining set is Df .
Theorem 20.6.9 ([1778]) Let f be a function in RF. If m is odd, then CDf is a three-
weight linear code with parameters [pm−1 − 1, m]p . If m is even, then CDf is a two-weight
linear code with parameters pm−1 − 1 + (p − 1)p(m−2)/2 , m p where is the sign of the
Walsh transform of f . The weight distributions are given in [1778].
20.6.4 Codes with Few Weights from p-Ary Weakly Regular Plateaued
Functions Based on the First Generic Construction
Let p be an odd prime. Let Ψ : Fpm → Fpm be a mapping with Ψ(0) = 0. We examine
the code Cψ1 given by (20.14) where ψ1 (x) = Trpm /p (Ψ(x)) is a weakly regular s-plateaued
p-ary function with 0 ≤ s ≤ m − 2.
Throughout this subsection η0 = (−1)(p−1)/2 . The following theorem describes the Ham-
ming weights of the codewords of Cψ1 .
Theorem 20.6.10 ([1381]) Let Cψ1 be the linear p-ary code defined by (20.14). Assume
that ψ1 = Trpm /p (Ψ) is weakly regular p-ary s-plateaued with 0 ≤ s ≤ m − 2. Let g be a
p-ary function over supp(Wψ1 ) as described in [1381] and [1018, Theorem 2]. Let = ±1
be the sign of Wψ1 . For α ∈ F∗p , α satisfies αα = 1. The weight distribution of Cψ1 is as
follows.
(a) wtH (e c0,β ) = pm − pm−1 for β ∈ F∗pm .
c0,0 ) = 0 and wtH (e
(b) For α ∈ F∗p and β ∈ Fpm , if αβ ∈ cα,β ) = pm − pm−1 .
/ supp(Wψ1 ), then wtH (e
(c) For α ∈ F∗p and β ∈ Fpm , if αβ ∈ supp(Wψ1 ) and m + s is odd, then wtH (e
cα,β ) =
m+s+1 m+s−1
m m−1 g(αβ)
p −p − η0 2
p 2 p .
Linear Codes from Functions 509
One can determine the weight distribution of the code Cψ1 addressed by Theorem 20.6.10.
We present separately the even case of m + s and its odd case.
Theorem 20.6.11 ([1381]) Let Cψ1 be the three-weight code of Theorem 20.6.10. If m + s
is even, then Cψ1 is a [pm − 1, m + 1, d]p code over Fp where
( m+s−2 (m+s)/2
pm − pm−1 − (p − 1)p 2 if η0 = 1,
d= m m−1 m+s−2 (m+s)/2
p −p −p 2 if η0 = −1
and weight distribution given in Table 20.25, noting again that = ±1 is the sign of Wψ1 .
TABLE 20.25: Weight distribution of Cψ1 in Theorem 20.6.11 for m + s even and p odd
Example 20.6.12 Let Ψ : F33 → F33 be the map defined by Ψ(x) = ζ 22 x13 + ζ 7 x4 + ζx2
where F∗33 = hζi with ζ 3 + 2ζ + 1 = 0. The function ψ1 (x) = Tr33 /3 (Ψ(x)) is weakly regular
ternary 1-plateaued with
n o
g(ω)
Wψ1 (ω) ∈ 0, 32 ξ3 = {0, −9, −9ξ3 , −9ξ32 }
for all ω ∈ F33 , where = −1 and g is an unbalanced function over F33 . The code Cψ1
of Theorem 20.6.11 is a three-weight ternary linear [26, 4, 15]3 code with Hamming weight
enumerator HweCψ1 (x, y) = y 26 + 16x15 x11 + 62x18 y 8 + 2x24 y 2 . This was verified by Magma
[250].
The following theorem determines the weight distribution of Cψ1 in Theorem 20.6.10
when m + s is an odd integer.
Theorem 20.6.13 ([1381]) Let Cψ1 be the code of Theorem 20.6.10. If m + s is odd, then
Cψ1 is a three-weight [pm − 1, m + 1, pm − pm−1 − p(m+s−1)/2 ]p code over Fp . Cψ1 has weight
distribution given in Table 20.26, noting again that = ±1 is the sign of Wψ1 . The number
(m+s+1)/2
of minimum weight codewords depends on the sign of η0 = ±1.
510 Concise Encyclopedia of Coding Theory
TABLE 20.26: Weight distribution of Cψ1 in Theorem 20.6.13 for m + s and p odd
20.6.5 Codes with Few Weights from p-Ary Weakly Regular Plateaued
Functions Based on the Second Generic Construction
Let f be a p-ary function from Fq to Fp where q = pn and p is an odd prime. Define the
set
Df = {x ∈ F∗q | f (x) = 0}. (20.18)
Let us denote m = #Df and Df = {d1 , d2 , . . . , dm }. Recall that the second generic con-
struction of a linear code CDf with defining set Df is
CDf = {cβ = Trpn /p (βd1 ), Trpn /p (βd2 ), . . . , Trpn /p (βdm ) | β ∈ Fq }.
Let us define the subset of the set of weakly regular unbalanced plateaued functions that
we shall use to construct linear codes. Similar to the class RF, denote by WRP the set of
those weakly regular p-ary s-plateaued unbalanced functions f : Fq → Fp , where 0 ≤ s ≤ n,
with f (0) = 0 such that there exists an even positive integer hf with gcd(hf − 1, p − 1) = 1
where f (ax) = ahf f (x) for any a ∈ F∗p and x ∈ Fq .
Throughout this subsection, as earlier, η0 = (−1)(p−1)/2 and p∗ = (−1)(p−1)/2 p.
Theorem 20.6.14 ([1382]) Let n + s be an even integer and f ∈ WRP. Then CDf is a
√ n+s−2
three-weight linear code over Fp with parameters pn−1 − 1 + η0 (p − 1) p∗ , n p . The
Hamming weights of the codewords and the weight distribution of CDf are as in Table 20.27,
where = ±1 is the sign of Wf .
TABLE 20.27: Weight distribution of CDf in Theorem 20.6.14 for n + s even and p odd
Example 20.6.15 The function f : F38 → F3 defined by f (x) = Tr38 /3 (ζx4 + ζ 816 x2 ),
where F∗38 = hζi with ζ 8 + 2ζ 5 + ζ 4 + 2ζ 2 + 2ζ + 2 = 0, is a quadratic 2-plateaued unbalanced
function in the set WRP such that
g(β)
Wf (β) ∈ {0, η05 35 ξ3 } = {0, 243, 243ξ3 , 243ξ32 } for all β ∈ F38
where g is an unbalanced 3-ary function with g(0) = 0. Also = η0 = −1. Then CDf
is a three-weight ternary linear [2348, 8, 1458]3 code with Hamming weight enumerator
HweCDf (x, y) = y 2348 + 260x1458 y 890 + 5832x1566 y 782 + 468x1620 y 728 , verified by Magma [250].
Linear Codes from Functions 511
TABLE 20.28: Weight distribution of CDf in Theorem 20.6.16 for n + s and p odd
Example 20.6.17 The function f : F33 → F3 defined by f (x) = Tr33 /3 (x4 + x2 ), where
F∗33 = hζi with ζ 3 + 2ζ + 1 = 0, is a quadratic bent function (s = 0) in the set WRP with
√ √ √
Wf (β) ∈ {i3 3, i3 3ξ3 , i3 3ξ32 } = {6ξ3 + 3, −3ξ3 − 6, −3ξ3 + 3} for all β ∈ F33 ,
where = η0 = −1. Then CDf is a three-weight ternary linear [8, 3, 4]3 code with Hamming
weight enumerator HweCDf (x, y) = y 8 + 12x4 y 4 + 8x6 y 2 + 6x8 , verified by Magma [250].
Let f ∈ WRP. For any x ∈ Fq , f (x) = 0 if and only if f (ax) = 0 for every a ∈ F∗p . Then
S
one can choose a subset Df of the defining set Df defined by (20.18) such that a∈F∗p aDf is
a partition of Df ; namely, Df = F∗p Df = {ad | a ∈ F∗p and d ∈ Df }, where for each pair of
distinct elements d1 , d2 ∈ Df we have dd1 ∈
/ F∗p . This implies that the linear code CDf can be
2
punctured into a shorter linear code CDf , which is based on the second generic construction
with the defining set Df . Notice that for β ∈ F∗q , #{x ∈ Df | f (x) = 0 and Trq/p (βx) =
0} = (p − 1)#{x ∈ Df | f (x) = 0 and Trq/p (βx) = 0}.
Corollary 20.6.18 ([1382]) The punctured version CDf of the code CDf in Theorem
√ n+s−2
20.6.14 is a three-weight linear code with parameters (pn−1 − 1)/(p − 1) + η0 p∗ ,n p
whose weight distribution is listed in Table 20.29, where = ±1 is the sign of Wf .
TABLE 20.29: Weight distribution of CDf in Corollary 20.6.18 for n + s even and p odd
Example 20.6.19 The punctured version CDf of CDf in Example 20.6.15 is a three-weight
ternary [1174, 8, 729]3 linear code with Hamming weight enumerator HweCD (x, y) = y 1174 +
f
Corollary 20.6.20 ([1382]) The punctured version CDf of the code CDf in Theorem
20.6.16 is a three-weight linear code with parameters (pn−1 − 1)/(p − 1), n, pn−2 −
p(n+s−3)/2 p whose weight distribution is listed in Table 20.30, where = ±1 is the sign
of Wf .
TABLE 20.30: Weight distribution of CDf in Corollary 20.6.20 for n + s and p odd
Example 20.6.21 The punctured version CDf of CDf in Example 20.6.17 is a three-weight
ternary [4, 3, 2]3 linear code with Hamming weight enumerator HweCD (x, y) = 1+12x2 y 2 +
f
8x3 y + x4 , as verified by Magma [250]. This code is optimal by the Singleton Bound.
We can work on the Walsh support of a weakly regular plateaued function f to define
a subcode of each code constructed above. Consider therefore a linear code C Df involving
Df defined by
C Df = cβ = T rpn /p (βd1 ), T rpn /p (βd2 ), . . . , T rpn /p (βdm ) | β ∈ supp(Wf ) (20.19)
which is a subcode of CDf . Hence, the following codes in Corollaries 20.6.22, 20.6.23, 20.6.24,
and 20.6.25 are subcodes of the codes in Theorems 20.6.14, 20.6.16 and Corollaries 20.6.18,
and 20.6.20, respectively. Notice that their parameters are directly derived from those of
the corresponding codes.
Corollary 20.6.22 ([1382]) The subcode C Df of the code CDf in Theorem 20.6.14 is a
√ n+s−2
two-weight linear code with parameters pn−1 − 1 + η0 (p − 1) p∗ , n − s p whose
weight distribution is listed in Table 20.31.
TABLE 20.31: Weight distribution of C Df in Corollary 20.6.22 for n + s even and p odd
Corollary 20.6.23 ([1382]) The subcode C Df of the code CDf in Theorem 20.6.16 is the
three-weight linear code with parameters pn−1 − 1, n − s, (p − 1) pn−2 − p(n+s−3)/2 p whose
weight distribution is listed in Table 20.32.
Corollary 20.6.24 ([1382]) The subcode C Df of the code CDf in Corollary 20.6.18 is a
√ n+s−2
two-weight linear code with parameters (pn−1 − 1)/(p − 1) + η0 p∗ , n − s p whose
weight distribution is listed in Table 20.33.
TABLE 20.33: Weight distribution of C Df in Corollary 20.6.24 for n + s even and p odd
Corollary 20.6.25 ([1382]) The subcode C Df of the code CDf in Corollary 20.6.20 is a
three-weight linear code with parameters (pn−1 −1)/(p−1), n−s, pn−2 −p(n+s−3)/2 p whose
weight distribution is listed in Table 20.34.
Below, we show that one can construct other linear codes. We shall push further the use of
weakly regular plateaued functions in the recent construction methods proposed by Tang et
al. [1778]. Let f : Fq → Fp be a p-ary function. Define the sets Df,sq = {x ∈ Fq | f (x) ∈ SQ}
and Df,nsq = {x ∈ Fq | f (x) ∈ N SQ} where SQ and NSQ denote, respectively, the sets of
514 Concise Encyclopedia of Coding Theory
squares and non-squares in F∗p . With a similar definition to the linear code CDf , define a
linear code CDf,sq involving Df,sq = {d01 , d02 , . . . , d0m } where
and a linear code CDf,nsq involving Df,nsq = {d001 , d002 , . . . , d00t } where
Clearly, the codes CDf,sq and CDf,nsq have length m and t, respectively, and dimension at
most n.
Theorem 20.6.26 ([1382]) Let n + s be an even integer and f ∈ WRP. Then CDf,sq is
√ n+s−2
a three-weight linear code with parameters p−1
2 pn−1 − η0 p∗ , n p where = ±1
is the sign of Wf . The Hamming weights of the codewords and the weight distribution of
CDf,sq are in Table 20.35.
TABLE 20.35: Weight distribution of CDf,sq in Theorem 20.6.26 for n + s even and p odd
(n+s)/2
Remark 20.6.27 In Theorem 20.6.26, if η0 = 1 and p = 3, then we have the
condition 0 ≤ s ≤ n − 4; otherwise, 0 ≤ s ≤ n − 2 and n ≥ 3. This guarantees that we have
wtH (cβ ) > 0 for each β ∈ F∗q , which confirms that the code CDf,sq has dimension n.
Theorem 20.6.29 ([1382]) Let n + s be an odd integer and f ∈ WRP. Then CDf,sq is a
√ n+s−1
three-weight linear code with parameters p−1
2 pn−1 + p∗ , n p where = ±1 is the
sign of Wf . The Hamming weights of the codewords and the weight distribution of CDf,sq
are in Table 20.36 and Table 20.37 when p ≡ 1 (mod 4) and p ≡ 3 (mod 4), respectively.
(n+s−1)/2
Remark 20.6.30 In Theorem 20.6.29, if p ≡ 3 (mod 4) and η0 = −1 or p ≡ 1
(mod 4) and = −1, then we have the condition 0 ≤ s ≤ n − 3; otherwise, 0 ≤ s ≤ n − 1
and n ≥ 2. This guarantees that we have wtH (cβ ) > 0 for each β ∈ F∗q , which confirms that
the code CDf,sq has dimension n.
Linear Codes from Functions 515
TABLE 20.36: Weight distribution of CDf,sq in Theorem 20.6.29 for p ≡ 1 (mod 4) and
n + s odd
Hamming weight w Multiplicity Aw
0 1
(p−1)2 n−2
2 p pn−s−1 − 1
√ n+s−3 √
p−1 p−1
+ pn−s−1
n−2 n−s−1
2 (p − 1)p + (p + 1) p 2 p
(p−1)2 √ √
pn−2 + pn+s−3 pn − pn−s + p−1 pn−s−1 − pn−s−1
2 2
TABLE 20.37: Weight distribution of CDf,sq in Theorem 20.6.29 for p ≡ 3 (mod 4) and
n + s odd
Hamming weight w Multiplicity Aw
0 1
(p−1)2 n−2
2 p pn−s−1 − 1
(p−1) 2
n−2
√ ∗ n+s−3 p−1 √ n−s−1
2 p − p pn − pn−s + 2 pn−s−1+ (−1)n p∗
p−1 √ n+s−3 p−1 √ n−s−1
2 (p − 1)pn−2 − (p + 1) p∗ 2 p n−s−1
− (−1)n p∗
Theorem 20.6.31 ([1382]) Let n + s be an odd integer and f ∈ WRP. Then CDf,nsq is a
√ n+s−1
three-weight linear code with parameters p−1
2 pn−1 − p∗ , n p where = ±1 is the
sign of Wf . The Hamming weights of the codewords and the weight distribution of CDf,nsq
are in Table 20.38 and Table 20.39 when p ≡ 1 (mod 4) and p ≡ 3 (mod 4), respectively.
TABLE 20.38: Weight distribution of CDf,nsq in Theorem 20.6.31 for p ≡ 1 (mod 4) and
n + s odd
Hamming weight w Multiplicity Aw
0 1
(p−1)2 n−2
2 p pn−s−1 − 1
(p−1) 2 √ n+s−3 √
pn − pn−s + p−1 pn−s−1 + pn−s−1
n−2
2 − p
p 2
√ √
p−1
(p − 1)pn−2 − (p + 1) pn+s−3 p−1
pn−s−1 − pn−s−1
2 2
TABLE 20.39: Weight distribution of CDf,nsq in Theorem 20.6.31 for p ≡ 3 (mod 4) and
n + s odd
Hamming weight w Multiplicity Aw
0 1
(p−1)2 n−2
2 p pn−s−1 − 1
p−1 n−2
√ n+s−3 p−1 √ n−s−1
2 (p − 1)p + (p + 1) p∗ 2 pn−s−1 + (−1)n p∗
(p−1)2 √ n+s−3 √ n−s−1
2 pn−2 + p∗ pn − pn−s + p−1
2 pn−s−1 − (−1)n p∗
516 Concise Encyclopedia of Coding Theory
(n+s−1)/2
Remark 20.6.32 In Theorem 20.6.31, if p ≡ 3 (mod 4) and η0 = 1 or p ≡ 1
(mod 4) and = 1, then we have the condition 0 ≤ s ≤ n − 3; otherwise, 0 ≤ s ≤ n − 1 and
n ≥ 2. This guarantees that we have wtH (cβ ) > 0 for each β ∈ F∗q , which confirms that the
code CDf,nsq has dimension n.
Example 20.6.33 The function f : F36 → F3 defined as f (x) = Tr36 /3 (ζx4 +ζ 27 x2 ), where
F∗36 = hζi with ζ 6 + 2ζ 4 + ζ 2 + 2ζ + 2 = 0, is a quadratic 1-plateaued unbalanced function
in the set WRP with
√ √ √
Wf (β) ∈ 0, i27 3, i27 3ξ3 , i27 3ξ32 = 0, 54ξ3 + 27, −27ξ3 − 54, −27ξ3 + 27
for all β ∈ F36 where = η0 = −1 and g is an unbalanced ternary function with g(0) =
0. Then CDf,nsq is a three-weight ternary linear [216, 6, 126]3 code with Hamming weight
enumerator HweCDf,nsq (x, y) = y 216 + 72x126 y 90 + 576x144 y 72 + 80x162 y 54 again verified by
Magma [250].
Remark 20.6.34 Let n + s be an even integer and f ∈ WRP. Then CDf,nsq is a three-
weight linear code with the same parameters and weight distribution as CDf,sq of Theorem
20.6.26.
From each of the codes constructed above, we can obtain a shorter linear code as done
earlier in this section. Indeed, the code CDf,sq can be punctured into a shorter one whose
weight distribution is derived from that of CDf,sq . Let f ∈ WRP. For any x ∈ Fq , f (x)
is a quadratic residue (respectively quadratic non-residue) in F∗p if and only if f (ax) is a
quadratic residue (respectively quadratic non-residue) in F∗p for every a ∈ F∗p . Thus one
S
can choose a subset Df,sq of Df,sq such that a∈F∗ aDf,sq is a partition of Df,sq . A similar
p
subset Df,nsq of Df,nsq can be chosen. So one can easily obtain the punctured versions CDf,sq
and CDf,nsq of the codes CDf,sq and CDf,nsq , respectively, whose parameters are derived
directly from those of the original codes. Corollaries 20.6.35, 20.6.37, and 20.6.38 below
follow directly from Theorems 20.6.26, 20.6.29, and 20.6.31, respectively.
Corollary 20.6.35 ([1382]) The punctured version CDf,sq of the code CDf,sq in Theorem
√ n+s−2
20.6.26 is a three-weight linear code with parameters 12 pn−1 − η0 p∗
, n p and
weight distribution in Table 20.40.
TABLE 20.40: Weight distribution of CDf,sq in Corollary 20.6.35 for n + s even and p odd
Example 20.6.36 The punctured version CDf,sq of CDf,sq in Example 20.6.28 is a three-
weight ternary linear [45, 5, 27]3 code with Hamming weight enumerator HweCD (x, y) =
f,sq
Corollary 20.6.37 ([1382]) The punctured version CDf,sq of the code CDf,sq in Theorem
√ n+s−1
20.6.29 is a three-weight linear code with parameters 12 pn−1 + p∗
, n p and weight
distribution in Table 20.41 and Table 20.42 when p ≡ 1 (mod 4) and p ≡ 3 (mod 4), re-
spectively.
TABLE 20.41: Weight distribution of CDf,sq in Corollary 20.6.37 for p ≡ 1 (mod 4) and
n + s odd
Hamming weight w Multiplicity Aw
0 1
p−1 n−2 n−s−1
2 p p −1
√ √ n−s−1
1
(p − 1)pn−2 + (p + 1) pn+s−3 p−1
n−s−1
2 2 p + p
(p−1) √ n+s−3 √
p−1
− pn−s−1
n−2 n n−s n−s−1
2 p + p p −p + 2 p
TABLE 20.42: Weight distribution of CDf,sq in Corollary 20.6.37 for p ≡ 3 (mod 4) and
n + s odd
Hamming weight w Multiplicity Aw
0 1
p−1 n−2 n−s−1
2 p p −1
p−1 n−2
√ ∗ n+s−3 n n−s p−1 n−s−1
√ n−s−1
2 p − p p −p + 2 p + (−1)n p∗
1 √ n+s−3 p−1 √ n−s−1
2 (p − 1)pn−2 − (p + 1) p∗ 2 pn−s−1 − (−1)n p∗
Corollary 20.6.38 ([1382]) The punctured version CDf,nsq of the code CDf,nsq in Theo-
√ n+s−1
rem 20.6.31 is a three-weight linear code with parameters 12 pn−1 − p∗
, n p and
weight distribution in Table 20.43 and Table 20.44 when p ≡ 1 (mod 4) and p ≡ 3 (mod 4),
respectively.
TABLE 20.43: Weight distribution of CDf,nsq in Corollary 20.6.38 for p ≡ 1 (mod 4) and
n + s odd
Hamming weight w Multiplicity Aw
0 1
(p−1) n−2
2 p pn−s−1 − 1
√ n+s−3 √
p−1
pn − pn−s + p−1 pn−s−1 + pn−s−1
n−2
2 p − p 2
√ √
1
(p − 1)pn−2 − (p + 1) pn+s−3 p−1
pn−s−1 − pn−s−1
2 2
Example 20.6.39 The punctured version CDf,nsq of CDf,nsq in Example 20.6.33 is a three-
weight ternary linear [108, 6, 63]3 code with Hamming weight enumerator HweCD (x, y) =
f,nsq
y 108 + 72x63 y 45 + 576x72 y 36 + 80x81 y 27 . This code is minimal by Theorem 20.6.47 and is
almost optimal by the Griesmer Bound.
518 Concise Encyclopedia of Coding Theory
TABLE 20.44: Weight distribution of CDf,nsq in Corollary 20.6.38 for p ≡ 3 (mod 4) and
n + s odd
Hamming weight w Multiplicity Aw
0 1
p−1 n−2 n−s−1
2 p p −1
1 n−2
√ ∗ n+s−3 p−1 n−s−1
√ n−s−1
2 (p − 1)p + (p + 1) p 2 p + (−1)n p∗
p−1 √ n+s−3 √ n−s−1
2 pn−2 + p∗ pn − pn−s + p−1
2 pn−s−1 − (−1)n p∗
With a definition similar to that of the subcode C Df , we have a linear code involving
Df,sq defined by
which is the subcode of CDf,sq defined by (20.20). The following codes in Corollaries 20.6.40,
20.6.41, 20.6.42, and 20.6.43 are subcodes of the codes in Theorems 20.6.26, 20.6.29 and
Corollaries 20.6.35, and 20.6.37, respectively.
Corollary 20.6.40 ([1382]) The subcode C Df,sq of the code CDf,sq in Theorem 20.6.26 is
√ n+s−2
a two-weight linear code with parameters p−1
2 pn−1 − η0 p∗ , n − s p and weight
distribution in Table 20.45.
TABLE 20.45: Weight distribution of C Df,sq in Corollary 20.6.40 for n + s even and p odd
Corollary 20.6.41 ([1382]) The subcode C Df,sq of the code CDf,sq in Theorem 20.6.29 is
√ n+s−1
a three-weight linear code with parameters p−1
2 pn−1 + p∗ , n − s p and weight
distribution in Table 20.46.
TABLE 20.46: Weight distribution of C Df,sq in Corollary 20.6.41 for n + s and p odd
TABLE 20.47: Weight distribution of C Df,sq in Corollary 20.6.42 for n + s even and p odd
Corollary 20.6.42 ([1382]) The subcode C Df,sq of the code CDf,sq in Corollary 20.6.35
√ n+s−2
is a two-weight linear code with parameters 21 pn−1 − η0 p∗
, n − s p and weight
distribution in Table 20.47.
Corollary 20.6.43 ([1382]) The subcode C Df,sq of the code CDf,sq in Corollary 20.6.37
√ n+s−1
is a three-weight linear code with parameters 12 pn−1 + p∗
, n − s p and weight
distribution in Table 20.48.
TABLE 20.48: Weight distribution of C Df,sq in Corollary 20.6.43 for n + s and p odd
With a similar definition for the subcode C Df defined by (20.19), we have a linear code
C Df,nsq involving Df,nsq defined by
which is the subcode of CDf,nsq defined by (20.21). Hence, the following codes in Corollaries
20.6.44 and 20.6.45 are subcodes of the codes constructed in Theorem 20.6.31 and Corollary
20.6.38, respectively.
Corollary 20.6.44 ([1382]) The subcode C Df,nsq of the code CDf,nsq in Theorem 20.6.31
√ n+s−1
is a three-weight linear code with parameters p−1
2 pn−1 − p∗ , n − s p and weight
distribution in Table 20.49.
Corollary 20.6.45 ([1382]) The subcode C Df,nsq of the code CDf,nsq in Corollary 20.6.38
√ n+s−1
is a three-weight linear code with parameters 12 pn−1 − p∗
, n − s p and weight
distribution in Table 20.50.
When we assume only the weakly regular bent-ness in this subsection, we can obviously
recover the linear codes obtained by Tang et al. [1778].
Codes with few weights from functions have some applications to the design of minimal
codes for secret sharing schemes and for two-party computation.
520 Concise Encyclopedia of Coding Theory
TABLE 20.49: Weight distribution of C Df,nsq in Corollary 20.6.44 for n + s and p odd
TABLE 20.50: Weight distribution of C Df,nsq in Corollary 20.6.45 for n + s and p odd
Definition 20.6.46 A vector a covers another vector b if the support of a contains the
support of b. A nonzero codeword a of the code C defined over Fq is said to be minimal if
a covers only the codewords βa for every β ∈ Fq , but no other nonzero codeword of C. The
code C is said to be minimal if every nonzero codeword of C is minimal.
It is worth noting that determining the minimality of a linear code over finite fields has
been an attractive research topic in coding theory. Almost all the codes presented in the
previous sections are minimal, since they satisfy the sufficient condition of minimality given
by Ashikhmin and Barg in the theorem below.
Theorem 20.6.47 ([67]) All nonzero codewords of a linear code C over Fq are minimal if
q−1 wtmin
< ,
q wtmax
where wtmin and wtmax denote the minimum and maximum weights of nonzero codewords
in C, respectively.
In [943], Ding, Heng, and Zhou presented a necessary and sufficient condition for linear
codes over finite fields to be minimal; see also [556] for minimal binary codes.
Theorem 20.6.48 ([943]) Let C be a linear code in Fnq . Then C is minimal if and only if
P
β∈F∗
q
wtH (a + βb) 6= (q − 1)wtH (a) − wtH (b) for any Fq -linearly independent codewords
a, b ∈ C.
Minimal codes are also explored in Chapters 18 and 19. In [556, 943] the authors pre-
sented an infinite family of minimal linear codes not satisfying the sufficient condition given
by Ashikhmin and Barg [67] showing then that this condition is not necessary. See more on
the connection between linear codes and secret sharing in Section 33.3.1.
Since minimal codes for secret sharing schemes have been nicely developed in [942], we
only deal here with minimal codes for two-party computation.
Secure two-party computation (2PC) is a sub-problem of secure multi-party computation
Linear Codes from Functions 521
(MPC) that has received special attention by researchers because of its close relation to
many cryptographic tasks. The goal of 2PC is to create a generic protocol that allows two
parties to jointly compute an arbitrary function on their inputs without sharing the value
of their inputs with the opposing party. One of the most well-known examples of 2PC is
Yao’s millionaire problem, in which two parties, Alice and Bob, are millionaires who wish
to determine who is wealthier without revealing their wealth. Formally, Alice has wealth a,
Bob has wealth b, and they wish to compute a ≥ b without revealing the values a or b.
In [374], Chabanne, Cohen and Patey have introduced a protocol for secure two-party
computation of linear functions in the semi-honest model, based on coding techniques. Their
protocol uses q-ary minimal codes. We refer the reader to [374] for the actual details.
Definition 20.7.1 A code C ⊆ Fqn is a locally recoverable code (LRC) with locality
r if for every i ∈ {1, . . . , n}, there exists a subset Ri ⊆ {1, . . . , n} \ {i} with |Ri | ≤ r and a
function φi such that, for every codeword c = (c1 , c2 , . . . , cn ) ∈ C,
ci = φi {cj | j ∈ Ri } .
LRCs have recently been a very attractive subject in research in coding theory, due to
their theoretical appeal, and to applications in large-scale distributed storage systems, where
a single storage node erasure is considered as a frequent error event. Distributed storage
systems are the subject of Chapter 31 with applications of LRCs in such systems detailed
in Section 31.3. Distributed storage systems and LRCs involving algebraic geometry codes
are presented in Section 15.9.
The parameters of an (n, k, r)q LRC have been studied; the upper bound on d is some-
times called the Singleton Bound for LRC.
Theorem 20.7.2 ([830, 1489]) Let C be an (n, k, r)q LRC of cardinality q k over an al-
phabet of size q; then the minimum distance d and rate nk of C satisfy
k r
d ≤ n − k − dk/re + 2 and ≤ .
n r+1
Note that if r = k, then the upper bound given in the above theorem coincides with the
well-known Singleton Bound, d ≤ n − k + 1.
In view of the above upper bound on the minimal distance, optimal/almost optimal LRC
are defined as follows.
Definition 20.7.3 LRCs for which d = n − k − dk/re + 2 are called optimal codes. The
code is almost optimal when the minimum distance differs by at most 1 from the optimal
value.
522 Concise Encyclopedia of Coding Theory
We now recall the concept of r-good polynomials, which is the key ingredient for con-
structing optimal linear LRCs.
Definition 20.7.4 Let q be a power of a prime and n be a positive integer such that q ≥ n.
A polynomial F over Fq is said to be an r-good polynomial4 if
1. the degree of F is r + 1, and
2. there exists a partition A = {A1 , . . . , A r+1
n } of a set A ⊆ F
q of size n into sets of size
r + 1 such that F is constant on each set Ai in the partition.
Good polynomials produce optimal LRCs according to the construction due to Tamo
and Barg [1772], yielding so-called Tamo–Barg codes, given in the next theorem.
Theorem 20.7.5 ([1772]) For r ≥ 1, let g(x) be an r-good polynomial over Fps . Set n =
(r + 1)l and k = rt, where t ≤ l. For a = (aij )i=0,...,r−1; j=0,...,t−1 ∈ Frt k
ps = Fps , let
r−1 X
X t−1
fa (x) = aij g(x)j xi .
i=0 j=0
Sl
If {A1 , . . . , Al } is the partition associated to g(x), set A = i=1 Ai and define
In the above theorem, the elements of the set A are called localizations, a ∈ Fkps is
the information vector for fa (β) β∈A , and the elements fa (β) β∈A are called symbols
of the codeword. The local recovery is accomplished as follows. Let c = (cβ )β∈A ∈ C
where cβ = fa (β) for some a ∈ A. Suppose that the erased symbol in c corresponds to the
localization α ∈ Aj where Aj is one of the sets in the partition A. cβ | β ∈ Aj \{α} denote
the remaining r symbols in the localizations of the set Aj . To find the value cα = fa (α), find
the unique polynomial δ(x) of degree less than r such that δ(β) = cβ for all β ∈ Aj \ {α},
that is,
X Y x − β0
δ(x) = cβ (20.22)
0
β − β0
β∈Aj \{α} β ∈Aj \{α,β}
and set cα = δ(α). Hence, to find one erased symbol, we need to perform polynomial
interpolation from r known symbols in its recovery set.
Example 20.7.6 ([1772]) We construct a (9, 4, 2)13 LRC over F13 with g(x) = x3 . Since
g(1) = g(3) = g(9) = 1, g(2) = g(6) = g(5) = 8, g(4) = g(12) = g(10) = 12,
let A = A1 ∪ A2 ∪ A3 where A1 = {1, 3, 9}, A2 = {2, 6, 5}, A3 = {4, 12, 10}. If the
information
vector is a = (a00 , a01 , a10 , a11 ), then fa (x) = a00 + a01 g(x) + x a10 +
a11 g(x) = a00 + a10 x + a01 x3 + a11 x4 . Let c = fa (β) β∈A with a = (1, 1, 1, 1).
So c = fa (1), fa (3), fa (9)|fa (2), fa (6), fa (5)|fa (4), fa (12), fa (10) = (4, 8, 7|1, 11, 2|0, 0, 0).
Suppose fa (1) is erased. This erased symbol can be recovered by accessing the 2 other code-
word symbols at the localization 3 and 9. Using (20.22), we find that δ(x) = 2x + 2 and
compute δ(1) = 4, which is the required value.
Gγ (x) = γxm
is an r-good polynomial, where γ ∈ F∗ps . Note that F∗ps can be split into pairwise disjoint
multiplicative cosets of the form bUm , where b ∈ F∗ps and Um = {x ∈ Fps | xm = 1}. Observe
next that, for every x ∈ bUm , Gγ (x) = Gγ (b); see [1772, Proposition 3.2].
is an r-good polynomial where e is a divisor of t such that pe ≡ 1(mod m), and ai ∈ Fps
Pt/e
satisfying i=0 ai = 0 with a0 6= 0 and at/e 6= 0. In fact, this (mpt − 1)-good polynomial
can be written as
m
m Y t/e
Y Y X ei
F (x) = (x + k + αi ) = (x + k)m = ai xp
i=1 k∈K k∈K i=0
Pt/e ei
where K is the set of roots of the linearized polynomial i=0 ai xp . Note that K is an
additive subgroup of Fps that is closed under multiplication by Fpe , and Um ⊆ Fpe ⊆ K;
Pt/e
Fpe ⊆ K follows because 1 ∈ K as i=0 ai = 0. See [1772, Theorem 3.3] for more details.
For the functions H and I, let Gγ (x) = γxm and Fa be defined as in (20.23), and assume
that Fps contains all the roots of Fa .
5 See Example 15.9.9 for a similar use of a power function to construct an algebraic geometry LRC.
524 Concise Encyclopedia of Coding Theory
Pt i i
Set H(x) = Fa (Gγ (x)) = i=0 ai γ p xmp where t > 0. Then H is an r-good polynomial
over Fps if and only if {b ∈ Fps \ Ea | b + Ea ⊆ Im(Gγ )} is nonempty where Ea = {x ∈
Fps | Fa (x) = 0} and Im(Gγ ) = {Gγ (x) | x ∈ Fps }. Let cUm be a multiplicative coset of Um
Spt
where c ∈ F∗ps . Then the r-good polynomial H(x) is constant on i=1 xi Um , where each
xi ∈ F∗ps , and γxm
m
1 , . . . , γxpt is an additive coset of the form b + Ea . See [1280, Theorem
8] for more details.
P m
t pi
Finally set I(x) = G1 (Fa (x)) = i=0 ai x . Then I is an r-good polynomial over
∗
Fps if and only if {b ∈ Fps | bUm ⊆ Im(FaS )} is nonempty where Im(Fa ) = {Fa (x) | x ∈ Fps }.
m
The r-good polynomial I is constant on i=1 (xi + Ea ) where Ea = {x ∈ Fps | Fa (x) = 0},
and Fa (x1 ), . . . , Fa (xm ) is a coset of Um . Again see [1280] for details.
In what follows let Um = {x ∈ Fq | xm = 1}. We consider separately the cases q odd and q
even; see [1281].
For q odd, let b ∈ F∗q and let m ≥ 3 be an integer satisfying m | (q − 1). Then the Dickson
polynomial Dm,b (x) is an (m − 1)-good polynomial as we now describe. Suppose b ∈ ξ l Um
q−1
for some l ∈ S = 0, 1, . . . , m − 1 where ξ is a primitive element of Fq and ξ i Um is the
multiplicative coset of Um in F∗q . Then pairwise disjoint subsets of Fq with cardinality m
such that Dm,b is constant on each subset include
Di = u + b · u−1 | u ∈ ξ i Um for i ∈ I ⊆ S
where
0, 1, . . . , 2l − 1, l + 1, l + 2, . . . , 2l + q−1 even, q−1
2m −1 if l is m is even,
0, 1, . . . , l − 1, l + 1, l + 2, . . . , l + q−1 − 1
if l is q−1
even, m is odd,
I= 2 2 2m 2 (20.25)
q−1
l−1 l 1
odd, q−1
0, 1, . . . , 2 , l + 1, l + 2, . . . , 2 + 2m − 2 if l is m is even,
0, 1, . . . , l−1 , l + 1, l + 2, . . . , l + q−1 − 1
if l is q−1
odd, m is odd.
2 2 2m
The Dickson polynomial Dm,b (x) is constant on exactly lDm,b pairwise disjoint subsets with
cardinality m where
q−1
− 1 if l is even, q−1
m is even,
2m
q−1−m
q−1
if l is even, m is odd,
2m
lDm,b = q−1
2m
if l is odd, q−1
m is even,
q−1−m
if l is odd, q−1
is odd.
2m m
where
(
0, 1, . . . , 2l − 1, l + 1, l + 2, . . . , 2l + q−1 − 12
2m if l is even,
I= q−1 (20.26)
0, 1, . . . , l−1 l
2 , l + 1, l + 2, . . . , 2 + 2m − 1 if l is odd.
We now present several examples of new r-good polynomials H over Fps with r = mpt −1
where m > 1, gcd(m, p) = 1, pt 6≡ 1 (mod m), and ps ≡ 1 (mod m). Let lH be the number
of pairwise disjoint subsets with cardinality r + 1 on which H is constant. The examples
were found using Magma [250].
t t
−1
Example 20.7.7 Let Fa (x) = xp − ap x where a is a primitive element of Fps . Let
p = 3, t = 1, and m = 4. Define
Very recently, Micheli [1385] provided a Galois theoretical framework which allows pro-
duction of good polynomials and showed that the construction of good polynomials can be
reduced to a Galois theoretical problem over global function fields.
526 Concise Encyclopedia of Coding Theory
Acknowledgement
The author is deeply grateful to Claude Carlet for his very careful reading, valuable
comments and precious information (in particular on Kerdock codes) and to Cunsheng
Ding for his great help in bringing material in this chapter and suggesting improvements.
She is also indebted to Cary Huffman for his precious comments and great help in improving
this chapter.
Chapter 21
Codes over Graphs
Christine A. Kelley
University of Nebraska-Lincoln
21.1 Introduction
Due to advances in graph-based decoding algorithms, codes over graphs have been shown
to be capacity-achieving on many communication channels and are now ubiquitous in in-
dustry applications such as hard disk drives, flash drives, wireless standards (IEEE 802.11n,
IEEE 802.16e, WiMax) [292, 409, 1589, 1590, 1681, 1712], and deep space communication.
While any linear code gives rise to a natural graph representation, the phrase “codes over
graphs” is typically reserved for codes whose corresponding graph representation aids in
the decoding or understanding of those codes. Perhaps the most common example is the
family of low-density parity check (LDPC) codes [787] that have become standard error-
correcting codes in modern wireless and storage devices. LDPC codes are characterized by
having sparse graph representations that allow for efficient and practical implementation
of their iterative decoders. Although the concepts of iterative decoding on sparse graph
representations of codes was introduced much earlier [787, 1784], they remained largely
unexplored until [1316] when improvements in computational power made them more at-
tractive for longer practical block lengths. Other prominent families of graph-based codes
include turbo codes, repeat accumulate codes, fountain codes, Raptor codes, low-density
generator-matrix codes, and others [183, 604, 1044, 1292, 1296, 1687]. Fountain and Raptor
codes are described in Chapter 30.
The following definition presents terminology that we will use in this chapter.
Definition 21.1.1 A graph G is a pair (V, E) where V is a set of points called vertices,
or nodes, and E is a collection of sets of cardinality two of vertices from V called edges.
Unless otherwise specified, the graphs in this chapter are assumed to be simple, meaning
527
528 Concise Encyclopedia of Coding Theory
that there are no loops or multiedges. Namely, if {vi , vj } ∈ E, then vi 6= vj , and the pairs
in E are distinct. A graph is bipartite if the vertices can be partitioned in two sets V1 and
V2 such that any edge in E contains one element from V1 and one from V2 . A vertex v has
degree r if v belongs to r pairs in E. A graph is r-regular, or regular, if every vertex has
the same degree r. A subgraph of G is a graph (V 0 , E 0 ) where V 0 ⊆ V and E 0 is a set of
unordered pairs of V 0 such that E 0 ⊆ E. If S is a subset of V , the subgraph induced by S
in G is the pair (S, ES ) where {vi , vj } ∈ ES if and only if {vi , vj } ∈ E and vi , vj ∈ S.
Definition 21.2.1 A low-density parity check (LDPC) code C is a linear block code
that is specified by a parity check matrix H that is sparse in the number of its nonzero
entries (thereby giving its name). An LDPC code is said to be (j, r)-regular, or simply
regular) if the parity check matrix H has a fixed number j of nonzero entries per column
and a fixed number r of nonzero entries per row, and is said to be irregular otherwise.
In another landmark paper, Tanner [1784] initiated the area of designing codes from
graphs and generalized the constraints that could be used in linear codes. Consequently,
the resulting graph representation for linear codes is commonly referred to as a Tanner
graph, described next.
Codes over Graphs 529
0.9
0.8
0.7
0
, equiprobable ensemble of parity-check codes
0.6
code rate
0.5 o (3,6)
0.4 o (3,5)
o (4,6)
0.3
o (3,4)
0.2 o (4,5)
o (5,6)
0.1
0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
minimum distance/block-length ratio d/n
Definition 21.2.3 An LDPC code C with a parity check matrix H may be represented by
a bipartite graph G, called a Tanner graph, with vertex set V ∪ W where V ∩ W = ∅. The
vertices in V are the variable nodes and correspond to the columns of H; the vertices
in W are the check nodes or constraint nodes and correspond to the rows of H. Two
nodes vt ∈ V and ws ∈ W are connected by an edge if and only if the (s, t)th entry of H is
nonzero. The value of the (s, t)th entry acts as a weight for the corresponding edge. We will
restrict our attention to binary LDPC codes, where each edge implicitly has 1 as a weight.
If the LDPC code is (j, r)-regular, then the degree of each variable node is j and the degree
of each check node is r. The check nodes of an LDPC code represent simple parity check
constraints; i.e., the values of the adjacent variable nodes sum to 0 mod 2.
While early constructions of LDPC codes focused on the regular case, irregular codes
have been shown to be better in decoding performance. The degree distribution of a Tanner
graph is used to analyze the code.
Definition 21.2.4 The degree distribution of the variable nodes and the constraint
nodes of a Tanner graph G are denoted by polynomials λ(x) and ρ(x), respectively. The
dv
variable node degree distribution λ(x) is defined by λ(x) = i=2 λi xi−1 , and the constraint
d c
node degree distribution ρ(x) is defined by ρ(x) = i=2 ρi xi−1 , where λi (respectively ρi ) is
the fraction of edges in the Tanner graph that are connected to degree i variable (respectively
check) nodes, and dv and dc are the maximum variable node and check node degrees,
respectively. For example, a (3, 6)-regular LDPC code has degree distribution λ(x) = x2
and ρ(x) = x5 . Since degree 1 nodes have been shown to cause poor performance, we assume
there are no degree 1 nodes in the Tanner graph and omit them in the polynomials described
above.
Example 21.2.5 For an LDPC code with a Tanner graph shown in Figure 21.2, the vari-
able nodes on the left are labeled v1 , v2 , . . . , v7 and the constraint nodes on the right are
530 Concise Encyclopedia of Coding Theory
FIGURE 21.2: A parity check matrix H for a code C on the left with its corresponding
Tanner graph representation on the right. The variable nodes are represented by shaded
circles, and the constraint nodes are represented by squares.
v
1
v2
w
1
v
3
w
2
1 1 1 1 0 0 1 v
4
1 0 1 0 1 1 0
H=
1 1 0 1 1 1 0 v
5
w
3
0 1 0 1 1 0 1
v 1
0
61
0
0
1 w4
v
7
V W
Remark 21.2.6 For non-binary LDPC codes over Fps , the edge weights act as coefficients
in the corresponding parity check equations, and the sum is set to 0 in thefield. For example,
if the first row of a parity check matrix for an LDPC code over F3 was 0 1 2 0 0 1 , then
the corresponding equation at the first constraint w1 would be v2 + 2v3 + v6 = 0 mod 3.
For more detail on non-binary LDPC codes, we refer the reader to [490].
Remark 21.2.7 The Tanner graph may be seen as representing the corresponding code C
in the following way. Any binary vector x = x1 x2 · · · xn that is input to the variable node
set (i.e., vi is assigned value xi for i = 1, . . . , n) is a codeword of C if and only if all of
the constraints w1 , . . . , wm are simultaneously satisfied. This coincides with the definition
of parity check matrix since xH T = 0 if and only if x ∈ C. Moreover, while a parity check
matrix (or equivalently, a Tanner graph) represents a unique code, a code may have many
parity check matrix (and Tanner graph) representations.
Example 21.2.9 Suppose the Tanner graph in Figure 21.2 represents a GLDPC code so
that the parity check constraints represent more general constraints. The constraint w1 has
degree 5 and variable node neighbors v1 , v2 , v3 , v4 and v7 . Suppose w1 represents a linear
Codes over Graphs 531
code Cw1 of length 5 with check equations v1 +v2 +v7 = 0 mod 2 and v1 +v3 +v4 = 0 mod 2.
Then, assuming some ordering on the edges to v1 , v2 , v3 , v4 , v7 , the constraint node w1
is satisfied if and only if the values assigned to v1 , v2 , v3 , v4 , v7 form a codeword in Cw1
(i.e., satisfy the two equations). Thus, for a generalized LDPC code, there are more overall
constraints imposed by the same set of constraint nodes in the Tanner graph. Consequently,
while the matrix in Figure 21.2 is still an incidence matrix of the Tanner graph for the
GLDPC code, it is no longer a parity check matrix of the GLDPC code. Note further that
the code rate of a GLDPC code is typically smaller than the code rate of a simple LDPC
code for the same Tanner graph representation.
With the Tanner graph representation of a code’s parity check matrix, the iterative
decoder from [787] may be seen as a message-passing graph-based decoder that exchanges
information between variable and check nodes in the graph [1784]. The structure of the graph
affects the success of the decoder. Recall that the girth of a graph is the smallest number of
edges in any cycle in the graph, and in a bipartite graph the girth is always even. Moreover,
the diameter of a graph is the largest distance between any two vertices, where distance
is measured by the number of edges in the shortest path between them. For decoding,
the girth corresponds to the number of iterations for which the messages propagated are
independent; thus much research has focused on designing codes with large girth so as to
prevent bad information from reinforcing variable node beliefs. Indeed, iterative decoding
is optimal on cycle-free graphs [1889]. In contrast, a small graph diameter may be desired
to allow information from any node in the graph to reach any other node in the fewest
iterations. Despite this intuition, there exist special cases of codes with small girth and/or
large diameter that still perform well, suggesting that other factors such as overall cycle
structure or graph topology may also play a role.
Remark 21.2.10 Since a girth of 4 is the smallest possible in a bipartite (simple) graph,
it is common practice to remove all 4-cycles in the construction of LDPC codes. Linear
codes that have Tanner graph representations without 4-cycles are characterized in [1181]
in terms of the parameters and minimum distance of the dual code. Moreover, while cycle-
free representations ensure independent messages during iterative decoding, codes having
such representations were shown to have poor minimum distance [693]. In particular, Etzion,
Trachtenberg, and Vardy showed the following.
Theorem 21.2.11 ([693]) If an [n, k, d] linear code C with code rate R has a cycle-free
Tanner graph representation, then for R ≥ 0.5, d ≤ 2, and for R < 0.5,
n n+1 2
d≤ + < .
k+1 k+1 R
Using the girth, the minimum distance of LDPC codes with column weight j (equiva-
lently, having variable node degree j) may be bounded as follows.
Theorem 21.2.12 (Tree Bound [1784]) Suppose a Tanner graph G of an LDPC code C
has girth g and variable node degree j. Then the minimum distance d satisfies
( g−6
1 + j + j(j − 1) + j(j − 1)2 + · · · + j(j − 1) 4 if g/2 is odd,
d≥ 2 g−8 g−4
1 + j + j(j − 1) + j(j − 1) + · · · + j(j − 1) 4 + (j − 1) 4 if g/2 is even.
A similar result for GLDPC codes was also given in [1784]. These results are shown by
enumerating the Tanner graph of the code from a variable node as a tree until the layer in
which nodes may not be distinct (due to girth), and examining how many variable nodes
must have a value 1 in a minimum weight codeword.
532 Concise Encyclopedia of Coding Theory
Remark 21.2.13 A special case of simple LDPC codes with j = 2 are referred to as
cycle codes. Their minimum distance d can be shown to be g/2, where g is the girth of
the corresponding Tanner graph. Since the girth is known to increase only logarithmically
in the block length [516], cycle codes are known to be bad (i.e., d/n → 0 as n → ∞).
Thus, if the variable nodes have degree 2, it is typical to use generalized LDPC codes with
constraints from a strong linear code at the check nodes to improve the minimum distance
and performance.
LDPC codes are often constructed by randomly choosing the positions and values of
the nonzero entries of H. In a random construction, many desirable features for practical
implementation are lost, such as efficient representation, simple analysis, and easy encoding.
However, with some algebraic structure in the code, some of these features may be retained.
The theory of designing and decoding LDPC codes has attracted widespread interest in the
coding theory community; see for example [409, 1121, 1209, 1589, 1590, 1591]. For example,
designing graphs for codes that have properties such as large girth, good expansion (see
Section 21.5), small diameter, or optimizing other combinatorial objects in a graph to obtain
better codes is of fundamental interest.
21.3 Decoding
A codeword transmitted over a communication channel is typically corrupted by noise.
A decoder takes the received (noisy) word r as input and tries to estimate the codeword
that was transmitted. A maximum likelihood decoder (see Section 1.17) finds the most likely
transmitted codeword as follows:
variable node symbols are estimated according to an estimation rule. If the estimates of
all of the variable nodes corresponds to a codeword, the iterative decoder stops and outputs
that codeword; otherwise another iteration is performed. This continues until the decoder
either produces an output or some specified number of iterations is reached.
(t)
Let µch (v) be the channel/received message for the variable node v. Let µc→v be the
(t)
message from check node c to variable node v at a particular decoding iteration t. Let µv→c
be the message from variable node v to check node c at a particular decoding iteration t.
Let N (v) denote the set of check node neighbors of a variable node v in the Tanner graph
G representing the parity check matrix H, and let N (c) denote the set of variable node
neighbors of a check node c in G. Let N (v) \ {c} denote the set of check node neighbors of
v excluding check node c. We define N (c) \ {v} similarly.
Variable node update rule (Gallager A): The message sent from a variable node v to a check
node c in iteration t + 1 is given by
(
(t) (t)
µc0 →v if µc0 →v are identical for all c0 ∈ N (v) \ {c},
µ(t+1)
v→c =
µch (v) otherwise.
(t+1)
Variable node update rule (Gallager B): The variable node update sets µv→c = x if, for
some chosen b ∈ {1, 2, . . . , |N (v)| − 1}, at least b of the check nodes in N (v) \ {c} are
equal to value x.
Estimation: The variable nodes are estimated similarly to the variable node update rule
at the end of each iteration t. If the estimated values of the variable nodes correspond
to a valid codeword in the LDPC code, decoding stops. Otherwise, another round of
decoding is performed until a maximum specified limit of decoding iterations is reached.
Example 21.3.2 In Figure 21.3, the update rule at a check node (on the left) and at a
variable node (on the right) are shown. At the check node, the outgoing message is simply
the XOR of the incoming messages. At the variable node, since all the incoming check node
to variable node messages agree and equal to 1 in this example, the outgoing message is
also 1.
We now turn to the soft decision Sum-Product Algorithm (SPA) [787, 1170, 1590]. A soft
decision decoder such as this uses likelihood values for the messages. For binary codes, these
likelihood values may be of the form of probability pairs (p0 , p1 ), where p0 is the probability
of a message symbol representing value 0 and p1 is the probability of a message symbol
534 Concise Encyclopedia of Coding Theory
FIGURE 21.3: Check node update on the left and variable node update on the right for
the Gallager A Algorithm
1 1
0 v 1 c
c v
1
1
Channel
0 0
representing value 1, or the likelihood values may be of the form of log-likelihood ratios
(LLRs) (for example log(p0 /p1 )). While either representation may be used, for practical
implementation, the log representation, referred to as log-domain decoding, is preferred
over the probability representation, referred to as probability-domain decoding, due to
numerical precision issues that occur in practice. A log-domain description of the SPA is
presented below. We refer the reader to [787] for a probability-domain description of the
SPA. Before we proceed,
we modify our notation to represent soft information.
Prob(v=0 | r)
Let µch (v) = log Prob(v=1 | r) be the LLR associated with the variable node v based on
(t)
the received word r from the channel. Let µv→c be the LLR message sent from a variable
(t)
node v to a check node c in decoding iteration t and let µc→v be the LLR message sent from
(t)
a check node c to a variable node v in decoding iteration t. Note that µv→c represents the
LLR for the value associated with variable node v based on information available at node
(t)
v from all neighboring nodes excluding node c and µc→v represents the LLR for the value
associated with node v based on information available at node c from all its neighboring
nodes excluding node v.
Algorithm 21.3.3 (Sum-Product (SPA))
(0)
Initialization: For the first decoding iteration, t = 0, set µv→c = µch (v) for all c ∈ N (v).
Check node update rule: The absolute value of the message sent from a check node c to a
variable node v in iteration t is given by
! !
(t) (t)
|µc→v | Y |µv0 →c |
tanh = tanh .
2 0
2
v ∈N (c)\{v}
The sign value of the message sent from a check node c to a variable node v in iteration
t is given by
(t)
Y
sign µ(t)
c→v = sign µv0 →c .
v 0 ∈N (c)\{v}
Variable node update rule: The message sent from a variable node v to a check node c in
iteration t + 1 is given by
(t)
X
µ(t+1)
v→c = µch (v) + µc0 →v .
c0 ∈N (v)\{c}
Codes over Graphs 535
FIGURE 21.4: Check node update on the left and variable node update on the right for
the Sum-Product Algorithm in the probability-domain
(0.9, 0.1)
Estimation: The variable nodes are estimated at the end of each iteration t as follows:
(t)
( P
0 if µch (v) + c0 ∈N (v) µc0 →v > 0,
vb =
1 otherwise.
If the estimated values of the variable nodes correspond to a valid codeword in the
LDPC code, decoding stops. Otherwise, another round of decoding is performed until
a maximum specified limit of decoding iterations is reached.
Example 21.3.4 In Figure 21.4, the update rule at a check node (on the left) and at a
variable node (on the right) are shown. Instead of using LLRs as messages, the messages
are shown in the probability-domain. The first coordinate of the message represents that
probability of the adjacent variable node being a 0 and the second coordinate represents
the probability of the adjacent variable node being a 1. At the check node, the outgoing
message is a probability pair that depends on the probabilities on the incoming messages.
In the example shown, the probability that the variable node v is a 0 given the incoming
messages at c (as shown in the figure) is 0.558 and the probability that it is a 1 is 0.442.
Specifically, the probability of the outgoing message being a 0 (respectively a 1) is the sum of
the product of the probabilities of an even (respectively odd) number of incoming messages
being 1. The variable node update is also shown using messages in the probability-domain.
The outgoing message is obtained by a component-wise product of the incoming messages
and the channel message and scaled by a normalization factor.
Initialization, the variable node update rule, and estimation are as in the SPA.
(t)
Check node update rule: |µ(t)
c→v | = min |µv0 →c |. The sign value of the check node to
v 0 ∈N (c)\{v}
variable node message remains the same as for the SPA.
Most practical implementations of the MSA [396, 1591] use this approximation in the check
node updates along with scaling factors to improve decoder convergence.
536 Concise Encyclopedia of Coding Theory
FIGURE 21.5: Check node update on the left and variable node update on the right for
the Min-Sum Algorithm using LLRs
(+7)
(−6) (−6)
(+4) v (−4) c
c v
(−4)
(−4)
Channel
(+6) (+6)
Example 21.3.6 In Figure 21.5, the update rule at a check node (on the left) and at a
variable node (on the right) are shown with the messages as LLRs. At the check node, the
outgoing message has magnitude that is the minimum of the magnitudes of the incoming
message and has a sign that is the product of the signs of the incoming messages. At a
variable node, the outgoing message is simply a sum of the incoming messages and the
channel message.
Linear programming (LP) decoding for LDPC codes was proposed in [710, 711] and
can be applied to any linear code, with complexity depending on the sparsity of the repre-
sentation. By taking the convex hull of all codewords, a codeword polytope may be obtained
whose vertices are precisely the codewords of the code. ML decoding can be expressed as
an integer linear programming problem, and its output is a vertex of this polytope. For the
additive white Gaussian noise channel, the ML decoding essentially finds the codeword that
minimizes the cost function
c = arg min crT .
b
c∈C
When the code constraints are relaxed, the convex hull results in a new polytope, called the
pseudocodeword polytope P. In this case, the linear programming decoder complexity
is significantly reduced, and the LP decoding problem finds
The vertices of this polytope with all integer coordinates are codewords of C. The solution
of LP decoding is a vertex of this polytope P, and when the LP decoder finds an all integer
coordinate vertex as the solution, its estimate coincides with that of the ML decoder. Thus,
although LP decoding is more complex in general over other iterative decoding methods
such as the SPA, it has the advantage that when it decodes to a codeword, the codeword is
guaranteed to be the maximum-likelihood codeword [710, 711].
Remark 21.3.7 When decoding GLDPC codes, iterative decoding is performed between
the variable and constraint nodes, but at the constraint nodes, typically ML decoding of
the code representing the constraint node is performed. This is feasible since the degree of
the constraint node, and hence the block length of the code represented by the constraint,
is typically small. For soft decision decoding, the decoding at the constraint node may be
performed efficiently using trellis-based decoding [1897].
Codes over Graphs 537
waterfall region
-2
10
codeword failure rate
10 -4
DE threshold
10 -8
0 0.5 1 1.5 2 2.5
channel capacity
signal to noise ratio (SNR (dB))
Remark 21.3.8 A typical performance plot of an iterative decoder for an LDPC code is
shown in Figure 21.6. The plot shows the failure rate of the decoder as a function of the
channel quality. In the case of the AWGN channel, the channel quality is measured in terms
of signal to noise ratio (SNR). As the SNR increases, the decoder failure rate decreases.
There are two prominent regions: the waterfall region where the decoder failure rate drops
steeply with increasing SNR and the error floor region, where the decoder failure rate
drops slowly with increasing SNR. While the waterfall region typically starts at an SNR
value that is dependent on the degree distribution of the LDPC code [1590] and the chosen
iterative decoder, the error floor region tends to start at an SNR value that is dependent
on the distance properties of the code, the combinatorial structures in the Tanner graph,
and the chosen iterative decoder.
Finite length LDPC codes when decoded via iterative message-passing decoders may
sometimes fail to converge to a codeword, or sometimes converge to an incorrect codeword.
Depending on the channel and choice of iterative decoder, these types of failure events have
been attributed to combinatorial structures in the underlying LDPC Tanner graph called
538 Concise Encyclopedia of Coding Theory
stopping sets [534], trapping sets [1387, 1846], pseudocodewords [710, 1099, 1100, 1102, 1158],
and absorbing sets [612]. Substantial research has been conducted to analyze these structures
and to optimize them in LDPC code design [612, 1387, 1846].
Stopping sets characterize all error events that prevent the iterative decoder from con-
verging to a codeword on an erasure channel [534]. If the variable nodes of a stopping set are
erased, then the iterative decoder cannot recover any of these variable nodes and is “stuck”
at the stopping set. This gives a simple code design criterion, namely to maximize the size
of the smallest stopping set, denoted smin , in the corresponding graph representation. Since
the support of any codeword is itself a stopping set, smin ≤ d.
Example 21.3.10 In Figure 21.7, the set S = {v1 , v3 , v6 } forms a stopping set of size 3 in
the Tanner graph. Note that each neighboring constraint node for these variable nodes is
connected to S more than once. If the received values at these variable nodes are erased, an
iterative decoder cannot recover the values of the erased nodes. The set {v2 , v4 } in Figure
21.7 is an example of a stopping set of smallest size. Note that {v1 , v6 } is not a stopping
set since w1 is only adjacent to v1 .
v
1
v
2 w
1
v
3
w
2
v
4
v w
5 3
v
6 w
4
v
7
Example 21.3.11 Consider the Tanner graph in Figure 21.7. In Figure 21.8, the received
word r = 011???0 is input to the variable nodes, shown in the left-most graph. Recall
that any bit that is received is guaranteed to be correct for the binary erasure channel
(BEC). In the second graph, the information at w1 is used to recover the value at v4 . This
allows w4 to have enough information to recover the value at v5 , shown in the third graph,
which in turn enables w3 to recover the value at v6 . It is worth noting that if the set of
erased variable nodes does not contain a stopping set, the decoder will successfully recover
the transmitted word regardless of the number of initially erased nodes; in particular, this
number can exceed d − 1. Indeed, the minimum distance of the corresponding code in this
example is 2.
For other channels such as the binary symmetric channel (BSC) or the additive white
Gaussian noise (AWGN) channel, if certain subsets of variable nodes receive incorrect mes-
sages from the channel, the iterative decoder may eventually get stuck at a so-called trapping
Codes over Graphs 539
0 0 0 0
1 w 1 w 1 w 1 w
1 1 1 1
1 1 1 1
w w w w
2 2 2 2
? 0 0 0
w w w w
? 3 ? 3 1 3 1 3
0
1 0
1 0
1 0
1
? 1
0
w4 ? 1
0
w4 ? 1
0
w4 0 1
0
w4
0
1 0
1 0
1 0
1
0 0 0 0
set and not be able to correct the errors on the variable nodes of this trapping set. Combina-
torially, an (a, b)-trapping set is a subset S of variable nodes in the Tanner graph G where
|S| = a and whose induced subgraph in G contains b odd degree check nodes. However,
whether this set causes an error or not depends on the corresponding choice of decoder
[1846]. To remove this dependence on the decoder choice, absorbing sets were introduced in
[612] to characterize failure on these channels in a purely combinatorial way.
Definition 21.3.12 An (a, b)-absorbing set is an (a, b)-trapping set in a Tanner graph
G with the additional property that every variable node in S has more even degree than
odd degree neighbors in the subgraph induced by S in G [612]. More refined notions of
absorbing sets, such as fully absorbing and elementary absorbing sets, have also been defined
to describe varying degrees of harmfulness [612].
Intuitively, if the information at the variable nodes in an absorbing set is incorrect, their
estimates are unlikely to change since each neighboring check node of the absorbing set
connects an even number of times to these erroneous nodes.
Remark 21.3.13 In general, small absorbing sets are regarded as the most problematic.
Removing these is one method to obtain low error-floor performance [151, 612, 1182]. The
smallest (a, b)-absorbing set of the Tanner graph of a code is the size of the smallest a, and
the corresponding smallest b for that given a, for which an absorbing set exists [612].
During message-passing iterative decoding, if the shortest path length from variable
node vi to another variable node vj is 2t, then it takes t iterations for information to prop-
agate from vi to vj . This message propagation can be better visualized using the decoder’s
computation tree for each node.
Remark 21.3.15 Recall that iterative decoding is optimal on a cycle-free graph [1889].
Using this observation, Wiberg and others showed that the iterative decoder attempts to
converge to the closest codeword solution for codewords on the tree. The codewords on
the computation tree contain all the codewords of the original LDPC code as well as other
valid configurations that are not codewords of the LDPC code. Valid configurations on the
540 Concise Encyclopedia of Coding Theory
computation tree are referred to as pseudocodewords. For the case of LP decoding, the
set of pseudocodewords is shown to be all codewords existing in Tanner graphs that are
obtained from all possible lifts (for any lift degree and choice of permutations in the lift) of
the original Tanner graph; see [710, 711, 1158]. Pseudocodewords for an LP decoder can be
described more accurately as belonging to polyhedra and bounds on the parameters of the
pseudocodewords can be derived using the structure of the LDPC Tanner graph.
Remark 21.3.16 Since the parity check matrix for a code is not unique, the choice of which
parity check matrix to use is an important one. Depending on which matrix is chosen,
the corresponding graph may have different properties. Several papers have investigated
the effect of redundancy of the parity check matrix on decoder performance. In general,
the higher the redundancy, the better the performance of the iterative decoder due to
the reduction of bad combinatorial structures (e.g., small stopping sets, absorbing sets,
pseudocodewords) that typically results [1636, 1950]. For example, in [1636] it is shown that
with enough redundancy, the smallest stopping set size can match the minimum distance
of the code. However, higher redundancy comes at a cost of increased decoder complexity.
An LDPC code, denoted a PG-LDPC code, is obtained by taking the points to lines
incidence matrix of PG(m, q) and using that for its parity check matrix. For m = 2 and
q = 2s , the corresponding PG-LDPC code has block length n = 22s + 2s + 1, dimension
k = 22s + 2s − 3s , and minimum distance dmin = 2s + 2.
Example 21.4.2 An incidence matrix for the points and lines of PG(2, 2) over F2 (i.e., the
projective plane) is shown in Figure 21.9, along with the corresponding Tanner graph. The
geometry is commonly known as the Fano plane. The corresponding PG-LDPC code is the
[7, 3, 4]2 simplex code, the dual of the Hamming code H3,2 .
Codes over Graphs 541
FIGURE 21.9: A Fano plane PG(2,2) on the left, the corresponding incidence matrix in
the middle, and the LDPC Tanner graph on the right
1 1 0 1 0 0 0
0 1 1 0 1 0 0
0 0 1 1 0 1 0
H = 0 0 0 1 1 0 1
1 0 0 0 1 1 0
0 1 0 0 0 1 1
1 0 1 0 0 0 1
Definition 21.4.3 The m-dimensional finite Euclidean geometry over Fq without the
origin point and the lines containing it, denoted EG0 (m, q), has the following parameters
m−1
(q m −1)
[1159]: There are q m − 1 points and q q−1 lines. Each line contains q points and each
m
−1
point is on qq−1 lines. Moreover, any two points have exactly one line in common and
any two lines either have one point in common or are parallel (i.e., they have no points in
common).
An LDPC code, denoted an EG-LDPC code, is obtained by taking the points to lines
incidence matrix of EG0 (m, q) and using that for its parity check matrix. For m = 2
and q = 2s , the corresponding EG-LDPC code has block length n = 22s − 1, dimension
k = 22s − 3s , and minimum distance dmin = 2s + 1.
There are a wide variety of finite geometry codes. For example, a µ-dimensional subspace
of a finite geometry is called a µ-flat. By taking the incidence matrix of µ1 -flats and µ2 -flats
from an m-dimensional finite geometry, where 0 ≤ µ1 < µ2 ≤ m, LDPC codes have also
been obtained [1780].
Remark 21.4.4 FG-LDPC codes have the advantage of being cyclic or quasi-cyclic; see
Section 1.12 and Chapters 2 and 7. In a cyclic code, each row of the parity check matrix
is a cyclic shift of the preceding row by one position. For quasi-cyclic codes, the parity
check matrix can be seen as an array of submatrices, each of which is cyclic. The cyclic or
quasi-cyclic structure of these codes is particularly desirable for practical implementation
where identical processing units can be run in parallel to improve the efficiency of encoding
and decoding operations. Many codes used in industry have this property.
Remark 21.4.5 The parity check matrices for FG-LDPC codes have a large number of re-
dundant rows, which may positively impact their decoding performance; see Remark 21.3.16.
In particular, the redundancy allows decoding to proceed even when some nodes become
faulty. Moreover, the geometric structure of these codes has led to concrete results on stop-
ping sets and pseudocodewords [1101, 1912], and trapping sets of FG-LDPC codes [536].
More recently, absorbing sets of FG-LDPC codes were analyzed in [150, 1261, 1278].
Remark 21.4.6 Incidence matrices from a wide variety of discrete structures may be used
to obtain LDPC codes that have practical advantages. This may include a more compact
representation, easier implementation, and/or theoretical guarantees on the corresponding
code parameters using the discrete structure’s properties. For example, Steiner systems,
combinatorial block designs (examined in Chapter 5), partial geometries, generalized quad-
rangles, generalized polygons, and Latin squares have all been explored for LDPC code
design [1052, 1053, 1054, 1101, 1284, 1859].
542 Concise Encyclopedia of Coding Theory
Graphs whose second largest eigenvalue µ2 (in absolute value) of the corresponding
adjacency matrix is small compared to the largest eigenvalue µ1 are known to be good
expanders. Recall that the adjacency matrix of a r-regular connected graph has r√as its
largest eigenvalue, and for a (j, r)-regular bipartite graph, the largest eigenvalue is jr.
We start by mentioning two bounds on the minimum distance, called the Bit-Oriented
and Parity-Oriented Bounds, that rely on the connectivity properties of the Tanner graph.
Theorem 21.5.3 (Bit-Oriented Bound [1787]) Let G be a regular connected graph with
n variable nodes of uniform degree j and m constraint nodes of uniform degree r, and let each
constraint node represent constraints from a [r, kr, r] code1 . Then the minimum distance
d of the GLDPC code C with parity check matrix H represented by the Tanner graph G
satisfies
n(jr − µ2 )
d≥ ,
jr − µ2
where µ2 is the second largest eigenvalue of H T H.
code rate of the code is b/a. For simple LDPC codes, the check nodes represent simple [r, r − 1, 2] parity
check codes.
Codes over Graphs 543
In both cases, the smaller the second largest eigenvalue µ2 , the better the bound on the
minimum distance.
Constructing graphs with good expansion is a hard problem [982]. Since most known
constructions yield regular graphs, we start by showing a common technique to convert a
regular graph into a bipartite graph that may be used to represent a code.
Definition 21.5.5 Let G be a graph with vertex set V and edge set E. Then the edge-
vertex
incidence graph of G is the bipartite
graph with vertex set V ∪ E and edge set
{v, e} | v ∈ V and e ∈ E are incident in G .
Example 21.5.7 A 3-regular graph on 8 vertices and its corresponding (2, 3)-regular edge-
vertex incidence graph are shown in Figure 21.10. Since edge-vertex incidence graphs are
2-left regular by construction, it is standard to use edge-vertex incidence graphs as Tanner
graphs for GLDPC codes; see Remark 21.2.13.
FIGURE 21.10: A (2, 3)-regular edge-vertex incidence graph on the right corresponding
to the 3-regular graph on the left
v 11
00
00
11
1 w
w 1
0 1
8 v
2 0
1
w
1 deg=3 v 1
0
0
1
w
2
3
v w
v 1
0
1 7
0
1 4
v v w
3 2 0
1 3
51
0
v
v v 0
1 w
w 4 61
0 4
2 v
v 0
1
5 w
v 6 71
0
w
6
v 10
0
1
5
8
v v
7 v 12 v 0
1
8 91
0 w
6
v 01
w v
101
0
v
3 9 10 v 11 w
5 w
7
v 1
0
11
w
4
120
1
v w
8
Definition 21.5.8 Expander codes are families of graph-based codes whose correspond-
ing Tanner graphs are obtained from expander graphs. If the expander graph is not bipartite,
then the code is represented by the edge-vertex incidence graph of the expander graph (and
inherits some level of expansion). If the expander graph is bipartite, then the code may be
a simple LDPC code, or a GLDPC code with code constraints used at the check nodes, or
a GLDPC code represented by the edge-vertex incidence graph of the bipartite expander
graph with possibly multiple types of code constraints at the vertices.
We now state a particularly useful result describing the expansion of an r-regular graph
due to Alon and Chung.
544 Concise Encyclopedia of Coding Theory
Theorem 21.5.9 ([40]) Let G be an r-regular graph on n vertices and let µ2 be the second
largest eigenvalue (in absolute value) of its adjacency matrix. Then every subset S of γn
µ2
vertices contains at most nr 2 2
2 γ + r (γ − γ ) edges in the subgraph induced by S in G.
Using Theorem 21.5.9, Sipser and Spielman show that the guaranteed error correction
capabilities of a code may be improved when the underlying graph is a good expander.
Theorem 21.5.10 ([1719]) Let C be a GLDPC code formed by taking the edge-vertex
incidence graph of an (n, r, µ2 )-expander. Let the check nodes represent constraints of an
[r, kr, r] linear block code. Then the resulting GLDPC code has block length N = nr/2, rate
R ≥ 2k − 1, and minimum distance
( − µ2 /r)2
d≥N .
(1 − µ2 /r)2
Remark 21.5.11 A family of highly expanding graphs known as Ramanujan graphs [1291]
was constructed with excellent graph properties that surpassed the parameters predicted
for random graphs. One notable family is obtained by taking the Cayley graph of projective
special linear groups over Fq with a particular choice of generators [1291]. The description
of these graphs and their analysis relies on deep results from mathematics using tools from
graph theory, number theory, and representation theory of groups [491]. Sipser and Spielman
show that graphs with good expansion lead to LDPC codes with minimum distance growing
linearly with the block length [1719]. Indeed, [1719] gives a construction of one of the few
explicit algebraic families of asymptotically good codes known to date, using the graphs from
[1291] and Theorem 21.5.10. Cayley graphs have also been used in other notable LDPC code
constructions [1330].
We now consider expander codes that result from bipartite expander graphs directly
(and not from their edge-vertex incidence graphs).
Definition 21.5.12 Let 0 < α < 1 and 0 < δ < 1. A (j, r)-regular bipartite graph G with
n variable nodes of degree j and m check nodes of degree r is an (αn, δj)-expander if for
every subset S of variable nodes such that |S| < αn and the size of the set of neighbors of
S, denoted |N (S)|, is at least δj|S|.
Theorem 21.5.13 ([1719]) Let G be an (αn, δj)-expander.
(a) If δ > 1/2, then the simple LDPC code represented by the Tanner graph G has mini-
mum distance d ≥ αn and code rate R ≥ 1 − rj .
(b) If the check nodes represent constraints from a [r, kr, r] code and if δ > 1/(r), the
GLDPC code represented by the Tanner graph G has minimum distance d ≥ αn and
code rate R ≥ 1 − j(1 − k).
Finally, we consider the case of taking the edge-vertex incidence graph of a bipartite
expander graph for a code.
Definition 21.5.14 A (j, r)-regular bipartite graph G on n left vertices and m right vertices
is a (j, r, n, m, µ2 )-expander if the second largest eigenvalue of A(G) (in absolute value) is
µ2 .
Remark 21.5.15 Let G be an (j, r, n, m, µ2 )-expander and consider the edge-vertex in-
cidence graph of G. The corresponding GLDPC code is formed by using constraints of
a [j, k1 j, 1 j] linear block code at each degree j vertex, and using the constraints of an
[r, k2 r, 2 r] linear block code at each degree r vertex. The resulting LDPC code has block
length N = nj = mr and rate R ≥ k1 + k2 − 1.
Codes over Graphs 545
Remark 21.6.2 Note that the protograph used in a protograph code construction may
have multiple edges between the same pair of vertices. Hence, the incidence matrix may
have entries of value greater than one. Typically, the lifted graph that is used to represent
the code no longer retains the multiple edges by the choice of permutations used.
Example 21.6.3 In Figure 21.11, a protograph is shown on the left with its corresponding
incidence matrix. Note that the top right node (represented by the first row) has 2 edges to
the top left node, no edges to the second left node, and a single edge to each of the remaining
left nodes. An example of a degree 3 lift is shown on the right with its corresponding parity
check matrix. The permutation matrices correspond to the assignments made to the edges
of the protograph while constructing the lift, and indicate how the vertices in the clouds
connect to each other. If the permutation σ is assigned to the edge {vi , cj } in the protograph
(with vi a left node and cj a right node), then vi,k is connected to cj,σ(k) in the lift for
k = 1, . . . , 3. Note that the local structure at each node belonging to the cloud of v (or the
cloud of c) in the lift is the same as that for v (or for c) in the protograph.
Definition 21.6.4 Array-based LDPC codes are defined as LDPC codes with parity
check matrices composed of blocks of circulant matrices and were introduced in [703]. That
is, the parity check matrix H is composed of j × r blocks of circulant matrices2 , each of
size ` × `. The overall block length is r` and the code rate of the corresponding LDPC code
is R ≥ 1 − j/r. Typically, each circulant matrix in an array is a cyclically shifted identity
matrix, where the shift is chosen either randomly or algebraically.
Example 21.6.5 The Tanner–Sridhara–Fuja (TSF) codes introduced in [1788] are
one family of array-based codes with algebraically chosen circulant matrices. Let a and b
be two elements from the multiplicative group of F` , with orders j and r, respectively, and
` a prime. Then the parity check matrix H is defined as
I1,` Ib,` Ib2 ,` ··· Ibr−1 ,`
Ia,` Iab,` Iab2 ,` ··· Iabr−1 ,`
H= .. .. .. .. ,
. . . ··· .
Iaj−1 ,` Iaj−1 b,` Iaj−1 b2 ,` ··· Iaj−1 br−1 ,`
where Iai bk ,` is the ` × ` identity matrix cyclically shifted to the left by ai bk positions, where
ai bk is interpreted as an element in the integers Z` = {0, 1, . . . , ` − 1}.
One popular code is the [155, 64, 20] code that is obtained by choosing ` = 31, a = 5, b = 2
and j = 3, r = 5 in the above construction. The matrix is given by
I1,31 I2,31 I4,31 I8,31 I16,31
H = I5,31 I10,31 I20,31 I9,31 I18,31 ,
I25,31 I19,31 I7,31 I14,31 I28,31
where Ix,31 is the 31 × 31 identity matrix cyclically shifted to the left by x positions. This
code has block length 155, dimension 64, minimum distance 20, and the corresponding
Tanner graph has a girth of 8.
Furthermore, several graph automorphism properties were shown to exist within the
Tanner graph for these codes [1785] that provide an algebraic framework for analyzing and
understanding the structure and properties of the codes.
Remark 21.6.6 It was shown in [1103] that all array-based LDPC codes and several other
classes of protograph LDPC codes may be obtained and analyzed in the voltage graph
framework. Moreover, tools from voltage graph theory can be used to obtain (often simpler)
proofs of code graph properties.
The array-based LDPC codes as defined above also belong to the class of quasi-cyclic
(QC) codes. One interesting feature of quasi-cylic array-based codes is that they may be
used to obtain convolutional codes. A binary ` × ` circulant matrix may be represented by a
polynomial in the ring F2 [X] mod (X ` −1). Viewing these polynomials as elements in F2 [X],
a convolutional code is obtained. In terms of the Tanner graph, the convolutional code can
be seen as an “unwrapping” of the Tanner graph of the QC code within each circulant [1789].
Also, by Theorem 7.6.3, the free distance of the convolutional code is bounded below by the
minimum distance of the original QC code. LDPC convolutional codes were later shown to
belong to the family of spatially-coupled LDPC codes [1172]. The latter are currently the
most interesting class of LDPC codes capable of achieving near capacity performance on
many communication channels and will be discussed in Section 21.8.
Array-based codes built from circulant matrices have a limited minimum distance.
Specifically, the following result has been shown.
2 A circulant matrix is a square matrix with each row of the matrix being a cyclic shift of its previous
It is well known that circulant matrices commute under multiplication. Thus, the min-
imum distance bound holds for all array-based LDPC codes with the property defined in
Theorem 21.6.7 that the parity check matrix contains a j` × (j + 1)` submatrix having a
j(j +1) grid of nonzero circulants. A generalization of this theorem is given next [1730]. Pro-
tograph codes whose permutation assignments generate a commutative permutation group
also have limitations on properties such as Tanner graph girth [1103].
Theorem 21.6.8 ([1730]) Let C be an LDPC code with a parity check matrix H composed
of a j × r grid of ` × ` permutation matrices that commute and is derived from a degree ` lift
of a protograph represented by a j × r parity check matrix B. Then the minimum distance
of C is bounded above by
∗ X
d≤ min perm(BS\{i} ),
S⊆{1,2,...,r}, |S|=j+1
i∈S
where perm(BS\{i} ) denotes the permanent of the matrix consisting of the j columns of B
∗
in the set S \ {i} and the min operator returns the smallest nonzero value from a set.
Despite the upper bounds on the minimum distance in Theorems 21.6.7 and 21.6.8,
array-based LDPC codes are prominent in industry applications as their structure lends
itself to efficient practical implementation.
(c) There is a deterministic way of computing the probability density function of the mes-
sages exchanged in a cycle-free Tanner graph for t decoding iterations.
Remark 21.7.2 Regarding (c), if the probability density function (pdf) of wrong/incorrect
messages3 in the graph tends to a delta function at ±∞ after t → ∞ iterations, then it
means that the iterative decoder is likely to converge for the specified channel quality.
Hence, a density evolution (DE) threshold is defined as the worst channel quality that
allows the pdfs of the incorrect messages to tend to a delta function at ±∞ as t, n → ∞.
This quantity is sometimes referred to as the capacity of the LDPC graph with the specified
degree distribution on the variable and check nodes of the graph under the chosen iterative
message-passing decoder. Thus, the performance of LDPC codes via iterative sum-product
decoding is typically limited by their density evolution thresholds [1589, 1590].
Definition 21.7.3 The density evolution (DE) threshold of an LDPC code is the worst
case channel condition at which the code may still converge under sum-product decoding,
and relates primarily to the degree distribution (λ(x), ρ(x)) of the vertices in the code’s
Tanner graph.
Remark 21.7.4 In Figure 21.6, the DE threshold for a long block length LDPC code may
be seen as the channel SNR where the waterfall region begins. The channel capacity is an
SNR value that is at most the DE threshold. Closing the gap between the DE threshold
and the channel capacity requires optimizing the degree distribution of the Tanner graph
for the code [1591].
The binary erasure channel (BEC) is the simplest case to analyze. For the case of the
BEC, let ch be the probability of a symbol being erased by the channel; see Figure 30.2
where ch = p. Let (t) be the overall probability of sending an erasure message from
a variable node to a check node in iteration t, and let γ(t) be the overall probability of
sending an erasure message from a check node to a variable node in iteration t. Further,
let γi (t) be the probability of sending an erasure message from a degree i check node to a
neighboring variable node in iteration t. Similarly, let i (t) be the probability of sending an
erasure message from a degree i variable node to a neighboring check node in iteration t.
Then we can derive the following:
i−1
γi (t) = 1 − 1 − (t) ,
since if any other incoming message to a check node is an erasure message, the outgoing
message from that check node will also be an erasure message. Averaging over all degrees
of a check node, we get
X X i−1
γ(t) = ρi γi (t) = ρi 1 − 1 − (t) = 1 − ρ 1 − (t) .
i i
3 For example, a wrong message would be one with the incorrect sign compared to the true value of the
TABLE 21.1: DE thresholds for (j, r)-regular LDPC codes on the BEC
Combining the above equations yields the general recursion for the probability of sending
an erasure message during iteration t + 1 as
Thus, the DE threshold is derived as the largest value for ch for which (t + 1) → 0 as
t → ∞. Sometimes the DE threshold is written as
A fixed point in the DE analysis is defined by the solutions to the following equation
These are the points where the pdfs do not change with increasing iterations. A table of
the DE thresholds for different regular LDPC codes over the binary erasure channel (BEC)
are given in Table 21.1. The Shannon capacity cap of the channel for the specified LDPC
code rate is also given to compare against their DE threshold.
Remark 21.7.5 As noted in Table 21.1, regular LDPC codes typically have DE thresholds
that are not close to the channel capacity. To reverse the problem, one can ask what is the
distribution λ(·) and ρ(·) for which the DE threshold approaches the channel capacity. By
optimizing irregular degree distributions and designing irregular codes, [1589] showed that
we can come very close to the channel capacity on many communication channels.
the reader to [183]. The main design problem for optimizing the performance of turbo codes
is in the choice of component codes and permutations (called interleavers) used. For ex-
ample, the free distances of the component convolutional codes limit the minimum distance
of the overall turbo codes; consequently, these codes tend to have a poor distance distri-
bution. A decoding threshold similar to a DE threshold may be obtained using a process
called EXtrinsic Information Transfer (EXIT) analysis [1797, 1798] that measures how the
quality of each component code’s output improves based on the information gained from
the other component codes during iterative decoding. Typically for block lengths less than
1000 bits, turbo codes tend to outperform LDPC codes, whereas for longer block lengths,
LDPC codes tend to be superior.
FIGURE 21.12: On the top, a base Tanner graph to be coupled to form an SC-protograph.
Variable nodes are denoted by •, and check nodes are denoted by . On the bottom, an
SC-protograph resulting from randomly edge-spreading L copies of the Tanner graph on
top, with coupling width w = 1, and applying the same map at each position.
meaning the DE threshold is the best possible. Further, increasing the variable node degrees
allows the MAP threshold to approach capacity of the channel. While the results were
rigorously proven for the binary erasure channel (BEC), spatial-coupling has been shown to
be effective on a wide range of other channels [682, 1172, 1173, 1216, 1745]. While spatially-
coupled LDPC codes may be designed using several methods including edge-spreading an
initial Tanner graph to obtain an SC-protograph (that is then lifted to obtain an SC-
LDPC code) as in Figure 21.12, or using a so-called edge-cutting vector [685], all of these
methods can be seen as special cases of a single protograph code with algebraically chosen
permutations [149].
Part III
Applications
553
Chapter 22
Alternative Metrics
Marcelo Firer
State University of Campinas (Unicamp)
22.1 Introduction
The main scope of this chapter is metrics defined for coding and decoding purposes,
mainly for block codes. Despite the fact that metrics are nearly ubiquitous in coding theory,
it is very common to find them as a general background, with the role of metric invariants
not always clearly stated. As a simple but eloquent example, the role of the minimum
distance and the packing radius of a code are commonly interchanged. While the minimum
distance d(C) of a code C ensures that any error of size at most d(C) − 1 can be detected,
the packing radius R(C) ensures that any error of size at most R(C) can be corrected. These
statements are true for any metric (considering additive errors) and the interchange between
these two figures of merits follows from the fact that, in the most relevant metric in coding
theory, the Hamming metric, the packing radius is determined by the minimum distance in
555
556 Concise Encyclopedia of Coding Theory
j k
the famous formula R(C) = dH (C)−1
2 . This is not the case for general metrics, where not
only one of the invariants may not be determined by the other, but determining the packing
radius may be an intractable work, even in the case of a code with only two codewords; see,
for example, [613]. As a second example of such a situation is the description of equivalence
of codes, found in nearly every textbook on the subject, that seldom mentions the fact
that differing by a combination of a permutation of the coordinates and the product by an
invertible diagonal matrix means that the two considered codes are images one of the other
by a linear isometry of the Hamming space. What is natural to the Hamming metric may
become a “sloppy” approach when considering different metrics.
For this reason, besides properly defining a (non-exhaustive) list of metrics that are
alternative to the Hamming metric, we will focus our attention on the structural aspects of
the metric space (X , d). By structural, in this context, we mean aspects that are relevant
for coding theory: the relation between the minimum distance and the packing radius,
the weight (or distance) enumerator of the ambient space (or codes), the existence of a
MacWilliams-type identity, a useful description of the group of (linear) isometries of (X , d)
and, of course, bounds for the existence of codes with given parameters. What will be
mainly missing is the search for codes attaining or approaching the bounds and particular
interesting decoding algorithms, among other things, mainly due to space restrictions.
The most important metric in coding theory is the Hamming metric dH . Being an
alternative metric means (in the context of coding theory) being an alternative to the
Hamming metric (or to the Lee metric). The prominence of the Hamming metric is due
mainly to the fact that it is matched to the binary symmetric channel, in the sense that
(considering codewords to be equiprobable)
for every code C ⊆ Fn2 and every x ∈ Fn2 , where PBSC (c | x) := %dH (x,c) (1 − %)n−dH (x,c) is
the probabilistic model of the channel; see Section 1.17.
Since errors are always described in a probabilistic model, being matched to a channel
is a crucial property of a metric. Despite the fact that any metric can be matched to a
channel (the reverse affirmation is not true - see [732]), only part of the alternative metrics
arose matched to a specific channel. Others are considered to be “suitable” to correct some
types of errors that are characteristic of the channel, despite the fact that the metric and
the channel are not matched. Finally, some of the metrics were studied for their intrinsic
interest, shedding some light on the role of metric invariants and waiting on a theoretical
shelf to acquire more practical application. This recently happened to the rank metric,
introduced by È. M. Gabidulin in 1985 [760] which became outstandingly relevant after the
burst in research of network coding; see Chapter 29.
All the distance functions surveyed in this chapter actually determine a metric, in
the sense that they satisfy the symmetry and positivity axioms (what entitles them to
be called a distance, according to the terminology used in [532, Chapter 1]) and also
the triangular inequality. At this point we should remark that the triangle inequality,
many times the unique property that is not obvious, is a somehow superfluous property
in the context of coding theory. Given two distances (or metrics) d1 and d2 defined on
Fnq , they should be considered equivalent if they determine the same minimum distance
decoding for every possible code C and every possible received message x, in the sense that
arg minc∈C {d1 (x, c)} = arg minc∈C {d2 (x, c)}. Since Fnq is a finite space, given a distance d
it is always possible to find a metric d0 that is equivalent to d. For more details on this and
related subjects see [614].
Alternative Metrics 557
WD Weight defined: It is defined by a weight: d(x, y) = wt(x − y). This means that
the metric is invariant by translation, in the sense that d(x + z, y + z) = d(x, y)
for all x, y, z ∈ Fnq . The invariance by translation property makes the metric suitable
for decoding linear codes, allowing, for instance, syndrome decoding.
AP Additive Property: It is additive, in the sensePthat the weight wt may be considered
to be defined on the alphabet Fq and wt(x) = wt(xi ), for x = x1 x2 · · · xn ∈ Fnq . The
additive property is a condition for a metric to be matched to a memoryless channel.
RS Respect to support: This means that whenever supp(x) ⊆ supp(y), we have that
wt(x) ≤ wt(y). Respecting the support is an essential condition for a metric being
suitable for decoding over a channel with additive noise.
Nearly all the alternative metrics presented in this chapter are considered in the context
of block coding over an alphabet, and they may be essentially distinguished in the way they
differ from the Hamming metric in the alphabet, in the spreading over the coordinates or
in both ways. They can be classified into some different types. There are metrics that look
at the way errors spread over the bits, ignoring any additional structure on the alphabet.
This is the case of the metrics generated by subspaces and the poset metrics, introduced
in Sections 22.2 and 22.3, respectively. In Section 22.4 we move into a different direction,
considering metrics that “dig” into the structure of the alphabet and spread additively over
the bits. These are natural generalizations of the Lee metric. In Section 22.5 we consider
two families of metrics that move in these two directions simultaneously; they consider
different weights on the alphabet but are non-additive. Up to this point, all the metrics are
defined by a weight, hence invariant by translations. In the next three sections, we approach
metrics that are not weight-defined: the metrics for asymmetric channels (Section 22.6),
editing metrics, defined over strings of different length (Section 22.7) and metrics defined
on a permutation group (Section 22.8).
Before we start presenting the alternative metrics, we shall establish some notations and
conventions:
• Given x = x1 x2 · · · xn ∈ Fnq , its support is denoted by supp(x) := {i ∈ [n] | xi 6= 0},
where [n] := {1, 2, . . . , n}.
• ei is the vector with supp(ei ) = {i} and the ith coordinate equals 1. We shall refer to
β = {e1 , e2 , . . . , en } as the standard basis of Fnq .
• Since we are considering many different metrics, we will denote each distance func-
tion by a subscript: the Hamming metric is denoted by dH , the Lee metric by dL ,
and so on. The subscript is used also in the notation for the minimum distance of
a code C: dH (C), dL (C), etc.
• Given a metric d on Fnq , we denote by Bdn (r, x) = {y ∈ Fnq | d(x, y) ≤ r} and
Sdn (r, x) = {y ∈ Fnq | d(x, y) = r} the metric ball and sphere, with cardinality bnd (r)
and snd (r) respectively, the center x being immaterial if the metric is weight-defined.
• The packing radius Rd (C) of a code C contained in a space (for instance Fnq ) endowed
with a metric d is defined by Rd (C) = max {r | Bdn (r, x) ∩ Bdn (r, y) = ∅ for all x, y ∈
C, x 6= y}.
558 Concise Encyclopedia of Coding Theory
Definition 22.2.1 TheSF-weight wtF (x) of 0 6= x ∈ Fnq is the minimum size of a set I ⊆
[n] such that x ∈ span( Fi ) and wtF (0) = 0. The F-distance is dF (x, y) = wtF (x − y).
i∈I
S S
Without loss of generality, since span( Fi ) = span span(Fi ) , we may assume
i∈[n] i∈[n]
that each Fi is a vector subspace of Fnq ; hence
X
i i
wtF (x) = min |I| I ⊆ [m], x = x , x ∈ Fi .
i∈I
The Hamming metric is a particular case of subspace metric, which happens when m = n
and Fi = span(ei ) for each i ∈ [n]. The well known rank metric on the space of all k × n
matrices over Fq , which was introduced in [760], can also be viewed as a particular instance
of a projective metric (a special type of subspace metric, to be defined in Section 22.2.1),
where F is the set of all k × n matrices of rank 1. This metric became nearly ubiquitous in
network coding after Silva, Kschischang, and Koetter’s work [1711]. Despite its importance,
the rank metric will not be explored in this chapter. Many of its important features can
be found in Chapter 29, devoted to network coding, and in Chapter 11, where rank-metric
codes are examined.
As far as this author was able to track, the subspace metrics, in their full generality,
are unexplored in the professional literature. However, something is known about some
particular cases, namely projective and combinatorial metrics, most of it due to the work
of Gabidulin and co-authors. We present some of these results on particular subclasses in
the sequence.
is called a projective metric. In this situation, we can identify each Fi with a nonzero
vector fi ∈ Fi . We denote by
the dF -minimum distance of a code C. For an [n, k]q linear code C, the usual Singleton
Bound dF (C) ≤ n − k + 1 was proved to hold (see [765]).
We remark that, for a projective metric, since {f1 , f2 , . . . , fm } generates Fnq , we have
that m ≥ n. By considering the linear map ϕ : Fm n
q → Fq determined by ϕ(ei ) = fi , its
m
kernel P := ker(ϕ) ⊆ Fq is an [m, m − n]q linear code called the parent code of F. Given
an [m, m − n]q linear code C, the columns of a parity check matrix determine a family FC
of basic sets such that C is the parent code of FC . The parent code helps to compute the
F-weight of a vector. Given x ∈ Fnq , the inverse image ϕ−1 (x) is a coset of the parent code
P. If y ∈ ϕ−1 (x) is a coset leader (a vector of minimal Hamming weight), we have that
wtF (x) = wtH (y) [767, Proposition 2].
This simple relation between the F-weight of a vector and the Hamming weight of the
parental coset is used to produce a class of Gilbert–Varshamov Bounds for codes with an
F-metric. We let Li (P) be the number of cosets of P with Hamming weight i. Then we
have that Li (P) = smdF (i) for every i less than or equal to the packing radius of the parent
code P. This simple remark is the key to prove the following Gilbert–Varshamov Bound.
Theorem 22.2.2 ([767, Theorem 3.1.]) Let Pm be a sequence of [m, n]2 linear codes
with Li (Pm ) ' mi , where ' means asymptotic behavior. Suppose that Pm attains the
Gilbert–Varshamov Bound, and let Fm be the family of basic sets determined by Pm . Then
there exists a code C ⊆ Fnq with Fm -distance d = dFm (C) and M elements where
2n
M ' Pd m
.
i=0 i
The parent code of F is used to produce a family of codes attaining the Singleton Bound,
a generalization of Reed–Solomon codes, using concatenation of Vandermonde matrices to
generate the basic sets (see [765, Section 3]) and also to produce a family of dF -perfect
codes (see [768] for the construction).
The phase-rotation metric is a special instance of a projective metric when m = n + 1,
Fi = span(ei ) for i ≤ n and Fn+1 = span(1) where 1 := 11 · · · 1. It is suitable for decoding
in a binary channel in which errors are a cascade of a phase-inversion channel (where every
bit of the input message is flipped) and a binary symmetric channel. The actual model
of the channel depends on the probability of the two types of errors, and hence the term
“suitable” is not to be confused with the matching property.
It is easy to see that, in this case, wtF (x) = min {wtH (x), n + 1 − wtH (x)}. In [763], the
authors provide constructions of perfect codes relative to the phase-rotation metric, based
on the construction of perfect codes with the Hamming metric and also show how to reduce
the decision decoding for the phase-rotation metric to decoding for the Hamming metric.
We remark that the combinatorial metric can be expressed in a simpler way, as intro-
duced in [759]: [
wtF (x) = min |I| I ⊆ [m], supp(x) ⊆ Ai .
i∈I
When considering a combinatorial metric, we shall identify a basic set with its support;
hence call each A i also a basic set and denote F = {A1 , A2 , . . . , Am }. We note that, if
F = {i} | i ∈ [n] , then dF is the Hamming metric.
There are many particular instances of combinatorial metrics that have been used for
particular coding necessities, and there is a substantial literature devoted to codes with
“good” properties according to such particular instances, despite the fact that the actual
metric is hardly mentioned. However, since its introduction in 1973 by Gabidulin, the struc-
tural aspects of the metric were left nearly untouched.
A Singleton-type bound was determined in 1996 [252] for codes which are not necessarily
linear. We present here its linear version: given an [n, k]q linear code C and a covering F,
its minimum distance dF (C) is bounded by
dF (C) − 1 dF (C) − 1
n ≤ n ≤ n − k,
D D
where d·e is the ceiling function and D is the minimum number of basic sets needed to
cover [n]. We remark that D ≤ n and equality in the bounds holds if and only if dF is the
Hamming metric, and, in this case, we get the usual Singleton Bound.
Recently, combinatorial metrics started to be explored in a more systematic way.
In [1510] the authors give a necessary and sufficient condition for the existence of a
MacWilliams Identity, that is, conditions on F to ensure that the dF -weight distribution of
a code C determines the dF -weight distribution of the dual code C ⊥ .
In the same work, there is a description of the group GL(Fnq , dF ) of linear dF -isometries.
We briefly explain how it is obtained.
First of all, we say that a permutation σ ∈ Sn preserves the covering if σ(A) ∈ F, for
all A ∈ F. Each such permutation induces a linear isometry Tσ (x) := xσ(1) xσ(2) · · · xσ(n) .
The set of all such isometries is a group denoted by GF .
Next, let F i := {A ∈ F | i ∈ A}. This defines an equivalence relation on [n]: i ∼F j
if and only if F i = F j . Let s be the number of equivalence classes
and denote by JiK the
equivalence class of i. We construct an s × m matrix M = mJiKj , the incidence matrix
of this equivalence relation:
(
1 if JiK ⊆ Aj ,
mJiKj =
0 otherwise.
Given an n × n matrix B = bxy with entries in Fq , let us consider the (i, j) block
(sub-matrix) Bij = bxy x∈JiK,y∈JjK . We say that B respects M if, for JiK = JjK, the block
Bij = Bii = bxy x,y∈JiK is an invertible matrix and, for JiK 6= JjK, then Bij 6= 0 (Bij is
a nonzero matrix) implies that supp(vj ) ⊆ supp(vi ), where vk is the k th row of M . Each
matrix B respecting M determines a linear dF -isometry, and the set KM of all such matrices
is also a group.
Alternative Metrics 561
Theorem 22.2.4 For q > 2, the group GL(Fnq , dF ) of linear dF -isometries is the semi-
direct product GL(Fnq , dF ) = GF n KM . For q = 2, we have the inclusion GF n KM ⊆
GL(Fn2 , dF ) and equality may or not hold, depending of F.
There are some special instances of combinatorial metrics that deserve to be mentioned,
namely the block metrics, burst metrics, and 2-dimensional burst metrics.
Theorem 22.2.5 ([716, Theorem 4.3]) Let C ⊆ Fnq be a linear code and F be a partition
P wtF (C) m−wtF (C)
of [n] with m basic sets each having cardinality r. Let FweC (x, y) = x y
c∈C
be the F-weight enumerator of the code C. Then,
1
FweC ⊥ (x, y) = FweC (y − x, y + (q r − 1)x).
|C|
The authors also establish necessary and sufficient conditions on a parity check matrix for
a code to be perfect or MDS. Also, algebraic geometry codes are explored with this metric.
Theorem 22.2.6 ([303, Theorems 1,2 and 4]) Let C be a linear code with minimum
burst distance dF (C) where F = F[b] . Then C corrects
j k
(a) any pattern of dF (C)−1
2 error bursts of length b;
As an example, it is possible to prove that a code C can correct any burst error of length
b if and only if 3 ≤ min {dF[b] (x, y) | x, y ∈ C, x 6= y}; that is, the error correction capability
(the packing radius) is determined by the minimum distance and this happens in a similar
562 Concise Encyclopedia of Coding Theory
way to the Hamming metric. The Reiger Bound [1585], for example, can be expressed in
terms of the packing radius.
A different instance where the b-burst metric arises is in high density storage media.
In these storage devices, the density does not permit reading the bits one-by-one, but only
in sequences of b-bits at a time. This is what is called the b-symbol read channel.
Recently, Cassuto and Blaum ([366], for the case b = 2) and Yaakobi et al. ([1918], for
b > 2), proposed coding and decoding schemes for the b-symbol read channel. The decoding
scheme is a nearest-neighbor decoding, which considers a metric that is actually the b-burst
metric. Many bounds on such coding schemes and construction of MDS codes can be found
in the literature; see for example [543]. We should remark that in all these works, the authors
consider as undistinguished a particular encoding approach and the metric decoding.
where r ∈ [n1 ], s ∈ [n2 ] and the entries are considered (mod n1 , mod n2 ). A large account
on the subject can be found in the survey [217] on array codes.
generated by subspaces (Section 22.2), in the sense that the Hamming metric is the only
intersection of these two families.
Codes with a poset metric, or simply poset codes, were investigated along many different
paths, including conditions (on the poset) to turn a given family of codes into perfect or
MDS codes. The fact that there are relatively many codes with such properties is what
first attracted attention to poset metrics. For perfect or MDS codes with poset metrics see
[315, 637, 1016, 1019, 1035, 1118, 1550], among others.
Considering more structural results, one can find in [1487] a description of the group
of linear isometries of the metric space (Fnq , dP ) as a semi-direct product of groups. Each
element σ of the group Aut(P ) of poset automorphisms (permutations of [n] that preserve
the order) determines a linear isometry Tσ (x) = xσ(1) xσ(2) · · · xσ(n) . The set of all such
isometriesis a group, also denoted by Aut(P). We let MP be the order matrix of P , that
is, MP = mij is an n × n matrix and mij = 1 if i j and mij = 0 otherwise. An n × n
matrix A = aij with entries in Fq such that aii 6= 0 and supp(A) ⊆ supp(MP ) determines
a linear isometry, and the family of all such maps is a group denoted by GP .
Theorem 22.3.1 ([1487, Corollary 1.3]) The group GL(Fnq , dP ) of linear isometries of
(Fnq , dP ) is the semi-direct product GL(Fnq , dP ) = Aut(P ) n GP .
MacWilliams-type identities are explored in [405], revealing a deep understanding about
the most important duality result in coding theory. First of all, it was noticed since [1035]
that, when looking at the dual code C ⊥ , we are actually not interested in the dP -weight
distribution but in its dP ⊥ -weight distribution, determined by the opposite poset P ⊥
defined by i P ⊥ j ⇐⇒ j P i. The innovation proposed in [405] is to note that
the metric invariant is not necessarily the weight distribution, but rather a distribution of
elements in classes determined by a suitable equivalence relation on ideals of P . The authors
identify three very basic and different equivalence relations of ideals. Given two poset ideals
I, J ⊆ [n] we establish the following equivalence relations.
EC : (I, J) ∈ EC if they have the same cardinality.
ES : (I, J) ∈ ES if they are isomorphic as sub-posets.
EH : Given a subgroup H ⊆ Aut(P ), (I, J) ∈ EH if there is σ ∈ H such that σ(I) = J.
It is easy to see that EH ⊆ ES ⊆ EC .
Given an equivalence relation E over the set I(P ) of ideals of P , for each coset I ∈
P
I(P )/E, the I-sphere SI,E centered at 0 with respect to E is
P
SI,E := {x ∈ Fnq | (hsupp(x)iP , I) ∈ E}.
If E is an equivalence relation on I(P ), we denote by E ⊥ the equivalence relation on
I(P ⊥ ), where E may be either EC , ES , or EH . Given an ideal I ∈ I(P ), we denote its
complement by I c = [n] \ I. We say that E ⊥ is the dual relation of E.
P
We define AI,E (C) := |SI,E ∩ C| and call WP,E (C) := [AI,E (C)]I∈I(P )/E the spectrum
of C with respect to E. We remark that, for E = EC , the spectrum of C consists of the
coefficients of the weight enumerator polynomial of C.
Definition 22.3.2 Let P be a poset, E an equivalence relation on I(P ) and suppose that
E ⊥ is the dual relation on I(P ⊥ ). The relation E is a MacWilliams equivalence relation
if, given linear codes C1 and C2 , WP,E (C1 ) = WP,E (C2 ) implies WP ⊥ ,E ⊥ (C1⊥ ) = WP ⊥ ,E ⊥ (C2⊥ ).
For the case E = EC , to be a MacWilliams equivalence relation implies the existence of a
MacWilliams Identity in the usual sense. In general, for E being a MacWilliams equivalence
relation depends on the choice of P .
564 Concise Encyclopedia of Coding Theory
As for the last item, a poset P is called hierarchical if [n] can be partitioned into a
disjoint union [n] = H1 ∪ H2 ∪ · · · ∪ Hr in such a way that i P j if and only if i ∈ Hhi , j ∈
Hhj and hi < hj . This family of posets includes the minimal poset (an anti-chain) and
maximal (a chain, defined shortly). Hierarchical poset metrics are a true generalization of
the Hamming metric, in the sense that many well known results concerning the Hamming
metric are actually a characteristic of hierarchical poset metrics.
All these properties (and some more in [1313]) are a direct consequence of the canonical
decomposition of codes with hierarchical posets [712], which allows the transformation of
problems with hierarchical poset metrics into problems with the well-studied Hamming
metric.
Another relevant class of poset metrics is the family of NRT metrics, defined by a poset
which is a disjoint union of chains. A poset P is a chain, or total order, when (up to
relabeling) 1 P 2 P · · · P n. Let (P, [n]) be a disjoint union of m chains, each of length
l; that is, n = ml and the relations in P are jl + 1 P jl + 2 P · · · P (j + 1)l, for
j = 0, 1, . . . , m − 1. This is the original instance explored by Niederreiter in a sequence of
three papers [1433, 1434, 1435] and later, in 1997, by Rosenbloom and Tsfasman [1595],
with an information-theoretic approach. After these three main authors, it is known as the
NRT metric. Many coding-theoretic questions have been investigated with respect to the
NRT metric, including MacWilliams Duality [636], MDS codes [635, 1721], structure and
decoding of linear codes [1480, 1486], and coverings [370]. In [127] the authors show the
connection of NRT metric codes with simple models of information transmission channels
and introduce a simple invariant that characterizes the orbits of vectors under the group
of linear isometries (the shape of a vector). For the particular case of one single chain
(n = l, m = 1), dP is an ultra-metric and, given a code C with minimum distance ddP (C),
it is possible to detect and correct every error of P -weight at most ddP (C) − 1. Despite
some strangeness that may be caused by ultra-metrics, this instance is simple and well
understood; see [1486].
An extensive and updated survey of poset metrics in the context of coding theory can be
found in [731]. In the following two subsections we will present some recent generalizations
of the poset metrics.
Alternative Metrics 565
In [45], the authors give a characterization of the group of linear isometries of (Fnq , d(P,π) ).
Necessary and sufficient conditions are provided for such a metric to admit a MacWilliams
Identity. Perfect and MDS codes with poset-block metrics are investigated in [45] and [487].
It is worth noting that an interesting feature of this family of metrics is combining together
a poset metric, which increases the weight of vectors (hence “shrinking” the metric balls),
with a block metric, which decreases the weights (hence “blowing-up” the balls).
and the MacWilliams Identity. It is worth noting that the conditions for the validity of the
Extension Theorem and the MacWilliams Identity do not coincide. As far as the author was
able to look, this is the first instance where the hypotheses for the MacWilliams Extension
Theorem and the MacWilliams Identity are different.
A different formulation of this digraph metric can be obtained by considering weights
on a poset, as defined in [1017], where perfect codes start to be explored in this context.
We remark that codes over rings of integers can be considered as codes on a flat torus,
a generalization of spherical codes; see, for example, [454, 1336].
Most of the results using metrics over rings of integers are constructions of codes that are
able to correct a certain number of errors, along with decoding algorithms; see [997, 998].
It is also worth noting that these metrics can be obtained as path-metrics on appropriate
circulant graphs, as presented in [1336].
n
Pn l-dimensional Lee weight of x ∈ Fq is defined additively by wtl,φ (x) :=
The
i=1 wtl,φ (xi ). The weight of the difference dl,φ (x, y) = wtl,φ (x − y) is the l-dimensional
Lee distance determined by φ.
The l-dimensional Lee distance is a metric which generalizes some known metrics. This
is the case for the Lee metric, which is obtained by setting l = 1 and φ(e1 ) = 1. Also the
Mannheim metric can be obtained as a 2-dimensional Lee metric: for p ≡ 1 (mod 4), the
equation x2 ≡ −1 (mod p) has a solution x = a ∈ Fp . For l = 2, we define φ(e1 ) = 1 and
φ(e2 ) = a and one gets the Mannheim metric (Section 22.4.1) as a particular instance of a
2-dimensional Lee metric.
In [1440] there are constructions of codes correcting any error e of weight wtl,φ (e) = 1
(for q odd) and wtl,φ (e) = 2 (for the case q = pm = 4n + 1 with p > 5).
for x, y ∈ Znq . The KS-distance determines a metric which generalizes the Hamming metric
(for the partition BH defined by B1 = {1, 2, . . . , q − 1}) and the Lee metric (for the partition
BL defined by Bi = {i, q − i}, for every 1 ≤ i ≤ b1 + q/2c).
Considering Znq as a module over Zq , we say that C ⊆ Znq is linear over Zq if it is a
Zq -submodule of Znq ; see Chapter 6. Since Zq is commutative, it has invariant basis number,
and we denote by k the rank of C. Considering a KS-metric, the main existing bound is a
generalization of the Hamming Bound, and it applies for linear codes over Zq .
Theorem 22.4.1 ([1094, Theorem 1]) Let C ⊂ Znq be a linear code of length n over Zq
that corrects all errors e with wtB (e) ≤ r1 and all errors f consisting of bursts of length at
most b < n/2 and wtB (f) ≤ r2 , with 1 < r1 < r2 < (m − 1)b. Then
b
X
n−k ≥ logq bndB (r1 ) + (n − i + 1)(bidB (r2 ) − bidB (r1 )),
i=1
In more recent works [796, 797], there is a refinement of the bounds, by considering
errors of limited pattern, that is, by possibly limiting the range of errors in each coordinate:
wtB (xi ) ≤ d. It is worth noting that a generalization of the Kaushik–Sharma metric can be
done for any finite group (not necessarily cyclic), as presented in [137].
To define an order relation on a multiset one should consider the Cartesian product of
multisets M1 (X , c1 ) and M2 (X , c2 ): it is the mset with underlying set X × X and counting
function c defined as the product, c((a, b)) = c1 (a)c2 (b), that is,
M1 × M2 := {mn/(m/a, n/b) | m/a ∈ M1 , n/b ∈ M2 }.
A submultiset (submset) of M = (X , c) is a multiset S = (X , c0 ) such that c0 (x) ≤ c(x),
for all x ∈ X . We denote it by S M . A submset R = (M × M, c× ) M × M is
called a mset relation on M if c× (m/a, n/b) = mn for all (m/a, n/b) ∈ R. If we consider,
for example, the multiset M = {4/a, 2/b}, we have that S = {5/(4/a, 2/a), 8/(4/a, 2/b)}
is a submset of M × M , but it is not a mset relation, since 5 6= 4 · 2, whereas R =
{2/(2/a, 1/b), 6/(2/b, 3/a)} is a mset relation.
Definition 22.5.1 Let M be a multiset. A partially ordered mset relation (or pomset
relation) R on M is a mset relation satisfying
(a) (reflexivity) (m/a)R(m/a), for all m/a ∈ M ;
(b) (anti-symmetry) (m/a)R(n/b) and (n/b)R(m/a) implies m = n and a = b;
(c) (transitivity) (m/a)R(n/b)R(k/c) implies (m/a)R(k/c).
The pair P = (M, R), where M is an mset and R is a pomset relation, is a partially
ordered mset (or pomset). Given P = (M, R), a subset I M is called a pomset
ideal if m/a ∈ I and (n/b)R(k/a), with k > 0 and b 6= a, implies n/b ∈ I. Given a
submset S M , we denote by hSiP the ideal generated by S, that is, the smallest ideal of
P containing S.
Consider the mset M = {r/1, r/2, . . . , r/n} ∈ Mr ([n]) where r := bq/2c is the integer
part of q/2. The Lee support of a vector x ∈ Znq is suppL (x) := {k/i | k = wtL (xi ), k 6= 0},
where wtL (xi ) = min {xi , q − xi } is the usual Lee weight.
The P-weight and P-distance on Znq are defined as
wtP (x) := |hsuppL (x)iP | and dP (x, y) := wtP (x − y),
respectively.
The P-distance satisfies the metric axioms. In the case that P is an anti-chain, that
is, any pair of distinct points m/a, n/b ∈ M with a 6= b are not comparable (neither
(m/a)R(n/b) nor (n/b)R(m/a)), we have that hsuppL (x)iP = suppL (x) and so wtP (x) =
P
i wtL (xi ) = wtL (x); therefore the pomset metric is a generalization of the Lee metric. In
the case M = {1/1, 1/2, . . . , 1/n} we have a poset, hence generalizing also the poset metrics.
Not much is known about pomset metrics, only what is presented in [812]: the authors
generalize for pomsets the basic operations known for posets (direct and ordinal sum, direct
and ordinal products), and study the behavior of the minimum distance under some of these
operations, producing either closed expressions or bounds.
where dxe is the ceiling function and ∗ stands for any of the following cases:
Considering the m-spotty metrics, most of the attention was devoted to the weight
distribution of a code: MacWilliams Identities were obtained for the Hamming m-spotty
weight wtm,H ([1765] for the binary case, [1481] for finite fields in general, and [1665] over
rings), the Lee spotty weight wtm,L in [1699], and the RT spotty weight wtm,RT in [1482].
{x | N (c1 , x) ≤ r} ∩ {x | N (c2 , x) ≤ r} = ∅
Theorem 22.6.1 ([433, Theorem 3]) Let C ⊂ Fn2 be a code with minimum distance
da (C). Then C is capable of correcting r0 or fewer 0-errors and r1 or fewer 1-errors, where
r0 and r1 are fixed and r0 + r1 < da (C). In particular, it can correct (da (C) − 1) or fewer
0-errors or (da (C) − 1) or fewer 1-errors.
This distance was recently introduced in [1113], where it is used to define a metric on the
so-called l-gramm profile of x ∈ JqKn . It turns out it is an interesting distance for the types
of errors that may occur in the DNA storage channel, which includes substitution errors (in
the synthesis and sequencing processes) and coverage errors (which may happen during the
DNA fragmentation process). For details see [1113].
Theorem 22.7.2 ([973, Theorems 4.1 and 4.2]) A code C ⊆ X ∗ is capable of correct-
ing e insertions/deletions if and only if dID (C) > 2e, and it is capable of correcting i
insertions and d deletions if and only if d∗ID (C) > 2(d + i).
The structure of the edit space, that is, the structure of X ∗ equipped with the
metric dIDS , is studied in [990]. The group GL(X ∗ , dIDS ) is described as the product
GL(X ∗ , dIDS ) = hγi × Sq where γ : X ∗ → X ∗ is the reversion map γ(x1 x2 · · · xm ) =
xm xm−1 · · · x1 , hγi is the group generated by γ and Sq is the group of permutations of
the alphabet X (|X | = q), acting as usual on every position of a string, σ(x1 x2 · · · xm ) =
xσ(1) xσ(2) · · · xσ(m) .
It was proved in [476, Theorem 2] that a code capable of correcting a deletions and b
insertions has size asymptotically bounded above by
q n+b
n
.
a+b
(q − 1)a+b a+b b
The case E = {I, D, S} is also approached in [1225] where Levenshtein gives the bounds
for codes capable of correcting one insertion-deletion-substitution error:
2n−1 2n
≤ AIDS (n, 1)2 ≤ .
n n+1
As a general relation, useful for studying bounds, we can find in [990]:
Tables of known values of AIDS (n, t)q and AID (n, t)q (for small values of n, t) can be found
in [988] for q = 2 and [989] for q = 3.
Theorem 22.8.1 ([124]) Let Bdτ (R) be the dτ -ball of radius R. The following hold.
p
(a) For n − 1 < t < n(n − 1)/2, Adτ (n, t) ≤ b3/2 + n(n − 1) − 2t + 1/4c!.
n! n!
(b) ≤ Adτ (n, 2r + 1) ≤ .
|Bdτ (2r)| |Bdτ (r)|
Obstruction conditions for the existence of perfect codes with the Kendall-tau metric
can be found in [320].
It is worth remarking that the set S = {(1, 2), (2, 3), . . . , (n − 1, n)} of adjacent transpo-
sitions is a set of generators of Sn , minimal if considering only transpositions as generators.
Moreover, a transposition σ = (i, j) is its own inverse (σ = σ −1 ). It follows that the Kendall-
tau metric is just the graph metric in the Cayley graph of Sn determined by the generator
set S; the vertices are the elements of the group and two vertices are connected by an edge
if and only if they differ by a generator. This is actually the Coxeter group structure. The
group Sn acts with no fixed points on the subspace V = {x ∈ Rn | x1 +x2 +· · ·+xn = 0} and
there is a simplicial structure on the unit sphere of V , known as the Coxeter complex, on
which Sn acts simply transitively. This action makes Coxeter groups suitable for spherical
coding; see Definition 12.4.2 for the definition of a spherical code. The study of such group
codes, a generalization of Slepian’s permutation modulation codes, is studied in [1393]. See
also Chapter 16.
There are other relevant metrics on Sn . Let Gn := Z2 × Z3 × · · · × Zn and consider the
embedding : Sn → Gn which associates to σ ∈ Sn the element xσ = xσ (1)xσ (2) · · · xσ (n −
1) ∈ Gn , where xσ is defined by
xσ (i) = {j | j < i + 1, σ(j) > σ(i + 1)},
Pn−1
for i = 1, 2, . . . , n − 1. The `1 metric on Gn , d`1 (x, y) = i=1 |x(i) − y(i)|, induces a metric
on Sn : d`1 (σ, π) := d`1 (xσ , xπ ). This is a metric related to the Kendall-tau metric by the
relation dτ (σ, π) ≥ d`i (σ, π). More on permutation metrics can be found in [124].
Chapter 23
Algorithmic Methods
Alfred Wassermann
Universität Bayreuth
23.1 Introduction
In this chapter, we study algorithms for computer construction of “good” linear [n, k]q
codes and methods to determine the minimum distance and the weight distribution of a
linear code. While we will focus on construction methods in this chapter, for methods for
decoding received words, we refer to Section 1.17 of Chapter 1 and Chapters 15, 21, 24, and
32.
In the search for good codes coding theorists always want to satisfy two conflicting goals:
1. The redundancy of the code should be as small as possible, i.e., the ratio k/n should
be as large as possible, and
2. the code should allow one to correct as many errors as possible; i.e., the minimum
distance should be as big as possible.
As a consequence, a typical approach to find good codes over a finite field Fq is to fix
two parameters from [n, k, d] and optimize the third one.
First, we note that in this chapter we claim to have found a certain code as soon as we
have a generator matrix and know the parameters n, k, d. Fortunately, the construction
575
576 Concise Encyclopedia of Coding Theory
method described in this chapter also reveals the weight distribution as soon as a code
has been found. In Section 23.8 we study a state-of-the-art algorithm for computing the
minimum distance of linear codes in general. In special cases, this algorithm can be modified
easily to determine the weight distribution of the code.
When it comes to code construction by searching for a suitable generator matrix, we
immediately realize the following: a code in general will have many generator matrices, and
permuting columns and multiplying columns by nonzero scalars do not change fundamental
properties of the code. One possible approach to find a suitable generator matrix for given
parameters n and d is to build up the generator matrix row by row, thereby avoiding
isomorphic copies. This method is described in detail in [266, 268, 1180]. In a sense, by
doing this we fix n and d and try to maximize k.
In this chapter we will focus on an alternative approach and fix k and d and try to
minimize the length of n, which means find a minimal set of columns of a generator matrix
such that the resulting code has at least minimum distance d. Another formulation of the
problem would be to fix k and n and try to maximize d.
Both approaches—row-by-row or column-by-column—have been used successfully. The
latter approach, which is the approach that will be described in this chapter, is to select
the columns of a generator matrix appropriately. In order to do this an equivalence between
the existence of a linear code with a prescribed minimum distance and the existence of a
solution of a certain system of Diophantine linear equations is exploited. Since the number
of candidates for the possible columns of the generator matrix is equal to (q k −1)/(q −1), for
interesting cases of linear codes the search has to be restricted, for example to codes having
a prescribed symmetry. Thus, by applying group theory the number of Diophantine linear
equations often reduces to a size which can be handled by today’s computer hardware.
Additionally, the search can be tailored to special types of codes like two-weight, self-
orthogonal or LCD codes; see Section 23.6.
As before, a linear [n, k]q code is a k-dimensional subspace of the n-dimensional vector
space Fnq over the finite field Fq with q elements. The q k codewords of length n are the
elements of the subspace; they are written as row vectors. The (Hamming) weight wtH (c)
of a codeword c is defined to be the number of nonzero entries of c, and the minimum
distance dH (C) of a code C is the minimum of all weights of the nonzero codewords in C.
For the purpose of error correcting codes we are interested in codes with high minimum
distance d as these allow the correction of b(d − 1)/2c errors. On the other hand we are
interested in codes of small length n. As already stated, high minimum distance and small
length are contrary goals for the optimization of codes. A linear code C is called optimal
if there is no linear [n, k, dH (C) + 1] code.
The definitive reference on lower and upper limits for the maximum minimum distance
of a linear code of fixed length n and dimension k are the online tables [845]. In most
cases the best upper bound is given by the Griesmer Bound [855]; see Theorem 1.9.18 and
Remark 1.9.19. In many cases there is a gap between the minimum distance of the best
known linear [n, k]q code and the limit given by these upper bounds.
{G1 , G2 , . . . , Gn }
of column vectors Gi ∈ Fkq . Since we do not exclude that columns can be scalar multiples of
other columns, we view the generator matrix as a multiset. This view of a code as a multiset
of points makes it worthwhile to have a short excursion into finite projective geometry.
1 In Chapter 14, “dimension” of a projective subspace is the projective dimension, which is one less than
where a point P = hPi is incident with the hyperplane H = h⊥ , i.e., P ∈ H, if and only
if P · h = 0 in Fq .
Example 23.3.1 F32 has 31 2 = 7 points and 7 hyperplanes, i.e., 1-dimensional subspaces
P1 P2 P3 P4 P5 P6 P7
h100i h010i h001i h110i h011i h101i h111i
100⊥ 0 1 1 0 1 0 0
010⊥ 1 0 1 0 0 1 0
M3,2 = 001⊥ 1 1 0 1 0 0 0 (23.1)
110⊥ 0 0 1 1 0 0 1
011⊥ 1 0 0 0 1 0 1
101⊥ 0 1 0 0 0 1 1
111⊥ 0 0 0 1 1 1 0
Now, we are ready to construct linear [n, k]q codes by searching for suitable (multi)sets
of points in V .
The codewords of an [n, k]q code C are the linear combinations of rows of a generator
matrix G of C; i.e., for every codeword c = c1 c2 · · · cn ∈ C, there exists a vector h =
h1 h2 · · · hk ∈ V such that
c = c1 c2 · · · cn = hG
If Gi denotes the ith column of the matrix G, then for i = 1, . . . , n we have the inner
products
h · GT
i = ci .
If we take GT T
i as base vector of the point hGi i in V , it follows that
⊥
ci = 0 if and only if hGT
i i∈h .
The Hamming weight of the codeword c is the number of its nonzero entries. Now the
following lemma is obvious.
Lemma 23.3.2 If a generator matrix of a [n, k]q code C is viewed as a multiset P of points
in V , then the following hold.
(a) Every codeword c of C corresponds to a hyperplane h⊥ of V .
Algorithmic Methods 579
(b) The Hamming weight of c is equal to the number of points in P which are not contained
in h⊥ ; i.e., it is equal to the number of zeros in the row labeled by h⊥ of the columns
of the matrix Mk,q labeled by P.
(c) The [n, k]q code defined by a multiset P has minimum weight at least d if every hy-
perplane contains at most n − d points of P.
Example 23.3.4 Continuing Example 23.3.1, we can take the projective point set P =
{P1 , P2 , . . . , P7 } of all points of F32 as columns for our generator matrix. Since the row
sum of each row in the incidence matrix (23.1) is equal to 3, for every hyperplane there
are 7 − 3 = 4 points of P not contained in that hyperplane. It follows that every nonzero
codeword has weight 4, and we have constructed a [7, 3, 4]2 code. Taking the base vectors
of the points as columns of the generator matrix G gives
1 0 0 1 0 1 1
G = 0 1 0 1 1 0 1 .
0 0 1 0 1 1 1
This is the well known simplex code; see also Definition 1.10.2.
Choosing points P1 , P2 , P3 , and P7 , we get a code with the generator matrix
1 0 0 1
G = 0 1 0 1 .
0 0 1 1
Summing up the corresponding columns of the incidence matrix, we immediately see that
this gives a code with 6 codewords of weight 2 and one codeword of weight 4; i.e., it is a
[4, 3, 2]2 code.
We note that every linear code C can be described by projective point sets in many
different ways. For example, each step of the Gaussian elimination process applied on a
generator matrix will give a different projective point set without changing the code. On
the other hand, if a projective point set and its order as columns of the generator matrix
is given, the row reduced echelon form of the generator matrix defines a canonical
projective point set describing the code.
The special case when a code C is described by a set of projective points instead of a
multiset can be read from the weight distribution of the dual code C ⊥ . There is a codeword
of weight 2 in the dual code if and only if there is a linear combination of two columns of
the generator matrix of C which equals the zero vector; i.e., two columns of the generator
matrix are nonzero scalar multiples of each other. If a code C is described by a set rather
than a multiset of projective points, C is called projective.
Proposition 23.3.5 For a linear [n, k]q -code C the following statements are equivalent.
(a) C is a projective code.
(b) C ⊥ does not contain words of weight 2.
(c) The columns of a generator matrix of C correspond to different projective points.
Remark 23.3.6 The relation between projective point sets and linear codes is well studied;
see [611]. MacWilliams and Sloane mention the connection in the special case of MDS codes
[1323, Chapter 11, Section 6].
580 Concise Encyclopedia of Coding Theory
The complement of a projective point set defining V an [n, k, d]q code is called a minihyper.
V
A (b, t)-minihyper is a multiset of b points in 1 q such that every hyperplane in k−1 q
contains at least t of these points and there is at least one hyperplane which contains exactly
t points; see further [293, 720, 891, 1344].
The above considerations lead us to the following theorem which shows the equivalence
between the construction of codes with a prescribed minimum distance and solving a system
of Diophantine linear equations. In the next theorem and subsequently It is the t×t identity
matrix.
Theorem 23.3.7 There is a [n, k, d0 ]q code with minimum distance d0 ≥ d if and only if
[k ] [k]
there are column vectors x ∈ {0, . . . , n} 1 q and y ∈ {0, . . . , n − d} 1 q satisfying simultane-
ously
x
• Mk,q I[k] = (n − d)1T , (23.2)
1 q y
[k1]q
X
• xi = n, (23.3)
i=1
Remark 23.3.8 The constructed codes in Theorem 23.3.7 are projective if the solution
[k]
vector x is restricted to {0, 1} 1 q .
Example 23.3.9 Continuing with Example 23.3.4, the [7, 3, 4]2 code is obtained by choos-
ing x1 = x2 = · · · = x7 = 1 and y1 = y2 = · · · = y7 = 0 in Theorem 23.3.7. The
[4, 3, 2]2 code is obtained with the choice x1 = x2 = x3 = x7 = 1, x4 = x5 = x6 = 0,
y1 = y2 = · · · = y6 = 2, and y7 = 0.
For this brute force approach to the code search, clearly the number of rows, respectively
columns, of Mk,q , namely k1 q = (q k − 1)/(q − 1), is the limiting factor of the construction.
Solving the system of Diophantine linear equations is only possible for small values of k and
q. Therefore, we need to apply a well-known method, also described in [293], to shrink the
Γ
matrix Mk,q to the much smaller Mk,q by prescribing a subgroup Γ of the general linear
group GLk (q) as a group of automorphisms. This approach will be described in the following
sections.
Leaving the field automorphisms aside, which are trivial if q is a prime, the fundamental
theorem of projective geometry says that the automorphism π : Fkq → Fkq is an element of
GLk (q); i.e., it can be expressed as an invertible k × k matrix Mπ over Fq . For more details
on the action of GLk (q) on the subspace lattice of PG(k, q), see [296, 297] and [190, Chapter
8].
The set of all automorphisms of a set of subspaces U forms a group Γ. Let Γ ≤ GLk (q)
be a group represented by k × k matrices over Fq . An action of the group Γ (from the left)
on a nonempty set U is defined by a mapping
Γ × U → U : (π, U ) 7→ π(U )
U ∼Γ U 0
if and only if there exists π ∈ Γ such that π(U ) = U 0 . The proof relies on the fact π(U ) = U 0
if and only if U = π −1 (U 0 ). The equivalence class
Γ(U ) = {π(U ) | π ∈ Γ}
Mπ P = Mπ PT T T
1 Mπ P2 · · · Mπ Pn .
Since π(P) = P, the base vectors of points in P are mapped to base vectors of P. That
means there are scalars α1 , α2 , . . . , αn ∈ Fq and a permutation σ ∈ Sym(n) of the indices
1, 2, . . . , n, such that
Mπ P = α1 PT T T
σ(1) α2 Pσ(2) · · · αn Pσ(n) .
Mπ P = P Pσ diag(α),
where Pσ is the permutation matrix induced by the permutation σ and diag(α) is the
diagonal matrix with diagonal entries α1 , α2 , . . . , αn . But that means Pσ diag(α) is an au-
tomorphism of the code C (see Definition 1.8.7), and we have shown the following theorem.
Theorem 23.3.11 Let Γ ≤ GLk (q) be a group of automorphisms of a projective point
(multi)set P where P defines a linear [n, k]q code C. Then Γ induces an automorphism
group Γ0 of C.
(b) The incidence between a point P and a hyperplane H is invariant under the action of
Γ. That means if P ∈ H, then π(P ) ∈ π(H) for all π ∈ Γ.
(c) The number of Γ-orbits and their sizes on the set of points is equal to the number of
Γ-orbits on the set of hyperplanes.
(d) If Γ(H) is a Γ-orbit on hyperplanes and Γ(P ) is a Γ-orbit on points, then the cardi-
nality of {Q ∈ Γ(P ) | Q ∈ H} is independent of the choice of the representative H of
the orbit Γ(H).
(e) If the action of the group Γ on a projective point set consists of a single orbit, i.e., the
action is transitive, then the corresponding code is cyclic.
There are several methods available to compute the full automorphism group of a linear
code, to test two codes for isomorphism, or to partition a set of linear codes into isomorphism
classes.
One possibility to tackle these problems is to use the general methods described in
Section 3.2. For linear codes, specific algorithms have been developed; see [265, 266, 1217,
1219, 1650] and the references therein. A very fast algorithm, which is available in various
computer algebra systems as well as a stand-alone program, is codecan by Feulner [721, 722].
Theorem 23.4.1 Let Γ be a subgroup of GLk (q), let ω1 , . . . , ωm be the orbits of Γ on the
projective points of V , and let Ω1 , . . . , Ωm be the orbits of Γ on the set of hyperplanes
of V
Γ
together with a transversal H1 ∈ Ω1 , H2 ∈ Ω2 , . . . , Hm ∈ Ωm . Let Mk,q = mΓi,j be the
m × m matrix with integer entries
mΓi,j := {P ∈ ωj | P ∈ Hi },
for 1 ≤ i, j ≤ m.
There is an [n, k, d0 ]q code with minimum distance d0 ≥ d such that a generator matrix
of this code has Γ as a group of automorphisms if and only if there are vectors x with
xi ∈ {0, . . . , bn/|ωi |c}, 1 ≤ i ≤ m, and y ∈ {0, . . . , n − d}m satisfying simultaneously
x
Γ
• Mk,q Im = (n − d)1T , (23.4)
y
m
X
• |ωj | · xj = n. (23.5)
j=1
Algorithmic Methods 583
Remark 23.4.2 As in Section 23.3, the codes constructed in Theorem 23.4.1 are projective
if the solution vector x is restricted to {0, 1}m .
Γ
For algorithms to compute the matrix Mk,q , see [1090, 1162]; a more advanced approach
is described in [190, Chapter 8] and [1106, Chapter 9].
Note that for the computation of the hyperplane orbits the group action for the group
element M ∈ Γ is realized by a multiplication from the right of the matrix M to a row
vector h, i.e., hM .
Example 23.4.3 Continuing Example 23.3.4, we again search for a [4, 3, 2]2 code. But now
we prescribe a cyclic group Γ of order 3, generated by the matrix
h0 1 0i
100 .
001
F3
The action of Γ on the points 2
1 2 yields the following three orbits:
For example, entry mΓ1,2 = 1 is computed as follows. ω2 = {110, 101, 110} and one represen-
tative of Ω1 is the hyperplane H = 100⊥ . For the members of ω2 we have that 110, 101 ∈/H
and 010 ∈ H. Therefore it follows that mΓ1,2 = 1.
The resulting system of Diophantine equations (23.4) and (23.5) is
x1
2 1 0 1 0 0 x2
2
1 1 1 0 1 0 x3 2
0 3 0 0 0 1 y1 = 2
3 3 1 0 0 0 y2 4
y3
where x1 , x2 ∈ {0, 1}, x2 ∈ {0, 1}, x3 ∈ {0, 1, . . . , 4} and yi ∈ {0, 1, 2}, 1 ≤ i ≤ 3. The
unique solution vector of this system of equation is
[1, 0, 1; 0, 0, 2]T ,
which means that the columns of the generator matrix will be composed of the union of the
orbits ω1 and ω3 . As in Example 23.3.4, we obtain the following generator matrix
1 0 0 1
0 1 0 1
0 0 1 1
of an optimal linear [4, 3, 2]2 -code. From the solution vector y, we immediately can read
off the weight distribution of the code. There are two entries of y equal to zero. The
584 Concise Encyclopedia of Coding Theory
corresponding orbits on hyperplanes both have size 3. Therefore, the code has (q−1)·(3+3) =
6 codewords of weight d + 0 = 2. Further, y3 = 2. The hyperplane orbit Ω3 has size 1, which
means that the code has exactly (q − 1) · 1 = 1 codeword of weight d + 2 = 4.
Since the solution vector x1 x2 x3 has only entries from {0, 1} we know that the resulting
code is projective.
Example 23.4.4 Here, we are searching for a linear [14, 3, 9]3 code over F3 . Such a code is
optimal. First note that a code with these parameters cannot be projective, since there are
exactly 31 3 = (33 − 1)/(3 − 1) = 13 1-dimensional subspaces of F33 . Therefore, at least one
column has to occur at least two times in a generator matrix of such a code. The incidence
matrix M3,3 has size 13 × 13. The 1-dimensional subspaces are h001i, h010i, h011i, h012i,
h100i, h101i, h102i, h110i, h111i, h112i, h120i, h121i, h122i.
Now we prescribe the complete monomial group Γ = Mon3 (3); see [294]. This group is
generated by the matrices
h0 1 0i h0 1 0i h2 0 0i h1 0 0i h1 0 0i
100 , 001 , 010 , 020 , 010 ,
001 100 001 001 002
F3
and has order 48. The action of Γ on 3
1 3 yields the following three orbits:
As an example, we show how to compute the entry mΓ1,1 = 2: ω1 = {h001i, h010i, h100i} and
one representative of Ω1 is the hyperplane H = 001⊥ . For the members of ω1 we have that
/ H and h010i, h100i ∈ H. It follows that mΓ1,1 = 2.
h001i ∈
Γ
Having computed M3,3 , the corresponding system of Diophantine equations (23.4) and
(23.5) is
x1
2 2 0 1 0 0 x2
5
1 1 2 0 1 0 x3 5
0 3 1 0 0 1 y1 = 5
3 6 4 0 0 0 y2 14
y3
where x1 ∈ {0, 1, 2, 3, 4}, x2 ∈ {0, 1, 2}, x3 ∈ {0, 1, 2, 3} and yi ∈ {0, 1, 2, 3, 4, 5}, 1 ≤ i ≤ 3.
The unique solution vector of this system of equations is equal to
[0, 1, 2; 3, 0, 0]T ,
Algorithmic Methods 585
which means that the orbit ω2 occurs once in the corresponding generator matrix and the
orbit ω3 occurs two times in the generator matrix. Thus, we obtain the generator matrix
0 1 1 0 1 1 1 1 1 1 1 1 1 1
1 0 1 1 0 2 1 1 2 2 2 2 1 1
1 1 0 2 2 0 1 1 2 2 1 1 2 2
of an optimal linear [14, 3, 9]3 code. From the solution vector y, we immediately can read off
the weight distribution of the code: there are (q − 1) · 3 = 6 codewords of weight d + 3 = 12
and (q − 1) · (6 + 4) = 20 codewords of weight d + 0 = 9.
Since x3 > 1, we know that the code is not projective.
n d
96 81
97 82
102 87
103 88
108 92
In [299] in more than 30 cases a code with a minimum distance as good as the current
lower bound could be constructed. In situations where the number of orbits generated by
a permutation group were too big, the group was enlarged by adding a diagonal matrix
as a generator to get fewer orbits. For example in the case of q = 9 and k = 4 a double
transposition and a diagonal matrix were used to reduce the system of equations to 59 rows
which gave the following 3 new [n, 4, d]9 codes:
n d
100 86
128 110
130 112
Remark 23.4.5 The approach to prescribe an automorphism group and solve a resulting
Diophantine linear system to construct a linear code is very general and can be used in a
wide variety of other contexts.
For example, in [1116, 1117] it was used to find very good linear codes over rings. In
[300, 1139] subspace codes were found with the same approach; see Chapter 14 for more
information about subspace codes.
Even more general, a code can be regarded as an incidence structure, represented by an
incidence matrix whose row sums are within a certain range. Now the theory of incidence
preserving group actions [1106] allows one to shrink the incidence matrix to a smaller
one whose rows, respectively columns, correspond to orbits of a group and whose row
sums stay in the same range. This construction is a general approach that works for many
discrete structures. Kramer and Mesner were among the first to apply this method in design
theory [1161]. Subsequently, it has become known as Kramer–Mesner Method. It allowed
construction of record breaking structures such as, for example, combinatorial designs [192],
q-analogs of designs [295, 296, 297], and (n, r)-arcs in P G(2, q) [298].
Algorithmic Methods 587
code has exactly two intersection numbers with the hyperplanes. Let w1 and w2 be the two
possible weights of the linear code; without restriction let’s assume that w1 < w2 . Then,
if q is a power of the prime p, it is well known that there are integers j and t such that
w1 = pj t and w2 = pj (t + 1); see [330, Corollary 5.5].
It is easy to generalize Theorem 23.3.7 to linear two-weight codes; see [1136].
Theorem 23.6.1 Let w1 < w2 . There is a two-weight [n, k, w1 ]q code with nonzero weights
[k]
w1 and w2 if and only if there is a column vector x ∈ {0, . . . , n} 1 q and a column vector
[k ]
y ∈ {0, 1} 1 q , where 00 · · · 0 6= y 6= 11 · · · 1, satisfying simultaneously
x
• Mk,q diag(w2 − w1 ) = (n − w1 )1T ,
y
[k1]q
X
• xi = n,
i=1
k k
where diag(w2 − w1 ) is the 1 q × 1 q diagonal matrix with diagonal entries w2 − w1 .
Remark 23.6.2 Prescribing a group of automorphisms in the search for two-weight codes
is also straightforward: In Theorem 23.4.1 the identity matrix Im in equation (23.4) has to
be replaced by the m × m diagonal matrix diag(w2 − w1 ) and the entries of the column
vector y from Theorem 23.4.1 have to be restricted to {0, 1}.
Another straightforward modification of the equations allows one to restrict the search
to linear three-weight codes, i.e., linear codes attaining three different nonzero weights
w1 < w2 < w3 . In the special case that the three weights are symmetric around w2 , i.e.,
there is a δ > 0 such that w1 + δ = w2 = w3 − δ, condition (23.2) reduces to
x
• Mk,q diag(δ) = (n − w1 )1T ,
y
and y is required to be a vector of length k1 q with entries from {0, 1, 2}, whereas x has the
where x has
the same restrictions as in Theorem 23.6.1 and y is required to be a vector of
length k1 q with entries from {0, 1, . . . , b(n − d)/∆c}.
Prescribing a group of automorphisms Γ to the search for divisible codes is straightfor-
Γ
ward: in the above condition Mk,q has to be replaced by the matrix Mk,q .
Note that, since the Gram matrix is symmetric, it suffices to consider the k+1
2 entries Qi,j ,
1 ≤ i ≤ j ≤ k. With this notation the additional equations over Fq can therefore be written
as
[k1]q
X
xs P(i) (j)
s Ps = Qi,j for 1 ≤ i ≤ j ≤ k ,
s=1
590 Concise Encyclopedia of Coding Theory
k
where hPs i, 1 ≤ s ≤ 1 q,
runs through all points. These equations guarantee that the
pairwise inner products of rows of a generator matrix built from the columns of V1 q have
the required values. Combining this with the result from Section 23.3 we get the following
theorem.
Theorem 23.6.4 Let q ∈ {2, 3} and Q be a symmetric k × k matrix over Fq . There exists
a [n, k, d0 ]q code with minimum distance d0 ≥ d if and only if there are column vectors
[k] [k]
x ∈ {0, . . . , n} 1 q and y ∈ {0, . . . , n − d} 1 q satisfying the system of equations
x
• Mk,q I[k] = (n − d)1T ,
1 q y
[k1]q
X
• xi = n,
i=1
Additionally prescribing a group of automorphisms in the search for linear codes with a
prescribed Gram matrix is straightforward.
Theorem 23.6.5 Let Γ be a subgroup of GLk (q) and Q be a symmetric k × k matrix over
Fq . Let ω1 , . . . , ωm be the orbits of Γ on the projective points of V , and let Ω1 , . . . , Ωm be
the orbits ofΓ on the set of hyperplanes of V together with a transversal H1 , H2 , . . . , Hm .
Γ
Let Mk,q = mΓi,j be the m × m matrix with integer entries
mΓi,j := {P ∈ ωj | P ∈ Hi },
for 1 ≤ i, j ≤ m.
= pΓ(i,j),s be the k+1
Γ
Further, let Pk,q 2 × m matrix with entries
X
pΓ(i,j),s = P(i) P(j) for 1 ≤ i ≤ j ≤ k and 1 ≤ s ≤ m .
P∈ωs
There is an [n, k, d ]q code with minimum distance d0 ≥ d such that a generator matrix of this
0
code has Γ as a group of automorphisms and a generator matrix has Gram matrix Q if and
only if there are vectors x with xi ∈ {0, . . . , bn/|ωi |c} for 1 ≤ i ≤ m, y ∈ {0, . . . , n − d}m ,
k+1
and z ∈ Z( 2 ) satisfying
Γ
Mk,q Im
0m×(k+1) x (n − d)1T
2
Γ
Pk,q 0(k+1)×m −qI(k+1) y = Q ,
2 2
|ω1 | · · · |ωm | 01×m 01×(k+1) z n
2
where 0a×b is the a × b zero matrix and Q is the column vector of length k+1
2 consisting
of the entries Qi,j , 1 ≤ i ≤ j ≤ k.
Note that in the above system of equations the integer variables zi , 0 ≤ i < k+1
2 are
implicitly bounded by the restrictions on the vector x and y.
As for Theorem 23.6.4, in the case of q 6∈ {2, 3} the above conditions involving the entries
of the Gram matrix are sufficient but not necessary for a code together with a generator
matrix with given Gram matrix to exist.
Algorithmic Methods 591
is an [n, n − k]q code. A self-orthogonal linear [n, k]q code is a k-dimensional subspace of
the n-dimensional vector space Fnq over the finite field Fq with the additional requirement
that C ⊆ C ⊥ .
This means that a code C is self-orthogonal if and only if it holds over Fq that
v · w = 0 for all v, w ∈ C .
It is known that if G is a generator matrix of C and G(0) , G(1) , . . . , G(k−1) are the rows of
G then C is self-orthogonal if and only if
X
G(i) · G(j) = G(i) (j)
s Gs = 0 for all 1 ≤ i ≤ j ≤ k.
0≤s<n
In other words, a linear [n, k]q code C with generator matrix G is self-orthogonal if and only
if the Gram matrix GGT = 0k×k over Fq . This enables us to apply the method from Section
23.6.4 to the search for self-orthogonal codes for q = 2, 3.
Example 23.6.6 In [1140] the described method allowed the authors to construct self-
orthogonal codes for the following parameters, which were improvements of the bounds (for
general linear codes) in [845].
• Parameters of optimal codes: [177, 10, 84]2 , [38, 7, 21]3 , [191, 6, 126]3 , [202, 6, 132]3 ,
[219, 6, 144]3 , [60, 7, 36]3 .
• Parameters of codes which are improvements to the previous bounds in [845] but
which are not optimal codes: [175, 10, 82]2 , [140, 11, 64]2 , [61, 7, 36]3 , [188, 7, 120]3 ,
[243, 7, 156]3 .
Γ
The time needed to compute the matrices Mk,q of Section 23.4 is small compared to the
time needed to solve the corresponding system of Diophantine equations. The computation
times depend heavily on the number of orbits and the number n − d which is the upper
bound of parts of the variables. Also, the number k2 of equations to ensure self-orthogonal
solutions comes into play. All this is shown in Table 23.1 which gives detailed information
for the six optimal codes.
TABLE 23.1: Code parameters, number of orbits, and solving time for optimal self-
orthogonal codes
n − d k2
Code # Orbits Time
[177, 10, 84]2 51 < 3h 93 45
[38, 7, 21]3 101 < 100s 17 21
[60, 7, 36]3 67 < 100s 24 21
[202, 6, 132]3 44 < 10s 70 15
[219, 6, 144]3 38 < 10s 75 15
[191, 6, 126]3 44 < 10s 65 15
One further challenge is to choose a group such that the reduction is large enough to
592 Concise Encyclopedia of Coding Theory
TABLE 23.2: Parameters of all self-orthogonal [n, 9, d]2 codes which could be constructed
in [1140] with the method described in this chapter and whose minimum distance is as least
as high as the best known codes in [845]. An entry 30 − 33 means that codes of length 30,
31, 32, and 33 could be constructed.
n d n d n d n d n d
21 − 25 8 70 − 74 32 118 − 121 56 175 − 176 84 222 − 223 108
30 − 33 12 80 − 81 36 127 − 128 60 182 − 184 88 226 110
38 − 42 16 84 38 135 − 138 64 189, 191 92 228 − 233 112
45 18 86 − 90 40 142 66 194 94 238 − 240 116
47 − 50 20 93 42 144 − 145 68 196 − 201 96 243 118
53 22 95 − 97 44 148 70 205 98 245 − 248 120
55 − 58 24 100 46 150 − 154 72 207 − 209 100 250 122
63 − 65 28 102 − 106 48 159 − 160 76 212 102 252 124
68 30 111 − 113 52 166 − 170 80 214 − 216 104 256 128
get a system which can be handled by the solving algorithm but which, on the other hand,
is also a group of automorphisms of a self-orthogonal code with high minimum distance.
The least requirement on the group is that there exist point orbits under the action of the
group with length at most n.
For more details on promising groups for construction of general linear codes, see [299,
1483]. However, in the case of self-orthogonal codes, cyclic groups (i.e., generated by one
element) seem to be especially good. For example in the case of q = 2 and k = 9 all the
distance-optimal self-orthogonal codes in [1140] were found by using cyclic groups.
In many cases it is possible to find self-orthogonal codes which meet the minimum weight
of the best known linear codes. Of course this is possible only in the case of even weight
(in the binary case) or weight d with d ≡ 0 mod 3 (in the ternary case). This situation is
reflected in Table 23.2 for the case q = 2 and k = 9 from [1140].
An alternative approach is to classify self-orthogonal codes by reversing the operation
of going from a code to its residual code; see [271]. This means that instead of searching
for suitable columns of a generator matrix, one builds up generator matrices row by row.
In Chapter 4 construction methods for self-dual codes are studied.
Theorem 23.6.7 (Massey [1347]) If G is a generator matrix for the linear [n, k]q code C,
then C is an LCD code if and only if the k × k Gram matrix GGT is nonsingular. Moreover,
if C is an LCD code, then GT (GGT )−1 G is the orthogonal projection from Fnq to C.
Remark 23.6.8 GT (GGT )−1 is known as the Penrose inverse of the matrix G. It does
exist if and only if the Gram matrix GGT is invertible.
Algorithmic Methods 593
Carlet and Guilley [353] used LCD codes as a counter-measure to side-channel attacks
in cryptography. LCD codes are also used in the construction of self-dual codes and lattices
[1421]. In [687] decoding properties of LCD codes were studied.
There are many constructions of LCD codes. For example, Yang and Massey [1920]
proved that C is a cyclic LCD code if and only if the generator polynomial is self-reciprocal
plus some condition on the irreducible factors of the polynomial; see also Section 2.10.
Sendrier [1648] showed that LCD codes meet the Gilbert–Varshamov Bound. Dougherty et
al. [629] describe constructions of LCD codes by codes over rings, orthogonal matrices and
block designs. For other constructions, see Chapter 16 and Galindo et al. [785].
At first sight it appears that in an exhaustive search for all optimal LCD codes we have
to prescribe all symmetric matrices from GLk (q). Fortunately, the set of Gram matrices
which have to be examined can be reduced to a much smaller set. To see this we make a
short excursion to quadratic forms.
A symmetric matrix A ∈ Fk×k q gives rise to a quadratic form QA , which is a mapping
P 0 P 0T = (M P )(M P )T = M (P P T )M T
This classification enables an exhaustive search for the [n, k]2 LCD codes with the largest
possible minimum distance for small values of k, i.e., k < 6.
In the search for LCD codes for larger values of k one strategy could be to restrict the
set of prescribed invertible Gram matrices to, for example, permutation matrices.
594 Concise Encyclopedia of Coding Theory
TABLE 23.3: Table of parameters of binary LCD codes with n < 50. The entries in the
table are the largest values of d for which an [n, k, d]2 LCD code has been found with the
methods of this chapter. The search was exhaustive for k < 6. Bold face entries indicate
that the code is an optimal [n, k, d]2 code for general linear codes.
n/k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
1 1
2 1 1
3 3 2 1
4 3 2 1 1
5 5 2 2 2 1
6 3 3 2 2 1 1
7 7 4 3 2 2 2 1
8 7 5 3 3 2 2 1 1
9 9 6 4 4 3 2 2 2 1
10 9 6 5 4 3 3 2 2 1 1
11 11 6 5 4 4 4 2 2 2 2 1
12 11 7 6 5 4 4 3 2 2 2 1 1
13 13 8 6 6 5 4 4 3 2 2 2 2
14 13 9 7 6 5 5 4 4 3 2 2 2
15 15 10 7 6 6 6 4 4 4 3 2 2 2 2
16 15 10 8 7 6 6 5 4 4 3 3 2 2 2
17 17 10 9 8 7 6 6 4 4 4 3 3 2 2 2 2
18 17 11 9 8 7 7 6 4 5 3 2 3 3 2 2 2
19 19 12 10 9 8 8 6 6 6 4 4 3 3 2 2 2 2 2
20 19 13 10 10 9 8 6 6 6 5 2 4 3 3 3 2 2 2
21 21 14 11 10 9 8 8 7 6 6 5 2 4 4 3 3 2 2 2 2
22 21 14 11 10 10 9 8 7 6 6 5 4 2 4 3 3 2 2 2 2
23 23 14 12 11 10 10 9 8 6 6 6 6 3 2 4 4 3 2 2 2 2 2
24 23 15 13 12 11 10 9 8 6 6 6 6 2 2 2 4 3 3 3 2 2 2
25 25 16 13 12 11 10 10 8 7 6 6 2 2 2 3 4 4 4 2 2 2 2 2 2
26 25 17 14 12 12 11 10 10 8 6 6 6 2 2 2 4 4 2 3 2 2 2 2
27 27 18 14 13 12 12 11 10 8 8 6 4 3 2 4 4 3 3 3 2 2 2 2 2
28 27 18 15 14 13 12 11 10 9 8 6 8 4 2 4 4 2 3 3 2 2 2 2
29 29 18 15 14 13 12 12 10 10 8 8 4 2 2 3 4 3 3 2 2 2 2 2 2
30 29 19 16 14 14 13 12 12 10 9 8 6 4 3 4 4 4 2 3 2 2 2 2
31 31 20 17 15 14 14 12 12 10 10 9 8 3 4 3 4 3 2 3 2 2 2 2 2
32 31 21 17 16 15 14 13 12 10 10 10 8 2 4 4 2 3 3 2 2 2 2 2
33 33 22 18 16 15 14 13 12 11 10 10 8 3 4 4 3 3 2 2 2 2 2 2 2
34 33 22 18 17 16 14 14 12 12 12 11 8 2 5 4 3 2 3 2 2 2 2 2
35 35 22 19 18 16 16 14 14 12 11 12 8 3 2 4 2 3 3 2 2 2 2 2 2
36 35 23 19 18 17 16 15 14 13 12 12 8 2 2 4 4 4 3 3 2 2 2 2
37 37 24 20 18 17 16 15 15 13 12 12 8 2 2 3 4 3 2 3 2 2 2 2 2
38 37 25 21 19 18 17 16 15 14 12 12 8 2 2 4 2 2 3 3 2 2 2 2
39 39 26 21 20 18 18 17 16 14 14 12 8 2 2 3 3 3 3 3 2 2 2 2 2
40 39 26 22 20 19 18 17 16 14 13 12 8 2 2 3 3 3 3 2 2 2 2 2
41 41 26 22 20 19 19 18 16 14 15 12 8 2 2 3 2 3 3 3 2 2 2 2 2
42 41 27 23 21 20 20 18 18 16 15 12 8 2 2 4 3 3 3 3 2 2 2 2
43 43 28 23 22 20 20 19 18 16 14 14 8 2 4 2 3 3 3 2 2 2 2 2 2
44 43 29 24 22 21 20 19 18 16 14 14 8 2 4 2 2 2 2 2 2 2 2 2
45 45 30 25 22 21 20 20 18 18 16 14 8 2 2 4 4 2 2 2 2 2 2 2 2
46 45 30 25 23 22 21 20 18 18 17 16 8 2 2 4 3 2 2 3 2 2 2 2
47 47 30 26 24 22 22 21 18 18 16 16 8 2 2 3 2 2 3 3 2 2 2 2 2
48 47 31 26 24 23 22 21 20 19 18 16 8 2 2 2 3 3 3 2 2 2 2 2
49 49 32 27 25 23 22 22 20 20 16 16 8 2 2 2 4 3 3 2 2 2 2 2 2
Example 23.6.11 Table 23.3 contains the results of a computer search for q = 2, n < 50,
k < 13 carried out for this chapter. The search was exhaustive for k < 6 using Corol-
lary 23.6.10. The entries in the table are the largest values of d for which an [n, k, d]2 LCD
code has been found with the methods of this chapter. Bold face entries indicate that the
code meets the general upper bound for linear codes; i.e., it is an optimal [n, k, d]2 code.
The entries for values of k ≥ 13 were determined by additionally computing the minimum
distance of the corresponding dual codes.
all minimum weight codewords of C are increased to weight at least d + 1. This is the case
if the system of equations (23.2) and (23.3) is adapted to
x
0
• Mk,q I = (` − 1)1T ,
[]k
1 q y
X
• xi = `,
0
where Mk,q is the submatrix of Mk,q whose rows are labeled by the hyperplanes correspond-
ing to weight d codewords of C. The columns are labeled by the points which are allowed
as additional columns of a generator matrix. In other words, we search for an additional `
columns of a matrix such that for every minimum weight codeword at most ` − 1 zeros are
added. See [1135, 1138] for further information.
Above is the formulation of the problem as a packing problem. If the ones and zeros of the
0
matrix Mk,q are flipped, then one arrives at a covering problem, i.e., for every hyperplane
corresponding to a minimum weight codeword, the weight has to increase by at least 1. In
this formulation, the very efficient approach of the Dancing Links Algorithm [1133] can be
applied.
Iri B (i)
G(i) = A(i) .
0 0
(1) (2) (m)
This means that the ranks of the information sets of the matrices Pi−1 G , G , . . . , G are
(i)
r1 , r2 , . . . , rm , respectively, and the matrices A consist of j=1 rj columns. So from now
on we will work with a code equivalent to C which has m consecutive disjoint information
sets. Nevertheless, we will call this code C.
(i)
Let Gj denote the j th row of G(i) . In the second step, for each t = 1, 2, . . . , k, we do
the following. For each of the matrices G(i) , 1 ≤ i ≤ m, we compute all possible linear
combinations of t rows of G(i) and determine their weights. That means that we search
for the codewords with minimum weight in the set of all codewords of C which are linear
combinations of t rows j1 , j2 , . . . , jt of G(i) :
(i) (i) (i)
c = a1 Gj1 + a2 Gj2 + · · · + at Gjt
596 Concise Encyclopedia of Coding Theory
is a lower bound for the weight of codewords which are the linear combination (with nonzero
coefficients) of at least t + 1 rows of the generator matrices G(i) . The enumeration can be
stopped as soon as dt ≥ dt .
This algorithm is a generalization of an idea of Brouwer for cyclic codes. Also, for self-
dual codes this algorithm seems to be common knowledge. In that case there are exactly
two disjoint information sets, each of rank k.
Example 23.8.1 We apply the minimum distance algorithm to the [7, 3]2 code with gen-
erator matrix
1 0 0 1 0 1 1
G(1) = 0 1 0 1 1 0 1 .
0 0 1 0 1 1 1
Computing consecutive disjoint information sets gives the following two generator matrices
G(2) and G(3) :
0 1 1 1 0 0 1
G(2) = 1 1 0 0 1 0 1 and
1 1 1 0 0 1 0
0 1 1 1 0 0 1
G(3) = 1 0 1 1 1 0 0 .
1 1 1 0 0 1 0
In the first enumeration step t = 1 of the algorithm we simply have to determine the weights
of the rows of the three generator matrices. All the rows have weight four, whence d1 = 4.
For the lower bound after completing t = 1 we have
d1 = (2 − 0) + (2 − 0) + (2 − (3 − 1)) = 4 .
Hence d = d1 = 4 is the minimum distance of C.
If n = 2k and if the two disjoint information sets have full rank k, a slight variation
of this algorithm enables one to count the number of codewords of a fixed weight w of the
code C as follows.
Algorithmic Methods 597
Step 2: For all values of t with 1 ≤ t < w/2 and all possible choices of aj ∈ Fq \ {0},
1 ≤ j ≤ t, count the number of codewords c of weight w which are the linear
combination of t rows of G(2) :
(2) (2) (2)
c = a1 Gj1 + a2 Gj2 + · · · + at Gjt .
When it comes to implementation of the minimum distance algorithm some details are
important:
• For a fixed selection j1 , j2 , . . . , jt of rows of a generator matrix, all linear combinations
(1) (1) (1) (2) (2) (2)
a1 Gj1 + a2 Gj2 + · · · + at Gjt and a1 Gj1 + a2 Gj2 + · · · + at Gjt with aj ∈ Fq \ {0},
1 ≤ j ≤ t, have to be tested. An obvious optimization is to enumerate the linear
combinations of the rows of the generator matrices up to scalar multiples; i.e., we can
fix a1 = 1 and for each selection of t rows (q − 1)t−1 linear combinations have to be
tested. The resulting number of codewords then has to be multiplied by (q − 1).
• For a fast implementation of this algorithm, especially for the cases q = 2 or 3,
working with bit vectors is crucial. Counting the number of 1-bits in a vector is
usually called popcount. In some processors it is available in hardware; for software
implementations see for example [1132, 1878].
• The generating of all combinations of t rows j1 , j2 , . . . , jt out of k rows of the generator
matrices G(1) and G(2) , respectively, can be realized with a revolving door algorithm
like the algorithm of Nijenhuis and Wilf [1439]. It has the property that in each step
only one row in j1 , j2 , . . . , jt is exchanged. The resulting sequence of t-tuples has the
interesting property that the combination j1 , j2 , . . . , jt of the rows of the generator
matrix is visited after exactly
jt + 1 jt−1 + 1
N = − + ···
t t−1
t j2 + 1 t j1 + 1
+ (−1) − (−1) − [t odd]
2 1
other combinations have been visited; see Lüneburg [1305] and Knuth [1132]. This
enables an easy parallelization of the problem: For 1 ≤ t ≤ w/2 the work of kt
enumeration steps can be divided into b kt /T c chunks of size T . The rth job for
The minimum distance algorithm of this section has been generalized to nonlinear
codes [1853] and to linear codes over rings [1115].
An alternative approach to compute the minimum distance of a linear code for Fq ,
q ∈ {2, 3}, is based on lattice basis reduction [1881]. While this method may not be able
to compute the exact minimum distance for large codes, it usually determines good upper
bounds on the minimum distance of the code. This can be of importance in cryptographic
applications. The complexity of approximating the minimum distance of linear codes is
discussed in [650].
Chapter 24
Interpolation Decoding
Swastik Kopparty
Rutgers University
24.1 Introduction
In this chapter we will see some beautiful algorithmic ideas based on polynomial interpo-
lation for decoding algebraic codes. Here we will focus only on (Generalized) Reed–Solomon
codes, but these ideas extend very naturally to the important class of algebraic geometry
codes studied in Chapter 15.
Let us quickly recall the polynomial-evaluation based definition of (narrow-sense) Reed–
Solomon codes from Theorem 1.14.11. Following the notation from that chapter, let q be a
prime power, let α ∈ Fq be a generator of the multiplicative group F∗q = Fq \ {0} of Fq , and
define1
Pk,q = {p(x) ∈ Fq [x] | deg(p) < k} and
RS k (α) = { p(α0 ), p(α1 ), p(α2 ), . . . , p(αq−2 ) | p(x) ∈ Pk,q } ⊆ Fq−1
q .
Informally, the codewords of RS k (α) are the vectors of all evaluations of a polynomial of
degree < k at all the nonzero points of Fq . See specifically Theorem 1.14.11.
More generally, we can take an arbitrary s = (s1 , . . . , sn ) ∈ Fnq with n ≤ q and the
si distinct. Consider evaluations of polynomials of degree < k at all the points of s. This
1 We will follow the convention that the degree of the 0 polynomial is −∞.
599
600 Concise Encyclopedia of Coding Theory
E = {u ∈ S | p(u) 6= r(u)} be the set of error locations. Let Z(x) ∈ Fq [x] be the polynomial
given by Y
Z(x) = (x − u).
u∈E
Z(x) is called the error-locating polynomial. We do not know Z(x); indeed, finding Z(x)
is as hard as finding p(x). Nevertheless thinking about Z(x) will motivate our algorithm.
The crux of the Berlekamp–Welch Algorithm is the following identity. For each u ∈ S,
we have
Z(u) · r(u) = Z(u) · p(u). (24.1)
Indeed, if u 6∈ E, then r(u) = p(u) and so Z(u) · r(u) = Z(u) · p(u). Otherwise if u ∈ E,
then Z(u) = 0, and so (24.1) holds.
Let W (x) ∈ Fq [x] be the polynomial Z(x)·p(x). Observe that deg(Z) ≤ e and deg(W ) <
k + e. This motivates the following idea: let us try to search for the polynomials Z and W
by solving the linear equations
Z(u) · r(u) = W (u)
for each u ∈ S. Hopefully we will then recover p(x) as W (x)/Z(x). This algorithm actually
works, but its analysis is more subtle than the naive arguments given above suggest. We
now formally present the algorithm.
Input: r : S → Fq , e, and k.
Step 1: Interpolation: Let a0 , a1 , . . . , ae and b0 , b1 , . . . , bk+e−1 be indeterminates. Con-
sider the following system of homogeneous linear equations in these indeterminates,
one equation for each u ∈ S:
e
! k+e−1
X X
ai ui · r(u) = bj uj . (24.2)
i=0 j=0
If A(x) divides B(x), then return p(x) = B(x)/A(x); otherwise return FAIL.
Theorem 24.2.2 Suppose r(x) is within Hamming distance3 e = b(n − k)/2c of some
codeword p(x) of GRS k (s) where s = (s1 , . . . , sn ) ∈ Fnq with the si distinct. Then with
S = {s1 , . . . , sn } the Berlekamp–Welch Algorithm RSDecode will output that codeword p(x).
Before analyzing the correctness of the algorithm, we make some observations about
its running time. The main operations involved are solving systems of linear equations and
polynomial division. By well-known algorithms, both of these can be solved in polynomial
time (in fact, time O(n3 log2 q)). With more advanced algebraic algorithms, the running
time can even be made O(n(log q · log n)3 ). This is nearly-linear time, and almost as fast as
noiseless polynomial interpolation!
Claim 1: In Step 1 there does indeed exist a nonzero solution (a0 , . . . , ae , b0 , . . . , bk+e−1 ) ∈
Fk+2e+1
q .
Let Z(x) be the error locating polynomial, and let W (x) = Z(x) · p(x). Note
that Z(x) is a nonzero polynomial. Taking a0 , . . . , ae to be the coefficients of
Z(x), and taking b0 , . . . , bk+e−1 to be the coefficients of W (x), the identity (24.1)
immediately implies that this (a0 , . . . , ae , b0 , . . . , bk+e−1 ) is a nonzero solution to
the system of equations (24.2).
Claim 2: In Step 2 the polynomial A(x) does divide B(x) and B(x)/A(x) = p(x).
Take A(x), B(x) and consider the polynomial H(x) = B(x) − p(x)A(x) ∈ Fq [x].
We know that for every u 6∈ E, we have
where the last equality follows from the fact that (a0 , . . . , ae , b0 , . . . , bk+e−1 ) sat-
isfies (24.2). Observe that H(x) is a polynomial of degree at most k + e − 1.
By the above, we have that H(u) vanishes for at least n − e > k + e − 1 val-
ues of u ∈ S. By Lemma 24.1.1, this implies that H(x) is the identically zero
polynomial. Thus B(x) = p(x)A(x) and the claim follows.
It is worth noting that there may be multiple nonzero solutions to the system of equa-
tions (24.2). Claim 2 is about every such nonzero solution!
Another way of viewing the Berlekamp–Welch Algorithm is through the lens of rational
function interpolation. Roughly, we start off trying to find a rational function B(x)/A(x)
such that deg(A), deg(B) are both small, and B(u)/A(u) = r(u) for all u ∈ S. We have to
be careful about what we mean by division when the denominator is 0.
Definition 24.2.3 Let S ⊆ Fq and let r : S → Fq be a function. We say that the rational
function B(x)/A(x) ∈ Fq (x) interpolates r(x) if for every u ∈ S, either
The Berlekamp–Welch Algorithm is essentially based on the fact that rational interpolation
can be efficiently solved using linear algebra, and the analysis basically shows any low degree
rational interpolation of r(x) is (after division) the nearby polynomial p(x).
4 When we formally analyze the algorithm, everything will be elementary and we will not use the Bezout
Factor Q(x, y) into its irreducible factors over Fq . Let L be the set of all p(x) for
which y−p(x) is an irreducible factor of Q(x, y). These are our candidate codewords.
Now output those codewords of L which are within distance e of the received word
r.
Theorem 24.3.2 Let s = (s1 , . . . , sn ) ∈ Fnq with the si distinct and S = {s1 , . . . , sn }.
√
If e < n − 2 nk − k, then the Sudan Algorithm RSListDecodeV1 outputs the list of all
GRSk (s) that are within Hamming distance e of r. Furthermore, this list has
codewords in p
n
size at most k .
To understand how remarkable the above theorem is, consider the setting k = (0.01)n
and e = (0.75)n. Then the theorem says that the Sudan Algorithm can find all codewords
within distance (0.75)n of any given received word r. Note that most entries of r may be
wrong; yet we can find the corrected codeword!
Before analyzing the correctness of the algorithm, we discuss the running time. Factoring
of bivariate polynomials of degree d over Fq can be done in time (d log q)O(1) by a randomized
algorithm (or (dq)O(1) by a deterministic algorithm). Thus the entire Sudan Algorithm can
be made to run in polynomial time.
Thus the first step of the algorithm succeeds in finding a nonzero solution.
Now we examine the second step. By construction, the polynomial Q(x, y) has the
property that Q(u, r(u)) = 0 for all u ∈ S. Consider the polynomial H(x) = Q(x, p(x)). For
Interpolation Decoding 605
any point u ∈ S \ E, we have H(u) = Q(u, p(u)) = Q(u, r(u)) = 0. Furthermore, the degree
of H is at most I + (k − 1)J. Thus H(x) is a polynomial of degree at most
√ √
r
n
I + (k − 1)J < ( nk + 1) + (k − 1) + 1 < 2 nk + k,
k
By Lemma 24.1.1, we conclude that H(x) = Q(x, p(x)) must be the zero polynomial.
Finally, we use the fact that if Q(x, y) is such that Q(x, p(x)) = 0, it means that the
bivariate polynomial y − p(x) divides Q(x, y). This is a form of the so-call Factor Theorem.5
So y − p(x) will appear in the list of factors of Q, and thus p(x) will appear in the output of
the Sudan Algorithm RSListDecodeV1,p as desired. The number of such factors is at most
n
the y-degree of Q, which is at most J = k . This completes the proof of Theorem 24.3.2.
We make some remarks about the argument that we just saw.
Remark 24.3.3
• The precise algebraic problem that we need to solve in the second step of the algorithm
is root finding rather than factoring. There are faster and simpler algorithms for root
finding than for general factoring.
• A slightly cleverer choice of monomials6 used
√ in the polynomial Q(x, y) leads to an
improvement of the decoding radius to n − 2nk. This is the error-correction perfor-
mance of the original Sudan Algorithm. This is larger than half the minimum distance
for all k < n/3.
• The significance of the Sudan Algorithm is that it showed for the first time that it
is possible to efficiently list-decode positive rate codes beyond what is possible for
unique decoding. For a code of rate R, the classical Singleton Bound Theorem 1.9.10
implies that it is not possible to uniquely decode from more than ((1 − R)/2)-fraction
errors. We √just saw that Reed–Solomon codes of rate R can be efficiently decoded
from (1 − 2 R − R)-fraction errors, which for small R is larger than (1 − R)/2.
24.4.1 Preparations
We state below some simple properties of multiplicity. We omit the (easy) proofs.
Lemma 24.4.3 Suppose H(x) ∈ F[x] and u ∈ F are such that mult(H, u) ≥ M . Then
(x − u)M | H(x).
Lemma 24.4.4 Let H(x) ∈ F[x] be a nonzero polynomial of degree at most d. Then
X
mult(H, u) ≤ d.
u∈F
Lemma 24.4.5 Suppose Q(x, y) ∈ F[x, y] and p(x) ∈ F[x]. Suppose u ∈ F is such that
mult(Q, (u, p(u))) ≥ M . Then, letting H(x) = Q(x, p(x)), we have mult(H, u) ≥ M .
We will be dealing with certain weighted degrees of bivariate polynomials. For a mono-
mial xi y j , its (α, β)-weighted degree is defined to be αi + βj.
Interpolation Decoding 607
For each (i, j) ∈ TD , we let aij be an indeterminate. Let Q(x, y) ∈ Fq [x, y] be the
polynomial all of whose monomials, with coefficients aij , are in MD :
X
Q(x, y) = aij xi y j .
(i,j)∈TD
Consider the system of homogeneous linear equations on the aij by imposing the
condition, for each u ∈ S,
|TD |
Find a nonzero solution (aij )(i,j)∈TD ∈ Fq . If no nonzero solution exists, then
return FAIL.
Step 2: Polynomial Algebra: Factor Q(x, y) into its irreducible factors over Fq . Let L
be the set of all p(x) for which y − p(x) is an irreducible factor of Q(x, y). These
are our candidate codewords. Now output those codewords of L which are within
distance e of the received word r.
Theorem 24.4.7 Let s = (s1 , . . . , sn ) ∈ Fnq with the si distinct and S = {s1 , . . . , sn }. If
√
e < n − nk, then the Guruswami–Sudan Algorithm RSListDecodeV2 outputs the list of all
codewords in GRS
√ k (s) that are within Hamming distance e of r. Furthermore, this list has
size at most 2 nk.
608 Concise Encyclopedia of Coding Theory
How many equations are there? Asking that Q vanishes at a point (u, r(u)) with multiplicity
at least M is the same as asking that M2+1 coefficients of the polynomial Q(x + u, y + r(u))
vanish. Thus the total number of equations in the system is n · M2+1 . We now check that
D2 nkM 2 M2 k M (M + 1)
= =n· · >n· ,
2(k − 1) 2(k − 1) 2 k−1 2
where the last inequality uses the fact that MM+1 < k−1k
since M = k. Thus the system of
equations has a nonzero solution, and the first step of the algorithm succeeds.
Now we consider the second step. By construction, the polynomial Q(x, y) has the prop-
erty that mult(Q, (u, r(u))) ≥ M for all u ∈ S. Consider the polynomial H(x) = Q(x, p(x)).
For any point u ∈ S \ E, we have that p(u) = r(u), and thus by Lemma 24.4.5,
mult(H, u) ≥ M . Furthermore, the degree of H is at most D. Thus H(x) is a polyno-
mial of degree at most D whose total multiplicity of vanishing at all points in S is at
least √ √
(|S| − |E|) · M ≥ (n − e) · M > (n − (n − nk)) · M = nk · M = D.
By Lemma 24.4.4, we conclude that H(x) must be the zero polynomial.
Again, since Q(x, p(x)) = H(x) = 0, we get that y − p(x) divides Q(x, y). So y − p(x)
will appear in the list of factors of Q, and thus p(x) will appear in the output of the
Guruswami–Sudan Algorithm RSListDecodeV2, as desired. The number of such factors is
D
√ √
at most the y-degree of Q, which is at most k−1 ≤ kk−1
nk
≤ 2 nk, completing the proof of
Theorem 24.4.7.
This means that a polynomial that vanishes with high multiplicity at all the points of a set
of interest is a very special polynomial, and presumably its fate is more strongly tied to
that of the set.
There have been many successful uses of the idea of interpolation in several areas of
mathematics, and taking multiplicities into account is often useful [660, 1628, 1629].
Another way to view this is as follows. To get codewords of the interleaved Reed–Solomon
code IRS q (k, S, t), we take a t × |S| matrix whose rows are codewords of GRS k (s), with
s = {s1 , . . . , sn ) where S = {s1 , . . . , sn }, and view each column of the matrix as a single
symbol in Σ = Ftq . The strings in Σ|S| so obtained are the codewords of IRS q (k, S, t).
Clearly, interleaved Reed–Solomon codes are very closely related to Reed–Solomon codes.
It is easy to see that the rate of IRS q (k, S, t) equals k/|S|, and the minimum Hamming
distance of IRS q (k, S, t) equals |S| − k + 1, exactly as in the case of Reed–Solomon codes.
We now turn our attention to decoding algorithms. The problem of decoding interleaved
Reed–Solomon codes from e errors is the following natural-looking question. Let r : S → Ftq
be the received word. We will sometimes think of r as a tuple of t functions r(1) , r(2) , . . . , r(t) ,
where r(i) : S → Fq is the ith output coordinate of r. Our goal is to find one/many/all
t-tuples of polynomials p = (p1 , . . . , pt ) ∈ (Fq [x])t with deg(pi ) < k such that r(u) =
(p1 (u), . . . , pt (u)) for all but at most e values of u ∈ S; in other words dH (p, r) ≤ e where
dH (·, ·) denotes Hamming distance on Σ|S| .
Clearly, if dH (p, r) ≤ e, then we have that for all i, dH (pi , r(i) ) ≤ e where this latter
|S|
dH (·, ·) denotes Hamming distance on Fq . Note that the converse does not hold: the sets of
(i)
coordinates u ∈ S where pi and r agree can look very different for different i. When e is at
most (|S| − k)/2 (half the minimum distance of the code), then the above observation gives
an easy algorithm to decode interleaved Reed–Solomon codes from adversarial errors: for
each i, decode r(i) , using a standard Reed–Solomon decoder such as the Berlekamp–Welch
Algorithm, to find the unique polynomial pi of degree < k that is within Hamming distance
e of it. Taking these pi together to form a t-tuple p, we get a candidate codeword p of
IRS q (k, S, t), and then we check whether or not dH (p, r) ≤ e. 7
We will now describe an algorithm that can decode from a significantly larger number of
random errors. Specifically, the model for generating r is as follows. There is some unknown
7 A more sophisticated variation using the Guruswami–Sudan Algorithm RSListDecodeV2 can extend this
√
algorithm to decode interleaved Reed–Solomon codes from any e ≤ n − nk errors. Further improvements
seem difficult: decoding from an even larger number of errors in the worst case would need a breakthrough
on decoding of standard Reed–Solomon codes.
610 Concise Encyclopedia of Coding Theory
t-tuple of polynomials p = (p1 , . . . , pt ) ∈ (Fq [x])t where deg(pi ) < k. There is an unknown
set J ⊆ S (the set of error locations) of size at most e. Based on these unknowns, the
received word r : S → Ftq is generated by setting
(
(p1 (u), p2 (u), . . . , pt (u)) if u 6∈ J,
r(u) = (24.3)
a uniformly random element of Ftq if u ∈ J.
Note that the number of errors (i.e., the number of u ∈ S for which p(u) 6= r(u)) is always
at most |J|, and is with high probability equal to |J|, but it could be smaller.
In this setting, we would like to design a decoding algorithm that takes r as input and
recovers p, the underlying codeword. Below we give a natural (given the previous sections)
algorithm for this due to Bleichenbacher, Kiayias, and Yung [220]. A clever and nontrivial
analysis shows that this algorithm succeeds with high probability even with the number of
t
errors e being as large as t+1 · (n − k), which for large t is close to the minimum distance
of the code (twice the worst-case unique decoding radius!). This result is very surprising.
The key idea is to run a Berlekamp–Welch type decoding algorithm for each of the r(i) ,
while taking into account the fact that the error locations, and hence the error-locating
polynomials, are the same.
Solve the system to find any nonzero solution (a0 , . . . , ae , b1,0 , . . . , bt,k+e−1 ) ∈
e+t(k+e)
Fq . If there is no nonzero solution, then return FAIL.
Step 2: Polynomial Algebra: For the solution found in the previous step, define
A(x), B1 (x), B2 (x), . . . , Bt (x) ∈ Fq [x] by
e
X
A(x) = ai xi and
i=0
k+e−1
X
B` (x) = b`,j xj .
j=0
The list-decodability of Reed–Solomon codes is still open. For all we know, Reed–
Solomon codes themselves may be list-decodable up to list decoding √ capacity; this cor-
responds to list-decoding from O(k) agreements, instead of the nk agreements that the
Guruswami–Sudan Algorithm needs. The best negative result in this direction is by Ben-
Sasson, Kopparty, and Radhakrishnan [159], who show that the list size may become su-
perpolynomial for Reed–Solomon codes of vanishing rate.
Interpolation methods have a long history in mathematics. Some striking classical appli-
cations include the Thue–Siegel–Roth theorems on diophantine approximation, the Gelfond–
Schneider–Baker theorems on transcendental numbers [93], and the Stepanov–Bombieri–
Schmidt proofs of the Weil Bounds [1629]. More recent applications include the results of
Dvir, Lev, Kopparty, Saraf, and Sudan [659, 660, 1146, 1612] on the Kakeya problems over
finite fields, the work of Guth and Katz [884, 885] in combinatorial geometry, and the results
of Croot, Lev, Pach, Ellenberg and Giswijit on the cap set problem [472, 679].
Chapter 25
Pseudo-Noise Sequences
Tor Helleseth
University of Bergen
Chunlei Li
University of Bergen
25.1 Introduction
Pseudo-noise sequences (PN sequences), also referred to as pseudo-random se-
quences, are sequences that are deterministically generated but possess some favorable
properties that one would expect to find in randomly generated sequences. Applications
of PN sequences include signal synchronization, navigation, radar ranging, random number
generation, spread-spectrum communications, multi-path resolution, cryptography, and sig-
nal identification in multiple-access communication systems [824, 825]. Interested readers
are referred to comprehensive investigations and surveys of the design and applications of
PN sequences in [794, 824, 825, 940, 1616, 1617, 1850]. This chapter aims to provide a brief
overview of the topics of sequences with low correlation and/or maximal periods.
Correlation of sequences is a measure of their similarity or relatedness. In communication
and engineering systems, there are a large number of problems that require sets of sequences
with one or both of the following properties: (i) each sequence in the set is easy to distinguish
from a time-shifted version of itself; (ii) each sequence is easy to distinguish from (a possibly
time-shifted version of) every other sequence in the set. The first property is important for
applications such as ranging systems, radar systems, and spread spectrum communications
systems; and the second is desirable for simultaneous ranging to several targets, multiple-
terminal system identification, and code-division multiple-access (CDMA) communications
613
614 Concise Encyclopedia of Coding Theory
systems [1616, 1617]. There are strong ties between low correlation sequence design and the
theory of error-correcting codes. Interestingly, the dual of an efficient error-correcting code
will often yield a family of sequences with desirable correlation properties and vice versa
[940]. The first part of this chapter will be dedicated to individual sequences and sets of
sequences with low periodic correlation.
Feedback shift registers are one of the most popular methods to generate PN sequences
in an easy, efficient, and low-cost manner. Due to the intrinsic combinatorial structures,
shift register sequences with maximal possible length are of particular interest. They are
widely used in the areas of error-correcting codes, combinatorics, graph theory, and cryptog-
raphy [824]. For linear feedback shift registers, the maximum-length sequences are used in a
variety of engineering applications as well as in the construction of optimal error-correcting
codes [940]. For nonlinear cases, de Bruijn sequences, which are deemed as the nonlinear
counterpart of m-sequences, are considered as a model example of the interaction of discrete
mathematics and computer science [1558]. They have been used in building error-correcting
codes for storage systems [381], stream ciphers [933], and in DNA sequences correction and
assemblies [429]. Techniques from combinatorics, graph theory, and abstract algebra have
yielded a number of constructions and generations of de Bruijn sequences. The second part
of this chapter will introduce some approaches in these areas to efficiently generate de Bruijn
sequences.
The correlation of two sequences is the inner product of the first sequence in the
complex-valued version with a shifted version of the second sequence in the complex-valued
version. According to different shift operations, the correlation of two sequences is called
a periodic correlation if the shift is a cyclic shift; an odd-periodic correlation if the shift
is a cyclic shift with an additional sign; an aperiodic correlation if the shift is not cyclic,
and a partial-period correlation if the inner product involves only a partial segment of the
two sequences. Sequences based on the periodic correlation constraint have stronger ties to
coding theory. Partly for this reason and partly on account of our own research interests,
the emphasis in this chapter is on the design of sequences with low periodic correlation. We
will first introduce some definitions of correlation measures of sequences in this subsection,
and restrict our discussion to sequences with periodic correlation in later subsections.
Let {u(t)} and {v(t)} be two q-ary sequences of period N , not necessarily distinct. The
periodic correlation, sometimes also called even-periodic correlation, of the sequences
{u(t)} and {v(t)} is the collection {θu,v (τ ) | 0 ≤ τ ≤ N − 1} with
N
X −1
θu,v (τ ) = ωqu(t+τ )−v(t) ,
t=0
where the sum t+τ is computed modulo N . The correlation is called in-phase correlation
when τ = 0 and out-of-phase correlation when τ 6= 0. When the q-ary sequences {u(t)},
{v(t)} are the same, we speak of the autocorrelation function of the sequence {u(t)},
simply denoted as θu (τ ), and speak of the crosscorrelation function of them if they are
distinct.
For a family F of M periodic sequences over Zq , we define the peak crosscorrelation
magnitude by
θc = max |θu,v (τ )| 0 ≤ τ < N, u, v ∈ F and u 6= v ,
A sequence family is said to have low correlation if it has a relatively small magnitude
θmax when compared with other families of the same size and period [940].
From the definition of the periodic correlation function, it can be verified [1616, 1617]
that
N
X −1 N
X −1
|θu,v (τ )|2 = θu (τ )θv∗ (τ ).
τ =0 τ =0
X X N
X −1 −1
X NX
= |θu (0)|2 + |θu (τ )|2 + |θu,v (τ )|2
u∈F u∈F τ =1 u6=v τ =0
2
≤ M N + M (N − 1)θa2 + M (M − 1)N θc2 .
which leads to the following lower bound on the peak correlation magnitude:
2 N 2 (M − 1)
θmax ≥ . (25.1)
MN − 1
This bound is known as the Welch Bound, and it is shown to be tight in [1884]. A sequence
family is commonly considered to have optimal correlation if it achieves the Welch Bound.
The Sidelńikov Bound [940] is asymptotic for q-ary sequence families with periodic cor-
relation. The Sidelńikov Bound is derived from a successive estimation of the ratio of even
moments and can be approximated by
s
N 2s+1
(2s + 1)(N − s) + s(s+1) − M2 (2s)! for q = 2 and 0 ≤ s < 2N
5 ,
2
θmax ≥ 1
2 (Ns ) (25.2)
s 2s+1
2 (s + 1)(2N − s) − 2 N
M (s!)2 (2N
for q > 2 and s ≥ 0.
s )
This bound is significantly tighter than the Welch Bound when q > 2.
In addition, by applying the Cauchy inequality to the sum on the right-hand side of the
equality
NX−1 N
X −1 N
X −1
|θu,v (τ )|2 = θu (τ )θv∗ (τ ) = N 2 + θu (τ )θv∗ (τ ),
τ =0 τ =0 τ =1
we have N −1 2 −1 −1
N
! N
!
X X X
θu (τ )θv∗ (τ ) ≤ |θu (τ )|2 |θv (τ )|2 .
τ =1 τ =1 τ =1
Therefore, the upper and lower bounds on the mean square values of periodic corre-
lation functions can be given by
N −1 N −1
! 21 N −1
! 21
X X X
|θu,v (τ )|2 − N 2 ≤ |θu (τ )|2 |θv (τ )|2 ≤ (N − 1)θa2 .
τ =0 τ =1 τ =1
Frequently in practice, there is greater interest in the mean square correlation distribution of
a sequence family than in the parameter θmax . Quite often in sequence design, the sequence
family is derived from a binary cyclic code of length N by picking a set of cyclically distinct
Pseudo-Noise Sequences 617
sequences of period N . The families of Gold and Kasami sequences are constructed in this
way [821, 1086]. In this case, the mean square correlation of the family can be shown to be
either optimum or close to optimum, under certain easily satisfied conditions, imposed on
the minimum distance of the dual code. A similar situation holds even when the sequence
family does not come from a linear cyclic code. In this sense, mean square correlation is not
a very discriminating measure of the correlation properties of a family of sequences.
For the case when there is approximate, but not perfect, synchronism present in a
CDMA communication system, there is interest in the design of families of sequences whose
correlations, both autocorrelation functions and crosscorrelation functions, are zero or small
for small values of relative time shift [1286]. In this context, a low correlation zone (LCZ)
sequence family with parameters (N, M, L, δ) is a family F of M sequences of period N
such that for each time shift τ in the low correlation zone (−L, L), the magnitude of the
periodic crosscorrelation function of any two distinct sequences at the shift τ and that of
the out-of-phase autocorrelation function of any sequence at the shift τ are bounded by δ,
i.e., for u, v ∈ F
|θu,v (τ )| ≤ δ for |τ | < L, u 6= v, and for 0 < |τ | < L when u = v.
It is shown in [1782] that the parameters of an (N, M, L, δ) LCZ sequence family are mu-
tually constrained by the following inequality
N (M L − N )
δ2 ≥ .
ML − 1
It is worth noting that if L is taken as N , the above lower bound on δ matches the Welch
Bound (25.1).
The odd-periodic correlation of two q-ary sequences {u(t)} and {v(t)} is the collec-
tion {θbu,v (τ ) | −(N − 1) ≤ τ ≤ N − 1} given by
N
X −1
θbu,v (τ ) = (−1)b(t+τ )/N c ωqu(t+τ )−v(t) .
t=0
The odd-periodic correlation of the sequences {u(t)} and {v(t)} is termed odd-periodic
crosscorrelation when they are different, and termed odd-periodic autocorrelation
when they are identical, where θbu (τ ) is used for simplicity. Similar to the (even-)periodic
correlation, the odd-periodic correlation properties of sequences can play an equally impor-
tant role in applications for extracting desired information from the signals at the receiver.
Hence, sequences with low odd-periodic correlation magnitude are favorable. Nevertheless,
the odd-periodic correlation properties of sequences are little explored in general. Only a few
constructions of sequences with low autocorrelations have been proposed in the literature
[1302, 1921]. Lüke and Schotten in [1302] showed that there exists no binary sequence of
period greater than 2 that are perfect with respect to odd-periodic autocorrelation (namely,
all out-of-phase odd-periodic autocorrelations equal zero). They proposed a class of odd-
periodic almost perfect binary sequences with period q + 1 and with one position taking
a special element, where q is an odd prime power. As we shall see, these sequences can
be used to construct quaternary sequences with low periodic correlation. Yang and Tang
[1921] recently proposed several families of quaternary sequences with low odd-periodic
autocorrelation.
The aperiodic correlation of two q-ary sequences {u(t)} and {v(t)} is the collection
{ρu,v (τ ) | −(N − 1) ≤ τ ≤ N − 1} given by
min{N −1,N −1−τ }
X
ρu,v (τ ) = ωqu(t+τ )−v(t) ,
t=max{0,−τ }
618 Concise Encyclopedia of Coding Theory
where the range of t in the summation is the overlapped area of a shifted-version of {u(t)}
with {v(t)}. The aperiodic correlation of the sequences {u(t)} and {v(t)} is termed aperi-
odic crosscorrelation when they are different and termed aperiodic autocorrelation
when they are identical, where ρu (τ ) is used for simplicity. For a family of M sequences of
common period N , let ρmax be the maximum magnitude of the aperiodic crosscorrelation of
any two different sequences and the out-of-phase aperiodic autocorrelation of any sequence.
In practice sequences with small ρmax are desirable. A lower bound on ρmax due to Welch
[1884] is
N 2 (M − 1)
ρ2max ≥ .
M (2N − 1) − 1
For large families of sequences, the best bound on ρmax is due to Levenshtein [1232]. The
Levenshtein Bound is based on linear programming theory as well as the theory of associa-
tion schemes. When M ≥ 4 and N ≥ 2, the Levenshtein Bound is
(2N 2 + 1)M − 3N 2
ρ2max ≥ .
3(M N − 1)
Aperiodic correlations are used more often than periodic correlations in a CDMA com-
munication system. However, the problem of designing sequence sets, particularly the cross-
correlation functions, with low aperiodic correlation is difficult. The conventional approach
has been to design sequences based on periodic correlation properties and to subsequently
analyze the aperiodic correlation properties of the resulting design.
For a sequence {u(t)} of period N , the merit factor is defined as
ρu (0)2
F =P .
0<|τ |≤N −1 ρ2u (τ )
The merit factor is the ratio of the square of the in-phase autocorrelation to the sum of
the squares of the out-of-phase aperiodic autocorrelation values. Hence it can be deemed
as a measure of the aperiodic autocorrelation properties of a sequence. It is also closely
connected with the signal to self-generated noise ratio of a communication system in which
coded pulses are transmitted and received. It is desirable that the merit factor of a sequence
be high, corresponding to the requirement that the sequence should have low aperiodic
autocorrelation magnitude. Due to the intractability of aperiodic correlation, progress on
the merit factor was only made for binary sequences in recent years [1040, 1093]. Let FN
denote the largest merit factor for any binary sequence of length N . Exhaustive computer
searches for N ≤ 40 have revealed that F11 = 12.1, F13 = 14.08 and 3.3 ≤ FN ≤ 9.85 for
other positive integers N up to 40. The values F11 and F13 are achieved by Barker sequences
of corresponding lengths. From partial searches, for lengths up to 117, the highest known
merit factor is between 8 and 9.56; for lengths from 118 to 200, the best-known factor is
close to 6. For lengths > 200, statistical search methods have failed to yield a sequence
having merit factor exceeding 5. Researchers have also looked into the asymptotic behavior
of the merit factor of binary sequences of infinitely large length N . It is proved that skew-
symmetric sequences exhibit the best known asymptotic merit factor as large as 6.342· · ·
[1040, 1093].
The partial-period correlation of two sequences {u(t)} and {v(t)} is the collection
{∆u,v (l, τ, t0 ) | 1 ≤ l ≤ N, 0 ≤ τ ≤ N − 1, 0 ≤ t0 ≤ N − 1} given by
t=tX
0 +l−1
where l is the length of the partial-period and the sum t+τ is again computed modulo N . In
direct-sequence CDMA systems, the pseudo-random signature sequences used by the various
users are often very long for reasons of data security. In such situations, to minimize receiver
hardware complexity, correlation over a partial period of the signature sequence is often
used to demodulate data as well as to achieve synchronization [825, 940]. For this reason,
the partial-period correlation properties of a sequence are of interest. Researchers have
attempted to determine the moments of the partial-period correlation. Here, the main tool
is the application of the Pless Power Moments of coding theory; see [1513] and Section 1.15.
These identities often allow the first and second partial-period correlation moments to be
completely determined. For example, this is true in the case of m-sequences.
We have recalled basic correlation measures of sequences used in digital communica-
tion systems. Other correlation metrics related to transmitted signal energy, like peak-to-
sidelobes level (PSL) and peak-to-average power ratio (PAPR) of sequences can be found
in a recent survey [1850].
(iii) If N ≡ 2 (mod 4), then the smallest possible value of Θ is equal to 2. In this case, the
out-of-phase autocorrelation function θs (τ ) may take values from {−2, 2}. Jungnickel
and Pott in [1057] showed that there exist no binary sequences with the out-of-phase
autocorrelation function θs (τ ) always taking the value 2 or always taking the value
−2 with lengths up to 109 , except for 4 length values; in addition, there exist binary
sequences with out-of-phase autocorrelation values {−2, 2}, which are said to have
optimal autocorrelation.
(iv) If N ≡ 3 (mod 4), then the smallest possible value of Θ is again equal to 1 since the
out-of-phase autocorrelation function may equal −1. As a matter of fact, there exist
binary sequences with out-of-phase autocorrelation function always taking the value
−1. These sequences are said to have ideal two-level autocorrelation.
Due to their intrinsic connection, many known families of binary sequences with opti-
mal autocorrelation are constructed from combinatorial objects [60]. Here we recall basic
definitions of difference sets and almost difference sets for self-completeness. Let (A, +) be
an abelian group of order v and C be a k-subset of A. Let dC (w) = |C ∩ (C + w)| be
the difference function at any nonzero element w in A. The subset C is called a (v, k, λ)
difference set (DS) in A if dC (w) = λ for any nonzero element w in A, and is called
a (v, k, λ, t) almost difference set (ADS) if dC (w) = λ for t nonzero elements w and
dC (w) = λ + 1 for v − 1 − t nonzero elements w. Difference sets and almost difference sets
in an abelian group A are said to be cyclic if the group itself is cyclic.
Let ZN be the set of integers modulo N . A binary sequence {s(t)} with a support Cs is
said to have optimal autocorrelation if one of the following is satisfied:
(i) If N ≡ 0 (mod 4), Cs is an N, k, k − (N + 4)/4, N k − k 2 − (N − 1)N/4 ADS in ZN .
(ii) If N ≡ 1 (mod 4), Cs is an N, k, k − (N + 3)/4, N k − k 2 − (N − 1)2 /4 ADS in ZN .
(iii) If N ≡ 2 (mod 4), Cs is an N, k, k − (N + 2)/4, N k − k 2 − (N − 1)(N − 2)/4 ADS
in ZN .
(iv) If N ≡ 3 (mod 4), Cs is an N, (N + 1)/2, (N + 1)/4 or N, (N − 1)/2, (N − 3)/4
DS in ZN .
Among the above four cases, the last one when N ≡ 3 (mod 4) is the family of Paley–
Hadamard difference sets with flexible parameters and ideal two-value autocorrelation. The
conditions for the first three cases are more restrictive. Here we shall introduce one example
for each of the first three cases and then discuss the last case in more depth.
The examples for cases (i)-(iii) we introduce here all pertain to square and non-square
elements in finite fields. A nonzero element x in the finite field Fq is called a square if
there exists an element y ∈ Fq satisfying x = y 2 . Equivalently, a nonzero element x ∈ Fq
with x = αk is a square if and only if the exponent k is even, where α is a primitive
element of Fq . For case (i), let q be a prime power with q ≡ 1 (mod 4); then the set {t |
αt + 1 is a non-square in Fq , 0 ≤ t < q − 1} is a q − 1, (q − 1)/2, (q − 5)/4, (q − 1)/4 ADS in
Zq−1 , of which the binary sequence has optimal autocorrelation values {0, −4}. For case (iii),
let q be a prime power with q ≡ 3 (mod 4); then the set{t | αt +1 is a non-square in Fq , 0 ≤
t < q − 1} is a q − 1, (q − 1)/2, (q − 3)/4, (3q − 5)/4 ADS in Zq−1 , of which the binary
sequence has optimal autocorrelation values {0, −2, 2} [1702]. For case (ii), let q = p be
a prime with p ≡ 1 (mod 4); then the set of squares modulo p forms an ADS in Zp . The
binary sequence having this ADS as its support is the well-known Legendre sequence [704],
and it has optimal autocorrelation values {0, −3, 1}. Other families of binary sequences with
optimal autocorrelation for N (mod 4) ∈ {0, 1, 2} are mostly constructed from cyclotomic
classes. More instances of such sequences can be found in [328, 1163, 1783, 1934].
Pseudo-Noise Sequences 621
For the last case of Paley–Hadamard difference sets, we cover one important case N =
2n − 1, for which n can be freely chosen and the corresponding binary sequences can be
converted to good binary cyclic codes of the same length. Known infinite families of Paley–
Hadamard difference sets fall into two classes:
(1) difference sets with Singer parameters, and
Construction 25.2.1 (Trace Function) Define the trace function Trqr /q : Fqr → Fq by
Pr−1 qi
Trqr /q (x) = i=0 x . Let n0 be a divisor of n and R = {x ∈ F2 0 | Tr2n0 /2 (x) = 1},
n
which is a difference set [1716]. Gordon, Mills, and Welch showed the set D = {x ∈ F2n |
Tr2n /2n0 (x) ∈ R} is a difference set in F∗2n with Singer parameters [839]. It is a generalization
of the original difference set {x ∈ F2n | Tr2n /2 (x) = 1} by Singer [1716].
For the second class of Paley–Hadamard difference sets, cyclotomy plays a critical role
in the known constructions. Let q = df + 1 be a power of prime and α a primitive element
(d,q)
of Fq . The cosets Di = αi hαd i, where hαd i is the cyclic group generated by αd in F∗q , are
Sd−1 (d,q)
called cyclotomic classes of order d with respect to Fq . Clearly F∗q = i=0 Di . Known
constructions of Paley–Hadamard difference sets are summarized as follows.
Construction 25.2.4 (Hall [887]) Let p be a prime of the form p = 4s2 + 27 for some s.
(6,p) (6,p) (6,p)
The Hall difference set is given by D = D0 ∪ D1 ∪ D3 .
Construction 25.2.5 (Paley [1484]) Let p be a prime with p ≡ 3 (mod 4). Then the
( p−1 , p)
is a p, p−1 p−3
cyclotomic class D1 2 2 , 4 difference set in Zp , which is the well-known
Legendre sequence [704].
Construction 25.2.6 (Two-Prime [142, Chapter 2]) Let N = p(p + 2) with p and
p + 2 being two primes. The two-prime difference set is given by
[ ( p−1 , p) ( p+1 , p+2)
[ p−1
( , p) ( p+1 , p+2)
(Zp × {0}) D0 2 × D0 2 D1 2 × D1 2 ,
Let nj (s, τ ) denote the number of occurrences of j ∈ {0, 1, 2, 3} in the difference s(t+τ )−s(t)
(mod 4) as t ranges over ZN . The autocorrelation of {s(t)} can be rewritten as
θs (τ ) = n0 (s, τ ) − n2 (s, τ ) + ω4 n1 (s, τ ) − n3 (s, τ ) ,
The magnitude Θ could be as small as zero, which is achieved by perfect quaternary se-
quences. Nevertheless, perfect quaternary sequences have only been found for lengths 2, 4, 8,
and 16, and it is believed that there is no perfect quaternary sequence with lengths greater
than 16 [704]. Quaternary sequences of period N that have Θ = 1 for odd N or have Θ = 2
for even N were found, which are considered to be the best possible autocorrelation for
Pseudo-Noise Sequences 623
quaternary sequences. Hence quaternary sequences with Θ = 1 for odd N or Θ = 2 for even
N are said to have optimal autocorrelation. Lüke and Schotten in [1302] gave a com-
prehensive survey on both binary and quaternary sequences with optimal autocorrelation
properties. In addition to binary and quaternary sequences, they also introduced almost
binary and almost quaternary sequences, which have a single position take a certain special
value and are used to construct optimal sequences. There are several constructions of op-
timal (almost) quaternary sequences. Below we recall the generalized Sidelńikov sequence
proposed in [1301] and related constructions of sequences with optimal autocorrelation.
Lüke, Schotten, and Mahram in [1301] generalized the binary Sidelńikov sequence [1702]
as follows. Let α be a primitive element in Fq . The quaternary Sidelńikov sequence of
length q − 1 can be given by s(0) = c for certain c ∈ Z4 and s(t) = logα (αt − 1) (mod 4) for
t = 1, . . . , q − 2, where logα (·) denotes the discrete logarithm function in Fq . The authors
in [1301] showed that the quaternary Sidelńikov sequences have Θ = 2 when N = q − 1 ≡ 0
(mod 4) and c = 0, 2, or when N = q − 1 ≡ 2 (mod 4) and c = 1, 3.
When N is odd, the optimal autocorrelation Θ = 1 is attained by a so-called
complementary-based method due to Schotten [1631]. The complementary-based method
utilizes odd-perfect binary sequences proposed in [1300] as building blocks. The construction
can be described as below, using the trace function defined in Construction 25.2.1.
In addition, when N ≡ 1 (mod 4), the binary Legendre sequences of odd prime length
can be modified to yield optimal quaternary sequences [1302]. Take N = p as an odd prime
with N ≡ 1 (mod 4). The quaternary sequence {s(t)} is given as follows: s(0) = 1, s(t) = 0 if
t is a square modulo p, and s(t) = 2 if t is a non-square modulo p. For instance, when p = 5,
the modified Legendre sequence is defined as 10220, which has periodic autocorrelation
spectrum θs (τ ) : (5, −1, −1, −1, −1).
624 Concise Encyclopedia of Coding Theory
Using optimal binary and quaternary sequences with odd length N and Θ = 1, one can
further obtain more optimal quaternary sequences of length 2N and Θ = 2 by multiply-
ing them with the perfect quaternary sequence 01. The aforementioned optimal quaternary
sequences are constructed by slightly modifying binary sequences with optimal autocorre-
lation. Hence they cannot have good balance over Z4 . (Here by balance we mean the values
of Z4 are (almost) uniformly distributed in a quaternary sequence.) Tang and Ding [1781]
employed the interleaving technique and the inverse Gray mapping in constructing balanced
quaternary sequences for even length N ≡ 2 (mod 4).
where
si (t) = Tn (1 + 2ηi )β t for 1 ≤ i ≤ 2n and s2n +1 (t) = Tn (2β t ).
for some a ∈ GR(4, n) depending upon i, j, and τ . The correlation distribution of Family
A was determined in [940], which indicates that
√
θmax ≤ 1 + 2n .
Compared with the binary Gold sequences, Family A has the same family size and better
correlation.
With the same notation as in Family A, we define another sequence family L as follows:
when n is even. Each sequence Family L can be verified to have period N = 2n − 1. Clearly
the family has size 2n
2 + 2n + 1 when n is odd,
|L| =
22n + 2n when n is even.
Furthermore, it can be shown [940] that θmax for Family L satisfies
√
θmax ≤ 1 + 2 2n .
By the Sidelńikov Bound, Family L is superior to any possible family of binary sequences
having the same size. Moreover, when n is odd, L is simultaneously a superset of Family A
as well as of the family of binary of Gold sequences.
limited [824, 935]. In this section, we will briefly introduce some basics of shift register
sequences and summarize common approaches to generating de Bruijn sequences. Differ-
ent from the previous section, we will denote a shift register sequence by {st } (instead of
{s(t)}) in this section for simpler presentation. In addition, binary n-tuples will be denoted
by upper-case letters in bold, e.g., X, Y, A, B.
f (x0 , . . . , xn−1 )
x0 x1 xn−1
At every clock pulse, the content of x0 is output, the content of xi+1 is transferred into
xi for 0 ≤ i ≤ n − 2 and the last unit xn−1 is filled with f (x0 , x1 , . . . , xn−1 ). Namely, given
an initial state S0 = s0 s1 · · · sn−1 , a feedback shift register outputs a sequence s = {st } =
s0 s1 · · · sn−1 sn sn+1 · · · according to the following recurrence
sn+t = f (st , st+1 , . . . , st+n−1 ) for t = 0, 1, 2, . . . .
The sequence s = {st } is called an FSR sequence of order n. It exhibits a sequence of state
transitions S0 → S1 → S2 → · · · in the shift register, where St = st st+1 · · · st+n−1 is called
the tth state of s for t = 0, 1, 2, . . . . The states St−1 and St+1 are called the predecessor
and the successor of St , respectively.
A shift register is called a linear feedback shift register (LFSR) when the feedback
function is of linear form
f (x0 , x1 , . . . , xn−1 ) = c0 x0 + c1 x1 + · · · + cn−1 xn−1 ,
and is called a nonlinear feedback shift register (NLFSR) when the feedback function
is nonlinear.
n
z }| {
In LFSRs, the all-zero state 0n = 0 · · · 0 repeatedly produces the all-zero successor
state. Hence, a nonzero binary LFSR sequence of order n has length at most 2n − 1 since
0n as an initial state will yield the all-zero sequence. LFSR sequences of order n with
maximum possible length 2n −1 are called maximum-length sequences, or m-sequences
for short. It is well-known that m-sequences of order n have a one-to-one correspondence
with primitive polynomials of degree n. This indicates that there are a total of φ(2n − 1)/n
cyclically distinct binary m-sequences of order n, where φ(2n − 1) is the Euler totient of
2n − 1, namely, the number of positive integers that are less than and coprime to 2n − 1.
Binary m-sequences possess many good random properties [824]:
• two-level ideal autocorrelation: the autocorrelation function of an m-sequence
takes the value −1 at all out-of-phase shifts;
628 Concise Encyclopedia of Coding Theory
• cycle-and-add property: the addition of an m-sequence and any of its cyclic shifts
is again an m-sequence;
• span-n property: any nonzero n-tuple binary string occurs exactly once in an m-
sequence of order n;
• run property: the runs of length i take up 1/2i portion of all runs in an m-sequence,
where a run of length i is a block of i consecutive identical 1’s or 0’s.
The above properties of m-sequences are favorable in radar systems, synchronization of
data, Global Positioning Systems (GPS), coding theory, and code-division multiple-access
(CDMA) communication systems [824]. On the other hand, the regularities of m-sequences
are not desirable in cryptography. Nonlinearity and irregularity need to be introduced when
m-sequences are used in cryptographic applications.
n
For n-stage binary NLFSRs, there are a total of 22 possible Boolean feedback functions.
When the feedback function of an NLFSR has a nonzero constant term, the all-zero state
0n produces the successor 0 · · · 01, instead of the all-zero state as in LFSRs. That is to say,
an n-stage NLFSR could generate an FSR sequence of maximum period 2n , which, under
cyclic shifts, contains all possible 2n states. Such an FSR sequence is called a binary de
Bruijn sequence of order n, in which every possible binary n-tuple occurs exactly once.
Binary de Bruijn sequences can be considered as the nonlinear counterpart of binary m-
sequences with respect to the span-n property. Historically, this type of sequence has been
‘rediscovered’ several times in different contexts. They became more generally known after
the work [496] of Nicolaas S. de Bruijn, whose name was later associated with this type of
sequence.
De Bruijn sequences showcase nice combinatorial and random properties. They have
found applications in a variety of areas, particularly in control mechanisms, coding and
communication systems, pseudo-random number generators, and cryptography [824, 1558].
It was shown that binary de Bruijn sequences exist for all positive integers n, and there
n−1
are a total of 22 −n cyclically distinct binary de Bruijn sequences of order n [496]. The
number of de Bruijn sequences is huge compared to that of m-sequences, but it takes up
n−1 n
only a small portion 2(2 −n) /22 in the whole space of all possible binary FSR sequences
of order n. Therefore, efficient generation of de Bruijn sequences has been an interesting
and challenging topic. As we shall see, this problem can be approached from perspectives of
different mathematics branches, such as combinatorics, graph theory, and abstract algebra
[755, 824, 1558].
In some contexts, it is sufficient to express the cycle structure of Ω(f ) by the distribution
of lengths of cycles for simplicity, where we will denote the cycle structure as [di (λi )]i≥1 ,
where di (λi ) means that there are di cycles of length λi in the cycle structure.
Pseudo-Noise Sequences 629
FIGURE 25.2: The state graph of the pure cycling register (PCR) for n = 3
000
100 001
010
101
110 011
111
000
0 00 100 001
010
01 10
1 101
11
n=1 110 011
n=2
111
n=3
Graphically, the cycle structure of Ω(f ) exhibits a graph of state transitions in the FSR.
As an instance, Figure 25.2 displays the state graph for the simple pure cycling register
(PCR) with feedback function f (x0 , x1 , . . . , xn−1 ) = x0 for n = 3. This state graph has
four disjoint cycles and the cycle structure can be given by Ω(x0 ) = [0] ∪ [1] ∪ [001] ∪ [011].
n
There are 22 possible feedback functions for n-stage binary FSRs. The superposition
of the state graphs of all these FSRs yields the de Bruijn graph Gn of order n. It is
a directed graph composed of 2n vertices and 2n+1 edges, where each vertex denotes an
n-tuple state and one vertex x0 x1 · · · xn−1 is directed to another vertex y0 y1 · · · yn−1 if
and only if x1 · · · xn−1 = y0 · · · yn−2 . Figure 25.3 displays the binary de Bruijn graphs for
n = 1, 2, 3.
The de Bruijn graph provides an overview of state transitions in all possible FSRs.
Choosing a specific feedback function simply means that some edges in the graph will be
removed. Clearly, an FSR with any feedback function transits a state to a unique succes-
sor state, in which the last bit is uniquely determined by the feedback function. While a
state does not necessarily have a unique predecessor state, Golomb [824] gave a necessary
and sufficient condition on feedback functions that restricts each state to have a unique
630 Concise Encyclopedia of Coding Theory
y0 y1 · · · yn−1 y1 · · · yn−1 y n
x0 x1 · · · xn−1 x1 · · · xn−1 xn
x1 · · · xn−1 xn x0 x1 · · · xn−1
y1 · · · yn−1 yn y 0 y1 · · · yn−1
predecessor. An FSR is said to be nonsingular if its feedback function has the form
The state graph of a nonsingular FSR consists of pure cycles that are disjoint from each
other (as in Figure 25.2). Such cycle structures are branchless and of more practical interest
[824]. We will restrict our discussion to nonsingular FSRs in the remainder of this section.
FIGURE 25.5: The adjacency graph of the pure cycling register (PCR) for n = 3
s 1 s 2 s 1 s
[0] [001] [011] [1]
conjugate pairs (001, 101), (010, 110) with the cycle [011], and the cycle [011] shares one
conjugate pair (011, 111) with the cycle [1].
For an FSR with feedback function f , the cycle joining and splitting methods change
the cycle structure of Ω(f ), which results in changes to the values of f at the conjugate
pairs accordingly. When the successors of a conjugate pair a0 a1 · · · an−1 and a0 a1 · · · an−1
are swapped in its state graph, the resulting feedback function is given by
n−1
Y
f 0 (x0 , x1 , . . . , xn−1 ) = x0 + g(x1 , . . . , xn−1 ) + (xi + ai + 1)
i=1
= x0 + g 0 (x1 , . . . , xn−1 ),
where the function g 0 takes the same value as g at all (n − 1)-tuples except (a1 , . . . , an−1 ).
In other words, each cycle joining or cycle splitting has a one-to-one correspondence to the
change of g(x1 , . . . , xn−1 ) at one position, causing the Hamming weight of the truth table
of g to change by one. It can be shown that the parity of the number of cycles is the same
as the parity of the weight of the truth table of g(x1 , . . . , xn−1 ) [755].
Theorem 25.3.3 With the notation of Theorem 25.3.1, let s = {st } have period . Then
The identity in Theorem 25.3.3 is the key to studying the periodic properties of LFSRs.
To study the periods of the sequences in Ω(f ), it is very useful to define the period of the
characteristic polynomial f (x). Let f (x) be a polynomial with nonzero constant term. The
period of a polynomial f (x), denoted by per(f ), is defined as the smallest positive integer e
such that f (x) | (xe − 1). From the key identity in Theorem 25.3.3, the period of a sequence
s in Ω(f ) and the period of f (x) are closely connected.
Theorem 25.3.4 Let s = {st } be a nonzero sequence in Ω(f ). Then the period of s divides
the period of the characteristic polynomial f (x). Moreover, if f (x) is irreducible, then the
period of s equals the period of f (x).
Theorem 25.3.4 explains the essential reason that m-sequences of order n are generated
from primitive polynomials of degree n, which by definition have period 2n − 1.
Example 25.3.5 Let n = 4. Below we choose three different polynomials of degree 4 that
illustrate the result of Theorem 25.3.4.
• We first take a polynomial f (x) = x4 + x3 + x2 + 1 = (x + 1)(x3 + x + 1). It is easily
seen (from Example 25.3.2) that f (x) divides x7 − 1 and f (x) actually has per(f ) = 7.
Moreover, the cycle structure of Ω(f ) is given by
The nonzero cycle has length 15, which is the same as per(f ).
By Theorems 25.3.3 and 25.3.4, we can rewrite Ω(f ) with per(f ) = e as follows:
e
x −1
Ω(f ) = ϕ(x) deg(ϕ(x)) < n
f (x)
since all sequences in Ω(f ) repeat with period e. Therefore, for two polynomials g(x) and
h(x), we have
and
per(u + v) = lcm(per(u), per(v)).
Thus, the full cycle structure of Ω(gh) for co-prime polynomials g and h can be determined
in a relatively easy way.
Example 25.3.6 Take g(x) = x2 + x + 1 and h(x) = x3 + x + 1. We have Ω(g) = [0] ∪ [011]
and Ω(h) = [0] ∪ [0010111]. The set Ω(gh) can be given as follows:
where the sum of any cycle [u] and [0] is the cycle [u] itself, and the last cycle
[010000111110101001100] is the sum of [011] and [0010111] formed by adding 7 repli-
cations of [011] and 3 replications of [0010111].
Theorem 25.3.7 Let gcd(g, h) = 1. Let Ω(g) contain d1 cycles of length λ1 denoted as
[d1 (λ1 )], and let Ω(h) contain d2 cycles of length λ2 denoted as [d2 (λ2 )]. Combine by adding
in all possible ways the corresponding sequences. This leads to d1 λ1 d2 λ2 sequences, all of
period lcm(λ1 , λ2 ). The number of cycles obtained in this way is d1 d2 gcd(λ1 , λ2 ).
Note that both Ω(g) and Ω(h) may contain many cycles. By Theorem 25.3.7 we can repeat-
edly combine all sequences from Ω(g) and Ω(h).
Example 25.3.8 Let f (x) = (x2 + x + 1)(x + 1)2 and define g(x) = x2 + x + 1 and
h(x) = (x + 1)2 . Then the cycle structure of Ω(g) is given by [0] ∪ [011], denoted as
[1(1), 1(3)], and the cycle structure of Ω(h) is [0] ∪ [1] ∪ [01], denoted as [2(1), 1(2)]. The
cycle structure of Ω(f ) is therefore obtained by combining the two cycle structures. By
Theorem 25.3.7, the cycle structure of Ω(f ) has the form
The following result determines the cycle structure of polynomials having the form f (x)k ,
where f (x) is an irreducible polynomial.
Theorem 25.3.9 Let f (x) be an irreducible polynomial of degree n and period e. For a
positive integer k, determine κ such that 2κ < k ≤ 2κ+1 . Then Ω(f k ) contains the following
number of sequences with the following periods:
κ κ−1 κ
Number 1 2n − 1 22n − 2n ··· 22 n
− 22 n
2kn − 22 n
Example 25.3.10 Let f (x) = x2 + x + 1, i.e., n = 2, e = 3. Let [d(λ)] denote that there
are d cycles of period λ in Ω(f ). Then Theorem 25.3.9 gives the cycle structure of Ω(f 2 ) as
[1(1), 1(3), 2(6)]. In fact, the cycle structure of Ω(f 2 ) is
Example 25.3.11 Let f (x) = (x2 + x + 1)2 (x + 1)2 . The cycle structure of (x2 + x + 1)2
is [1(1), 1(3), 2(6)] and the cycle structure of (x + 1)2 is [2(1), 1(2)]. Hence, by Theorem
25.3.7, the cycle structure of Ω(f ) is given by [2(1), 1(2), 2(3), 7(6)]. More specifically, the
cycle structure of Ω(f ) is
be shown [496] that if the de Bruijn graph Gn has N complete Hamiltonian cycles, then
n−1
there exist 22 −1 · N complete Hamiltonian cycles in Gn+1 . Starting from the unique
Hamiltonian path through the de Bruijn graph G1 , by induction one can easily prove the
n−1
fact that there are a total of 22 −n de Bruijn sequences of order n.
The above result was proved by a method known as the BEST Theorem due to de Bruijn,
Ehrenfest, Smith, and Tutte [496]. For a given graph G without loops, the BEST Theorem
determines the number of subgraphs of G that are rooted trees. In graph theory, a rooted
tree is a connected graph without cycles, and there exists a vertex that is not the initial
vertex of any edge. The number of rooted trees of a graph without loops is determined by
the following BEST Theorem.
Theorem 25.4.1 (BEST [496]) Given a graph G of t nodes without loops, let A be its
associated matrix in which Ai,i equals the degree of vertex i, and Ai,j denotes (−1) times
the number of edges between vertex i and vertex j. Then the number of rooted trees of G is
equal to the minor of any element of A.
Denote by G∗n the modified de Bruijn graph, which is obtained by dropping the cycles at
all-zero and all-one vertices. We shall apply the BEST Theorem to a directed graph G∗n
with certain adjustments: we use Ai,i to denote the number of edges exiting the vertex
numbered i and Ai,j to denote (−1) times the number of edges from vertex i to vertex j.
For the directed graph G∗n , denote any integer x ∈ {0, 1, . . . , 2n − 1} by a binary n-tuple
x0 x1 · · · xn−1 . It is relatively easy to verify that the associated matrix of G∗n has the form:
• A0,0 = A2n −1,2n −1 = 1;
• A0,1 = A2n −1,2n −2 = −1;
• Ai,i = 2 for i = 1, 2, . . . , 2n − 2;
• Ai,j = −1 for i = 1, 2, . . . , 2n − 2, j = 2i, 2i + 1 modulo 2n ;
• Ai,j = 0 otherwise.
From the structure of the associated matrix of G∗n , it can be shown by induction that there
n
are 22 −(n+1) trees with root 0 by evaluating the determinant of the minor of the matrix
A at (0, 0). Furthermore, each directed rooted tree of G∗n at vertex 0 can be used to find a
Hamiltonian path of Gn+1 by the following Graphical Algorithm.
In is shown in [757] that the Graphical Algorithm can generate all de Bruijn sequences
of order n. As an illustration, we recall an example from [757].
Example 25.4.3 Let T be a rooted tree in G∗3 given by vertices 0, 1, . . . , 7 and edges
(1, 2), (2, 4), (3, 6), (4, 0), (5, 2), (6, 4), (7, 6),
where the label (i, j) denotes an edge from vertex i to vertex j. These edges together with
(0, 0) will be labeled in Step 2. Thus, the unlabeled edges in G3 are
(0, 1), (1, 3), (2, 5), (3, 7), (4, 1), (5, 3), (6, 5), (7, 7).
Starting from vertex 0, with (0, 0) labeled, (0, 1) satisfies Step 4. Following the steps in the
Graphical Algorithm, we get a sequence of edges
0 → 1 → 3 → 7 → 7 → 6 → 5 → 3 → 6 → 4 → 1 → 2 → 5 → 2 → 4 → 0 → 0.
In a similar way, all Hamiltonian paths of G4 can be generated by all trees rooted at vertex
0 in G∗3 .
The preceding process using a graphical approach can generate all de Bruijn sequences
of order n. However, we have to obtain all trees in Gn−1 . Another drawback of this approach
is that we need to store the tree and generated path in some way. These drawbacks can be
overcome by some approaches from a combinatorial or algebraic perspective.
Steps 1 - 2 of the Prefer-One Algorithm ignore the wraparound sequences. When Step
2 terminates, it outputs a de Bruijn sequence appended with (n − 1) extra bits. However,
Step 2 is sufficient to produce a de Bruijn sequence. This elegant algorithm, as noted in
Fredricksen’s survey [755], was proposed from different areas several times.
In the Prefer-Same Algorithm, the condition Step 2(ii) on the run length is a necessary
requirement for producing de Bruijn sequences.
necklace of a string x, denoted by Neck(x), is the set consisting of x along with all its circular
shifts. The necklace representative of Neck(x) is the unique Lyndon word in Neck(x). We say
x is a co-Lyndon word if xx is a Lyndon word. Let coNeck(x) denote the set containing all
length-n substrings of the circular string xx. The representative of coNeck(x) is the lexico-
graphically smallest string in coNeck(x). For instance, let x = 01010; we have Neck(01010) =
{01010, 10100, 01001, 10010, 00101} and its representative is 00101; coNeck(00111) =
{00111, 01111, 11111, 11110, 11100, 11000, 10000, 00000, 00001, 00011}. The co-necklace rep-
resentative is 00000, and it is a co-Lyndon word.
The Lyndon word and co-Lyndon word are closely related to the pure cycling regis-
ter (PCR) and its complement. Recall that the feedback function of the PCR is given
by f (x0 , x1 , . . . , xn−1 ) = x0 . Its complement, called the complement cycling register
(CCR), has feedback function fc (x0 , x1 , . . . , xn−1 ) = x0 . It can be easily verified that the
state graph of the PCR actually consists of necklaces and the state graph of the CCR consists
of co-necklaces [781]. For instance, for the PCR with n = 3, the cycles [0], [1], [001], [011] as
in Figure 25.2 are necklaces; for the CCR with n = 3, the cycle [000111] forms a co-necklace
{000, 001, 011, 111, 110, 100} and the cycle [010] is a co-necklace {010, 101}.
We introduce some combinatorial algorithms for generating de Bruijn sequences by in-
vestigating Lyndon or co-Lyndon words. We start with an efficient necklace concatenation
algorithm, and then present several follow-up algorithms by setting rules on Lyndon words.
The name Granddaddy for the above algorithm originated in [1132], where Knuth referred to
the lexicographically least de Bruijn sequence for each positive integer n as the granddaddy.
Different from the preceding greedy algorithms, the Granddaddy Algorithm does not require
storing the previously generated sequence. Instead, it only requires checking whether the
n-tuple x∗ is a Lyndon word or not, hence reducing memory cost to O(n). The reverse
version was recently proposed in [642], where the authors termed it the Grandma Algorithm
corresponding to the playful name Granddaddy Algorithm.
From the above algorithms, one may already notice that the process essentially provides
ways of concatenating the necklaces in the state graph of PCRs. In fact, we can have even
simpler algorithms for creating possible Lyndon words [1036, 1625].
Pseudo-Noise Sequences 639
Since the necklace Neck(x) has a unique Lyndon word, the above two algorithms essentially
generate PCRs with the successor rule xt+n = xt and join the cycles by swapping the
successors of that Lyndon word with xt+n = xt . The preceding approaches that work with
PCRs can be applied to CCRs with moderate modifications [781].
The proofs of the preceding algorithms are given in the corresponding references [37, 641,
754, 755, 1036, 1333], and PCR1 and CCR3, despite different presentations, are essentially
the same as the ones in [1036]. Observe that these algorithms heavily depend on the simple
cycle structures of PCRs and CCRs. They have the advantages of simplicity and low memory
cost. Nevertheless, they only generate a few de Bruijn sequences. From an algebraic point
of view, they are among the simplest LFSRs with manageable cycle structures. After the
following example we will introduce some other LFSRs that have been used to generate de
Bruijn sequences.
Example 25.4.14 Table 25.1 illustrates the process of these algorithms for n = 5 starting
with all-zero string 00000. Note that in this example, the algorithms CCR1 and CCR2
produce the same de Bruijn sequence. This is not true for n > 5. For instance, if n = 6, the
algorithms CCR1 and CCR2 generate the following de Bruijn sequences, respectively:
0000001111110001101110011001010110101001000100111011000010111101;
0000001111110001001110110011000010111101001010110101000110111001.
FIGURE 25.6: The adjacency graph of LFSRs from a product of primitive polynomials
p1 (x), p2 (x) that have relatively co-prime degrees
[0]
s
1
n
1
2 (2 − 2)(2m − 2)
s[s + s2 ]
Q 1
Q
Q
n Q 2m − 2
2 −2
Q
Q
Q
[s1 ] s
QQs
[s2 ]
1
Beyond this simplest case, a natural step would be to consider FSRs with characteristic
polynomial the product of two primitive polynomials.
Pn−1
Proposition 25.4.15 ([1240, 1244]) Let p1 (x) = xn + i=0 ci xi and p2 (x) = xm +
Pm−1 j
j=0 dj x be two primitive polynomials and gcd(n, m) = 1. The cycle structure of an
(n + m)-stage LFSR with characteristic polynomial f (x) = p1 (x)p2 (x) is given by
From Proposition 25.4.15, one can derive a full cycle from Ω(f ) using the following
algorithm.
Step 1: Swap the successors of the conjugate pair (0 · · · 0, 10 · · · 0) shared by the cycles [0]
and [s1 + s2 ].
Step 2: Pick an (n + m)-tuple A = a0 a1 · · · an+m−1 from the cycle [s1 ], with its conjugate
Ab on [s1 +s2 ]. Swapping the successors of the conjugate pair (A, A)
b joins the cycles
[s1 ] and [s1 + s2 ] (as in Figure 25.4).
Step 3: Pick an (n + m)-tuple B = b0 b1 · · · bn+m−1 from the cycle [s2 ], with its conjugate B
b
on [s1 + s2 ]. Swapping the successors of the conjugate pair (B, B) joins the cycles
b
[s2 ] and [s1 + s2 ] (as in Figure 25.4).
Step 4: A new feedback function that generates de Bruijn sequences of order (n + m) is
given by
n−1
X m−1
X X n+m−1
Y
f 0 (x0 , . . . , xn+m−1 ) = ci dj xi+j + (xi + δi + 1),
i=0 j=0 δ∈∆ i=1
where ci , dj are the coefficients in p1 (x) and p2 (x), respectively, as given in Propo-
n+m−1
z }| {
sition 25.4.15, and ∆ is a set consisting of 0 · · · 0 , a1 · · · an+m−1 , and b1 · · · bn+m−1 .
Given the initial state 00000, an FSR with feedback function f 0 generates the de Bruijn
sequence 00000111110110101110010100110001.
The previous example shows the process of joining smaller cycles into a full cycle for
a relatively simple characteristic polynomial, and shows how the corresponding feedback
function is obtained. It is proven in [1239] that when different conjugate pairs are chosen,
Pseudo-Noise Sequences 643
the resulting de Bruijn sequences will be different. This indicates that the number of de
Bruijn sequences obtained in this way can be determined by explicitly counting the number
of conjugate pairs among adjacent cycles [1239].
Following the idea from Proposition 25.4.15, we can further consider LFSRs that have
the characteristic polynomial as a product of multiple primitive polynomials.
Proposition 25.4.18 ([1240]) For an n-stage LFSR with characteristic polynomial of the
Qk
form f (x) = i=0 pi (x), where pi (x) is a primitive polynomial of degree di such that di < dj
and gcd(di , dj ) = 1 for 0 ≤ i < j ≤ k and i 6= j, let si be the m-sequence of length 2di − 1
generated by pi (x). Then the cycle structure of Ω(f ) is given by
k+1
[ [
Ω(f ) = [0] ∪ [si1 + si2 + · · · + sil ],
l=1 Il ⊆Zk+1
where Il = {i1 , i2 , . . . , il } and Zk+1 = {0, 1, . . . , k}. Furthermore, the adjacency graph G(f )
is given as follows.
(a) The cycle [0] is adjacent to the cycle [s0 + s1 + · · · + sk ].
hP i hP i
(b) For any set Il ⊆ {0, 1, . . . , k}, the cycle j∈Il sj is adjacent to all cycles j∈M sj
where M is any set satisfying Il ∪ M = Zk+1 .
hP i hP i
(c) For any set M given in (b), the cycles j∈Il s j and j∈M sj share only one
Q dij
conjugate pair if M ∩ Il = ∅, and exactly (2 − 2) conjugate pairs otherwise.
ij ∈Il ∩M
k
1
(2di − 2) conjugate pairs.
Q
In particular, the cycle [s0 + s1 + · · · + sk ] contains 2
i=0
By the above proposition, an algorithm can be developed to join smaller cycles in Ω(f ) into
a full-length cycle [1240]. In particular, when p0 (x) = 1 + x, the adjacency graph G(f ) for
Qk
f (x) = i=0 pi (x) has no loop. In this case, the number of de Bruijn sequences from f (x)
can be determined by the BEST Theorem and the theory of cyclotomy [1240].
A further extension for constructing de Bruijn sequences from LFSRs is to consider non-
primitive irreducible polynomials or even arbitrary characteristic polynomials. However, the
study of LFSRs with more flexible choices of characteristic polynomials also involves much
more complex investigation of adjacency graphs and heavier notation. (Readers may already
notice the heavy notation and complexity from Proposition 25.4.18 and Theorem 25.3.9.)
For a deep understanding of this topic, please refer to [382, 383, 1238, 1239, 1244] for recent
work in this direction. In the following, we will introduce a simple case for non-primitive
irreducible polynomials followed by an auxiliary example.
Note that a non-primitive irreducible polynomial f of degree n has its period e as a
divisor of 2n − 1. According to Theorem 25.3.4, all nonzero sequences in Ω(f ) have period
e. That is to say, the state diagram has the cycle structure
Observe that the mapping φ satisfies φ(x) + φ(y) = φ(x + y) for any x, y ∈ F2n . This
statement is clearly valid when at least one of x, y is the zero element. Indeed, for nonzero
x, y in F2n , suppose x = αk , y = αl , and αk + αl = αm . Then the ith coordinate of
φ(x) + φ(y) is ak+id,0 + al+id,0 . It is the starting coordinate of αk+id + αl+id = αm+id , which
is actually the ith coordinate of αm . It thus follows that φ(αk )+φ(αl ) = φ(αm ) = φ(αk +αl ).
In addition, it is readily seen that φ(x) = 0 if and only if x = 0. Hence, φ is a bijective
mapping from F2n to Fn2 , and φ(x), φ(x + 1) for any x ∈ F2n always form a conjugate pair
[924].
The cyclotomic class with respect to d is given by Ci = {αi β j | 0 ≤ j < e}, where
0 ≤ i < d. It was shown in [924, Theorem 3] that the mapping φ induces a one-to-one
correspondence between cycles in Ω(f ) and cyclotomic classes. As a result, the number of
conjugate pairs between two cycles [si ] and [sj ] is equal to the cyclotomic number (i, j)d
given by (i, j)d = |Ci ∩ (Cj + 1)|. In cyclotomy theory, the cyclotomic numbers (i, j)d for
small integers d are determined. Hence, the adjacency graph G(f ) can be explicitly given
for those integers d with known cyclotomic numbers.
Example 25.4.19 Let n = 4 and f (x) = x4 + x3 + x2 + x + 1. For the finite field F24 =
F2 [x]/(f (x)), we take a primitive element α that is a root of the primitive polynomial
q(x) = x4 + x + 1. Then β = α3 . The FSR with characteristic polynomial f (x) is the
so-called pure summing register and has cycle structure as follows
In the above we introduced the generation of de Bruijn sequences from LFSRs with
simple characteristic polynomials. For LFSRs with more general characteristic polyno-
mials, such as a product of two different irreducible polynomials or even an arbitrary
polynomial, Chang et al. in [382, 384] intensively investigated the property of their
cycle structure and adjacency graph, and also developed a python program to gener-
ate de Bruijn sequences from products of irreducible polynomials, which is available at
https://github.com/adamasstokhorst/deBruijn.
Chapter 26
Lattice Coding
Frédérique Oggier
Nanyang Technological University
26.1 Introduction
A lattice Λ in Rn is a set of points in Rn composed of all integer linear combinations
of independent vectors b1 , b2 , . . . , bm :
Λ = {a1 b1 + · · · + am bm | a1 , . . . , am ∈ Z} .
bm
b1
b2
...
(x1 , . . . , xn ) = (a1 , . . . , am ) for a1 , . . . , am ∈ Z.
bm
645
646 Concise Encyclopedia of Coding Theory
Message
Receiver
Source
6
z
u = u1 · · · uk noise x
e
message codeword
estimate
? ? y =x+z
x received
Encoder codeword Channel vector Decoder
- -
Lattices have been extensively studied; see, e.g., [442] for an encyclopedic reference, [661,
1335] for mathematical theories, [1938] for an information theoretic approach, [1384, 1497]
for applications of lattices to cryptography, and [455] for a recent tutorial on lattices and
some of their applications.
This chapter introduces some aspects of lattice coding, namely coding for the Gaussian
channel (see Section 26.2), modulation schemes for fading channels (in Section 26.3) and
constructions of lattices from linear codes (in Section 26.4). Some other aspects of lattice
coding will be reviewed in Section 26.5.
y = x + z ∈ Rn (26.1)
where z is a random vector whose components are independent Gaussian random variables
with mean 0 and variance σ 2 , as shown in Figure 26.1.
The Gaussian channel coding problem consists of constructing the set S such that x
can be decoded from y despite the presence of the noise z. If the transmitter had infinite
power, given σ 2 , it could just scale the set S, so that the distance between any two vectors
in S is much larger than 2σ 2 . Then points in S will be far enough apart that the decoder
can just choose the point x which is closest to the received point y. However √ we do have
a power constraint: all the points of S must lie within a sphere of radius nP around the
origin, where P defines an average power constraint.
Suppose now that S is a subset from a lattice Λ. The receiver will make a correct decision
to choose the closest lattice point x ∈ S from y as the decoded point exactly if the noise
Lattice Coding 647
The dominant terms in the sum correspond to vectors with minimum norm1
λ = min kxk .
06=x∈Λ
Therefore, dropping all terms except the ones of minimum norm, the upper bound can
be approximated by
2 2
κe−ρ /(2σ )
,
2
where ρ = λ/2 is the packing radius of Λ, and κ is the kissing number of Λ, that is, the
number of packing balls that touch a fixed one, which corresponds to the number of lattice
points having the minimum nonzero norm.
This expression is minimized if ρ is maximized (once this is done, one may then minimize
κ). Intuitively, we expect√the number of √ points in S to be close to the ratio between the
volume of the sphere B n ( nP ) of radius nP and the volume V (Λ) of Λ, i.e.,
vol B n (1)(nP )n/2 ∆(nP )n/2
|S| ≈ =
V (Λ) ρn
where ∆ is the packing density of Λ defined by
vol B n (ρ)
∆(Λ) = .
V (Λ)
∆ 2/n
Then ρ2 ≈ (nP ) |S| , and for a fixed number of points, minimizing the probability
of error can be achieved by maximizing the packing density of Λ.
We give some examples of famous lattices, with some of their parameters introduced
above; more are found in [442, Chapter 4].
• The cubic lattice Zn : It has minimal norm λ(Zn ) = minx∈Zn \{0} ||x|| = 1, packing
radius ρ(Zn ) = 1/2, and kissing number κ(Zn ) = 2n. Its packing density is ∆(Zn ) =
vol Bn (1) .
2n
E7 = {x = (x1 , . . . , x8 ) ∈ E8 | x1 = x2 } ,
E6 = {x = (x1 , . . . , x8 ) ∈ E8 | x1 = x2 = x3 } .
• The Barnes–Wall lattice Λ16 : The so-called Barnes–Wall lattices BWn are defined
in dimension n = 2k , k ≥ 2. In dimension 16, Λ16 = BW16 has the best known packing
density in dimension 16.
Message
Receiver
Source
6
H, z
u = u1 · · · uk fading and noise x
e
message codeword
estimate
? ? y = xH + z
x received
Encoder codeword Channel vector Decoder
- -
e 6= x
The intuition follows the formal derivation of the error probability of decoding x
when x is sent, given that H is known, and assuming that the decoder looks for the closest
lattice point, in which case this error probability is bounded above by
1 Y 1
xi )2
(xi −e
.
2
xi 6=x
ei 8σ 2
The actual lattice decoder used to find the closest lattice point is called the sphere decoder
[479, 922, 1855].
Algebraic constructions (see, e.g., [143, 1055]) provide a systematic method to obtain
full diversity, and also to compute the minimum distance. Such an algebraic construction
is explained below.
and the conditions σ1 (α) > 0 and σ2 (α) > 0 ensure that no complex value is introduced
when taking the square root.
The volume of this lattice is given by the square root of
2 p 2
σ1 (1) σ2 (1) σ1 (α) p 0
det det = σ1 (α)σ2 (α)(σ2 (θ) − σ1 (θ))2 .
σ1 (θ) σ2 (θ) 0 σ2 (α)
We are left to exhibit full diversity. Recall from (26.2) that points in lattices obtained
from quadratic fields are of the general form
p p p p
( σ1 (α)σ1 (x), σ2 (α)σ2 (x)) and ( σ1 (α)σ1 (y), σ2 (α)σ2 (y))
1√ 1√
B= 1+ 5
1+ 5
.
σ1 2 σ2 2
Then " √ p #
α σ2 (α)√
B= √ 1+√5 p
α 2 σ2 (α) 1−2 5
with corresponding Gram matrix G equalling
" √ √ √ √ #
σ1 3 − 1+2√5 + σ2 3 − 1+2 5 √ σ1 −1 + 2 1+2√5 + σ2 −1 + 2√1+2 5
σ1 −1 + 2 1+2 5 + σ2 −1 + 2 1+2 5 σ1 2 + 1+2 5 + σ2 2 + 1+2 5
yielding
5 0
G=
0 5
and volume p √ √
|σ1 (α)σ2 (α)| |σ2 (θ) − σ1 (θ)| = 5 | 5| = 5.
Up to a change of basis, this lattice is in fact a scaled version of Z2 .
26.4.1 Construction A
For q ≥ 2 a positive integer, consider the set Zq = {0, 1, . . . , q − 1} of integers modulo q.
We will use the notation Fp to denote Zp to emphasize the case q = p is prime. A linear code
C in Znq is by definition an additive subgroup of Znq , whose vectors are called codewords. If
q = p is prime, we have more structure, and a linear code is a subspace of dimension k of
the vector space Fnp (called an [n, k] code).
Next we establish a connection between linear codes in Znq and lattices. Consider the
map ρ defined over n copies of Zq by taking the reduction modulo q component-wise:
One may take any arbitrary subset S of Znq and compute ρ−1 (S), but we will start with S
⊂ Znq a linear code, to understand how this code structure is carried over to ρ−1 (S).
The key result is as follows (see, e.g., [455] for a proof):
For C ⊂ Znq a linear code, the lattice ΛC = ρ−1 (C) is said to have been obtained via
Construction A. Construction A generates good (capacity-achieving) codes for the Gaus-
sian channel, for some channels with side information [1937]. It is further used for wiretap
coding, as will be explained below. It also appears in lattice-based cryptographic schemes,
linked to the computational difficulty of the shortest and closest vector problems [1384].
652 Concise Encyclopedia of Coding Theory
qn
ΛC
• We have n =
= |C|, where |C| is the number of codewords of C.
qZ V (ΛC )
• If q is prime, a linear code C of dimension k ≤ n has q k codewords, and V (ΛC ) = q n−k .
A generator matrix for the lattice ΛC can be computed from that of C. If q = p is prime,
the linear code C of dimension k over Fp is a subspace and has a basis, formed by k vectors.
These k vectors are stacked as row vectors to form a k × n matrix with elements in Fp . In
this case, up to coordinate
permutation, a generator matrix may always be reduced to its
systematic form Ik A , where Ik is the k × k identity matrix, and A is a k × (n − k) matrix.
When q is a composite number, even though we still have a generator matrix, which
contains vectors vi = (vi1 , . . . , vin ) for i = 1, . . . , l (l = k when q = p) that generate C as its
rows, these vectors do not always form a basis and the generator matrix may notP always be
l
reduced to its systematic form. But each codeword a ∈ C can be written as a = i=1 ai vi ,
so that
Xl Xl n
X
a= ai vi ∈ C ⇐⇒ ρ−1 (a) = ai vi + qhi ei ∈ Rn
i=1 i=1 i=1
n
where ei , i = 1, . . . , n, form the canonical basis of R and h1 , . . . , hn ∈ Z.
Stacking the vectors as row vectors of an (n + l) × n matrix yields an expanded generator
matrix. Since we are working over Z, a generator matrix for the lattice ΛC is given by the
n × n full rank matrix H obtained by computing the Hermite normal form (HNF) of
v1
..
.
vl
.
qe1
.
..
qen
Recall that being in a Hermite normal form means that H = [hij ] is a square matrix
satisfying:
• hij = 0 for i < j, which means the matrix H will be upper triangular, and
• 0 ≤ hij < hii for i > j, that is, entries are nonnegative and each row has a maximum
entry on the diagonal.
If the generator matrix of C can be put in systematic form, then a generator matrix of ΛC
is
Il A
.
0(n−l)×l qIn−l
The code over F3 has dimension 2, length n = 3, and has for generator matrices
2 0 1 1 0 2
M= and R =
0 2 1 0 1 2
Lattice Coding 653
Famous lattices
n have been shown to be obtained using Construction
o A. For example, the
Pn−1
linear code C = a1 , . . . , an−1 , i=1 ai a1 , . . . , an−1 ∈ F2 of length n and dimension
n − 1 gives the lattice ΛC with generator matrix
1
..
In−1 . .
1
01×(n−1) 2
Thus every vector in ΛC is such that the sum of its entries is even and we have just con-
structed the lattices
Xn
Dn = (x1 , . . . , xn ) ∈ Zn xi is even .
i=1
We note that the lattice construction from number fields from the previous section can
in turn be combined with Construction A to obtain further lattices; see, e.g., [1154] and
references therein.
Furthermore, a popular use of Construction A is the construction of lattices which are
particularly structured with respect to their dual; they are called modular lattices (see,
e.g., [115, 236]). We will not give more details on these since they have little application to
coding theory. We will however mention in Section 26.5.2 how the weight enumerator of C
is interpreted in the context of Construction A.
(i)
where αj ∈ {0, 1} and l ∈ Zn .
The minimum distance condition does not play a role in the following discussion. A
related construction is used in [913] to construct Barnes–Wall lattices from Reed–Muller
codes. It was called Construction D in [1153].
It was shown in [1153] that the set ΓD itself may not necessarily be a lattice. Let us denote
by ΛD the smallest lattice that contains ΓD , and by ? the componentwise multiplication
(known also as the Schur product or Hadamard product): for x = (x1 , . . . , xn ), y =
(y1 , . . . , yn ) ∈ Fn2 , we have
x ? y := (x1 y1 , . . . , xn yn ) ∈ Fn2 .
yB = x + zB and yE = x + zE
for the respective received vectors at Bob and Eve. To obtain confidentiality, randomness
is introduced at the transmitter, and coset coding gives a practical way to handle this
randomness.
Let us look again at the lattice ΛC = ρ−1 (C) obtained from a linear code C ⊂ Znq
via Construction A geometrically. It is obtained by considering the lattice qZn and its
656 Concise Encyclopedia of Coding Theory
translations by the codewords of C. In other words, ρ−1 (C) is the union of cosets of qZn , and
codewords of C form coset representatives. This makes Construction A particularly suitable
for a coding strategy called coset coding, which is used in particular in the context of
wiretap coding.
The secret information is encoded into cosets, while x is then chosen randomly within
this coset. If we consider for example the code {(0, 0), (1, 1)} ⊂ Z22 , one bit of secret can be
transmitted using coset coding: to send 0, choose the coset 2Z2 , and to send 1, choose the
coset 2Z2 + (1, 1). The geometric intuition is that when Bob receives a noisy codeword, his
channel is such that in the radius around his received point, only the codeword that was
sent is present, while Eve, who is assumed to have a stronger noise than Bob, will find in
her radius points from different cosets, such that each coset is equally likely to have been
sent. See [1290] for a practical implementation of coset coding over a USRP testbed.
Coset encoding uses two nested lattices ΛE ⊂ ΛB , where ΛB is the lattice from which
a signal constellation is carved for transmission to Bob, while ΛE is the sublattice used
to partition ΛB . For Construction A, as explained above, ΛB is partitioned using ΛE =
qZn . More general versions of Construction A can be used for the same purpose (see, e.g.,
[661, 1154] and references therein), and one may ask for a design criterion to choose nested
pairs of lattices ΛE ⊂ ΛB that bring most confidentiality. The (strong) secrecy gain
χΛ,strong of an n-dimensional lattice Λ defined by
ΘνZn (y)
χΛ,strong = sup
y>0 ΘΛ (y)
for y > 0 was proposed as one design criterion in [1461], where ΘΛ is the theta series of
Λ [442] defined by X 2
ΘΛ (z) = q kxk for q = eiπz , Im(z) > 0
x∈Λ
(we set y = −iz). Note the normalization factor ν which ensures that both Zn and Λ have
the same volume, for fair comparison. The theta series of an integral lattice keeps track of
the different norms of lattice points. The coefficient N (m) of q m in this series tells how many
points in the lattice are at squared distance m from the origin. This series always starts with
1, corresponding to the zero vector. The second term corresponds to the squared minimum
2
norm λ2 , and thus the coefficient N (λ2 ) of q λ is the kissing number κ of the lattice.
The theta series of a lattice is not that easy to compute in full generality. If the lat-
tice is obtained using Construction A over F2 , then ΘΛC (z) = HweC (θ3 (2z), θ2 (2z)) where
HweC (x, y) is the Hamming weight enumerator of C and θ2 , θ3 are Jacobi theta functions
[442].
1
The role of the theta series ΘΛE at the point y = 2πσ 2 has been independently confirmed
E
in [1266], where it was shown for the mod-Λ Gaussian channel that the mutual information
I(S; Z), an information theoretic measure of the amount of information that Eve gets about
the secret message S byreceiving Z, is bounded by a function that depends of the channel
1
parameters and of ΘΛE 2πσ2 .
E
Chapter 27
Quantum Error-Control Codes
27.1 Introduction
Information is physical. It is sensible to use quantum mechanics as a basis of computation
and information processing [724]. Here at the intersection of information theory, computing,
and physics, mathematicians and computer scientists must think in terms of the quantum
physical realizations of messages. The often philosophical debates among physicists over
the nature and interpretations of quantum mechanics shift to harnessing its power for
information processing and testing the theory for completeness.
One cannot directly access information stored and processed in massively entangled
quantum systems without destroying the content. Turning large-scale quantum computing
into practical reality is massively challenging. To start with, it requires techniques for error
control that are much more complex than those implemented effectively in classical systems.
As a quantum system grows in size and circuit depth, error control becomes ever more
important.
Quantum error-control is a set of methods to protect quantum information from
unwanted environmental interactions, known as decoherence. Classically, one encodes
information-carrying vectors into a larger space to allow for sufficient redundancy for error
detection and correction. In the quantum setup, information is stored in a subspace embed-
ded in a larger Hilbert space, which is a finite dimensional, normed, vector space over the
field of complex numbers C. Codewords are quantum states and errors are operators.
The good news is that noise, if it can be kept below a certain level, is not an obstacle to
resilient quantum computation. This crucial insight is arrived at based on seminal results
that form the so-called threshold theorems. Theoretical references include the exposition of
Knill et al. in [1131], the work of Preskill in [1542], the results shown by Steane in [1751], and
the paper of Ahoronov and Ben-Or [24]. A comprehensive review on related experiments is
available in [334].
The possibility of correcting errors in quantum systems was shown, e.g., in the early
works of Shor [1695], Steane [1749], and Laflamme et al. [1187]. While the quantum codes
657
658 Concise Encyclopedia of Coding Theory
that these pioneers proposed may nowadays seem to be rather trivial in performance, their
construction highlighted both the main obstacles and their respective workarounds. Mea-
surement collapses the information contained in the state into something useless. One should
measure the error, not the data. Since repetition is ruled out due to the No-cloning The-
orem [1907], we use redundancy from spreading the states to avoid repetition. There are
multiple types of errors, as we will soon see. The key is to start by correcting the phase
errors, and then use the Hadamard transform to exchange the bit flips and the phase er-
rors. Quantum errors are continuous. Controlling them seemed to be too daunting a task. It
turned out that handling a set of discrete error operators, represented by the tensor product
of Pauli matrices, allows for the control of every C-linear combination of these operators.
Advances continue to be made as effort intensifies to scale quantum computing up.
Research in quantum error-correcting codes (QECs) has attracted the sustained at-
tention of established researchers and students alike. Several excellent online lecture notes,
surveys, and books are available. Developments up to 2011 have been well-documented
in [1252]. It is impossible to describe the technical details of every important research direc-
tion in QECs. We focus on quantum stabilizer codes and their variants. The decidedly
biased take here is for the audience with more applied mathematics background, including
coding theorists, information theorists, and researchers in discrete mathematics and finite
geometries. No knowledge of quantum mechanics is required beyond the very basic. This
chapter is meant to serve as an entry point for those who want to understand and get
involved in building upon this foundational aspect of quantum information processing and
computation, which have been proven to be indispensable in future technologies.
A quantum stabilizer code is designed so that errors with high probability of occuring
transform information-carrying states to an error space which is orthogonal to the original
space. The beauty lies in how natural the determination of the location and type of each
error in the system is. Correction becomes a simple application of the type of error at the
very location.
27.2 Preliminaries
Consider the field extensions Fp to Fq=pr to Fqm =prm , for prime p and positive integers
r and m. For α ∈ Fqm , the trace mapping from Fqm to Fq is given by
m−1
TrFqm /Fq (α) = α + αq + . . . + αq ∈ Fq .
The trace of α is the sum of its conjugates. If the extension Fq of Fp is contextually clear,
the notation Tr is sufficient. Properties of the trace map can be found in standard textbooks,
e.g., [1254, Chapter 2]. The key idea is that TrFqm /Fq serves as a description for all linear
transformations from Fqm to Fq .
Let G be a finite abelian group, for now written multiplicatively, with identity 1G . Let U
be the multiplicative group of complex numbers of modulus 1, i.e., the unit circle of radius
1 on the complex plane C. A character χ : G → U is a homomorphism. For any g ∈ G, the
|G|
= χ g |G| = χ(1G ) = 1. Let c denote
images of χ are |G|th roots of unity since (χ(g))
−1
the complex conjugate of c. Then χ(g −1 ) = (χ(g)) = χ(g). The only trivial character is
χ0 : g 7→ 1 for all g ∈ G. One can associate χ and χ by using χ(g) = χ(g). The set of all
characters of G forms, under function multiplication, an abelian group G. b For g, h ∈ G and
Quantum Error-Control Codes 659
χ, Ψ ∈ G
b we have two orthogonality relations
( (
X 0 if χ 6= Ψ, X 0 if g 6= h,
χ(g) Ψ(g) = and χ(g) χ(h−1 ) = (27.1)
g∈G
|G| if χ = Ψ |G| if g = h.
b χ∈G
Consider the additive abelian group (Fq , +). The additive character χ1 : (Fq , +) → U
2π
given by χ1 : c 7→ e p Tr(c) , for all c ∈ Fq , with Tr = TrFq /Fp , is called the canonical
character of (Fq , +). For a chosen b ∈ Fq and for all c ∈ Fq ,
2π
χb := Fq → U where c 7→ χ1 (b · c) = e p Tr(b·c)
is a character of (Fq , +). Every character of (Fq , +) can, in fact, be expressed in this manner.
The extension to (Fnq , +) is straightforward.
2π
Theorem 27.2.1 Let ζ := e p and Tr = TrFq /Fp be the trace map with q = pm . Let
a = (a1 , a2 , . . . , an ) and b = (b1 , b2 , . . . , bn ) be vectors in Fnq . For each a,
A qubit, a term coined by Schumacher in [1635], is the canonical quantum system consist-
ing of two distinct levels. The states of a qubit live in C2 and are defined by their continuous
amplitudes. A qudit refers to a system of q ≥ 3 distinct levels, with a qutrit commonly used
when q = 3. Physicists prefer the “bra” h·| and “ket” |·i notation to describe quantum
systems. A |ϕi is a (column) vector while hψ| is the vector dual of |ψi.
n
An arbitrary nonzero vector in C2 is written
X 1 X
|ψi = ca |ai with ca ∈ C and kca k2 = 1.
2n
a∈Fn
2
na∈F2
The normalization is optional since |ψi and α|ψi are considered the same state for nonzero
α ∈ C.
P P
The inner product of |ψi := a∈Fn ca |ai and |ϕi := a∈Fn ba |ai is
2 2
X
hψ|ϕi = ca ba .
a∈Fn
2
Their (Kronecker) tensor product is written as |ϕi ⊗ |ψi and is often abbreviated to
|ϕψi. The states |ψi and |ϕi are orthogonal or distinguishable if hψ|ϕi = 0. Let A be
a 2n × 2n complex unitary matrix with conjugate transpose A† . The (Hermitian)
√ inner
product of |ϕi and A|ψi is equal to that of A† |ϕi and |ψi. Henceforth, i := −1.
660 Concise Encyclopedia of Coding Theory
n
Definition 27.2.3 A qubit error operator is a unitary C-linear operator acting on C2
qubit by qubit. It can be expressed by a unitary matrix with respect to the basis {|0i, |1i}.
The three nontrivial errors acting on a qubit are known as the Pauli matrices:
0 1 1 0 0 −i
σx = , σz = , σy = i σx σz = .
1 0 0 −1 i 0
The actions of the error operators on a qubit |ϕi = α|0i + β|1i ∈ C2 can be considered
based on their types. The trivial operator I2 leaves the qubit unchanged. The bit-flip
error σx flips the probabilities
α β
σx |ϕi = β|0i + α|1i or σx = .
β α
The phase-flip error σz modifies the angular measures
α α
σz |ϕi = α|0i − β|1i or σz = .
β −β
The combination error σy contains both a bit-flip and a phase-flip, implying
α −iβ
σy |ϕi = −iβ|0i + iα|1i or σy = .
β iα
It is immediate to confirm that σx2 = σy2 = σz2 = I2 and σx σz = −σz σx . The Pauli matrices
generate a group of order 16. Each of its elements can be uniquely represented as iλ w, with
λ ∈ {0, 1, 2, 3} and w ∈ {I2 , σx , σz , σy }.
Example 27.3.2 Continuing from Example 27.3.1, we write E = X((0, 1))Z((0, 0)) and
E0 = iX((0, 1))Z((1, 1)). We choose the ordering (0, 0), (0, 1), (1, 0), (1, 1) of F22 and the
corresponding ordering for the basis of C4 . The matrix for X((0, 1)) agrees with I2 ⊗ σx ,
the matrix for Z((0, 0)) is I4 , and the matrix for Z((1, 1)) is diagonal with diagonal entries
1, −1, −1, 1. Multiplying matrices confirms that σz ⊗ σy is indeed iX((0, 1))Z((1, 1)).
The quantum weight of an error operator E = iλ X(a)Z(b) ∈ En is
wtQ (E) := wtQ (E) = wtQ (a|b) = {i | ai = 1 or bi = 1, 1 ≤ i ≤ n}
= {i | wi 6= I2 , 1 ≤ i ≤ n}.
By definition, wtQ (E E0 ) ≤ wtQ (E) + wtQ (E0 ), for any E, E0 ∈ En . We can define the set of
all error operators of weight at most δ in En and determine its cardinality. Let
En (δ) := {E ∈ En | wtQ (E) ≤ δ} and En (δ) = {E ∈ En | wtQ (E) ≤ δ}.
Pδ Pδ
Then |En (δ)| = 4 j=0 3j nj and |En (δ)| = j=0 3j nj .
In the classical setup, both errors and codewords are vectors over the same field. In
the quantum setup, errors are linear combinations of the tensor products of Pauli matrices.
n
A qubit code Q ⊆ C2 has three parameters: its length n, dimension K over C, and
minimum distance d = d(Q). We use
((n, K, d)) or Jn, k, dK with k = log2 K
to signify that Q describes the encoding of k logical qubits as n physical qubits, with d
being the smallest number of simultaneous errors that can transform a valid codeword into
another.
662 Concise Encyclopedia of Coding Theory
The third equality comes from collecting terms that contain only a and only g. By the first
orthogonality relation in (27.1), one arrives at
(
1 X 0 0 if χ 6= χ0 ,
Lχ Lχ0 = χ (a)a =
|G|
a∈G
Lχ if χ = χ0 .
We verify the second assertion by using the second orthogonality relation in (27.1). Since
χ(1) = 1, we obtain
X 1 XX 1 X X
Lχ = χ(g) g = g χ(g) χ(1) = 1.
|G| |G|
χ∈G
b b g∈G
χ∈G g∈G χ∈G
b
Quantum Error-Control Codes 663
Using g h = a, the definition of Lχ , and the equality χ(g −1 ) = χ(g), one gets
1 X 1 X 1 X
g Lχ = χ(h) g h = χ(ag −1 ) a = χ(g −1 ) χ(a) a = χ(g) Lχ .
|G| |G| |G|
h∈G a∈G a∈G
Proposition 27.3.5 For each χ ∈ G, b let V (χ) := Lχ V = {Lχ (|vi) | |vi ∈ V }. For
|vi ∈ V (χ) and g ∈ G, we have g|vi = χ(g) |vi.
L Thus V (χ) is a common eigenspace of all
operators in G. A direct decomposition V = χ∈Gb V (χ) ensures that each |vi ∈ V has a
unique expression X
|vi = |viχ where |viχ ∈ V (χ).
χ∈G
b
Proof: For |vi ∈ V (χ) and g ∈ G, there exists |wi ∈ V such that
confirming the first assertion, where the second equality follows from Proposition
P 27.3.4.
Using Proposition 27.3.4 again, if |vi ∈ V , we write |vi = 1|vi = b Lχ |vi =
χ∈G
P P
χ∈Gb |viχ , where |viχ := Lχ (|vi) ∈ V (χ). On the other hand, if |vi = b |uiχ for
χ∈G
|uiχ ∈ V (χ), then |uiχ = Lχ (|wiχ ) for some |wiχ ∈ V . Since {Lχ | χ ∈ G} b has the
orthogonality property, for every χ ∈ G, we have
b
X X
|viχ = Lχ(|vi) = Lχ |uiχ0 = Lχ Lχ0 (|wiχ0 )
χ0 ∈G
b χ0 ∈ G
b
X
= Lχ Lχ0 (|wiχ0 ) = Lχ (|wiχ ) = |uiχ .
χ0 ∈ G
b
L
Thus, V = χ∈G
b V (χ).
It is also a well-known fact that V (χ) and V (χ0 ) are Hermitian orthogonal for all χ 6=
0
χ ∈G b when all g ∈ G are unitary linear operators on V .
All the tools to connect qubit stabilizer codes to classical codes are now in place. We
choose G := hg1 , g2 , . . . , gk i to be an abelian subgroup of En with gj := iλj X(aj ) Z(bj )
for 1 ≤ j ≤ k, where aj , bj ∈ Fn2 and λj ≡ aj · bj (mod 2). Since σx and σz are Hermitian
unitary matrices, X(aj ) and Z(bj ) are also Hermitian matrices. The basis element gj is
Hermitian since
gj† := g T
j = (−i)
λj
Z(bj )T X(aj )T = (−i)λj Z(bj ) X(aj )
= iλj (−1)aj ·bj (−1)aj ·bj X(aj ) Z(bj ) = iλj X(aj ) Z(bj ).
This implies E|vi ∈ Q(χ0 ) and E : Q(χ) → Q(χ0 ) where χ0 := χE χ. Since En is a group, E is
a bijection, making dimC Q(χ) = dimC Q(χ0 ). As E runs through En , χE takes all characters
of G, ensuring that dimC Q(χ) is the same for all χ ∈ G. b Thus, dimC Q(χ) = 2n−(n−k) = 2k
for any χ ∈ G using Proposition 27.3.5.
b
We now show that, if E ∈ Ed−1 and |v1 i, |v2 i ∈ Q(χ) with hv1 |v2 i = 0, then
hv1 |E1 E2 |v2 i = 0, where E := E1 E2 . If E ∈ G = C, then hv1 |E|v2 i = χ(E)(hv 1 |v2 i) = 0.
Otherwise, E ∈ / C. From wQ (E) = wQ (E) ≤ d − 1 and the assumption C ⊥s \ C ∩ E d−1 = ∅,
0
we know E ∈ / C ⊥s . Hence, there exists E ∈ G such that E E0 = −E0 E. Then, for |v2 i ∈ Q(χ),
we have
E0 E|v2 i = −E E0 |v2 i = −χ(E0 )E|v2 i with − χ(E0 ) 6= χ(E0 ).
Therefore, E|v2 i ∈ Q(χ0 ) for some χ0 6= χ. Since |v1 i ∈ Q(χ) and Q(χ) is orthogonal to
Q(χ0 ), we confirm hv1 |E|v2 i = 0.
One reads v1 = (a|b) as having a = (1, 1, 0, 0, 0) and b = (0, 0, 1, 0, 1). The code C is
symplectic self-orthogonal, with dimF2 C = 4 and dimF2 C ⊥ = 6; i.e., the codimension
is 2. To extend the basis for C to a basis for C ⊥s we use (0, 0, 0, 0, 0, 1, 1, 1, 1, 1) and
(1, 1, 1, 1, 1, 0, 0, 0, 0, 0). Since wQ (C) = 4 and wQ (C ⊥s ) = 3, one obtains wQ (C ⊥s \ C) = 3.
We can write the J5, 1, 3K code Q = Q(χ0 ) explicitly by using G = hg1 , g2 , g3 , g4 i, with
g1 = σx ⊗ σx ⊗ σz ⊗ I2 ⊗ σz , g2 = σz ⊗ σx ⊗ σx ⊗ σz ⊗ I2 , g3 = I2 ⊗ σz ⊗ σx ⊗ σx ⊗ σz , and
g4 = σz ⊗ I2 ⊗ σz ⊗ σx ⊗ σx .
k
Since
k = 132 and dimC Q = 2 = 2, two independent
vectors in C32 form a basis of
Q = |vi ∈ C g|vi = χ0 (g)|vi for all g ∈ G . Q consists of vectors which are fixed
P all g ∈ G. After some computation,
by P we conclude that Q can be generated by |v0 i =
g∈G g|00000i and |v1 i = g∈G g|11111i.
With minor modifications, the qubit stabilizer formalism extends to the general qudit
case. A complete treatment is available in [1109]. We outline the main steps here. An n-
⊗n ∼ n
qudit system is a nonzero element in (Cq ) = Cq . Let a = (a1 , . . . , an ) ∈ Fnq . The
standard C-basis is
n
and an arbitrary vector in Cq
P
is written |ψi = a∈Fn
q
ca |ai with ca ∈ C such that
q −n a∈Fnq kca k2 = 1.
P
Quantum Error-Control Codes 665
2πi
Let a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Fnq , and ω := e p , where q = pm with p a prime.
The error operators form En := {ω β X(a)Z(b) | a, b ∈ Fnq , β ∈ Fp } of cardinality pq 2n . The
n
respective actions of X(a) and Z(b) on |vi ∈ Cq are X(a)|vi = |a + vi and Z(b)|vi =
0
(ω)Tr(b·v) |vi where Tr = TrFq /Fp . Hence, for E := ω β X(a)Z(b) and E0 := ω β X(a0 )Z(b0 )
0 0
in En , one gets E E0 = ω Tr(b·a −b ·a) E0 E. The symplectic weight of (a|b) is the quantum
weight of E.
The (trace) symplectic inner product of (a|b) and (a0 |b0 ) in F2n q , generalizing (27.2),
is
h(a|b), (a0 |b0 )is = Tr(b · a0 − b0 · a).
⊥s
The symplectic dual of C ⊆ F2n q is C = {u ∈ F2n
q | hu, cis = 0 for all c ∈ C}. As in the
qubit case, in the general qudit setup, a subgroup G of En is abelian if and only if G is a
symplectic self-orthogonal subspace of En ∼= F2n
q . The analog of Theorem 27.3.6 follows.
code Q. The code is pure whenever min {wtH (C2 \ C1⊥E ), wtH (C1 \ C2⊥E )} = min {d1 , d2 }.
Proof: Let Gj and Hj be the generatorand parity check matrices of Cj . Consider the linear
H1 0
2n
code C ⊆ Fq with generator matrix . Since C1⊥E ⊆ C2 , we have H1 H2T = 0.
0 H2
Similarly, from C2⊥E ⊆ C1 , we know H2H1T = 0. Define
C ⊥s to bethe code with parity check
H2 0 G2 0
and generator matrices, respectively, and . We verify that C ⊆ C ⊥s
0 H1 0 G1
with dimFq C = 2n − (k1 + k2 ). By Theorem 27.3.8, dimC Q = n − dimFq C = k1 + k2 − n.
The distance computation is clear.
A special case of the CSS Construction comes via a Euclidean dual-containing code
C ⊥E ⊆ C. From such an [n, k, d]q code C, one obtains an Jn, 2k − n, ≥ dKq code Q. The next
method allows for most qubit CSS codes to be enlarged while avoiding a significant drop
in the distance. The choice of the extra vectors in the generator matrix of C 0 is detailed
in [1750, Section III].
Theorem 27.4.3 (Steane Enlargement of CSS Codes) Let C be an [n, k, d]2 code that
contains its Euclidean dual C ⊥E ⊆ C. Suppose that C can be enlarged to an [n, k 0 > k +1, d0 ]2
code C 0 . Then there is a pure qubit code with parameters Jn, k + k 0 − n, min {d, d3d0 /2e}K.
A generalization to the qudit case was subsequently given in [1267], where the distance is
min {d, d q+1 0
q d e}. Comparing the minimum distances in the resulting codes, the enlargement
offers a better chance of relative gain in the qubit case as compared with the q > 2 cases.
Lisoněk and Singh, inspired by the classical Construction X, proposed a modification to
qubit stabilizer codes in [1273]. The construction generalizes naturally to qudit codes. Here
C ⊥H is the Hermitian dual of C.
Theorem 27.4.4 (Quantum Construction X) For an [n, k]q2 linear code C, let e :=
k − dim(C ∩ C ⊥H ). Then there exists an Jn + e, n − 2k + e, dKq code Q, with d := d(Q) ≥
min {dH (C ⊥H ), dH (C + C ⊥H ) + 1}, where C + C ⊥H := {u + v | u ∈ C, v ∈ C ⊥H }.
The case e = 0 is the usual stabilizer construction. To prevent a sharp drop in d, we want
small e, i.e., large Hermitian hull C ∩ C ⊥H .
We shift our attention now to propagation rules and bounds. Most of them are direct
consequences of the propagation rules and bounds on the classical codes used as ingredients
in the above constructions.
Proposition 27.4.5 (see [331, Theorem 6] for the binary case) From an Jn, k, dKq
code, the following codes can be derived: an Jn, k − 1, ≥ dKq code by subcode construction
if k > 1 or if k = 1 and the initial code is pure, an Jn + 1, k, ≥ dKq code by lengthening if
k > 0, and an Jn − 1, k, ≥ d − 1Kq code by puncturing if n ≥ 2.
The analog of shortening is less straightforward. It requires the construction of an auxiliary
code and, then, a check on whether this code has codewords of a given length. The details
on how to shorten quantum codes are available in [849, Section 4], building upon the initial
idea of Rains in [1554].
How can we measure the goodness of a QEC? For stabilizer codes, given their classical
ingredients and constructions, there are numerous bounds.
Theorem 27.4.6 (Quantum Hamming Bound; see [331] for the binary case) Let
Q be a pure Jn, k, dKq code with d ≥ 2` + 1 and k > 0. Then
`
j n
X
n−k 2
q ≥ (q − 1) .
j=0
j
668 Concise Encyclopedia of Coding Theory
Q is perfect if it meets the bound. The proof comes from the observation that
`
X X n
qn ≥ dimC (E Q) = dimC Q · |En (`)| = q k (q 2 − 1)j .
j=0
j
E∈En (`)
P1
The code in Example 27.3.7 is perfect. It has 2n−k = 16 = j=0 3j 5j = 1 + 15 codewords.
Here is another bound which had been established as a necessary condition for pure
stabilizer codes.
An upper bound, which is well-suited for computer search, is the Quantum Linear Pro-
gramming (LP) Bound ; see Theorem 12.5.3. In the qubit case, the bound is explained in
detail in [331, Section VII]. The same routine adjusts immediately to the general qudit
case, as was shown in [1109, Section VI]. The main tool is the MacWilliams Identities of
Theorem 1.15.3 (and [1319]); these are linear relations between the weight distribution of a
classical code and its dual. They hold for all of the inner products we are concerned with
here and have been very useful in ruling out the existence of quantum codes of certain
ranges of parameters.
Rains supplied a nice proof for the next bound, which is a corollary to the Quantum LP
Bound, using the quantum weight enumerator in [1554]. A quantum code that reaches the
equality in the bound is said to be quantum MDS (QMDS).
Theorem 27.4.8 (Quantum Singleton Bound) An Jn, k, dKq code with k > 0 satisfies
k ≤ n − 2d + 2.
Nearly all known families of classical codes over finite fields, especially those with well-
studied algebraic and combinatorial structures, have been used in each of the constructions
above. A partial list, compiled in mid 2005 as [1109, Table II], already showed a remarkable
breadth. The large family of cyclic-type codes, whose corresponding structures in the rings
of polynomials are ideals or modules, has been a rich source of ingredients for QECs with
excellent parameters. This includes the BCH, cyclic, constacyclic, quasi-cyclic, and quasi-
twisted codes; see, e.g., Chapter 9. In the family, the nestedness property, very useful in
the CSS construction, comes for free. A great deal is known about their dual codes under
numerous applicable inner products. For small q, the structures allow for extensive computer
algebra searches for reasonable lengths, aided by their minimum distance bounds.
The most comprehensive record for best-known qubit codes is Grassl’s online table [847].
Numerous entries have been certified optimal, while still more entries indicate gaps between
the best-possible and the best-known. It is a two-fold challenge to contribute meaningfully
to the table. First, for n ≤ 100, many researchers have attempted exhaustive searches. Bet-
ter codes are unlikely to be found without additional clever strategies. Second, for n > 100,
computing the actual distance d(Q) tends to be prohibitive. As the length and dimen-
sion grow, computing the minimum distances of the relevant classical codes to derive the
quantum distance is hard [1843]. Improvements remain possible, with targeted searches. Re-
cent examples include the works of Galindo et al. on quasi-cyclic constructions of quantum
codes [786] where Steane enlargement is deployed, the search reported in [1273] on cyclic
Quantum Error-Control Codes 669
codes over F4 where Construction X is used with e ∈ {1, 2, 3}, and similar random searches
on quasi-cyclic codes done in [699] for qubit and qutrit codes.
Less attention has been given to record-holding qutrit codes, for which there is no
publicly available database of comparative extent. A table listing numerous qutrit codes
is kept by Y. Edel in [662] based on their explicit construction as quantum twisted codes
in [201]. Better codes than many of those in the table have since been found.
Attempts to derive new quantum codes by shortening good stabilizer codes motivate
closer studies on the weight distribution of the classical auxiliary codes, in particular when
the stabilizer codes are QMDS. Shortening is very effective in constructing qudit MDS codes
of lengths up to q 2 + 2 and minimum distances up to q + 1.
There has been a large literature on QMDS codes. All of the above constructions via
classical codes as well as the propagation rules have been applied to families of classi-
cal MDS codes, particularly Generalized Reed–Solomon (see Chapter 24) and constacyclic
MDS codes. Since the dual of an MDS code is MDS, the dual distance is evident, leaving
only the orthogonality property to investigate. While the theoretical advantages are clearly
abundant, there are practical limitations. The length of such codes is bounded above by
q 2 + 2, when q is even, and by q 2 + 1, when q is odd, assuming the MDS Conjecture holds;
see Conjecture 3.3.21 of Chapter 3.
For qubit codes, the only nontrivial QMDS codes are those with parameters J5, 1, 3K,
J6, 0, 4K, and J2m, 2m − 2, 2K. As q grows larger, the practical value of QMDS codes quickly
diminishes, since controlling qudit systems with q > 3 is currently prohibitive. A list for
q-ary QMDS codes, with 2 ≤ q ≤ 17, is available in [850]. Another list that covers families
of QMDS codes and their references can be found in [391, Table V]. More works on QMDS
codes continue to appear, with detailed analysis on the self-orthogonality conditions supplied
from number theoretic and combinatorial tools.
Taking algebraic geometry (AG) codes, examined in Chapter 15, as the classical ingre-
dients is another route. A wealth of favourable results had already been available prior to
the emergence of QECs. Curves with many rational points often lead, via the Goppa con-
struction, to codes with good parameters. Their duals are well-understood, via the residue
formula. Their designed distances can be computed from the Riemann–Roch Theorem.
Chen et al. showed how to combine Steane enlargement and concatenated AG codes to
derive excellent qubit codes in [395]. A quantum asymptotic bound was established in [714].
Construction of QECs from AG codes was initially a very active line of inquiry. It has
somewhat lessened in the last decade, mostly due to lack of practical values to add to the
quest as q grows.
Using codes over rings to construct QECs has also been tried. This route, however, does
not usually lead to parameter improvements over QECs constructed from codes over fields.
The absence of a direct connection from codes over rings to QECs necessitates the use of a
Gray map, which often causes a significant drop in the minimum distance.
profile, which can be hardware-specific, is crucial. The study of asymmetric quantum codes
(AQCs) gained traction when the ratios of how often σz occurs over the occurrence of σx
were discussed in [1025], with follow-up constructions offered soon after in [1615]. Wang et
al. established a mathematical model of AQCs in the general qudit system in [1871].
As in the symmetric case, En := {ω β X(a)Z(b) | a, b ∈ Fnq , β ∈ Fp }. An error E :=
ω β X(a)Z(b) ∈ En has wtX (E) := wtH (a) and wtZ (E) := wtH (b).
Definition 27.5.1 Let dx and dz be positive integers. A qudit code Q with dimension
K ≥ q is called an asymmetric quantum code (AQC) with parameters ((n, K, dz , dx ))q
or Jn, k, dz , dx Kq , where k = logq K, if Q detects dx − 1 qudits of X-errors and, at the same
time, dz − 1 qudits of Z-errors. The code Q is pure if |ϕi and E|ψi are orthogonal for any
|ϕi, |ψi ∈ Q and any E ∈ En such that 1 ≤ wtX (E) ≤ dx − 1 or 1 ≤ wtZ (E) ≤ dz − 1. Any
code Q with K = 1 is assumed to be pure.
An Jn, k, d, dKq AQC is symmetric, with parameters Jn, k, dKq , but the converse is not
true since, for E ∈ En with wtX (E) ≤ d − 1 and wtZ (E) ≤ d − 1, wtQ (E) may be bigger than
d − 1.
To date, most known families of AQCs come from the asymmetric version of the CSS
Construction and its generalization in [697].
Theorem 27.5.2 (CSS-like Constructions for AQCs) Let Cj be an [n, kj , dj ]q code
for j ∈ {1, 2}. Let Cj⊥∗ be the dual of Cj under one of the Euclidean, the trace Euclidean,
the Hermitian, and the trace Hermitian inner products. Let C1⊥∗ ⊆ C2 , with
Going forward, three general challenges can be identified. First, find better AQCs, par-
ticularly in qubit systems, than the currently best-known. More tools to determine or to
lower bound dz and dx remain to be explored if we are to improve on the parameters. Sec-
ond, construct codes with very high dz to dx ratio, since experimental results suggest that
this is typical in qubit channels. Third, find conditions on the nested classical codes that
yield impure codes.
codes are yet to be devised. Also currently unavailable are efficient encoding and decoding
algorithms.
The bridge between classical coding theory and quantum error-control was firmly put
in place via the stabilizer formalism. Various generalizations and modifications have been
studied since, benefitting both the classical and quantum sides of the error-control theory.
Well-researched tools and the wealth of results in classical coding theory translate almost
effortlessly to the design of good quantum codes, moving far beyond what is currently prac-
tical to implement in actual quantum devices. Research problems triggered by error-control
issues in the quantum setup revive and expand studies on specific aspects of classical codes,
which were previously overlooked or deemed not so interesting. This fruitful cross-pollination
between the classical and the quantum, in terms of error-control, is set to continue.
Acknowledgement
This work is supported by Nanyang Technological University Research Grant no.
M4080456.
Chapter 28
Space-Time Coding
Frédérique Oggier
Nanyang Technological University
28.1 Introduction
Space-time coding refers to an area of coding theory that appears in the context of
transmission of information over a wireless channel where both the transmitter and the
receiver are equipped with multiple antennas. The goal is to design space-time codebooks,
that is families of codewords to be transmitted over this wireless channel, that achieve
high reliability, and high rate (to be formally defined below). When the wireless channel
is such that it may be assumed constant over a given period of time, the codewords are
transmitted via multiple antennas (“space”) over this time period (“time”), which explains
the terminology.
This chapter presents typical channel models for which space-time codes are designed
(see Section 28.2), out of which design criteria are extracted. Then some families of space-
time codes with examples are provided in Section 28.3. Finally different variations of space-
time coding problems are reviewed in Section 28.4.
673
674 Concise Encyclopedia of Coding Theory
Message
Receiver
Source
6
H, Z
u = u1 · · · uk fading and noise X
e
message codeword
estimate
? ? Y = HX + Z
X received
Encoder codeword Channel matrix Decoder
- -
YN ×T = HN ×M XM ×T + ZN ×T ∈ CN ×T . (28.1)
The N × M fading matrix H models the wireless environment: its coefficient hij corre-
sponds to the channel coefficient between the j th transmit and the ith receive antenna and
is a complex Gaussian random variable with zero mean and unit variance. The N × T ma-
trix Z corresponds to the channel noise, whose independent entries are complex Gaussian
random variables with zero mean and variance N0 (see, e.g., [204] for more details). The
xil entry of the space-time codeword X corresponds to the signal transmitted from the
ith antenna during the lth symbol interval for 1 ≤ l ≤ T . It is a function of the message
symbols, and the mapping between the message symbols and the codeword is done by an
encoder, as usual (see Figure 28.1). A space-time block code is a finite set C of M × T
complex matrices X, whose cardinality is denoted by |C|.
The channel model (28.1) is sometimes referred to as a multiple-input multiple-
output (MIMO) channel. Multiple antenna systems have been extensively studied since
they are known to support high data rates [748, 1796].
Design criteria for space-time codes depend on the type of receiver that is considered.
Two major classes of receivers have been considered in the literature:
• coherent: it is assumed that the channel matrix H is known to the receiver.
• non-coherent: the channel matrix H is not known to the receiver, in which case
different solutions are available (see, e.g., [152, 967]), including a technique called
differential space-time coding, which we will explain below.
Space-Time Coding 675
the pairwise error probability Prob(X → X), b i.e., the probability that, when a codeword
X is transmitted, the ML receiver decides erroneously in favor of another codeword X, b
assuming only X and X are in the codebook, is bounded above by
b
" #−N
(X − X)(X
b − b ∗
X)
Prob(X → X)b ≤ det IM + ,
4N0
where IM is the M × M identity matrix. Let r denote the rank of the codeword difference
matrix X − X,
b with nonzero eigenvalues λj , j = 1, . . . , r. For small N0 , we have
Qr !−rN
( j=1 λj )1/r
Prob(X → X)
b ≤ .
4N0
When X and X b vary through distinct codewords in the codebook, we can look at the
minimum value of r, and we call min {rN } the diversity gain of the code. If r = M ,
we say that the code has full diversity. This means that we can exploit all the M N
independent channels available in the MIMO system. Maximizing r is sometimes called the
rank or diversity criterion. Furthermore, we want to maximize the coding gain [1793],
1/M
given by minX6=Xb det((X − X)(X − X) )
b b ∗ .
When a space-time block code C is linear, meaning that
X ± X0 ∈ C for all X, X0 ∈ C,
minX6=X b ∗ ∗
b det((X− X)(X− X) ) reduces to minX6=0M ×T det (XX ); maximizing this quantity
b
is referred to as the determinant criterion.
Considering linear space-time codes not only simplifies the design problem, it also endows
the code of a lattice structure and enables the application of the sphere decoder [479, 922,
1855] at the receiver.
• One may prefer full rate space-time codes, in which case it will be requested to have
k = NT.
• One may take into consideration the notion of shaping gain [747], which suggests
considering constellations with a cubic shaping, in order to have both a practical
way of labeling QAM/HEX symbols, and low average energy. We note that the best
shaping gain is obtained by a spherical shaping, whose labeling becomes complex
for large constellations. See also [1174] for a nonlinear mapping encoder with a good
shaping gain.
• Cubic shaping is related to the concept of information lossless. A space-time code is
information lossless if the mutual information of the equivalent channel obtained
by including the encoder in the channel is equal to the mutual information of the
MIMO channel (see, e.g., [1666]).
• Another space-time coding property which may be desirable is that of having a non-
vanishing determinant (NVD) [155]. Adaptive transmission schemes may require
the use of different size constellations, in which case it is important to ensure that
coding gain does not depend on the constellation size, even if the code is infinite. The
NVD property is a necessary and sufficient condition for a space-time code to achieve
the diversity-multiplexing tradeoff [671, 1945] (see also [1923]).
Yt = St Ht + Zt for t = 0, 1, 2, . . . . (28.3)
The main difference with the coherent case is that we immediately assume that the trans-
mitted matrix St is square, and the index t emphasizes a time dependency, which will be
needed in the differential unitary space-time modulation scheme [967, 1011] explained
next.
Assuming S0 = I, the transmitted signal St is encoded as follows:
Yt = Vut St−1 H + Zt
= Vut (Yt−1 − Zt−1 ) + Zt
= Vut Yt−1 + Z0t .
Space-Time Coding 677
Since the matrix H does not appear in the last equation, this means differential modulation
allows decoding without knowledge of the channel.
The maximum likelihood decoder is thus given by
u
bt = arg min kYt − Vl Yt−1 k.
l=0,...,|C|−1
The pairwise block probability of error can then be bounded above [967, 1011], similarly to
what was done in the coherent case, to obtain the so-called diversity product given by
1
min | det(Vi − Vj )|1/M for all Vi 6= Vj ∈ C (28.5)
2 Vi 6=Vj
where u1 , u2 are QAM symbols and ¯ denotes complex conjugation. We thus transmit 2
symbols for N M = 2; this code is thus full rate. It is also clearly linear.
The Alamouti code is fully diverse since
for any u1 , u2 ∈ C nonzero. For arbitrary complex numbers, it does not have a non-vanishing
determinant; however it does have it for QAM symbols (|u| ≥ 1 if u ∈ Z[i], u 6= 0). The
Alamouti code is one of the most well known and popular space-time codes, thanks to its
excellent performance and easy decoding.
where u1 , . . . , uQ are typically chosen from a QAM constellation, and where Cq , Dq are fixed
complex matrices to be designed. Alternatively, one may consider the real and imaginary
parts of uq , Re(uq ) and Im(uq ), respectively, and write
Q
X
X= (Re(uq )Aq + Im(uq )Bq )
q=1
In [921], the matrices {Aq , Bq } are optimized to yield space-time codes that approach ca-
pacity. They were proposed before the introduction of the properties of non-vanishing de-
terminant, or diversity-multiplexing trade-off.
V = (I + iA)−1 (I − iA).
It is easy to check that V is unitary. The Cayley transform of iA is preferred since all its
eigenvalues are strictly imaginary, thus different from 1, so that the inverse of I + iA always
exists.
680 Concise Encyclopedia of Coding Theory
where u1 , . . . , uQ ∈ R are the information symbols, chosen from a finite set, and where Aq
are fixed M × M complex Hermitian matrices. We remark that the encoding technique is
similar to that of linear dispersion codes.
The performance of a Cayley code depends on Q, the Hermitian basis matrices {Aq },
and the set from which each uq is chosen. To choose the {Aq }, the approach of [920] consists
of optimizing a mutual information criterion, instead of the traditional diversity criterion
(28.5), since it is argued that at high rate, checking the diversity may become intractable.
However, one could combine the Cayley code approach explained above with that of division
algebras to ensure full diversity, as was tried in [1456].
ri = fi s + zi ,
where zi is the noise vector and fi is the fading at the ith relay. Now each relay transmits
ti = Ai ri ,
with
f1 g1 R
.. X
S = A1 s · · · AR s , H = , and W = gi Ai zi + w.
.
fR gR i=1
The matrix S is called a distributed space-time codeword. Analyzing the behavior of the
pairwise error probability shows [1047] that the diversity criterion still holds. Adaptations
of space-time codes have thus been used for wireless networks for example in [672, 673,
1123, 1457].
The diversity-multiplexing gain tradeoff (DMT) has naturally been studied as well in
the context of wireless networks; see [80] for the DMT of the amplify-and-forward protocol,
assuming a direct link from the transmitter to the receiver, unlike in the distributed space-
time code model. It was shown that in order to reach the tradeoff, the transmitter node
always transmits, which yields to so-called non-orthogonal amplify-and-forward protocol. In
[1919], the protocol has further been extended to the case of relays equipped with multiple
antennas.
to be maximized.
In [976], a concatenated scheme is considered, where the inner code is the golden code
(see Subsection 28.3.3) and the outer code is a trellis code. This can be viewed as a mul-
tidimensional trellis coded modulation, where the golden code serves as a signal set to be
partitioned, which is done based on lattice set partitioning to increase the minimum de-
terminant between codewords. A first attempt to design such a scheme was made in [379],
whose ad-hoc scheme suffered from a high trellis complexity. In [1407], the golden code is
serially concatenated with a convolutional code. In [1309], coset coding is used to design an
outer code for the golden code, and the coding problem becomes that of designing a suitable
outer code over the given ring of matrices (F2×2
2 and F2 [i]2×2 ): the repetition code of length
2 and a construction using Reed–Solomon codes are proposed. In [1460], codes over matrix
rings were more generally considered for some space-time codes built over division algebras.
as was done for linear dispersion codes. Then (28.2) in Euclidean norm becomes
also obtained by vectorizing and separating the real and imaginary parts of HBi to obtain
bi for i = 1, . . . , 4N 2 .
From (28.6), a QR decomposition of B, B = QR, with Q unitary, reduces to computing
where R is an upper right triangular matrix. The number and position of nonzero elements
in the upper right part of R will determine the complexity of the real sphere decoding
process [922, 1855].
The worst case is given when the matrix R is a full upper right triangular matrix,
2
yielding the complexity of the exhaustive-search ML decoder, that is here O(|S|4N ), where
|S| is the number of PAM symbols in use. If the structure of the code is such that the
decoding complexity has an exponent of |S| smaller than 4N 2 , we say that the code is
fast-decodable [203].
We list three shapes for the matrix R which have been shown [1048, 1417] to result in
reduced decoding complexity. Suppose a space-time codeword has matrix R of the form:
∆ ∗
R= ,
0 R0
when their corresponding basis matrices Bk and Bl belong to disjoint subsets of the induced
partition of the basis, yielding a decoding complexity of O(|S|max1≤i≤g |Γi | ).
Finally, a code which is not g-group decodable but could become one assuming that a set
of symbols are already decoded gives another characterization of fast-decodability [1417].
A code is called conditionally g-group decodable if there exists a partition into g + 1
disjoint subsets Γ1 , . . . , Γg , ΓC such that
u1 −u∗2
u11 + iu12 −u21 + iu22
X= = ,
u2 u∗1 u21 + iu22 u11 − iu12
where u1 , u2 are QAM symbols and u = (u11 , u12 , u21 , u22 ) is the PAM symbol vector. A
decomposition into basis matrices B1 , B2 , B3 , B4 is given by
where
1 0 i 0 0 −1 0 i
B1 = , B2 = , B3 = , B4 = .
0 1 0 −i 1 0 i 0
We are next interested in deriving a code design criterion for MIMO wiretap codes (there
is an important literature on designing wiretap codes, in particular capacity achieving codes
from an information theoretic view point, which we voluntarily skip here). We know from
(28.1) what is the channel to the honest user (which we will index by M for main); the
channel to the eavesdropper is similar:
YM = HM X + ZM ∈ CnM ×T
YE = HE X + ZE ∈ CnE ×T ,
where the transmitter has M transmit antennas, while the receiver and the eavesdropper
have respectively nM and nE receive antennas. Ignoring the eavesdropper, we know how to
design space-time codes, as reviewed above. We are left with deriving a code design criterion
for confidentiality. The noise coefficients for both receivers have different variances and the
covariance matrices of the fading matrices are different (we do not normalize the covariance
matrices for deriving code design criteria).
To achieve confidentiality, we use an encoding scheme called coset coding, which is
a well accepted encoding to provide confusion in the presence of an eavesdropper, whose
original version is already present in Wyner’s work, where the notion of wiretap codes was
first proposed. The idea of coset encoding is to partition the set of codewords into subsets
(called cosets), label each coset by information symbols, and then pick a codeword uniformly
at random within a coset for the actual transmission. Concretely, the transmitter partitions
the group of linear space-time codewords into a union of disjoint cosets ΛE + C, where ΛE
is a subgroup and C encodes the data. The transmitter then chooses a random R ∈ ΛE so
that X = R + C ∈ ΛE + C.
We mimic the usual pairwise error probability computations to obtain Eve’s probability
of correctly obtaining the secret codeword, in an attempt to derive a confidentiality code
design criterion. It remains an open question whether the obtained criterion can be linked
to the information theoretical characterization of secrecy; see [1266] for a discussion in the
context of Gaussian channels.
We have that Eve’s probability Pc,e of correctly decoding a space-time code X sent is
[153]
−n −T
X
Pc,e ≤ CMIMO γE TM
det (IM + γE XX∗ ) E (28.7)
X∈ΛE
where γE is Eve’s signal to noise ratio and CMIMO is a constant that depends on different
channel parameters.
To derive a code design criterion from (28.7), similarly to the way diversity and coding
gain were obtained for reliability, we rewrite:
−n −T
X
Pc,e ≤ CMIMO γE TM
1+ det (IM + γE XX∗ ) E .
X∈ΛE \{0}
The space-time code used to transmit data to Bob is assumed to be fully diverse, since a
wiretap code needs to ensure reliability for Bob, namely, if X 6= 0 and T ≥ M then the
rank of X is M . To minimize Eve’s average probability of correct decoding, a first simplified
design criterion is then
X 1
min .
ΛE det(XX∗ )nE +T
X∈ΛE \{0}
For works on MIMO wiretap codes achieving the secrecy capacity and satisfying Wyner’s
original criterion for secrecy given in terms of mutual information, see [1111].
Chapter 29
Network Codes
Frank R. Kschischang
University of Toronto
685
686 Concise Encyclopedia of Coding Theory
FIGURE 29.1: (a) the butterfly network with bottleneck channel (x, y), in which source
node s wishes to send fixed-length packets as quickly as possible to two sink nodes t and
u; (b) a routing solution, achieving the transfer of three packets a, b, c over two time slots
(here − indicates an idle time slot); (c) a more efficient network coding solution, achieving
the transfer of two packets a, b in a single channel time slot, at the expense of a coding
operation at node x and decoding operations at the sink nodes
(a, c) a
t t t
(−
c)
−)
b
a
a
(a,
, c)
(b, c) a+
(b,
y y a+b y
s x s x s x
a+
)
(−
(b,
b
, c)
(b,
a)
u u u
(b, a) b
(a) (b) (c)
shown in Figure 29.1(b), where each edge is labelled with a packet pair having components
corresponding to time slots.
In the network coding solution shown in Figure 29.1(c), the contents of packets are
indeed permitted to be altered at the intermediate nodes, i.e., the intermediate nodes may
transmit packets that are a function of the packets that they receive, so that, in effect,
routers become coders. The benefit of this is that node x, which receives two packets a
and b, can transmit their sum a + b (here we view packets of n bits as binary vectors,
and add them as elements of Fn2 ). In a single time slot, node t now receives a and a + b,
and can recover b by forming a + (a + b) in Fn2 . Likewise, node u recovers packet a by
forming (a + b) + b. Thus at the expense of an encoding operation at intermediate node x
and decoding operations at the sinks, a transmission rate of 2 packets per time slot can be
achieved. Since the source can emit at most 2 distinct packets per time slot and the sinks
can receive at most 2 distinct packets per time slot, this rate is clearly the best possible.
The butterfly network, introduced in the seminal paper [26] of Ahlswede, Cai, Li, and
Yeung (in which the field of network coding originated), illustrates the important conclusion
that routing alone is, in general, insufficient to achieve the full information-carrying capa-
bility of packet networks. Stated differently, information-flow is not the same as commodity-
flow: communication networks are not the same as highways, since, unlike cars on the high-
way, packets can in fact be combined to squeeze through network bottlenecks, provided that
sufficient side information is available at the sinks to be able to reverse such combinations.
The network of Figure 29.1, for example, has 7 vertices, 9 directed edges (but no parallel
edges), a single source vertex s, 2 sink vertices t and u, and a packet alphabet A = Fn2 .
At each vertex v ∈ V , we denote by I(v) the set of (incoming) edges incident to v and by
O(v) the set of (outgoing) edges incident from v. Possibly by introducing “virtual nodes”
in a given network we may, without loss of generality, assume that S ∩ T = ∅, that I(s) = ∅
for every source node s ∈ S, and that O(t) = ∅ for every sink node t ∈ T .
As usual in graph theory, a (directed) path in G from a vertex u ∈ V to a vertex v ∈ V
is a finite sequence of directed edges (u, v1 ), (v1 , v2 ), . . . , (v`−1 , v), where each element of the
sequence is an element of E. A cycle in G is a directed path from some vertex to itself. The
term acyclic in the definition of a packet network means that the multigraph G contains
no cycles. A vertex v ∈ V is said to be reachable from u ∈ V if v = u or if there is a
directed path in G from u to v.
As in combinational logic (hence the terminology combinational packet network),
it is assumed that packets transmitted on the non-idle edges of O(v) are functions of the
packets received on the non-idle edges of I(v), or, if v is a source, of packets internally gener-
ated at v. Acyclicity of G ensures that each packet sent on any edge is indeed a well-defined
function of the packets sent by the sources. In an actual network implementation, packets
must be received (and possibly buffered) at v before outgoing packets can be computed
and transmitted. Delays caused by transmission, buffering, and processing of packets are,
however, not explicitly modelled. We refer to the function applied by a particular node v
as the local encoding function at v. For example, if v were acting as a router, then each
packet sent on an edge of O(v) would be a copy of one of the packets received on an edge of
I(v). In Figure 29.1(c), node x receives two packets a and b on the two edges of I(x), and
transmits the packet a + b on the single edge in O(x).
By a single channel use of a combinational packet network N we mean a particular
assignment of packets drawn from A to each of the (non-idle) edges of E, consistent with
the particular local encoding functions implemented by the nodes of the network. In other
words, in a single channel use, the local encoding functions are fixed, and each edge of
the network is used at most once. Of course it is possible, as in the routing example of
Figure 29.1(b), that the local encoding functions may vary from channel use to channel use.
Transmission efficiency is measured in units of packets per channel use. For example,
the butterfly network routing solution of Fig. 29.1(b) achieves a multicast rate of 1.5 packets
per channel use, while the network coding solution of Fig. 29.1(c) achieves a multicast rate of
2 packets per channel use. It should be noted that the measure of information (“packets”)
used here depends on the alphabet size |A|. It is straightforward to convert a rate from
packets per channel use to bits per channel use, simply by multiplying by log2 |A| bits per
packet.
This combinational model is not the most general possible. For example, if the underly-
ing transmission channels were wireless, then the network model would need to be modified
to take into account the broadcast nature of the channel (i.e., the possibility that a single
transmission might be received simultaneously by multiple receivers) and the multiple-access
nature of the channel (i.e., that simultaneous transmissions from multiple transmitters may
interfere with each other at a given receiver). The model also does not explicitly incorporate
outage (the possibility that links may not be perfectly reliable), delay (or any notion of tim-
ing as in sequential logic), or the possibility of feedback (cycles in the graph). Nevertheless,
the combinational model given is sufficiently rich that it will well illustrate the main ideas
of network coding.
Network Codes 689
To achieve R(s, t) = mincut(s, t), it is necessary that in each channel use, through each
minimum cut exactly mincut(s, t) distinct packets must flow, from which the mincut(s, t)
information packets emitted by the source must be recoverable.
It is well known from the edge-connectivity version of Menger’s Theorem [1374] or from
the theory of commodity flows [675, 738] that mincut(s, t) is equal to the maximum number
of mutually edge-disjoint paths from s to t. Such a collection of mutually edge-disjoint
paths can be found using, e.g., the Ford–Fulkerson method [444, Section 26.2] [738]. Thus
the solution of the single-source unicast problem, i.e., transmission of R(s, t) = mincut(s, t)
packets per channel use from the source s to a single sink t can be achieved by routing,
sending a distinct packet along each of the edge-disjoint paths between s and t in each
channel use.
690 Concise Encyclopedia of Coding Theory
For example, in the butterfly network of Figure 29.1, mincut(s, t) = 2, and indeed it is
easy to find a pair of edge-disjoint paths from s to t over which 2 packets may be routed
in each channel use. Note that mincut(s, u) = 2 also. However, as we have already seen, in
this network the paths over which packets are routed to t conflict with the paths over which
packets are routed to u, making it impossible to achieve, using routing alone, a transmission
rate of 2 packets per channel use to both t and u simultaneously.
The main theorem of network multicasting is that the upper bound (29.1) is achievable (with
equality) via network coding [26]. Moreover, the upper bound is achievable with linear
network coding, provided that the packet alphabet A is a vector space over a sufficiently
large finite field [1248]. The quantity mint∈T mincut(s, t) is referred to as the multicast
capacity of the given combinational packet network.
In linear network coding, we assume that the packet alphabet A is an n-dimensional
vector space Fnq over the finite field Fq . We assume that the source will communicate at
some (integer) rate r to all of the sink nodes t ∈ T . This means that the source vertex s
has r information packets p1 , p2 , . . . , pr to send in any given channel use, which we view
as “incoming” to s. We will gather these information packets to form the rows of an r × n
matrix denoted as P . Each sink node t ∈ T must generate r output packets, which we view
as “outgoing” from t. These will be gathered into the rows of an r × n matrix denoted, with
slight abuse of notation, as XO(t) . We require, for all t ∈ T , that XO(t) = P . In other words,
the communication problem is to reproduce the source matrix P at the output XO(t) of
each sink node t ∈ T .
We will assume that no edges of the network are idle. This means that a vertex v ∈ V \{s}
will observe exactly |I(v)| incoming packets, forming the rows of an |I(v)|×n matrix denoted
as XI(v) . Vertex v (including, now, each sink node t ∈ T ) will then apply its local encoding
function to produce |O(v)| outgoing packets, forming the rows of an |O(v)| × n matrix
denoted as XO(v) .
In linear network coding, these local encoding functions are linear over Fq , i.e.,
XO(v) = Lv XI(v) ,
for some |O(v)| × |I(v)| matrix Lv with entries from Fq , called the local transfer matrix
at v. In other words, each outgoing packet (a row of XO(v) ) is an Fq -linear combination
of the incoming packets at node v (the rows of XI(v) ). The row of Lv corresponding to a
particular edge e ∈ O(v) is called the local encoding vector associated with e.
Note that, since only linear operations are performed in the network, every packet trans-
mitted along an edge is a linear combination of the source packets p1 , p2 , . . . , pr . In partic-
ular, for every vertex v ∈ V , we have
XI(v) = Gv P
where Gv is an |I(v)| × r matrix of coefficients from Fq , called the global transfer matrix
at v. The row of Gv corresponding to a particular edge e ∈ I(v) is called the global
encoding vector associated with e.
Network Codes 691
Now, in order for a sink node t to be able to recover P , it is necessary for the |I(t)| × r
global transfer matrix Gt at t to have rank r, since only then does Gt have a left inverse G−1t
satisfying G−1
t Gt = Ir , where Ir is the r × r identity matrix. (Note that Gt does not need to
be square to have a left inverse, merely full rank.) Forming G−1 t XI(t) at node t produces P ,
as desired. The following theorem claims that if r ≤ mint∈T mincut(s, t) and q is sufficiently
large, then it is possible to make Gt an r × r invertible matrix simultaneously at each
sink node t, thereby achieving a multicast rate of r with linear network coding.
is achievable, for sufficiently large q, with linear network coding. In fact, q ≥ |T | suffices
for (29.2) to hold.
This theorem was established in [1248], and an elegant algebraic proof was given in
[1157], in which the bound q > |T | was proved to be sufficient for the theorem to hold. That
q = |T | also suffices follows from the linear information flow algorithm of [1034], which
provides a polynomial-time construction of capacity-achieving network codes. See [750, 1169]
for tutorial introductions.
XO(s∗ ) = Ls∗ P,
where, as above, the rows of P consist of the r packets to be sent. In other words, the
packets transmitted on the edges incident from s∗ are linear combinations of the r super-
source packets. However, since O(s∗ ) is a cut separating s∗ from the rest of the network, the
supersource packets must be decodable from XO(s∗ ) , which means that Ls∗ must be invert-
ible. Premultiplying at s∗ by L−1
s∗ does not affect the ability of any of the sinks to decode,
but translates a general network coding solution into a solution where the global encoding
vectors on the edges of O(s∗ ) are unit vectors. Thus source si receives ri “uncoded” super-
source packets. By supposing that these ri supersource packets are actually generated at
692 Concise Encyclopedia of Coding Theory
si , we see that the given linear network coding solution has the effect of delivering packets
from si to each sink at a rate ri packets per channel use, thereby achieving the rate-tuple
(r1 , . . . , r|S| ).
Stated differently, and making the practical assumption that q = 2m (so that each field
element comprises m bits), the probability Prob[failure] that such a random choice of local
encoding coefficients fails to produce a feasible network code satisfies
Prob[failure] ≤ |T |η2−m ,
which, for any fixed network, decreases exponentially with the number of bits per field
element.
The random linear network coding approach has many practical benefits, as it can be
used not only to design a fixed network code, but also as a communication protocol
[407, 964]. In this approach, communication between source and sinks occurs in a series of
rounds or generations; during each generation, the source injects a number of fixed-length
packets into the network, each of which is a fixed length row vector over Fq . These packets
propagate throughout the network, possibly passing through a number of intermediate nodes
Network Codes 693
Yt = Gt P, (29.4)
where the Gt ∈ F`×rq is the global transfer matrix induced at sink node t by the random
choice of local encoding coefficients in the network. If we assume that Gt is not known to
either the source or the sink t, then we refer to the network operation as non-coherent
network coding.
How can the sink recover P ? One obvious method is to endow the transmitted packets
with so-called headers that can be used to record the particular linear combination of the
components of the message present in each received packet. In wireless communication, the
transmission of known symbols for the purpose of determining the transfer characteristics
of an unknown channel is known as channel sounding. Here, we let packet pi = (ei , mi),
where ei is the ith unit vector and mi is the packet payload. We then have P = Ir M ,
so that
Yt = Gt P = Gt Ir M = Gt Gt M .
The first r columns of Yt thus yield Gt , and, assuming Gt has a left inverse G−1 t —which, as
we have seen, occurs with high probability if the field size is sufficiently large—sink node t
can recover M = G−1 t (Gt M ).
Of course, the use of packet headers incurs a transmission overhead relative to a fixed
network (with the relevant global encoding vectors known a priori by all sinks). For many
practical transmission scenarios, where packet lengths can potentially be large, this overhead
can be held to a small percentage of the total information transferred in a generation. For
example, take q = 28 = 256 (field elements are bytes), a generation size r = 50, and packets
of length 211 = 2048 bytes. The packet header then must be 50 bytes, representing an
overhead of just 50/2048 = 2.4%.
694 Concise Encyclopedia of Coding Theory
where the Grassmannian Gq (n, k), simply the set of all k-dimensional subspaces of Fnq , is
given as
Gq (n, k) = {V ∈ Pq (n) | dim(V ) = k}.
Let U and V be two subspaces of Fnq . Their intersection U ∩ V is then also a subspace
of Fnq , as is their sum
U + V = {u + v | u ∈ U, v ∈ V },
which is the smallest subspace of Fnq containing both U and V . It is well known that
When U ∩ V = {0}, i.e., when U and V intersect trivially, then their sum U + V is a direct
sum, denoted as U ⊕ V , and in this case dim(U ⊕ V ) = dim(U ) + dim(V ).
Turning now to matrices, we let Fm×nq denote the set of all m × n matrices over Fq .
The zero matrix is denoted as 0m×n ; however, the subscript is often omitted when the
dimensions are clear from context. As above, the r × r identity matrix is denoted as Ir .
The row space of a matrix X is denoted as hXi, and the rank of X is denoted as rank(X);
of course, rank(X) = dim (hXi). If X ∈ Fqm1 ×n and Y ∈ Fqm2 ×n , then
X
= hXi + hY i ;
Y
Network Codes 695
therefore
X
rank = dim(hXi + hY i).
Y
A vector in Fnq will be considered as an element of F1×n q , i.e., as a row vector. For
v = (v1 , . . . , vn ) ∈ Fnq , let supp(v) = i ∈ {1, . . . , n} | vi 6= 0 denote its support and let
wtH (v) = |supp(v)| denote its Hamming weight.
Given two vectors u, v ∈ Fnq , let u ? v denote their Hadamard product, i.e., the vector
obtained by multiplying u and v componentwise so that (u ? v)i = ui vi . The Hamming
distance between binary vectors u, v ∈ Fn2 can be expressed as
dH (u, v) = wtH (u) + wtH (v) − 2wtH (u ? v).
It will also be useful to define the asymmetric distance [1567] between two binary vectors
u, v ∈ Fn2 as
dA (u, v) = max {wtH (u), wtH (v)} − wtH (u ? v).
In the special case when wtH (u) = wtH (v), we have dH (u, v) = 2dA (u, v).
If X ∈ Fm×n
q is nonzero, then its reduced row echelon form is denoted as rref(X).
Associated with a nonzero X is a vector id(X) ∈ Fn2 , called the identifying vector of
X, in which supp(id(X)) is the set of column positions of the leading ones in the rows of
rref(X). For example in F3×5
2 ,
1 1 0 1 0 1 0 1 0 1
if X = 0 1 1 1 1 , then rref(X) = 0 1 1 0 0
1 0 1 1 0 0 0 0 1 1
and id(X) = (1, 1, 0, 1, 0), where the leading ones in rref(X) have been indicated by under-
lining. If X = 0, then we set id(X) to the zero vector. Note that rank(X) = wtH (id(X)).
Associated with a vector space V ∈ Gq (n, k), k > 0, is a unique k × n matrix GV in
reduced row echelon form (i.e., with GV = rref(GV )) having the property that hGV i = V .
We call GV the basis matrix for V . With a slight abuse of notation we extend the id
function to vector spaces by defining
id(V ) = id(GV ),
where id(V ) is the zero vector if dim(V ) = 0. Clearly we have dim(V ) = wtH (id(V )).
The following bounds on the dimension of the sum and product of spaces in terms of
their identifying vectors will be useful.
Theorem 29.4.1 For any pair of spaces U, V ∈ Pq (n), let u = id(U ) and let v = id(V ) be
their respective identifying vectors. Then
dim(U + V ) ≥ wtH (u) + wtH (v) − wtH (u ? v),
dim(U ∩ V ) ≤ wtH (u ? v).
Proof: To show the first inequality, let PU = supp(u) be the set of pivot positions in GU
and similarly let PV = supp(v). Since every pivot position of GU is a pivot position of
GU
X= ,
GV
and similarly for GV , it follows that PU ∪PV ⊆ supp(id(X)). Therefore, since dim(U +V ) =
|supp(id(X))|, we get
dim(U + V ) ≥ |PU ∪ PV |
= |PU | + |PV | − |PU ∩ PV | = wtH (u) + wtH (v) − wtH (u ? v).
696 Concise Encyclopedia of Coding Theory
ways, since we can extend U by adjoining any of the q n −q k vectors not in V , then adjoining
any of the q n − q k+1 vectors not in the resulting (k + 1)-space, etc., but any specific choice
is in an equivalence class of size (q j − q ` )(q j − q `+1 ) · · · (q j − q j−1 ).
The quantity Nq (n, k, j, `) is very useful. For example, Nq (n, n, k, k) = nk (the num-
n−k
ber of k-subspaces of an n-space, i.e., |Gq (n, k)|), Nq (n, k, j, k) = j−k (the number of
2
j-dimensional spaces containing the k-space V ), Nq (n, k, k, k − i) = q i ki n−k
i (the num-
ber of k-spaces at graph distance i from the k-space V in the Grassmann graph), etc.
Let us also mention here two additional properties of the Gaussian coefficient. We have
[760]
m m−t
m n
= for t ≤ n ≤ m, (29.7)
n t t n−t
and [1156, Lemma 5]
∞
i(n−i) n Y 1
q ≤ ≤ h(q)q i(n−i) where h(q) = . (29.8)
i j=1
1 − q −j
It is shown in [1156] that h(q) decreases monotonically with q, approaching q/(q − 1) for
large q. The infinite product for h(q) converges rapidly; the following table lists h(q) for
various values of q.
q 2 3 4 5 7 8 16 32 64 128 256
h(q) 3.46 1.79 1.45 1.32 1.20 1.16 1.07 1.03 1.02 1.01 1.004
hYt i = hGt P i ⊆ hP i ,
i.e., even though the network linearly mixes the transmitted packets, the received packets
span a space contained in the space spanned by the transmitted packets. This motivates
a convenient abstraction—termed the operator channel —in which the input and output of
698 Concise Encyclopedia of Coding Theory
the channel are considered simply as subspaces of Fnq , and any more particular relationship
between the input and output matrices (induced, say, by a particular network topology) is
ignored.
For integer k ≥ 0, define a stochastic operator Hk , called an erasure operator, that
operates on Pq (n). If dim(U ) > k, then Hk (U ) returns a randomly chosen k-dimensional
subspace of U ; otherwise, Hk (U ) returns U . For the purposes of code constructions, the
actual probability distribution of Hk is unimportant; for example, it could be chosen to be
uniform. Given a channel input U and a channel output V , it is always possible to write V
as
V = Hk (U ) ⊕ E
for some subspace E ∈ Pq (n), assuming that k = dim(U ∩ V ), and that Hk (U ) = U ∩ V .
Definition 29.4.2 An operator channel associated with Fnq is a channel with input and
output alphabet Pq (n). The channel input U and channel output V are related as V =
Hk (U ) ⊕ E, where k = dim(U ∩ V ) and E is an error space satisfying E ∩ U = {0}.
In transforming U to V , we say that the channel commits ρ = dim(U ) − k erasures and
t = dim(E) errors.
It is important to note that the concept of an “error” differs from that in classical coding
theory: here it refers to the insertion into the network (by an adversary, say) of vectors not in
the span of the transmitted vectors, so that the received space may contain vectors outside
of that span. Errors (in the absence of erasures) will make the dimension of the received
space larger than that of the transmitted one. The concept of an “erasure” also differs from
that in classical coding theory: here it refers to the possibility that the intersection of the
received subspace with the span of the transmitted vectors may not include all of the vectors
in that span. Erasures (in the absence of errors) will make the dimension of the received
space smaller than the transmitted one.
Note that, according to this definition, the error space E is assumed to intersect trivially
with U ; thus the choice of E is not independent of the choice of U . However, if one were to
model the received space as V = Hk (U ) + E for an arbitrary error space E, then, since E
always decomposes for some space E 0 as E = (E ∩ U ) ⊕ E 0 , one would get
V = Hk (U ) + ((E ∩ U ) ⊕ E 0 ) = Hk0 (U ) ⊕ E 0
for some k 0 ≥ k. In other words, components of an error space that intersect with the
transmitted space U would only be helpful, possibly decreasing the number of erasures seen
by the receiver.
In summary, for any sink node t, we consider the operator channel as a point-to-point
channel between s and t, with input and output alphabet equal to Pq (n). The channel takes
in a vector space U and puts out another vector space V , possibly with erasures (deletion
of vectors from U ) or errors (addition of vectors to U ). Of course, in actuality, the source
node s transmits a basis for U into the network, and the sink node t gathers a basis for V .
packets. A code for the operator channel is known as a subspace code. Although one can
consider multi-shot codes [1413, 1445], we will mainly be interested in so-called one-shot
codes, that convey information via a single channel use, sending just a single subspace from
an appropriate codebook of subspaces.
where the latter equalities follow from (29.5). It can be shown [1156, Lemma 1] that this
distance measure does indeed satisfy all of the axioms (including the triangle inequality)
that make it a metric. That dS (U, V ) is a metric also follows from the fact that this quan-
tity represents a graph distance (the length of a geodesic) in the undirected Hasse graph
representing the lattice of subspaces of Fnq partially ordered by inclusion.
It is easy to show that
The first equality can be interpreted as saying that the subspace distance between U and
V is equal to the number of dimensions that must be removed from U to produce U ∩ V ,
plus the number of additional dimensions that must be adjoined to U ∩ V to produce V .
The second equality says that dS (U, V ) is equal to the number of dimensions that must be
adjoined to U to produce U + V , plus the number of dimensions that must be removed from
U + V to produce V . Evidently, at least one geodesic between U and V in the Hasse graph
passes through U ∩ V while another passes through U + V .
The minimum subspace distance between distinct codewords in a code Ω is denoted
as dS (Ω), i.e.,
dS (Ω) = min {dS (U, V ) | U, V ∈ Ω, U 6= V }.
1 When speaking of subspace distance in this chapter, dimension is measured in the non-projective sense.
In Section 14.5, when speaking of subspace codes and subspace distance, dimension is measured in the
projective sense; fortunately the numerical values of subspace distance agree in both senses. See Footnote 1
of Section 14.5.
700 Concise Encyclopedia of Coding Theory
For example, the binary constant-dimension subspace code Ψ defined in (29.9) has dS (Ψ) =
4, since every pair of codewords intersects trivially.
Consider now information transmission with a subspace code Ω of packet-length n over
Fq using an operator channel that takes a codeword U ∈ Ω to the received space
V = Hdim(U )−ρ (U ) ⊕ E
where E ∩ U = {0} and ρ ≥ 0. As above, in transforming U to V , the channel commits ρ
erasures and t = dim(E) errors. Given V , a nearest-subspace-distance decoder for Ω returns
a codeword U ∈ Ω having smallest subspace distance to V .
Theorem 29.5.2 When U ∈ Ω is sent, and V is received with ρ erasures and t errors, then
a nearest-subspace-distance decoder for Ω is guaranteed to return U if 2(ρ + t) < dS (Ω).
Proof: Let U 0 = Hdim(U )−ρ (U ). From the triangle inequality we have dS (U, V ) ≤
dS (U, U 0 ) + dS (U 0 , V ) = ρ + t. If T 6= U is any other codeword in Ω, then dS (Ω) ≤
dS (U, T ) ≤ dS (U, V ) + dS (V, T ), from which it follows that dS (V, T ) ≥ dS (Ω) − dS (U, V ) ≥
dS (Ω) − (ρ + t) > ρ + t ≥ dS (U, V ). Since dS (V, T ) > dS (U, V ), a nearest-subspace-distance
decoder must produce U .
Conversely, consider two codewords U, T ∈ Ω with dS (U, T ) = dS (Ω), where (without
loss of generality) dim(U ) ≤ dim(T ). Let ρ = dS (U, U ∩ T ), and let t be the smallest integer
such that 2(t + ρ) ≥ dS (Ω). Note that dim(U ) ≤ dim(T ) implies that t is nonnegative.
Writing T = (U ∩ T ) ⊕ T 0 , let V = (U ∩ T ) ⊕ E, where E is any subspace of T 0 of dimension
t. By construction, we then have dS (U, V ) = ρ + t, while dS (T, V ) ≤ dS (Ω) − (ρ + t) ≤ ρ + t.
Thus, if U is sent and V is received, since both U and T are within subspace distance ρ + t
of V , there is no guarantee that the receiver will produce U .
Another distance measure, introduced in the later paper [1710], is the so-called injection
distance
dI (U, V ) = max {dim(U ), dim(V )} − dim(U ∩ V )
= dim(U + V ) − min {dim(U ), dim(V )}.
The minimum injection distance between distinct codewords in a code Ω is denoted as
dI (Ω), i.e.,
dI (Ω) = min {dI (U, V ) | U, V ∈ Ω, U 6= V }.
Again it can be shown [1710] that this distance measure is indeed a metric. That dI (U, V )
is a metric also follows from the fact that this quantity represents a graph distance in the
undirected Hasse graph representing the lattice of subspaces of Fnq partially ordered by
inclusion, adjoined with the edges of the Grassmann graphs in each dimension.
As explained in [1710], the injection distance is designed to measure adversarial effort in
networks, counting the minimum number of packets that an adversary would need to inject
into the network to transform a given input space U to a given output space V . Whereas the
subspace distance treats errors and erasures symmetrically, the injection distance reflects the
fact that the injection of a single error packet may cause the replacement of a dimension
(i.e., a simultaneous error and erasure); thus errors and erasures should not be treated
symmetrically in network coding applications of subspace codes.
The injection distance and the subspace distance are, however, closely related, as
2dI (U, V ) = dS (U, V ) + | dim(V ) − dim(U )|. (29.10)
In fact, the two metrics are equivalent when U and V have the same dimension, i.e., if
dim(U ) = dim(V ), then dS (U, V ) = 2dI (U, V ). It follows from (29.10) that
2dI (Ω) ≥ dS (Ω),
Network Codes 701
Since balls of radius b(d − 1)/2c centered at the codewords of an (n, d, k)q code must be
disjoint, the total number of points included in the union of such balls cannot exceed the
number of points |Gq (n, k)| = nk in the space as a whole. Thus we get the Sphere Packing
Bound [1156].
n−d+1 n−d+1
Aq (n, d, k) ≤ |Gq (n − d + 1, k − d + 1)| = = .
k−d+1 n−k
n−k+d−1
n k n
= .
k k−d+1 k−d+1 d−1
The Anticode Bound is also useful for bounding the parameters of rank-metric codes; see
Chapter 11.
It is easy to observe that the Anticode Bound also implies the Sphere Packing Bound as
a special case, since a ball BV (n, k, q, b(d − 1)/2c) is (by the triangle inequality) an anticode
of maximum distance d − 1. However, a ball is not an optimal anticode in Gq (n, k), and
therefore the bound of Theorem 29.6.3 is always tighter for nontrivial codes.
The bound in Theorem 29.6.3 was first obtained by Wang, Xing and Safavi-Naini in
[1870] using a different argument. The proof that Theorem 29.6.3 follows from Delsarte’s
Anticode Bound is due to Etzion and Vardy [695]. As observed in [1913], the Anticode
Bound is always stronger than the Singleton Bound for nontrivial codes in Gq (n, k).
where the first equality follows from the fact that each codeword of Ω will appear as a
codeword in exactly Nq (n, k, n − 1, k) = (q n−k − 1)/(q − 1) of the ΩU ’s. This argument
yields the following theorem [695].
Theorems 29.6.4 and 29.6.5 may be iterated to give an upper bound for Aq (n, d, k).
However, as in the classical case of the Johnson space, the order in which the two bounds
should be iterated in order to get the tightest bound is unclear. By iterating Theorem 29.6.5
with itself, the following bound is established in [695, 1913].
704 Concise Encyclopedia of Coding Theory
Lemma 29.6.7 ([25]) Let ΩD ⊆ Gq (n, k) be a code with distances in a set D. Then, for a
nonempty subset S ⊆ Gq (n, k), there exists a code Ω∗D (S) ⊆ S with distances in D such that
where, if |Ω∗D (S)| = 1, then Ω∗D (S) is a code with distances in D by convention.
It is shown in [25] that for t = 0 and m = n − 1, Theorem 29.6.8 gives Theorem 29.6.4.
Unlike the case of classical coding theory in the Hamming metric, the best lower bounds
on Aq (n, d, k) result from code constructions, the subject of the next section. Indeed, given
in Figure 29.2 are plots of the Singleton, Anticode, and Johnson upper bounds, together
with the Gilbert–Varshamov (GV) lower bound for (2048, d, k)256 constant-dimension codes,
with k = 50 and k = 512, expressed in terms of rate versus relative distance. The rate R of
an (n, d, k)q code Ω is defined as
1
R= logq |Ω|;
nk
this is the number of q-ary information symbols that are sent in the transmission of k
packets of length n symbols from Fq using code Ω. The relative distance is the ratio d/k;
since 1 ≤ d ≤ k, the relative distance is bounded between 1/k and 1. As can be seen in
Figure 29.2, the upper bounds all give essentially the same result (when expressed on a
logarithmic scale). Also shown in the figure are points achieved by the lifted maximum rank
distance (MRD) codes described in the next section. Although there is a gap between the
Gilbert–Varshamov Bound and the various upper bounds on Aq (n, d, k), the lifted MRD
codes come very close to the upper bounds, and hence are essentially optimal (at least for
the reasonably realistic case of packet lengths of 2K bytes, with network coding performed
on bytes, i.e., in F256 ).
FIGURE 29.2: The GV (lower) Bound and various upper bounds on Aq (n, d, k) expressed
in terms of rate versus relative distance for n = 248, q = 256 and k ∈ {50, 512}. Also
shown are achievable rates for (n, d, k)q lifted maximum rank distance codes, with d ∈
{1, 2, 3, . . . , 50} for k = 50 and d ∈ {1, 8, 15, . . . , 512} for k = 512.
1
GV
0.8 Singleton
k= Anticode
50 Johnson
0.6 lifted MRD
k=
512
R
0.4
0.2 n = 2048
q = 256
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
d/k
29.7 Constructions
29.7.1 Lifted Rank-Metric Codes
In this section, we describe the simplest construction of asymptotically good subspace
codes, which uses rank-metric codes as building blocks. This construction was first proposed
706 Concise Encyclopedia of Coding Theory
in [1870], and then rediscovered in [1156] for the special case where the rank-metric code
is a Delsarte–Gabidulin code. The construction was later explained in [1708, 1711] in the
context of the subspace/injection distance. The latter description is reviewed below. Rank-
metric codes are described in more detail in Chapter 11; here we review only some basic
properties needed for the purposes of this chapter.
For matrices X, Y ∈ Fn×m
q , the rank distance is defined as
θ(f (α1 ))
θ(f (α ))
2 <n−d+1
C= . f (x) ∈ [x] .
.
F q,m
.
θ(f (αn ))
It is shown in [760] that such a code has dR (C) = d and q m(n−d+1) codewords, and so it is
indeed an MRD code.
Given a rank-metric code C ⊆ Fn×m q , a minimum-rank-distance decoder for C takes a
matrix r ∈ Fn×m
q and returns a codeword c ∈ C that minimizes the rank distance dR (c, r).
It is easy to see that, if dR (c, r) < dR (C)/2 for some c ∈ C, then c is the unique solution to
the above problem. A bounded-distance decoder for C returns c ∈ C if dR (c, r) < dR (C)/2,
or declares a failure if no such codeword can be found. For Delsarte–Gabidulin codes, very
efficient bounded-distance decoders exist; see, e.g., [760, 1711].
Let us return, now, to the construction of constant-dimension subspace codes. For a
matrix X ∈ Fk×m q , let the subspace
Λ(X) = Ik X ∈ Gq (k + m, k)
Network Codes 707
Proof: This follows from Theorem 29.4.1, from which we know that dim(U ∩ V ) ≤
wtH (id(U ) ? id(V )). If dim(U ) ≥ dim(V ), then wtH (id(U )) ≥ wtH (id(V )) and
This also follows from the work of Etzion and Silberstein [692], who give the subspace
distance counterpart of Theorem 29.7.2.
Theorem 29.7.3 ([692]) For any pair of spaces U, V ∈ Pq (n),
codes, however with a reduced row-echelon form conforming to matrices having identifying
vector a.
For example, the identifying vector a = (101010) corresponds to the 3×3 Ferrers diagram
• • •
F (a) = • • .
•
for some elements v11 , v12 , v13 , v22 , v23 , v33 ∈ Fq . The codeword v is lifted to the subspace
* 1 v
11 0 v12 0 v13 +
Λa (v) = 0 0 1 v22 0 v23 ∈ Λa (Ca ).
0 0 0 0 1 v23
Theorem 29.7.4 Let A be a binary code of length n with constant codeword weight k and
minimum Hamming distance 2d, or equivalently, minimum asymmetric distance d. For each
codeword a ∈ A, let Ca be an FD code over Fq for a with minimum rank distance d. Finally,
let [
Ω= Λa (Ca ).
a∈A
Proof: Distinct codewords of Ω in the same Schubert cell are separated by an injection
distance of at least d, as they are codewords of the same lifted FD code. Codewords of
Ω in different Schubert cells are separated by an injection distance of at least d due to
Theorems 29.7.2 and 29.7.3. The packet-length, codeword dimension, and cardinality of Ω
are obvious.
We refer to a code obtained by this multi-level construction as a union of lifted FD
codes.
The size of such constant-dimension codes depends on the size of the corresponding FD
codes. Bounds on the size of linear FD codes with prescribed minimum distance are given
in [692]; see also [1110]. The construction of maximal Ferrers diagram rank-metric codes is
considered in [51, 691].
It is interesting to observe the padded codes of Section 29.7.2 are in fact a union of lifted
FD codes, corresponding to the constant-weight-k code A in which the codewords have k
consecutive nonzero positions starting in position 1, k + 1, 2k + 1, . . ..
In this construction a naive choice for A would be one with a high information rate.
However, a high information rate would only result in a large number of selected Schubert
cells, and does not necessarily guarantee a high overall rate for the resulting (n, d, k)q
code. This is due to the fact that the size of a lifted FD code depends on the size of its
underlying Ca code, which in turn depends on a. In particular, the largest Schubert cell,
with the largest corresponding lifted FD code, is selected when a = 11 · · · 100 · · · 0. This
vector is actually the first to occur in reverse lexicographic order. In [692] constant-weight
710 Concise Encyclopedia of Coding Theory
lexicodes, always including 11 · · · 100 · · · 0, are used to select well-separated Schubert cells
in the Grassmannian.
This construction idea can be extended to obtain mixed-dimension subspace codes. If
A has minimum asymmetric distance d and each Ca code is designed to have minimum
rank distance d, then the resulting subspace code is guaranteed to have minimum injection
distance d. Similarly, if A has minimum Hamming distance 2d and each Ca code is designed
to have minimum rank distance d, then the resulting subspace code is guaranteed to have
minimum subspace distance 2d.
Further constructions of subspace codes based on Ferrers diagrams and related concepts
are given in [1707].
Every (n, δ, k)q code Ω can be associated with a {0, 1}-valued column vector x with nk rows
corresponding to elements V ∈ Gq (n, k). The entryP xV in the row corresponding to subspace
V takes value 1 if and only if V ∈ Ω, so that |Ω| = V ∈Gq (n,k) xV . Furthermore, M x is then
n
a column vector with k−d+1 rows corresponding to elements W ∈ Gq (n, k − d + 1), where
the entry in the row corresponding to subspace W takes value i if subspace W occurs exactly
i times as a subspace of some codeword of Ω. In order to ensure that δ ≥ d, no W can occur
more than once (otherwise two codewords of Ω would have a common (k−d+1)-dimensional
subspace); thus no entry of M x may exceed unity.
The code construction problem then becomes the following integer optimization problem:
X
maximize: xV
V ∈Gq (n,k)
Grassmannian in which the subspace is contained; see, e.g., [158, 808, 1476, 1599, 1818].
A generalization of subspace codes to so-called “flag codes” is given in [1257]. Subspace
code constructions based on concepts of discrete geometry are given in [449, 450, 451, 452].
Constructions and bounds for mixed-dimension subspace codes are considered in [977].
be an (n, d, k)q union of lifted FD-codes, constructed as described in Section 29.7.3, and
suppose that A is given as {a1 , a2 , . . . , a|A| }. Let c1 = 0, and, for 2 ≤ i ≤ |A|, let ci =
Pi−1
j=1 |Caj |.
Codewords are numbered starting at zero. To map an integer m in the range {0, . . . , |Ω|−
1} to a codeword: (a) find the largest index i such that ci ≤ m, (b) map the integer j = m−ci
to a codeword of Cai (using an encoder for the corresponding rank-metric code), which can
then be lifted to the corresponding subspace. Note that 0 ≤ j < |Cai |. Conversely, the j th
codeword of Cai maps back to the message m = ci + j. Assuming efficient encoding of the
underlying rank-metric codes, the main complexity of the encoding algorithm, given m, is
to determine the corresponding ci , which can be done using a binary search in time at worst
proportional to log |A|.
Several algorithms have been proposed for implementing the function dec(·). The first
such algorithm was proposed by Kötter and Kschischang in [1156] and is a version of Sudan’s
“list-of-1” decoding for Delsarte–Gabidulin codes. The time complexity of the algorithm is
O((k + m)2 m2 ) operations in Fq . A faster algorithm was proposed in [1711] which is a
generalization of the standard (“time-domain”) decoding algorithm for Delsarte–Gabidulin
codes. The complexity of this algorithm is O(dm3 ) operations in Fq . As shown in [1709], the
algorithm in [1711] can significantly benefit from the use of optimal (or low-complexity) nor-
mal bases, further reducing the decoding complexity to (11t2 +13t+m)m2 /2 multiplications
in Fq (and a similar number of additions). Finally, a transform-domain decoding algorithm
was proposed in [1708, 1709], which is slightly faster than that of [1711] for low-rate codes.
As we will see, a bounded-distance decoder for a lifted Delsarte–Gabidulin code can
be used as a black box for decoding many other subspace codes. For instance, consider
C r = C × · · · × C, the rth Cartesian power of a Delsarte–Gabidulin code C ⊆ Fk×m q . Recall
that Λ(C r ) is a (k + rm, d, k)q code, where d = dR (C). Let dec(·) be a bounded-distance
decoder for Λ(C). Then a bounded-distance decoder for Λ(C r ) can be obtained as the map
Pq (k + rm) → C r ∪ {ε} given by
(
c1 · · · b
cr = b ci 6= ε, i = 1, . . . , r, and dI (Λ(b
c if b c), U ) ≤ t,
U 7→
b
ε otherwise
where bci = dec A yi , i = 1, . . . , r, and A ∈ F`×k
q and y1 , . . . , yr ∈ F`×m
q are such that
A y1 · · · yr = U . In other words, we can decode Λ(C r ) by decoding each Cartesian
component individually (using the same matrix A on the left) and then checking whether
the resulting subspace codeword is within the bounded distance from U .
is satisfied. Note that P (a) simply rearranges the columns of a matrix so that an FD-
restricted lifting is transformed to a usual lifting. Since P (a) is a nonsingular linear trans-
formation, and therefore preserves dimensions, we have
dA (a, a0 ) ≤ dI (U, V ) ≤ t,
29.9 Conclusions
As we have seen in this chapter, the ideas that underlie network coding are simple,
yet potentially far-reaching. Allowing the nodes in a network to perform coding (not just
routing), permits the multicast capacity of a network to be achieved with linear network
coding. A careful network coding design is not required, as a completely random choice
of network coding coefficients will, with high probability of success when the field size is
sufficiently large, achieve capacity. In practice, the random linear network coding approach
results in multicast protocols that are robust to changes in the underlying network topology.
The random linear network coding approach also opens an intriguing set of problems
in coding theory, centered on the design of subspace codes, but strongly connected with
the design of codes in the rank metric, and problems in geometry. As we have seen in
this chapter, many bounds and constructions of classical coding theory have subspace-code
analogs. The book [852] collects together many contributions from experts in this area, going
well beyond the results presented in this chapter. A table of bounds and constructions for
subspace codes and constant-dimension codes is maintained at
http://subspacecodes.uni-bayreuth.de/.
From a practical standpoint, at least for applications in network coding, it seems that the
main problems are largely solved, as constant-dimension lifted rank-metric codes contain
close to the maximum possible number of codewords (at least on a logarithmic scale),
and efficient encoding and decoding algorithms have been developed. It is unlikely that
codes with marginally larger codebooks (even though they exist) will justify the additional
complexity needed to process them.
From a mathematical standpoint, however, much remains open and—as in coding theory
in general—finding new approaches for constructing subspace codes and decoding them
efficiently as well as finding new applications for such codes remains an active area of
research.
Chapter 30
Coding for Erasures and Fountain Codes
Ian F. Blake
University of British Columbia
30.1 Introduction
The chapter addresses the problem of the efficient correction of erasures for such appli-
cations as the dissemination of large downloads on the internet. Two aspects of the problem
are considered. In the next section a class of block codes designed for erasure-correction are
considered, the Tornado codes. The construction of these introduced several concepts, in
particular the notion of probabilistic construction of bipartite graphs, that play a prominent
role in the other type of code considered, the fountain codes examined in Sections 30.3 and
30.4. In both classes of codes, linear encoding and decoding complexity is achieved.
Remark 30.1.1 The term packet will be used interchangeably with symbol as a sequence
of bits, which in the sequel will also be identified with nodes of a code bipartite graph.
For uniformity, the left nodes of such a graph will be referred to as information symbols or
packets and the right nodes as coded or check symbols or packets. Since the only operation
on such symbols is an XOR with other symbols, sometimes referred to as addition (as in
the finite field F2 ), the length of the symbols is not an issue.
Example 30.1.2 For 0110, 1101 ∈ F42 the XOR is 0110 ⊕ 1101 = 1011.
Remark 30.1.3 There are two commonly used bipartite graphs to represent codes. In one,
the k left nodes are information nodes and the (n − k) right nodes are check nodes.
The alternative representation is for the left nodes of the graph to represent the n codeword
positions or parity check matrix columns and the right nodes the parity check matrix rows.
Thus left node j is connected to right node i if the element hij of the parity check matrix H
is 1. This second representation is usually referred to as the Tanner graph of the code and
when the left nodes represent a codeword, the right nodes are all zero. This representation
is commonly used in the analysis of LDPC codes where the left nodes are usually referred
to as variable nodes and the right nodes as check nodes; see Chapter 21 for the study of
715
716 Concise Encyclopedia of Coding Theory
LDPC codes. The first representation will be used in this work. The representations are
shown in Figure 30.1 for the [7, 4, 3]2 Hamming code.
FIGURE 30.1: (a) Code parity check matrix of the [7, 4, 3]2 Hamming code; (b) first
representation of the code; (c) Tanner graph of the code
Remark 30.1.4 In fountain codes, coded packets are generated as random XOR’s of se-
lected information packets and transmitted on the channel. A client or receiver accepts
coded packets as they arrive until a sufficient number are received that allows for decoding
(solving a matrix equation) to retrieve the information packets. In this context an erasure is
simply a missing packet. All randomly generated coded packets are equivalent in this regard
and such a system is ideal for a multicast environment where the possibility of feedback from
the client to the transmitter is not allowed to prevent the server from being overwhelmed
by retransmission requests for particular missing packets.
1−p
0 q q0
p
qE
p
1 q q1
1−p
Remark 30.1.5 In preparation for the discussion of Tornado codes in the next section,
some elements of erasure-correction for linear block codes are recalled. The binary erasure
channel (BEC), shown in Figure 30.2, has a binary input {0, 1} and ternary output
{0, 1, E} where E represents an erasure, which is an error whose location is known. If the
erasure probability is p, the capacity of such a channel is CBEC = 1 − p. Consider the use
of a binary linear [n, k, d]2 code. Suppose the codeword c ∈ Fn2 is transmitted and the word
r received. This word may have up to d − 1 erasures in it which must be solved for. All
vectors will be row vectors and displayed in bold face. Suppose e ≤ d − 1 erasures are made
in transmission, and let H be the (n − k) × n parity check matrix of the code. Suppose the
columns of the parity check matrix are labelled hT i , i = 1, 2, . . . , n. Assume the e erasures
occurred in positions i1 , i2 , . . . , ie and that the received word is r = r1 r2 · · · rn .
Coding for Erasures and Fountain Codes 717
Given the value of a check symbol and all but one of the information symbols on which it
depends, set the missing information symbol to be the XOR of the check symbol and its
known attached information symbols.
Remark 30.2.2 The success of this algorithm depends on there being a check node with
the desired property at each stage of the decoding process. The algorithm requires that all
check nodes be received without erasure. A cascade of codes is constructed as follows. At
the first stage consider a bipartite graph with k information nodes on the left and βk check
nodes on the right for some β < 1. Label this code C(B0 ). For the next stage, create another
bipartite graph using the check nodes of C(B0 ) as left nodes and add β 2 k check nodes on the
right—a code graph labelled C(B1 ). Continue in this manner, at each stage the check (right)
nodes of the code graph C(Bi ) serving as the input (left) nodes of C(Bi+1 ). The algorithm
applied to each of these stages requires all check nodes to be received without erasures. To
complete the construction a small erasure-correction code C is added with kβ m+2 /(1 − β)
check bits. The overall code is referred to as C(B0 , B1 , . . . , Bm , C). It has k information bits
and
m+1
X β m+2 k kβ
βik + =
i=1
1 − β 1 −β
718 Concise Encyclopedia of Coding Theory
Remark 30.2.3 If the last code C is able to correct any fraction of β(1 − ) of erasures (in
either check or information bits - left or right nodes of C) with high probability and if each
of the intermediate codes C(Bi ) is able to correct a fraction β(1 − ) of left nodes with high
probability, given all check or right nodes are correct, then overall the code should be able
to correct a fraction of β(1 − 0 ) of erasures with high probability. The relationship between
and 0 is complex and the reader is referred to the proof of Theorem 2 in [1299] for details.
The cascade construction of Remark 30.2.2 gives an overall structure of a code and its
decoding but has so far not specified how the bipartite graphs are to be constructed. The
construction of the graphs will each be randomized according to certain distributions. For a
given bipartite graph, an edge in the graph is said to be of left edge degree i if it is con-
nected to a node on the left of degree i, and similarly for right edge degree. Consider two
edge degree distributions, a left distribution (λ1 , λ2 , . . .) and a right distribution (ρ1 , ρ2 , . . .)
where λi (respectively ρi ) is the fraction of edges in the graph of left (respectively right)
degree i. Note that each stage of the cascade construction requires a bipartite graph and
each bipartite graph will P be chosen randomly according to the (λ, ρ) distributions. Define
the polynomials λ(x) = i λi xi−1 and ρ(x) = i ρi xi−1 . These will be referred to as a
P
(λ, ρ) edge degree distribution.
Several properties of a bipartite graph constructed according to these distributions follow
immediately. Let E be the total number of edges in the graph. Then the number of left
(respectively right) nodes of degree i is Eλi /i (respectively Eρi /i). If aL (respectively aR )
is the average degree of the left (respectively right) nodes, then
1 1 1 1
aL = P = R1 and aR = P = R1 .
λ
i i /i λ(x) dx ρ
i i /i ρ(x) dx
0 0
Remark 30.2.4 Given a left and a right edge degree distribution the following technique is
one possibility for constructing a bipartite graph with the required properties. It is assumed
all the quantities needed are integral; suitable approximations could be obtained for the
cases when this is not true. Consider four columns of nodes. Choose an appropriate number
of edges E for the graph. The number of left nodes in the first column of degree i will be
Eλi /i. These nodes have i edges emanating from each of them and these, for all i, terminate
in the second column of nodes. Similarly for the right nodes: create Eρj /j right nodes, with
j edges and these nodes form the fourth column with the nodes at the other end of their
edges forming the third column. So the second and third columns have E nodes each. Select
a random permutation on the integers 1, 2, . . . , E and connect the nodes of the second and
third columns, according to this permutation, to create the bipartite graph. The nodes in
Coding for Erasures and Fountain Codes 719
the first column are the left nodes of the bipartite graph, and the nodes of the fourth column
are the right nodes; the nodes in columns two and three are absorbed into the edges in the
obvious way. The resulting bipartite graph has the required left and right edge degrees.
There may be multiple edges between two nodes which can be corrected by deleting edges.
Such a process will have little effect on the probabilities involved.
Algorithm 30.2.1 is equivalent to the next algorithm, which is in a slightly more conve-
nient form for the sequel. It is applied to each stage of the cascade of graphs.
Algorithm 30.2.5
Associate with the nodes of the code graph a register initially loaded with received values.
For each left information node that is not erased, add the associated value to all neighbor
registers on the right and erase all edges emanating from that node. If after this operation
there is a check node of degree one, replace the erased left neighbor with the value of
the check node register, XOR that value to all other attached nodes of the left neighbor
and erase all edges from the two nodes. The left information neighbor node is said to be
resolved. Continue the process until all left nodes are resolved or there are no check nodes
of degree one. If there is no check node of degree one and there are remaining unresolved
information nodes, the decoding fails.
FIGURE 30.3: A small example of Algorithm 30.2.5 using the Hamming [7, 4, 3]2 code of
Figure 30.1
(a) (b) (c) (d) (e)
1 q 1 q 1 q 1 q 1 q
q0 q1 q1 q0 q0
? q ? q ? q 1 q 1 q
q0 q1 q0 q0 q0
? q ? q ? q ? q 0 q
q1 q0 q1 q1 q1
1 q 1 q 1 q 1 q 1 q
Example 30.2.6 The various steps of Algorithm 30.2.5 are shown in Figure 30.3. Notice
in Step (d) the two uppermost check digits are the same and give the corresponding infor-
mation bit the value 0. In this circumstance the two check bits could not differ if the set
of information and check bits corresponds to a valid codeword and the number of erasures
does not exceed 2, the erasure-correcting capability of the code.
With the probabilistic construction of the graphs according to the above edge distribu-
tions, one is able to set up differential equations for the number of left and right degree
edges as time evolves and solve them at each stage of the algorithm to give an analysis
of the decoding algorithm; see [1295, 1298, 1299]. It is desired to choose the left and right
edge degree distributions to maximize the probability that the process ends successfully,
i.e., with all information nodes resolved. The analysis is straightforward but involved. The
results are reported in the following two impressive theorems.
Theorem 30.2.7 ([1298, Theorem 2]) Let k be an integer and suppose that
C = C(B0 , B1 , . . . , Bm , C)
720 Concise Encyclopedia of Coding Theory
is a cascade of bipartite graphs from Remark 30.2.2 where B0 has k left (information) nodes.
Suppose that each Bi is chosen at random with edge degrees specified by λ(x) and ρ(x) such
that λ1 = λ2 = 0, and suppose δ is such that
ρ 1 − δλ(x) > 1 − x for 0 < x ≤ 1. (30.1)
Theorem 30.2.8 ([1298, Theorem 3]) For any rate R with 0 < R < 1 and any with
0 < < 1 and sufficiently large block length n, there is a linear code and a decoding algorithm
that, with probability 1 − O(n−3/4 ), is able to correct a random (1 − R)(1 − )-fraction of
erasures in time proportional to n ln(1/).
Remark 30.2.9 It is noted [1298] that there is a dual version of the condition on the
distributions in the Theorem 30.2.7:
δλ 1 − ρ(y) < 1 − y for 0 ≤ y < 1,
Remark 30.2.10 Theorem 30.2.7 establishes a probabilistic encoding and decoding algo-
rithm that is capacity achieving. Three properties of the construction are:
(i) [1684, Theorem 1] For λ(x), ρ(x) and δ satisfying the conditions of Theorem 30.2.7
above,
aL
1 − (1 − δ)aR .
δ≤
aR
(ii) [1684, Corollary 1]
aL
1 − (1 − aL /aR )aR ,
δ≤
aR
a result that follows from (i) since δ ≤ aL /aR .
(iii) [1684, Lemma 2] If (λ(x), ρ(x)) satisfy the condition that δλ(1 − ρ(1 − x)) < x for all
x ∈ (0, δ], then δ ≤ ρ0 (1)/λ0 (0).
Remark 30.2.11 Given the theory developed for achieving capacity on the BEC with
linear decoding complexity, it is of interest to derive distributions that satisfy the conditions
of Theorem 30.2.7 and, equivalently, Remark 30.2.9.
Example 30.2.12 ([1684, 1685]) Two distributions with a parameter a are given by:
PN −1 α
(−1)k+1 xk
ρa (x) = x a−1
and λα,N (x) = α k=1 kα ,
α − N N (−1)N +1
α(α − 1) · · · (α − N + 1)
α
= .
N N!
The numerator of the right-hand side of the equation for λα,N (x) is a truncated version of
Coding for Erasures and Fountain Codes 721
1 − (1 − x)α for real α, and is a version of the general binomial theorem. Notice that for
0 < α < 1 the terms in the numerator are positive—the binomials alternate in sign as does
(−1)k+1 . With the distributions (λ, ρ) = (λα,N , ρa ), the condition required for asymptotic
optimality reduces to ([1684, Theorem 2])
δα
δλ 1 − ρ(1 − x) ≤ α x.
α−N (−1)N +1
N
So to satisfy the condition δλ 1 − ρ(1 − x) < x (from Remark 30.2.9) it is required that
α
α− N (−1)N +1
δ≤ ,
α
and from this equation it can be shown [1684] that the distribution pair (λ, ρ) is asymp-
totically optimal. Notice the nodes on the right are all of degree (a − 1) and the graph is
referred to as right regular.
Example 30.2.13 ([1685, Section 3.4]) Let D be a positive integer (that will be an
indicator for how close δ can be made to 1 − R for the sequences obtained). Let H(D) =
PD 1
i=1 i denote the harmonic sum truncated at D and note that H(D) ≈ ln(D). Two
distributions parameterized by the positive integer D are
D
1 X xi
λD (x) = and ρD (x) = eµ(x−1) (truncated)
H(D) i=1 i
where λD (x) is a truncated version of ln(x) and ρD (x) is a truncated version of the Poisson
distribution by which is meant that the sequence of probabilities
µi−1
ρD,i = e−µ
(i − 1)!
is truncated at some high power so that the following development remains valid. The
parameter µ is the solution to the equation
1−R
1 −µ 1
(1 − e ) = 1− . (30.2)
µ H(D) D+1
Such a distribution gives
R 1 a code of rate at least R. Note the average
left degree is aL =
H(D)(D + 1)/D and 0 λD (x) dx = 1/H(D) 1 − 1/(D + 1) . The right degree edge
distribution is approximated by the Poisson distribution. It can also be established that
−δ δµx
δλD 1 − ρD (1 − x) = δλD (1 − e−µx ) ≤ ln(e−µx ) =
.
H(D) H(D)
For the right hand side of the last equation to be at most x it is required that
δ ≤ H(D)/µ.
By (30.2)
H(D)/µ = (1 − R) 1 − 1/(D + 1) /(1 − e−µ ).
The technique involves, for a given λ(x) and δ, using the condition of Theorem 30.2.7
to generate linear constraints for the ρi and uses linear programming to determine if the
suitable ρi exist. The reader is referred to the numerous references on capacity achieving
sequences, including [1297, 1298, 1299, 1475, 1589, 1684, 1690].
30.3 LT Codes
The random construction idea of bipartite graphs for erasure coding of the previous
section was contained in the original contribution of [321], which also mentioned the idea of
fountain codes. The first incarnation of fountain codes was the work of Luby [1292] called
LT codes (for Luby Transform) there. It seems a natural consequence of the work in [321],
and they are the precursors of the Raptor codes which feature in so many commercial
standards for big downloads. The original paper [1292] and the articles [1688, 1691] are
excellent references for the definition and analysis of these codes used for this presentation.
Indeed much of what is known about the construction and performance of fountain codes
derives from the work of Luby and Shokrollahi and their colleagues.
Consider k packets of fixed size (since, as noted previously, the only operation used is
that of XOR and the extension from bit operations to packet operations is trivial). The
notation used will be that of information packets and coded or check packets. The idea
of fountain codes is to create check packets by choosing at random in a carefully chosen
manner a subset of information packets and XOR’ing them together to form a check packet.
How this is done is discussed below. The receiver will collect from the channel a sufficient
number of check packets before starting the decoding algorithm. It will be assumed here
the receiver collects k(1 + ) packets and is referred to as the overhead of the code, the
number (fraction) of coded packets beyond k, the minimum required to retrieve the original
k information packets with high probability.
With the k(1 + ) collected coded packets, the receiver forms a bipartite graph with
the left nodes the unknown information packets (which are to be solved for) and the right
nodes the coded packets (received) with edges between them designating which information
packets are included in the coded packet, information contained in a coded packet header.
Decoding will be unique only if the associated k × k(1 + ) matrix is of full rank over F2 .
Algorithm 30.3.1
Initially the code graph has k unknown information (left) nodes corresponding to the in-
formation packets and k(1 + ) right nodes corresponding to received coded packets. At
each stage of decoding, a coded node of degree one is sought. Suppose coded node u has
degree 1 and its (unique) information node neighbor is v. The packet associated with u is
then associated with attached information node v and XOR’ed to each of v’s other coded
Coding for Erasures and Fountain Codes 723
neighbors. All edges involved with this process emanating from nodes u and v are removed
and the process is repeated. It is said the decoded information node v is recovered or
resolved. If at any stage before all information nodes are resolved there is no coded packet
of degree 1, decoding fails.
Example 30.3.2 The LT code of interest is in Figure 30.4 (a). At each stage of decoding
the chosen coded symbol of degree one is designated by +. At the first stage (a) the coded
symbols y2 and y4 have degree 1 - choose symbol y4 and delete node x1 and all its other
neighbors to give graph (b). Coded symbols y1 , y2 and y6 have degree 1 - choose y1 and
repeat the process to give graph (c). Coded symbols y2 , y3 , y6 and y7 have degree 1 - choose
y2 to give graph (d). Finally choose node y6 and then y5 to resolve all input symbols.
P An LT i
code is formed in the following manner. A degree distribution Ω(x) =
i≥1 Ωi x is sought which with high probability ensures a degree one coded packet will
be available at each stage to the completion of the process. To form a coded packet the
distribution is sampled for a value. If i is the result, (with probability Ωi ) i information
packets are chosen uniformly at random from the k information packets and XOR’ed to
form a coded packet. An LT code with k information symbols and distribution Ω(x) will be
denoted a (k, Ω(x)) LT code.
Given a (k, Ω(x)) LT code, the output ripple at step i of the decoding process is
defined as the set of coded symbols at that step of reduced degree 1, and we say that a
coded symbol is released at step i + 1 if its degree is larger than 1 at step i and 1 at step
i + 1. At least one coded symbol is released at each step of the algorithm. One is able to
set up differential discrete time equations to model the decoding process, and the analysis
[1688, 1691] shows that if the expected number of coded symbols in the output ripple is to
be 1 at step k of the algorithm, the final step, the distribution should satisfy the equation
(1 − x)Ω00 (x) = 1 for 0 < x < 1,
which, with the constraint Ω(1) = 1 (the sum of all probabilities is to add to 1), has the
solution
X xi
Ω(x) = .
i(i − 1)
i≥2
This distribution is unsuitable as it has no nodes of degree 1. The modified result suggested
in [1292] is the soliton distribution given by
x x2 xk
Ω(x) = + + ··· + . (30.3)
k 1·2 (k − 1) · k
724 Concise Encyclopedia of Coding Theory
That this distribution sums to unity is easily established by induction on k. If r(L) denotes
the probability a coded symbol is released when L information symbols are unresolved (as
yet unknown), this soliton distribution has the interesting property ([1292, Proposition 10])
that r(L) = 1/k for all L = 1, 2, . . . , k, i.e., a uniform release probability.
Remark 30.3.3 The soliton distribution does not work well in practice as the variance of
the ripple size for finite k will be too large resulting in too high a probability the output
ripple will vanish. To remedy this, the following distribution is proposed.
Remark 30.3.5 The intuition of this distribution is that the random√ walk represented by
the output ripple size deviating from its mean by more than ln(k/δ) k is at most δ resulting
in a higher probability of complete decoding.
Theorem 30.3.6 ([1292, Theorems 12 and 13]) With the above notation the average
degree of a coded symbol is D = O(ln(k/δ)) and the √ number of coded symbols required to
achieve a decoding failure of at most δ is K = k + O( k ln2 (k/δ)).
Remark 30.3.7 For obvious reasons LT codes and their relatives are often referred to
as fountain codes and sometimes rateless codes since there is no notion of code rate
involved. While the parameter of packet loss rate on the internet is an important parameter
in the overall performance of the coded system, it does not figure in the code design problem.
Definition 30.3.8 ([1687]) A reliable decoding algorithm for a fountain code is one
which can recover the k input packets from n coded packets and errs (fails to complete
decoding) with a probability that is at most inversely polynomial in k (of the form 1/k c ).
Proof: An informal overview of the proof is given. Consider the LT code (k, Ω(x)) and
consider the probability an information node u is not attached to any coded node, in which
case it is impossible for decoding to be successful. Consider a coded node v of degree d. The
probability that u is not a neighbor to v is
k−1 . k
= 1 − d/k.
d d
If a = Ω0 (1) is the average degree of a coded node, then averaging this expression over d
Coding for Erasures and Fountain Codes 725
gives the probability u is not a neighbor of v as 1 − a/k. Similarly the probability u is not a
neighbor of any of the n = k(1 + ) output nodes is (1 − a/k)n . For large k this expression
is approximated as
n/k
(1 − a/k)n = (1 − a/k)k ≈ (exp(−a))n/k = exp(−α)
where α = an/k is the average degree of an information node. If the probability of failure
of the algorithm is assumed to be 1/k c and the fact that an information symbol not being
attached to any coded node is only one way the decoding could fail, then
as required.
Remark 30.3.10 For very large downloads it will be important to have a linear (in the
number of information packets, k) decoding complexity and the above discussion shows that
this is not achievable for LT codes since the average node degree of the soliton distribution
is O(ln(k)). The modification of LT codes to achieve this linear complexity resulted in the
Raptor codes introduced in the next section. The important modification was to obtain a
distribution with a constant average degree that allowed recovery of almost all information
packets (not all), with an acceptable overhead, and to introduce a precoding stage to recover
all the information packets. The process is described in the next section.
be the generating function. The argument to derive these functions is combinatorial. The
reader is referred to the proof in [1688, Theorem 1], due to Karp et al. [1084], for details. A
recursion is given to compute Pu−1 (x, y) knowing Pu (x, y) and pu . For u = k + 1, k, . . . , 1,
the recursion is given by
Pu x(1 − pu ) + ypu , (1/u) + y(1 − 1/u) − Pu x(1 − pu ), 1/u
Pu−1 (x, y) = ,
y
where a complicated expression for the quantity pu is given; as noted, it gives the probability
a random cloud element joins the ripple after the current transition. The initial conditions
include ( n
r c
r Ω1 (1 − Ω1 ) if c + r = n,
Pc,r,k =
0 otherwise,
n
X
r−1 c x(1 − Ω1 ) + yΩ1 − x(1 − Ω1 )
Pk (x, y) = Pc,r,k x y = ,
y
r≥1,c
726 Concise Encyclopedia of Coding Theory
def def
Pk+1 (x, y) = xn and pk+1 = Ω1 .
These definitions allow Pk (x, y) to be computed from Pk+1 (x, y) and, subsequently, all
Pu (x, y), u = k − 1, . . . , 1.
These expressions are important, beyond their intrinsic interest, in that they allow the
probability of error (decoding failure for LT codes) to be computed (see, e.g., [1311]) in that
X X
Perr (u) = Pc,0,u = 1 − Pc,r,u = 1 − Pu (1, 1)
c≥0 c≥0,r≥1
Let Ru (x, y) = ∂Pu /∂y. Then Ru (1, 1) = R(u) is the expected ripple size, given the
decoding process has continued up to u undecoded information symbols. It can be shown
[1311, 1688] that
R(u) = u Ω0 (1 − u/k) + u ln(u/k) + O(1).
An expression for the expected size of the cloud is also given [1311,
√ Theorem 2] where it is
also shown that the standard deviation of the ripple size is O( k).
Remark 30.3.11 An important consequence of the above analysis is the behavior of the
probability of decoding failure as the decoding process evolves. It is shown [1688, Figure 9]
that decoding failure typically occurs towards the end of the process when there are rela-
tively few information symbols unresolved, an observation that matches behavior observed
in simulations. This behavior has important implications for the formulation of the Raptor
codes of the next section.
Remark 30.3.12 It is shown in [1292, Theorem 12], and commented on in [1687], that in
order to have
√ a reliable decoder for LT codes, it is necessary for the decoder to gather at
least k + O k · ln2 (k/δ) coded symbols for a probability of error of less than δ, and hence
√
an overhead of O ln2 (k/δ)/ k . Many other aspects of the analysis of the performance of
LT codes and their decoding algorithm are given in [1294, 1295, 1298, 1299, 1327, 1475,
1684, 1686, 1688, 1691].
Remark 30.3.13 It should be noted that the decoding algorithm for the LT codes is an
instance of a belief propagation (BP) or message-passing decoding algorithm. It is
clear that it is not maximum likelihood (ML). An algorithm that is ML is Gaussian
Elimination (GE) where the decoder, on gathering N coded symbols from the fountain,
forms the N × k matrix G of coded symbols and information symbols and solves the matrix
equation
GxT = rT
using GE on the matrix G, where x is the k-tuple of information symbols and r the N -tuple
of received coded symbols. The complexity of such an algorithm is O(k 3 ) and is far too
complex for the application to large downloads of interest here.
That BP is not ML is clear since one can imagine a situation where the matrix G is
of full rank (and hence the above equation has a unique solution), yet the arrangement of
checks is such that the decoding algorithm fails to complete.
Coding for Erasures and Fountain Codes 727
Definition 30.4.1 For the construction of a Raptor code (k, Cn , Ω(x)) imagine three
vertical sequences of nodes. The first is the set of k information nodes. The second, referred
to as intermediate nodes is the set of k information nodes with n − k parity nodes added
above, generated according to the simple erasure code Cn . The third is a set of n(1 + )
coded nodes generated according to the (n, Ω(x)) LT code acting on the set of n intermediate
nodes. The process is illustrated in Figure 30.5.
Remark 30.4.2 There is an interesting trivial case for Raptor codes, that of the precode
only (PCO) codes [1687] where the distribution Ω(x) = x is used; i.e., for each coded
symbol, a random intermediate symbol is chosen and no XOR operations are used. Such a
code is equivalent to converting a block code (Cn ) into a fountain code. The performance of
such codes is actually quite good. The reader is referred to [1687] for more information.
The distribution that will be used for the LT code is a modified and truncated version
of the soliton distribution (see [1687]):
D
!
1 X xi xD+1
ΩD (x) = µx + +
1+µ i=2
(i − 1)i D
728 Concise Encyclopedia of Coding Theory
where D = d4(1+)/e for a positive number and µ = (/2)+(/2)2 . That this distribution
sums to unity is easy to establish using the soliton distribution. Notice that the average
degree of a coded node is O(1) for D a constant. Recall the average degree of a coded node
for the soliton distribution was O(ln(k/δ)).
The following result is crucial to the development of Raptor codes.
Lemma 30.4.3 ([1687, Lemma 4]) There exists a positive real number c (depending on
) such that, with an error probability of at most e−cn , any set of (1+/2)n+1 output symbols
of the LT code (n, ΩD (x)) is sufficient to recover at least (1−δ)n of the intermediate symbols
via BP decoding where δ = (/4)/(1 + ).
Remark 30.4.4 Suppose the LT code (n, Ω(x)) is used. The edge distribution of the check
nodes of the LT codes then is, as shown in the previous section, ω(x) = Ω0 (x)/Ω0 (1).
Let a = Ω0 (1) be the average degree of a coded node. The probability an intermedi-
ate node is the neighbor of exactly ` coded nodes is given by the binomial distribution
N ` N −`
n (a/`) (1 − a/n) where N = n(1 + /2) + 1 coded nodes. The intermediate node edge
degree distribution then is
(1+/2)n ∞
a(1 − x)
X
ι(x) = 1 − ≈ ea(x−1) = e−a (aj /j!)xj
n j=0
where the last approximation is the standard approximation to the binomial by the Poisson
distribution. Notice that asymptotically this is also the degree distribution of the interme-
diate nodes since ι(x) = ι0 (x)/ι0 (1).
Remark 30.4.5 As in Section 30.2, part of the proof of Lemma 30.4.3 includes the obser-
vation that if, for any constant δ, ι 1 − ω(1 − x) < x for x ∈ [δ, 1], then the probability the
decoder cannot recover at least (1 − δ)n or more of the input (intermediate) nodes is less
than e−cn . It is straightforward to show for the distributions noted [1687, proof of Lemma
4] that
0
ι 1 − ω(1 − x) < e−(1+/2)Ω (1−x)
and it is shown that this is indeed less than x on the interval [δ, 1], a crucial part of the
proof of Lemma 30.4.3.
Remark 30.4.6 The requirements of the code Cn of the Raptor code are quite minimal
and not specified strictly. It is assumed of such codes [1687] that, for a fixed real number
and for every length n, there is a linear code Cn with the properties:
(i) The rate of Cn is (1 + /2)/(1 + ).
(ii) The BP decoder can decode Cn on a BEC with erasure probability
δ = (/4)(1 + ) = (1 − R)/2
with O(n log(1/)) arithmetic operations. Recall that the capacity of the BEC is 1 − R
and the conditions on this code are not severe.
Examples of such codes are Tornado codes of Section 30.2.
Theorem 30.4.7 ([1687, Theorem 5]) Let be a positive real number, k an integer,
D = d4(1 +)/e, R = (1 +/2)/(1 +), n = dk/(1−R)e, and let Cn be a code with the above
properties. Then the Raptor code with parameters (k, Cn , ΩD (x)) has space consumption 1/R,
overhead , and a cost of O(log(1/)) with respect to BP decoding of both the precode and
the LT code.
Coding for Erasures and Fountain Codes 729
Remark 30.4.8 It is noted that a similar construction to Raptor codes was discovered
independently by Maymounkov [1358].
Inherent in Theorem 30.4.7 is the notion of a reliable decoder and hence the probability
of error of the Raptor code is inversely polynomial in k. However, if the Cn used has an
exponentially small error probability in k, the Raptor code that uses it will also.
It is also noted that the codes of Theorem 30.4.7 have a constant overhead while the
LT codes have a vanishingly small overhead with k. This is a consequence of our insistence
on a linear decoding complexity. As noted earlier, the Raptor codes achieve linear decoding
complexity by requiring only a fraction of information symbols to be decoded, relying on
the linear code Cn to decode the remainder.
Definition 30.4.9 The input ripple at a given stage of decoding is defined as the set of
all input symbols that are released at that stage.
The expected size of the input ripple is of interest as the decoding algorithm develops. A
slight change of notation from Section 30.2 is adopted in that the BP messages sent on the
graph edges (from/to intermediate and coded nodes) are 0 and 1, a 0 from a coded symbol
to an intermediate neighbor if and only if the intermediate symbol cannot be recovered at
that point and from an intermediate symbol to a neighbor coded symbol if its value is not
yet determined.
An overview of the analysis is given [1687, Section VIIA]. It makes use of some arguments
of Tornado codes. Let
pi+1 = ω 1 − ι(1 − pi )
denote the probability an edge of the graph carries a 1 value from a coded node to an
attached intermediate node, where p0 = δ, the design erasure probability of the code Cn . As
in Section 30.2, this assumes independent messages (a tree to the depth of interest) where
ω(x) and ι(x) are the edge distributions of the coded and intermediate edges, respectively.
The reference [1294] approaches this problem from a general point of view, from which the
above result follows easily.
Let ui be the probability that an intermediate symbol is recovered at round i (which is
the probability the corresponding edge from a coded symbol to that intermediate symbol
carried a 1). If the intermediate symbol node had a degree of d, the probability it is recovered
at round i, given the degree d, is 1 minus the probability all zeros were transmitted to it,
or 1 − (1 − pi )d ; averaging over the distribution ι(x) this probability is ui = 1 − ι(1 − pi ) =
1−e−αpi . It follows that pi = − ln(1−ui )/α where α is the average degree of an intermediate
symbol. Thus the fraction of symbols recovered by round (i + 1) is
ui+1 = 1 − e−αω(ui ) .
or equivalently .
p
Ω0 (x) ≥ − ln 1 − x − c (1 − x)/k (1 + ) (30.4)
Example 30.4.10 As with the Tornado codes of Section 30.2 a linear programming tech-
nique is suggested to design distributions that satisfy (30.4). Impressive results on Raptor
codes are given in [1687, Section VII A, Table 1]. One of these examples is given here for
k = 100, 000:
The code has an overhead of 0.028 and an average coded node degree of 5.85. The δ value
is 0.01 and it is noted that for small values of d the distribution is approximately soliton,
i.e., Ωd ≈ 1/(d(d − 1)).
Algorithm 30.4.11
It as assumed that each node is associated with a register which can be updated with
information passed between nodes as the algorithm runs. The BP algorithm runs on the
code bipartite graph until there is no coded node of degree 1. If there is a coded node u of
degree 2 at this stage, attached to information nodes v and w, and the register associated
with u, ru contains the symbol bu , then add a variable x1 to the register rv of v and x1 ⊕ bu
to rw . These operations are recorded in the register in some suitable manner. The variable
x1 is referred to as an inactivation variable. Continue the algorithm by removing all edges
incident with the visited nodes u, v, and w. Each time in the decoding process there is no
degree 1 coded node available, a sufficient number of inactivation variables are introduced
and the information node registers and attached coded node registers are updated and the
process continues. At the end of the process, some of the left nodes will contain inactivation
variables in their ‘solutions’. If ` inactivation variables have been introduced, a matrix
equation can be established relating the left nodes containing inactivation variables and the
originally received coded symbols. Solving this set of equations is achieved by using GE on
Coding for Erasures and Fountain Codes 731
an ` × ` matrix. The information symbols are then resolved. The complexity of this stage is
O(`3 ) and typically ` is quite small. The result is an ML algorithm with a complexity that
is considerably reduced compared to ML on the originally received coded symbols.
The above algorithm may be viewed as an equivalent matrix reduction on the n×k matrix
G that relates the coded and information symbols. This matrix can first be subjected to
row and column permutations (no other arithmetic operations) to place as large an identity
matrix as possible in the upper left corner of the matrix. Suppose the process results in a
τ × τ identity matrix. The (n − τ ) × τ matrix under the identity Iτ matrix, say A, can be
zeroed using row and XOR operations. The (n − τ ) × (k − τ ) matrix to the right of A, say B,
which, due to the previous operations has higher density than the original relatively sparse
matrix, can then be solved by GE. This part relates to the inactivation variables in the
previous description. The matrix above B can then be used to give all information symbols.
It is clear that if the original matrix is of full rank, this algorithm yields the solution and is
ML.
There are numerous studies on inactivation decoding including [215, 216, 1488, 1690].
Algorithm 30.4.12
Step 1: Sample the distribution Ω(x), N = k(1 + ) times and produce an N × n binary
matrix S with rows v1 , v2 , . . . , vN , vi ∈ Fn2 .
732 Concise Encyclopedia of Coding Theory
Step 2: Assuming the N × k binary matrix SGT is of full rank, over F2 , find indices
i1 , i2 , . . . , ik such that the corresponding rows of SGT form the nonsingular ma-
trix R with inverse R−1 . Thus R = AGT where A, a submatrix of S, is formed by
rows vi1 , . . . , vik .
Step 3: Compute the k processed symbols y = y1 · · · yk by yT = R−1 xT .
It is clear that in Algorithm 30.4.12 the coded symbols zi1 = x1 , zi2 = x2 , . . . , zik = xk
since R = AGT and RyT = RR−1 xT = xT . Thus among the LT encoded symbols are the
information symbols.
Other Comments
The term online codes has been used [1358] synonymously with the term fountain
codes. More recently [368] the term has been introduced to mean a fountain code that
adapts its code distribution in response to learning of the ‘state’ of the decoder. Thus as
the decoding proceeds, such a decoder can offer superior performance. Of course this decoder
would require the characterization of the state and the transmission of this information back
to the encoder, in contrast to the multicast situation. Using the dynamics of random graph
processes, it is shown in [368] how such a strategy can result in a considerable lowering of
the overhead required for a given performance compared to the usual fountain codes.
The performance of fountain and in particular Raptor codes over the past decade has
been spectacular, essentially achieving capacity on a BEC with linear encoding and decoding
complexity. An essential feature of such codes is that it is unnecessary for the encoder to
learn the characteristics of the channel to achieve such performance. It is natural to consider
Coding for Erasures and Fountain Codes 733
the performance of such codes on other channels such as the BSC and AWGN using an
appropriate form of BP decoding. The issue is addressed in [688] where in particular the
fraction of nodes of degree 2 is shown to be of importance for the use of Raptor codes
in error-correction on the BSC. Other studies are available in the literature but are not
commented on here.
Numerous other works on interesting questions related to these erasure-correcting codes
include [33, 42, 215, 216, 1293, 1389, 1488, 1589, 1684, 1685, 1686, 1690, 1692, 1744].
Chapter 31
Codes for Distributed Storage
Vinayak Ramkumar
Indian Institute of Science
Myna Vajha
Indian Institute of Science
S. B. Balaji
Qualcomm
M. Nikhil Krishnan
University of Toronto
Birenjith Sasidharan
Govt. Engineering College, Barton Hill
P. Vijay Kumar
Indian Institute of Science
735
736 Concise Encyclopedia of Coding Theory
The traditional means of ensuring reliability in data storage is to store multiple copies of
the same file in different storage units. Such a replication strategy is clearly inefficient in
terms of storage overhead. By storage overhead we mean the ratio of the amount of data
stored pertaining to a file to the size of the file itself. Modern-day data centers can store
several Exabytes of data, and there are enormous costs associated with such vast amounts
of storage not only in terms of hardware, software and maintenance, but also in terms of the
power and water consumed. For this reason, reduction in storage overhead is of paramount
importance. An efficient approach to reduce storage overhead, without compromising on
reliability, is to employ erasure codes instead of replication. When an [n, k] erasure code is
employed to store a file, the file is first broken into k fragments. To this, n − k redundant
fragments are then added, and the resultant n fragments are stored across n storage units.
Thus the storage overhead incurred is given by nk . In terms of minimizing storage overhead, a
class of codes known as maximum distance separable (MDS) codes, of which Reed–Solomon
(RS) codes are the principal example, are the most efficient. MDS codes have the property
that the entire file can be obtained by connecting to any k storage units, which means that
an MDS codes can handle the failure of n − k storage units without suffering data loss. RS
codes are widely used in current-day distributed storage systems. Examples include, the
[9, 6] RS code in the Hadoop Distributed File System with Erasure Coding (HDFS-EC), the
[14, 10] RS code employed in Facebook’s f4 BLOB storage, and the [12, 8] RS code employed
in Baidu’s Atlas cloud storage [489].
A second important consideration governing choice of the erasure code is the ability of
the code to efficiently handle the failure of an individual storage unit, as such failures are a
common occurrence [1568, 1624]. From here on, we will interchangeably refer to a storage
unit as a node, as we will, at times, employ a graphical representation of the code. We
use the term node failure to encompass not only the actual physical failure of a storage
unit, but also instances when a storage unit is down for maintenance or else is unavailable
on account of competing serving requests. Thus, the challenge is to design codes which
are efficient not only with respect to storage overhead, but which also have the ability to
efficiently handle the repair of a failed node. The nodes from which data is downloaded
for the repair of a failed node are termed helper nodes, and the number of helper nodes
contacted is termed the repair degree. The amount of data downloaded to a replacement
node from the helper nodes during node repair is called the repair bandwidth.
If the employed erasure code is an [n, k] MDS code, then the conventional approach
towards repair of a failed node is as follows. A set of k helper nodes would be contacted
and the entire data from these k nodes would then be downloaded. This would then permit
reconstruction of the entire data file and hence, in particular, enable repair of the failed
node. Thus under this approach, the repair bandwidth is k times the amount of data stored
in the replacement node, which is clearly wasteful of network bandwidth resources and can
end up clogging the storage network.
This is illustrated in Figure 31.1 in the case of Facebook’s [14, 10] RS code. If each node
stores 100MB of data, then the total amount of data download needed for the repair of a
node storing 100MB from the 10 helper nodes equals 1GB. Thus in general, in the case of
Codes for Distributed Storage 737
1 2 3 4 5 6 7 8 9 10 11 12 13 14
100 100 100 100 100 100 100 100 100 100 100 100 100
MB MB MB MB MB MB MB MB MB MB MB MB MB
10 X 100MB
Failed Node
100
MB
an [n, k] MDS code, the repair degree equals k and repair bandwidth is the entire file size
which is k times the amount of data stored in a node. As will be seen, under conventional
node repair, MDS codes are inefficient both in terms of repair degree and repair bandwidth.
Coding theorists responded to the challenge of node repair by coming up with two classes
of codes, known as ReGenerating Codes (RGCs) [541] and Locally Recoverable Codes (LRCs)
[830]. The goal of an RGC is to reduce the repair bandwidth, whereas with an LRC, one
aims to minimize the repair degree. Recently much progress has also been made in coming
up with more efficient approaches for the repair of RS codes [882, 1660]. This chapter will
primarily focus on RGCs and LRCs. We will also briefly discuss novel methods of RS repair,
as well as a class of codes known as Locally ReGenerating Codes (LRGCs) that combine the
desirable properties of RGCs and LRCs.
A brief overview of the topic of codes for distributed storage, and from the perspective
of the authors, is presented here. For further details, the readers are referred to surveys
such as can be found in [96, 488, 542, 1241, 1283]. Distributed systems related to algebraic
geometry codes are described in Section 15.9. We would like to thank the editors of Advanced
Computing and Communications for permitting the reuse of some material from [1165].
We begin with a brief primer on RS codes.
tion. Of these, the first k symbols are message symbols while the remaining are redundant
symbols.
The RS code derives its MDS property from the fact that the polynomial f can be deter-
mined from the knowledge of any k evaluations simply by solving the following nonsingular
set of k equations in the k unknown coefficients {bi }k−1
i=0 :
f (xi1 ) 1 x i1 ··· xk−1
i1 b0
f (xi2 ) 1 x i2 ··· xk−1 b1
i2
.. = .. .. .. .. .. ,
. . . . . .
f (xik ) 1 xik · · · xk−1
ik
bk−1
a Vandermonde matrix
and therefore invertible
where i1 , . . . , ik are any set of k distinct indices drawn from {0, . . . , n − 1}. From this it
follows that an RS code can recover from the erasure of any n − k code symbols. The
conventional approach of repairing a failed node corresponding to code symbol f (xj ) would
be to use the contents of any k nodes (i.e., any k code symbols) to recover the polynomial
f and then evaluate the polynomial f at xj to recover the value of f (xj ).
FIGURE 31.3: Showing how breaking up a single scalar symbol into two smaller symbols
helps improve the repair bandwidth. This breaking up of a symbol is referred to as sub-
packetization. The sub-packetization level equals 2 here.
(ii) Node Repair: If a node fails, then the replacement node can connect to any subset
of d helper nodes, where k ≤ d ≤ n − 1, download β symbols from each of these helper
nodes, and repair the failed node.
With respect to the definition above, there could be two interpretations as to what it
means to repair a failed node. One interpretation of repair is that the failed node is replaced
by a replacement node which, upon repair, stores precisely the same data as did the failed
node. This is called Exact Repair (E-R). This is not however, a requirement and there is an
alternative, more general definition of node repair. Under this more general interpretation,
the failed node is replaced by a replacement node in such a way that the resultant collection
of n nodes once again satisfies the requirements of a regenerating code. Such a replacement
740 Concise Encyclopedia of Coding Theory
FIGURE 31.4: Data collection and node repair properties of an {(n, k, d), (α, β), B, Fq }
regenerating code
1 1 Replacement
𝛼 β Node
𝛼 ata β
2 2
ollecto
𝛼 β
3
+1 d+1
n n
𝛼 capacity nodes 𝛼 capacity nodes
of a failed node is termed as Functional Repair (F-R). Clearly, E-R is a special case of
F-R. E-R is preferred in practice as it simplifies management of the storage network.
It is easily verified that the storage overhead of a regenerating code equals nα B , the
repair degree is d, and the repair bandwidth is dβ. The parameter β is typically much
smaller than α, resulting in savings in repair bandwidth. The parameter α is called the
sub-packetization level of the regenerating code, and having low sub-packetization level
is preferred from a practical perspective; see [1825]. A regenerating code is said to possess
the optimal-access property if no computation is required at the helper nodes during node
repair. If repair of a failed node can be done without any computation at either the helper
nodes or the replacement node, then the regenerating code is said to have the Repair-By-
Transfer (RBT) property.
which we will refer to as the Cut-Set Bound. This bound holds for F-R (and hence also
for E-R).
A regenerating code is said to be optimal if the Cut-Set Bound is satisfied with equality
and if, in addition, decreasing either α or β would cause the bound to be violated. There
are many flavors of optimality in the sense that for a fixed (B, k, d), inequality (31.1) can be
met with equality by several different pairs (α, β). The parameter α determines the storage
dβ
overhead nαB , whereas β is an indicator of normalized repair bandwidth B . The various pairs
of (α, β) which satisfy the Cut-Set Bound with equality present a tradeoff between storage
Codes for Distributed Storage 741
overhead and normalized repair bandwidth. The existence of codes achieving the Cut-Set
Bound for all possible parameters is known again from network coding in the F-R case.
Thus in the F-R case, the storage-repair bandwidth (S-RB) tradeoff is fully characterized.
An example of this tradeoff is presented for the case (B = 5400, k = 6, d = 10). The
discussion above suggests that the tradeoff is a discrete collection of optimal pairs (α, β).
However, in the plot, these discrete pairs are connected by straight lines, giving rise to
the piecewise linear graph in Figure 31.5. The straight line connections have a physical
interpretation and correspond to a space sharing solution to the problem of node repair;
the reader is referred to [1658, 1804] for details.
α is minimized by setting α = dβ. Regenerating codes having these parameters are termed
as Minimum Bandwidth Regenerating (MBR) codes. MBR codes have the minimum
possible repair bandwidth but are not MDS. The storage overhead of an MBR code can be
shown to be bounded below by 2, whereas MSR codes can have storage overhead arbitrarily
close to 1.
Several constructions of E-R MSR [832, 1243, 1569, 1578, 1619, 1621, 1764, 1774, 1824,
1925, 1926] and MBR [1164, 1263, 1569, 1570] codes can be found in the literature. A
selected subset of these constructions is discussed here.
742 Concise Encyclopedia of Coding Theory
Theorem 31.2.2 For any given (n, k ≥ 3, d), E-R codes do not exist for (α, β, B) corre-
sponding to an interior point on the F-R tradeoff, except possibly for a small region in the
neighborhood of the MSR point with α values in the range
d−k+1
(d − k + 1)β ≤ α ≤ (d − k + 2)β − β.
d−k+2
This theorem does not eliminate the possibility of E-R codes approaching the F-R trade-
off asymptotically, i.e., in the limit as B → ∞. The E-R tradeoff for the (n, k, d) = (4, 3, 3)
case was characterized in [1803] where the impossibility of approaching the F-R tradeoff un-
der E-R was established. The analogous result in the general case was established in [1620].
Examples of interior-point RGC constructions include layered codes [1804], improved layered
codes [1652], determinant codes [680], cascade codes [681], and multi-linear-algebra-based
codes [657].
The file to be stored consists of the 9 symbols {a1 , a2 , . . . , a9 }. We first generate a parity
symbol aP given by aP = a1 + a2 + · · · + a9 (mod 2). Next, set up a complete graph with
n = 5 nodes, i.e., form a fully-connected pentagon. The pentagon has 52 = 10 edges, and
we assign each of the 10 code symbols {ai | 1 ≤ i ≤ 9} ∪ {aP } to a distinct edge; see
Figure 31.6. Each node stores all the symbols appearing on edges incident on that node,
giving α = 4 .
Data Collection: The data collection property requires that the entire data file be recov-
erable by connecting to any k = 3 nodes. It can be easily seen that any collection of 3 nodes
contains 9 distinct symbols from {ai | 1 ≤ i ≤ 9} ∪ {aP }, which is sufficient to recover the
entire file.
Codes for Distributed Storage 743
2,5,6,P
P 6
2
1,4,9,P 1 1,3,6,7
3
9 5 7
4
3,5,8,9 8 2,4,7,8
Node Repair: Now suppose one of the nodes failed. The repair is accomplished by down-
loading, from each of the remaining d = 4 nodes, the code symbol it has in common with
the failed node. Since each helper node passes on a single symbol to aid in node repair, we
have β = 1.
This construction can be generalized by replacing the pentagon by a polygon with n
vertices. The parameters of the general construction are
k
(n, k, d = n − 1), (α = n − 1, β = 1), B = k(n − 1) − , Fq ,
2
where q = O(n2 ). In the initial step of this construction, an n2 , B MDS code is used to
generate the code symbols, and these code symbols are then assigned to distinct edges of a
complete graph on n vertices. Each node stores the symbols appearing on incident edges,
resulting in α = n − 1. The data collection and node-repair properties in the case of this
general construction are easily verified.
d > 2k − 2 through the mechanism of shortening a code [1569]. When d = 2k − 2,we have
α = k − 1, d = 2α, and B = α(α + 1). The encoding matrix Ψ is then given by Ψ = Φ ΛΦ ,
where Φ is a n × α matrix and Λ is a n × n diagonal matrix. The entries of Ψ are chosen such
that any d rows of Ψ are linearly independent, any α rows of Φ are linearly independent,
and the diagonal entries of Λ are distinct. These conditions
can be met by picking Ψ to be
a Vandermonde matrix that is of the form Ψ = Φ ΛΦ (this is not difficult).
Let S1 and S2 be two distinct α × α symmetric matrices, which together contain all the
B = α(α + 1) elements contained in the data file. The message matrix M is then given by
T
M = S1 S2 . Let φT th th
i denote the i row of Φ and λi the i diagonal element of the diagonal
matrix Λ. Then the α symbols stored in node i are given by cT T T T
i = ψi M = φi S1 +λi φi S2 . We
will now show that the data collection and node repair properties hold for this construction.
Data Collection: Let ΨDC = ΦDC ΛDC ΦDC be the k × d sub-matrix of the n × d matrix
Ψ and corresponding to an arbitrary subset of k nodes drawn from the totality of n nodes.
To establish the data collection property it suffices to show that one can recover S1 and S2
from the matrix
ΨDC M = ΦDC S1 + ΛDC ΦDC S2 ,
given ΦDC and ΛDC . The first step is to multiply both sides of the equation on the right by
the matrix ΦT
DC to obtain
ΨDC M ΦT T T
DC = ΦDC S1 ΦDC + ΛDC ΦDC S2 ΦDC .
Set P = ΦDC S1 ΦT T T
DC and Q = ΦDC S2 ΦDC , so that ΨDC M ΦDC = P + ΛDC Q. It can be seen
th
that P and Q are symmetric matrices. The (i, j) element of P + ΛDC Q is Pij + λi Qij ,
whereas the (j, i)th element is Pji + λj Qji = Pij + λj Qij . Since all the {λi } are distinct,
one can solve for Pij and Qij for all i 6= j, thus obtaining all the non-diagonal entries
of both matrices P and Q. The ith row of P excluding the diagonal element is given by
φT T
i S1 [φ1 · · · φi−1 φi+1 · · · φα+1 ], from which the vector φi S1 can be obtained, since the
T
matrix on the right is invertible. Next, one can form [φ1 · · · φα ] S1 . Since the matrix on
the left of S1 is invertible, we can then recover S1 . In a similar manner, the entries of the
matrix S2 can be recovered from the non-diagonal entries of Q.
Node Repair: Let f be the index of the failed node and {hj | j = 1, . . . , d} denote the
arbitrary set of d helper nodes chosen. The helper node hi computes ψhTi M φf and passes
it on to the replacement node. Set Ψrep = [ψh1 ψh2 · · · ψhd ]T . Then the d symbols obtained
by the replacement node from the repair node can be aggregated into the form Ψrep M φf .
From the properties of Ψ, it can be seen that Ψrep is invertible. Thus the replacement node
T
can recover M φf = S1 φf S2 φf . Since S1 and S2 are symmetric matrices, φT f S1 and
φf S2 can be obtained by simply taking the transpose. Now φf S1 + λf φf S2 = φf M = cT
T T T T
f,
completing the repair process.
where q ≥ n, for some t > 1 and r ≥ 1. Each codeword in the Clay code is composed of a
total of nα = r × t × rt symbols over the finite field Fq . We will refer to these finite field
symbols as code symbols of the codeword. Using the notation [a : b] = {a, a + 1, . . . , b},
these nα code symbols will be indexed by the three tuple
Such an indexing allows us to identify each code symbol with an interior or exterior point
of an r × t × rt three-dimensional (3D) cube, and an example is shown in Figure 31.7. For a
code symbol A(x, y; z) indexed by (x, y; z), the pair (x, y) indicates the node to which the
code symbol belongs, while z is used to uniquely identify the specific code symbol within
the set of α = rt code symbols stored in that node.
Uncoupled code: As an intermediate step in describing the construction of the Clay code
A, we introduce a second code B that has the same length n = rt, the same level α = rt of
sub-packetization, and which possesses a simpler description. For reasons that will shortly
become clear, we shall refer to the code B as the uncoupled code. The uncoupled code B is
simply described in terms of a set of rα parity check (p-c) equations. Let {B(x, y; z) | x ∈
[0 : r − 1], y ∈ [0 : t − 1], z ∈ Ztr } be the nα code symbols corresponding to the code B.
The rα p-c equations satisfied by the code symbols that make up each codeword in code
B are given by
r−1 X
X t−1
`
θx,y B(x, y; z) = 0 for all ` ∈ [0 : r − 1] and z ∈ Ztr , (31.2)
x=0 y=0
where the {θx,y } are all distinct. Such a {θx,y } assignment can always be carried out with a
field of size q ≥ n = rt. The uncoupled code is also an MDS code as it is formed by simply
stacking α codewords, each belonging to the same [n, k]q MDS code.
746 Concise Encyclopedia of Coding Theory
FIGURE 31.7: Illustrating the data cube associated with a Clay code having parameters
(r = 2, t = 2). Hence n = rt = 4 and α = rt = 4. We associate, with each codeword in this
example Clay code, a 2 × 2 × 4 data cube. Each of the 16 points within the data cube is
thus associated with a distinct code symbol of the codeword. The data cube is made up of
rt = 4 planes and each plane is represented by a vector z. The vector z associated with a
plane in the cube is identified by the location of dots within the plane, which are placed at
the coordinates (x, y; z) satisfying x = zy .
y
x
(0,0,z) (0,1,z)
z (1,0,z) (1,1,z)
Z = (0,1)
Pairwise Coupling: Next, we introduce a pairing among the nα code symbols (see Fig-
ure 31.8) associated with uncoded codeword B(x, y; z). The pair of a code symbol B(x, y; z),
for the case x = zy is given by B(zy , y; z(x → zy )) where the notation z(x → zy ) denotes
the vector z with the y th component zy , replaced by x, i.e.,
The code symbols B(x, y; z), for the case x = zy will remain unpaired. One can alternately
view this subset of code symbols as being paired with themselves, i.e., the pair of B(x, y; z),
for the case x = zy , is B(x, y; z) itself.
We now introduce a pairwise transformation, which we will refer to as the coupling
transformation, which will lead from a codeword
B(x, y; z) | x ∈ [0 : r − 1], y ∈ [0 : t − 1], z ∈ Ztr ,
in the coupled code. For x = zy the pairwise transformation takes on the form
−1
A(x, y; z) 1 γ B(x, y; z)
= , (31.3)
A(zy , y, z(x → zy )) γ 1 B(zy , y; z(x → zy ))
where γ is selected such that γ 2 = 1. This causes the 2 × 2 linear transformation above to
be invertible. For the case x = zy , we simply set
FIGURE 31.8: Figure illustrating the data cube with Clay code symbols on the left and
the uncoupled code symbols on the right. A, A∗ and B, B ∗ indicate the paired symbols in
the respective cubes.
A B
A* B*
Parity Check Equations: Combining equations (31.3) and (31.2) gives us the p-c equa-
tions satisfied by the code symbols of the Clay code A:
r−1
t−1
θx,y A(x, y; z) + 1{x=zy } A(x, y; z(x → zy )) = 0, (31.4)
x=0 y=0
for all ∈ [0 : r − 1] and all z ∈ Ztr , where 1{x=zy } is equal to 1 if x = zy or else takes the
value 0.
Optimal-Access Node Repair: We will show how repair of a single node (x0 , y0 ) in the
Clay code is accomplished by downloading β symbols from each of the remaining n − 1
nodes. Since no computation is required at a helper node, this will also establish that the
Clay code possesses the optimal-access property. The β = rt−1 symbols passed on by a
helper node (x, y) = (x0 , y0 ) are precisely the subset {A(x, y; z) | z ∈ P } of the α = rt
symbols contained in that node, in which
P = {z ∈ Ztr | zy0 = x0 }
identifies an rt−1 -sized subset of the totality of rt planes in the cube. We will refer to P
as the set of repair planes. Pictorially, these are precisely the planes that have a dot in the
location of the failed node; see Figure 31.9.
Consider the r p-c equations given by (31.4) for a fixed repair plane z, i.e., a plane z
belonging to P . The code symbols appearing in these equations either belong to the plane
z itself or else are paired to a code symbol belonging to z. For y = y0 , the pair of a code
symbol A(x, y, z) lying in z is a second code symbol A(zy , y, z(x → zy )) that does not
belong to z but lies, however, in a different plane z that is also a member of P . It follows
that, for y = y0 , the replacement for failed node (x0 , y0 ) has access to all the code symbols
with y = y0 that appear in (31.4). For the case y = y0 with x = x0 , the replacement node
has access to code symbols that belong to the plane z but not to their pairs. Based on the
above, the p-c equations corresponding to the plane z ∈ P can be expressed in the form
θx 0 ,y0 A(x0 , y0 ; z) +
γθx,y 0
A(x0 , y0 ; z(x → zy0 )) = κ∗ ,
x=zy0
where κ∗ indicates a quantity that can be computed at a replacement node based on the
748 Concise Encyclopedia of Coding Theory
FIGURE 31.9: Figure illustrating node repair of (x0 , y0 ) = (1, 0). The helper data sent
corresponds to repair planes z with a dot at failed node (1, 0). The Uncoupled code symbols
corresponding to repair planes are recovered as described in the text. We can therefore
recover the failed symbols from the repair planes and also the remaining failed node symbols
of the form A∗ from A, B as shown in the figure.
A*
A B B
code symbols supplied by the n − 1 helper nodes. Thus we have a set of r equations in r
unknowns. From the theory of generalized RS codes, it is known that these equations suffice
to recover the symbols
This completes the proof of node repair. We refer the readers to [1621] for a proof of the
data collection property.
Near Optimal Bandwidth MDS Codes are yet another variant of RGCs explored
in the literature. They are vector MDS codes that trade between sub-packetization level
and savings in repair bandwidth. As demonstrated in [1825], a large sub-packetization level
is not a desirable feature in a distributed storage system. Though MSR codes have opti-
mal repair bandwidth, they necessarily incur a high sub-packetization level as established
in [44, 98, 833, 1775]. The piggybacking framework [1571], -MSR framework [1580], and
the transformation in [1242] are examples of construction methods for vector MDS codes
that have a small sub-packetization level while ensuring substantial reduction in repair
bandwidth.
Fractional-Repetition Codes, introduced in [1607], may be regarded as a general-
ization of the RBT polygon MBR code [1570] presented in Section 31.2.6. In a fractional-
repetition code, the symbols corresponding to a data file are first encoded using an MDS
code. Each code symbol is then replicated ρ times. These replicated code symbols are stored
across n nodes, with each node storing α symbols and each code symbol appearing in ex-
actly ρ distinct nodes. The definition of MBR codes requires that any set of d surviving
nodes can be called upon to serve as helper nodes for node repair. In contrast, in the case
of fractional-repetition codes, it is only guaranteed that there is at least one choice of d
helper nodes corresponding to which RBT is possible. A fractional-repetition code with ρ-
repetition allows repair without any computation for up to ρ−1 node failures. Constructions
of fractional-repetition codes can be found in [1465, 1494, 1607, 1706].
Secure RGCs, introduced in [1495], are a variant of RGCs where a passive but curious
eavesdropper is present, who has access to the data stored in a subset A of size |A| = ` < k of
the storage nodes. The aim here is to prevent the eavesdropper from gaining any information
pertinent to the stored data. Here again, there is a notion of exact and functional repair and
there are corresponding storage-repair bandwidth tradeoffs. Codes that operate at extreme
ends of the tradeoff are called secure MBR and secure MSR codes, respectively. In [1495], an
upper bound on file size for the case of F-R secure RGC is provided along with a matching
construction corresponding to the MBR point for the case d = n − 1. In [1664], the authors
study the E-R storage-bandwidth tradeoff for this model.
This eavesdropper model was subsequently extended in [1572] to the case where the
passive eavesdropper also has access to data received during the repair of a subset of nodes
A1 ⊆ A of size `1 . In [1572], the authors provide a secure MBR construction that holds for
any value of the parameter d. The file size under this construction matches the upper bound
shown in [1495]. In [1576], for the extended model, the authors provide an upper bound on
the file size corresponding to the secure MSR point and a matching construction. In [1924],
the authors study the E-R storage-bandwidth tradeoff for the extended model.
a nontrivial LRC is necessarily larger than that of an MDS code. There are two broad classes
of LRCs. LRCs with Information Symbol Locality (ISL) are systematic linear codes in
which the repair degree is reduced only for the repair of nodes corresponding to message
symbols. In an LRC with All-Symbol Locality (ASL), the repair degree is reduced for
the repair of all n nodes, regardless of whether the node corresponds to a message or parity
symbol. Clearly, the class of LRCs with ASL is a sub-class of the set of all LRCs with ISL.
involving at most r other code symbols cj with j ∈ St . Thus the set St in the equation
above has size at most r. The minimum distance of an (n, k, r) LRC [830] must necessarily
satisfy the Singleton Bound for LRC:
k
dmin ≤ (n − k + 1) − −1 . (31.5)
r
Thus
k for the same values of [n, k], an LRC has dmin which is ksmaller
by an amount equal to
r − 1 in comparison to an MDS code. The quantity r − 1 may thus be regarded
as the penalty associated with imposing the locality requirement. An LRC whose minimum
distance satisfies the above bound with equality is said to be optimal. The class of pyramid
codes [994] are an example of a class of optimal LRCs and are described below. Analysis of
nonlinear LRCs can be found in [737, 1489].
It follows that the code is an optimal LRC. In the general case, if we start with an [n, k]
RS code and split a single parity column, we will obtain an optimal
pyramid LRC.
X1 X2 X3 X4 X5 X6 X7 Y1 Y2 Y3 Y4 Y5 Y6 Y7 P1
X-code PX Y-code PY P2
In terms of reliability, the (n = 18, k = 14, r = 7) Windows Azure code and the [9, 6]
RS code are comparable as they both have the same minimum distance dmin = 4. In terms
of repair degree, the two codes are again comparable, having respective repair degrees of 7
(Windows Azure LRC) and 6 (RS). The major difference is in the storage overhead, which
stands at 18 9
14 = 1.29 in the case of the Azure LRC and 6 = 1.5 in the case of the [9, 6] RS
code. This reduction in storage overhead has reportedly saved Microsoft millions of dollars
[1386].
where ua , ub , uc ∈ Fq . Thus the value of an example code symbol f (Pa ) can be recovered
from the values of two other code symbols, f (Pb ) and f (Pc ) in the present case. Thus this
construction represents an LRC with r = 2 which structurally is a subcode of an RS code.
f(Pa)
f(Pb)
f(Pc)
We now present a more general form of the construction in [1772] of an (n, k, r) ASL
LRC, sometimes called a Tamo–Barg code, for the case n = q − 1 and (r + 1) | (q − 1). It
will be convenient to express k in the form
k
k = r + a where = − 1 and 1 ≤ a ≤ r.
r
In an RS code, we evaluate all polynomials of degree ≤ k − 1; here we restrict attention to
the subset Q of polynomials that can be expressed in the form
f (x) = x(r+1)i fi (x),
i=0
has degree at most a−1. Clearly by counting the number of coefficients bi,j ∈ Fq , we see that
Codes for Distributed Storage 753
the number of polynomials in the set Q equals q `r+a = q k and hence this code has dimension
k. Let F∗q denote the set of q − 1 nonzero elements in Fq . Code symbols are obtained by
evaluating each polynomial in Q at all the elements in F∗q . Let C be the resultant code, i.e.,
C = (f (u) | u ∈ F∗q ) | f ∈ Q .
We will next establish that C is an LRC. To see this, let H denote the set of (r + 1)th roots
of unity contained in Fq . Then the q−1 ∗
r+1 multiplicative cosets of H partition Fq . We first note
∗
that for any β ∈ H and any b ∈ Fq , the product bβ is a zero of the polynomial xr+1 − br+1 .
It follows then that for f ∈ Q,
f (bβ) = f (x)|x=bβ = f (x) (mod xr+1 − br+1 )
x=bβ
`
!
X
= x(r+1)i fi (x) (mod xr+1 − br+1 )
x=bβ
i=0
`
!
X
= b(r+1)i fi (x) . (31.6)
x=bβ
i=0
FIGURE 31.12: Classification of various approaches for LRCs for multiple erasures
C r ti e ra re
C e ith
ai a i ity
FIGURE 31.13: An example code with sequential recovery which can recover from 2
erasures with n = 8, k = 4, r = 2
1 2
Parity- Parity-
5
Check Check
7 6
Parity- 8 Parity-
Check Check
4 3
More formally a code with sequential recovery from t erasures is an [n, k] linear code
Codes for Distributed Storage 755
over Fq such that an arbitrary set of t erased symbols cj1 , . . . , cjt can be recovered as follows:
X
cji = am cm with am ∈ Fq
m∈Si
where Si ⊆ {1, 2, . . . , n}, |Si | ≤ r, and {ji , ji+1 , . . . , jt } ∩ Si = ∅. The example given in
the Figure 31.13 corresponds to the parameter set (n = 8, k = 4, r = 2, t = 2). Sequential
recovery was introduced in [1535], where a detailed analysis for the t = 2 case can be found.
Characterization of the maximum possible rate for given n, r, and t = 3 can be found
in [1739]. The maximum possible rate of codes with sequential recovery for a given r, t
is characterized in [95]. The construction of codes having high rate, but rate that is less
in comparison to the rate of the construction in [95], can be found in [1579, 1737]. The
construction in [1737] however has lower block length O(rO(log(t)) ) in comparison to the
construction in [95], which has O(rO(t) ) block length.
The sets {Sji } will be referred to as recovery sets. The example product code described
above corresponds to the parameter set (n = (r + 1)2 , k = r2 , t = 2) and the sets S1i , S2i to
symbols lying within the same row and column respectively.
Codes with t-availability can recover from t erasures. This can be seen as follows. If there
are t erased symbols including ci , then apart from ci there are t − 1 other erased symbols.
These however, can be present in at most t − 1 out of the t disjoint recovery sets S1i , . . . , Sti .
756 Concise Encyclopedia of Coding Theory
FIGURE 31.14: The binary product code as an example of a code with availability. In
this example, the code symbol 5 can be recovered either directly from the node storing it
or else by computing either the row sum or the column sum.
1 2 3 P
4 5 6 P
7 8 9 P
P P P P
Hence, there must exist at least one recovery set Sji in which none of the erased symbols is
present, and this recovery set can be used to recover ci . It can be verified that a code with
t-availability is also a code that can recover for t erasures in parallel.
Upper bounds on the rate of codes with availability for a given (r, t) can be found in
[97, 1070, 1771]. A construction of high-rate codes with availability is presented in [1869].
An upper bound on the minimum distance of a code with availability that is independent of
field size q can be found in [97, 1771, 1866]. Field size dependent upper bounds on minimum
distance can be found in [97, 996]. Asymptotic lower bounds, i.e., lower bounds on the rate
k dmin
n as a function of the relative minimum distance n as n → ∞, for fixed (r, t, q) can be
found in [128, 1771].
that involve a common set {cm | m ∈ S} of r unerased code symbols, where S ⊆ {1, 2, . . . , n}
and |S| ≤ r. Further details including constructions and performance bounds can be found
in [1579].
More formally, a code with (r, δ) locality C over Fq is an [n, k] linear code over Fq such
that for each code symbol ci there is an index set Si ⊆ {1, 2, . . . , n} such that dmin (C|Si ) ≥ δ
and |Si | ≤ r + δ − 1 where C|Si is the restriction of the code to the coordinates correspond-
ing to the set Si . Alternately, we may regard the code C|Si as being obtained from C by
puncturing C in the locations corresponding to index set {1, 2, . . . , n} \ Si . Note that an
LRC is an instance of an (r, δ) code with δ = 2.
The classification of this class of codes into information symbol and all symbol (r, δ)
locality codes follows in the same way as was carried out in the case of an LRC. There is
an analogous minimum distance bound [1534] given by
k
dmin (C) ≤ (n − k + 1) − − 1 (δ − 1). (31.8)
r
A code with (r, δ) locality satisfying the above bound with equality is said to be optimal.
Optimal codes with (r, δ) information symbol locality can be obtained from pyramid codes
by extending the approach described in Section 31.3.1.1 and splitting a larger number δ − 1
of the parity columns in the generator matrix of a systematic RS code. Optimal codes with
(r, δ) ASL can be obtained by employing the construction in [1772] as described in Section
31.3.2 for the case when (r + δ − 1) | n and q = O(n). Optimal (r, δ) cyclic codes with
q = O(n) can be found in [392] for the case when (r + δ − 1) | n. A detailed analysis as to
when the upper bound on minimum distance appearing in (31.8) is achievable can be found
in [1738]. Characterization of binary codes achieving the bound in (31.8) with equality can
be found in [903]. A field size dependent upper bound on dimension k for fixed (r, δ, n, dmin )
appears in [14]. Asymptotic lower bounds, i.e., lower bounds on the rate nk as a function of
the relative minimum distance dmin n as n → ∞, for a fixed (r, δ, q) can be found in [128].
FIGURE 31.15: The code on the left is an LRC in which each code symbol is protected by
a [n = 4, k = 3, dmin = 2] local code and each local code is contained in an overall [24, 14, 7]
global code. In the hierarchical locality code appearing to the right, each local code is a
part of a [12, 8, 3] so-called middle code, and the middle codes, in turn, are contained in an
overall [24, 14, 6] global code.
24 14 6
24 14 7
12 8 3 12 8 3
432 432 432 432 432 432 432 432 432 432 432 432
recoverability finds particular application in the design of sector-disk codes [1512] that are
used in RAID storage systems to combat simultaneous sector and disk erasures.
FIGURE 31.16: An LRGC in which each of the three local codes is a pentagon MBR code.
The set of 30 scalar symbols that make up the LRGC form a scalar ASL LRC in which
there are three disjoint local codes, each of block length r + 1 = 10. The contents of each
of the three pentagons are obtained from the 10 scalar symbols making up the respective
local code by following the same procedure employed to construct a pentagon MBR code
from a set of 10 scalar symbols that satisfy an overall parity check.
9 5 7 9 5 7 9 5 7
4 4 4
3,5, 2,4, 3,5, 2,4, 3,5, 2,4,
8 8 8
8,9 7,8 8,9 7,8 8,9 7,8
Local Code 1 Local Code 2 Local Code 3
1 2 3 4 5 6 7 8 9 P 1 2 3 4 5 6 7 8 9 P 1 2 3 4 5 6 7 8 9 P
Scalar ASL LRC
Codes for Distributed Storage 759
GRS k (Θ, u)⊥ = v1 h(θ1 ), v2 h(θ2 ), . . . , vn h(θn ) | h ∈ H = GRS n−k (Θ, v).
Like the {ui }, the {vi }ni=1 ⊆ F∗q are also a set of scaling coefficients where v = (v1 , . . . , vn ).
The scaling coefficients {ui }, {vj } do not, however, play any role in determining the repair
bandwidth, and for this reason, in the text below, we assume all the scaling coefficients
ui , vj are equal to 1.
Trace Function and Trace-Dual Basis: The trace function Trq/p : Fq → B is given by
t−1
X i
Trq/p (x) = xp ,
i=0
where x ∈ Fq . For every basis Γ = {γ1 , γ2 , . . . , γt } of Fq over B, there exists a second basis
∆ = {δ1 , δ2 , . . . , δt }, termed the trace-dual basis, satisfying
1 if i = j,
Trq/p (γi δj ) =
0 otherwise.
Thus given {Trq/p (xγi )}ti=1 , the element x can be uniquely recovered.
760 Concise Encyclopedia of Coding Theory
Node Repair via the Dual Code: Recall that GRS k (Θ, u) and its dual GRS n−k (Θ, v)
are scaled evaluations of polynomials of degree at most kP− 1 and at most n − k − 1,
n
respectively. Hence for f, h ∈ F, H, respectively, we have i=1 f (θi )h(θi ) = 0 (assuming
that each ui and each vj equals 1 for reasons explained earlier). Suppose the code symbol
f (θi ) has been erased. We have
n
X
f (θi )h(θi ) = − f (θj )h(θj ).
j=1, j6=i
Thus,
n
X
Trq/p (f (θi )h(θi )) = − Trq/p (f (θj )h(θj )). (31.9)
j=1, j6=i
Next, let us assume that it is possible to select a subset Hi of H in such a way that
{h(θi )}h∈Hi forms a basis for Fq over B. It follows from
nP(31.9) and the existence of o a trace-
n
dual basis that f (θi ) can be recovered from the set j=1, j6=i Trq/p (f (θj )h(θj )) . In
h∈Hi
[882], the authors carefully choose the subsets {Hi }ni=1 so as to not only satisfy the above
basis requirement, but also reduce the repair bandwidth associated with the recovery of
f (θi ) via (31.9).
The Repair Scheme in [882]: Let n − k ≥ pt−1 for a GRS code C = GRS k (Θ, 1). Let
Γ = {γ1 , γ2 , . . . , γt } be a basis for Fq over B. Each codeword in C corresponds to the scaled
evaluation at the elements in Θ of a polynomial f ∈ F. With respect to the scheme for
failed-node recovery described above, consider the set
( )t
Trq/p γj (x − θi )
Hi = .
(x − θi )
j=1
It is straightforward to verify that {h(θi )}h∈Hi ≡ Γ and for j 6= i, {h(θj )}h∈Hi is a set
1
consisting of scalar multiples (over B) of θj −θ i
. Hence, from the B-linearity of the trace
function Trq/p , it is possible to compute all elements in the set {Trq/p (f (θj )h(θj ))}h∈Hi
f (θ )
from Trq/p θj −θj i . Clearly, in order for the replacement node to be able to compute
nP o
n
j=1, j6=i Tr q/p (f (θj )h(θj )) , each node j (j 6= i) needs only provide the single sym-
h∈Hi
f (θj )
bol Trq/p θj −θi ∈ B. This results in a repair bandwidth of n − 1 symbols over B to recover
each f (θi ). In contrast, as noted earlier, the traditional approach for recovering a code
symbol incurs a repair bandwidth of k symbols over Fq or equivalently kt symbols over B.
There has been much subsequent work on the repair of codes dealing with issues such
as repairing RS codes in the presence of multiple erasures, achieving the Cut-Set Bound on
node repair, enabling optimal access, etc. The reader is referred to [399, 489, 1776] and the
references therein for details. RS repair schemes specific to the [n = 14, k = 10]q=256 RS
code employed by HDFS have been provided in [655].
Azure, Ceph, and Openstack have enabled support for erasure codes within their systems.
These erasure coding options were initially limited to RS codes. It was subsequently realized
that the frequent node-repair operations taking place in the background and the consequent
network traffic, and helper-node distraction, were hampering front-end operations. This mo-
tivated the development of the RGCs, LRCs, and the improved repair of RS codes. As noted
in Section 31.3.1.2, LRCs are very much a part of the Windows Azure cloud-storage system.
Hadoop EC has made Piggybacked RS codes available as an option. Both LRC and MSR
(Clay) codes are available as erasure coding options in Ceph.
Chapter 32
Polar Codes
Noam Presman
Tel Aviv University
Simon Litsyn
Tel Aviv University
32.1 Introduction
Introduced by Arıkan [63], polar codes are error-correcting codes (ECCs) that achieve
the symmetric capacity of discrete input memoryless channels (DMCs) with polynomial
encoding and decoding complexities. Arguably the most useful member of this family is
Arıkan’s (u + v, v) polar code. It was also the first one that was discovered; however gener-
alizations were soon to follow [1149, 1150, 1406, 1403].
In this chapter we attempt to provide a general presentation of polar codes and their
associated algorithms. At the same time, many of the examples in the chapter use Arıkan’s
original construction due to its simplicity and wide applicability. Our journey begins in
Section 32.2 by introducing a code construction based on a recursive concatenation of short
codes (a.k.a. kernels) and by predetermining some of their information input symbols (a.k.a.
symbol freezing). The theory of polar codes is based on such constructions and on associat-
ing with each information symbol of the ECC a synthetic channel. Such a synthetic channel
has as input the information symbol and as outputs the original communication channel
outputs and some of the other information symbols. In Section 32.3 we describe the notion
763
764 Concise Encyclopedia of Coding Theory
Notations
Throughout we use the following notations. For a natural number `, we denote [`] =
{1, 2, 3, . . . , ` − 1} and [`]− = {0, 1, 2, . . . , ` − 1}. Vectors are represented by bold lowercase
letters and matrices by bold uppercase letters. For i ≥ j, let uij = [uj uj+1 · · · ui ] be the
sub-vector of u of length i − j + 1 (if i < j we say that uij = [ ], the empty vector, and its
length is 0). For a set of indices I ⊆ [n]− , the vector uI is a sub-vector of u with indices
corresponding to I (the order of elements in uI is the same as in u), where n is the length
of u.
Definition 32.2.1 Given an ` dimensions kernel g(·), we construct a kernel based trans-
n n
formation g (n) (·) of N = `n dimensions (i.e., g (n) (·) : F` → F` ) in the following recursive
fashion:
where
j=N/`−1 (i+1)·(N/`)−1
[γi,j ]j=0 = g (n−1) ui·(N/`) for i ∈ [`]− .
Example 32.2.2 (The (u + v, v) kernel) In his seminal paper [63] Arıkan proposed
the
1 0
following linear binary kernel g(u, v) = [u + v, v] = [u, v] · G and G = , where
1 1
u, v, Gi,j ∈ F2 . Such a mapping is referred to as the (u + v, v) kernel. It can be seen that
1 1 1 1 1 1 1 1
Example 32.2.4 (Reed–Muller codes) Consider g (n) (·) from Example 32.2.2. Selecting
A to be the indices of the rows of G(n) with Hamming weight at least 2n−r results in the
rth order Reed–Muller (RM) code RM(r, n) of length N = 2n . Note that row i of G(n) has
weight 2wtH (i) where wtH (i) is the Hamming weight of the binary representation of i. For
a general indices selection A, the minimum distance of the code is dmin = mini∈A 2wtH (i)
[1148, Chapter 6].
766 Concise Encyclopedia of Coding Theory
The above GCC formalization is illustrated in Figure 32.1. In this figure, we see the `
outer-code codewords of length `n−1 depicted as horizontal rectangles (similar to rows of a
matrix). The instances of the inner-codeword mapping are depicted as vertical rectangles
that are located on top of the outer-codes rows (resembling columns of a matrix). This is
Polar Codes 767
appropriate, as this mapping operates on columns of the matrix whose rows are the outer-
code codewords. Note that for brevity we only drew three instances of the inner mapping, but
there should be `n−1 instances of it, one for each column of this matrix. In the homogenous
case, the outer-codes themselves are constructed in the same manner. Note, however, that
even though these outer-codes have the same structure, they form different codes in the
general case. The reason is that they may have different sets of frozen symbols.
Example 32.2.5 (GCC structure for Arıkan’s Construction) Figure 32.2 depicts
the GCC block diagram for the Arıkan
(u+ v, v) Construction specified
in Example
32.2.2.
N/2−1 N/2−1 N/2−1 −1
Note that [γ0,j ]j=0 = g (n−1) u0 and [γ1,j ]j=0 = g (n−1) uN
N/2 are the two
outer-codes, each of length N/2 bits, and the kernel g(·) is the inner mapping.
The GCC structure of polar codes can be also represented by a layered1 Forney’s normal
factor graph (NFG) [744]. Layer #0 of this graph contains the inner mappings (represented
as sets of vertices), and therefore we refer to it as the inner layer. Layer #1 contains the
vertices of the inner layers of all the outer-codes that are concatenated by layer #0. We
may continue and generate layer #i by considering the outer-codes that are concatenated
by layer #(i − 1) and include in this layer all the vertices describing their inner mappings.
This recursive construction process may continue until we obtain outer-codes that cannot
be decomposed into nontrivial inner-codes and outer-codes. Edges (representing variables)
connect between outputs of the outer-codes to the input of the inner mappings. This rep-
resentation can be understood as viewing the GCC structure in Figure 32.1 from its side.
Example 32.2.6 (Layered NFG for Arıkan’s Construction) Figure 32.3 depicts a
layered factor graph representation for a polar code of length N = 2n symbols with kernel
of ` = 2 dimensions. We have two outer-codes (#0 and #1) each one of length N/2 bits that
are connected by the inner layer (note the similarities to the GCC block diagram in Figure
−1 −1
32.2). Half edges represent the inputs uN0 and the outputs xN 0 of the transformation.
The edges, denoted by γi,j for j ∈ [N/2]− and i ∈ [2]− , connect the outputs of the two
outer-codes to the inputs of the inner mapping blocks, g(·). To demonstrate the recursive
1 In a layered graph, the vertices set can be partitioned into a sequence of sub-sets called layers and
denoted by L0 , L1 , · · · , Lk−1 . The edges of the graph connect only vertices within the layer or in successive
layers.
768 Concise Encyclopedia of Coding Theory
FIGURE 32.4: Normal factor graph representation of the g(·) block from Figure 32.3 for
Arıkan’s (u + v, v) Construction
structure of the code we also unfolded the interior organization of outer-codes #0 and
#1, thereby uncovering their inner-codes (g(·)) and outer-codes (each one of length N/4
symbols).
Strictly speaking, the vertical blocks that represent the g(·) inner mapping are them-
selves factor graphs (i.e., collections of vertices and edges). An example of a normal factor
graph specifying such a block is given in Figure 32.4 for Arıkan’s (u + v, v) Construction
(see Example 32.2.5). Vertex a0 represents a parity constraint and vertex e1 represents an
equivalence constraint. The half edges u0 , u1 represent the inputs of the mapping, and the
half edges x0 , x1 represent its outputs. This graphical structure is probably the most pop-
ular visual representation of polar codes (see, e.g., [63, Figure 12] and [1148, Figure 5.2])
and is also known as the “butterflies” graph because of the edges arrangement.
Note that it is also possible to employ several types of kernels in the same construction
and even kernels that have different types of input alphabets (non-homogenous kernels). The
authors introduced such mixed-kernels construction that is advantageous over homogenous
kernel constructions [1545].
Polar Codes 769
The next step Arıkan takes is called channel splitting. As its name implies, the vector
(i)
channel WN is split into a set of N F-symbol input coordinate channels denoted by WN :
i N
F → F × Y , having the corresponding transition probabilities
(i) 1 X
WN (y, ui−1
0 | ui ) = WN (y | u),
q N −1 −1
uN
i+1 ∈F
N −i
(i)
where |F| = q. The outputs of a channel WN i ∈ [N ]− consist of the output vector of the
physical channel, y, and the ith prefix of the input vector u, i.e., ui−1
0 . It is further asserted
−1
that symbols uNi+1 are unknown and uniformly distributed. The set of coordinate channels
n o
(i)
WN is referred to as the synthetic channels generated by the transformation.
i∈[N ]−
The channel splitting induces the decoding method specified in Algorithm 32.3.1, referred
to as Successive-Cancellation (SC) decoding. According to this method the decoder decides
on symbols of input information sequentially according to (32.1). If i is a frozen index,
then it uses the fixed and predetermined frozen value ui . Otherwise it applies maximum-
likelihood on the ith synthetic channel, in which it assumes the correctness of its previous
decisions u b i−1
0 . The outputs of the algorithm are the estimated information vector u b and
its corresponding codeword x b.
The reader may raise two concerns with regards to the specification of Algorithm 32.3.1:
(i) The complexity of the method seems to be exponential in the code length. It turns
out that a recursive and efficient method exists for polar codes with time complexity of
O(q ` ·N ·log N ) and space complexity of O(q·N ). (ii) Failure in step i of the algorithm causes
the entire algorithm to fail. While error propagation is indeed a problem for any sequential
method, it turns out that there exists a method to select A such that the code rate is
770 Concise Encyclopedia of Coding Theory
approaching the symmetric capacity of the channel W with vanishing error probability (in
the code length) in SC decoding.
[III] Input: y.
//Initialization:
B Allocate decision vector u
b = 0.
//Sequential decision:
B For each i = 0, 1, 2, . . . , N − 1 do
ui if i ∈ Ac ,
u
bi = (i) i−1 (32.1)
b 0 | ui = f ) if i ∈ A
argmaxf ∈F WN (y, u
[JJJ] Output: u
b and x
b = g(b
u).
(i)
X Prob(Y = y | Ui0 = ui0 )
(i) i−1
I WN = − · log q W N (y, u0 | u i ) .
qi
ui0 ∈Fi ,y∈Y N
(i)
Note that I(W), I WN ∈ [0, 1] and by the information chain-rule we have that I(W) =
PN −1 (i)
1/N · i=0 I WN . Finally, let us denote the GA SC error probability on step i as
(i) bi 6= Ui ) where i ∈ [N ]− .
Pe (WN ) = Prob(U
Recall that Shannon measures the channel quality by the channel capacity, indicating
the maximum code rate that can beused to transmit data on the channel with vanishing
(i)
error probability. Similarly, I WN indicates the quality of the ith synthetic channel.
(i)
Specifically, as I WN → 0 the channel worsens and the probability of error in (32.1) GA
Polar Codes 771
(i) (i)
SC grows to 1 (assuming that i ∈
/ A), i.e., Pe (WN ) → 1. Conversely, as I WN → 1 the
(i)
channel improves and the probability of error in (32.1) tends to zero (Pe (WN ) → 0). We
say that a kernel g(·) is polarizing if almost all the synthetic channels have their symmetric
capacities go to zero or to one as N grows. Let us formalize this.
For a length N transformation let us induce a uniform probability distribution on the
synthetic channels and define In and Pe,n to be random variables, respectively, standing for
the capacity and error-probability of the nth synthetic channel selected uniformly at random.
In other words, we pick each one of the N synthetic channels with probability 1/N and In
and Pe,n are the
n capacity
o and error-probability of this channel, respectively. For a subset of
(i)
channels Γ ⊆ WN , we understand the probabilistic notation Prob(In ∈ Γ) as the
i∈[N ]−
chance to select any channel in Γ uniformly at random; therefore Prob(In ∈ Γ) = |Γ|/N .
Definition 32.3.2 A kernel g(·) polarizes a channel W if and only if for every δ ∈ (0, 0.5)
we have
I(W) if ι0 = 1 − δ and ι1 = 1,
lim Prob(ι0 ≤ In ≤ ι1 ) =
n→∞ 1 − I(W) if ι0 = 0 and ι1 = δ.
Example 32.3.3 Figure 32.5 depicts the polarization phenomenon of the bi-additive white
Gaussian noise (bi-AWGN) channel at Es /N0 = −1.54 dB (I(W) = 0.6 bits) by the (u+v, v)
kernel. On the left, we took g (11) (·) of length
2048 bits
and estimated using density evolution
(i)
the capacities of the synthetic channels I W2048 . It is easy to see that most of channels
are either with capacity close to 1 or to 0, even for this intermediate length of 2048 bits.
The right figure indicates the trend of polarization: as the length of transformation grows
the proportion of channels with capacity close to 1 tends to I(W) and those with capacity
close to 0 tends to 1 − I(W).
(a) (b)
(i)
A partial distance element Dmin ui−1
0 corresponds to the ith step of SC, in which
we have to decide on ui after we have already determined the prefix ui−1 0 of the in-
put u to the kernel g(·). In this case the event u i = ω corresponds to the subset
g(ui−1 `−1 `−1 N −i−1
0 , ω, v i+1 ) | v i+1 ∈ F , ω ∈ F . Generally speaking, the larger the gap between
any two codewords when each one of them corresponds to a different ui value, the smaller
the chance that a decoder will perform erroneous decisions. The minimum gap correspond-
ing to a certain prefix is defined in (32.2) and the minimum across all prefixes is defined in
(32.3).
Example 32.4.3 (RS kernels [1406]) Let the Reed–Solomon kernel over Fq be defined
Polar Codes 773
as g(uq−1
0 ) = u · GRS (q) where GRS (q) is a q × q matrix given by
···
1 1 1 1 0
2
1
α α ··· αq−2 0
2
α.4 α2(q−2)
1 α. ··· .. 0.
GRS (q) = .. .. ..
1
··· .
1 αq−2 α2(q−2) ··· α(q−2)(q−2) 0
1 1 1 ··· 1 1
with α a primitive element of Fq . We have that for each i ∈ [q], the i lowest rows of GRS (q)
form a generating matrix of the [q, i, q − i + 1] extended RS code. As a consequence, we can
(i)
easily observe that Dmin = i + 1 for all i ∈ [q]− .
Arıkan defined the polarization term, and provided the first example of a polarizing
kernel (u + v, v) [63]. Korada et al. found conditions for polarizing binary and linear kernels
[1149, 1150]. Mori and Tanaka considered the general case of a mapping g(·) that is not
necessarily linear and binary [1406, 1403]. For brevity, we provide in this chapter two useful
necessary and sufficient conditions for polarization based on the latter works. The first
condition deals with binary kernels that are not necessarily linear.
Theorem 32.4.4 (Polarization of Binary Kernels [1403]) Let g(·) be a binary kernel
(i)
of ` dimensions. If there exists i ∈ [`]− and ui−1
0 ∈ {0, 1}i where Dmin (ui−1
0 ) ≥ 2, then g(·)
is polarizing according to Definition 32.3.2.
The next theorem deals with linear kernels over Fq . Such kernels are specified by an ` × `
kernel generating matrix G. We say that two kernel matrices are polarization equiva-
lent if the synthetic channels induced by them have the same capacity sequences for any
symmetric channel W. Lemma 32.4.5 specifies an equivalence condition.
Definition 32.4.6 A lower triangular matrix with a unit diagonal that is equivalent to a
matrix G is called a standard form of G.
In general, a standard form of a matrix G is not necessarily unique. If there exists a standard
form of G that is the identity matrix, then it is unique.
1 α2
1 0 0 0 1 0
1
e = 2 1 0 0
and V = 0 α2 α2 0
.
G α α 1 0 0 0 α 0
1 1 1 1 0 0 0 1
So G
e is a standard form of GRS (4).
Theorem 32.4.8 (Polarization of Linear Kernels [1406]) For a linear kernel g(·)
over Fq specified by an invertible matrix G with non-identity standard form, the following
conditions are equivalent.
774 Concise Encyclopedia of Coding Theory
Example 32.4.11 The Arıkan kernel as well as the kernel in Example 32.4.2 have a polar-
ization exponent of 0.5. Linear kernels with GRS (q) have a polarization exponent logq (q!)/q.
Specifically, the kernel based on GRS (4) has the exponent 0.573120.
The following theorem associates the error-correction performance of the GA SC and
exponent.
Theorem 32.4.12 (Polarization Rate [1406, 1403]) Let g(·) be an ` dimensions kernel
over F that polarizes a channel W. Then for any > 0
(E(g)−)n
lim Prob Pe,n < 2−` = I(W), (32.4)
n→∞
and (E(g)+)n
lim Prob Pe,n < 2−` = 0. (32.5)
n→∞
Roughly speaking, (32.4) and (32.5) suggest that as N grows the SC GA decoding error
rate of almost all the good channels (the N · I(W) channels with capacity approaching 1)
E(g)+ E(g)−
is in the range of 2−N , 2−N
.
Note that the exponent is always smaller than 1. As a consequence the error probability
decays sub-exponentially in the length of the code N . Korada et al. showed that as ` grows,
the maximum achievable polarization exponent of binary linear kernels of ` dimensions
approaches 1 [1150, Theorem 22]. The exponent of binary kernels is ≤ 0.5 for ` ≤ 13
[1259, 1546]. For ` = 16 dimensions the maximum exponents for binary linear and nonlinear
kernels are 0.51828 [1150, Example 28] and 0.52742 [1546], respectively. The RS kernels have
the maximum exponent per their number of dimensions.
Polar Codes 775
Theorem 32.5.1 (Polar Codes Achieve Capacity with SC Decoding) Let g(·) be
an ` dimensions kernel over Fq polarizing a channel W with symmetric capacity I(W).
∞
For each > 0 there exists a sequence of codes C (i) i=1 where C (n) is of length `n , rate
≥ I(W) − , constructed from g (n) (·) and non-frozen symbols set A(n) , such that the frame
(E(g)−)n
error of the code under SC decoding is Pe,n = o 2−` .
Proof: Given > 0, for each n let us specify A(n) to be the set of channels with GA SC
(E(g)−2)n
error probability below 2−` . According to Theorem 32.4.12 there exists n0 such
that for each n > n0 we have |A(n) | ≥ I(W) − . In that case the probability of frame error
under SC is bounded from above by
(E(g)−2)n
(E(g)−)n
(i)
X
Pe,n < Pe (WN ) ≤ (I(W) − ) · `n · 2−` = o 2−` .
i∈A(n)
Remark 32.5.2 A more refined notion of polarization that links the gap to capacity and
the length of a polar code achieving it (with certain error probability) was studied by
Guruswami and Xia [883]. The scaling behavior of polar codes is a related phenomenon to
that [823, 918, 919, 1151, 1397, 1398]. Sasoglu et al. showed how to achieve the Shannon
capacity of a non-symmetric channel by a suitable extension of the input alphabet that
depends on the capacity achieving input probability distribution [1622].
The proof of Theorem 32.5.1 suggests that in order to specify a good polar code, the
quality of the synthetic channels has to be assessed and graded. Bad channels are assigned to
be frozen while good channels are used to carry the information. This channel organization
operation is referred to as polar code design.
based on the channel W using, e.g., Monte-Carlo simulation and estimating properties of
the channels (e.g., symbol-error-rate, Bhattacharyya, capacity, etc.) may allow us to assess
their ordering [63].
Alternatively, it may be possible to assess the qualities of the synthetic channels by
performing estimation of their log likelihood ratio (LLR) distribution or parameter thereof.
Mori and Tanaka followed that approach by employing the density evolution (DE) algorithm
for Arıkan’s Construction [1404, 1403]. This algorithm recursively computes the probability
density function of the LLR messages assuming that the zero codeword was transmitted.
However, as the block length grows the computation precision has to increase and thus
the complexity. An efficient approximation to DE is done by assuming that all computed
distributions are Gaussian, thereby only the expectation and the standard deviation of them
have to be tracked [1819].
Tal and Vardy developed a quantization method that allows performing DE analysis
while still maintaining manageable complexity [1769]. They devised two approximation
methods that bound the bit-channel capacity from above and from below by applying the
so-called channel-upgrading and channel-degrading methods, respectively. A fidelity
parameter µ controls the tightness of these bounds. As µ increases the bounds are tighter
while complexity also increases. Generalizations of this approach for multiple access channels
(MACs) and non-binary input alphabets were studied by Pereg and Tal [1503].
So far we have seen that polar codes achieve the symmetric capacity of channels. Because
of their kernel based recursive structure, they also have efficient implementations of their
encoding and decoding algorithms. This is explored in the next section.
//STEP 2 · i:
B Using the previous steps, prepare the inputs to the decoder of outer-code Ci .
//STEP 2 · i + 1:
B Run the decoder of code Ci on the inputs you prepared.
B Process the output of this decoder, together with the outputs of the previous steps.
The structure of Algorithm 32.7.1 is typical for decoding algorithms of GCCs. For ex-
ample, see the decoding algorithms in Dumer’s survey on GCCs [648] and his RM recursive
decoding algorithms [649, 651]. In fact, Dumer’s simplified decoding algorithm for RM codes
[649, Section IV] is the SC decoding for Arıkan’s structure we describe in Subsection 32.7.1.
778 Concise Encyclopedia of Coding Theory
32.7.1.1 SC for (u + v, v)
The inputs of the SC algorithm for Arıkan’s Construction are listed below.
Prob(Y =y | X =0)
• A vector of N input LLRs, λ, such that λj = ln Prob(Yjj =yjj | Xjj =1) for j ∈ [N ]− , where
Yj is the observation of the j th realization of the channel W, Xj → Yj .
• Frozen indicator vector z ∈ {0, 1}N , defined in Section 32.6.
The algorithm outputs two binary vectors of length N bits, u b and xb, containing the es-
timated information word (including frozen bits placed at appropriate locations) and its
corresponding codeword, respectively. The SC function signature is defined as [b u, xb] =
SCDecoder (λ, z) .
First, let us describe the decoding algorithm for a code of length N = 2 bits, i.e., for
the basic kernel g (1) (u, v) = [u + v, v]. Note that this is the base case of the recursion. We
get as input λ = [λ0 , λ1 ] which are the LLRs of the output of the channel (λ0 corresponds
to the first output of the channel and λ1 corresponds to the second output). The procedure
has four steps as described in Algorithm 32.7.2.
Note that STEPs 1 and 3 may be done based on the LLRs computed in STEPs 0 and 2,
respectively (i.e., by their sign), or using additional side information (for example, if u is
frozen, then the decision is based on its known value). A decoder of a polar code of length
N bits is described in Algorithm 32.7.3.
The reader may observe the similarities of the computations in STEPs 0 and 2 in Al-
gorithms 32.7.2 and 32.7.3. Those steps correspond to the LLR calculation of u and v
respectively for the base case or the LLR preparations for outer-code #0 and #1 decoders
in the longer code. They both employ the constraint manifested by g(·). Specifically, for Al-
gorithm 32.7.3 those computations are separately employed on columns of Figure 32.2 or the
Polar Codes 779
inner layer of Figure 32.3. The input to the computations are the pair of LLRs correspond-
ing to the output of g(·) (i.e., λ2j and λ2j+1 corresponding to x2j and x2j+1 , respectively)
and the results of it are LLRs corresponding to inputs of g(·) (i.e., λ
bj corresponding to γ0,j
and γ1,j in STEPs 0 and 2, respectively).
Algorithm 32.7.3 (SC Recursive Description for a (u+v, v) Polar Code of Length
N = 2n bits)
[III] Input: λ; z.
//STEP 0:
b N/2−1 , for the first outer-code such that
B Compute the LLR input vector, λ 0
//STEP 1:
B Give the vector λ b as an input to the polar code decoder of length N/2. Also
provide to the decoding algorithm, the indices of the frozen bits from the first half of
the codeword (corresponding to the first outer-code); i.e., run
h i
b (0) , x
u b(0) = SCDecoder λ, b zN/2−1 .
0
where ub (0) is the information word estimation for the first outer-code, and x
b(0) is its
corresponding codeword.
//STEP 2:
b N/2−1 , for the second outer-code,
b(0) , prepare the LLR input vector, λ
B Using λ and x 0
such that (0)
bi = (−1)xbi · λ2i + λ2i+i for all i ∈ [N/2] .
λ −
//STEP 3:
B Give the vector λ b as an input to the polar code decoder of length N/2. In ad-
dition, provide the indices of the frozen bits from the second half of the codeword
(corresponding to the second outer-code); i.e., run
h i
b (1) , x
u b(1) = SCDecoder λ,b zN −1 ,
N/2
b (1) and x
where u b(1) are the estimations of the information word and its corresponding
codeword of the second outer-code.
[JJJ] Output:
(0) (1)
• u
b= u b ,u b ;
h iN/2−1
(0) (1) (1)
• x
b= x bi + x bi , x
bi .
i=0
780 Concise Encyclopedia of Coding Theory
`−1
In the GCC structure of this polar code there are ` outer-codes {Ci }i=0 , each one of
length N/` symbols. We assume that each outer-code has a decoding algorithm associated
with it. This decoding algorithm is assumed to receive as input the channel observations
on the outer-code symbols (usually formatted as LLR vectors). If the outer-code is a polar
code, then this algorithm should also receive the indices of the frozen symbols of the outer-
code. We require that the algorithm outputs its estimation on the information vector and
its corresponding outer-code codeword.
Let us first consider the recursion base-case of a code of length ` generated by a single
application of the kernel, i.e., x = g (u). Assuming that n we already
o decided on symbols
i−1 i−1 (t)
u0 (denote this decision by u b 0 ), computing the LLRs λ b corresponding to the
t∈Fq
ith input of the transformation (i.e., ui ) is done according to the following rule
P
u `−1
∈F q
`−i−1 Rg ub i−1
0 , 0, u`−1
i+1
b(t) = ln P
λ i+1
, (32.7)
i−1 `−1
u`−1 ∈F`−i−1
q
R g u
b 0 , t, ui+1
i+1
P
`−1 (xr )
where Rg (u`−1
0 ) = exp − r=0 λ r , such that x = g(u).
Consequently, SC decoding
n o for a polar code of length ` includes sequential calculations
of the likelihood values λ b(t) corresponding to non-frozen ui according to (32.7)
t∈Fq \{0}
followed by a decision on ui (denoted by u bi ) for i ∈ [`]− . If ui is frozen, we set u bi to be
equal to its predetermined value. Finally, in (32.6) we output u b = [b u0 ub1 · · · u
b`−1 ], and
b = g (b
x u).
For longer codes (N > `) we follow the recipe of Algorithm 32.7.1, using the LLR prepa-
ration rule of (32.7) applied on the columns of the matrix in Figure 32.1. Before applying
step 2 · i we already decoded outer-codes {Cm }i−1
m=0 and stored the outer-codes information
words and codeword in {b u(m) }i−1 x(m) }i−1
m=0 and {b m=0 , respectively. We use theseoutcomes to
h iN/`−1
generate an LLR input vector to the decoder of Ci denoted as b(t)
λ . In
j
j=0 t∈Fq \{0}
b(t) we use (32.7) with input LLRs corresponding to the j th column of
order to compute λj
h ii−1
(m)
the GCC matrix and known values u b i−1
0 = xbj . Then we call the SC decoder for
m=0
(i+1)·N/`−1
outer-code Ci , using the computed LLRs and zi·N/` as inputs to (32.6), and receive
Polar Codes 781
b (i) and x
u b(i) as outputs. Finally, we use the ` decided outer-codes codewords {bx(m) }`−1
m=0
to generate C’s estimated codeword (employing the inner mapping) and the outer-codes
information words {b u(m) }`−1
m=0 to generate the information word (by serially concatenating
them).
32.7.1.3 SC Complexity
The SC decoding steps can be classified into three categories: (SC.a) LLR calculations;
(SC.b) decision making based on these LLRs; (SC.c) partial encoding of the decided
symbols using the kernel. The time and space complexities of (SC.a) category operations
dominate the time complexity of the entire SC algorithm. This is our justification for re-
garding the number of operations of (SC.a) as a good measure of the SC decoder time
complexity.
For a kernel of ` dimensions over the field Fq , the straight-forward calculation of the
likelihoods performed on the ith decoding step (32.7) (i ∈ [`]− ) requires O(`·q `−i ) operations.
For linear kernels it is possible to apply trellis decoding based on the zero-coset’s parity
check matrix which yield an upper bound of O(` · q i ) operations. Let τ and s be the total
time and space complexities for LLR computations of a kernel g(·) (note that τ = O(` · q ` )
and τ = O(` · q `/2 ) for general and linear codes respectively and s = O(` · q)). Inspection
of the recursive SC decoding algorithm yields total time complexity of O(τ · N/` · log` (N ))
and total space complexity of O(N · q + s).
Although being advantageous due to its light complexity, SC suffers from two inher-
ent deficiencies resulting in error-correction performance sub-optimality: (i) Decisions on
information symbols are final and cannot be changed after being made. (ii) When the SC
−1
algorithm decides on ui , it assumes that all the assignments to uN i are possible, although
some of the components may be frozen. The SCL decoder is a generalization of SC that
aims to mitigate these shortcomings by allowing multiple decision options to exist at any
given point in time.
meet the CRC constraints, significantly improves the performance, making it comparable
to the state of the art coding systems [1770, Section V].
In order to attach a CRC of b symbols we have to unfreeze b information symbols in the
inner polar code such that the total information rate of the concatenated system remains
the same. This technique is characterized by the following tradeoff. Increasing the CRC size
b causes the performance of the inner-code SCL to deteriorate. On the other hand, it may
improve the performance of the concatenated system as a larger proportion of the polar
code codewords is eliminated. Therefore, the CRC should be carefully selected based on the
SNR and the list size. As a general rule of thumb, as the list size increases, larger CRC
should be considered.
Example 32.7.4 Figure 32.6 provides FER curves of several polar codes of length N =
2048 bits and rate R = 0.5 simulated on a bi-AWGN channel. Two kernels are used here:
Arıkan’s (u + v, v) and the RS(4) kernel. The codes were designed using GA simulation at
Eb /N0 = 2 dB. We considered SC and SCL decoders with list size (L) in {2, 4, 8, 16, 32}.
For SCL with L = 16, 32 we also examined CRC of 16 bits concatenated to the information
part. The reader may notice the following phenomena: (i) SCL outperforms SC. Moreover,
as the list size grows the decoding performance improves; however it eventually saturates,
in the limit of ML decoding. (ii) Attaching CRC significantly lowers the FER. (iii) RS(4)
based polar codes outperform (u + v, v) polar codes due to their better partial distances
distribution.
FIGURE 32.6: FER curves of polar codes of length N = 2048 bits and rate R = 0.5
simulated on the bi-AWGN channel
Polar Codes 783
Remark 32.7.5 Additional polar code decoding algorithms are the iterative belief-
propagation (BP) [62, 63, 1148, 1543] and the sequential SC Stack (SCS) [1442, 1443]. Many
useful modifications to SCL were proposed over the years; see, e.g., [234, 917, 1235, 1613].
For more detailed discussion on SC and SCL complexity the reader may refer to [1543,
Section 5.5].
Cunsheng Ding
The Hong Kong University of Science and Technology
785
786 Concise Encyclopedia of Coding Theory
where 1 ≤ t ≤ `, then the subset {Pi1 , Pi2 , . . . , Pit } is called an access set. The set Γ of
all access sets is called the access structure, which is an element of the power set 2P of
P. An access set is minimal if any proper subset of it is not an access set. The set of all
minimal access sets is called the minimal access structure.
A secret sharing scheme is said to be t-democratic if every group of t participants is
in the same number of minimal access sets, where t ≥ 1. A participant is called a dictator
if she/he is a member of every minimal access set.
A secret sharing scheme is said to be perfect if any subset of participants and their
shares either gives all information about the secret s or does not contain any information
about s. A secret sharing scheme is said to be ideal if each share has the same size as the
corresponding secret.
The following example illustrates the building blocks of secret sharing schemes and
related concepts.
Definition 33.1.3 A secret sharing scheme is called a (t, `)-threshold scheme, where
1 ≤ t ≤ `, if the following conditions are satisfied:
1. the number of participants is `;
2. any subset of t or more participants can obtain all information about the secret with
their shares; and
3. any subset of t − 1 or fewer participants cannot get any information about the secret
with their shares.
By definition, any (t, `)-threshold scheme is perfect and t-democratic. The scheme de-
scribed in Example 33.1.2 is an (`, `)-threshold scheme.
The first secret sharing schemes appeared in the public literature in 1979, and were
developed by Shamir [1659] and Blakley [214] independently. The Blakley scheme has a ge-
ometric description, while the Shamir scheme has an algebraic expression. The two schemes
are in fact equivalent. The following example describes the Shamir secret sharing scheme.2
2 The Shamir scheme is also presented in the context of algebraic geometry codes in Section 15.8.4.
Secret Sharing with Linear Codes 787
Example 33.1.4 Let q be a prime power. Let ` and t be positive integers with 2 ≤ t ≤
` < q. In the Shamir scheme, the secret space is S := Fq with the uniform probability
distribution, the set of participants is P := {P1 , P2 , . . . , P` }. The share spaces are the same
as the secret space. The sharing computing procedure is the following:
• The dealer randomly chooses an element s ∈ S as the secret to be shared among the
` participants.
• The dealer then randomly and independently chooses t − 1 elements a1 , . . . , at−1 from
Fq , and constructs the polynomial
t−1
X
a(x) = s + ai xi ∈ Fq [x].
i=1
• The dealer finally randomly and independently chooses ` pairwise distinct elements
u1 , . . . , u` from F∗q := Fq \ {0}, computes si = a(ui ) and distributes (ui , si ) to Pi as
the share for all i.
When t participants Pi1 , . . . , Pit meet together with their shares, they can recover the secret
s by solving the following system of t linear equations to obtain a(x) and hence s:
It can be easily proved that any set of t − 1 or fewer participants get no information about
s with their shares. Hence, the Shamir scheme is a (t, `)-threshold scheme.
Construction 33.2.1 Let C be an [n, k, d]q code with generator matrix G, where k ≥ 2. In
the secret sharing scheme based on G with respect to the first construction, the
secret space and share spaces are Fq , and the set of participants is P = {P1 , P2 , . . . , Pn }.
The share computing procedure is the following:
• The dealer chooses randomly an element s1 from Fq as the secret to be shared among
the n participants.
• The dealer then randomly and independently chooses s2 , . . . , sk from Fq , and computes
t = (t1 , t2 , . . . , tn ) := sG,
where s = (s1 , s2 , . . . , sk ).
• The dealer gives ti to Participant Pi as the share.
788 Concise Encyclopedia of Coding Theory
It is easily seen that a set of shares {ti1 , ti2 , . . . , tim } determines the secret s1 if and only if
the vector e1 = (1, 0, . . . , 0)T is a linear combination of gi1 , gi2 , . . . , gim , where gi is the ith
column of G. Furthermore, the secret sharing scheme is perfect and ideal.
The secret recovering procedure goes as follows. Given a set of shares {ti1 , . . . , tim } which
can determine the secret, the subgroup of m participants solves the equation
m
X
e1 = xj gij
j=1
supp(c) = {i | 0 ≤ i ≤ n − 1, ci 6= 0}.
Theorem 33.2.3 Let C be an [n, k, d]q code with generator matrix G = g1 · · · gn . The
access structure of the secret sharing scheme based on G with respect to the first construction
is given by
n
X
Γ = {Pi | i ∈ supp((x1 , x2 , . . . , xn ))} e1 = xj gj .
j=1
Remark 33.2.4 Theorem 33.2.3 indicates that the access structure of the secret sharing
scheme based on a linear code with respect to the first construction depends on the choice
of the underlying generator matrix G.
McEliece and Sarwate were the first to observe a connection between secret sharing
and error-correcting codes [1367]. The secret sharing schemes based on linear codes with
respect to the first construction were considered by McEliece and Sarwate [1367] where the
Shamir scheme formulated in terms of polynomial interpolation was generalized in terms
of Reed–Solomon codes. In terms of vector spaces, Brickell also studied this kind of secret
sharing schemes [302]. A generalization of this approach was given by Bertilsson according
to [1830] and also by van Dijk [1830].
Secret sharing schemes based on MDS codes over Fq with respect to the first construction
were investigated by Renvall and Ding in [1587]. Their access structures are known and
documented in the following theorems.
Theorem 33.2.6 Let G be a generator matrix of an [n, k, n − k + 1]q MDS code C such that
its ith column is a scalar multiple of e1 . In the secret sharing scheme based on G with respect
to the first construction, a set of shares determines the secret if and only if it contains the
ith share or its cardinality is no less than k.
Secret Sharing with Linear Codes 789
The access structure of the secret sharing schemes described by Theorem 33.2.6 could
be interesting in applications where the president of some organization alone should know
the secret, and a set of other participants can determine the secret if and only if the number
of participants is no less than k.
A practical question concerning such a secret sharing scheme is whether each [n, k, n −
k + 1] MDS code has such a generator matrix and how to find them if there are some. The
answer is “yes” and such a generator matrix is found as follows. First, take any generator
matrix G0 . Then find a coordinate of the first row vector of G0 and add multiples of this
row to other rows to make the coordinate of other rows equal zero. The obtained matrix is
the desired one.
Theorem 33.2.7 Let G be a generator matrix of an [n, k, n − k + 1]q MDS code C such
that e1 is a linear combination of g1 and g2 , but not a multiple of either of them, where gi
denotes the ith column vector of G. The access structure of the secret sharing scheme based
on G with respect to the first construction is then described as follows:
(a) any k shares determine the secret;
(b) a set of shares with cardinality k − 2 or less determines the secret if and only if the
set contains both t1 and t2 ;
(c) a set of k − 1 shares {ti1 , ti2 , . . . , tik−1 }
(i) determines the secret when it contains both t1 and t2 ;
(ii) cannot determine the secret when it contains one and only one of t1 and t2 ;
(iii) determines the secret if and only if rank(e1 , gi1 , . . . , gik−1 ) = k − 1 when it con-
tains neither of t1 and t2 .
Theorem 33.2.8 Let G be a generator matrix of an [n, k, n − k + 1]q MDS code C. The
secret sharing scheme based on G with respect to the first construction is a (k, n)-threshold
scheme if and only if e1 is not a linear combination of any k − 1 columns of G.
1 1 ··· 1
u1 u2 ··· un
G= . .. .. .. .
.. . . .
ut−1
1 ut−1
2 ··· t−1
un
The minimum distance of a linear code is one of the important parameters of the code
since it determines the error-correcting capacity of the code. The minimum distance of the
dual code has an implication on the access structure of the secret sharing scheme based on
the code. The following is such a result developed in [1587].
Theorem 33.2.10 Let G be a generator matrix of an [n, k, d]q code C. In the secret shar-
ing scheme based on G with respect to the first construction, if each of two sets of shares
790 Concise Encyclopedia of Coding Theory
{ti1 , . . . , til } and {tj1 , . . . , tjm } determines the secret, but no subset of any of them can do
so, then
|{i1 , . . . , il } ∪ {j1 , . . . , jm }| − |{i1 , . . . , il } ∩ {j1 , . . . , jm }| ≥ d⊥ ,
where d⊥ is the minimum distance of the dual code C ⊥ .
The access structures of the secret sharing scheme based on linear codes with respect
to the first construction are very complex in general. Not much work in this direction has
been done.
Remark 33.2.11 Any secret s can be encoded into a binary string s1 s2 · · · su with an
encoding scheme. Then s can be shared bit by bit by a group of participants. Hence, binary
linear codes or linear codes over a small field Fq can be employed for sharing a secret with
large size. This remark applies also to the second construction to be treated later.
Definition 33.3.5 The covering problem of a linear code is to determine the set of all
minimal codewords of the code.
The covering problem of linear codes is very hard in general. Minimal linear codes are
special in the sense that all their codewords are minimal. It has been a hot topic in the
past decades to search for minimal linear codes. The following is a useful tool developed by
Ashikmin and Barg [67].
Lemma 33.3.6 A linear code C over Fq is minimal if wmin /wmax > (q − 1)/q. Herein and
hereafter, wmin and wmax denote the minimum and maximum nonzero Hamming weights in
C, respectively.
Many families of minimal linear codes have been identified with Lemma 33.3.6. Most
of these codes are one-weight codes, two-weight codes [330], and three-weight codes [557].
These codes usually have few weights and have relatively small dimensions. Minimal codes
were also investigated in [28] and [69].
Lemma 33.3.6 presents a sufficient condition, which is not necessary. This was demon-
strated with two examples by Cohen, Mesnager, and Patey [420]. Recently, the first infinite
family of minimal binary linear codes with wmin /wmax ≤ 1/2 was discovered in [380].
The reader is informed that necessary and sufficient conditions for a linear code to be
minimal have been just derived and more infinite families of minimal linear codes with
wmin /wmax ≤ (q − 1)/q have been constructed [556, 943]. Families of minimal linear codes
are also presented in Chapter 18.
Construction
T T 33.3.7
Let C be an [n, k, d]q code with generator matrix G where G =
T
g0 g1 · · · gn−1 with gi ∈ Fkq . The secret space S of the secret sharing scheme
based on C with respect to the second construction is Fq with the uniform probability
distribution. The participant set P = {P1 , P2 , . . . , Pn−1 }. The secret sharing procedure goes
as follows:
• The dealer chooses randomly u = (u0 , . . . , uk−1 ) such that s = u · g0 , where u · g0 is
the standard inner product of the vectors u and g0 .
• The dealer then treats u as an information vector and computes the corresponding
codeword
t = (t0 , t1 , . . . , tn−1 ) := uG.
1 ≤ i1 < · · · < im ≤ n − 1
Γ = {Pi1 , . . . , Pim } .
g0 is a linear combination of gi1 , . . . , gim
It is easily seen that a set of shares {ti1 , ti2 , . . . , tim } gives no information about s if g0 is
not a linear combination of gi1 , . . . , gim . Hence, the secret sharing scheme is perfect.
792 Concise Encyclopedia of Coding Theory
Suppose that
m
X
g0 = xj gij .
j=1
Then the secret s is recovered by the group {Pi1 , . . . , Pim } of participants after computing
m
X
s= xj tij .
j=1
The second construction above was considered by Massey [1348, 1349]. Van Dijk pointed
out in [1830] that Massey’s approach is a special case of the construction introduced by
Bertilsson and Ingemarson in [184]. The following theorem was observed by Massey. It
documents the access structure of the secret sharing scheme based on C; see [1348, 1349].
Theorem 33.3.8 Let C be an [n, k, d]q code. Then the access structure of the secret sharing
scheme based on C with respect to the second construction is given by
Remark 33.3.9 Theorem 33.3.8 shows that the access structure of the secret sharing
scheme based on a linear code with respect to the second construction is independent of the
choice of the underlying generator matrix G. This is a major difference between the first
and second constructions.
The minimum distance d⊥ of the dual code C ⊥ gives a lower bound on the size of any
minimal access set in the secret sharing scheme based on C with respect to the second
construction. Specifically, we have the following which was proved by Renvall and Ding in
[1587].
Theorem 33.3.10 Let C be an [n, k, d]q code and let d⊥ denote the minimum distance of the
dual code. In the secret sharing scheme based on C with respect to the second construction,
any set of d⊥ − 2 or fewer shares does not give any information on the secret, and there is
at least one set of d⊥ shares that determines the secret.
Example 33.3.12 Let C be the linear code with parameters [13, 10, 2]3 and generator ma-
trix
1 0 0 0 0 0 0 0 0 1 0 2 1
0 1 0 0 0 0 0 0 0 0 0 0 1
0 0 1 0 0 0 0 0 0 0 0 1 1
0 0 0 1 0 0 0 0 0 2 0 1 0
0 0 0 0 1 0 0 0 0 2 0 1 1
.
0 0 0 0 0 1 0 0 0 0 0 2 1
0 0 0 0 0 0 1 0 0 1 0 1 2
0 0 0 0 0 0 0 1 0 2 0 1 2
0 0 0 0 0 0 0 0 1 1 0 0 1
0 0 0 0 0 0 0 0 0 0 1 1 2
Then C ⊥ is a [13, 3, 7]3 code. In the secret sharing scheme based on C with respect to the
second construction, the secret space is S = F3 , the participant set is P = {P1 , . . . , P12 },
and there are q n−k−1 = 32 = 9 minimal access sets which are the following:
Example 33.3.13 Let C ⊥ be the [12, 4, 6]2 code with generator matrix
0 0 0 0 1 1 1 1 1 1 1 1
1 1 1 1 0 0 0 0 1 1 1 1
H=
0 0 1 1 0 0 1 1 0 0 1 1
0 1 0 1 0 1 0 1 0 1 0 1
The secret sharing scheme based on C with respect to the second construction has the secret
space F2 , the participant set P = {P1 , . . . , P11 }, and the following minimal access sets:
Every participant is in 4 minimal access sets. Therefore, the secret sharing scheme is 1-
democratic.
Planar functions were employed to construct linear codes over Fq with three and five
weights by Carlet, Ding, and Yuan in [352], where the duals of these codes were used to
construct secret sharing schemes with special access structures.
The duals of some irreducible codes and a class of linear codes from quadratic forms
were used to construct secret sharing schemes by Yuan and Ding in [352]. Several families
of trace codes were utilized for secret sharing by Yuan and Ding in [1935]. A family of group
character codes was employed for secret sharing by Ding, Kohel, and Ling in [557]. Some
linear codes were also considered for secret sharing in [564, 566].
Secret sharing schemes based on additive codes over F4 and their connections with
t-designs were considered by Kim and Lee in [1120]. Self-dual codes were employed to
construct secret sharing schemes by Dougherty, Mesnager, and Solé in [633].
The entropy is a measure of the average uncertainty in the random variable. Entropy is
the uncertainty of a single random variable. We can define conditional entropy H(X | Y ),
which is the entropy of a random variable X conditional on the knowledge of another
random variable Y . The reduction in uncertainty due to another random variable is called
the mutual information. For two random variables X and Y this reduction is the mutual
information
X p(x, y)
I(X; Y ) = H(X) − H(X | Y ) = p(x, y) log ,
x,y
p(x)p(y)
where p(x, y) is the joint probability distribution of X and Y . The mutual information
I(X; Y ) is a measure of the dependence between the two random variables. It is symmetric
in X and Y and always nonnegative and is equal to zero if and only if X and Y are
independent.
In [1083] Karnin, Greene, and Hellman considered the situation where there are k secrets
s1 , s2 , . . . , sk from secret spaces Si with the uniform probability distribution to be shared.
The following are required for any 1 ≤ j ≤ k:
• Any set of m shares determines the secret sj ; i.e., for any set of m indices 1 ≤ i1 <
· · · < im ≤ n, H(sj | (ti1 , . . . , tim )) = 0. Here and hereafter ti ∈ Ti denotes the
share of Participant Pi and H(sj | (ti1 , . . . , tim )) denotes the uncertainty of sj given
(ti1 , . . . , tim ).
• Any set of m − 1 shares gives no information about the secret sj ; i.e., for any set of
m − 1 indices 1 ≤ i1 < · · · < im−1 ≤ n, H(sj | (ti1 , . . . , tim−1 )) = H(sj ), or in terms of
mutual information I(sj ; (ti1 , . . . , tim−1 )) = 0, where H(sj ) denotes the uncertainty
of sj , H(a | b) the uncertainty of a when event b happened, and I(a; b) the amount of
mutual information between a and b.
Secret Sharing with Linear Codes 795
Such schemes are necessary in applications where a number of secrets should be shared
at the same time by a number of participants. We will refer to such systems as [k, m, n]
(multisecret sharing) threshold schemes. Another important fact is that each [k, m, n]
threshold scheme for multisecret sharing gives naturally k (m, n)-threshold schemes for
single secret sharing. Thus, the importance of multisecret sharing follows also from that of
single secret sharing.
Multisecret sharing schemes were also studied by Jackson, Martin, and O’Keefe [1027],
where they considered the case in which each subset of k participants is associated with a
secret which is protected by a (t, k)-threshold access structure and lower bounds on the size
of a participant’s share. Some information aspects of multisecret sharing schemes were also
studied by Blundo, De Santis, Di Crescenzo Gaggia, and Vaccaro [226], where they tried to
work out a general theory of multisecret sharing schemes and to establish some lower bounds
on the size of information held by each participant for various access structures. Ding,
Laihonen, and Renvall [558] have established some connections between linear multisecret
sharing schemes and linear codes, which will be introduced in the next subsections.
Theorem 33.4.1 A multisecret sharing scheme defined over the above secret and share
spaces is linear if and only if its share function is of the form
Theorem 33.4.1 clearly shows that a linear multisecret sharing scheme gives an [n, k, d]q
code C with generator matrix G, and each generator matrix G of an [n, k, d]q linear code C
gives a linear multisecret sharing scheme. The share function is an encoding mapping of a
linear code.
For a linear multisecret sharing scheme with the share function f of (33.1), recovering
the original multisecret s is carried out as follows. Let G(i1 , . . . , iu ) denote the submatrix
consisting of columns i1 , i2 , . . . , iu of the matrix G, where 1 ≤ u ≤ n, and 1 ≤ i1 < · · · <
iu ≤ n. Suppose that the shares ti1 , ti2 , . . . , tiu are known; then recovering the multisecret
becomes solving the linear equation
The complexity of solving such a linear equation is O(u3 ) with a method like the Gaussian
elimination method. Thus, recovering the multisecret is much simpler than decoding linear
codes.
Theorem 33.4.2 Let a multisecret sharing scheme have the share function of (33.1). Then
< I(s) if r < k,
I(s; (ti1 , . . . , tiu )) = r log2 q =
= I(s) if r = k,
> 0 if r < k,
H(s | (ti1 , . . . , tiu )) = (k − r) log2 q =
= 0 if r = k,
By Theorem 33.4.2 the amount of information about the multisecret s given by a set of
shares {ti1 , . . . , tiu } is completely determined by the rank of the submatrix G(i1 , . . . , iu ) of
G. Thus, each share gives information about the multisecret s; however we shall prove this
could not be true for each individual secret sj .
Theorem 33.4.3 Let ej denote the vector where the j th coordinate is 1 and the remaining
coordinates are 0. Let a multisecret sharing scheme have the share function of (33.1). If
r = rank G(i1 , . . . , iu ) = k, then
If r < k, then I(sj ; (ti1 , . . . , tiu )) = log2 q = I(sj ) if and only if the vector ej is a linear com-
bination of the column vectors of the submatrix G(i1 , . . . , iu ); otherwise I(sj ; (ti1 , . . . , tiu )) =
0.
Theorem 33.4.4 A multisecret sharing scheme with the share function of (33.1) is a
[k, k, n] threshold scheme if and only if
(a) the linear code C with generator matrix G is MDS, and
By Theorem 33.4.4, a [k, k, n] threshold scheme for multisecret sharing gives one [n, k, n−
k + 1] MDS code and k MDS codes with parameters [k, k − 1, 2].
Theorem 33.4.5 For any linear [k, k, n] threshold scheme with the share function of (33.1)
This theorem clearly shows the information hierarchy about the multisecret of [k, k, n]
threshold schemes, i.e., I(s; B) = |B| log2 q, where B is a set of shares. Thus, linear [k, k, n]
threshold schemes are the most democratic schemes in that each share contains the same
amount of information about the multisecret, and two sets of shares give the same amount
of information about the multisecret if and only if the numbers of shares in the two sets are
equal. However, the information hierarchy about each individual secret sj is quite different,
i.e., I(sj ; B) = 0 if |B| < k and I(sj ; B) = I(sj ) otherwise.
The relation between linear [k, k, m] threshold schemes and linear [n, k, n − k + 1] MDS
codes is now clear. To construct such multisecret sharing schemes, we need to find linear
MDS codes. It is obvious that not every generator matrix of an MDS code satisfies Theo-
rem 33.4.4(b). So our task is first to find MDS codes, and then to find generator matrices
of those codes satisfying Theorem 33.4.4(b). Below we give an example.
Let α = (α1 , . . . , αn ) where the αi are distinct elements of Fq . The [n, k, n − k + 1]
generalized Reed–Solomon code has generator matrix
1 1 ··· 1
α1 α2 ··· αn
G= . .. .. .. . (33.2)
.. . . .
α1k−1 α2k−1 ··· αnk−1
To construct linear multisecret sharing [k, k, n] threshold schemes based on the Reed–
Solomon code, we need the following lemma.
The linear multisecret sharing scheme based on the Reed–Solomon codes is constructed
as follows. Choose the αi such that (33.3) holds. Then we use the matrix of (33.2) as the
one of (33.1) to construct the share function for the multisecret sharing scheme.
Theorem 33.4.7 The multisecret sharing scheme based on the Reed–Solomon code with
generator matrix G of (33.2) satisfying (33.3) is a linear [k, k, n] threshold scheme.
To illustrate the above linear multisecret sharing [k, k, n] threshold scheme based on the
Reed–Solomon codes, we take the following example.
Example 33.4.8 Consider the field F11 and k = 3. Let αi = i for i = 1, 2, 3, 4, 5. Then the
matrix in (33.2) becomes
1 1 1 1 1
G0 = 1 2 3 4 5 ,
1 4 9 5 3
which generates a [5, 3, 3] MDS code over F11 . It is easily checked that each 2 × 2 submatrix
of G0 is invertible; so G0 gives a [3, 3, 5] threshold scheme for multisecret sharing.
Connections between single-secret sharing and multisecret sharing can be found in [558].
Multisecret sharing schemes based on codes were investigated in [333, 1740].
Chapter 34
Code-Based Cryptography
Philippe Gaborit
University of Limoges
Jean-Christophe Deneuville
University of Toulouse
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 800
34.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 800
34.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 800
34.1.2 Background on Coding Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 801
34.2 Difficult Problems for Code-Based Cryptography: The
Syndrome Decoding Problem and Its Variations . . . . . . . . . . . . . . . . 803
34.3 Best-Known Attacks for the Syndrome Decoding Problem . . . . . 804
34.4 Public-Key Encryption from Coding Theory with Hidden
Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 807
34.4.1 The McEliece and Niederreiter Frameworks . . . . . . . . . . . . 807
34.4.2 Group-Structured McEliece Framework . . . . . . . . . . . . . . . . . 809
34.4.3 Moderate-Density Parity Check (MDPC) Codes . . . . . . . 810
34.5 PKE Schemes with Reduction to Decoding Random Codes
without Hidden Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812
34.5.1 Alekhnovich’s Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812
34.5.2 HQC: Efficient Encryption from Random Quasi-Cyclic
Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 813
34.5.3 Ouroboros Key-Exchange Protocol . . . . . . . . . . . . . . . . . . . . . . 814
34.6 Examples of Parameters for Code-Based Encryption and Key
Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815
34.7 Authentication: The Stern Zero-Knowledge Protocol . . . . . . . . . . . 816
34.8 Digital Signatures from Coding Theory . . . . . . . . . . . . . . . . . . . . . . . . . 817
34.8.1 Signature from a Zero-Knowledge Authentication
Scheme with the Fiat–Shamir Heuristic . . . . . . . . . . . . . . . . . 818
34.8.2 The CFS Signature Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 818
34.8.3 The WAVE Signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 819
34.8.4 Few-Times Signature Schemes and Variations . . . . . . . . . . 820
34.9 Other Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 820
34.10 Rank-Based Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 820
Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 821
799
800 Concise Encyclopedia of Coding Theory
Introduction
Code-based cryptography refers to asymmetric cryptography based on coding theory
assumptions where the underlying primitives can be reduced to solving decoding prob-
lems. This chapter focuses on the essentials of code-based cryptography and covers some
reminders on coding theory and the associated hard problems useful for cryptography, his-
torical constructions such as the McEliece [1361] and Niederreiter [1432] constructions, as
well as more recent proposals for public-key cryptography, key-exchange protocols, and
digital signatures.
With the announced downfall of number theoretic based approaches to provide secure
public-key primitives against quantum adversaries [858, 1696], the field of post-quantum
cryptography has received growing interest with a lot of research effort. Among the possible
alternative tools such as lattices, hash functions, multivariate polynomials, or isogenies of
elliptic curves, coding theory stands as a credible mature candidate for several reasons that
will be discussed in the next sections: it allows one to design most of the usual (most used)
primitives, the complexity of the best known attacks is well-studied and rather stable, and
it offers time/memory/security trade-offs.
In particular, in recent years a lot of progress has been made. In addition to the classi-
cal McEliece framework published in 1978, with its advantages and drawbacks, some new
approaches have introduced the possibility of having code-based cryptosystems with small
key sizes and with security reductions to generic problems as a way to avoid the classical
structural attacks associated to the McEliece framework.
The recent standardization process launched by the National Institute of Standards and
Technology (NIST) in 2017 [1441] has thrown code-based cryptography into a new era by
forcing it to consider real applications, which had not really been done before because of
the practical efficiency of number theory-based cryptosystems like RSA.
34.1 Preliminaries
34.1.1 Notation
Throughout this chapter, Z denotes the ring of integers and Fq denotes the finite field of q
elements for q a power of a prime. Let R = Fq [X]/(X n −1) denote the quotient ring of poly-
nomials modulo X n − 1 whose coefficients lie in Fq . Elements of R will be interchangeably
considered as row vectors or polynomials. Vectors and polynomials (respectively, matrices)
will be represented by lower-case (respectively, upper-case) bold letters.1
The typical case is q = 2 and n a prime number chosen such that 2 is primitive modulo
n; indeed in the typical case X n − 1 factors as the product of two irreducible polynomials
X n − 1 = (X + 1)(X n−1 + X n−2 + · · · + 1) [779]. Hence X n − 1 can be considered as almost
irreducible, which is considered to be better for security and makes it easy to find invertible
elements of R; modulo X n − 1, the invertible elements are exactly polynomials with an odd
number of terms.
1 Vectors are usually considered in row representation in coding theory, against column representation in
lattice-based cryptography.
Code-Based Cryptography 801
For any two elements x, y ∈ R, their product, defined as the usual product of two
polynomials modulo X n − 1, is as follows: x · y = z ∈ R with
X
zk = xi yj for i, j, k ∈ {0, . . . , n − 1}.
i+j≡k (mod n)
Definition 34.1.4 The dual code C ⊥ of a linear [n, k] code C is the linear [n, n − k] code
defined by
C ⊥ = {x ∈ Fnq | hx, ci = 0 for all c ∈ C}.
C = {mG | m ∈ Fkq }.
(n−k)×n
Definition 34.1.6 Given an [n, k] code C, we say that H ∈ Fq is a parity check
matrix for C if H is a generator matrix of the dual code C ⊥ , or more formally, if
C = {x ∈ Fnq | HxT = 0}, or equivalently C ⊥ = {yH | y ∈ Fn−k
q },
Remark 34.1.7 It follows from the previous definitions that GHT = 0k×(n−k) .
Code-based cryptography was originally proposed using the standard Hamming metric.
Other metrics, such as the rank metric, allow for the design of cryptographic primitives
with different properties. We focus on the Hamming metric for most of this chapter and
refer the reader to Section 34.10 for a brief overview of code-based cryptography with the
rank metric.
Definition 34.1.9 Let C be an [n, k] linear code over Fq . The minimum distance of C is
Remark 34.1.10 Notice that by definition, the minimum distance of a linear code is ex-
actly the minimum weight of a nonzero codeword over all possible codewords.
Definition 34.1.11 ([1323, Chap. 16, §7]) View a vector c = (c0 , . . . , cs−1 ) of Fsn 2 as s
successive blocks (n-tuples). An [sn, k, d] linear code C is quasi-cyclic (QC) of index s if
for any c = (c0 , . . . , cs−1 ) ∈ C, the vector obtained after applying a simultaneous circular
shift to every block c0 , . . . , cs−1 is also a codeword.
More formally, by considering each block ci as a polynomial in R = Fq [X]/(X n − 1),
the code C is QC of index s if for any c = (c0 , . . . , cs−1 ) ∈ C, then (X · c0 , . . . , X · cs−1 ) ∈ C.
Definition 34.1.12 A systematic quasi-cyclic [sn, n] code of index s and rate 1/s is a
quasi-cyclic code with an (s − 1)n × sn parity check matrix of the form
In 0n×n · · · 0n×n A0
0n×n In A1
H= ... .. ..
. .
0n×n ··· In As−2
where A0 , . . . , As−2 are circulant n × n matrices, In is the n × n identity matrix, and 0n×n
is the n × n zero matrix.
Remark 34.1.13 The definition of systematic quasi-cyclic codes of index s can of course
be generalized to all rates `/s, ` = 1, . . . , s − 1, but we shall only use systematic QC codes
of rates 1/2 and 1/3 and wish to lighten notation with the above definition. In the sequel,
referring to a systematic QC code will imply by default that it is of rate 1/s. Note that
arbitrary QC codes are not necessarily equivalent to a systematic QC code.
Definition 34.1.14 For an [n, k] code over Fq , the Gilbert–Varshamov distance is the
smallest value of d which satisfies the inequality
d
X n
q n−k ≤ (q − 1)i .
i=0
i
Code-Based Cryptography 803
This distance corresponds to the average minimum distance of a random [n, k] code. Notice
that for quasi-cyclic codes, this value can be a little higher [779].
n
Definition 34.1.15 We denote by Sw (Fq ) the set of all words of weight w in Fnq .
The Ideal-Decision Syndrome Decoding problem was proven NP-complete in [171] under
the name of the Coset Weight problem; it is also sometimes called the Maximum Likeli-
hood Decoding problem. The computational version CSD is obviously a harder problem.
In practice the Ideal-Decision Syndrome Decoding problem assumes that the answer to the
question can only be “yes” or “no”; it corresponds to an ideal case, which does not fit very
well the usual attacker models. In practice, for security proofs in cryptography, it is easier to
consider a more precise notion of advantage and indistinguishability for an attacker, which
motivates the introduction of the Decision Syndrome Decoding problem. The DSD problem
is easier than the IDSD problem, but it was proven in [54, 733] that over F2 the DSD prob-
lem is harder than the CSD problem, so that overall for random binary codes these three
problems are equivalent. But in practice the DSD problem appears more naturally.
One can draw a distinction with the Diffie–Hellman problems, for which there is a
computational version CDH (Computational Diffie–Hellman) and a decisional version DDH
(Decisional Diffie–Hellman); in practice proofs are often based on the DDH assumption. In
the case of the Diffie–Hellman problem, the CDH problem is harder than the DDH problem,
and it is not known whether or not the two problems are equivalent.
Another related problem is the Minimum Weight problem.
Problem 34.2.4 (Minimum Weight) Let C be a random [n, k] code over Fq and let w
be a positive integer. Does there exist a nonzero codeword x in C with wtH (x) ≤ w?
804 Concise Encyclopedia of Coding Theory
There is no known reduction for the QCSD problem, but the problem is considered hard
by the cryptographic community. In the typical case for n as described in Section 34.1.1,
there is no known attack using the quasi-cyclic structure which drastically reduces the
cost of the attack for the QCSD problem.
√ The best improvement in that case, the DOOM
approach [1649], only permits a gain in n. Concerning the reduction between the decision
and the computational QCSD problems, different from of the previously mentioned case of
random binary codes, it is not known whether or not the two problems are polynomially
equivalent.
Problem 34.2.7 (Learning Parity with Noise (LPN)) Fix a secret s ∈ Fn2 and an
error probability p. A sample t is defined by the binary value t = hr, si + e for r a random
element in Fn2 and e an error value which is 1 with probability p and 0 otherwise. Given N
samples, is it possible to recover s?
The complexity of the LPN problem depends on the number of samples considered [1310].
For a fixed number of samples the LPN problem corresponds exactly to the SD problem for
a [N, n] binary code with an error weight depending on the probability p. To the best of
our knowledge, except for the case of the Hopper–Blum (HB) authentication scheme [983]
where the notion of unlimited samples appears naturally, for encryption schemes based on
LPN, one has to fix the number of samples. The fact that the weight of the error may vary
because of the probability p makes it hard to have efficient parameters, so that sometimes
one considers LPN problems with a fixed weight distribution. Hence, in practice, efficient
cryptosystems based on LPN or its variations are equivalent to cryptosystems based on the
classical SD (or QCSD) problems.
I of indexes for which ei = 0 for i ∈ I, i.e., an error-free subset of positions. The average
running time is hence a function of n, k, and w. Let C be a code with generator matrix G
and parity check matrix H. Remember that, by multiplying by a parity check matrix H,
decoding a noisy codeword y = mG + e for a small weight error e and a codeword c = mG
is equivalent to solving eHT = yHT .
The first ISD dates back to Prange [1540] and is presented in Algorithm 34.3.1. The gen-
eral idea consists in guessing k coordinates positions of a noisy vector and hoping that these
positions do not contain errors. If the guess is correct, one can reconstruct the codeword and
check that the error has the desired small weight. Otherwise the attacker considers a new
set of coordinates. The algorithm essentially relies on the fact that, for an invertible matrix
(n−k)×(n−k)
U ∈ F2 and a permutation matrix P ∈ Fn×n 2 , if H0 = UHP, solving eHT = s is
equivalent to solving e0 H0T = s0 , with e0 = eP and s0 = sUT . Using Gaussian elimination,
an information set can be identified in polynomial time by writing the resulting parity check
matrix in systematic form. The algorithm succeeds when sUT has a low enough weight.
is approximately n9 . For this special set of parameters, the complexity depends linearly on
a unique parameter n. In Table 34.1 we give the values of the exponent depending on the
attack. In the table, the last three attacks also use nearest neighbor techniques together
with the ISD approach; these approaches require extensive memory and are not necessarily
the most efficient for cryptographic parameters. All these attacks are probabilistic. There
also exist deterministic methods, but they are too expensive and in practice probabilistic
methods are used. The Prange algorithm can be adapted in a straightforward manner to the
case of finding minimum weight codewords of a code. In that case the syndrome is null, but
it is possible to use the Prange algorithm by guessing an error position of the small weight
vector and use the associated column of the parity check matrix as the new syndrome.
TABLE 34.1: Evolution of the value of the coefficient α in the exponential part 2αn of the
complexity of attacks for decoding a random [n, n/2, d] code for d the Gilbert–Varshamov
distance of the code (approximately n9 ).
Name Date α
Exhaustive search 0.386
Prange [1540] 1962 0.1207
Stern [1752] 1988 0.1164
Dumer [647] 1991 0.1162
May, Meurer, Thomas [1356] 2011 0.1114
Becker, Joux, May, Meurer [146] 2012 0.1019
May, Ozerov [1357] 2015 0.1019
Both, May [253] 2017 0.0953
Both, May [254] 2018 0.0885
In the case where √ w is very small compared to n and k (for instance, typically,
k = n/2, w = O( n)), all these attacks have the same complexity, approximately
2−w log(1− n )(1+o(1)) , which is the complexity of the Prange-ISD Algorithm [339].
k
Remark 34.3.2 The best known quantum attack consists in using q the Grover Algorithm
n n−k
through the Prange-ISD Algorithm and gives a complexity in O w / w [858]; see
also [1069] for small exponential improvements with other improved ISD attacks. This com-
plexity roughly means to divide by a factor of 2 in the exponential term of the attack, and
that a 256 bits classical security level (the complexity needed for an adversary to attack a
problem, that is, 2256 binary operations for a 256 bits security) would give a 128 quantum
bits security. In contrast to number theory-based problems like the factorization problem
or the discrete logarithm problem, there is no known quantum attack with logarithmic gain
for the Syndrome Decoding problem; the a priori reason for which the community does
not believe in it, is that the SD problem is NP-complete, and the general rationale is that
this class of complexity remains hard even in front of a quantum adversary. Meanwhile the
fact that there may exist a better quantum attack for the QCSD problem is still an open
question.
To gauge the gap between theory and practice, a challenge website was created in
2019 [58]. It gathers instances of generic problems such as the SD problem or the small
codeword finding problem, as well as instances more specific to the NIST post-quantum
cryptography standardization process.
Code-Based Cryptography 807
KeyGen
Generate a (private) random [n, k] linear code C and its associated generator
matrix G ∈ Fk×n
2 , along with an efficient decoding algorithm D. Sample uniformly
at random an invertible matrix S ∈ Fk×k 2 and a permutation matrix P ∈ F2
n×n
.
Return (sk, pk) = (S, G, P), G e = SGP .
Encrypt
To encrypt m ∈ Fk2 , sample uniformly e ∈ Fn2 with wtH (e) = w and return
c = mGe + e.
Decrypt
To decrypt, using D, first decode d = cP−1 to retrieve mS, then m from mS.
Why decryption works
d = (mS)G + eP−1 ; decoding d permits recovery of mS, then m = (mS)S−1 .
McEliece parameters: The size of the public key is (n − k)k; the size of the ciphertext is
n.
Security of the McEliece framework: The OW-CPA (One-Wayness against Chosen
Plaintext Attack) security of the scheme relies on two problems. First, the hidden code
structure is indistinguishable from a random code, which assumes that the SGP matrix
cannot be distinguished from a random one. Second, based on this assumption, the security
is then the security of decoding random codes (hence the SD problem). From the OW-CPA
security one can obtain IND-CCA2 (Indistinguishable under Chosen Plaintext Attacks)
security with a small ciphertext overhead using a general conversion as, for instance, in
[970].
Structural attacks of McEliece: The main security assumption is the indistinguishability
of the public matrix from random codes. The particular type of attack, called a structural
attack, consists in trying to recover directly the private key G from the public key SGP.
There exist two type of hidden families. In the first case, the attacker knows that the public
key SGP is obtained from a code belonging to a large family of codes but does not know
the specific code in the large family (for instance Goppa codes). Another possibility consists
2 Section 15.7 examines the McEliece scheme in the context of algebraic geometry codes.
808 Concise Encyclopedia of Coding Theory
in considering a unique code but one which has to be resistant to recovering the primary
structure from a permuted structure (for instance Reed–Muller or Reed–Solomon codes).
The best generic attacks to recover a permutation from a permuted code and the code are
the Leon Algorithm [1217] (with complexity: enumerating sufficiently many small weight
e 2dim(C∩C ⊥ ) ).
codewords) and the Support Splitting Algorithm [1647] (with complexity O
In the case of specific families of codes, for instance Reed–Solomon codes with monomial
transformations (permutation matrices where each column is moreover multiplied by a
nonzero scalar), it is possible to break the system in polynomial time [1705].
Niederreiter approach: Niederreiter’s approach which can be seen as a dual approach
of McEliece, is presented in Figure 34.2. The security is equivalent to the security of the
McEliece framework. The main interest is that the ciphertext has size n−k (the dimension of
the dual) rather than n. The drawback is that the message has to be expressed in terms of a
small weight vector. In practice, with the KEM/DEM (Key Encapsulation Mechanism/Data
Encapsulation Mechanism) approach, the latter drawback becomes less important.
KeyGen
Generate a (private) random [n, k] linear code C and its associated parity check
(n−k)×n
matrix Hsec ∈ F2 , along with an efficient decoding algorithm D. Sample
(n−k)×(n−k)
uniformly at random an invertible matrix S ∈ F2 and a permutation
n×n
P ∈ F2 . Return (sk, pk) = ((S, Hsec , P) , Hpub = SHsec P).
Encrypt
To encrypt m ∈ Fk2 , first encode it into a vector m
e of length n and weight w,
then return c = Hpub me T.
Decrypt
To decrypt, first decode d = D S−1 c , then retrieve m from m
e = dP−1 .
see [463, 464, 707]) lies like a sword of Damocles over the system, and finding a practical
alternative cryptosystem based on the difficulty of decoding unstructured or random codes
has always been a major issue in code-based cryptography.
Generalized McEliece framework: The original McEliece framework described in Fig-
ure 34.1 has a straightforward generalization in a scheme which can be roughly described
as:
• Secret Key: a decodable code C
• Public Key: a matrix G0 = TrapDoor(C), for TrapDoor(C) a transformation hiding
C in G0
• Encryption: c = mG0 + e, for e a small weight error
• Decryption: C.Decode(TrapDoor−1 (c)), for C.Decode a decoding algorithm of C
This more general point of view permits one to not only consider the original McEliece
permutation hiding, but also more general hiding as, for instance, the group-structured
McEliece or the MDPC cryptosystem variations of the next sections.
This generalized framework also contains other natural variations on the scheme, such as
the Wieschbrink variation in which one adds random columns to a code to hide it [1890], or
using a subcode of a code [165], or a mix of the two approaches. There also exist variations
[101] where the permutation is replaced by a matrix with a very small numbers of 1s in
each row and column or by considering a monomial transformation rather than a basic
permutation (typically when considering non-binary codes). But this type of system was
also attacked [465]. All these variations do not fit in the original McEliece framework, but
have in common the notion of TrapDoor and hiding a decodable code in a public matrix,
with a security relying on the indistinguishability of the public matrix from a random code.
Most of these variations were also broken [119, 459, 461, 1222].
while this approach should lead to a decrease of the size of the public key by some factor,
the underlying quasi-cyclic structure may now be seen as a weakness.
Security and parameters: Everything is similar to the McEliece framework except that
the size of the public key is smaller by a factor of roughly 10 and more structure is added
on the hidden code. The reduction is done with the QCSD problem and not with the SD
problem.
Alice Bob
$ n
h0 , h1 ← Sw (F2 ), h ← h1 h0−1
h
−−−−−−→
$
e0 , e1 ← Fn2 s.t. wtH (e0 ) + wtH (e1 ) = 2t
s ← h0 c1 c1 c1 ← e0 + he1
←−−−−−−
(e0 , e1 ) ← BitFlipping(h0 , h1 , s, t, w)
Parameters: For security level λ, the length of the code is in O(λ2 ) and the weight of the
generating vector is in O(λ).
Security: The original presentation of the MDPC scheme had an OW-CPA security, based
not only on the QCSD problem but also on the indistinguishability between the public
generator matrix of a QC-MDPC code (generated by a small weight vector) and a random
quasi-cyclic code. The Niederreiter version presented in Figure 34.3 follows [15] and can be
proven IND-CCA2 (under the two previous hypotheses) with a generic transformation of
type [970] at the condition that the DFR is proven sufficiently low in 2−λ , for λ the security
parameter in bits.
Attacks and DFR: The main attacks are related to the decryption failure rate [874]. The
MDPC scheme can be used for key exchange in a one-time process with a DFR as sufficiently
small as 2−30 ; using it for encryption necessitates a stronger analysis of the decoder to reach
a DFR of at least 2−128 [1651]. A strong drawback of the cryptosystem for one-time key
exchange is the cost of inverting a polynomial, which is drastically improved in [643].
Variations on the scheme: The original MDPC scheme [1392] is in a McEliece-like form.
It is possible to consider it in a Niederreiter-like form as in the BIKE-2 version of the NIST
812 Concise Encyclopedia of Coding Theory
standardization process, and also a version which does not necessitate a costly inversion
[132] (the BIKE-1 version of the NIST standardization process); see the Bike submission
[15] for details.
KeyGen
$ $ √ $
Let A ← Fk×n2 , x ← Fk2 , w = O( n), e ← Sw n
(F2 ), y ← xA + e, and H ←
T T
T
A |y . Let C be the code with parity check matrix H and generator matrix
G. Return (sk, pk) = (e, G).
Encrypt
To encrypt m ∈ {0, 1}, if m = 0, c ← c0 + e0 , otherwise if m = 1, c ← u, for c0 a
$ $
random codeword of C, e0 ← Sw
n
, and u ← Fn2 . Return c.
Decrypt
To decrypt, return b ← hc, ei.
Why decryption works: If√m = 0, hc, ei = hc0 + e0 , ei = he0 , ei = 0 with overwhelm-
ing probability (since w = O( n)). If m = 1, decryption succeeds with probability 1/2.
The scheme is probabilistic.
Even if the system was not totally practical, the approach in itself was a breakthrough for
code-based cryptography. Its inspiration was provided in part by the Ajtai–Dwork cryptosys-
tem [27] which is based on solving hard lattice problems. The Ajtai–Dwork cryptosystem
also inspired the Learning With Errors (LWE) lattice-based cryptosystem by Regev [1583]
which generated a huge amount of research in lattice-based cryptography.
Code-Based Cryptography 813
FIGURE 34.5: Description of the HQC cryptosystem. The multiplication xy of two ele-
ments x and y of R is a simplified notation for x · y of (34.1); the choice of n follows the
typical case of Section 34.1.1.
$
• KeyGen(param): Samples h ← R, a generator matrix G ∈ Fk×n 2 of C which
$ 2
√
decodes O(n) errors, sk = (x, y) ← R such that wtH (x) = wtH (y) = O( n).
Sets pk = (h, s = x + hy), and returns (pk, sk).
$ $
• Encrypt(pk, m): Generates e ← R, r = (r1 , r2 ) ←√R2 such that wtH (e) = we ,
and wtH (r1 ) = wtH (r2 ) = wr , for wr = we = O( n). Sets u = r1 + hr2 and
v = mG + sr2 + e, and returns c = (u, v).
• Decrypt(sk, c): Returns C.Decode(v − uy).
Parameters: Similarly to MDPC codes, for λ the security level, the length of the code is
in O(λ2 ) and the weight of the small weight generating vectors is in O(λ). In practice the
parameters are slightly larger than for MDPC and two blocks are needed for the ciphertext.
Security: The IND-CPA security of HQC relies on the QCSD problem and the decision
DQCSD problem; no indistinguishability hypothesis for a hidden code is needed. The IND-
CCA2 security can be reached through a generic and efficient transformation such as [970]
depending on the DFR of the system.
Instantiations: In the original HQC submission, the use of a tensor code obtained from
BCH codes and a repetition code was proposed for decoding. Recently a more efficient
concatenated code based on Reed–Muller and Reed–Solomon codes was proposed in [57].
Typically one needs to decode codes with a very low rate of order 1% and an error rate of
order 30%.
Decryption Failure Rate analysis: The main point to keep in mind for this system is
that since the decoding is probabilistic, one needs to have a precise DFR analysis for the
814 Concise Encyclopedia of Coding Theory
security proof which requires a DFR in 2−λ for λ the security level, which is the case with the
proposed decoding codes. One could consider more efficient iterative decoding algorithms
which exist in the literature, but in that case having a precise DFR analysis which is able
to go as a low as 2−128 is trickier to obtain.
A weakness of the system is its relatively low encryption rate, but this is not a major
issue for classical applications of public-key encryption schemes such as authentication or
key exchange. In the NIST standardization process, it was possible to consider a message
size of only 256 bits.
Alice Bob
$ seedf1
seedf1 ← {0, 1}λ , f1 ←− Fn2
$ n
h0 , h1 ← Sw (F2 ), f0 ← h1 + f1 h0 f0 ,f1
−−−−−−−→
$
e ← Stn (F2 )
$
e0 , e1 ← F2 s.t. wtH (e0 ) + wtH (e1 ) = 2t
c0 ,c1
s ← c0 − h0 c1 ←−−−−−−−− c0 ← f0 e1 + e, c1 ← e0 + f1 e1
(e0 , e1 ) ← xBitFlipping(h0 , h1 , s, t, w)
Shared
Hash(e0 , e1 ) Hash(e0 , e1 )
Secret
Why it works: s = −e0 h0 + e1 h1 + e, a syndrome for the small weight vector (e0 , e1 ) associated to√the
MDPC matrix derived from h0 and h1 to which one adds a small weight vector e. w and t are in O( n).
The modified syndrome s can be decoded by the xBitFlipping Algorithm 34.5.1.
Security and parameters: The Ouroboros protocol can be seen as intermediate to HQC
and MDPC schemes: it benefits from the security reduction of HQC with reduction to
attacking random quasi-cyclic instances on one side (the QCSD and DQCSD problems),
and on the other side, it uses an extended bitflip-like decoding algorithm, xBitFlipping of
Algorithm 34.5.1. As with the HQC protocol, the Ouroboros protocol necessitates sending
two blocks for the encryption. Similarly to MDPC and HQC, for λ the security level, the
length of the code is in O(λ2 ), and the weight of the small weight generating vector is in
O(λ). In practice the parameters lie between those of HQC and MDPC. Notice that since
the value f1 is random, a seed to generate it is sufficient. As with MDPC, the Ouroboros
key-exchange protocol can be turned into an encryption scheme which can be proven IND-
CCA2 through a generic transformation like [970] depending on a low DFR, but without
an additional public matrix indistinguishability hypothesis.
Decryption Failure Rate analysis: The decoding of Ouroboros is very close to the
decoding of MDPC codes as the properties of the decoder are very close in practice. Basically
for a small DFR such as 2−30 , like MDPC, the Ouroboros protocol can be used as a KEM
for key exchange. It can be used for encryption but with a lower DFR.
Key exchange and encryption: Similarly to MDPC the protocol is presented as a key-
exchange protocol but it can be easily turned into an encryption scheme for a message m
by adding a ciphertext c = m XOR Hash(e0 , e1 ).
TABLE 34.2: Comparison between several 1st and 2nd round code-based submissions to the
NIST Post-Quantum Cryptography (PQC) standardization process. All sizes are expressed
in bytes. The targeted security level is NIST Level-1, approximately equivalent to 128 bits
of classical security. The notation ct stands for ciphertext.
seeds r1 , r2 , and r3 . In that case the protocol is not zero-knowledge but only testable weak zero-knowledge
[30]; random seeds ri need to be added as in Figure 34.7.
Code-Based Cryptography 817
by considering q-ary codes [372]. An interesting question would be to try to decrease the
cheating probability for one round, below the 21 best known cheating probability.
Security: The Fiat–Shamir heuristic is proven secure in the random oracle model [32, 1528].
The security is based on the SD problem for the original Stern algorithm and under QCSD
for its quasi-cyclic improvement. The question of the extension of the proof in a quantum
oracle is actively considered by researchers.
Parameters: The main advantage of such signatures is that the public key is rather small
(in the quasi-cyclic version), a few hundred bits, and the security is really strong. Now the
signature length in itself is rather large (a few hundred thousand bits).
Variations and applications: Because of the simplicity of the scheme and the really strong
security of the scheme, the Fiat–Shamir heuristic-based signature has been used to introduce
many of the classical signatures used in classical number theory-based cryptography: blind
signatures [219], group signatures [698], undeniable signatures [18], ring signatures [20,
1944], identity-based signatures [371], etc.
check whether the hash value considered as a syndrome can be decoded, and repeat it until
one obtains a decodable hash value.
For a t-correcting binary Goppa code C of length n = 2m over F2 , only nt among the
(nt)
2mt syndromes are decodable, yielding an asymptotic density of 2mt ≈ t!1 . The scheme is
described in Figure 34.9; all details can be found in the original paper [456].
KeyGen
Use the Niederreiter KeyGen Algorithm (see Figure 34.2) to generate a public
parity check matrix Hpub and a secret parity check matrix Hsec .
Sign
Compute si ← h (h (m) |i) until si0 can be decoded in z using Hsec where h is a
hash function. The signature is (z, i0 ).
Verify
Accept the signature if Hpub zT = h (h (m) |i0 ). Reject otherwise.
Security: The security proof of the scheme relies on the indistinguishability of a dense
permuted Goppa code. It was proven in [705] that this assumption did not hold and that
it was possible to distinguish between dense Goppa codes and random codes; this property
holds for very dense Goppa codes but not for Goppa codes used in the classical encryption
scheme for which the rate of the code is often of order 12 . It is interesting to notice that even
if it is possible to distinguish between dense Goppa codes and random codes, in practice
there is no known attack on the scheme from this distinguisher.
Parameters: The density of t!1 obliges one to consider on average t! trials; hence t cannot
be too large. Moreover decoding t errors for a random code has to be difficult. Overall these
two constraints lead to considering very long Goppa codes so that the size of the public
key is super-polynomial in the security parameter. In practice, the scheme can be used for
levels of security that are not too high, such as 80 bits of security, in which case n = 216 ,
n − k = 144, and t = 9; higher security levels like 128 bits seem out of reach in practice.
Because the CFS scheme was the only existing hash-and-sign signature scheme for a long
time, it was used in many code-based constructions.
considered as the first hash-and-sign code-based signature scheme. For 128 bits of classical
security, signature sizes are on the order of 15 thousand bits, the public key size on the order
of 4 megabytes, and the rejection rate is limited to one rejection every 10 to 12 signatures.
Research Problem 34.9.1 Among primitives which are still not known but are of real
interest, one can cite homomorphic encryption. It is trivially possible to add O(n) cipher-
texts by a code-based cryptosystem and still be able to decrypt the sum of them, but it is
not known if it is possible with O(n2 ) ciphertexts or even more. Also it is not known how
to obtain fully homomorphic encryption. Lastly, identity-based encryption is also an open
question.
vector v is the dimension of the Fq -space generated by its coordinates. There is a natural
dictionary of many notions in Hamming distance when they are considered with the rank
metric. For instance the notion of permutation is turned into a notion of invertible matrix
over the base field Fq , and the notion of support in the Hamming metric is turned into
the space generated by coordinates in the rank metric. Also the SD problem has an ana-
log in the rank metric, namely, the Rank Syndrome Decoding (RSD) problem, where the
Hamming distance is replaced by the rank distance. Changing the metric implies several
changes to the geometric properties of the codes. Overall some cryptographic schemes can
be easily adapted; for instance, the GPT (Gabidulin–Paramonov–Tretjakov) cryptosystem
[766] is an adaptation of the McEliece cryptosystem with Gabidulin codes replacing Goppa
codes. However, Gabidulin codes have a structure that is hard to mask, which leads to
many attacks on the scheme or its variations [1479]. There also exist equivalent systems to
MDPC, namely the LRPC (Low Rank Parity Check) cryptosystem [774]; to HQC, namely
the RQC (Rank Quasi-Cyclic) cryptosystem [19]; and to Ouroboros, namely the Ouroboros-
R scheme [16]. These schemes were presented in the ROLLO 2nd round submission of the
NIST standardization process [55].
After a first attempt in 2014 at a signature scheme using the rank metric, namely the
RankSign signature [776] that was attacked in [515], the efficient signature scheme Durandal
was proposed in 2019 [56]. It is possible to adapt the Stern authentication scheme to the rank
metric [778]. Other primitives with rank metric formulations, e.g., [772], are also known.
But some others, e.g., hash functions, seem to resist because of properties of the metric.
In general the rank metric leads to cryptosystems with smaller key sizes. The main
difficulty of the problem is not yet completely fixed even though, in recent years, there
has developed better knowledge of the security of the general Rank Syndrome Decoding
problem, to which the Syndrome Decoding problem can be probabilistically reduced [118,
775, 780].
Acknowledgement
The first author thanks Olivier Blazy, Alain Couvreur, Thomas Debris-Alazard, and
Gilles Zémor for helpful comments and discussions.
Bibliography
[1] M. Aaltonen. Linear programming bounds for tree codes. IEEE Trans. Inform.
Theory, 25:85–90, 1979.
[2] M. Aaltonen. A new upper bound on nonbinary block codes. Discrete Math., 83:139–
160, 1990.
[3] E. Abbe. Randomness and dependencies extraction via polarization. In Proc. Infor-
mation Theory and Applications Workshop, La Jolla, California, 7 pages, February
2011.
[4] E. Abbe. Randomness and dependencies extraction via polarization, with applica-
tions to Slepian-Wolf coding and secrecy. IEEE Trans. Inform. Theory, 61:2388–
2398, 2015.
[5] E. Abbe and E. Telatar. Polar codes for the m-user MAC. arXiv:1002.0777 [cs.IT],
February 2010.
[6] E. Abbe and E. Telatar. Polar codes for the m-user multiple access channel. IEEE
Trans. Inform. Theory, 58:5437–5448, 2012.
[7] N. Abdelghany and N. Megahed. Code-checkable group rings. J. Algebra Comb.
Discrete Struct. Appl., 4:115–122, 2017.
[8] M. Abramowitz and I. A. Stegun. Handbook of Mathematical Functions with Formu-
las, Graphs, and Mathematical Tables, volume 55 of National Bureau of Standards
Applied Mathematics Series. For sale by the Superintendent of Documents, U.S. Gov-
ernment Printing Office, Washington, D.C.; 9th Dover printing, 10th GPO printing,
1964.
[9] T. Abualrub, A. Ghrayeb, and R. H. Oehmke. A mass formula and rank of Z4 cyclic
codes of length 2e . IEEE Trans. Inform. Theory, 50:3306–3312, 2004.
[10] T. Abualrub and R. H. Oehmke. On the generators of Z4 cyclic codes of length 2e .
IEEE Trans. Inform. Theory, 49:2126–2133, 2003.
[11] T. Abualrub and R. H. Oehmke. Correction to: “On the generators of Z4 cyclic
codes of length 2e ” [IEEE Trans. Inform. Theory 49:2126–2133, 2003]. IEEE Trans.
Inform. Theory, 51:3009, 2005.
[12] T. Abualrub and I. Şiap. Cyclic codes over the rings Z2 + uZ2 and Z2 + uZ2 + u2 Z2 .
Des. Codes Cryptogr., 42:273–287, 2007.
[13] L. M. Adleman. Molecular computation of solutions to combinatorial problems.
Science, 266:1021–1024, 1994.
823
824 Bibliography
[36] M. Alekhnovich. More on average case vs. approximation complexity. In Proc. 44th
IEEE Symposium on Foundations of Computer Science, pages 298–307, Cambridge,
Massachusetts, October 2003.
[37] A. M. Alhakim. A simple combinatorial algorithm for de Bruijn sequences. Amer.
Math. Monthly, 117:728–732, 2010.
[38] P. Aliferis and J. Preskill. Fault-tolerant quantum computation against biased noise.
Physical Review A, 78(5):052331, November 2008.
[39] P. Almeida, D. Napp, and R. Pinto. A new class of superregular matrices and MDP
convolutional codes. Linear Algebra Appl., 439:2145–2157, 2013.
[40] N. Alon and F. R. K. Chung. Explicit construction of linear sized tolerant networks.
Discrete Math., 72:15–19, 1988.
[41] N. Alon, O. Goldreich, J. Håstad, and R. Peralta. Simple constructions of almost
k-wise independent random variables. Random Structures Algorithms, 3:289–304,
1992.
[42] N. Alon and M. G. Luby. A linear time erasure-resilient code with nearly optimal
recovery. IEEE Trans. Inform. Theory, 42:1732–1736, 1996.
[43] J. L. Alperin and R. B. Bell. Groups and Representations, volume 162 of Graduate
Texts in Mathematics. Springer-Verlag, New York, 1995.
[45] M. M. S. Alves, L. Panek, and M. Firer. Error-block codes and poset metrics. Adv.
Math. Commun., pages 95–111, 2008.
[46] S. A. Aly, A. Klappenecker, and P. K. Sarvepalli. Remarkable degenerate quantum
stabilizer codes derived from duadic codes. In Proc. IEEE International Symposium
on Information Theory, pages 1105–1108, Seattle, Washington, July 2006.
[88] C. Bachoc, A. Passuello, and F. Vallentin. Bounds for projective codes from semidef-
inite programming. Adv. Math. Comm., 7:127–145, 2013.
[89] C. Bachoc and F. Vallentin. New upper bounds for kissing numbers from semidefinite
programming. J. Amer. Math. Soc., 21:909–924, 2008.
[90] C. Bachoc and G. Zémor. Bounds for binary codes relative to pseudo-distances of k
points. Adv. Math. Commun., 4:547–565, 2010.
[91] B. Bagchi. On characterizing designs by their codes. In N. S. N. Sastry, editor,
Buildings, Finite Geometries and Groups, volume 10 of Springer Proc. Math., pages
1–14. Springer, New York, 2012.
[92] B. Bagchi and S. P. Inamdar. Projective geometric codes. J. Combin. Theory Ser.
A, 99:128–142, 2002.
Bibliography 829
[97] S. B. Balaji and P. V. Kumar. Bounds on the rate and minimum distance of codes
with availability. In Proc. IEEE International Symposium on Information Theory,
pages 3155–3159, Aachen, Germany, June 2017.
[98] S. B. Balaji and P. V. Kumar. A tight lower bound on the sub-packetization level
of optimal-access MSR and MDS codes. In Proc. IEEE International Symposium on
Information Theory, pages 2381–2385, Vail, Colorado, June 2018.
[99] A. Balatsoukas-Stimming, M. B. Parizi, and A. Burg. LLR-based successive can-
cellation list decoding of polar codes. In Proc. IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), pages 3903–3907, Florence, Italy,
May 2014.
[100] M. Baldi, A. Barenghi, F. Chiaraluce, G. Pelosi, and P. Santini. LEDAkem.
First round submission to the NIST post-quantum cryptography call:
https://csrc.nist.gov/CSRC/media/Projects/Post-Quantum-Cryptography/
documents/round-1/submissions/LEDAkem.zip, November 2017.
[101] M. Baldi, M. Bianchi, F. Chiaraluce, J. Rosenthal, and D. Schipani. Enhanced public
key security for the McEliece cryptosystem. J. Cryptology, 29:1–27, 2016.
[102] M. Baldi, M. Bodrato, and F. Chiaraluce. A new analysis of the McEliece cryp-
tosystem based on QC-LDPC codes. In R. Ostrovsky, R. De Prisco, and I. Visconti,
editors, Security and Cryptography for Networks, volume 5229 of Lecture Notes in
Comput. Sci., pages 246–262. Springer, Berlin, Heidelberg, 2008.
[137] V. Batagelj. Norms and distances over finite groups. Department of Mathematics,
University of Ljubljana, Yugoslavia, 1990.
[138] L. M. Batten and J. M. Dover. Some sets of type (m, n) in cubic order planes. Des.
Codes Cryptogr., 16:211–213, 1999.
[139] H. Bauer, B. Ganter, and E. Hergert. Algebraic techniques for nonlinear codes.
Combinatorica, 3:21–33, 1983.
[140] L. D. Baumert and R. J. McEliece. Weights of irreducible cyclic codes. Inform. and
Control, 20:158–175, 1972.
[141] L. D. Baumert, W. H. Mills, and R. L. Ward. Uniform cyclotomy. J. Number Theory,
14:67–82, 1982.
[142] Leonard D. Baumert. Cyclic Difference Sets, volume 182 of Lecture Notes in Math.
Springer-Verlag, Berlin-New York, 1971.
[143] E. Bayer-Fluckiger, F. Oggier, and E. Viterbo. New algebraic constructions of ro-
tated Zn -lattice constellations for the Rayleigh fading channel. IEEE Trans. Inform.
Theory, 50:702–714, 2004.
[144] L. M. J. Bazzi, M. Mahdian, and D. Spielman. The minimum distance of turbo-like
codes. IEEE Trans. Inform. Theory, 55:6–15, 2009.
[145] L. M. J. Bazzi and S. K. Mitter. Some randomized code constructions from group
actions. IEEE Trans. Inform. Theory, 52:3210–3219, 2006.
[146] A. Becker, A. Joux, A. May, and A. Meurer. Decoding random binary linear codes
in 2n/20 : how 1 + 1 = 0 improves information set decoding. In D. Pointcheval and
T. Johansson, editors, Advances in Cryptology – EUROCRYPT 2012, volume 7237
of Lecture Notes in Comput. Sci., pages 520–536. Springer, Berlin, Heidelberg, 2012.
[147] P. Beelen. The order bound for general algebraic geometric codes. Finite Fields
Appl., 13:665–680, 2007.
[148] P. Beelen and T. Høholdt. The decoding of algebraic geometry codes. In Advances
in Algebraic Geometry Codes, volume 5 of Ser. Coding Theory Cryptol., pages 49–98.
World Sci. Publ., Hackensack, New Jersey, 2008.
[149] A. Beemer, S. Habib, C. A. Kelley, and J. Kliewer. A general approach to optimizing
SC-LDPC codes. In Proc. 55th Allerton Conference on Communication, Control, and
Computing, pages 672–679, Monticello, Illinois, October 2017.
[150] A. Beemer, K. Haymaker, and C. A. Kelley. Absorbing sets of codes from finite
geometries. Cryptogr. Commun., 11:1115–1131, 2019.
[151] A. Beemer and C. A. Kelley. Avoiding trapping sets in SC-LDPC codes under
windowed decoding. In Proc. IEEE International Symposium on Information Theory
and its Applications, pages 206–210, Monterey, California, October-November 2016.
[152] J.-C. Belfiore and A. M. Cipriano. Space-Time Coding for Non-Coherent Channels.
In H. Boelcskei, D. Gesbert, C. Papadias, and A. J. van der Veen, editors, Space-Time
Wireless Systems: From Array Processing to MIMO Communications, chapter 10.
Cambridge University Press, Cambridge, UK, 2006.
Bibliography 833
[153] J.-C. Belfiore and F. Oggier. An error probability approach to MIMO wiretap chan-
nels. IEEE Trans. Commun., 61:3396–3403, 2013.
[154] J.-C. Belfiore, F. Oggier, and E. Viterbo. Cyclic division algebras: a tool for space-
time coding. Found. Trends Commun. and Inform. Theory, 4:1–95, 2007.
[155] J.-C. Belfiore and G. Rekaya. Quaternionic lattices for space-time coding. In Proc.
IEEE Information Theory Workshop, pages 267–270, Paris, France, April 2003.
[156] J.-C. Belfiore, G. Rekaya, and E. Viterbo. The golden code: a 2×2 full-rate space-time
code with non-vanishing determinants. IEEE Trans. Inform. Theory, 51:1432–1436,
2005.
[157] B. I. Belov, V. N. Logačev, and V. P. Sandimirov. Construction of a class of lin-
ear binary codes that attain the Varšamov-Griesmer bound (in Russian). Prob-
lemy Peredači Informacii, 10(3):36–44, 1974. (English translation in Probl. Inform.
Transm., 10(3):211–217, 1974).
[158] E. Ben-Sasson, T. Etzion, A. Gabizon, and N. Raviv. Subspace polynomials and
cyclic subspace codes. IEEE Trans. Inform. Theory, 62:1157–1165, 2016.
[164] T. P. Berger and P. Loidreau. Designing an efficient and secure public-key cryptosys-
tem based on reducible rank codes. In A. Canteaut and K. Viswanathan, editors,
Progress in Cryptology – INDOCRYPT 2004, volume 3348 of Lecture Notes in Com-
put. Sci., pages 218–229. Springer, Berlin, Heidelberg, 2005.
[165] T. P. Berger and P. Loidreau. How to mask the structure of codes for a cryptographic
use. Des. Codes Cryptogr., 35:63–79, 2005.
[166] G. Berhuy and F. Oggier. On the existence of perfect space–time codes. IEEE Trans.
Inform. Theory, 55:2078–2082, 2009.
[167] E. R. Berlekamp. Negacyclic codes for the Lee metric. In R. C. Bose and T. A. Dowl-
ing, editors, Combinatorial Mathematics and its Applications (Proc. Conf., Univ.
North Carolina, Chapel Hill, NC, 1967), pages 298–316. Univ. North Carolina Press,
Chapel Hill, NC, 1969.
834 Bibliography
[168] E. R. Berlekamp. Goppa codes. IEEE Trans. Inform. Theory, 19:590–592, 1973.
[169] E. R. Berlekamp, editor. Key Papers in the Development of Coding Theory. IEEE
Press, New York, 1974.
[170] E. R. Berlekamp. Algebraic Coding Theory. World Scientific Publishing Co. Pte.
Ltd., Hackensack, New Jersey, revised edition, 2015.
[173] S. D. Berman. Semisimple cyclic and abelian codes. II (in Russian). Kibernetika
(Kiev), 3:21–30, 1967. (English translation in Cybernetics, 3:17–23, 1967).
[174] S. D. Berman. On the theory of group codes. Cybernetics, 3:25–31, 1969.
[175] J. J. Bernal, Á. del Rı́o, and J. J. Simón. An intrinsical description of group codes.
Des. Codes Cryptogr., 51:289–300, 2009.
[176] F. Bernhardt, P. Landrock, and O. Manz. The extended Golay codes considered as
ideals. J. Combin. Theory Ser. A, 55:235–246, 1990.
[177] D. J. Bernstein, T. Chou, T. Lange, I. von Maurich, R. Mizoczki, R. Niederhagen,
E. Persichetti, C. Peters, P. Schwabe, N. Sendrier, J. Szefer, and W. Wen. Classic
McEliece: conservative code-based cryptography. Second round submission to the
NIST post-quantum cryptography call. https://classic.mceliece.org, March
2019.
[178] D. J. Bernstein, T. Lange, and C. Peters. Attacking and defending the McEliece
cryptosystem. In J. Buchmann and J. Ding, editors, Post-Quantum Cryptography,
volume 5299 of Lecture Notes in Comput. Sci., pages 31–46. Springer, Berlin, Hei-
delberg, 2008.
[179] D. J. Bernstein, T. Lange, and C. Peters. Smaller decoding exponents: ball-collision
decoding. In P. Rogaway, editor, Advances in Cryptology – CRYPTO 2011, volume
6841 of Lecture Notes in Comput. Sci., pages 743–760. Springer, Berlin, Heidelberg,
2011.
[180] D. J. Bernstein, T. Lange, and C. Peters. Wild McEliece. In A. Biryukov, G. Gong,
and D. R. Stinson, editors, Selected Areas in Cryptography – SAC 2010, volume 6544
of Lecture Notes in Comput. Sci., pages 143–158. Springer, Berlin, Heidelberg, 2011.
[181] D. J. Bernstein, T. Lange, C. Peters, and P. Schwabe. Really fast syndrome-
based hashing. In A. Nitaj and D. Pointcheval, editors, Progress in Cryptology –
AFRICACRYPT 2011, volume 6737 of Lecture Notes in Comput. Sci., pages 134–
152. Springer, Berlin, Heidelberg, 2011.
[182] D. J. Bernstein, T. Lange, C. Peters, and H.C.A. van Tilborg. Explicit bounds for
generic decoding algorithms for code-based cryptography. In Pre-Proc. International
Workshop on Coding and Cryptography, pages 168–180, Ullensvang, Norway, May
2009.
Bibliography 835
[183] C. Berrou, A. Glavieux, and P. Thitimajshima. Near Shannon limit error correcting
coding and decoding. In Proc. International Communications Conference, pages
1064–1070, Geneva, Switzerland, May 1983.
[184] M. Bertilsson and I. Ingemarsson. A construction of practical secret sharing schemes
using linear block codes. In J. Seberry and Y. Zheng, editors, Advances in Cryptol-
ogy – AUSCRYPT ’92, volume 718 of Lecture Notes in Comput. Sci., pages 67–79.
Springer, Berlin, Heidelberg, 1993.
[185] M. R. Best. Perfect codes hardly exist. IEEE Trans. Inform. Theory, 29:349–351,
1983.
[186] T. Beth and C. Ding. On almost perfect nonlinear permutations. In T. Helleseth,
editor, Advances in Cryptology – EUROCRYPT ’93, volume 765 of Lecture Notes in
Comput. Sci., pages 65–76. Springer, Berlin, Heidelberg, 1994.
[187] T. Beth, D. Jungnickel, and H. Lenz. Design Theory. Cambridge University Press,
Cambridge, UK, 2nd edition, 1999.
[188] K. Betsumiya, T. A. Gulliver, and M. Harada. On binary extremal formally self-dual
even codes. Kyushu J. Math., 53:421—-430, 1999.
[189] K. Betsumiya, M. Harada, and A. Munemasa. A complete classification of doubly
even self-dual codes of length 40. Electronic J. Combin., 19:#P18 (12 pages), 2012.
[190] A. Betten, M. Braun, H. Fripertinger, A. Kerber, A. Kohnert, and A. Wassermann.
Error-Correcting Linear Codes, volume 18 of Algorithms and Computation in Math-
ematics. Springer-Verlag, Berlin, 2006. Classification by isometry and applications,
with 1 CD-ROM (Windows and Linux).
[191] A. Betten, H. Fripertinger, A. Kerber, A. Wassermann, and K-H. Zimmermann.
Codierungstheorie – Konstruktion und Anwendung linearer Codes. Springer-Verlag,
Heidelberg, 1998.
[192] A. Betten, R. Laue, and A. Wassermann. Simple 7-designs with small parameters.
J. Combin. Des., 7:79–94, 1999.
[193] A. Beutelspacher. Partial spreads in finite projective spaces and partial designs.
Math. Z., 145:211–229, 1975.
[194] A. Beutelspacher. Correction to: “Partial spreads in finite projective spaces and
partial designs” [Math. Z., 145:211–229, 1975]. Math. Z., 147:303, 1976.
[195] A. Beutelspacher. On t-covers in finite projective spaces. J. Geom., 12:10–16, 1979.
[196] J. Bierbrauer. The theory of cyclic codes and a generalization to additive codes.
Des. Codes Cryptogr., 25:189–206, 2002.
[197] J. Bierbrauer. Cyclic additive and quantum stabilizer codes. In C. Carlet and
B. Sunar, editors, Arithmetic of Finite Fields, volume 4547 of Lecture Notes in Com-
put. Sci., pages 276–283. Springer, Berlin, Heidelberg, 2007.
[198] J. Bierbrauer. Cyclic additive codes. J. Algebra, 372:661–672, 2012.
[199] J. Bierbrauer. Introduction to Coding Theory. Discrete Mathematics and its Appli-
cations. Chapman & Hall/CRC, Boca Raton, Florida, 2nd edition, 2017.
836 Bibliography
[216] F. L. Blasco, G. Liva, and G. Bauch. LT code design for inactivation decoding.
In Proc. IEEE Information Theory Workshop, pages 441–445, Hobart, Tasmania,
November 2014.
[217] M. Blaum, P. G. Farrell, and H. C. A. van Tilborg. Array Codes. In V. S. Pless and
W. C. Huffman, editors, Handbook of Coding Theory, Vol. I, II, chapter 22, pages
1855–1909. North-Holland, Amsterdam, 1998.
[218] M. Blaum, J. L. Hafner, and S. Hetzler. Partial-MDS codes and their application to
RAID type of architectures. IEEE Trans. Inform. Theory, 59:4510–4519, 2013.
Bibliography 837
[235] A. Bonnecaze, P. Solé, C. Bachoc, and B. Mourrain. Type II codes over Z4 . IEEE
Trans. Inform. Theory, 43:969–976, 1997.
[236] A. Bonnecaze, P. Solé, and A. R. Calderbank. Quaternary quadratic residue codes
and unimodular lattices. IEEE Trans. Inform. Theory, 41:366–377, 1995.
[237] A. Borel. Linear Algebraic Groups, volume 126 of Graduate Texts in Mathematics.
Springer-Verlag, New York, 2nd edition, 1991.
[238] M. Borello. The automorphism group of a self-dual [72, 36, 16] binary code does not
contain elements of order 6. IEEE Trans. Inform. Theory, 58:7240–7245, 2012.
[239] M. Borello. The automorphism group of a self-dual [72, 36, 16] code is not an ele-
mentary abelian group of order 8. Finite Fields Appl., 25:1–7, 2014.
[251] M. Bossert. On decoding using codewords of the dual code. arXiv:2001.02956 [cs.IT],
January 2020.
Bibliography 839
[252] M. Bossert and V. Sidorenko. Singleton-type bounds for blot-correcting codes. IEEE
Trans. Inform. Theory, 42:1021–1023, 1996.
[253] L. Both and A. May. Optimizing BJMM with nearest neighbors: full decoding in
22/21n and McEliece security. In Proc. 10th International Workshop on Coding and
Cryptography, Saint Petersburg, Russia, September 2017. Available at http://cits.
rub.de/imperia/md/content/may/paper/bjmm+.pdf.
[254] L. Both and A. May. Decoding linear codes with high error rate and its impact for
LPN security. In T. Lange and R. Steinwandt, editors, Post-Quantum Cryptogra-
phy, volume 10786 of Lecture Notes in Comput. Sci., pages 25–46. Springer, Berlin,
Heidelberg, 2018.
[255] D. Boucher. Construction and number of self-dual skew codes over Fp2 . Adv. Math.
Commun., 10:765–795, 2016.
[256] D. Boucher, W. Geiselmann, and F. Ulmer. Skew-cyclic codes. Appl. Algebra Engrg.
Comm. Comput., 18:379–389, 2007.
[257] D. Boucher, P. Solé, and F. Ulmer. Skew constacyclic codes over Galois rings. Adv.
Math. Commun., 2:273–292, 2008.
[258] D. Boucher and F. Ulmer. Codes as modules over skew polynomial rings. In M. G.
Parker, editor, Cryptography and Coding, volume 5921 of Lecture Notes in Comput.
Sci., pages 38–55. Springer, Berlin, Heidelberg, 2009.
[259] D. Boucher and F. Ulmer. Coding with skew polynomial rings. J. Symbolic Comput.,
44:1644–1656, 2009.
[260] D. Boucher and F. Ulmer. Linear codes using skew polynomials with automorphisms
and derivations. Des. Codes Cryptogr., 70:405–431, 2014.
[261] D. Boucher and F. Ulmer. Self-dual skew codes and factorization of skew polynomials.
J. Symbolic Comput., 60:47–61, 2014.
[262] M. Boulagouaz and A. Leroy. (σ, δ)-codes. Adv. Math. Commun., 7:463–474, 2013.
[263] S. Boumova, P. G. Boyvalenkov, and D. P. Danev. Necessary conditions for existence
of some designs in polynomial metric spaces. European J. Combin., 20:213–225, 1999.
[264] J. Boutros, E. Viterbo, C. Rastello, and J.-C. Belfiore. Good lattice constellations for
both Rayleigh fading and Gaussian channels. IEEE Trans. Inform. Theory, 42:502–
518, 1996.
[265] I. Bouyukliev. About the code equivalence. In T. Shaska, W. C. Huffman, D. Joyner,
and V. Ustimenko, editors, Advances in Coding Theory and Cryptography, volume 3
of Ser. Coding Theory Cryptol., pages 126–151. World Sci. Publ., Hackensack, New
Jersey, 2007.
[266] I. Bouyukliev. What is Q-extension? Serdica J. Comput., 1:115–130, 2007.
[267] I. Bouyukliev. Classification of Griesmer codes and dual transform. Discrete Math.,
309:4049–4068, 2009.
[294] M. Braun. Designs over the binary field from the complete monomial group. Aus-
tralas. J. Combin., 67:470–475, 2017.
[295] M. Braun, T. Etzion, P. R. J. Östergård, A. Vardy, and A. Wassermann. Existence
of q-analogs of Steiner systems. Forum Math. Pi, 4:e7, 14, 2016.
[300] M. Braun, P. R. J. Östergård, and A. Wassermann. New lower bounds for binary
constant-dimension subspace codes. Experiment. Math., 27:179–183, 2018.
842 Bibliography
[301] S. Bravyi, M. Suchara, and A. Vargo. Efficient algorithms for maximum likelihood
decoding in the surface code. Physical Review A, 90(3):032326, September 2014.
[302] E. F. Brickell. Some ideal secret sharing schemes. In J.-J. Quisquater and J. Van-
dewalle, editors, Proc. Advances in Cryptology – EUROCRYPT ’89, volume 434 of
Lecture Notes in Comput. Sci., pages 468–475. Springer, Berlin, Heidelberg, 1990.
[303] J. D. Bridwell and J. K. Wolf. Burst distance and multiple-burst correction. Bell
System Tech. J., 49:889–909, 1970.
[304] J. Bringer, C. Carlet, H. Cabanne, S. Guilley, and H. Maghrebi. Orthogonal direct
sum masking: a smartcard friendly computation paradigm in a code, with builtin
protection against side-channel and fault attacks. In D. Naccache and D. Sauveron,
editors, Information Security Theory and Practice. Securing the Internet of Things,
volume 8501 of Lecture Notes in Comput. Sci., pages 40–56. Springer, Berlin, Hei-
delberg, 2014.
[305] A. E. Brouwer. Personal homepage. http://www.win.tue.nl//~aeb/. Accessed:
2018-08-31.
[306] A. E. Brouwer. Some unitals on 28 points and their embeddings in projective planes
of order 9. In M. Aigner and D. Jungnickel, editors, Geometries and Groups, volume
893 of Lecture Notes in Math., pages 183–188. Springer-Verlag, Berlin, Heidelberg,
1981.
[307] A. E. Brouwer. Some new two-weight codes and strongly regular graphs. Discrete
Appl. Math., 10:111–114, 1985.
[308] A. E. Brouwer. Strongly regular graphs. In C. J. Colbourn and J. H. Dinitz, edi-
tors, The CRC Handbook of Combinatorial Designs, chapter VII.11, pages 852–868.
Chapman & Hall/CRC Press, 2nd edition, 2007.
[309] A. E. Brouwer, A. M. Cohen, and A. Neumaier. Distance-Regular Graphs, volume 18
of Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics
and Related Areas (3)]. Springer-Verlag, Berlin, 1989.
[310] A. E. Brouwer, A. V. Ivanov, and M. H. Klin. Some new strongly regular graphs.
Combinatorica, 9:339–344, 1989.
[311] A. E. Brouwer and S. C. Polak. Uniqueness of codes using semidefinite programming.
Des. Codes Cryptogr., 87:1881–1895, 2019.
[312] A. E. Brouwer and L. M. G. M. Tolhuizen. A sharpening of the Johnson bound for
binary linear codes and the nonexistence of linear codes with Preparata parameters.
Des. Codes Cryptogr., 3:95–98, 1993.
[313] A. E. Brouwer and M. van Eupen. The correspondence between projective codes
and 2-weight codes. Des. Codes Cryptogr., 11:261–266, 1997.
[314] A. E. Brouwer, R. M. Wilson, and Q. Xiang. Cyclotomy and strongly regular graphs.
J. Algebraic Combin., 10:25–28, 1999.
[315] R. A. Brualdi, J. S. Graves, and K. M. Lawrence. Codes with a poset metric. Discrete
Math., 147:57–72, 1995.
[316] A. A. Bruen and J. A. Thas. Hyperplane coverings and blocking sets. Math. Z.,
181:407–409, 1982.
Bibliography 843
[317] T. A. Brun, I. Devetak, and M.-H. Hsieh. Correcting quantum errors with entangle-
ment. Science, 314(5798):436–439, 2006.
[318] F. Buekenhout. Existence of unitals in finite translation planes of order q 2 with a
kernel of order q. Geom. Dedicata, 5:189–194, 1976.
[319] K. A. Bush. Orthogonal arrays of index unity. Ann. Math. Stat., 23:279–290, 1952.
[320] S. Buzaglo and T. Etzion. Bounds on the size of permutation codes with the Kendall
τ -metric. IEEE Trans. Inform. Theory, 61:3241–3250, 2015.
[321] J. W. Byers, M. G. Luby, M. Mitzenmacher, and A. Rege. A digital fountain ap-
proach to reliable distribution of bulk data. In Proc. ACM SIGCOMM Conference
on Applications, Technologies, Architectures, and Protocols for Computer Communi-
cation, pages 56–67, Vancouver, Canada, August-September 1998. ACM Press, New
York.
[322] E. Byrne. Lifting decoding schemes over a Galois ring. In S. Boztaş and I. E. Shpar-
linski, editors, Applied Algebra, Algebraic Algorithms and Error-Correcting Codes,
volume 2227 of Lecture Notes in Comput. Sci., pages 323–332. Springer, Berlin, Hei-
delberg, 2001.
[323] E. Byrne. Decoding a class of Lee metric codes over a Galois ring. IEEE Trans.
Inform. Theory, 48:966–975, 2002.
[324] E. Byrne and P. Fitzpatrick. Gröbner bases over Galois rings with an application to
decoding alternant codes. J. Symbolic Comput., 31:565–584, 2001.
[325] E. Byrne and P. Fitzpatrick. Hamming metric decoding of alternant codes over
Galois rings. IEEE Trans. Inform. Theory, 48:683–694, 2002.
[326] V. R. Cadambe, S. A. Jafar, H. Maleki, K. Ramchandran, and C. Suh. Asymptotic
interference alignment for optimal repair of MDS codes in distributed storage. IEEE
Trans. Inform. Theory, 59:2974–2987, 2013.
[327] V. R. Cadambe and A. Mazumdar. Bounds on the size of locally recoverable codes.
IEEE Trans. Inform. Theory, 61:5787–5794, 2015.
[328] Y. Cai and C. Ding. Binary sequences with optimal autocorrelation. Theoret. Com-
put. Sci., 410:2316–2322, 2009.
[333] S. Calkavur and P. Solé. Multisecret-sharing schemes and bounded distance decoding
of linear codes. Internat. J. Compt. Math., 94:107–114, 2017.
844 Bibliography
[349] C. Carlet. Boolean Functions for Cryptography and Coding Theory. Cambridge
University Press, Cambridge, UK, 2020.
[350] C. Carlet, P. Charpin, and V. A. Zinoviev. Codes, bent functions and permutations
suitable for DES-like cryptosystems. Des. Codes Cryptogr., 15:125–156, 1998.
[351] C. Carlet and C. Ding. Highly nonlinear mappings. J. Complexity, 20:205–244, 2004.
[352] C. Carlet, C. Ding, and J. Yuan. Linear codes from perfect nonlinear mappings and
their secret sharing schemes. IEEE Trans. Inform. Theory, 51:2089–2102, 2005.
[353] C. Carlet and S. Guilley. Complementary dual codes for counter-measures to side-
channel attacks. In R. Pinto, P. R. Malonek, and P. Vettori, editors, Coding Theory
and Applications, volume 3 of CIM Ser. Math. Sci., pages 97–105. Springer, Cham,
2015.
[354] C. Carlet, C. Güneri, F. Özbudak, B. Özkaya, and P. Solé. On linear complementary
pairs of codes. IEEE Trans. Inform. Theory, 64:6583–6589, 2018.
[355] C. Carlet and S. Mesnager. Four decades of research on bent functions. Des. Codes
Cryptogr., 78:5–50, 2016.
[356] X. Caruso and J. Le Borgne. A new faster algorithm for factoring skew polynomials
over finite fields. J. Symbolic Comput., 79:411–443, 2017.
[357] I. Cascudo. On squares of cyclic codes. IEEE Trans. Inform. Theory, 65:1034–1047,
2019.
[358] I. Cascudo, H. Chen, R. Cramer, and C. Xing. Asymptotically good ideal linear
secret sharing with strong multiplication over any fixed finite field. In S. Halevi,
editor, Advances in Cryptology—CRYPTO 2009, volume 5677 of Lecture Notes in
Comput. Sci., pages 466–486. Springer, Berlin, Heidelberg, 2009.
[363] L. R. A. Casse and D. G. Glynn. The solution to Beniamino Segre’s problem Ir,q ,
r = 3, q = 2h . Geom. Dedicata, 13:157–163, 1982.
[364] L. R. A. Casse and D. G. Glynn. On the uniqueness of (q + 1)4 -arcs of PG(4, q),
q = 2h , h ≥ 3. Discrete Math., 48:173–186, 1984.
[365] Y. Cassuto and M. Blaum. Codes for symbol-pair read channels. In Proc. IEEE In-
ternational Symposium on Information Theory, pages 988–992, Austin, Texas, June
2010.
846 Bibliography
[366] Y. Cassuto and M. Blaum. Codes for symbol-pair read channels. IEEE Trans.
Inform. Theory, 57:8011–8020, 2011.
[367] Y. Cassuto and S. Litsyn. Symbol-pair codes: algebraic constructions and asymptotic
bounds. In A. Kuleshov, V. M. Blinovsky, and A. Ephremides, editors, Proc. IEEE
International Symposium on Information Theory, pages 2348–2352, St. Petersburg,
Russia, July-August 2011.
[368] Y. Cassuto and M. A. Shokrollahi. Online fountain codes with low overhead. IEEE
Trans. Inform. Theory, 61:3137–3149, June 2015.
[369] G. Castagnoli, J. L. Massey, P. A. Schoeller, and N. von Seeman. On repeated-root
cyclic codes. IEEE Trans. Inform. Theory, 37:337–342, 1991.
[380] S. Chang and J. Y. Hyun. Linear codes from simplicial complexes. Des. Codes
Cryptogr., 86:2167–2181, 2018.
[381] Z. Chang, J. Chrisnata, M. F. Ezerman, and H. M. Kiah. Rates of DNA sequence
profiles for practical values of read lengths. IEEE Trans. Inform. Theory, 63:7166–
7177, 2017.
[384] Z. Chang, M. F. Ezerman, S. Ling, and H. Wang. On binary de Bruijn sequences from
LFSRs with arbitrary characteristic polynomials. Des. Codes Cryptogr., 87:1137–
1160, 2019.
[385] P. Charpin. Codes Idéaux de Certaines Algèbres Modulaires, Thèse de 3ième cycle.
Université de Paris VII, 1982.
[390] B. Chen, H. Q. Dinh, H. Liu, and L. Wang. Constacyclic codes of length 2ps over
Fpm + uFpm . Finite Fields Appl., 37:108–130, 2016.
[391] B. Chen, S. Ling, and G. Zhang. Application of constacyclic codes to quantum MDS
codes. IEEE Trans. Inform. Theory, 61:1474–1484, 2015.
[392] B. Chen, S.-T. Xia, J. Hao, and F.-W. Fu. Constructions of optimal cyclic (r, δ)
locally repairable codes. IEEE Trans. Inform. Theory, 64:2499–2511, 2018.
[393] C. L. Chen, W. W. Peterson, and E. J. Weldon, Jr. Some results on quasi-cyclic
codes. Inform. and Control, 15:407–423, 1969.
[394] H. Chen and R. Cramer. Algebraic geometric secret sharing schemes and se-
cure multi-party computations over small fields. In C. Dwork, editor, Advances in
Cryptology—CRYPTO 2006, volume 4117 of Lecture Notes in Comput. Sci., pages
521–536. Springer, Berlin, Heidelberg, 2006.
[395] H. Chen, S. Ling, and C. Xing. Quantum codes from concatenated algebraic-
geometric codes. IEEE Trans. Inform. Theory, 51:2915–2920, 2005.
[396] J. Chen, A. Dholakia, E. Eleftheriou, M. P. C. Fossorier, and X. Hu. Reduced-
complexity decoding of LDPC codes. IEEE Trans. Commun., 53:1288–1299, 2005.
848 Bibliography
[415] J.-J. Climent, D. Napp, C. Perea, and R. Pinto. Maximum distance separable 2D
convolutional codes. IEEE Trans. Inform. Theory, 62:669–680, 2016.
[416] J.-J. Climent, D. Napp, R. Pinto, and R. Simões. Series concatenation of 2D con-
volutional codes. In Proc. IEEE 9th International Workshop on Multidimensional
(nD) Systems (nDS), Vila Real, Portugal, 6 pages, September 2015.
[417] J. T. Coffey and R. M. Goodman. The complexity of information set decoding. IEEE
Trans. Inform. Theory, 36:1031–1037, 1990.
[418] J. T. Coffey, R. M. Goodman, and P. G. Farrell. New approaches to reduced-
complexity decoding. Discrete Appl. Math., 33:43–60, 1991.
[419] G. D. Cohen and S. Encheva. Efficient constructions of frameproof codes. Electron.
Lett., 36:1840–1842, 2000.
[420] G. D. Cohen, S. Mesnager, and A. Patey. On minimal and quasi-minimal linear
codes. In M. Stam, editor, Cryptography and Coding, volume 8308 of Lecture Notes
in Comput. Sci., pages 85–98. Springer, Berlin, Heidelberg, 2013.
[421] H. Cohn, J. H. Conway, N. D. Elkies, and A. Kumar. The D4 root system is not
universally optimal. Experiment. Math., 16:313–320, 2007.
[422] H. Cohn and M. de Courcy-Ireland. The Gaussian core model in high dimensions.
Duke Math. J., 167:2417–2455, 2018.
[423] H. Cohn and N. D. Elkies. New upper bounds on sphere packings I. Ann. of Math.,
157:689–714, 2003.
[424] H. Cohn and A. Kumar. Universally optimal distribution of points on spheres. J.
Amer. Math. Soc., 20:99–148, 2007.
[425] H. Cohn and J. Woo. Three-point bounds for energy minimization. J. Amer. Math.
Soc., 25:929–958, 2012.
[426] H. Cohn and Y. Zhao. Energy-minimizing error-correcting codes. IEEE Trans.
Inform. Theory, 60:7442–7450, 2014.
[427] P. M. Cohn. Free Rings and Their Relations, volume 19 of London Math. Soc.
Monographs. Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], London,
2nd edition, 1985.
[428] C. J. Colbourn and J. F. Dinitz, editors. The CRC Handbook of Combinatorial
Designs. Chapman & Hall/CRC Press, 2nd edition, 2007.
[429] P. Compeau, P. Pevzner, and G. Tesler. How to apply de Bruijn graphs to genome
assembly. Nature Biotechnology, 29:987–991, 2011.
[430] I. Constaninescu. Lineare Codes über Restklassenringen ganzer Zahlen und ihre
Automorphismen bezüglich einer verallgemeinerten Hamming-Metrik. PhD thesis,
Technische Universität, München, Germany, 1995.
[431] I. Constaninescu and W. Heise. A metric for codes over residue class rings of integers
(in Russian). Problemy Peredači Informacii, 33(3):22–28, 1997. (English translation
in Probl. Inform. Transm., 33(3):208–213, 1997).
850 Bibliography
[438] J. H. Conway, V. S. Pless, and N. J. A. Sloane. Self-dual codes over GF(3) and
GF(4) of length not exceeding 16. IEEE Trans. Inform. Theory, 25:312–322, 1979.
[439] J. H. Conway, V. S. Pless, and N. J. A. Sloane. The binary self-dual codes of length
up to 32: a revised enumeration. J. Combin. Theory Ser. A, 60:183–195, 1992.
[440] J. H. Conway and N. J. A. Sloane. A new upper bound on the minimal distance of
self-dual codes. IEEE Trans. Inform. Theory, 36:1319–1333, 1990.
[441] J. H. Conway and N. J. A. Sloane. Self-dual codes over the integers modulo 4. J.
Combin. Theory Ser. A, 62:30–45, 1993.
[442] J. H. Conway and N. J. A. Sloane. Sphere Packings, Lattices and Groups, volume
290 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles
of Mathematical Sciences]. Springer-Verlag, New York, 3rd edition, 1999. With
additional contributions by Ei. Bannai, R. E. Borcherds, J. Leech, S. P. Norton, A.
M. Odlyzko, R. A. Parker, L. Queen and B. B. Venkov.
[443] D. Coppersmith and M. Sudan. Reconstructing curves in three (and higher) dimen-
sional spaces from noisy data. In Proc. 35th Annual ACM Symposium on the Theory
of Computing, pages 136–142, San Diego, California, June 2003. ACM Press, New
York.
[444] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms.
MIT Press, Cambridge, Massachusetts; McGraw-Hill Book Co., Boston, 2nd edition,
2001.
[447] A. Cossidente and G. Marino. Veronese embedding and two-character sets. Des.
Codes Cryptogr., 42:103–107, 2007.
Bibliography 851
[448] A. Cossidente, G. Marino, and F. Pavese. Non-linear maximum rank distance codes.
Des. Codes Cryptogr., 79:597–609, 2016.
[449] A. Cossidente and F. Pavese. On subspace codes. Des. Codes Cryptogr., 78:527–531,
2016.
[450] A. Cossidente and F. Pavese. Veronese subspace codes. Des. Codes Cryptogr., 81:445–
457, 2016.
[451] A. Cossidente and F. Pavese. Subspace codes in PG(2N − 1, Q). Combinatorica,
37:1073–1095, 2017.
[452] A. Cossidente, F. Pavese, and L. Storme. Geometrical aspects of subspace codes. In
M. Greferath, M. Pavčević, N. Silberstein, and M. Á. Vázquez-Castro, editors, Net-
work Coding and Subspace Designs. Signals and Communication Technology, Signals
Commun. Technol., pages 107–129. Springer, Cham, 2018.
[453] A. Cossidente and H. Van Maldeghem. The simple exceptional group G2 (q), q even,
and two-character sets. J. Combin. Theory Ser. A, 114:964–969, 2007.
[454] S. I. R. Costa, M. M. S. Alves, E. Agustini, and R. Palazzo. Graphs, tessellations,
and perfect codes on flat tori. IEEE Trans. Inform. Theory, 50:2363–2377, 2004.
[455] S. I. R. Costa, F. Oggier, A. Campello, J.-C. Belfiore, and E. Viterbo. Lattices Applied
to Coding for Reliable and Secure Communications. SpringerBriefs in Mathematics.
Springer, Cham, 2017.
[461] A. Couvreur, M. Lequesne, and J.-P. Tillich. Recovering short secret keys of RLCE
in polynomial time. In J. Ding and R. Steinwandt, editors, Post-Quantum Cryp-
tography, volume 11505 of Lecture Notes in Comput. Sci., pages 133–152. Springer,
Berlin, Heidelberg, 2019.
[462] A. Couvreur, I. Márquez-Corbella, and R. Pellikaan. Cryptanalysis of McEliece
cryptosystem based on algebraic geometry codes and their subcodes. IEEE Trans.
Inform. Theory, 63:5404–5418, 2017.
852 Bibliography
[463] A. Couvreur, A. Otmani, and J.-P. Tillich. Polynomial time attack on wild McEliece
over quadratic extensions. In P. Q. Nguyen and E. Oswald, editors, Advances in
Cryptology – EUROCRYPT 2014, volume 8441 of Lecture Notes in Comput. Sci.,
pages 17–39. Springer, Berlin, Heidelberg, 2014.
[464] A. Couvreur, A. Otmani, and J.-P. Tillich. Polynomial time attack on wild McEliece
over quadratic extensions. IEEE Trans. Inform. Theory, 63:404–427, 2017.
[470] L. Creedon and K. Hughes. Derivations on group algebras with coding theory ap-
plications. Finite Fields Appl., 56:247–265, 2019.
[471] H. S. Cronie and S. B. Korada. Lossless source coding with polar codes. In
Proc. IEEE International Symposium on Information Theory, pages 904–908, Austin,
Texas, June 2010.
[472] E. Croot, V. F. Lev, and P. P. Pach. Progression-free sets in Zn4 are exponentially
small. Ann. of Math. (2), 185:331–337, 2017.
[473] A. Cross, G. Smith, J. A. Smolin, and B. Zeng. Codeword stabilized quantum codes.
IEEE Trans. Inform. Theory, 55:433–438, 2009.
[474] B. Csajbók, G. Marino, O. Polverino, and F. Zullo. Maximum scattered linear sets
and MRD-codes. J. Algebraic Combin., 46:517–531, 2017.
[475] B. Csajbók and A. Siciliano. Puncturing maximum rank distance codes. J. Algebraic
Combin., 49:507–534, 2019.
[479] M. O. Damen, A. Chkeif, and J.-C. Belfiore. Lattice code decoder for space-time
codes. IEEE Commun. Letters, 4:161–163, 2000.
[480] M. O. Damen, A. Tewfik, and J.-C. Belfiore. A construction of a space-time code
based on number theory. IEEE Trans. Inform. Theory, 48:753–760, 2002.
[481] F. J. Damerau. A technique for computer detection and correction of spelling errors.
Commun. ACM, 7:171–176, 1964.
[482] I. B. Damgård and P. Landrock. Ideals and Codes in Group Algebras. Aarhus
Preprint Series, 1986.
[483] L. E. Danielsen and M. G. Parker. Edge local complementation and equivalence of
binary linear codes. Des. Codes Cryptogr., 49:161–170, 2008.
[486] R. N. Daskalov, T. A. Gulliver, and E. Metodieva. New ternary linear codes. IEEE
Trans. Inform. Theory, 45:1687–1688, 1999.
[487] B. K. Dass, N. Sharma, and R. Verma. Perfect codes in poset spaces and poset block
spaces. Finite Fields Appl., 46:90–106, 2017.
[488] A. Datta and F. Oggier. An overview of codes tailor-made for better repairability in
networked distributed storage systems. ACM SIGACT News, 44:89–105, 2013.
[489] H. Dau, I. M. Duursma, H. M. Kiah, and O. Milenkovic. Repairing Reed-Solomon
codes with multiple erasures. IEEE Trans. Inform. Theory, 64:6567–6582, 2018.
[490] M. C. Davey and D. J. C. MacKay. Low density parity check codes over GF(q).
IEEE Commun. Letters, 2:165–167, 1998.
[491] G. Davidoff, P. Sarnak, and A. Valette. Elementary Number Theory, Group Theory,
and Ramanujan Graphs, volume 55 of London Math. Soc. Student Texts. Cambridge
University Press, Cambridge, UK, 2003.
[492] P. J. Davis. Circulant Matrices. John Wiley & Sons, New York-Chichester-Brisbane,
1979.
[493] P. Dayal and M. K. Varanasi. An optimal two transmit antenna space-time code
and its stacked extensions. In Proc. 37th Asilomar Conference on Signals, Systems
and Computers, volume 1, pages 987–991, Pacific Grove, California, November 2003.
IEEE, Piscataway, New Jersey.
[494] P. Dayal and M. K. Varanasi. Distributed QAM-based space-time block codes for
efficient cooperative multiple-access communication. IEEE Trans. Inform. Theory,
54:4342–4354, 2008.
[495] J. De Beule, K. Metsch, and L. Storme. Characterization results on weighted mini-
hypers and on linear codes meeting the Griesmer bound. Adv. Math. Commun.,
2:261–272, 2008.
854 Bibliography
[514] T. Debris-Alazard, N. Sendrier, and J.-P. Tillich. WAVE: a new family of trapdoor
one-way preimage sampleable functions based on codes. In S. D. Galbraith and
S. Moriai, editors, Advances in Cryptology – ASIACRYPT 2019, Part 1, volume
11921 of Lecture Notes in Comput. Sci., pages 21–51. Springer, Berlin, Heidelberg,
2019.
[515] T. Debris-Alazard and J.-P. Tillich. Two attacks on rank metric code-based schemes:
RankSign and an IBE scheme. In T. Peyrin and S. D. Galbraith, editors, Advances in
Cryptology – ASIACRYPT 2018, Part 1, volume 11272 of Lecture Notes in Comput.
Sci., pages 62–92. Springer, Berlin, Heidelberg, 2018.
[516] L. Decreusefond and G. Zëmor. On the error-correcting capabilities of cycle codes of
graphs. In Proc. IEEE International Symposium on Information Theory, page 307,
Trondheim, Norway, June-July 1994.
[517] P. Delsarte. Bounds for unrestricted codes, by linear programming. Philips Res.
Rep., 27:272–289, 1972.
[518] P. Delsarte. Weights of linear codes and strongly regular normed spaces. Discrete
Math., 3:47–64, 1972.
[519] P. Delsarte. An algebraic approach to the association schemes of coding theory.
Philips Res. Rep. Suppl., 10:vi+97, 1973.
[520] P. Delsarte. An Algebraic Approach to the Association Schemes of Coding Theory.
PhD thesis, Université Catholique de Louvain, 1973.
[521] P. Delsarte. Four fundamental parameters of a code and their combinatorial signifi-
cance. Inform. and Control, 23:407–438, 1973.
[522] P. Delsarte. On subfield subcodes of modified Reed-Solomon codes. IEEE Trans.
Inform. Theory, 21:575–576, 1975.
[523] P. Delsarte. Bilinear forms over a finite field, with applications to coding theory. J.
Combin. Theory Ser. A, 25:226–241, 1978.
[524] P. Delsarte and J.-M. Goethals. Alternating bilinear forms over GF (q). J. Combin.
Theory Ser. A, 19:26–50, 1975.
[525] P. Delsarte and J.-M. Goethals. Unrestricted codes with the Golay parameters are
unique. Discrete Math., 12:211–224, 1975.
[526] P. Delsarte, J.-M. Goethals, and F. J. MacWilliams. On generalized Reed-Muller
codes and their relatives. Inform. and Control, 16:403–442, 1970.
[527] P. Delsarte, J.-M. Goethals, and J. J. Seidel. Spherical codes and designs. Geom.
Dedicata, 6:363–388, 1977.
[528] P. Delsarte and V. I. Levenshtein. Association schemes and coding theory. IEEE
Trans. Inform. Theory, 44:2477–2504, 1998.
[529] J.-C. Deneuville and P. Gaborit. Cryptanalysis of a code-based one-time signature.
Des. Codes Cryptogr., 88:to appear, 2020.
[530] J.-C. Deneuville, P. Gaborit, and G. Zémor. Ouroboros: a simple, secure and efficient
key exchange protocol based on coding theory. In T. Lange and T. Takagi, editors,
Post-Quantum Cryptography, volume 10346 of Lecture Notes in Comput. Sci., pages
18–34. Springer, Berlin, Heidelberg, 2017.
856 Bibliography
[536] Q. Diao, Y. Y. Tai, S. Lin, and K. Abdel-Ghaffar. Trapping set structure of finite
geometry LDPC codes. In Proc. IEEE International Symposium on Information
Theory, pages 3088–3092, Cambridge, Massachusetts, July 2012.
[537] L. E. Dickson. The analytic representation of substitutions on a power of a prime
number of letters with a discussion of the linear group. Ann. of Math., 11:65–120,
161–183, 1896/97.
[538] J. F. Dillon. Elementary Hadamard Difference Sets. PhD thesis, University of Mary-
land, 1974.
[539] J. F. Dillon. Multiplicative difference sets via additive characters. Des. Codes Cryp-
togr., 17:225–235, 1999.
[540] J. F. Dillon and H. S. Dobbertin. New cyclic difference sets with Singer parameters.
Finite Fields Appl., 10:342–389, 2004.
[541] A. G. Dimakis, P. B. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran. Net-
work coding for distributed storage systems. IEEE Trans. Inform. Theory, 56:4539–
4551, 2010.
[542] A. G. Dimakis, K. Ramchandran, Y. Wu, and C. Suh. A survey on network codes
for distributed storage. Proc. IEEE, 99:476–489, 2011.
[543] B. K. Ding, T. Zhang, and G. N. Ge. Maximum distance separable codes for b-symbol
read channels. Finite Fields Appl., 49:180–197, 2018.
[544] C. Ding. Cyclic codes from APN and planar functions. arXiv:1206.4687 [cs.IT], June
2012.
[545] C. Ding. Cyclic codes from the two-prime sequences. IEEE Trans. Inform. Theory,
58:3881–3891, 2012.
[546] C. Ding. Cyclotomic constructions of cyclic codes with length being the product of
two primes. IEEE Trans. Inform. Theory, 58:2231–2236, 2012.
[547] C. Ding. Cyclic codes from cyclotomic sequences of order four. Finite Fields Appl.,
23:8–34, 2013.
[548] C. Ding. Cyclic codes from some monomials and trinomials. SIAM J. Discrete Math.,
27:1977–1994, 2013.
Bibliography 857
[549] C. Ding. Codes from Difference Sets. World Scientific Publishing Co. Pte. Ltd.,
Hackensack, New Jersey, 2015.
[550] C. Ding. Linear codes from some 2-designs. IEEE Trans. Inform. Theory, 61:3265–
3275, 2015.
[551] C. Ding. Parameters of several classes of BCH codes. IEEE Trans. Inform. Theory,
61:5322–5330, 2015.
[552] C. Ding. A construction of binary linear codes from Boolean functions. Discrete
Math., 339:2288–2303, 2016.
[553] C. Ding. Cyclic codes from Dickson polynomials. arXiv:1206.4370 [cs.IT], November
2016.
[554] C. Ding. A sequence construction of cyclic codes over finite fields. Cryptogr. Com-
mun., 10:319–341, 2018.
[555] C. Ding, C. Fan, and Z. Zhou. The dimension and minimum distance of two classes
of primitive BCH codes. Finite Fields Appl., 45:237–263, 2017.
[556] C. Ding, Z. Heng, and Z. Zhou. Minimal binary linear codes. IEEE Trans. Inform.
Theory, 64:6536–6545, 2018.
[557] C. Ding, D. Kohel, and S. Ling. Secret sharing with a class of ternary codes. Theoret.
Comput. Sci., 246:285–298, 2000.
[558] C. Ding, T. Laihonen, and A. Renvall. Linear multisecret-sharing schemes and error-
correcting codes. J. Universal Comput. Sci., 3:1023–1036, 1997.
[559] C. Ding, K. Y. Lam, and C. Xing. Construction and enumeration of all binary duadic
codes of length pm . Fund. Inform., 38:149–161, 1999.
[560] C. Ding, C. Li, N. Li, and Z. Zhou. Three-weight cyclic codes and their weight
distributions. Discrete Math., 339:415–427, 2016.
[561] C. Ding, C. Li, and Y. Xia. Another generalisation of the binary Reed-Muller codes
and its applications. Finite Fields Appl., 53:144–174, 2018.
[562] C. Ding and S. Ling. A q-polynomial approach to cyclic codes. Finite Fields Appl.,
20:1–14, 2013.
[563] C. Ding and H. Niederreiter. Cyclotomic linear codes of order 3. IEEE Trans.
Inform. Theory, 53:2274–2277, 2007.
[564] C. Ding, D. Pei, and A. Salomaa. Chinese Remainder Theorem: Applications in
Computing, Coding, Cryptography. World Scientific Publishing Co., Inc., River Edge,
New Jersey, 1996.
[565] C. Ding and V. S. Pless. Cyclotomy and duadic codes of prime lengths. IEEE Trans.
Inform. Theory, 45:453–466, 1999.
[566] C. Ding and A. Salomaa. Secret sharing schemes with nice access structures. Fund.
Inform., 73:51–62, 2006.
[567] C. Ding, G. Xiao, and W. Shan. The Stability Theory of Stream Ciphers, volume
561 of Lecture Notes in Comput. Sci. Springer, Berlin, Heidelberg, 1991.
858 Bibliography
[568] C. Ding and J. Yang. Hamming weights in irreducible cyclic codes. Discrete Math.,
313:434–446, 2013.
[569] C. Ding, Y. Yang, and X. Tang. Optimal sets of frequency hopping sequences from
linear cyclic codes. IEEE Trans. Inform. Theory, 56:3605–3612, 2010.
[570] C. Ding and J. Yuan. Covering and secret sharing with linear codes. In C. S. Calude,
M. J. Dinneen, and V. Vajnovszki, editors, Discrete Mathematics and Theoretical
Computer Science, volume 2731 of Lecture Notes in Comput. Sci., pages 11–25.
Springer, Berlin, Heidelberg, 2003.
[571] C. Ding and Z. Zhou. Binary cyclic codes from explicit polynomials over GF(2m ).
Discrete Math., 321:76–89, 2014.
[572] K. Ding and C. Ding. A class of two-weight and three-weight codes and their appli-
cations in secret sharing. IEEE Trans. Inform. Theory, 61:5835–5842, 2015.
[573] H. Q. Dinh. Negacyclic codes of length 2s over Galois rings. IEEE Trans. Inform.
Theory, 51:4252–4262, 2005.
[574] H. Q. Dinh. Repeated-root constacyclic codes of length 2s over Z2a . In Algebra and
its Applications, volume 419 of Contemp. Math., pages 95–110. Amer. Math. Soc.,
Providence, Rhode Island, 2006.
[575] H. Q. Dinh. Structure of some classes of repeated-root constacyclic codes over inte-
gers modulo 2m . In A. Giambruno, C. P. Milies, and S. K. Sehgal, editors, Groups,
Rings and Group Rings, volume 248 of Lecture Notes Pure Appl. Math., pages 105–
117. Chapman & Hall/CRC, Boca Raton, Florida, 2006.
[576] H. Q. Dinh. Complete distances of all negacyclic codes of length 2s over Z2a . IEEE
Trans. Inform. Theory, 53:147–161, 2007.
[577] H. Q. Dinh. On the linear ordering of some classes of negacyclic and cyclic codes
and their distance distributions. Finite Fields Appl., 14:22–40, 2008.
[578] H. Q. Dinh. Constacyclic codes of length 2s over Galois extension rings of F2 + uF2 .
IEEE Trans. Inform. Theory, 55:1730–1740, 2009.
[579] H. Q. Dinh. Constacyclic codes of length ps over Fpm + uFpm . J. Algebra, 324:940–
950, 2010.
[580] H. Q. Dinh. On some classes of repeated-root constacyclic codes of length a power
of 2 over Galois rings. In Advances in Ring Theory, Trends Math., pages 131–147.
Birkhäuser/Springer Basel AG, Basel, 2010.
[581] H. Q. Dinh. Repeated-root constacyclic codes of length 2ps . Finite Fields Appl.,
18:133–143, 2012.
[582] H. Q. Dinh, S. Dhompongsa, and S. Sriboonchitta. On constacyclic codes of length
4ps over Fpm + uFpm . Discrete Math., 340:832–849, 2017.
[583] H. Q. Dinh, H. Liu, X.-S. Liu, and S. Sriboonchitta. On structure and distances
of some classes of repeated-root constacyclic codes over Galois rings. Finite Fields
Appl., 43:86–105, 2017.
[584] H. Q. Dinh and S. R. López-Permouth. Cyclic and negacyclic codes over finite chain
rings. IEEE Trans. Inform. Theory, 50:1728–1744, 2004.
Bibliography 859
[588] H. Q. Dinh, B. T. Nguyen, and S. Sriboonchitta. Skew constacyclic codes over finite
fields and finite chain rings. Math. Probl. Eng., Art. ID 3965789, 17 pages, 2016.
[589] H. Q. Dinh, B. T. Nguyen, S. Sriboonchitta, and T. M. Vo. On a class of constacyclic
codes of length 4ps over Fpm + uFpm . J. Algebra Appl., 18(2):1950022, 2019.
[590] H. Q. Dinh, B. T. Nguyen, S. Sriboonchitta, and T. M. Vo. On (α + uβ)-constacyclic
codes of length 4ps over Fpm + uFpm . J. Algebra Appl., 18(2):1950023, 2019.
[591] H. Q. Dinh and H. D. T. Nguyen. On some classes of constacyclic codes over poly-
nomial residue rings. Adv. Math. Commun., 6:175–191, 2012.
[592] H. Q. Dinh, H. D. T. Nguyen, S. Sriboonchitta, and T. M. Vo. Repeated-root
constacyclic codes of prime power lengths over finite chain rings. Finite Fields Appl.,
43:22–41, 2017.
[593] H. Q. Dinh, A. Sharma, S. Rani, and S. Sriboonchitta. Cyclic and negacyclic codes
of length 4ps over Fpm + uFpm . J. Algebra Appl., 17(9):1850173, 22, 2018.
[594] H. Q. Dinh, A. K. Singh, S. Pattanayak, and S. Sriboonchitta. On cyclic DNA codes
over the ring Z4 + uZ4 . Unpublished.
[595] H. Q. Dinh, A. K. Singh, S. Pattanayak, and S. Sriboonchitta. Cyclic DNA codes over
the ring F2 + uF2 + vF2 + uvF2 + v 2 F2 + uv 2 F2 . Des. Codes Cryptogr., 86:1451–1467,
2018.
[596] H. Q. Dinh, A. K. Singh, S. Pattanayak, and S. Sriboonchitta. DNA cyclic codes
over the ring F2 [u, v]/hu2 − 1, v 3 − v, uv − vui. Internat. J. Biomath., 11(3):1850042,
19, 2018.
[597] H. Q. Dinh, A. K. Singh, S. Pattanayak, and S. Sriboonchitta. Construction of cyclic
DNA codes over the ring Z4 [u]/hu2 − 1i based on the deletion distance. Theoret.
Comput. Sci., 773:27–42, 2019.
[598] H. Q. Dinh, L. Wang, and S. Zhu. Negacyclic codes of length 2ps over Fpm + uFpm .
Finite Fields Appl., 31:178–201, 2015.
[599] H. Q. Dinh, X. Wang, H. Liu, and S. Sriboonchitta. On the b-distance of repeated-
root constacyclic codes of prime power lengths. Discrete Math., 343(4):111780, 2020.
[601] D. Divsalar, S. Dolinar, and C. Jones. Protograph LDPC codes over burst erasure
channels. In Proc. MILCOM ’06 – IEEE Military Communications Conference,
Washington, DC, 7 pages, October 2006.
[602] D. Divsalar, S. Dolinar, C. Jones, and J. Thorpe. Protograph LDPC codes with
minimum distance linearly growing with block size. In Proc. GLOBECOM ’05 –
IEEE Global Telecommunications Conference, volume 3, pages 1152–1156, St. Louis,
Missouri, November-December 2005.
[603] D. Divsalar and L. Dolocek. On the typical minimum distance of protograph-based
non-binary LDPC codes. In Proc. Information Theory and Applications Workshop,
pages 192–198, San Diego, California, February 2012.
[604] D. Divsalar, H. Jin, and R. J. McEliece. Coding theorems for turbo-like codes. In
Proc. 36th Allerton Conference on Communication, Control, and Computing, pages
201–210, Monticello, Illinois, September 1998.
[605] H. S. Dobbertin. One-to-one highly nonlinear power functions on GF(2n ). Appl.
Algebra Engrg. Comm. Comput., 9:139–152, 1998.
[606] H. S. Dobbertin. Almost perfect nonlinear power functions on GF(2n ): the Niho
case. Inform. and Comput., 151:57–72, 1999.
[607] H. S. Dobbertin. Almost perfect nonlinear power functions on GF(2n ): the Welch
case. IEEE Trans. Inform. Theory, 45:1271–1275, 1999.
[608] H. S. Dobbertin. Kasami power functions, permutation polynomials and cyclic dif-
ference sets. In Difference Sets, Sequences and Their Correlation Properties (Bad
Windsheim, 1998), volume 542 of NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci.,
pages 133–158. Kluwer Acad. Publ., Dordrecht, Netherlands, 1999.
[609] H. S. Dobbertin. Almost perfect nonlinear power functions on GF(2n ): a new case
for n divisible by 5. In D Jungnickel and H. Niederreiter, editors, Proc. 5th Interna-
tional Conference on Finite Fields and Applications (Augsburg 1999), pages 113–121.
Springer-Verlag, Berlin, Heidelberg, 2001.
[610] H. S. Dobbertin, P. Felke, T. Helleseth, and P. Rosendahl. Niho type cross-correlation
functions via Dickson polynomials and Kloosterman sums. IEEE Trans. Inform.
Theory, 52:613–627, 2006.
[611] S. Dodunekov and J. Simonis. Codes and projective multisets. Electron. J. Combin.,
5:Research Paper 37, 23 pages, 1998.
[612] L. Dolecek, Z. Zhang, V. Vnantharam, M. J. Wainwright, and B. Nikolic. Analysis
of absorbing sets and fully absorbing sets of array-based LDPC codes. IEEE Trans.
Inform. Theory, 56:181–201, 2010.
[613] R. G. L. D’Oliveira and M. Firer. The packing radius of a code and partitioning
problems: the case for poset metrics on finite vector spaces. Discrete Math., 338:2143–
2167, 2015.
[614] R. G. L. D’Oliveira and M. Firer. Channel metrization. European J. Combin.,
80:107–119, 2019.
[615] R. G. L. D’Oliveira and M. Firer. A distance between channels: the average error of
mismatched channels. Des. Codes Cryptogr., 87:481–493, 2019.
Bibliography 861
[629] S. T. Dougherty, J.-L. Kim, B. Özkaya, L. Sok, and P. Solé. The combinatorics of
LCD codes: linear programming bound and orthogonal matrices. Internat. J. Inform.
Coding Theory, 4:116–128, 2017.
[630] S. T. Dougherty, J.-L. Kim, and P. Solé. Open problems in coding theory. In S. T.
Dougherty, A. Facchini, A. Leroy, E. Puczylowski, and P. Solé, editors, Noncommu-
tative Rings and Their Applications, volume 634 of Contemp. Math., pages 79–99.
Amer. Math. Soc., Providence, Rhode Island, 2015.
[631] S. T. Dougherty and S. Ling. Cyclic codes over Z4 of even length. Des. Codes
Cryptogr., 39:127–153, 2006.
[632] S. T. Dougherty and H. Liu. Independence of vectors in codes over rings. Des. Codes
Cryptogr., 51:55–68, 2009.
862 Bibliography
[635] S. T. Dougherty and K. Shiromoto. Maximum distance codes in Matn,s (Zk ) with
a non-Hamming metric and uniform distributions. Des. Codes Cryptogr., 33:45–61,
2004.
[636] S. T. Dougherty and M. M. Skriganov. MacWilliams duality and the Rosenbloom-
Tsfasman metric. Mosc. Math. J., 2:81–97, 2002.
[639] S. T. Dougherty, B. Yildiz, and S. Karadeniz. Self-dual codes over Rk and binary
self-dual codes. Eur. J. Pure Appl. Math., 6:89–106, 2013.
[640] J. Doyen, X. Hubaut, and M. Vandensavel. Ranks of incidence matrices of steiner
triple systems. Math. Z., 163:251–259, 1978.
[641] P. B. Dragon, O. I. Hernandez, J. Sawada, A. Williams, and D. Wong. Constructing
de Bruijn sequences with co-lexicographic order: the k-ary Grandmama sequence.
European J. Combin., 72:1–11, 2018.
[642] P. B. Dragon, O. I. Hernandez, and A. Williams. The Grandmama de Bruijn sequence
for binary strings. In E. Kranakis, G. Navarro, and E. Chávez, editors, LATIN 2016:
Theoretical Informatics, volume 9644 of Lecture Notes in Comput. Sci., pages 347–
361. Springer, Berlin, Heidelberg, 2016.
[643] N. Drucker, S. Gueron, and D. Kostic. Fast polynomial inversion for post quantum
QC-MDPC cryptography. In S. Dolev, V. Kolesnikov, S. Lodha, and G. Weiss,
editors, Cyber Security Cryptography and Machine Learning, volume 12161 of Lecture
Notes in Comput. Sci., pages 110–127. Springer, Berlin, Heidelberg, 2020.
[644] N. Drucker, S. Gueron, and D. Kostic. QC-MDPC decoders with several shades
of gray. In J. Ding and J.-P. Tillich, editors, Post-Quantum Cryptography, volume
12100 of Lecture Notes in Comput. Sci., pages 35–50. Springer, Berlin, Heidelberg,
2020.
[645] J. Ducoat. Generalized rank weights: a duality statement. In G. Kyureghyan,
G. Mullen, and A. Pott, editors, Topics in Finite Fields, volume 632 of Contemp.
Math., pages 101–109. Amer. Math. Soc., Providence, Rhode Island, 2015.
[646] I. Dumer. Two decoding algorithms for linear codes (in Russian). Problemy Peredači
Informacii, 25(1):24–32, 1989 (English translation in Probl. Inform. Transm.,
25(1):17–23, 1989).
[647] I. Dumer. On minimum distance decoding of linear codes. In Proc. 5th Joint Soviet-
Swedish Workshop on Information Theory, pages 50–52, Moscow, USSR, 1991.
Bibliography 863
[671] P. Elia, K. R. Kumar, S. A. Pawar, P. V. Kumar, and H.-F. Lu. Explicit space-
time codes achieving the diversity-multiplexing gain tradeoff. IEEE Trans. Inform.
Theory, 52:3869–3884, 2006.
[672] P. Elia, F. Oggier, and P. V. Kumar. Asymptotically optimal cooperative wireless
networks with reduced signaling complexity. IEEE J. Selected Areas Commun.,
25:258–267, 2007.
[673] P. Elia, B. A. Sethuraman, and P. V Kumar. Perfect space-time codes with minimum
and non-minimum delay for any number of antennas. In International Conference
on Wireless Networks, Communications and Mobile Computing, volume 1, pages
722–727. IEEE, Piscataway, New Jersey, 2005.
[674] P. Elias. Coding for noisy channels. IRE Conv. Rec., 4:37–46, 1955.
[675] P. Elias, A. Feinstein, and C. E. Shannon. A note on the maximum flow through a
network. IRE Trans. Inform. Theory, 2:117–119, 1956.
[676] N. D. Elkies. Explicit modular towers. In T. Basar and A. Vardy, editors, Proc.
35th Allerton Conference on Communication, Control, and Computing, pages 23–32,
Monticello, Illinois, September 1997.
[677] N. D. Elkies. Explicit towers of Drinfeld modular curves. In C. Casacuberta,
R. M. Miró-Roig, J. Verdera, and S. Xambó-Descamps, editors, European Congress
of Mathematics, Vol. II (Barcelona, 2000), volume 202 of Progress in Mathematics,
pages 189–198. Birkhäuser, Basel, 2001.
[678] N. D. Elkies. Still better nonlinear codes from modular curves. arXiv:math/0308046
[math.NT], August 2003.
[679] J. S. Ellenberg and D. C. Gijswijt. On large subsets of Fnq with no three-term
arithmetic progression. Ann. of Math. (2), 185:339–343, 2017.
[680] M. Elyasi and S. Mohajer. Determinant coding: a novel framework for exact-repair
regenerating codes. IEEE Trans. Inform. Theory, 62:6683–6697, 2016.
Bibliography 865
[681] M. Elyasi and S. Mohajer. Cascade codes for distributed storage systems.
arXiv:1901.00911 [cs.IT], January 2019.
[682] K. Engdahl and K. Sh. Zigangirov. To the theory of low-density convolutional codes I
(in Russian). Problemy Peredači Informacii, 35(4):12–28, 1999. (English translation
in Probl. Inform. Transm., 35(4):295–310, 1999).
[683] P. Erdős and Z. Füredi. The greatest angle among n points in the d-dimensional Eu-
clidean space. In Combinatorial Mathematics (Marseille-Luminy, 1981), volume 75
of North-Holland Math. Stud., pages 275–283. North-Holland, Amsterdam, 1983.
[684] T. Ericson and V. A. Zinov’ev. Codes on Euclidean Spheres. North-Holland Mathe-
matical Library. Elsevier Science, Amsterdam, 2001.
[690] T. Etzion, M. Firer, and R. A. Machado. Metrics based on finite directed graphs
and coding invariants. IEEE Trans. Inform. Theory, 64:2398–2409, 2018.
[691] T. Etzion, E. Gorla, A. Ravagnani, and A. Wachter-Zeh. Optimal Ferrers diagram
rank-metric codes. IEEE Trans. Inform. Theory, 62:1616–1630, 2016.
[692] T. Etzion and N. Silberstein. Error-correcting codes in projective spaces via rank-
metric codes and Ferrers diagrams. IEEE Trans. Inform. Theory, 55:2909–2919,
2009.
[693] T. Etzion, A. Trachtenberg, and A. Vardy. Which codes have cycle-free Tanner
graphs? IEEE Trans. Inform. Theory, 45:2173–2181, 1999.
[694] T. Etzion and A. Vardy. Perfect binary codes: constructions, properties and enu-
meration. IEEE Trans. Inform. Theory, 40:754–763, 1994.
[695] T. Etzion and A. Vardy. Error-correcting codes in projective space. IEEE Trans.
Inform. Theory, 57:1165–1173, 2011.
[696] M. F. Ezerman, S. Jitman, H. M. Kiah, and S. Ling. Pure asymmetric quantum MDS
codes from CSS construction: a complete characterization. Internat. J. Quantum Inf.,
11(3):1350027, 2013.
866 Bibliography
[699] M. F. Ezerman, S. Ling, B. Özkaya, and P. Solé. Good stabilizer codes from quasi-
cyclic codes over F4 and F9 . In Proc. IEEE International Symposium on Information
Theory, pages 2898–2902, Paris, France, July 2019.
[700] M. F. Ezerman, S. Ling, B. Özkaya, and J. Tharnnukhroh. Spectral bounds for quasi-
twisted codes. In Proc. IEEE International Symposium on Information Theory, pages
1922–1926, Paris, France, July 2019.
[701] A. Faldum. Radikalcodes. Diploma Thesis, Mainz, 1993.
[702] G. Falkner, B. Kowol, W. Heise, and E. Zehendner. On the existence of cyclic optimal
codes. Atti Sem. Mat. Fis. Univ. Modena, 28(2):326–341 (1980), 1979.
[703] J. L. Fan. Array codes as low-density parity-check codes. In Proc. 2nd Interna-
tional Symposium on Turbo Codes and Related Topics, pages 543–546, Brest, France,
September 2000.
[704] P. Fan and M. Darnell. Sequence Design for Communications Applications. Commu-
nications Systems, Techniques, and Applications Series. J. Wiley, New York, 1996.
[705] J.-C. Faugère, V. Gauthier-Umaña, A. Otmani, L. Perret, and J.-P. Tillich. A distin-
guisher for high-rate McEliece cryptosystems. IEEE Trans. Inform. Theory, 59:6830–
6844, 2013.
[706] J.-C. Faugère, A. Otmani, L. Perret, and J.-P. Tillich. Algebraic cryptanalysis of
McEliece variants with compact keys. In H. Gilbert, editor, Advances in Cryptology –
EUROCRYPT 2010, volume 6110 of Lecture Notes in Comput. Sci., pages 279–298.
Springer, Berlin, Heidelberg, 2010.
[707] J.-C. Faugère, L. Perret, and F. de Portzamparc. Algebraic attack against variants
of McEliece with Goppa polynomial of a special form. In P. Sarkar and T. Iwata,
editors, Advances in Cryptology – ASIACRYPT 2014, Part 1, volume 8873 of Lecture
Notes in Comput. Sci., pages 21–41. Springer, Berlin, Heidelberg, 2014.
[708] C. Faure and L. Minder. Cryptanalysis of the McEliece cryptosystem over hyperel-
liptic codes. In Proc. 11th International Workshop on Algebraic and Combinatorial
Coding Theory, pages 99–107, Pamporovo, Bulgaria, June 2008.
[709] G. Fazekas and V. I. Levenshtein. On upper bounds for code distance and covering
radius of designs in polynomial metric spaces. J. Combin. Theory Ser. A, 70:267–288,
1995.
[710] J. Feldman. Decoding Error-Correcting Codes via Linear Programming. PhD thesis,
Massachusetts Institute of Technology, 2003.
[712] L. V. Felix and M. Firer. Canonical-systematic form for codes in hierarchical poset
metrics. Adv. Math. Commun., 6:315–328, 2012.
[713] G.-L. Feng and T. R. N. Rao. Decoding algebraic-geometric codes up to the designed
minimum distance. IEEE Trans. Inform. Theory, 39:37–45, 1993.
[714] K. Feng, S. Ling, and C. Xing. Asymptotic bounds on quantum codes from algebraic
geometry codes. IEEE Trans. Inform. Theory, 52:986–991, 2006.
[715] K. Feng and Z. Ma. A finite Gilbert-Varshamov bound for pure stabilizer quantum
codes. IEEE Trans. Inform. Theory, 50:3323–3325, 2004.
[716] K. Feng, L. Xu, and F. J. Hickernell. Linear error-block codes. Finite Fields Appl.,
12:638–652, 2006.
[717] T. Feng and Q. Xiang. Strongly regular graphs from unions of cyclotomic classes.
J. Combin. Theory Ser. B, 102:982–995, 2012.
[718] C. Fernández-Córdoba, J. Pujol, and M. Villanueva. On rank and kernel of Z4 -
linear codes. In Á. Barbero, editor, Coding Theory and Applications, volume 5228
of Lecture Notes in Comput. Sci., pages 46–55. Springer, Berlin, Heidelberg, 2008.
[723] T. Feulner and G. Nebe. The automorphism group of an extremal [72, 36, 16] code
does not contain Z7 , Z3 × Z3 , or D10 . IEEE Trans. Inform. Theory, 58:6916–6924,
2012.
[724] R. P. Feynman. Simulating physics with computers. Int. J. Theor. Phys., 21:467–488,
1982.
[725] C. M. Fiduccia and Y. Zalcstein. Algebras having linear multiplicative complexities.
J. ACM, 24:311–331, 1977.
[726] F. Fiedler and M. Klin. A strongly regular graph with the parameters
(512, 73, 438, 12, 10) and its dual graph. Preprint MATH-AL-7-1998, Technische Uni-
versität Dresden, 23 pages, July 1998.
[727] J. E. Fields, P. Gaborit, W. C. Huffman, and V. S. Pless. On the classification of
extremal even formally self-dual codes. Des. Codes Cryptogr., 18:125—-148, 1999.
[728] J. E. Fields, P. Gaborit, W. C. Huffman, and V. S. Pless. On the classification of
extremal even formally self-dual codes of lengths 20 and 22. Discrete Appl. Math.,
111:75—-86, 2001.
868 Bibliography
[729] M. Finiasz, P. Gaborit, and N. Sendrier. Improved fast syndrome based crypto-
graphic hash functions. In V. Rijmen, editor, ECRYPT Hash Workshop, 2007.
[730] M. Finiasz and N. Sendrier. Security bounds for the design of code-based cryptosys-
tems. In M. Matsui, editor, Advances in Cryptology – ASIACRYPT 2009, volume
5912 of Lecture Notes in Comput. Sci., pages 88–105. Springer, Berlin, Heidelberg,
2009.
[731] M. Firer, M. M. S. Alves, J. A. Pinheiro, and L. Panek. Poset Codes: Partial Orders,
Metrics and Coding Theory. SpringerBriefs in Mathematics. Springer, Cham, 2018.
[732] M. Firer and J. L. Walker. Matched metrics and channels. IEEE Trans. Inform.
Theory, 62:1150–1156, 2016.
[733] J.-B. Fischer and J. Stern. An efficient pseudo-random generator provably as secure
as syndrome decoding. In U. Maurer, editor, Advances in Cryptology – EUROCRYPT
’96, volume 1070 of Lecture Notes in Comput. Sci., pages 245–255. Springer, Berlin,
Heidelberg, 1996.
[734] J. L. Fisher and S. K. Sehgal. Principal ideal group rings. Comm. Algebra, 4:319–325,
1976.
[735] N. L. Fogarty. On Skew-Constacyclic Codes. PhD thesis, University of Kentucky,
2016.
[736] N. L. Fogarty and H. Gluesing-Luerssen. A circulant approach to skew-constacyclic
codes. Finite Fields Appl., 35:92–114, 2015.
[737] M. A. Forbes and S. Yekhanin. On the locality of codeword symbols in non-linear
codes. Discrete Math., 324:78–84, 2014.
[738] L. R. Ford, Jr. and D. R. Fulkerson. Maximal flow through a network. Canad. J.
Math., 8:399–404, 1956.
[739] E. Fornasini and G. Marchesini. Structure and properties of two dimensional systems.
In S. G. Tzafestas, editor, Multidimensional Systems, Techniques and Applications,
pages 37–88. CRC Press, Boca Raton, Florida, 1986.
[740] G. D. Forney, Jr. Convolutional codes I: algebraic structure. IEEE Trans. Inform.
Theory, 16:720–738, 1970.
[741] G. D. Forney, Jr. Correction to: “Convolutional codes I: algebraic structure” [IEEE
Trans. Inform. Theory 16:720–738, 1970]. IEEE Trans. Inform. Theory, 17:360,
1971.
[742] G. D. Forney, Jr. Structural analysis of convolutional codes via dual codes. IEEE
Trans. Inform. Theory, 19:512–518, 1973.
[743] G. D. Forney, Jr. Convolutional codes II: maximum-likelihood decoding. Inform.
and Control, 25:222–266, 1974.
[744] G. D. Forney, Jr. Codes on graphs: normal realizations. IEEE Trans. Inform. Theory,
47:520–548, 2001.
[745] G. D. Forney, Jr., R. G. Gallager, G. R. Lang, F. M. Longstaff, and S. U. Qureshi.
Efficient modulation for band-limited channels. IEEE J. Selected Areas Commun.,
2:632–647, 1984.
Bibliography 869
[746] G. D. Forney, Jr., M. Grassl, and S. Guha. Convolutional and tail-biting quantum
error-correcting codes. IEEE Trans. Inform. Theory, 53:865–880, 2007.
[747] G. D. Forney, Jr. and L.-F. Wei. Multidimensional constellations I: Introduction, fig-
ures of merit, and generalized cross constellations. IEEE J. Selected Areas Commun.,
7:877–892, 1989.
[750] C. Fragouli and E. Soljanin. Network coding fundamentals. Found. Trends Network-
ing, 2:1–133, 2007.
[751] J. B. Fraleigh. A First Course in Abstract Algebra. Addison-Wesley, New York, 7th
edition, 2003.
[752] P. Frankl and N. Tokushige. Invitation to intersection problems for finite sets. J.
Combin. Theory Ser. A, 144:157–211, 2016.
[753] P. Frankl and R. M. Wilson. The Erdős-Ko-Rado theorem for vector spaces. J.
Combin. Theory Ser. A, 43:228–236, 1986.
[754] H. Fredricksen. Generation of the Ford sequence of length 2n , n large. J. Combina-
torial Theory Ser. A, 12:153–154, 1972.
[755] H. Fredricksen. A survey of full length nonlinear shift register cycle algorithms.
SIAM Rev., 24:195–221, 1982.
[756] H. Fredricksen and I. J. Kessler. An algorithm for generating necklaces of beads in
two colors. Discrete Math., 61:181–188, 1986.
[757] H. Fredricksen and J. Maiorana. Necklaces of beads in k colors and k-ary de Bruijn
sequences. Discrete Math., 23:207–210, 1978.
[758] W. Fulton. Algebraic Curves. Advanced Book Classics. Addison-Wesley Publishing
Company, Advanced Book Program, Redwood City, California, 1989. An introduc-
tion to algebraic geometry. Notes written with the collaboration of Richard Weiss;
reprint of 1969 original.
[759] È. M. Gabidulin. Combinatorial metrics in coding theory. In B. N. Petrov and
F. Csádki, editors, Proc. 2nd IEEE International Symposium on Information The-
ory, Tsahkadsor, Armenia, USSR, September 2-8, 1971, pages 169–176. Akadémiai
Kiadó, Budapest, 1973.
[760] È. M. Gabidulin. Theory of codes with maximum rank distance (in Russian). Prob-
lemy Peredači Informacii, 21(1):3–16, 1985 (English translation in Probl. Inform.
Transm., 21(1):1–12, 1985).
[761] È. M. Gabidulin. Rank q-cyclic and pseudo-q-cyclic codes. In Proc. IEEE Interna-
tional Symposium on Information Theory, pages 2799–2802, Seoul, Korea, June-July
2009.
870 Bibliography
[762] È. M. Gabidulin. A brief survey of metrics in coding theory. Math. Distances and
Appl., 66:66–84, 2012.
[763] È. M. Gabidulin and M. Bossert. Hard and soft decision decoding of phase ro-
tation invariant block codes. In Proc. International Zurich Seminar on Broadband
Communications: Accessing, Transmission, Networking. (Cat. No.98TH8277), pages
249–251, February 1998.
[764] È. M. Gabidulin and M. Bossert. Codes for network coding. In F. R. Kschischang and
E.-H. Yang, editors, Proc. IEEE International Symposium on Information Theory,
pages 867–870, Toronto, Canada, July 2008.
[765] È. M. Gabidulin and V. A. Obernikhin. Codes in the Vandermonde F-metric and
their application (in Russian). Problemy Peredači Informacii, 39(2):3–14, 2003. (En-
glish translation in Probl. Inform. Transm., 39(2):159–169, 2003).
[766] È. M. Gabidulin, A. V. Paramonov, and O. V. Tretjakov. Ideals over a non-
commutative ring and their application in cryptology. In D. W. Davies, editor,
Advances in Cryptology – EUROCRYPT ’91, volume 547 of Lecture Notes in Com-
put. Sci., pages 482–489. Springer, Berlin, Heidelberg, 1991.
[767] È. M. Gabidulin and J. Simonis. Metrics generated by families of subspaces. IEEE
Trans. Inform. Theory, 44:1336–1341, 1998.
[768] È. M. Gabidulin and J. Simonis. Perfect codes for metrics generated by primitive
2-error-correcting binary BCH codes. In Proc. IEEE International Symposium on
Information Theory, page 68, Cambridge, Massachussets, August 1998.
[769] P. Gaborit. Shorter keys for code based cryptography. In Proc. 4th International
Workshop on Coding and Cryptography, Book of Extended Abstracts, pages 81–91,
Bergen, Norway, March 2005. https://www.unilim.fr/pages\_perso/philippe.
gaborit/shortIC.ps.
[770] P. Gaborit and C. Aguilar-Melchor. Cryptographic method for communicating
confidential information. https://patents.google.com/patent/EP2537284B1/en,
2010.
[771] P. Gaborit and M. Girault. Lightweight code-based identification and signature. In
Proc. IEEE International Symposium on Information Theory, pages 191–195, Nice,
France, June 2007.
[772] P. Gaborit, A. Hauteville, and J.-P. Tillich. RankSynd: a PRNG based on rank
metric. In T. Takagi, editor, Post-Quantum Cryptography, volume 9606 of Lecture
Notes in Comput. Sci., pages 18–28. Springer, Berlin, Heidelberg, 2016.
[773] P. Gaborit, C. Lauradoux, and N. Sendrier. SYND: a fast code-based stream cipher
with a security reduction. In Proc. IEEE International Symposium on Information
Theory, pages 186–190, Nice, France, June 2007.
[774] P. Gaborit, G. Murat, O. Ruatta, and G. Zémor. Low rank parity check codes and
their application to cryptography. In Proc. 8th International Workshop on Coding
and Cryptography, Bergen, Norway, April 2013.
[775] P. Gaborit, O. Ruatta, and J. Schrek. On the complexity of the rank syndrome
decoding problem. IEEE Trans. Inform. Theory, 62:1006–1019, 2016.
Bibliography 871
[811] D. G. Glynn. The nonclassical 10-arc of PG(4, 9). Discrete Math., 59:43–51, 1986.
[812] I. Gnana Sudha and R. S. Selvaraj. Codes with a pomset metric and constructions.
Des. Codes Cryptogr., 86:875–892, 2018.
[813] C. D. Godsil. Polynomial spaces. Discrete Math., 73:71–88, 1988.
[814] C. D. Godsil. Algebraic Combinatorics. Chapman and Hall Mathematics Series.
Chapman & Hall, New York, 1993.
[815] N. Goela, E. Abbe, and M. Gastpar. Polar codes for broadcast channels. IEEE
Trans. Inform. Theory, 61:758–782, 2015.
[816] J.-M. Goethals. Two dual families of nonlinear binary codes. Electron. Lett., 10:471–
472, 1974.
[817] J.-M. Goethals. The extended Nadler code is unique. IEEE Trans. Information
Theory, 23:132–135, 1977.
[818] J.-M. Goethals and P. Delsarte. On a class of majority-logic decodable cyclic codes.
IEEE Trans. Inform. Theory, 14:182–188, 1968.
[819] J.-M. Goethals and J. J. Seidel. Strongly regular graphs derived from combinatorial
designs. Canad. J. Math., 22:597–614, 1970.
[820] M. J. E. Golay. Notes on digital coding. Proc. IRE, 37:657, 1949. (Also reprinted in
[169] page 13 and [212] page 9).
[821] R. Gold. Optimal binary sequences for spread spectrum multiplexing. IEEE Trans.
Inform. Theory, 13:619–621, 1967.
[822] R. Gold. Maximal recursive sequences with 3-valued recursive cross-correlation func-
tions. IEEE Trans. Inform. Theory, 14:154–156, 1968.
[823] D. Goldin and D. Burshtein. Improved bounds on the finite length scaling of polar
codes. IEEE Trans. Inform. Theory, 60:6966–6978, 2014.
[824] S. W. Golomb. Shift Register Sequences. World Scientific, Hackensack, New Jersey,
3rd edition, 2017.
[825] S. W. Golomb and G. Gong. Signal Design for Good Correlation: For Wireless
Communication, Cryptography, and Radar. Cambridge University Press, Cambridge,
UK, 2005.
[826] S. W. Golomb and E. C. Posner. Rook domains, Latin squares, affine planes, and
error-distributing codes. IEEE Trans. Inform. Theory, 10:196–208, 1964.
[832] S. Goparaju, A. Fazeli, and A. Vardy. Minimum storage regenerating codes for all
parameters. IEEE Trans. Inform. Theory, 63:6318–6328, 2017.
[833] S. Goparaju, I. Tamo, and A. R. Calderbank. An improved sub-packetization bound
for minimum storage regenerating codes. IEEE Trans. Inform. Theory, 60:2770–
2779, 2014.
[834] V. D. Goppa. A new class of linear correcting codes (in Russian). Problemy
Peredači Informacii, 6(3):24–30, 1970 (English translation in Probl. Inform. Transm.,
6(3):207–212, 1970).
[835] V. D. Goppa. Rational representation of codes and (L, g)-codes (in Russian). Prob-
lemy Peredači Informacii, 7(3):41–49, 1971 (English translation in Probl. Inform.
Transm., 7(3):223–229, 1971).
[836] V. D. Goppa. Codes associated with divisors (in Russian). Problemy Peredači Infor-
macii, 13(1):33–39, 1977 (English translation in Probl. Inform. Transm., 13(1):22–27,
1977).
[837] V. D. Goppa. Codes on algebraic curves (in Russian). Dokl. Akad. Nauk SSSR,
259:1289–1290, 1981.
[838] V. D. Goppa. Algebraic-geometric codes (in Russian). Izv. Akad. Nauk SSSR Ser.
Mat., 46(4):762–781, 896, 1982 (English translation in Mathematics of the USSR-
Izvestiya, 21(1):75–91, 1983).
[839] B. Gordon, W. H. Mills, and L. R. Welch. Some new difference sets. Canadian J.
Math., 14:614–625, 1962.
[840] D. C. Gorenstein and N. Zierler. A class of error-correcting codes in pm symbols. J.
Soc. Indust. Appl. Math., 9:207–214, 1961. (Also reprinted in [169] pages 87–89 and
[212] pages 194–201).
[846] M. Grassl. Code tables: Bounds on the parameters of various types of codes. http:
//www.codetables.de. Accessed on 2020-07-05.
[847] M. Grassl. Code tables: Bounds on the parameters of various types of codes: Quan-
tum error-correcting codes. http://www.codetables.de. Accessed on 2020-02-20.
[848] M. Grassl. Variations on encoding circuits for stabilizer quantum codes. In Y. M.
Chee, Z. Guo, S. Ling, F. Shao, Y. Tang, H. Wang, and C. Xing, editors, Coding and
Cryptology, volume 6639 of Lecture Notes in Comput. Sci., pages 142–158. Springer,
Berlin, Heidelberg, 2011.
[849] M. Grassl, T. Beth, and M. Rotteler. On optimal quantum codes. Int. J. Quantum
Inform., 2:55–64, 2004.
[850] M. Grassl and M. Rötteler. Quantum MDS codes over small fields. In Proc. IEEE In-
ternational Symposium on Information Theory, pages 1104–1108, Hong Kong, China,
June 2015.
[851] M. Greferath. Cyclic codes over finite rings. Discrete Math., 177:273–277, 1997.
[852] M. Greferath, M. Pavčević, N. Silberstein, and M. Á. Vázquez-Castro, editors.
Network Coding and Subspace Designs. Signals and Communication Technology.
Springer, Cham, 2018.
[853] M. Greferath and S. E. Schmidt. Gray isometries for finite chain rings and a nonlinear
ternary (36, 312 , 15) code. IEEE Trans. Inform. Theory, 45:2522–2524, 1999.
[854] M. Greferath and S. E. Schmidt. Finite-ring combinatorics and MacWilliams’ equiv-
alence theorem. J. Combin. Theory Ser. A, 92:17–28, 2000.
[855] J. H. Griesmer. A bound for error-correcting codes. IBM J. Research Develop.,
4:532–542, 1960.
[856] J. L. Gross and T. W. Tucker. Topological Graph Theory. Dover Publications, Inc.,
Mineola, NY, 2001.
[857] M. Grötschel, L. Lovász, and A. Schrijver. Geometric Algorithms and Combinatorial
Optimization. Springer-Verlag, Berlin, 1988.
[858] L. K. Grover. A fast quantum mechanical algorithm for database search. In Proc. 28th
Annual ACM Symposium on the Theory of Computing, pages 212–219, Phildelphia,
Pennsylvania, May 1996.
[859] K. Guenda, T. A. Gulliver, S. Jitman, and S. Thipworawimon. Linear `-intersection
pairs of codes and their applications. Des. Codes Cryptogr., 88:133–152, 2020.
[860] T. A. Gulliver. A new two-weight code and strongly regular graph. Appl. Math.
Lett., 9(2):17–20, 1996.
[861] T. A. Gulliver. Two new optimal ternary two-weight codes and strongly regular
graphs. Discrete Math., 149:83–92, 1996.
[862] T. A. Gulliver and V. K. Bhargava. Nine good rate (m − 1)/pm quasi-cyclic codes.
IEEE Trans. Inform. Theory, 38:1366–1369, 1992.
[863] T. A. Gulliver and V. K. Bhargava. New good rate (m − 1)/pm ternary and quater-
nary quasi-cyclic codes. Des. Codes Cryptogr., 7:223–233, 1996.
876 Bibliography
[869] C. Güneri and F. Özbudak. The concatenated structure of quasi-cyclic codes and an
improvement of Jensen’s bound. IEEE Trans. Inform. Theory, 59:979–985, 2013.
[870] C. Güneri, B. Özkaya, and S. Sayici. On linear complementary pair of nD cyclic
codes. IEEE Commun. Lett., 22:2404–2406, 2018.
[871] C. Güneri, B. Özkaya, and P. Solé. Quasi-cyclic complementary dual codes. Finite
Fields Appl., 42:67–80, 2016.
[872] C. Güneri, H. Stichtenoth, and İ. Taşkın. Further improvements on the designed
minimum distance of algebraic geometry codes. J. Pure Appl. Algebra, 213:87–97,
2009.
[873] A. Günther and G. Nebe. Automorphisms of doubly even self-dual binary codes.
Bull. Lond. Math. Soc., 41:769–778, 2009.
[874] Q. Guo, T. Johansson, and P. Stankovski. A key recovery attack on MDPC with CCA
security using decoding errors. In J. H. Cheon and T. Takagi, editors, Advances in
Cryptology – ASIACRYPT 2016, Part 1, volume 10031 of Lecture Notes in Comput.
Sci., pages 789–815. Springer, Berlin, Heidelberg, 2009.
[875] Gurobi Optimizer Reference Manual. gurobi.com.
[876] V. Guruswami. List Decoding of Error-Correcting Codes: Winning Thesis of the 2002
ACM Doctoral Dissertation Competition, volume 3282 of Lecture Notes in Comput.
Sci. Springer, Berlin, Heidelberg, 2005.
[877] V. Guruswami and W. Machmouchi. Explicit interleavers for a repeat accumulate
accumulate (RAA) code construction. In F. R. Kschischang and E.-H. Yang, edi-
tors, Proc. IEEE International Symposium on Information Theory, pages 1968–1972,
Toronto, Canada, July 2008.
[878] V. Guruswami and A. Rudra. Explicit codes achieving list decoding capacity: error-
correction with optimal redundancy. IEEE Trans. Inform. Theory, 54:135–150, 2008.
[879] V. Guruswami, A. Rudra, and M. Sudan. Essential Coding Theory. Draft
available at https://cse.buffalo.edu/faculty/atri/courses/coding-theory/
book/web-coding-book.pdf, 2012.
[881] V. Guruswami and C. Wang. Linear-algebraic list decoding for variants of Reed-
Solomon codes. IEEE Trans. Inform. Theory, 59:3257–3268, 2013.
[882] V. Guruswami and M. Wootters. Repairing Reed-Solomon codes. IEEE Trans.
Inform. Theory, 63:5684–5698, 2017.
[883] V. Guruswami and P. Xia. Polar codes: speed of polarization and polynomial gap to
capacity. IEEE Trans. Inform. Theory, 61:3–16, 2015.
[884] L. Guth and N. H. Katz. Algebraic methods in discrete analogs of the Kakeya
problem. Adv. Math., 225:2828–2839, 2010.
[885] L. Guth and N. H. Katz. On the Erdős distinct distances problem in the plane. Ann.
of Math. (2), 181:155–190, 2015.
[886] J. Hagenauer, E. Offer, and L. Papke. Iterative decoding of binary block and con-
volutional codes. IEEE Trans. Inform. Theory, 42:429–445, 1996.
[887] M. Hall, Jr. A survey of difference sets. Proc. Amer. Math. Soc., 7:975–986, 1956.
[888] M. Hall, Jr. Combinatorial Theory. Blaisdell Publishing Company, 1967.
[889] N. Hamada. On the p-rank of the incidence matrix of a balanced or partially balanced
incomplete block design and its application to error correcting codes. Hiroshima
Math. J., 3:154–226, 1973.
[890] N. Hamada and M. Deza. A survey of recent works with respect to a characterization
of an (n, k, d; q)-code meeting the Griesmer bound using a min·hyper in a finite
projective geometry. Discrete Math., 77:75–87, 1989.
[891] N. Hamada and T. Helleseth. Arcs, blocking sets, and minihypers. Comput. Math.
Appl., 39:159–168, 2000.
[892] N. Hamada and H. Ohmori. On the BIB-design having the minimum p-rank. J.
Combin. Theory Ser. A, 18:131–140, 1975.
[893] Y. Hamdaoui and N. Sendrier. A non asymptotic analysis of information set decod-
ing. IACR Cryptology ePrint Archive, Report 2013/162, 2013. https://eprint.
iacr.org/2013/162, 2013.
[894] N. Hamilton. Strongly regular graphs from differences of quadrics. Discrete Math.,
256:465–469, 2002.
[895] R. W. Hamming. Error detecting and error correcting codes. Bell System Tech. J.,
29:10–23, 1950. (Also reprinted in [169] pages 9–12 and [212] pages 10–23).
[896] R. W. Hamming. Coding and Information Theory. Prentice Hall, Englewood Cliffs,
New Jersey, 2nd edition, 1986.
[897] A. R. Hammons and H. El Gamal. On the theory of space-time codes for PSK
modulation. IEEE Trans. Inform. Theory, 46:524–542, 2000.
[898] A. R. Hammons, Jr., P. V. Kumar, A. R. Calderbank, N. J. A. Sloane, and P. Solé.
The Z4 -linearity of Kerdock, Preparata, Goethals, and related codes. IEEE Trans.
Inform. Theory, 40:301–319, 1994.
878 Bibliography
[922] B. Hassibi and H. Vikalo. On the sphere decoding algorithm I. Expected complexity.
IEEE Trans. Signal Process., 53:2806–2818, 2005.
[923] J. Håstad. Clique is hard to approximate within n1− . Acta Math., 182:105–142,
1999.
[924] E. R. Hauge and T. Helleseth. De Bruijn sequences, irreducible codes and cyclotomy.
Discrete Math., 159:143–154, 1996.
[925] E. R. Hauge and J. Mykkeltveit. On the classification of de Bruijn sequences. Discrete
Math., 148:65–83, 1996.
[928] A. S. Hedayat, N. J .A. Sloane, and J. Stufken. Orthogonal Arrays. Springer Nature,
New York, 1999.
[929] O. Heden. A survey of perfect codes. Adv. Math. Commun., 2:223–247, 2008.
[930] O. Heden. On perfect codes over non prime power alphabets. In A. A. Bruen and
D. L. Wehlau, editors, Error-Correcting Codes, Finite Geometries and Cryptography,
volume 523 of Contemp. Math., pages 173–184. Amer. Math. Soc., Providence, Rhode
Island, 2010.
[931] O. Heden and C. Roos. The non-existence of some perfect codes over non-prime
power alphabets. Discrete Math., 311:1344–1348, 2011.
[934] T. Helleseth. Some results about the cross-correlation function between two maximal
linear sequences. Discrete Math., 16:209–232, 1976.
[935] T. Helleseth. Nonlinear shift registers—a survey and challenges. In Algebraic Curves
and Finite Fields, volume 16 of Radon Ser. Comput. Appl. Math., pages 121–144.
De Gruyter, Berlin, 2014.
[936] T. Helleseth, L. Hu, A. Kholosha, X. Zeng, N. Li, and W. Jiang. Period-different m-
sequences with at most four-valued cross correlation. IEEE Trans. Inform. Theory,
55:3305–3311, 2009.
[937] T. Helleseth and A. Kholosha. Monomial and quadratic bent functions over the finite
fields of odd characteristic. IEEE Trans. Inform. Theory, 52:2018–2032, 2006.
[938] T. Helleseth and A. Kholosha. New binomial bent functions over the finite fields of
odd characteristic. IEEE Trans. Inform. Theory, 56:4646–4652, 2010.
[939] T. Helleseth, T. Kløve, and J. Mykkeltveit. The weight distribution of irreducible
cyclic codes with block lengths n1 ((q ` − 1)/n). Discrete Math., 18:179–211, 1977.
[940] T. Helleseth and P. V. Kumar. Sequences with Low Correlation. In V. S. Pless and
W. C. Huffman, editors, Handbook of Coding Theory, Vol. I, II, chapter 21, pages
1765–1853. North-Holland, Amsterdam, 1998.
[941] T. Helleseth and P. Rosendahl. New pairs of m-sequences with 4-level cross-
correlation. Finite Fields Appl., 11:674–683, 2005.
[942] Z. L. Heng and C. Ding. A construction of q-ary linear codes with irreducible cyclic
codes. Des. Codes Cryptogr., 87:1087–1108, 2019.
[943] Z. L. Heng, C. Ding, and Z. Zhou. Minimal linear codes over finite fields. Finite
Fields Appl., 54:176–196, 2018.
[944] Z. L. Heng and Q. Yue. A class of binary linear codes with at most three weights.
IEEE Commun. Letters, 19:1488–1491, 2015.
[945] Z. L. Heng and Q. Yue. A class of q-ary linear codes derived from irreducible cyclic
codes. arXiv:1511.09174 [cs.IT], April 2016.
[946] F. Hergert. The equivalence classes of the Vasil’ev codes of length 15. In D. Jung-
nickel and K. Vedder, editors, Combinatorial Theory, volume 969 of Lecture Notes
in Math., pages 176–186. Springer-Verlag, Berlin, Heidelberg, 1982.
[947] A. O. Hero. Secure space-time communication. IEEE Trans. Inform. Theory,
49:3235–3249, 2003.
[948] D. Hertel. A note on the Kasami power function. Cryptology ePrint Archive, Report
2005/436, 2005. https://eprint.iacr.org/2005/436.
[949] M. Herzog and J. Schönheim. Group partition, factorization and the vector covering
problem. Canad. Math. Bull., 15:207–214, 1972.
[950] F. Hess. Computing Riemann-Roch spaces in algebraic function fields and related
topics. J. Symbolic Comput., 33:425–445, 2002.
[951] R. Hill. On the largest size of cap in S5, 3 . Atti Accad. Naz. Lincei Rend. Cl. Sci.
Fis. Mat. Nat. (8), 54:378–384 (1974), 1973.
Bibliography 881
[952] R. Hill. Caps and groups. In Colloquio Internazionale sulle Teorie Combinatorie
(Rome, 1973), Tomo II, pages 389–394. Atti dei Convegni Lincei, No. 17. Accad.
Naz. Lincei, Rome, 1976.
[953] R. Hill. An extension theorem for linear codes. Des. Codes Cryptogr., 17:151–157,
1999.
[954] R. Hill and E. Kolev. A survey of recent results on optimal linear codes. In Combina-
torial Designs and their Applications (Milton Keynes, 1997), volume 403 of Chapman
& Hall/CRC Res. Notes Math., pages 127–152. Chapman & Hall/CRC, Boca Raton,
Florida, 1999.
[955] R. Hill and H. N. Ward. A geometric approach to classifying Griesmer codes. Des.
Codes Cryptogr., 44:169–196, 2007.
[956] M. Hirotomo, M. Takita, and M. Morii. Syndrome decoding of symbol-pair codes.
In Proc. IEEE Information Theory Workshop, pages 162–166, Hobart, Tasmania,
November 2014.
[957] J. W. P. Hirschfeld. Rational curves on quadrics over finite fields of characteristic
two. Rend. Mat. (6), 4:773–795, 1971.
[958] J. W. P. Hirschfeld. Finite Projective Spaces of Three Dimensions. Oxford Math-
ematical Monographs. The Clarendon Press, Oxford University Press, New York,
1985. Oxford Science Publications.
[959] J. W. P. Hirschfeld. Projective Geometries over Finite Fields. Oxford Mathematical
Monographs. The Clarendon Press, Oxford University Press, New York, 2nd edition,
1998. Oxford Science Publications.
[960] J. W. P. Hirschfeld, G. Korchmáros, and F. Torres. Algebraic Curves Over a Fi-
nite Field. Princeton Series in Applied Mathematics. Princeton University Press,
Princeton, New Jersey, 2008.
[961] J. W. P. Hirschfeld and L. Storme. The packing problem in statistics, coding theory
and finite projective spaces: Update 2001. In Finite Geometries, volume 3 of Dev.
Math., pages 201–246. Kluwer Acad. Publ., Dordrecht, Netherlands, 2001.
[962] J. W. P. Hirschfeld and J. A. Thas. Open problems in finite projective spaces. Finite
Fields Appl., 32:44–81, 2015.
[963] J. W. P. Hirschfeld and J. A. Thas. General Galois Geometries. Springer Monographs
in Mathematics. Springer, London, 2016.
[964] T. Ho, M. Médard, R. Kötter, D. R. Karger, M. Effros, J. Shi, and B. Leong. A
random linear network coding approach to multicast. IEEE Trans. Inform. Theory,
52:4413–4430, 2006.
[965] S. A. Hobart. Bounds on subsets of coherent configurations. Michigan Math. J.,
58:231–239, 2009.
[966] S. A. Hobart and J. Williford. The absolute bound for coherent configurations.
Linear Algebra Appl., 440:50–60, 2014.
[967] B. M. Hochwald and T. L. Marzetta. Unitary space-time modulation for multiple-
antenna communications in Rayleigh flat fading. IEEE Trans. Inform. Theory,
46:543–564, 2000.
882 Bibliography
[972] C. Hollanti, J. Lahtonen, and H.-F. Lu. Maximal orders in the design of dense
space-time lattice codes. IEEE Trans. Inform. Theory, 54:4493–4510, 2008.
[973] H. D. L. Hollmann. A relation between Levenshtein-type distances and insertion-and-
deletion correcting capabilities of codes. IEEE Trans. Inform. Theory, 39:1424–1427,
1993.
[974] H. D. L. Hollmann and Q. Xiang. A proof of the Welch and Niho conjectures on
cross-correlations of binary m-sequences. Finite Fields Appl., 7:253–286, 2001.
[975] Y. Hong. On the nonexistence of unknown perfect 6- and 8-codes in Hamming
schemes H(n, q) with q arbitrary. Osaka J. Math., 21:687–700, 1984.
[976] Y. Hong, E. Viterbo, and J.-C. Belfiore. Golden space-time trellis coded modulation.
IEEE Trans. Inform. Theory, 53:1689–1705, 2007.
[977] T. Honold, M. Kiermaier, and S. Kurz. Constructions and bounds for mixed-
dimension subspace codes. Adv. Math. Commun., 10:649–682, 2016.
[978] T. Honold and I. Landjev. Linear representable codes over chain rings. In Proc.
6th International Workshop on Algebraic and Combinatorial Coding Theory, pages
135–141, Pskov, Russia, September 1998.
[979] T. Honold and I. Landjev. Linearly representable codes over chain rings. Abh. Math.
Sem. Univ. Hamburg, 69:187–203, 1999.
[980] T. Honold and I. Landjev. Linear codes over finite chain rings. Electron. J. Combin.,
7:Research Paper 11, 22 pages, 2000.
[981] T. Honold and I. Landjev. Linear codes over finite chain rings and projective Hjelm-
slev geometries. In P. Solé, editor, Codes Over Rings, volume 6 of Series on Coding
Theory and Cryptology, pages 60–123. World Sci. Publ., Hackensack, New Jersey,
2009.
[982] S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications.
Bull. Amer. Math. Soc. (N.S.), 43:439–561, 2006.
[983] N. J. Hopper and M. Blum. Secure human identification protocols. In C. Boyd,
editor, Advances in Cryptology – ASIACRYPT 2001, volume 2248 of Lecture Notes
in Comput. Sci., pages 52–66. Springer, Berlin, Heidelberg, 2001.
Bibliography 883
[1007] W. C. Huffman. Self-dual codes over F2 + uF2 with an automorphism of odd order.
Finite Fields Appl., 15:277–293, 2009.
[1008] W. C. Huffman and V. S. Pless. Fundamentals of Error-Correcting Codes. Cambridge
University Press, Cambridge, UK, 2003.
[1009] W. C. Huffman and V. D. Tonchev. The existence of extremal self-dual [50, 25, 10]
codes and quasi-symmetric 2-(49, 9, 6) designs. Des. Codes Cryptogr., 6:97–106, 1995.
[1010] W. C. Huffman and V. Y. Yorgov. A [72, 36, 16] doubly even code does not have an
automorphism of order 11. IEEE Trans. Inform. Theory, 33:749–752, 1987.
[1011] B. L. Hughes. Differential space-time modulation. IEEE Trans. Inform. Theory,
46:2567–2578, 2000.
[1012] G. Hughes. Structure theorems for group ring codes with an application to self-dual
codes. Des. Codes Cryptogr., 24:5–14, 2001.
[1013] B. Huppert and N. Blackburn. Finite Groups II, volume 242 of Grundlehren der
Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences].
Springer-Verlag, Berlin-New York, 1982.
[1014] R. Hutchinson. The existence of strongly MDS convolutional codes. SIAM J. Control
Optim., 47:2812–2826, 2008.
[1015] R. Hutchinson, J. Rosenthal, and R. Smarandache. Convolutional codes with maxi-
mum distance profile. Systems Control Lett., 54:53–63, 2005.
[1016] J. Y. Hyun and H. K. Kim. The poset structures admitting the extended binary
Hamming code to be a perfect code. Discrete Math., 288:37–47, 2004.
[1017] J. Y. Hyun, H. K. Kim, and J. Rye Park. The weighted poset metrics and directed
graph metrics. arXiv:1703.00139 [math.CO], March 2017.
[1018] J. Y. Hyun, J. Lee, and Y. Lee. Explicit criteria for construction of plateaued
functions. IEEE Trans. Inform. Theory, 62:7555–7565, 2016.
Bibliography 885
[1019] J. Y. Hyun and Y. Lee. MDS poset-codes satisfying the asymptotic Gilbert-
Varshamov bound in Hamming weights. IEEE Trans. Inform. Theory, 57:8021–8026,
2011.
[1020] IBM ILOG CPLEX Optimization Studio. ibm.com.
[1021] Y. Ihara. Some remarks on the number of rational points of algebraic curves over
finite fields. J. Fac. Sci. Univ. Tokyo Sect. IA Math., 28:721–724 (1982), 1981.
[1022] T. Ikuta and A. Munemasa. A new example of non-amorphous association schemes.
Contrib. Discrete Math., 3(2):31–36, 2008.
[1023] T. Ikuta and A. Munemasa. Pseudocyclic association schemes and strongly regular
graphs. European J. Combin., 31:1513–1519, 2010.
[1024] S. P. Inamdar and N. S. Narasimha Sastry. Codes from Veronese and Segre embed-
dings and Hamada’s formula. J. Combin. Theory Ser. A, 96:20–30, 2001.
[1025] L. Ioffe and M. Mézard. Asymmetric quantum error-correcting codes. Physical
Review A, 75(3):032345, March 2007.
[1026] Y. Ishai, E. Kushilevitz, R. Ostrovsky, and A. Sahai. Zero-knowledge proofs from
secure multiparty computation. SIAM J. Comput., 39:1121–1152, 2009.
[1027] W. A. Jackson, K. M. Martin, and C. M. O’Keefe. Multi-secret sharing schemes.
In Y. G. Desmedt, editor, Advances in Cryptology – CRYPTO ’94, volume 839 of
Lecture Notes in Comput. Sci., pages 150–163. Springer, Berlin, Heidelberg, 1994.
[1028] N. Jacobson. The Theory of Rings, volume II of Amer. Math. Soc. Mathematical
Surveys. Amer. Math. Soc., New York, 1943.
[1029] N. Jacobson. Finite-Dimensional Division Algebras over Fields. Springer-Verlag,
Berlin, 1996.
[1030] H. Jafarkhani. A quasi-orthogonal space-time block code. IEEE Trans. Commun.,
49:1–4, 2001.
[1031] D. B. Jaffe. Optimal binary linear codes of length ≤ 30. Discrete Math., 223:135–155,
2000.
[1032] D. B. Jaffe and J. Simonis. New binary linear codes which are dual transforms of
good codes. IEEE Trans. Inform. Theory, 45:2136–2137, 1999.
[1033] D. B. Jaffe and V. D. Tonchev. Computing linear codes and unitals. Des. Codes
Cryptogr., 14:39–52, 1998.
[1034] S. Jaggi, P. Sanders, P. A. Chou, M. Effros, S. Egner, K. Jain, and L. M. G. M.
Tolhuizen. Polynomial time algorithms for multicast network code construction.
IEEE Trans. Inform. Theory, 51:1973–1982, 2005.
[1035] Y. Jang and J. Park. On a MacWilliams type identity and a perfectness for a binary
linear (n, n − 1, j)-poset code. Discrete Math., 265:85–104, 2003.
[1036] C. J. A. Jansen, W. G. Franx, and D. E. Boekee. An efficient algorithm for the
generation of de Bruijn cycles. IEEE Trans. Inform. Theory, 37:1475–1478, 1991.
[1037] H. L. Janwa and A. K. Lal. On expander graphs: parameters and applications.
arXiv:cs/0406048 [cs.IT], June 2004.
886 Bibliography
[1038] H. L. Janwa and O. Moreno. McEliece public key cryptosystems using algebraic-
geometric codes. Des. Codes Cryptogr., 8:293–307, 1996.
[1039] H. L. Janwa and R. M. Wilson. Hyperplane sections of Fermat varieties in P 3
in char. 2 and some applications to cyclic codes. In G. D. Cohen, T. Mora, and
O. Moreno, editors, Applied Algebra, Algebraic Algorithms and Error-Correcting
Codes, volume 673 of Lecture Notes in Comput. Sci., pages 180–194. Springer, Berlin,
Heidelberg, 1993.
[1040] J. Jedwab, D. J. Katz, and K.-U. Schmidt. Advances in the merit factor problem
for binary sequences. J. Combin. Theory Ser. A, 120:882–906, 2013.
[1041] S. A. Jennings. The structure of the group ring of a p-group over a modular field.
Trans. Amer. Math. Soc., 50:175–185, 1941.
[1042] J. M. Jensen. The concatenated structure of cyclic and abelian codes. IEEE Trans.
Inform. Theory, 31:788–793, 1985.
[1043] Y. Jia, S. Ling, and C. Xing. On self-dual cyclic codes over finite fields. IEEE Trans.
Inform. Theory, 57:2243–2251, 2011.
[1048] G. R. Jithamitra and B. Sundar Rajan. Minimizing the complexity of fast sphere
decoding of STBCs. In A. Kuleshov, V. M. Blinovsky, and A. Ephremides, editors,
Proc. IEEE International Symposium on Information Theory, pages 1846–1850, St.
Petersburg, Russia, July-August 2011.
[1049] S. Jitman, S. Ling, H. Liu, and X. Xie. Checkable codes from group rings.
arXiv:1012.5498 [cs.IT], December 2010.
[1050] S. Jitman, S. Ling, H. Liu, and X. Xie. Abelian codes in principal ideal group
algebras. IEEE Trans. Inform. Theory, 59:3046–3058, 2013.
[1051] R. Johannesson and K. S. Zigangirov. Fundamentals of Convolutional Coding. IEEE
Series on Digital & Mobile Communication. IEEE Press, New York, 1999.
[1052] S. J. Johnson and S. R. Weller. Regular low-density parity-check codes from combi-
natorial designs. In Proc. IEEE Information Theory Workshop, pages 90–92, Cairns,
Australia, September 2001.
[1053] S. J. Johnson and S. R. Weller. Codes for low-density parity check codes from Kirk-
man triple systems. In Proc. GLOBECOM ’02 – IEEE Global Telecommunications
Conference, volume 2, pages 970–974, Taipei, Taiwan, November 2002.
Bibliography 887
[1054] S. J. Johnson and S. R. Weller. Codes for iterative decoding from partial geometries.
IEEE Trans. Commun., 52:236–243, 2004.
[1055] G. C. Jorge, A. A. de Andrade, S. I. R. Costa, and J. E. Strapasson. Algebraic
constructions of densest lattices. J. Algebra, 429:218–235, 2015.
[1056] D. Jungnickel. Characterizing geometric designs, II. J. Combin. Theory Ser. A,
118:623–633, 2011.
[1057] D. Jungnickel and A. Pott. Perfect and almost perfect sequences. Discrete Appl.
Math., 95:331–359, 1999.
[1058] D. Jungnickel and V. D. Tonchev. Polarities, quasi-symmetric designs, and Hamada’s
conjecture. Des. Codes Cryptogr., 51:131–140, 2009.
[1059] D. Jungnickel and V. D. Tonchev. The number of designs with geometric parameters
grows exponentially. Des. Codes Cryptogr., 55:131–140, 2010.
[1060] D. Jungnickel and V. D. Tonchev. The classification of antipodal two-weight linear
codes. Finite Fields Appl., 50:372–381, 2018.
[1061] R. Jurrius and R. Pellikaan. On defining generalized rank weights. Adv. Math.
Commun., 11:225–235, 2017.
[1062] R. Jurrius and R. Pellikaan. Defining the q-analogue of a matroid. Electron. J.
Combin., 25(3):Paper 3.2, 32, 2018.
[1063] J. Justesen. A class of constructive asymptotically good algebraic codes. IEEE
Trans. Inform. Theory, 18:652–656, 1972.
[1064] J. Justesen. An algebraic construction of rate 1/v convolutional codes. IEEE Trans.
Inform. Theory, 21:577–580, 1975.
[1065] J. Justesen, K. J. Larsen, H. E. Jensen, A. Havemose, and T. Høholdt. Construction
and decoding of a class of algebraic geometry codes. IEEE Trans. Inform. Theory,
35:811–821, 1989.
[1066] G. A. Kabatianskii, E. A. Krouk, and B. Smeets. A digital signature scheme based
on random error-correcting codes. In M. Darnell, editor, Crytography and Coding,
volume 1355 of Lecture Notes in Comput. Sci., pages 161–167. Springer, Berlin,
Heidelberg, 1997.
[1067] G. A. Kabatianskii and V. I. Levenshtein. On bounds for packings on a sphere
and in space (in Russian). Problemy Peredači Informacii, 14(1):3–25, 1978. (English
translation in Probl. Inform. Transm., 14(1):1–17, 1978).
[1068] G. A. Kabatianskii and V. I. Panchenko. Packings and coverings of the Hamming
space by balls of unit radius (in Russian). Problemy Peredači Informacii, 24(4):3–16,
1988 (English translation in Probl. Inform. Transm., 24(4):261–272, 1988).
[1069] G. Kachigar and J.-P. Tillich. Quantum information set decoding algorithms. In
T. Lange and T. Takagi, editors, Post-Quantum Cryptography, volume 10346 of Lec-
ture Notes in Comput. Sci., pages 69–89. Springer, Berlin, Heidelberg, 2017.
[1070] S. Kadhe and A. R. Calderbank. Rate optimal binary linear locally repairable codes
with small availability. In Proc. IEEE International Symposium on Information
Theory, pages 166–170, Aachen, Germany, June 2017.
888 Bibliography
[1071] X. Kai, S. Zhu, and P. Li. A construction of new MDS symbol-pair codes. IEEE
Trans. Inform. Theory, 61:5828–5834, 2015.
[1072] X. Kai, S. Zhu, and L. Wang. A family of constacyclic codes over F2 + uF2 + vF2 +
uvF2 . J. Syst. Sci. Complex., 25:1032–1040, 2012.
[1073] T. Kailath. Linear Systems. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1980.
[1078] W. M. Kantor. Spreads, translation planes and Kerdock sets. II. SIAM J. Algebraic
Discrete Methods, 3:308–318, 1982.
[1079] W. M. Kantor. Exponential numbers of two-weight codes, difference sets and sym-
metric designs. Discrete Math., 46:95–98, 1983.
[1086] T. Kasami. Weight distribution formula for some class of cyclic codes. Technical
Report No. R-285, Coordinated Science Laboratory, University of Illinois at Urbana-
Champaign, 1966.
[1087] T. Kasami. The weight enumerators for several classes of subcodes of the 2nd order
binary Reed-Muller codes. Inform. and Control, 18:369–394, 1971.
[1088] T. Kasami. A Gilbert-Varshamov bound for quasi-cyclic codes of rate 1/2. IEEE
Trans. Inform. Theory, 20:679, 1974.
Bibliography 889
[1091] P. Kaski and O. Pottonen. Libexact User’s Guide: Version 1.0. Number 2008-1 in
HIIT technical reports. Helsinki Institute for Information Technology HIIT, Finland,
2008.
[1092] G. L. Katsman and M. A. Tsfasman. A remark on algebraic geometric codes. In
Representation Theory, Group Rings, and Coding Theory, volume 93 of Contemp.
Math., pages 197–199. Amer. Math. Soc., Providence, Rhode Island, 1989.
[1093] D. J. Katz. Sequences with low correlation. In L. Budaghyan and F. Rodrı́guez-
Henrı́quez, editors, Arithmetic of Finite Fields, volume 11321 of Lecture Notes in
Comput. Sci., pages 149–172. Springer, Berlin, Heidelberg, 2018.
[1094] M. L. Kaushik. Necessary and sufficient number of parity checks in codes correcting
random errors and bursts with weight constraints under a new metric. Information
Sciences, 19:81–90, 1979.
[1095] C. A. Kelley. On codes designed via algebraic lifts of graphs. In Proc. 46th Allerton
Conference on Communication, Control, and Computing, pages 1254–1261, Monti-
cello, Illinois, September 2008.
[1096] C. A. Kelley. Minimum distance and pseudodistance lower bound for generalized
LDPC codes. Internat. J. Inform. Coding Theory, 1:313–333, 2010.
[1097] C. A. Kelley. Algebraic design and implementation of protograph codes using non-
commuting permutation matrices. IEEE Trans. Commun., 61:910–918, 2013.
[1103] C. A. Kelley and J. Walker. LDPC codes from voltage graphs. In F. R. Kschischang
and E.-H. Yang, editors, Proc. IEEE International Symposium on Information The-
ory, pages 792–796, Toronto, Canada, July 2008.
890 Bibliography
[1122] T. Kiran and B. S. Rajan. STBC-schemes with non-vanishing determinant for certain
number of transmit antennas. IEEE Trans. Inform. Theory, 51:2984–2992, 2005.
[1123] T. Kiran and B. S. Rajan. Distributed space-time codes with reduced decoding
complexity. In Proc. IEEE International Symposium on Information Theory, pages
542–546, Seattle, Washington, July 2006.
[1124] C. Kirfel and R. Pellikaan. The minimum distance of codes in an array coming from
telescopic semigroups. IEEE Trans. Inform. Theory, 41:1720–1732, 1995.
[1125] A. Y. Kitaev. Fault-tolerant quantum computation by anyons. Ann. Physics, 303:2–
30, 2003.
[1126] B. Kitchens. Symbolic dynamics and convolutional codes. In Codes, Systems, and
Graphical Models (Minneapolis, MN, 1999), volume 123 of IMA Vol. Math. Appl.,
pages 347–360. Springer, New York, 2001.
[1127] E. Kleinfeld. Finite Hjelmslev planes. Illinois J. Math., 3:403–407, 1959.
[1128] T. Kløve. Error Correction Codes for the Asymmetric Channel. Department of
Informatics, University of Bergen, Norway, 1983, bibliography updated 1995.
[1129] W. Knapp and P. Schmid. Codes with prescribed permutation group. J. Algebra,
67:415–435, 1980.
[1130] E. Knill and R. Laflamme. Theory of quantum error-correcting codes. Physical
Review A, 55:900–911, February 1997.
[1131] E. Knill, R. Laflamme, and W. H. Zurek. Resilient quantum computation. Science,
279(5349):342–345, 1998.
[1132] D. E. Knuth. The Art of Computer Programming. Volume 4A: Combinatorial Algo-
rithms, Part 1. Addison-Wesley Professional, 2011.
[1133] D. E. Knuth. The Art of Computer Programming. Volume 4. Fascicle 5: Mathemat-
ical Preliminaries Redux; Introduction to Backtracking; Dancing Links. Addison-
Wesley, 2015.
[1134] D. E. Knuth. The Art of Computer Programming. Volume 4: Satisfiability. Addison-
Wesley Professional, 2015.
[1135] A. Kohnert. Update on the extension of good linear codes. In Combinatorics 2006,
volume 26 of Electron. Notes Discrete Math., pages 81–85. Elsevier Sci. B. V., Am-
sterdam, 2006.
[1136] A. Kohnert. Constructing two-weight codes with prescribed groups of automor-
phisms. Discrete Appl. Math., 155:1451–1457, 2007.
[1137] A. Kohnert. Construction of linear codes having prescribed primal-dual minimum
distance with applications in cryptography. Albanian J. Math., 2:221–227, 2008.
[1138] A. Kohnert. (l, s)-extension of linear codes. Discrete Math., 309:412–417, 2009.
[1139] A. Kohnert and S. Kurz. Construction of large constant dimension codes with a
prescribed minimum distance. In J. Calmet, W. Geiselmann, and J. Müller-Quade,
editors, Mathematical Methods in Computer Science: Essays in Memory of Thomas
Beth, volume 5393 of Lecture Notes in Comput. Sci., pages 31–42. Springer, Berlin,
Heidelberg, 2008.
892 Bibliography
[1158] R. Kötter and P. O. Vontobel. Graph covers and iterative decoding of finite-length
codes. In Proc. 3rd International Symposium on Turbo Codes and Related Topics,
pages 75–82, Brest, France, September 2003.
[1159] Y. Kou, Lin. S., and M. P. C. Fossorier. Low-density parity-check codes based on
finite geometries: a rediscovery and more. IEEE Trans. Inform. Theory, 47:2711–
2736, 2001.
[1160] I. Kra and S. R. Simanca. On circulant matrices. Notices Amer. Math. Soc.,
59(3):368–377, 2012.
[1161] E. S. Kramer and D. M. Mesner. t-designs on hypergraphs. Discrete Math., 15:263–
296, 1976.
[1162] D. L. Kreher and D. R. Stinson. Combinatorial Algorithms: Generation, Enumera-
tion, and Search. CRC Press, Boca Raton, Florida, 1999.
[1163] E. I. Krengel and P. V. Ivanov. Two constructions of binary sequences with optimal
autocorrelation magnitude. Electron. Lett., 52:1457–1459, 2016.
[1164] M. N. Krishnan and P. V. Kumar. On MBR codes with replication. In Proc. IEEE
International Symposium on Information Theory, pages 71–75, Barcelona, Spain,
July 2016.
[1165] M. N. Krishnan, M. Vajha, V. Ramkumar, B. Sasidharan, S. B. Balaji, and P. V.
Kumar. Erasure coding for big data. Advanced Computing and Communications
(ACCS), 3(1):6 pages, 2019.
[1166] D. S. Krotov, P. R. J. Östergård, and O. Pottonen. On optimal binary one-error-
correcting codes of lengths 2m −4 and 2m −3. IEEE Trans. Inform. Theory, 57:6771–
6779, 2011.
[1167] E. A. Krouk. Decoding complexity bound for linear block codes (in Russian). Prob-
lemy Peredači Informacii, 25(3):103–107, 1989 (English translation in Probl. Inform.
Transm., 25(3):251–254, 1989).
[1168] V. Krčadinac. Steiner 2-designs S(2, 4, 28) with nontrivial automorphisms. Glasnik
Matematički, 37:259–268, 2002.
[1169] F. R. Kschischang. An introduction to network coding. In M. Médard and
A. Sprintson, editors, Network Coding: Fundamentals and Applications, chapter 1.
Academic Press, Waltham, Massachusetts, 2012.
[1170] F. R. Kschischang, B. Frey, and H.-A. Loeliger. Factor graphs and the sum-product
algorithm. IEEE Trans. Inform. Theory, 47:498–519, 2001.
[1171] A. Kshevetskiy and È. M. Gabidulin. The new construction of rank codes. In Proc.
IEEE International Symposium on Information Theory, pages 2105–2108, Adelaide,
Australia, September 2005.
[1172] S. Kudekar, T. J. Richardson, and R. L. Urbanke. Threshold saturation via spatial
coupling: why convolutional LDPC ensembles perform so well over the BEC. IEEE
Trans. Inform. Theory, 57:803–834, 2011.
[1173] S. Kudekar, T. J. Richardson, and R. L. Urbanke. Spatially coupled ensembles
universally achieve capacity under belief propagation. IEEE Trans. Inform. Theory,
59:7761–7783, 2013.
894 Bibliography
[1174] K. R. Kumar and G. Caire. Construction of structured LaST codes. In Proc. IEEE
International Symposium on Information Theory, pages 2834–2838, Seattle, Wash-
ington, July 2006.
[1175] N. Kumar and A. K. Singh. DNA computing over the ring Z4 [v]/hv 2 − vi. Internat.
J. Biomath., 11(7):1850090, 18, 2018.
[1176] P. V. Kumar, R. A. Scholtz, and L. R. Welch. Generalized bent functions and their
properties. J. Combin. Theory Ser. A, 40:90–107, 1985.
[1177] S.-Y. Kung, B. C. Lévy, M. Morf, and T. Kailath. New results in 2-D systems
theory, part II: 2-D state-space models—realization and the notions of controllability,
observability, and minimality. Proc. of IEEE, 65:945–961, 1977.
[1182] G. B. Kyung and C. C. Wang. Finding the exhaustive list of small fully absorbing
sets and designing the corresponding low error-floor decoder. IEEE Trans. Commun.,
60:1487–1498, 2012.
[1183] G. G. La Guardia. Asymmetric quantum codes: new codes from old. Quantum
Inform. Process., 12:2771–2790, 2013.
[1184] G. G. La Guardia. Erratum to: Asymmetric quantum codes: new codes from old.
Quantum Inform. Process., 12:2791, 2013.
[1185] H. Laakso. Nonexistence of nontrivial perfect codes in the case q = pa1 pb2 pc3 , e ≥ 3.
Ann. Univ. Turku. Ser. A I, 177, 43 pages, 1979. Dissertation, University of Turku,
Turku, 1979.
[1186] A. Laaksonen and P. R. J. Östergård. Constructing error-correcting binary codes
using transitive permutation groups. Discrete Appl. Math., 233:65–70, 2017.
[1187] R. Laflamme, C. Miquel, J. P. Paz, and W. H. Zurek. Perfect quantum error cor-
recting code. Physical Review Lett., 77:198–201, July 1996.
[1190] K. Lally. Algebraic lower bounds on the free distance of convolutional codes. IEEE
Trans. Inform. Theory, 52:2101–2110, 2006.
[1191] K. Lally and P. Fitzpatrick. Algebraic structure of quasi-cyclic codes. Discrete Appl.
Math., 111:157–175, 2001.
[1192] C. W. H. Lam and L. Thiel. Backtrack search with isomorph rejection and consis-
tency check. J. Symbolic Comput., 7:473–485, 1989.
[1193] T. Y. Lam. A general theory of Vandermonde matrices. Exposition. Math., 4:193–
215, 1986.
[1194] T. Y. Lam and A. Leroy. Vandermonde and Wronskian matrices over division rings.
J. Algebra, 119:308–336, 1988.
[1195] T. Y. Lam and A. Leroy. Wedderburn polynomials over division rings. I. J. Pure
Appl. Algebra, 186:43–76, 2004.
[1196] T. Y. Lam, A. Leroy, and A. Ozturk. Wedderburn polynomials over division rings.
II. In S. K. Jain and S. Parvathi, editors, Noncommutative Rings, Group Rings,
Diagram Algebras and Their Applications, volume 456 of Contemp. Math., pages
73–98. Amer. Math. Soc., Providence, Rhode Island, 2008.
[1197] J. Lambek. Lectures on Rings and Modules. Blaisdell Publishing Co. Ginn and Co.,
Waltham, Mass.-Toronto, Ont.-London, 1966.
[1198] I. N. Landjev. The geometric approach to linear codes. In Finite Geometries, vol-
ume 3 of Dev. Math., pages 247–256. Kluwer Acad. Publ., Dordrecht, Netherlands,
2001.
[1199] I. N. Landjev and P. Vandendriessche. A study of (xvt , xvt−1 )-minihypers in PG(t, q).
J. Combin. Theory Ser. A, 119:1123–1131, 2012.
[1200] P. Landrock. Finite Group Algebras and Their Modules, volume 84 of London Math.
Soc. Lecture Note Ser. Cambridge University Press, Cambridge, UK, 1983.
[1201] P. Landrock and O. Manz. Classical codes as ideals in group algebras. Des. Codes
Cryptogr., 2:273–285, 1992.
[1202] J. N. Laneman and G. W. Wornell. Distributed space-time-coded protocols for
exploiting cooperative diversity in wireless network. IEEE Trans. Inform. Theory,
49:2415–2425, 2003.
[1203] S. Lang. Algebra, volume 211 of Graduate Texts in Mathematics. Springer-Verlag,
New York, revised 3rd edition, 2002.
[1204] J. B. Lasserre. An explicit equivalent positive semidefinite program for nonlinear 0-1
programs. SIAM J. Optim., 12:756–769, 2002.
[1205] M. Laurent. A comparison of the Sherali-Adams, Lovász-Schrijver, and Lasserre
relaxations for 0-1 programming. Math. Oper. Res., 28:470–496, 2003.
[1206] M. Laurent. Strengthened semidefinite programming bounds for codes. Math. Pro-
gram. Ser. B, 109:239–261, 2007.
[1207] M. Laurent and F. Vallentin. A Course on Semidefinite Optimization. Cambridge
University Press, Cambridge, UK, in preparation.
896 Bibliography
[1208] M. Lavrauw, L. Storme, and G. Van de Voorde. Linear codes from projective spaces.
In A. A. Bruen and D. L. Wehlau, editors, Error-Correcting Codes, Finite Geometries
and Cryptography, volume 523 of Contemp. Math., pages 185–202. Amer. Math. Soc.,
Providence, Rhode Island, 2010.
[1209] F. Lazebnik and V. A. Ustimenko. Explicit construction of graphs with arbitrary
large girth and of large size. Discrete Appl. Math., 60:275–284, 1997.
[1229] V. I. Levenshtein. Krawtchouk polynomials and universal bounds for codes and
designs in Hamming spaces. IEEE Trans. Inform. Theory, 41:1303–1321, 1995.
[1230] V. I. Levenshtein. Universal Bounds for Codes and Designs. In V. S. Pless and W. C.
Huffman, editors, Handbook of Coding Theory, Vol. I, II, chapter 6, pages 499–648.
North-Holland, Amsterdam, 1998.
[1231] V. I. Levenshtein. Equivalence of Delsarte’s bounds for codes and designs in sym-
metric association schemes, and some applications. Discrete Math., 197/198:515–536,
1999.
[1232] V. I. Levenshtein. New lower bounds on aperiodic crosscorrelation of binary codes.
IEEE Trans. Inform. Theory, 45:284–288, 1999.
[1233] V. I. Levenshtein. Bounds for deletion/insertion correcting codes. In Proc. IEEE
International Symposium on Information Theory, page 370, Lausanne, Switzerland,
June-July 2002.
[1234] B. C. Lévy. 2-D Polynomial and Rational Matrices, and their Applications for the
Modeling of 2-D Dynamical Systems. PhD thesis, Standford University, 1981.
[1235] B. Li, H. Shen, and D. Tse. An adaptive successive cancellation list decoder for polar
codes with cyclic redundancy check. IEEE Commun. Letters, 16:2044–2047, 2012.
[1236] C. Li, C. Ding, and S. Li. LCD cyclic codes over finite fields. IEEE Trans. Inform.
Theory, 63:4344–4356, 2017.
[1237] C. Li and Q. Yue. The Walsh transform of a class of monomial functions and cyclic
codes. Cryptogr. Commun., 7:217–228, 2015.
[1238] C. Li, X. Zeng, T. Helleseth, C. Li, and L. Hu. The properties of a class of linear
FSRs and their applications to the construction of nonlinear FSRs. IEEE Trans.
Inform. Theory, 60:3052–3061, 2014.
[1239] C. Li, X. Zeng, C. Li, and T. Helleseth. A class of de Bruijn sequences. IEEE Trans.
Inform. Theory, 60:7955–7969, 2014.
[1240] C. Li, X. Zeng, C. Li, T. Helleseth, and M. Li. Construction of de Bruijn sequences
from LFSRs with reducible characteristic polynomials. IEEE Trans. Inform. Theory,
62:610–624, 2016.
898 Bibliography
[1241] J. Li and B. Li. Erasure coding for cloud storage systems: a survey. Tsinghua Science
and Technology, 18:259–272, 2013.
[1242] J. Li and X. Tang. A systematic construction of MDS codes with small sub-
packetization level and near optimal repair bandwidth. arXiv:1901.08254 [cs.IT],
January 2019.
[1243] J. Li, X. Tang, and C. Tian. A generic transformation to enable optimal repair in
MDS codes for distributed storage systems. IEEE Trans. Inform. Theory, 64:6257–
6267, 2018.
[1244] M. Li and D. Lin. De Bruijn sequences, adjacency graphs, and cyclotomy. IEEE
Trans. Inform. Theory, 64:2941–2952, 2018.
[1244a] N. Li and S. Mesnager. Recent results and problems on constructions of linear codes
from cryptographic functions. Cryptogr. Commun., 12:965–986, 2020.
[1245] S. Li. The minimum distance of some narrow-sense primitive BCH codes. SIAM J.
Discrete Math., 31:2530–2569, 2017.
[1246] S. Li, C. Ding, M. Xiong, and G. Ge. Narrow-sense BCH codes over GF(q) with
m
−1
length n = qq−1 . IEEE Trans. Inform. Theory, 63:7219–7236, 2017.
[1247] S. Li, C. Li, C. Ding, and H. Liu. Two families of LCD BCH codes. IEEE Trans.
Inform. Theory, 63:5699–5717, 2017.
[1248] S.-Y. R. Li, R. W. Yeung, and N. Cai. Linear network coding. IEEE Trans. Inform.
Theory, 49:371–381, 2003.
[1249] X.-B. Liang. Orthogonal designs with maximal rates. IEEE Trans. Inform. Theory,
49:2468–2503, 2003.
[1250] X.-B. Liang and X.-G. Xia. Unitary signal constellations for differential space-time
modulation with two transmit antennas: parametric codes, optimal designs, and
bounds. IEEE Trans. Inform. Theory, 48:2291–2322, 2002.
[1251] H. Liao and X.-G. Xia. Some designs of full rate space-time codes with nonvanishing
determinant. IEEE Trans. Inform. Theory, 53:2898–2908, 2007.
[1252] D. A. Lidar and T. A. Brun, editors. Quantum Error Correction. Cambridge Uni-
versity Press, Cambridge, UK, 2013.
[1253] R. Lidl, G. L. Mullen, and G. Turnwald. Dickson Polynomials, volume 65 of Pitman
Monographs and Surveys in Pure and Applied Mathematics. Longman Scientific &
Technical, Harlow; copublished in the United States with John Wiley & Sons, Inc.,
New York, 1993.
[1256] M. W. Liebeck. The affine permutation groups of rank three. Proc. London Math.
Soc. (3), 54:477–516, 1987.
Bibliography 899
[1270] S. Ling and P. Solé. Good self-dual quasi-cyclic codes exist. IEEE Trans. Inform.
Theory, 49:1052–1053, 2003.
[1271] S. Ling and P. Solé. On the algebraic structure of quasi-cyclic codes II: chain rings.
Des. Codes Cryptogr., 30:113–130, 2003.
[1272] S. Ling and P. Solé. On the algebraic structure of quasi-cyclic codes III: generator
theory. IEEE Trans. Inform. Theory, 51:2692–2700, 2005.
900 Bibliography
[1273] P. Lisoněk and V. Singh. Quantum codes from nearly self-orthogonal quaternary
linear codes. Des. Codes Cryptogr., 73:417–424, 2014.
[1274] B. Litjens. Semidefinite bounds for mixed binary/ternary codes. Discrete Math.,
341:1740–1748, 2018.
[1275] B. Litjens, S. Polak, and A. Schrijver. Semidefinite bounds for nonbinary codes based
on quadruples. Des. Codes Cryptogr., 84:87–100, 2017.
[1276] J. B. Little. Algebraic geometry codes from higher dimensional varieties. In
E. Martı́nez-Moro, C. Munuera, and D. Ruano, editors, Advances in Algebraic Ge-
ometry Codes, volume 5 of Ser. Coding Theory Cryptol., pages 257–293. World Sci.
Publ., Hackensack, New Jersey, 2008.
[1277] H. Liu, C. Ding, and C. Li. Dimensions of three types of BCH codes over GF(q).
Discrete Math., 340:1910–1927, 2017.
[1278] H. Liu, Y. Li, L. Ma, and J. Chen. On the smallest absorbing sets of LDPC codes
from finite planes. IEEE Trans. Inform. Theory, 58:4014–4020, 2012.
[1279] H. Liu and Y. Maouche. Two or few-weight trace codes over Fq + uFq . IEEE Trans.
Inform. Theory, 65:2696–2703, 2019.
[1280] J. Liu, S. Mesnager, and L. Chen. New constructions of optimal locally recoverable
codes via good polynomials. IEEE Trans. Inform. Theory, 64:889–899, 2018.
[1281] J. Liu, S. Mesnager, and D. Tang. Constructions of optimal locally recoverable codes
via Dickson polynomials. Des. Codes Cryptogr., 88:1759–1780, 2020.
[1282] S. Liu, F. Manganiello, and F. R. Kschischang. Construction and decoding of gen-
eralized skew-evaluation codes. In Proc. 14th Canadian Workshop on Information
Theory, pages 9–13, St. John’s, Canada, July 2015.
[1283] S. Liu and F. Oggier. An overview of coding for distributed storage systems. In
Network Coding and Subspace Designs, Signals Commun. Technol., pages 363–383.
Springer, Cham, 2018.
[1284] Z. Liu and D. A. Pados. LDPC codes from generalized polygons. IEEE Trans.
Inform. Theory, 51:3890–3898, 2005.
[1285] P. Loidreau. Étude et optimisation de cryptosystèmes à clé publique fondés sur la
théorie des codes correcteurs. PhD thesis, École Polytechnique, Paris, 2001.
[1286] B. Long, P. Zhang, and J. Hu. A generalized QS-CDMA system and the design of
new spreading codes. IEEE Trans. Vehicular Tech., 47:1268–1275, 1998.
[1287] S. R. López-Permouth, B. R. Parra-Avila, and S. Szabo. Dual generalizations of the
concept of cyclicity of codes. Adv. Math. Commun., 3:227–234, 2009.
[1288] L. Lovász. On the Shannon capacity of a graph. IEEE Trans. Inform. Theory,
25:1–7, 1979.
[1289] H.-F. Lu and P. V. Kumar. A unified construction of space-time codes with optimal
rate-diversity tradeoff. IEEE Trans. Inform. Theory, 51:1709–1730, 2005.
[1290] J. Lu, J. Harshan, and F. Oggier. Performance of lattice coset codes on Universal
Software Radio Peripherals. Physical Commun., 24:94–102, 2017.
Bibliography 901
[1308] G. Luo, X. Cao, G. Xu, and S. Xu. A new class of optimal linear codes with flexible
parameters. Discrete Appl. Math., 237:126–131, 2018.
[1309] L. Luzzi, G. R.-B. Othman, J.-C. Belfiore, and E. Viterbo. Golden space-time block
coded modulation. IEEE Trans. Inform. Theory, 55:584–597, 2009.
[1310] V. Lyubashevsky. Lattice signatures without trapdoors. In D. Pointcheval and
T. Johansson, editors, Advances in Cryptology – EUROCRYPT 2012, volume 7237
of Lecture Notes in Comput. Sci., pages 738–755. Springer, Berlin, Heidelberg, 2012.
[1311] G. Maatouk and M. A. Shokrollahi. Analysis of the second moment of the LT decoder.
In Proc. IEEE International Symposium on Information Theory, pages 2326–2330,
Seoul, Korea, June-July 2009.
[1312] J. E. MacDonald. Design methods for maximum minimum-distance error-correcting
codes. IBM J. Research Develop., 4:43–57, 1960.
[1313] R. A. Machado, J. A. Pinheiro, and M. Firer. Characterization of metrics induced
by hierarchical posets. IEEE Trans. Inform. Theory, 63:3630–3640, 2017.
[1314] D. J. C. MacKay. Information Theory, Inference, and Learning Algorithms. Cam-
bridge University Press, Cambridge, UK, 2003.
[1315] D. J. C. MacKay and M. C. Davey. Evaluation of Gallager codes for short block
length and high rate applications. In B. Marcus and J. Rosenthal, editors, Codes,
Systems, and Graphical Models (Minneapolis, MN, 1999), volume 123 of IMA Vol.
Math. Appl., pages 113–130. Springer, New York, 2001.
[1316] D. J. C. MacKay and R. M. Neal. Near Shannon limit performance of low density
parity check codes. IEEE Electronic Letters, 32:1645–1646, 1996.
[1317] F. J. MacWilliams. Error-correcting codes for multiple-level transmission. Bell Sys-
tem Tech. J., 40:281–308, 1961.
[1318] F. J. MacWilliams. Combinatorial Problems of Elementary Abelian Groups. PhD
thesis, Harvard University, 1962.
[1319] F. J. MacWilliams. A theorem on the distribution of weights in a systematic code.
Bell System Tech. J., 42:79–94, 1963. (Also reprinted in [169] pages 261–265 and
[212] pages 241–257).
[1320] F. J. MacWilliams. Codes and ideals in group algebras. In R. C. Bose and T. A. Dowl-
ing, editors, Combinatorial Mathematics and its Applications (Proc. Conf., Univ.
North Carolina, Chapel Hill, NC, 1967), pages 317–328. Univ. North Carolina Press,
Chapel Hill, NC, 1969.
[1321] F. J. MacWilliams. Binary codes which are ideals in the group algebra of an abelian
group. Bell System Tech. J., 49:987–1011, 1970.
[1322] F. J. MacWilliams and H. B. Mann. On the p-rank of the design matrix of a difference
set. Inform. and Control, 12:474–488, 1968.
[1323] F. J. MacWilliams and N. J. A. Sloane. The Theory of Error-Correcting Codes.
North-Holland Publishing Co., Amsterdam-New York-Oxford, 1977.
[1324] F. J. MacWilliams, N. J. A. Sloane, and J. G. Thompson. Good self-dual codes exist.
Discrete Math., 3:153–162, 1972.
Bibliography 903
[1358] P. Maymounkov. Online codes. New York University Technical Report TR2002–833,
2002.
[1359] B. R. McDonald. Finite Rings with Identity. Marcel Dekker, Inc., New York, 1974.
Pure and Applied Mathematics, Vol. 28.
[1360] R. J. McEliece. Weight congruences of p-ary cyclic codes. Discrete Math., 3:177–192,
1972.
[1361] R. J. McEliece. A public-key system based on algebraic coding theory. DSN Progress
Report 44, pages 114–116, Jet Propulsion Laboratory, Pasadena, California, 1978.
[1362] R. J. McEliece. Finite Fields for Computer Scientists and Engineers. Kluwer Acad.
Pub., Boston, 1987.
[1393] T. Mittelholzer and J. Lahtonen. Group codes generated by finite reflection groups.
IEEE Trans. Inform. Theory, 42:519–528, 1996.
[1394] K. Momihara. Strongly regular Cayley graphs, skew Hadamard difference sets, and
rationality of relative Gauss sums. European J. Combin., 34:706–723, 2013.
[1395] Koji Momihara. Certain strongly regular Cayley graphs on F22(2s+1) from cyclotomy.
Finite Fields Appl., 25:280–292, 2014.
[1396] M. Mondelli, S. H. Hassani, I. Sason, and R. L. Urbanke. Achieving Marton’s region
for broadcast channels using polar codes. IEEE Trans. Inform. Theory, 61:783–800,
2015.
[1397] M. Mondelli, S. H. Hassani, and R. L. Urbanke. Scaling exponent of list decoders
with applications to polar codes. In Proc. IEEE Information Theory Workshop,
Seville, Spain, 5 pages, September 2013.
[1398] M. Mondelli, S. H. Hassani, and R. L. Urbanke. Scaling exponent of list decoders
with applications to polar codes. arXiv:1304.5220 [cs.IT], September 2014.
[1399] J. V. S. Morales and H. Tanaka. An Assmus-Mattson theorem for codes over com-
mutative association schemes. Des. Codes Cryptogr., 86:1039–1062, 2018.
[1400] P. Moree. On the divisors of ak + bk . Acta Arith., 80(3):197–212, 1997.
[1401] C. Moreno. Algebraic Curves over Finite Fields, volume 97 of Cambridge Tracts in
Mathematics. Cambridge University Press, Cambridge, UK, 1991.
[1402] M. Morf, B. C. Lévy, and S.-Y. Kung. New results in 2-D systems theory, part I:
2-D polynomial matrices, factorization, and coprimeness. Proc. of IEEE, 65:861–872,
1977.
[1403] R. Mori and T. Tanaka. Performance and construction of polar codes on symmet-
ric binary-input memoryless channels. In Proc. IEEE International Symposium on
Information Theory, pages 1496–1500, Seoul, Korea, June-July 2009.
[1404] R. Mori and T. Tanaka. Performance of polar codes with the construction using
density evolution. IEEE Commun. Letters, 13:519–521, 2009.
[1405] R. Mori and T. Tanaka. Channel polarization on q-ary discrete memoryless channels
by arbitrary kernels. In Proc. IEEE Information Theory Workshop, pages 894–898,
Austin, Texas, June 2010.
[1406] R. Mori and T. Tanaka. Source and channel polarization over finite fields and Reed-
Solomon matrices. IEEE Trans. Inform. Theory, 60:2720–2736, 2014.
[1407] L. Mroueh, S. Rouquette, and J.-C. Belfiore. Application of perfect space time codes:
PEP bounds and some practical insights. IEEE Trans. Commun., 60:747–755, 2012.
[1408] G. L. Mullen and D. Panario, editors. Handbook of Finite Fields. Discrete Mathe-
matics and its Applications. CRC Press, Boca Raton, Florida, 2013.
[1409] D. E. Muller. Application of Boolean algebra to switching circuit design and to error
detection. IEEE Trans. Comput., 3:6–12, 1954. (Also reprinted in [169] pages 20–26
and [212] pages 43–49).
908 Bibliography
[1419] L. P. Natarajan and B. S. Rajan. Fast-group-decodable STBCs via codes over GF(4):
further results. In Proc. IEEE International Conference on Communications, Kyoto,
Japan, 6 pages, June 2011.
[1420] I. P. Naydenova. Error Detection and Correction for Symmetric and Asymmetric
Channels. PhD thesis, University of Bergen, 2007.
[1423] G. Nebe, E. Rains, and N. J. A. Sloane. Self-Dual Codes and Invariant Theory.
Springer-Verlag, Berlin, Heidelberg, 2006.
[1424] G. Nebe and A. Schäfer. A nilpotent non abelian group code. Algebra Discrete
Math., 18:268–273, 2014.
[1425] A. A. Nechaev. Trace functions in Galois rings and noise-stable codes. In Proc. 5th
All-Union Symposium on the Theory of Rings, Algebras and Modules, page 97, 1982.
Bibliography 909
[1426] A. A. Nechaev. Kerdock’s code in cyclic form (in Russian). Diskret. Mat., 1(4):123–
139, 1989. (English translation in Discrete Math. Appl. 1(4):365–384, 1991).
[1427] A. A. Nechaev. Linear recurrence sequences over commutative rings (in Russian).
Diskret. Mat., 3(4):105–127, 1991. (English translation in Discrete Math. Appl.
2(6):659–683, 1992).
[1428] A. A. Nechaev and D. A. Mikhaı̆lov. A canonical system of generators of a unitary
polynomial ideal over a commutative Artinian chain ring (in Russian). Diskret. Mat.,
13(4):3–42, 2001. (English translation in Discrete Math. Appl. 11(6):545–586, 2001).
[1429] A. Nemirovski. Advances in convex optimization: conic programming. In M. Sanz-
Solé, J. Soria, J. L. Varona, and J. Verdera, editors, Proc. International Congress of
Mathematicians (Madrid, 2006), pages 413–444. Eur. Math. Soc., Zürich, 2007.
[1430] S.-L. Ng and M. B. Paterson. Functional repair codes: a view from projective geom-
etry. Des. Codes Cryptogr., 87:2701–2722, 2019.
[1431] X. T. Ngo, S. Bahsin, J.-L. Danger, S. Guilley, and Z. Najim. Linear complemen-
tary dual code improvement to strengthen encoded circuit against hardware Trojan
horses. In Proc. IEEE International Symposium on Hardware Oriented Security and
Trust (HOST), pages 82–87, McLean, Virginia, May 2015.
[1432] H. Niederreiter. Knapsack-type cryptosystems and algebraic coding theory. Problems
Control Inform. Theory/Problemy Upravlen. Teor. Inform., 15(2):159–166, 1986.
[1433] H. Niederreiter. Point sets and sequences with small discrepancy. Monatsh. Math.,
104:273–337, 1987.
[1434] H. Niederreiter. A combinatorial problem for vector spaces over finite fields. Discrete
Math., 96:221–228, 1991.
[1435] H. Niederreiter. Orthogonal arrays and other combinatorial aspects in the theory of
uniform point distributions in unit cubes. Discrete Math., 106/107:361–367, 1992.
[1436] H. Niederreiter and F. Özbudak. Constructive asymptotic codes with an improve-
ment on the Tsfasman-Vlăduţ-Zink and Xing bounds. In K. Feng, H. Niederreiter,
and C. Xing, editors, Coding, Cryptography and Combinatorics, volume 23 of Progr.
Comput. Sci. Appl. Logic, pages 259–275. Birkhäuser Verlag Basel, 2004.
[1437] H. Niederreiter and C. Xing. Rational Points on Curves over Finite Fields: Theory
and Applications, volume 285 of London Math. Soc. Lecture Note Ser. Cambridge
University Press, Cambridge, UK, 2001.
[1438] Y. Niho. Multi-Valued Cross-Correlation Functions Between Two Maximal Linear
Recursive Sequences. PhD thesis, University of Southern California, 1972.
[1439] A. Nijenhuis and H. S. Wilf. Combinatorial Algorithms. Academic Press, Inc. [Har-
court Brace Jovanovich, Publishers], New York-London, 2nd edition, 1978. For com-
puters and calculators, Computer Science and Applied Mathematics.
[1440] S. Nishimura and T. Hiramatsu. A generalization of the Lee distance and error
correcting codes. Discrete Appl. Math., 156:588–595, 2008.
[1441] NIST. Post-quantum cryptography standardization. https:
//csrc.nist.gov/Projects/post-quantum-cryptography/
Post-Quantum-Cryptography-Standardization, 2017.
910 Bibliography
[1442] K. Niu and K. Chen. CRC-aided decoding of polar codes. IEEE Commun. Letters,
16:1668–1671, 2012.
[1443] K. Niu and K. Chen. Stack decoding of polar codes. Electron. Lett., 48:695–697,
2012.
[1444] J.-S. No, S. W. Golomb, G. Gong, H.-K. Lee, and P. Gaal. Binary pseudorandom
sequences of period 2n − 1 with ideal autocorrelation. IEEE Trans. Inform. Theory,
44:814–817, 1998.
[1445] R. W. Nóbrega and B. F. Uchôa-Filho. Multishot codes for network coding: bounds
and a multilevel construction. In Proc. IEEE International Symposium on Informa-
tion Theory, pages 428–432, Seoul, Korea, June-July 2009.
[1448] G. H. Norton and A. Sălăgean. On the structure of linear and cyclic codes over a
finite chain ring. Appl. Algebra Engrg. Comm. Comput., 10:489–506, 2000.
[1449] G. H. Norton and A. Sălăgean. Cyclic codes and minimal strong Gröbner bases over
a principal ideal ring. Finite Fields Appl., 9:237–249, 2003.
[1450] K. Nyberg. Constructions of bent functions and difference sets. In I. B. Damgård,
editor, Advances in Cryptology – EUROCRYPT ’90, volume 473 of Lecture Notes in
Comput. Sci., pages 151–160. Springer, Berlin, Heidelberg, 1991.
[1451] K. Nyberg. Perfect nonlinear S-boxes. In D. W. Davies, editor, Advances in Cryp-
tology – EUROCRYPT ’91, volume 547 of Lecture Notes in Comput. Sci., pages
378–386. Springer, Berlin, Heidelberg, 1991.
[1457] F. Oggier and B. Hassibi. An algebraic coding scheme for wireless relay networks
with multiple-antenna nodes. IEEE Trans. Signal Process., 56:2957–2966, 2008.
Bibliography 911
[1458] F. Oggier, G. Rekaya, J.-C. Belfiore, and E. Viterbo. Perfect space-time block codes.
IEEE Trans. Inform. Theory, 52:3885–3902, 2006.
[1459] F. Oggier and A. Sboui. On the existence of generalized rank weights. In Proc.
IEEE International Symposium on Information Theory and its Applications, pages
406–410, Honolulu, Hawaii, October 2012.
[1460] F. Oggier, P. Solé, and J.-C. Belfiore. Codes over matrix rings for space-time coded
modulations. IEEE Trans. Inform. Theory, 58:734–746, 2012.
[1461] F. Oggier, P. Solé, and J.-C. Belfiore. Lattice codes for the wiretap Gaussian channel:
construction and analysis. IEEE Trans. Inform. Theory, 62:5690–5708, 2016.
[1462] F. Oggier and E. Viterbo. Algebraic number theory and code design for Rayleigh
fading channels. Found. Trends Commun. and Inform. Theory, 1:333–415, 2004.
[1463] C. M. O’Keefe and T. Penttila. A new hyperoval in PG(2, 32). J. Geom., 44:117–139,
1992.
[1464] T. Okuyama and Y. Tsushima. On a conjecture of P. Landrock. J. Algebra, 104:203–
208, 1986.
[1465] O. Olmez and A. Ramamoorthy. Fractional repetition codes with flexible repair from
combinatorial designs. IEEE Trans. Inform. Theory, 62:1565–1591, 2016.
[1466] O. Ore. Theory of non-commutative polynomials. Ann. of Math. (2), 34:480–508,
1933.
[1467] P. R. J. Östergård. Classifying subspaces of Hamming spaces. Des. Codes Cryptogr.,
27:297–305, 2002.
[1468] P. R. J. Östergård. On the size of optimal three-error-correcting binary codes of
length 16. IEEE Trans. Inform. Theory, 57:6824–6826, 2011.
[1469] P. R. J. Östergård. The sextuply shortened binary Golay code is optimal. Des. Codes
Cryptogr., 87:341–347, 2019.
[1470] P. R. J. Östergård and M. K. Kaikkonen. New single-error-correcting codes. IEEE
Trans. Inform. Theory, 42:1261–1262, 1996.
[1471] P. R. J. Östergård and O. Pottonen. There exists no Steiner system S(4, 5, 17). J.
Combin. Theory Ser. A, 115:1570–1573, 2008.
[1472] P. R. J. Östergård and O. Pottonen. The perfect binary one-error-correcting codes of
length 15: Part I—Classification. IEEE Trans. Inform. Theory, 55:4657–4660, 2009.
[1473] P. R. J. Östergård, O. Pottonen, and K. T. Phelps. The perfect binary one-error-
correcting codes of length 15: Part II—Properties. IEEE Trans. Inform. Theory,
56:2571–2582, 2010.
[1474] M. E. O’Sullivan. Decoding of codes on surfaces. In Proc. IEEE Information Theory
Workshop, pages 33–34, Killarney, Ireland, June 1998.
[1475] P. Oswald and M. A. Shokrollahi. Capacity-achieving sequences for the erasure
channel. IEEE Trans. Inform. Theory, 48:3017–3028, 2002.
[1476] K. Otal and F. Özbudak. Cyclic subspace codes via subspace polynomials. Des.
Codes Cryptogr., 85:191–204, 2017.
912 Bibliography
[1477] A. Otmani and J.-P. Tillich. An efficient attack on all concrete KKS proposals. In
B.-Y. Yang, editor, Post-Quantum Cryptography, volume 7071 of Lecture Notes in
Comput. Sci., pages 98–116. Springer, Berlin, Heidelberg, 2011.
[1478] A. Otmani, J.-P. Tillich, and L. Dallot. Cryptanalysis of two McEliece cryptosystems
based on quasi-cyclic codes. Math. Comput. Sci., 3:129–140, 2010.
[1479] R. Overbeck. A new structural attack for GPT and variants. In E. Dawson and
S. Vaudenay, editors, Progress in Cryptology – Mycrypt 2005, volume 3715 of Lecture
Notes in Comput. Sci., pages 50–63. Springer, Berlin, Heidelberg, 2005.
[1480] M. Özen and I. Şiap. Linear codes over Fq [u]/(us ) with respect to the Rosenbloom-
Tsfasman metric. Des. Codes Cryptogr., 38:17–29, 2006.
[1481] M. Özen and V. Şiap. The MacWilliams identity for m-spotty weight enumerators
of linear codes over finite fields. Comput. Math. Appl., 61:1000–1004, 2011.
[1482] M. Özen and V. Şiap. The MacWilliams identity for m-spotty Rosenbloom-Tsfasman
weight enumerator. J. Franklin Inst., 351:743–750, 2014.
[1483] N. Pace and A. Sonnino. On linear codes admitting large automorphism groups.
Des. Codes Cryptogr., 83:115–143, 2017.
[1484] R. E. A. C. Paley. On orthogonal matrices. J. Mathematics and Physics, 12:311–320,
1933.
[1485] L. Pamies-Juarez, H. D. L. Hollmann, and F. Oggier. Locally repairable codes with
multiple repair alternatives. In Proc. IEEE International Symposium on Information
Theory, pages 892–896, Istanbul, Turkey, July 2013.
[1486] L. Panek, M. Firer, and M. M. S. Alves. Classification of Niederreiter-Rosenbloom-
Tsfasman block codes. IEEE Trans. Inform. Theory, 56:5207–5216, 2010.
[1487] L. Panek, M. Firer, H. K. Kim, and J. Y. Hyun. Groups of linear isometries on poset
structures. Discrete Math., 308:4116–4123, 2008.
[1488] E. Paolini, G. Liva, B. Matuz, and M. Chiani. Maximum likelihood erasure decod-
ing of LDPC codes: pivoting algorithms and code design. IEEE Trans. Commun.,
60:3209–3220, 2012.
[1511] P. Piret. Structure and constructions of cyclic convolutional codes. IEEE Trans.
Information Theory, 22:147–155, 1976.
[1512] J. S. Plank and M. Blaum. Sector-disk (SD) erasure codes for mixed failure modes
in RAID systems. ACM Trans. on Storage, 10(1):4:1–4:17, 2014.
[1513] V. S. Pless. Power moment identities on weight distributions in error correcting
codes. Inform. and Control, 6:147–152, 1963. (Also reprinted in [169] pages 266–267
and [212] pages 257–262).
[1514] V. S. Pless. On the uniqueness of the Golay codes. J. Combin. Theory, 5:215–228,
1968.
[1515] V. S. Pless. A classification of self-orthogonal codes over GF(2). Discrete Math.,
3:209–246, 1972.
[1516] V. S. Pless. 23 does not divide the order of the group of a (72, 36, 16) doubly even
code. IEEE Trans. Inform. Theory, 28:113–117, 1982.
[1517] V. S. Pless. Q-codes. J. Combin. Theory Ser. A, 43:258–276, 1986.
[1518] V. S. Pless. More on the uniqueness of the Golay code. Discrete Math., 106/107:391–
398, 1992.
[1519] V. S. Pless. Duadic codes and generalizations. In P. Camion, P. Charpin, and
S. Harari, editors, Proc. International Symposium on Information Theory – EU-
ROCODE 1992, volume 339 of CISM Courses and Lect., pages 3–15. Springer, Vi-
enna, 1993.
[1520] V. S. Pless. Introduction to the Theory of Error-Correcting Codes. J. Wiley & Sons,
New York, 3rd edition, 1998.
[1521] V. S. Pless and W. C. Huffman, editors. Handbook of Coding Theory, Vol. I, II.
North-Holland, Amsterdam, 1998.
[1522] V. S. Pless and Z. Qian. Cyclic codes and quadratic residue codes over Z4 . IEEE
Trans. Inform. Theory, 42:1594–1600, 1996.
[1523] V. S. Pless and N. J. A. Sloane. On the classification and enumeration of self-dual
codes. J. Combin. Theory Ser. A, 18:313–335, 1975.
[1524] V. S. Pless, P. Solé, and Z. Qian. Cyclic self-dual Z4 -codes. Finite Fields Appl.,
3:48–69, 1997.
[1525] V. S. Pless and J. G. Thompson. 17 does not divide the order of the group of a
(72, 36, 16) doubly even code. IEEE Trans. Inform. Theory, 28:537–541, 1982.
[1526] M. Plotkin. Binary Codes with Specified Minimum Distance. Master’s thesis, Moore
School of Electrical Engineering, University of Pennsylvania, 1952.
[1527] M. Plotkin. Binary codes with specified minimum distances. IRE Trans. Inform.
Theory, 6:445–450, 1960. (Also reprinted in [169] pages 238–243).
[1528] D. Pointcheval and J. Stern. Security proofs for signature schemes. In U. Maurer,
editor, Advances in Cryptology – EUROCRYPT ’96, volume 1070 of Lecture Notes
in Comput. Sci., pages 387–398. Springer, Berlin, Heidelberg, 1996.
Bibliography 915
[1535] N. Prakash, V. Lalitha, S. B. Balaji, and P. V. Kumar. Codes with locality for two
erasures. IEEE Trans. Inform. Theory, 65:7771–7789, 2019.
[1536] E. Prange. Cyclic Error-Correcting Codes in Two Symbols. TN-57-103, Air Force
Cambridge Research Labs, Bedford, Massachusetts, September 1957.
[1537] E. Prange. Some Cyclic Error-Correcting Codes with Simple Decoding Algorithms.
TN-58-156, Air Force Cambridge Research Labs, Bedford, Massachusetts, April 1958.
[1538] E. Prange. An Algorithm for Factoring xn − 1 over a Finite Field. TN-59-175, Air
Force Cambridge Research Labs, Bedford, Massachusetts, October 1959.
[1539] E. Prange. The Use of Coset Equivalence in the Analysis and Decoding of Groups
Codes. TN-59-164, Air Force Cambridge Research Labs, Bedford, Massachusetts,
1959.
[1540] E. Prange. The use of information sets in decoding cyclic codes. IRE Trans. Inform.
Theory, 8:S 5–S 9, 1962.
[1541] F. P. Preparata. A class of optimum nonlinear double-error-correcting codes. Inform.
and Control, 13:378–400, 1968 (Also reprinted in [212] pages 366–388).
[1542] J. Preskill. Fault-tolerant quantum computation. In H.-K. Lo, S. Popescu, and
T. Spiller, editors, Introduction to Quantum Computation and Information, pages
213–269. World Sci. Publ., River Edge, New Jersey, 1998.
[1543] N. Presman. Methods in Polar Coding. PhD thesis, Tel Aviv University, 2015.
[1544] N. Presman and S. Litsyn. Recursive descriptions of polar codes. Adv. Math. Com-
mun., 11:1–65, 2017.
[1545] N. Presman, O. Shapira, and S. Litsyn. Mixed-kernels constructions of polar codes.
IEEE J. Selected Areas Commun., 34:239–253, 2016.
[1547] S. Pumplün and T. Unger. Space-time block codes from nonassociative division
algebras. Adv. Math. Commun., 5:449–471, 2011.
[1548] M. Putinar. Positive polynomials on compact semi-algebraic sets. Indiana Univ.
Math. J., 42:969–984, 1993.
[1549] L.-J. Qu, Y. Tan, and C. Li. On the Walsh spectrum of a family of quadratic APN
functions with five terms. Sci. China Inf. Sci., 57(2):028104, 7, 2014.
[1550] J. Quistorff. On Rosenbloom and Tsfasman’s generalization of the Hamming space.
Discrete Math., 307:2514–2524, 2007.
[1551] B. Qvist. Some remarks concerning curves of the second degree in a finite plane.
Ann. Acad. Sci Fenn., 134:1–27, 1952.
[1552] M. Rahman and I. F. Blake. Majority logic decoding using combinatorial designs.
IEEE Trans. Inform. Theory, 21:585–587, 1975.
[1553] E. M. Rains. Shadow bounds for self-dual codes. IEEE Trans. Inform. Theory,
44:134–139, 1998.
[1554] E. M. Rains. Nonbinary quantum codes. IEEE Trans. Inform. Theory, 45:1827–1832,
1999.
[1555] E. M. Rains and N. J. A. Sloane. Self-Dual Codes. In V. S. Pless and W. C.
Huffman, editors, Handbook of Coding Theory, Vol. I, II, chapter 3, pages 177–294.
North-Holland, Amsterdam, 1998.
[1556] B. S. Rajan and M. U. Siddiqi. A generalized DFT for abelian codes over Zm . IEEE
Trans. Inform. Theory, 40:2082–2090, 1994.
[1557] A. Ralston. A new memoryless algorithm for de Bruijn sequences. J. Algorithms,
2:50–62, 1981.
[1558] A. Ralston. De Bruijn sequences—a model example of the interaction of discrete
mathematics and computer science. Math. Mag., 55:131–143, 1982.
[1559] H. Randriambololona. Bilinear complexity of algebras and the Chudnovsky-
Chudnovsky interpolation method. J. Complexity, 28:489–517, 2012.
[1560] H. Randriambololona. (2, 1)-separating systems beyond the probabilistic bound.
Israel J. Math., 195:171–186, 2013.
[1561] H. Randriambololona. Asymptotically good binary linear codes with asymptotically
good self-intersection spans. IEEE Trans. Inform. Theory, 59:3038–3045, 2013.
[1562] H. Randriambololona. An upper bound of Singleton type for componentwise prod-
ucts of linear codes. IEEE Trans. Inform. Theory, 59:7936–7939, 2013.
[1563] H. Randriambololona. Linear independence of rank 1 matrices and the dimension
of ∗-products of codes. In Proc. IEEE International Symposium on Information
Theory, pages 196–200, Hong Kong, China, June 2015.
[1564] H. Randriambololona. On products and powers of linear codes under componentwise
multiplication. In S. Ballet, M. Perret, and A. Zaytsev, editors, Algorithmic Arith-
metic, Geometry, and Coding Theory, volume 637 of Contemp. Math., pages 3–77.
Amer. Math. Soc., Providence, Rhode Island, 2015.
Bibliography 917
[1565] A. Rao and N. Pinnawala. A family of two-weight irreducible cyclic codes. IEEE
Trans. Inform. Theory, 56:2568–2570, 2010.
[1566] C. R. Rao. Factorial experiments derivable from combinatorial arrangements of
arrays. Suppl. to J. Roy. Statist. Soc., 9:128–139, 1947.
[1567] T. R. N. Rao and A. S. Chawla. Asymmetric error codes for some LSI semiconductor
memories. In Proc. 7th Southeastern Symposium on System Theory, pages 170–171,
Auburn-Tuskegee, Alabama, March 1975.
[1568] K. V. Rashmi, N. B. Shah, D. Gu, H. Kuang, D. Borthakur, and K. Ramchandran.
A solution to the network challenges of data recovery in erasure-coded distributed
storage systems: a study on the Facebook Warehouse cluster. In Proc. 5th USENIX
Workshop on Hot Topics in Storage and File Systems, San Jose, California, 5 pages,
June 2013.
[1569] K. V. Rashmi, N. B. Shah, and P. V. Kumar. Optimal exact-regenerating codes for
distributed storage at the MSR and MBR points via a product-matrix construction.
IEEE Trans. Inform. Theory, 57:5227–5239, 2011.
[1570] K. V. Rashmi, N. B. Shah, P. V. Kumar, and K. Ramchandran. Explicit construction
of optimal exact regenerating codes for distributed storage. In Proc. 47th Allerton
Conference on Communication, Control, and Computing, pages 1243–1249, Monti-
cello, Illinois, September-October 2009.
[1571] K. V. Rashmi, N. B. Shah, and K. Ramchandran. A piggybacking design framework
for read-and download-efficient distributed storage codes. IEEE Trans. Inform. The-
ory, 63:5802–5820, 2017.
[1572] K. V. Rashmi, N. B. Shah, K. Ramchandran, and P. V. Kumar. Information-
theoretically secure erasure codes for distributed storage. IEEE Trans. Inform.
Theory, 64:1621–1646, 2018.
[1573] A. Ravagnani. Generalized weights: an anticode approach. J. Pure Appl. Algebra,
220:1946–1962, 2016.
[1574] A. Ravagnani. Rank-metric codes and their duality theory. Des. Codes Cryptogr.,
80:197–216, 2016.
[1575] A. Ravagnani. Duality of codes supported on regular lattices, with an application to
enumerative combinatorics. Des. Codes Cryptogr., 86:2035–2063, 2018.
[1576] A. S. Rawat. Secrecy capacity of minimum storage regenerating codes. In Proc.
IEEE International Symposium on Information Theory, pages 1406–1410, Aachen,
Germany, June 2017.
[1577] A. S. Rawat, O. O. Koyluoglu, N. Silberstein, and S. Vishwanath. Optimal locally
repairable and secure codes for distributed storage systems. IEEE Trans. Inform.
Theory, 60:212–236, 2014.
[1578] A. S. Rawat, O. O. Koyluoglu, and S. Vishwanath. Progress on high-rate MSR
codes: enabling arbitrary number of helper nodes. In Proc. Information Theory and
Applications Workshop, La Jolla, California, 6 pages, January-February 2016.
[1579] A. S. Rawat, A. Mazumdar, and S. Vishwanath. Cooperative local repair in dis-
tributed storage. EURASIP J. Adv. Signal Process., 2015:107, 2015.
918 Bibliography
[1582] I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. J. Soc. Indust.
Appl. Math., 8:300–304, 1960. (Also reprinted in [169] pages 70–71 and [212] pages
189–193).
[1583] O. Regev. New lattice based cryptographic constructions. In Proc. 35th Annual ACM
Symposium on the Theory of Computing, pages 407–416, San Diego, California, June
2003. ACM Press, New York.
[1584] G. Regts. Upper Bounds for Ternary Constant Weight Codes from Semidefinite
Programming and Representation Theory. Master’s thesis, University of Amsterdam,
2009.
[1585] S. Reiger. Codes for the correction of “clustered” errors. IRE Trans. Inform. Theory,
6:16–21, 1960.
[1586] O. Reingold, S. Vadhan, and A. Wigderson. Entropy waves, the zig-zag graph prod-
uct, and new constant degree expanders and extractors. In Proc. 41st IEEE Sympo-
sium on Foundations of Computer Science, pages 3–13, Redondo Beach, California,
November 2000.
[1587] A. Renvall and C. Ding. The access structure of some secret-sharing schemes. In
J. Pieprzyk and J. Seberry, editors, Information Security and Privacy, volume 1172
of Lecture Notes in Comput. Sci., pages 67–78. Springer, Berlin, Heidelberg, 1996.
[1588] H. F. H. Reuvers. Some Non-existence Theorems for Perfect Codes over Arbitrary
Alphabets. Technische Hogeschool Eindhoven, Eindhoven, 1977. Written for confer-
ment of a Doctorate in Technical Sciences at the Technische Hogeschool, Eindhoven,
1977.
[1589] T. J. Richardson, M. A. Shokrollahi, and R. L. Urbanke. Design of capacity-
approaching irregular low-density parity-check codes. IEEE Trans. Inform. Theory,
47:619–637, 2001.
[1590] T. J. Richardson and R. L. Urbanke. The capacity of low-density parity-check codes
under message-passing decoding. IEEE Trans. Inform. Theory, 47:599–618, 2001.
[1591] T. J. Richardson and R. L. Urbanke. Modern Coding Theory. Cambridge University
Press, Cambridge, UK, 2008.
[1592] P. Rocha. Structure and Representation of 2-D Systems. PhD thesis, University of
Groningen, 1990.
[1593] B. G. Rodrigues. A projective two-weight code related to the simple group Co1 of
Conway. Graphs Combin., 34:509–521, 2018.
[1594] C. Roos. A new lower bound for the minimum distance of a cyclic code. IEEE Trans.
Inform. Theory, 29:330–332, 1983.
Bibliography 919
[1595] M. Y. Rosenbloom and M. A. Tsfasman. Codes for the m-metric (in Russian). Prob-
lemy Peredači Informacii, 33(1):55–63, 1997. (English translation in Probl. Inform.
Transm., 33(1):45–52, 1997).
[1596] J. Rosenthal. Connections between linear systems and convolutional codes. In
B. Marcus and J. Rosenthal, editors, Codes, Systems, and Graphical Models (Min-
neapolis, MN, 1999), volume 123 of IMA Vol. Math. Appl., pages 39–66. Springer,
New York, 2001.
[1597] J. Rosenthal, J. M. Schumacher, and E. V. York. On behaviors and convolutional
codes. IEEE Trans. Inform. Theory, 42:1881–1891, 1996.
[1598] J. Rosenthal and R. Smarandache. Maximum distance separable convolutional codes.
Appl. Algebra Engrg. Comm. Comput., 10:15–32, 1999.
[1611] A. Sălăgean. Repeated-root cyclic and negacyclic codes over a finite chain ring.
Discrete Appl. Math., 154:413–419, 2006.
920 Bibliography
[1612] S. Saraf and M. Sudan. An improved lower bound on the size of Kakeya sets over
finite fields. Anal. PDE, 1:375–379, 2008.
[1613] G. Sarkis, P. Giard, A. Vardy, C. Thibeault, and W. J. Gross. Fast polar decoders:
algorithm and implementation. IEEE J. Selected Areas Commun., 32:946–957, 2014.
[1614] G. Sarkis, I. Tal, P. Giard, A. Vardy, C. Thibeault, and W. J. Gross. Flexible
and low-complexity encoding and decoding of systematic polar codes. IEEE Trans.
Commun., 64:2732–2745, 2016.
[1615] P. K. Sarvepalli, A. Klappenecker, and M Rötteler. Asymmetric quantum codes: con-
structions, bounds and performance. Proc. Roy. Soc. London Ser. A, 465(2105):1645–
1672, 2009.
[1627] B. Schmidt and C. White. All two-weight irreducible cyclic codes? Finite Fields
Appl., 8:1–17, 2002.
[1628] W. M. Schmidt. Diophantine Approximation, volume 785 of Lecture Notes in Math.
Springer, Berlin, 1980.
[1629] W. M. Schmidt. Equations Over Finite Fields: An Elementary Approach. Kendrick
Press, Heber City, Utah, 2nd edition, 2004.
[1630] J. Schönheim. On linear and nonlinear single-error-correcting q-nary perfect codes.
Inform. and Control, 12:23–26, 1968.
[1631] H. D. Schotten. Optimum complementary sets and quadriphase sequences derived
from q-ary m-sequences. In Proc. IEEE International Symposium on Information
Theory, page 485, Ulm, Germany, June-July 1997.
[1632] A. Schrijver. A comparison of the Delsarte and Lovász bounds. IEEE Trans. Inform.
Theory, 25:425–429, 1979.
[1633] A. Schrijver. Theory of Linear and Integer Programming. Wiley-Interscience Series
in Discrete Mathematics. John Wiley & Sons, Ltd., Chichester, UK, 1986.
[1634] A. Schrijver. New code upper bounds from the Terwilliger algebra and semidefinite
programming. IEEE Trans. Inform. Theory, 51:2859–2866, 2005.
[1635] B. Schumacher. Quantum coding. Physical Review A, 51:2738–2747, April 1995.
[1636] M. Schwartz and A. Vardy. On the stopping distance and stopping redundancy
of codes. In Proc. IEEE International Symposium on Information Theory, pages
975–979, Adelaide, Australia, September 2005.
[1637] G. F. Seelinger, P. A. Sissokho, L. E. Spence, and C. Vanden Eynden. Partitions of
V (n, q) into 2- and s-dimensional subspaces. J. Combin. Des., 20:467–482, 2012.
[1638] B. Segre. Curve razionali normali e k-archi negli spazi finiti. Ann. Mat. Pura Appl.
(4), 39:357–379, 1955.
[1639] B. Segre. Ovals in a finite projective plane. Canad. J. Math., 7:414–416, 1955.
[1640] B. Segre. Sui k-archi nei piani finiti di caratteristica due. Rev. Math. Pures Appl.,
2:289–300, 1957.
[1641] B. Segre. Ovali e curve σ nei piani di Galois di caratteristica due (in Italian). Atti
Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Nat. (8), 32:785–790, 1962.
[1642] E. S. Selmer. Linear Recurrence Relations Over Finite Fields. Department of Infor-
matics, University of Bergen, 1966.
[1643] N. V. Semakov and V. A. Zinov’ev. Equidistant q-ary codes and resolved balanced
incomplete block designs (in Russian). Problemy Peredači Informacii, 4(2):3–10,
1968. (English translation in Probl. Inform. Transm., 4(2):1–7, 1968).
[1644] N. V. Semakov, V. A. Zinov’ev, and G. V. Zaitsev. Class of maximal equidistant codes
(in Russian). Problemy Peredači Informacii, 5(2):84–87, 1969. (English translation
in Probl. Inform. Transm., 5(2):65–68, 1969).
[1645] P. Semenov and P. Trifonov. Spectral method for quasi-cyclic code analysis. IEEE
Commun. Letters, 16:1840–1843, 2012.
922 Bibliography
[1670] M. Shi, Y. Guan, C. Wang, and P. Solé. Few-weight codes from trace codes over Rk .
Bull. Aust. Math. Soc., 98:167–174, 2018.
[1671] M. Shi, D. Huang, and P. Solé. Optimal ternary cubic two-weight codes. Chinese J.
Electronics, 27:734–738, 2018.
[1672] M. Shi, Y. Liu, and P. Solé. Optimal two-weight codes from trace codes over F2 +uF2 .
IEEE Commun. Letters, 20:2346–2349, 2016.
[1673] M. Shi, Y. Liu, and P. Solé. Optimal binary codes from trace codes over a non-chain
ring. Discrete Appl. Math., 219:176–181, 2017.
[1674] M. Shi, Y. Liu, and P. Solé. Two-weight and three-weight codes from trace codes
over Fp + uFp + vFp + uvFp . Discrete Math., 341:350–357, 2018.
[1675] M. Shi, L. Q. Qian, and P. Solé. Few-weight codes from trace codes over a local ring.
Appl. Algebra Engrg. Comm. Comput., 29:335–350, 2018.
[1676] M. Shi, R. S. Wu, Y. Liu, and P. Solé. Two and three weight codes over Fp + uFp .
Cryptogr. Commun., 9:637–646, 2017.
[1677] M. Shi, R. S. Wu, L. Q. Qian, L. Sok, and P. Solé. New classes of p-ary few weight
codes. Bull. Malays. Math. Sci. Soc., 42:1393–1412, 2019.
[1678] M. Shi, R. S. Wu, and P. Solé. Asymptotically good additive cyclic codes exist. IEEE
Commun. Letters, 22:1980–1983, 2018.
[1679] M. Shi, H. Zhu, and P. Solé. Optimal three-weight cubic codes. Appl. Comput.
Math., 17:175–184, 2018.
924 Bibliography
[1680] M. Shi, S. Zhu, and S. Yang. A class of optimal p-ary codes from one-weight codes
over Fp [u]/(um ). J. Franklin Inst., 350:929–937, 2013.
[1681] K. Shimizu. A parallel LSI architecture for LDPC decoder improving message-
passing schedule. In Proc. IEEE International Symposium on Circuits and Systems,
pages 5099–5102, Island of Kos, Greece, May 2006.
[1682] K. Shiromoto. Singleton bounds for codes over finite rings. J. Algebraic Combin.,
12:95–99, 2000.
[1683] K. Shiromoto. Matroids and codes with the rank metric. arXiv:1803.06041
[math.CO], March 2018.
[1684] M. A. Shokrollahi. New sequences of linear time erasure codes approaching the
channel capacity. In M. Fossorier, H. Imai, S. Shu Lin, and A. Poli, editors, Applied
Algebra, Algebraic Algorithms and Error-Correcting Codes, volume 1719 of Lecture
Notes in Comput. Sci., pages 65–76. Springer, Berlin, Heidelberg, 1999.
[1685] M. A. Shokrollahi. Codes and graphs. In H. Reichel and S. Tison, editors, STACS
2000 – Symposium on Theoretical Aspects of Computer Science, volume 1770 of
Lecture Notes in Comput. Sci., pages 1–12. Springer, Berlin, Heidelberg, 2000.
[1686] M. A. Shokrollahi. Capacity-achieving sequences. In M. Brian and J. Rosenthal,
editors, Codes, Systems, and Graphical Models (Minneapolis, MN 1999), volume 123
of The IMA Volumes in Mathematics and its Applications, pages 153–166. Springer,
New York, 2001.
[1687] M. A. Shokrollahi. Raptor codes. IEEE Trans. Inform. Theory, 52:2551–2567, 2006.
[1688] M. A. Shokrollahi. Theory and applications of Raptor codes. In A. Quarteroni,
editor, MATHKNOW—Mathematics, Applied Sciences and Real Life, volume 3 of
MS&A. Model. Simul. Appl., pages 59–89. Springer-Verlag Italia, Milan, 2009.
[1689] M. A. Shokrollahi, B. Hassibi, B. M. Hochwald, and W. Sweldens. Representation
theory for high-rate multiple-antenna code design. IEEE Trans. Inform. Theory,
47:2335–2367, 2001.
[1690] M. A. Shokrollahi, S. Lassen, and R. Karp. Systems and processes for decoding
chain reaction codes through inactivation. http://www.freepatentsonline.com/
6856263.html, February 2005.
[1691] M. A. Shokrollahi and M. G. Luby. Raptor codes. Found. Trends Commun. and
Inform. Theory, 6:213–322, 2009.
[1692] M. A. Shokrollahi, T. Stockhammer, M. G. Luby, and M. Watson. Raptor Forward
Error Correction Scheme for Object Delivery. RFC 5053, October 2007.
[1693] M. A. Shokrollahi and H. Wasserman. List decoding of algebraic-geometric codes.
IEEE Trans. Inform. Theory, 45:432–437, 1999.
[1694] P. W. Shor. Algorithms for quantum computation: discrete logarithms and factoring.
In S. Goldwasser, editor, Proc. 35th IEEE Symposium on Foundations of Computer
Science, pages 124–134, Santa Fe, New Mexico, 1994. IEEE Comput. Soc. Press, Los
Alamitos, California.
[1695] P. W. Shor. Scheme for reducing decoherence in quantum computer memory. Physical
Review A, 52:R2493–R2496, October 1995.
Bibliography 925
[1696] P. W. Shor. Polynomial-time algorithms for prime factorization and discrete loga-
rithms on a quantum computer. SIAM J. Comput., 26:1484–1509, 1997.
[1697] I. E. Shparlinski, M. A. Tsfasman, and S. G. Vlădaţ. Curves with many points
and multiplication in finite fields. In H. Stichtenoth and M. A. Tsfasman, editors,
Coding Theory and Algebraic Geometry, volume 1518 of Lecture Notes in Math.,
pages 145–169. Springer, Berlin, Heidelberg, 1992.
[1698] K. W. Shum and Y. Hu. Cooperative regenerating codes. IEEE Trans. Inform.
Theory, 59:7229–7258, 2013.
[1699] I. Şiap. MacWilliams identity for m-spotty Lee weight enumerators. Appl. Math.
Lett., 23:13–16, 2010.
[1700] I. Şiap, T. Abualrub, and A. Ghrayeb. Cyclic DNA codes over the ring F2 [u]/(u2 −1)
based on the deletion distance. J. Franklin Inst., 346:731–740, 2009.
[1701] I. Şiap, N. Aydin, and D. K. Ray-Chaudhuri. New ternary quasi-cyclic codes with
better minimum distances. IEEE Trans. Inform. Theory, 46:1554–1558, 2000.
[1709] D. Silva and F. R. Kschischang. Fast encoding and decoding of Gabidulin codes.
In Proc. IEEE International Symposium on Information Theory, pages 2858–2862,
Seoul, Korea, June-July 2009.
[1710] D. Silva and F. R. Kschischang. On metrics for error correction in network coding.
IEEE Trans. Inform. Theory, 55:5479–5490, 2009.
[1712] F. G. Silva, V. Sousa, and Marinho L. High coded data rate and multicodeword
WiMAX LDPC decoding on the Cell/BE. IET Electronics Letters, 44:1415–1417,
2008.
[1713] J. H. Silverman. The Arithmetic of Elliptic Curves, volume 106 of Graduate Texts
in Mathematics. Springer, Dordrecht, 2nd edition, 2009.
[1717] R. C. Singleton. Maximum distance q-ary codes. IEEE Trans. Inform. Theory,
10:116–118, 1964.
[1718] M. Sinnokrot, J. R. Barry, , and V. Madisetti. Embedded Alamouti space-time
codes for high rate and low decoding complexity. In Proc. 42nd Asilomar Conference
on Signals, Systems and Computers, pages 1749–1753, Pacific Grove, California,
October 2008.
[1719] M. Sipser and D. Spielman. Expander codes. IEEE Trans. Inform. Theory, 42:1710–
1722, 1996.
[1720] A. N. Skorobogatov and S. G. Vlăduţ. On the decoding of algebraic-geometric codes.
IEEE Trans. Inform. Theory, 36:1051–1060, 1990.
[1721] M. M. Skriganov. Coding theory and uniform distributions. St. Petersburg Math.
J., 13:301–337, 2002.
[1722] D. Slepian. A class of binary signaling alphabets. Bell System Tech. J., 35:203–234,
1956. (Also reprinted in [169] pages 56–65 and [212] pages 83–114).
[1723] D. Slepian. A note on two binary signaling alphabets. IRE Trans. Inform. Theory,
2:84–86, 1956. (Also reprinted in [212] pages 115–117).
[1724] D. Slepian. Some further theory of group codes. Bell System Tech. J., 39:203–234,
1960. (Also reprinted in [212] pages 118–151).
[1725] D. Slepian. Group codes for the Gaussian channel. Bell System Tech. J., 47:575–602,
1968.
[1726] N. J. A. Sloane. Is there a (72, 36) d = 16 self-dual code? IEEE Trans. Inform.
Theory, 19:251, 1973.
[1727] N. J. A. Sloane and J. G. Thompson. Cyclic self-dual codes. IEEE Trans. Inform.
Theory, 29:364–366, 1983.
[1728] R. Smarandache, H. Gluesing-Luerssen, and J. Rosenthal. Constructions of MDS-
convolutional codes. IEEE Trans. Inform. Theory, 47:2045–2049, 2001.
[1729] R. Smarandache and J. Rosenthal. A state space approach for constructing MDS
rate 1/n convolutional codes. In Proc. IEEE Information Theory Workshop, pages
116–117, Killarney, Ireland, June 1998.
Bibliography 927
[1732] S. L. Snover. The Uniqueness of the Nordstrom–Robinson and the Golay Binary
Codes. PhD thesis, Michigan State University, 1973.
[1733] P. Solé. Asymptotic bounds on the covering radius of binary codes. IEEE Trans.
Inform. Theory, 36:1470–1472, 1990.
[1734] P. Solé. The covering radius of spherical designs. European J. Combin., 12:423–431,
1991.
[1735] P. Solé, editor. Codes Over Rings, volume 6 of Series on Coding Theory and Cryp-
tology. World Sci. Publ., Hackensack, New Jersey, 2009.
[1736] G. Solomon and H. C. A. van Tilborg. A connection between block and convolutional
codes. SIAM J. Appl. Math., 37:358–369, 1979.
[1737] W. Song, K. Cai, C. Yuen, K. Cai, and G. Han. On sequential locally repairable
codes. IEEE Trans. Inform. Theory, 64:3513–3527, 2017.
[1738] W. Song, S. H. Dau, C. Yuen, and T. J. Li. Optimal locally repairable linear codes.
IEEE J. Selected Areas Commun., 32:1019–1036, 2014.
[1739] W. Song and C. Yuen. Locally repairable codes with functional repair and multiple
erasure tolerance. arXiv:1507.02796 [cs.IT], 2015.
[1740] Y. Song, Y. Li, Z. Li, and J. Li. A new multi-use multi-secret sharing scheme based
on the duals of minimal linear codes. Security and Commun. Networks, 8:202–211,
2015.
[1741] A. B. Sørensen. Projective Reed-Muller codes. IEEE Trans. Inform. Theory,
37:1567–1576, 1991.
[1742] E. Spiegel. Codes over Zm . Inform. and Control, 35:48–51, 1977.
[1743] E. Spiegel. Codes over Zm , revisited. Inform. and Control, 37:100–104, 1978.
[1744] D. A. Spielman. Linear-time encodable and decodable error-correcting codes. IEEE
Trans. Inform. Theory, 42:1723–1731, 1996.
[1745] A. Sridharan, M. Lentmaier, D. J. Costello, Jr., and K. Sh. Zigangirov. Terminated
LDPC convolutional codes with thresholds close to capacity. In Proc. IEEE Inter-
national Symposium on Information Theory, pages 1372–1376, Adelaide, Australia,
September 2005.
[1746] R. Srikant and L. Ying. Communication Networks: An Optimization, Control, and
Stochastic Networks Perspective. Cambridge University Press, Cambridge, UK, 2014.
[1747] K. P. Srinath and B. S. Rajan. Low ML-decoding complexity, large coding gain,
full-diversity STBCs for 2 × 2 and 4 × 2 MIMO systems. IEEE J. Selected Topics
Signal Process., 3:916–927, 2010.
928 Bibliography
[1766] G. Szegö. Orthogonal Polynomials. Colloquium publications. Amer. Math. Soc., New
York, 1939.
[1767] T. Szőnyi and Z. Weiner. Stability of k mod p multisets and small weight codewords
of the code generated by the lines of PG(2, q). J. Combin. Theory Ser. A, 157:321–
333, 2018.
[1772] I. Tamo and A. Barg. A family of optimal locally recoverable codes. IEEE Trans.
Inform. Theory, 60:4661–4676, 2014.
[1773] I. Tamo, A. Barg, S. Goparaju, and A. R. Calderbank. Cyclic LRC codes, binary
LRC codes, and upper bounds on the distance of cyclic codes. Internat. J. Inform.
Coding Theory, 3:345–364, 2016.
[1774] I. Tamo, Z. Wang, and J. Bruck. Zigzag codes: MDS array codes with optimal
rebuilding. IEEE Trans. Inform. Theory, 59:1597–1616, 2013.
[1775] I. Tamo, Z. Wang, and J. Bruck. Access versus bandwidth in codes for storage. IEEE
Trans. Inform. Theory, 60:2028–2037, 2014.
[1776] I. Tamo, M. Ye, and A. Barg. The repair problem for Reed-Solomon codes: optimal
repair of single and multiple erasures with almost optimal node size. IEEE Trans.
Inform. Theory, 65:2673–2695, 2019.
[1777] H. Tanaka. New proofs of the Assmus-Mattson theorem based on the Terwilliger
algebra. European J. Combin., 30:736–746, 2009.
[1778] C. Tang, N. Li, Y. Qi, Z. Zhou, and T. Helleseth. Linear codes with two or three
weights from weakly regular bent functions. IEEE Trans. Inform. Theory, 62:1166–
1176, 2016.
[1779] C. Tang, Y. Qi, and M. Xu. A note on cyclic codes from APN functions. Appl.
Algebra Engrg. Comm. Comput., 25:21–37, 2014.
[1780] H. Tang, J. Xu, Lin. S., and K. Abdel-Ghaffar. Codes on finite geometries. IEEE
Trans. Inform. Theory, 52:572–596, 2005.
[1781] X. Tang and C. Ding. New classes of balanced quaternary and almost balanced
binary sequences with optimal autocorrelation value. IEEE Trans. Inform. Theory,
56:6398–6405, 2010.
[1782] X. Tang, P. Fan, and S. Matsufuji. Lower bounds on correlation of spreading sequence
set with low or zero correlation zone. Electron. Lett., 36:551–552, 2000.
930 Bibliography
[1783] X. Tang and G. Gong. New constructions of binary sequences with optimal autocor-
relation value/magnitude. IEEE Trans. Inform. Theory, 56:1278–1286, 2010.
[1784] R. M. Tanner. A recursive approach to low complexity codes. IEEE Trans. Inform.
Theory, 27:533–547, 1981.
[1785] R. M. Tanner. A transform theory for a class of group-invariant codes. IEEE Trans.
Inform. Theory, 34:752–775, 1988.
[1786] R. M. Tanner. On quasi-cyclic repeat accumulate codes. In Proc. 37th Allerton
Conference on Communication, Control, and Computing, pages 249–259, Monticello,
Illinois, September 1999.
[1787] R. M. Tanner. Minimum distance bounds by graph analysis. IEEE Trans. Inform.
Theory, 47:808–821, 2001.
[1788] R. M. Tanner, D. Sridhara, and T. E. Fuja. A class of group structured LDPC codes.
In Proc. 6th International Symposium on Communication Theory and Applications,
pages 365–370, Ambleside, UK, July 2001.
[1789] R. M. Tanner, D. Sridhara, A. Sridharan, D. J. Costello, Jr., and T. E. Fuja. LDPC
block and convolutional codes from circulant matrices. IEEE Trans. Inform. Theory,
50:2966–2984, 2004.
[1790] T. Tao and V. H. Vu. Additive Combinatorics, volume 105 of Cambridge Studies in
Advanced Mathematics. Cambridge University Press, Cambridge, UK, 2006.
[1791] L. F. Tapia Cuitiño and A. L. Tironi. Some properties of skew codes over finite
fields. Des. Codes Cryptogr., 85:359–380, 2017.
[1792] V. Tarokh, H. Jafarkhani, and A. R. Calderbank. Space-time block codes from
orthogonal design. IEEE Trans. Inform. Theory, 45:1456–1466, 1999.
[1793] V. Tarokh, N. Seshadri, and A. R. Calderbank. Space-time codes for high data rate
wireless communication: performance criterion and code construction. IEEE Trans.
Inform. Theory, 44:744–765, 1998.
[1794] O. Taussky and J. Todd. Covering theorems for groups. Ann. Soc. Polon. Math.,
21:303–305, 1948.
[1795] L. Teirlinck. On projective and affine hyperplanes. J. Combin. Theory Ser. A,
28:290–306, 1980.
[1796] E. Telatar. Capacity of multi-antenna Gaussian channels. European Trans. on
Telecommun., 10:585–596, 1999.
[1797] S. Ten Brink. Convergence of iterative decoding. IEEE Electronic Letters, 35:806–
808, 1999.
[1798] S. Ten Brink. Convergence behavior of iteratively decoded parallel concatenated
codes. IEEE Trans. Inform. Theory, 49:1727–1737, 2001.
[1799] A. Thangaraj and S. W. McLaughlin. Quantum codes from cyclic codes over GF(4m ).
IEEE Trans. Inform. Theory, 47:1176–1178, 2001.
[1800] J. G. Thompson. Weighted averages associated to some codes. Scripta Math., 29:449–
452, 1973.
Bibliography 931
[1803] C. Tian. Characterizing the rate region of the (4, 3, 3) exact-repair regenerating
codes. IEEE J. Selected Areas Commun., 32:967–975, 2014.
[1804] C. Tian, B. Sasidharan, V. Aggarwal, V. A. Vaishampayan, and P. V. Kumar. Lay-
ered exact-repair regenerating codes via embedded error correction and block designs.
IEEE Trans. Inform. Theory, 61:1933–1947, 2015.
[1805] A. Tietäväinen. On the nonexistence of perfect codes over finite fields. SIAM J.
Appl. Math., 24:88–96, 1973.
[1806] A. Tietäväinen. Nonexistence of nontrivial perfect codes in case q = pr1 ps2 , e ≥ 3.
Discrete Math., 17:199–205, 1977.
[1807] A. Tietäväinen. An upper bound on the covering radius as a function of the dual
distance. IEEE Trans. Inform. Theory, 36:1472–1474, 1990.
[1808] A. Tietäväinen. Covering radius and dual distance. Des. Codes Cryptogr., 1:31–46,
1991.
[1809] J. Tits. Ovoı̈des à translations. Rend. Mat. e Appl. (5), 21:37–59, 1962.
[1810] V. Tomás, J. Rosenthal, and R. Smarandache. Decoding of convolutional codes over
the erasure channel. IEEE Trans. Inform. Theory, 58:90–108, 2012.
[1811] V. D. Tonchev. Quasi-symmetric 2-(31, 7, 7) designs and a revision of Hamada’s
conjecture. J. Combin. Theory Ser. A, 42:104–110, 1986.
[1812] V. D. Tonchev. Combinatorial Configurations, Designs, Codes, Graphs. John Wiley
& Sons, Inc. New York, 1988.
[1813] V. D. Tonchev. A class of Steiner 4-wise balanced designs derived from Preparata
codes. J. Combin. Des., 4:203–204, 1996.
[1814] V. D. Tonchev. Codes and Designs. In V. S. Pless and W. C. Huffman, editors,
Handbook of Coding Theory, Vol. I, II, chapter 15, pages 1229–1267. North-Holland,
Amsterdam, 1998.
[1815] V. D. Tonchev. Linear perfect codes and a characterization of the classical designs.
Des. Codes Cryptogr., 17:121–128, 1999.
[1816] V. D. Tonchev. Codes. In C. J. Colbourn and J. H. Dinitz, editors, The CRC Hand-
book of Combinatorial Designs, chapter VII.1, pages 677–702. Chapman & Hall/CRC
Press, 2nd edition, 2007.
[1817] V. D. Tonchev. On resolvable Steiner 2-designs and maximal arcs in projective
planes. Des. Codes Cryptogr., 84:165–172, 2017.
[1818] A.-L. Trautmann, F. Manganiello, M. Braun, and J. Rosenthal. Cyclic orbit codes.
IEEE Trans. Inform. Theory, 59:7386–7404, 2013.
932 Bibliography
[1819] P. Trifonov. Efficient design and decoding of polar codes. IEEE Trans. Commun.,
60:3221–3227, 2012.
[1820] H. Trinker. The triple distribution of codes and ordered codes. Discrete Math.,
311:2283–2294, 2011.
[1821] M. A. Tsfasman, S. G. Vlăduţ, and D. Nogin. Algebraic Geometric Codes: Basic
Notions, volume 139 of Mathematical Surveys and Monographs. Amer. Math. Soc.,
Providence, Rhode Island, 2007.
[1822] M. A. Tsfasman, S. G. Vlăduţ, and T. Zink. Modular curves, Shimura curves, and
Goppa codes, better than Varshamov-Gilbert bound. Math. Nachr., 109:21–28, 1982.
[1823] L. Tunçel. Polyhedral and Semidefinite Programming Methods in Combinatorial Op-
timization. Amer. Math. Soc., Providence, Rhode Island; Fields Institute for Re-
search in Mathematical Sciences, Toronto, Canada, 2010.
[1824] M. Vajha, S. B. Balaji, and P. V. Kumar. Explicit MSR codes with optimal access,
optimal sub-packetization and small field size for d = k+1, k+2, k+3. In Proc. IEEE
International Symposium on Information Theory, pages 2376–2380, Vail, Colorado,
June 2018.
[1825] M. Vajha, V. Ramkumar, B. Puranik, G. R. Kini, E. Lobo, B. Sasidharan, P. V.
Kumar, A. Barg, M. Ye, S. Narayanamurthy, S. Hussain, and S. Nandi. Clay codes:
moulding MDS codes to yield an MSR code. In Proc. 16th USENIX Conference on
File and Storage Technologies, pages 139–154, Oakland, California, February 2018.
[1832] J. H. van Lint. Recent results on perfect codes and related topics. In M. Hall,
Jr. and J. H. van Lint, editors, Combinatorics. Part 1: Theory of Designs, Finite
Geometry and Coding Theory, volume 55 of Math. Centre Tracts, pages 158–178.
Math. Centrum, Amsterdam, 1974.
[1833] J. H. van Lint. A survey on perfect codes. Rocky Mountain J. Math., 5:199–224,
1975.
[1834] J. H. van Lint. Kerdock codes and Preparata codes. Congr. Numer., 39:25–41, 1983.
Bibliography 933
[1835] J. H. van Lint. Repeated-root cyclic codes. IEEE Trans. Inform. Theory, 37:343–345,
1991.
[1836] J. H. van Lint. Introduction to Coding Theory. Springer-Verlag, Berlin, Heidelberg,
3rd edition, 1999.
[1837] J. H. van Lint and A. Schrijver. Construction of strongly regular graphs, two-weight
codes and partial geometries by finite fields. Combinatorica, 1:63–73, 1981.
[1838] J. H. van Lint and R. M. Wilson. On the minimum distance of cyclic codes. IEEE
Trans. Inform. Theory, 32:23–40, 1986.
[1839] J. H. van Lint and R. M. Wilson. A Course in Combinatorics. Cambridge University
Press, Cambridge, UK, 2nd edition, 2001.
[1840] J. van Tilburg. On the McEliece public-key cryptosystem. In S. Goldwasser, editor,
Advances in Cryptology – CRYPTO ’88, volume 403 of Lecture Notes in Comput.
Sci., pages 119–131. Springer, Berlin, Heidelberg, 1990.
[1841] J. van Tilburg. Security-Analysis of a Class of Cryptosystems Based on Linear
Error-Correcting Codes. PhD thesis, Technical University Eindhoven, 1994.
[1842] G. J. M. van Wee. On the nonexistence of certain perfect mixed codes. Discrete
Math., 87:323–326, 1991.
[1843] A. Vardy. The intractability of computing the minimum distance of a code. IEEE
Trans. Inform. Theory, 43:1757–1766, 1997.
[1844] R. R. Varshamov. The evaluation of signals in codes with correction of errors (in
Russian). Dokl. Akad. Nauk SSSR, 117:739–741, 1957. (English translation in [212]
pages 68–71).
[1845] R. R. Varshamov. Some features of linear codes that correct asymmetric errors.
Soviet Physics Doklady, 9:538, 1965.
[1846] B. Vasic, S. K. Chilappagari, D. V. Nguyen, and S. K. Planjery. Trapping set ontol-
ogy. In Proc. 47th Allerton Conference on Communication, Control, and Computing,
pages 1–7, Monticello, Illinois, September-October 2009.
[1847] J. L. Vasil’ev. On nongroup close-packed codes (in Russian). Probl. Kibernet., 8:337–
339, 1962. (English translation in [212] pages 351–357).
[1848] G. Vega. A critical review and some remarks about one- and two-weight irreducible
cyclic codes. Finite Fields Appl., 33:1–13, 2015.
[1849] R. Vehkalahti, C. Hollanti, and F. Oggier. Fast-decodable asymmetric space-time
codes from division algebras. IEEE Trans. Inform. Theory, 58:2362–2385, 2012.
[1850] J. M. Velazquez-Gutierrez and C. Vargas-Rosales. Sequence sets in wireless com-
munication systems: a survey. IEEE Commun. Surveys & Tutorials, 19:1225–1248,
2017.
[1851] P. Véron. A fast identification scheme. In Proc. IEEE International Symposium on
Information Theory, page 359, Whistle, British Columbia, Canada, September 1995.
[1852] M. G. Vides. Métricas Sobre Grupos y Anilloscon Aplicaciones a la Teorı́a de
Códigos. PhD thesis, Universidad Nacional de Córdoba, 2018.
934 Bibliography
[1870] H. Wang, C. Xing, and R. Safavi-Naini. Linear authentication codes: bounds and
constructions. IEEE Trans. Inform. Theory, 49:866–872, 2003.
[1871] L. Wang, K. Feng, S. Ling, and C. Xing. Asymmetric quantum codes: characteriza-
tion and constructions. IEEE Trans. Inform. Theory, 56:2938–2945, 2010.
[1872] Q. Wang, K. Ding, and R. Xue. Binary linear codes with two weights. IEEE Com-
mun. Letters, 19:1097–1100, 2015.
[1873] H. N. Ward. Divisible codes. Arch. Math. (Basel), 36:485–494, 1981.
[1874] H. N. Ward. Multilinear forms and divisors of codeword weights. Quart. J. Math.
Oxford Ser. (2), 34(133):115–128, 1983.
[1888] D. B. West. Introduction to Graph Theory. Prentice Hall, Inc., Upper Saddle River,
New Jersey, 2nd edition, 2001.
936 Bibliography
[1889] N. Wiberg. Codes and Decoding on General Graphs. PhD thesis, Electrical Engi-
neering, Linköping University, Sweden, 1996.
[1890] C. Wieschebrink. Two NP-complete problems in coding theory with an application
in code based cryptography. In Proc. IEEE International Symposium on Information
Theory, pages 1733–1737, Seattle, Washington, July 2006.
[1891] M. M. Wilde and T. A. Brun. Optimal entanglement formulas for entanglement-
assisted quantum coding. Physical Review A, 77(6):064302, June 2008.
[1892] J. C. Willems. From time series to linear system I: finite-dimensional linear time
invariant systems. Automatica J. IFAC, 22:561–580, 1986.
[1893] W. Willems. A note on self-dual group codes. IEEE Trans. Inform. Theory, 48:3107–
3109, 2002.
[1894] R. A. Wilson. An Atlas of sporadic group representations. In R. T. Curtis and R. A.
Wilson, editors, The Atlas of Finite Groups: Ten Years On (Birmingham, 1995),
volume 249 of London Math. Soc. Lecture Note Ser., pages 261–273. Cambridge
University Press, Cambridge, UK, 1998.
[1895] S. Winograd. Some bilinear forms whose multiplicative complexity depends on the
field of constants. Mathematical Systems Theory, 10:169–180, 1977.
[1896] M. Wirtz. On the parameters of Goppa codes. IEEE Trans. Inform. Theory, 34:1341–
1343, 1988.
[1897] J. K. Wolf. Efficient maximum-likelihood decoding of linear block codes using a
trellis. IEEE Trans. Inform. Theory, 24:76–80, 1978.
[1898] J. Wolfmann. Codes projectifs à deux ou trois poids associés aux hyperquadriques
d’une géométrie finie. Discrete Math., 13:185–211, 1975.
[1899] J. Wolfmann. New bounds on cyclic codes from algebraic curves. In G. D. Cohen
and J. Wolfmann, editors, Coding Theory and Applications, volume 388 of Lecture
Notes in Comput. Sci., pages 47–62. Springer, Berlin, Heidelberg, 1989.
[1900] J. Wolfmann. Bent functions and coding theory. In A. Pott, V. Kumaran, T. Helle-
seth, and D. Jungnickel, editors, Difference Sets, Sequences and Their Correlation
Properties, volume 542 of NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., pages
393–418. Kluwer Acad. Publ., Dordrecht, Netherlands, 1999.
[1901] J. Wolfmann. Negacyclic and cyclic codes over Z4 . IEEE Trans. Inform. Theory,
45:2527–2532, 1999.
[1902] J. Wolfmann. Binary images of cyclic codes over Z4 . IEEE Trans. Inform. Theory,
47:1773–1779, 2001.
[1903] S. Wolfram. Idea Makers: Personal Perspectives on the Lives & Ideas of Some
Notable People. Wolfram Media, Inc., Champaign, Illinois, 2016.
[1904] H. Wolkowicz, R. Saigal, and L. Vandenberghe, editors. Handbook of Semidefinite
Programming. Kluwer Acad. Pub., Boston, 2000.
[1905] J. A. Wood. Extension theorems for linear codes over finite rings. In T. Mora
and H. F. Mattson, Jr., editors, Applied Algebra, Algebraic Algorithms and Error-
Correcting Codes, volume 1255 of Lecture Notes in Comput. Sci., pages 329–340.
Springer, Berlin, Heidelberg, 1997.
Bibliography 937
[1906] J. A. Wood. Duality for modules over finite rings and applications to coding theory.
Amer. J. Math., 121:555–575, 1999.
[1907] W. K. Wootters and W. H. Zurek. A single quantum cannot be cloned. Nature,
299:802–803, 1982.
[1908] S. J. Wright. Primal-dual Interior-point Methods. Society for Industrial and Applied
Mathematics (SIAM), Philadelphia, 1997.
[1909] B. Wu and Z. Liu. Linearized polynomials over finite fields revisited. Finite Fields
Appl., 22:79–100, 2013.
[1910] F. Wu. Constructions of strongly regular Cayley graphs using even index Gauss
sums. J. Combin. Des., 21:432–446, 2013.
[1911] Y. Wu, N. Li, and X. Zeng. Linear codes with few weights from cyclotomic classes
and weakly regular bent functions. Des. Codes Cryptogr., 88:1255–1272, 2020.
[1912] S.-T. Xia and F.-W. Fu. On the stopping distance of finite geometry LDPC codes.
IEEE Commun. Letters, 10:381–383, 2006.
[1913] S.-T. Xia and F.-W. Fu. Johnson type bounds on constant dimension codes. Des.
Codes Cryptogr., 50:163–172, 2009.
[1914] C. Xiang. It is indeed a fundamental construction of all linear codes.
arXiv:1610.06355 [cs.IT], October 2016.
[1915] C. Xing. Asymptotic bounds on frameproof codes. IEEE Trans. Inform. Theory,
48:2991–2995, 2002.
[1916] C. Xing. Nonlinear codes from algebraic curves improving the Tsfasman-Vlăduţ-Zink
bound. IEEE Trans. Inform. Theory, 49:1653–1657, 2003.
[1917] E. Yaakobi, J. Bruck, and P. H. Siegel. Decoding of cyclic codes over symbol-pair
read channels. In Proc. IEEE International Symposium on Information Theory,
pages 2891–2895, Cambridge, Massachussets, July 2012.
[1918] E. Yaakobi, J. Bruck, and P. H. Siegel. Constructions and decoding of cyclic codes
over b-symbol read channels. IEEE Trans. Inform. Theory, 62:1541–1551, 2016.
[1919] S. Yang and J.-C. Belfiore. Optimal space-time codes for the MIMO amplify-and-
forward cooperative channel. IEEE Trans. Inform. Theory, 53:647–663, 2007.
[1920] X. Yang and J. L. Massey. The condition for a cyclic code to have a complementary
dual. Discrete Math., 126:391–393, 1994.
[1921] Y. Yang and X. Tang. Generic construction of binary sequences of period 2N with
optimal odd correlation magnitude based on quaternary sequences of odd period N .
IEEE Trans. Inform. Theory, 64:384–392, 2018.
[1922] N. Yankov. A putative doubly even [72, 36, 16] code does not have an automorphism
of order 9. IEEE Trans. Inform. Theory, 58:159–163, 2012.
[1923] H. Yao and G. W. Wornell. Achieving the full MIMO diversity-multiplexing frontier
with rotation-based space-time codes. In Proc. 41st Allerton Conference on Com-
munication, Control, and Computing, pages 400–409, Monticello, Illinois, October
2003.
938 Bibliography
[1924] F. Ye, K. W. Shum, and R. W. Yeung. The rate region for secure distributed storage
systems. IEEE Trans. Inform. Theory, 63:7038–7051, 2017.
[1925] M. Ye and A. Barg. Explicit constructions of high-rate MDS array codes with optimal
repair bandwidth. IEEE Trans. Inform. Theory, 63:2001–2014, 2017.
[1926] M. Ye and A. Barg. Explicit constructions of optimal-access MDS codes with nearly
optimal sub-packetization. IEEE Trans. Inform. Theory, 63:6307–6317, 2017.
[1927] M. Ye and A. Barg. Cooperative repair: constructions of optimal MDS codes for all
admissible parameters. IEEE Trans. Inform. Theory, 65:1639–1656, 2019.
[1928] V. Y. Yorgov. Binary self-dual codes with an automorphism of odd order (in Rus-
sian). Problemy Peredači Informacii, 19(4):11–24, 1983. (English translation in Probl.
Inform. Transm., 19(4):260–270, 1983).
[1929] V. Y. Yorgov. A method for constructing inequivalent self-dual codes with applica-
tions to length 56. IEEE Trans. Inform. Theory, 33:77–82, 1987.
[1930] V. Y. Yorgov. On the automorphism group of a putative code. IEEE Trans. Inform.
Theory, 52:1724–1726, 2006.
[1931] V. Y. Yorgov and D. Yorgov. The automorphism group of a self-dual [72, 36, 16] code
does not contain Z4 . IEEE Trans. Inform. Theory, 60:3302–3307, 2014.
[1932] E. V. York. Algebraic Description and Construction of Error Correcting Codes: A
Linear Systems Point of View. PhD thesis, University of Notre Dame, 1997.
[1933] D. C. Youla and P. F. Pickel. The Quillen-Suslin theorem and the structure of n-
dimensional elementary polynomial matrices. IEEE Trans. Circuits and Systems,
31:513–518, 1984.
[1934] N. Y. Yu and G. Gong. New binary sequences with optimal autocorrelation magni-
tude. IEEE Trans. Inform. Theory, 54:4771–4779, 2008.
[1935] J. Yuan and C. Ding. Secret sharing schemes from three classes of linear codes. IEEE
Trans. Inform. Theory, 52:206–212, 2006.
[1936] V. A. Yudin. Minimum potential energy of a point system of charges (in Rus-
sian). Diskret. Mat., 4(2):115–121, 1992. (English translation in Discrete Math. Appl.,
3(1):75–81, 1993).
[1937] R. Zamir. Lattices are everywhere. In Proc. Information Theory and Applications
Workshop, pages 392–421, San Diego, California, February 2009.
[1938] R. Zamir. Lattice Coding for Signals and Networks: A Structured Coding Approach to
Quantization, Modulation, and Multiuser Information Theory. Cambridge University
Press, Cambridge, UK, 2014.
[1939] S. K. Zaremba. Covering problems concerning abelian groups. J. London Math. Soc.,
27:242–246, 1952.
[1940] A. Zeh and S. Ling. Decoding of quasi-cyclic codes up to a new lower bound for the
minimum distance. In Proc. IEEE International Symposium on Information Theory,
pages 2584–2588, Honolulu, Hawaii, June-July 2014.
Bibliography 939
[1941] G. Zémor. Minimum distance bounds by graph analysis. IEEE Trans. Inform.
Theory, 47:835–837, 2001.
[1942] J. Zhang, X. Wang, and G. Ge. Some improvements on locally repairable codes.
arXiv:1506.04822 [cs.IT], June 2015.
[1943] S. Zhang. On the nonexistence of extremal self-dual codes. Discrete Appl. Math.,
91:277–286, 1999.
[1944] D. Zheng, X. Li, and K. Chen. Code-based ring signature scheme. Internat. J.
Network Security, 5:154–157, 2007.
[1945] L. Zheng and D. N. C. Tse. Diversity and multiplexing: a fundamental tradeoff in
multiple-antenna channels. IEEE Trans. Inform. Theory, 49:1073–1096, 2003.
[1946] Z. Zhou and C. Ding. A class of three-weight cyclic codes. Finite Fields Appl.,
25:79–93, 2014.
[1947] Z. Zhou, N. Li, C. Fan, and T. Helleseth. Linear codes with two or three weights
from quadratic bent functions. Des. Codes Cryptogr., 81:283–295, 2016.
[1948] V. A. Zinov’ev. Generalized cascade codes (in Russian). Problemy Peredači Infor-
macii, 12(1):5–15, 1976. (English translation in Probl. Inform. Transm., 12(1):2–9,
1976).
[1949] V. A. Zinov’ev and V. K. Leont’ev. On non-existence of perfect codes over Galois
fields (in Russian). Probl. Control and Inform. Theory, 2:123–132, 1973.
941
942 Index
divisible, see divisible code linear over a ring, see linear code over
divisor, 80, 377 a ring
doubly-even, see doubly-even code locally recoverable, see locally
duadic, see duadic code recoverable code
dual, see dual code locally regenerating, see locally
dual distance, 103 regenerating code
dually quasi-MRD, see dually low-density parity check, see
quasi-MRD code low-density parity check (LDPC)
effective length, 461 code
encoding, see encoding LT, see LT code
equivalence, see equivalence MacDonald, see MacDonald code
error vector, 4 maximum distance separable, see
even, see even code maximum distance separable
existence of, 61, 65, 69–71, 76 (MDS) code
extended, see extended code maximum rank distance, see maximum
external distance, 103 rank distance (MRD) code
extremal, see extremal code message, 4
Ferrers diagram, see Ferrers diagram minimal, see minimal code
code minimum distance, 11, 576, 595, 802
few-weight, see few-weight code dual, 587
flag, 711 minimum weight, 11
formally self-dual, see formally mixed, see mixed code
self-dual code mixed-dimension, see mixed-dimension
fountain, see fountain code code
Gabidulin, see Gabidulin code multi-shot network, see network
generalized concatenated, see coding, multi-shot
generalized concatenated code multivariable, see multivariable code
generator matrix, 8, 575, 576, 801 nearly perfect, see nearly perfect code
link to Galois geometry, 288 negacyclic, see negacyclic code
standard form, 9, 14 network, see network coding
Goethals, see Goethals code nonexistence of, 61, 65, 68–71
Golay, see Golay code nonlinear, 598
Goppa, see Goppa code Nordstrom–Robinson, see
graph representation, 715 Nordstrom–Robinson code
Grassmannian, see constant-dimension one-shot network, see network coding,
code one-shot
group, see group code optimal, see optimal code
Hamming, see Hamming code optimal anticode, see optimal anticode
hexacode, see hexacode optimal vector anticode, see optimal
information rate, 20, 44 vector anticode
information set, 9, 595, 596 orbit, 710
inner distribution, see inner orthogonal, see orthogonal code
distribution packing radius, 557
isodual, see isodual code padded, see padded code
Justesen, see Justesen code parameters of, 7, 11
Kerdock, see Kerdock code parity check matrix, 8, 64, 801
LCD, see LCD code link to Galois geometry, 288
length of, 7 perfect, see perfect code
lengthened, see lengthened code polar, see polar code
linear, see linear code polycyclic, see polycyclic code
potential energy, 255, 263
Index 945
precode only, see precode only code type I, see type I code
Preparata, see Preparata code type II, see type II code
projective, see projective code type III, see type III code
punctured, see punctured code type IV, see type IV code
q-ary, 7 unrestricted, see unrestricted code
q-linear q m -ary code, see additive code vector rank-metric, see vector
quadratic residue, see quadratic residue rank-metric code
code weight distribution, see weight
quantum, see quantum code distribution
quasi-cyclic, see quasi-cyclic code weight enumerator, see weight
quasi-twisted, see quasi-twisted code enumerator
rank-metric, see rank-metric code zero position, 450
Raptor, see Raptor code code-division multiple-access (CDMA), 613,
rate, 20, 44, 198 617–619, 624, 628
rateless, see rateless code codecan, 582
redundancy set, 9 codeword, 4, 7
Reed–Muller, see Reed–Muller code even-like, 51
Reed–Solomon, see Reed–Solomon code minimal, 432, 520, 790
regenerating, see regenerating code odd-like, 51
repetition, see repetition code coding, see network coding
reversible, see reversible code coding gain, 675
self-dual, see self-dual code coherent, 674
self-orthogonal, see self-orthogonal collineation, see projective geometry,
code collineation
sequential, see sequential code column span of a matrix, 230
s-extremal, see s-extremal code combinatorial design, see design
shortened, see shortened code combinatorial metric, see metric,
simplex, see simplex code combinatorial
singly-even, see singly-even code companion matrix, 156
skew-BCH, see skew-BCH code completely monotone function, 258
skew-constacyclic, see completely reducible, 182, 184
skew-constacyclic code complexity of a convolutional code, see
skew-cyclic, see skew-cyclic code convolutional code, complexity
skew-RS, see skew-RS code cone, 269
space-time, see space-time block code convex, 269
spectrum, 563 dual, 269
spherical, see spherical code finitely generated, 269
subfield subcode, see subfield subcode of positive semidefinite matrices, 271
subspace, see subspace code pointed, 269
support a design, 98 proper, 269
symbol-pair, see symbol-pair code psd cone, 271
t-EC-AUED, see t-EC-AUED code conic program, 269
ternary, 7 complementary slackness, 270
t-error-correcting, 40 dual, 269
tetracode, see tetracode feasible solution, 269
three-weight, see three-weight code linear program, 270
Tornado, see Tornado code optimal solution, 269
trace, see trace code optimality condition, 270
transpose of, 229 optimization variable, 269
twisted, see twisted code primal, 269
two-weight, see two-weight code semidefinite program, 75, 271
946 Index
residue code, see algebraic geometry code, row reduced echelon form, 579
residue code row span of a matrix, 171, 230
residue field, 388
restriction of a code, 354, 757 SC decoding, see decoding, of polar codes,
reverse code, 206 successive cancellation
reverse constraint, 425 SC GA decoding, see decoding, of polar
reverse-complement constraint, 425 codes, genie aided successive
reversible code, 58 cancellation
Riemann–Roch equations, 344, 348, 351, Schubert cell, 696
353 SCL decoding, see decoding, of polar codes,
Riemann–Roch space, 313 list successive cancellation
Riemann–Roch theorem, 315, 669 secret sharing scheme, 352–354, 785–794
ring Segre mapping, 303
Ak , 119 Segre variety, 303
annihilator in, 391 exterior set, 303
associate, 396 self-dual code, 10, 65, 66, 76, 79, 101, 117,
chain, 117, 120, 387 136, 372, 373, 596
commutative with unity, 24 self-orthogonal code, 10, 79, 101, 117, 192,
Fpm + uFpm , 407 372, 576, 589–592
Frobenius, see Frobenius ring shadow of, 83
Galois, see Galois ring with minimal shadow, 84
Gray map on, see Gray map semidefinite program, see conic program,
ideal, 24, 117 semidefinite program
idempotent, 47 semifield, 302
index of nilpotency, 117, 120 Dickson, 302
index of stability, 120 spread set, 302
local, 117, 120, 387 semiprimitive, 455
maximal ideal of, 117 separating system, 349
of Boolean functions, 430 sequence, 613, 614
of polynomials Fq [x], 25 aperiodic correlation, see correlation,
as a PID, 25 aperiodic
unique coset representatives, 25 autocorrelation, see correlation
unique factorization, 25 binary, 614
principal ideal of, 24, 117, 387 with low correlation, 624–625
generator, 24 with optimal autocorrelation,
principal ideal ring, 380, 387 619–622
R(2, p, u2 = u), 444 bound
R(3, 2, u3 = 1), 446 Levenshtein, 618
R(3, 3, u3 = 1), 447 Sarwate, 616
Rk , 119, 430 Sidelńikov, 616
Rk (p), 430, 435 Welch, 616
R(k, p, uk = a), 430, 441 characteristic, 619
skew-polynomial, see skew-polynomial correlation, see correlation
ring crosscorrelation, see correlation
ripple de Bruijn, 628
input, 729 algorithms to generate, 635–640, 642
output, 723, 725 Lyndon word, 637
Roos bound, see bound, Roos defining a cyclic code, see cyclic code,
root of unity, see field, root of unity defining sequence
row degree, see degree, of a row even-periodic correlation, see
row reduced, 200 correlation, periodic, even-periodic
962 Index