Professional Documents
Culture Documents
Nonsense Mutations
Nonsense Mutations
ofcomplementation groups by using cell fusion tests (Figure 13.JO). If two cells, Aand B,
lack function in different repair genes, then when the cells are fused the resulting hybrid
cell will contain functioning copies of both genes. Cell A, with gene A defective, will
provide a functioning copy of gene B, and cell B, in which gene B is defective, will provide
a functional copy of gene A. Thus, the hybrid should recover the wIIdtype resistance to
DNA damage. Using this technique, cells from patients with xeroderma pigmentosum,
caused by defects in nucleotide excision repair (see Box 13.3), have been divided into
seven different groups. Cells from anyone group will complement (make good) the defect
in a cell from any other group. Fanconi anemia, caused by a defective cellular response to
DNA damage, has been divided into at least 12 groups. In general, a different gene is
mutated in each separate complementation group, although the clinical phenotypes overlap.
Details of these genetic diseases and their complementation groups can be found in the
OMIM database (http://www.ncbi.nlm.nih.gov/omim). Molecular studies of these various
groups have defined a large number of genes involved in human DNA repair. Sorting out
the individual pathways has been greatly aided by the very strong conservation of repair
mechanisms across the whole spectrum of life. Not only the reaction mechanisms but also
the protein structures and gene sequences are often conserved from E. coli to humans.
Generally, eukaryotes have multiple systems corresponding to each single system in E.
coli. For example, nucleotide excision repair requires six proteins in
E. coli but at least 30 in mammals. A downside of the conservation is a confuSing gene
nomenclature, referring sometimes to human diseases (e.g. xeroderma pigmentosum type D,
or XPD), sometimes to yeast mutants (RAD genes), and sometimes to mammalian cell
complementation systems (ERCC-excision repair cross-complementing). So, for example,
XPD, ERCC2, and RAD3 are the same gene in human, mouse, and yeast.
Not all the diseases that involve hypersensitivity to DNA-damaging agents are caused by
defects in the DNA repair systems themselves. Sometimes it is the broader cellular response
to damage that is defective. Normal cells react to DNA damage by stalling progress through
the cell cycle at a checkpoint until the damage has been repaired, or by triggering apoptosis if
the damage is irreparable. Patients with ataxia-telangiectasia and Fanconi anemia have intact
repair systems but are deficient in damage-sensing or response mechanisms. Defects in cell
cycle control and in the apoptotic response are central to the development of cancer, and they
will be discussed further in Chapter 17.
13.3 PATHOGENIC DNA VARIANTS
As mentioned above, the genomes of healthy individuals have huge numbers of sequence
variants. The great majority of these are completely harmless and have no known effect on the
phenotype. Even most of those that do affect the phenotype are part of the normal variation
that makes us all individual. Special interest, however, naturally attaches to those variants that
are pathogenic-that is, they either make us ill or make us susceptible to an illness.
Deciding whether a DNA sequence change is pathogenic can be
difficult
Not every sequence variant seen in an affected person will be pathogenic. Just as perfectly
healthy people carry innumerable sequence variants, the same wilJ be true of a person with a
genetic disease. How can we decide whether a sequence change we have discovered in such a
person is the cause of their disease or a harmless variant? Only a functional test can give a
definitive answer-but functional tests are often difficult to integrate into the work of a
diagnostic laboratory.
In any case, for many gene products no laboratory test is available that checks all aspects of
the gene's function in vivo. Some variants may be pathogenic only at times of environmental
stress, and others may have subtle effects that manifest as susceptibility to a disease, perhaps
only when in combination with certain
other genetic variants.
In the absence of a definitive functional test, the nature of the sequence change often
provides a clue. First we can ask whether the variant affects a sequence that is known to be
functional. Such sequences would include the coding sequences of genes, sequences flanking
exon-intron junctions (splice sites), the promoter sequence immediately upstream of a gene,
and any other known regulatory sequences. The great majority of all known pathogenic
variants affect sequences that were already known to be functional, and these comprise only a
small percentage of our total DNA. However, it is always possible that a variant located
outside any known functional sequence might lie in a currently unidentified functional
element. As we saw in Chapter II, the ENCODE project is revealing many previously
unsuspected functional elements in the human genome. Such elements are suspected to be
locations for variants that merely alter susceptibility ro a disease, rather than directly causing
any disease.
[fa variant does affect a known functional sequence, we must try to predict its effect. A
table of the genetic code (see Figure 1.25) can be used to identify the effect of a coding
sequence variant on the protein product of a gene. As described below, nonsense mutations,
frameshifts, and many deletions can be confidently predicted to wreck the protein. Similarly,
changes to the invariant GT...AG sequences at splice sites are highly likely to be pathogenic.
Changes that merely replace one amino acid with a different one (missense changes) are more
difficult to interpret.
Another approach is to look for precedents. Maybe a variant is already documented in
dbSNp, the database of single nucleotide polymorphisms (see above). Alternatively, it may be
documented in one of the databases of pathogenic mutations listed in Further Reading. A
different sort of precedent can be sought by checking the normal sequence of related genes.
These may be in humans (paralogs) or other species (orthologs). If the variant is present as the
normal, wildtype sequence of a related gene it is unlikely to be pathogenic.
Further aspects of this problem are considered in Chapter 18, where we discuss genetic
testing. In the rest of this section we consider some of the many ways in which a change in a
functional sequence can be pathogenic.
Single nucleotide and other small-scale changes are a common type of pathogenic
change
Pathogenic changes are often caused by small-scale sequence changes in either the coding
sequence or the regulatory region of a gene.
Missense mutations
A single nucleotide substitution within the coding sequence of a gene mayor may not alter the
sequence ofthe encoded protein. The genetic code is degenerate: the 64 codons encode only
20 different amino acids (plus three stop codons). Thus, some codon changes do not alter the
amino acid-they are silent or synonymous. When the codon change does result in a changed
amino acid (a nonsynonymous change), the effect depends partly on the chemical differences
between the old and new amino acids. As explained in Chapter I, the 20 amino acids can be
classified into acidic, basic, uncharged polar, and uncharged nonpolar types. Replacing an
amino acid by one in the same class (a conservative substitution) has less effect on the protein
strucrure than a nonconservative substitution. Adding or removing cysteine alters the potential
for forming disulfide bridges, and so can cause major structural changes. Similarity matrices
have been constructed that give a quantitative score for the likely disruptive effect of any
substitution (see Further Reading).
Some amino acids are crucial to the functioning of a particular protein-for example, those
at the active site of an enzyme. Others may be important for maintaining tile protein structure.
Globular proteins tend to have uncharged
nonpolar amino acids in the interior and charged ones on the outside; any sub
stitution that changes this may disrupt the three-dimensional folding. The sickle
cell mutation is pathogenic because it replaces a polar amino acid with a nonpolar one on the
outside of the globin molecule (Figure J3.11). This makes the molecules tend to stick
together. Protein aggregation, the result of abnormal proteins having sticky external areas, has
emerged as a common pathogenic mechanism in a variety of diseases, especially progressive
neurodegenerative conditions, and is discussed further on p. 425.
It is seldom possible to predict these effects with much confidence. It helps if the threedimensional structure of the protein has been solved, so that one can model the likely
structural effect of a substitution. If amino acid sequences of related proteins (from humans or
other organisms) are known, we can see which amino acids are invariant and which seem free
to vary widely between species. Most amino acid substitutions probably have no effect on the
functioning of a protein.
Nonsense mutations
Three of the 64 codons in the genetic code are stop codons, and so it is quite common for a
nucleotide substitution to convert the codon for an internal amino acid of a protein into a stop
codon. When ribosomes encounter a stop codon they dissociate from the mRNA, and the
nascent polypeptide is released (Figure 13.12.A). However, genes containing premature
termination codons seldom cause production of the truncated protein that might be predicted.
Cells have a mechanism, nonsense-mediated decay (NMD), that detects mRNAs containing
premature termination codons and degrades them. Thus, the usual result of a nonsense
mutation is to prevent any expression of the gene.
NMD works because the spliced mRNA that travels from the nucleus to the ribosomes retains
a memory of the positions of the introns. The splicing mechanism leaves proteins of the exon
junction complex (EJCj attached to splice sites. During the first (pioneer) round of
translation, as the ribosome passes each splice site it clears the ETC proteins attached to that
site. Ifthere is a premature termination codon, the ribosome will not have traversed every
splice site before it detaches. Some EJC proteins will remain attached to the mRNA, and this
marks dle mRNA for destruction (Figure 13.128).
Nonsense-mediated decay is not always fully effective. It does not apply to premature
stop codons that are in the last exon of a gene, or less than about 50 nucieotides
upsU'eam of the last splice junction. [n some cases, some quantity of truncated protein
is produced even when the stop codon is not in this protected zone. Truncated proteins
are potentially more pathogenic than a simple absence of the protein (Figure 13.12C)
because they have the potential to interfere with the function of the normal product.
Such dominant-negative effects will be discussed later in this chapter (see p. 431). [t is
assumed that NMD has arisen to protect against this problem.
Changes that affect splicing of the primary transcript The positions of splice sites are marked
by the (almost) invariant canonical GT... AG sequence, embedded within a less tightly defined
consensus splice site recognition sequence (see Chapter 1). Mutations that change the
canonical GT or AG will always prevent recognition of the site by the spliceosome and so will
disrupt splicing at that site (Figure 13.13A), but a variety of other sequence changes may also
affect it. Splicing is not an all-or-nothing process. As mentioned in Chapter
II, splice sites can be strong or weak. Because of variable use ofweak splice sites,
most human genes produce a variety of alternatively spliced transcripts. Splicing enhancer or
suppressor sequences modulate the strength of an adjacent splice site by binding proteins of
the SR (serine and arginine-rich) and hRNP (heterogeneous ribonucleoprotein) families,
Whereas single nucleotide changes might affect any coding sequence, most short tandem
repeats are located in noncoding DNA. Those that do occur in coding sequences are seldom
polymorpltic. However, tandem repeat variants located near promoters or splice sites ofgenes
can sometimes affect gene expression. For example, different alleles ofa 14 bp minisateUite
near the promoter ofthe insulin gene on l1p15 are associated with differential risk of type 2
diabetes (see OM1M 176730). Another example occurs within the cystic fibrosis
transmembrane conductance regulator (CFfR) gene, where a run ofT nucleotides near the 3'
end of intron 8 affects the effiCiency of the adjacent splice site. Alleles with five, seven, or
nine T nucleotides are common. Whereas the splicing of 7-Tor 9-T alleles is normal, 5-T
aUeles are often mis-spliced and exon 9 is skipped. 5-T aUeles on their own do not reduce the
output of correctly spliced mRNA so greatly as to be pathogenic, but in conjunction with
other low-functioning variants they can be a cause of cystic fibrosis. Their effect is enhanced
if a (TG)" repeat nearby in the intron has more than 11 repeats (Figure \3.\8).
Tandem repeats within coding sequences are not normally polymorphic, but may be liable
to pathogenic mutations because of polymerase stutter. Somatic mutations of this type are a
major cause of disease in people with defects in the post-replicative mismatch repair
system (see Chapter 17). Expanded polyalanine runs in certain proteins are responsible for
several inherited diseases. Examples include the PHOX2B protein in people with
congenital central hypoventiIation syndrome (OMIM 209880) and the HOXDI3 protein in
patients with synpolydactyly I (OMIM 186000). These variants presumably originated
through polymerase stutter, but within a family they are stably transmitted, just like any
other STRP aUele. In at least some cases, the expanded alanine run interferes with correct
localization of the protein within the ceU.
Dynamic mutations: a special class of pathogenic microsatellite variants So-caUed dynamic
mutations are STRPs that, above a certain size, become intensely unstable. The molecular
causes are not weU wlderstood, but they may be a consequence of the way in which, when
DNA is replicated, one strand (the lagging strand) is synthesized as a series of discontinuous
fragments-the Okazaki fragments (see Chapter I). A special endonuclease, FENI, cuts off the
overhangs in overlapping Okazaki fragments. One proposed mechanism for repeat expansion
is that FENI fails to make the CUIS, and overlapping fragments end up being joined end-toend. Repeats up to a certain size are stable, and it may be significant that, in most cases, the
threshold of instability occurs when the repeat sequence reaches the typical size of an Okazaki
fragment. Not all dynamic mutalions are pathogenic, but several are (Table 13_1). Others are
responsible for the nonpathogenic fragile sites seen by cytogeneticists when ceUs of some
people are subjected to replicative stress (see Figure 13.5). For example the FRAI6A fragile
site on chromosome 16 is due to an expanded (CCG)" repeat, whereas FRA16Bon the same
chromosome is caused by an expanded 33 bp minisateIlite. The diseases in Table 13.1 are
heterogeneous in many respects. There are different-sized repeat units, different degrees of
expansion, different locations with respect to the affected gene, and different pathogenic
mechanisms. Within these, the polyglutamine diseases form a weU-definedgroup, of which
Huntington disease (HD; OMIM 143100) is the prototype. In these conditions, modest
expansions of a (CAG)n repeat in the coding sequence of a gene lead to an expanded
polyglutamine run in the encoded protein (Figure 13,19C). This, in turn, predisposes the
protein to form intracellular aggregates that are toxic to ceUs, especially neurons (B .. x 13.4).
scale variations in copy number between apparently normal people reinforces tlle message
that not all variants in copy number are harmful.
Chromosomal trisomies probably owe their characteristic phenotypes to just a few dosagesensitive genes. For example, the characteristic features of Down syndrome are thought to be
due largely to dosage effects of just two genes, DSCRI and DYRKIA. It is to be expected that
more genes would produce phenotypic effects at half dosage than at 1.5-fold increased
dosage. Thus, large deletions or monosomy of a whole chromosome are less well tolerated
than duplications or trisomy in human development.
One common mechanism generating changes in gene dosage is non-allelic homologous
recombination (NAHR). Segmental duplications (often defined as sequences 1 kb or longer
witb 95% or greater sequence identity) may misalign when homologous chromosomes pair in
meiosis. NAHR then produces deletions or duplications. The misaligned repeats have the
same sequence but not the same chromosomal location, so recombination is homologous but
the sequences are not alleles. Many (but by no means all) of the common nonpathogenic
variants in copy number seen in normal healtby people are generated by this mechanism. aThalassemia provides a good example ofNAHR producing a pathogenic variation in gene
dosage. Most people have four copies oftbe a-globin gene (aa/ a.a) as a result ofan ancient
tandem duplication. As shown in Figure 13.20, NAHR between low-copy repeat sequences
flanking the a-globin genes can produce chromosomes carrying more or fewer a-globin genes.
Reduced copy numbers of a-globin genes produce successively more severe effects. People
with tbree copies (aa/a-) are healthy; those with two (whetber the phase is a-/a-or aa/--) suffer
mild a-tbalassemia; those with only one gene (a-/--) have severe disease; and lack of all a
genes (--/--) causes lethal hydrops fetalis (fluid accumulation in the fetus).
X-chromosome monosomy and trisomy are particularly interesting because X-inactivation
(inactivation of all except one of the Xchromosomes in a cell; see Chapter 3) ought to render
them asymptomatic in somatic tissues. However, as noted in Chapter 11, a surprisingly large
number of genes on tbe Xchromosome escape inactivation. Some ofthese have a counterpart
on the Ychromosome, but most do not. For those genes tbat escape X-inactivation but lack a
Y-linked counterpart, normal females would have two functional copies and males only one.
Turner (45,X) females would have the same single active copy as normal malesbut perhaps in
tbe context of female development, a single copy is not sufficient. The skeletal abnormalities
of Turner syndrome are caused by haploinsufficiency for SHOX (50% of the normal gene
product is not sufficient to produce a normal phenotype). This is a homeobox gene tbat is
located in the Xp/Yp pseudoautosomal region, and so is present in two copies in botb males
and normal females.
Below the level of conventional cytogenetic resolution but above the single gene level,
pathogenic variations in copy number are classified as microdeletions or
microduplications Cfable 13.2). Among tbese, three different molecular pathologies can
be distinguished:
Single gene syndromes, in which all the phenotypic effects are due to the deletion (or
sometimes duplication) of a single gene. For example, Alagille syndrome (OMIM
118450) is seen in patients with a micro deletion at 20p11. However, 93% of Alagille
patients have no deletion but instead are heterozygous for point mutations in tbeJAGl
gene located at 20p12. The cause ofthe syndrome in all cases is a half dosage of the JAGI
gene product.
Contiguousgene syndromes are seen primarily in males vvith X-chromosome deletions
(Figure 13.2lA). The classic case was a boy BB who had Duchenne muscular dystrophy
(DMD; OMIM 310200), chronic granulomatous disease (CGD; OMIM 306400), and retinitis
pigmentosa (OMIM 312600), together witb mental retardation. He had a
chromosomaldeletion inXp21 that removed a contiguous set of genes and incidentally
provided investigators witb the means to clone the genes whose absence caused two of his
diseases, DMD and CGD. Deletions ofthe tip ofXp are seen in anotber set ofcontiguous gene
syndromes. Successively larger deletions remove more genes and add more
diseases to the syndrome. Microdeletions are relatively frequent in some parts ofthe
Xchromosome (such as Xp21 and proximal Xq) but are rare or unknown in others (such
as Xp22.1-22.2 and Xq28). No doubt the deletion of certain individual genes, and visible
deletions in gene-rich regions, would be lethal. Similar contiguous gene syndromes are
much less common with auto somes because ofthe presence ofthe balancing normal
chromosome (Figure 13.21B). Langer-Giedon syndrome (trichorhinophalangeal
syndrome, type II; OMIM 150230) is a rare example.
Segmental aneuploidy syndromes are a special type of contiguous gene syndrome that
regularly recur with a well-recognized phenotype. Examples include Williams-Beuren
(OMIM 194050), Prader-Willi (OMIM 176270), Angelman (OMIM 105830), Smith-Magenis
(OMIM 182290), and DiGeorge/
velocardiofacial (OMIM 1884001192430) syndromes (see Table 13.2). These syndromes
all have deletions produced by NAHR between low-copy repeats that flank the region in
question. NAHR will also produce duplications of these regions, although these may not
be pathogenic. The example ofPraderWilli and Angelman syndromes (see Figure 11.20, p.
367) happens ro involve an imprinted region, which complicates the phenotype, but the
same mechanism produces the other syndromes ruentioned above. As with other
contiguous gene syndromes, the phenotype usually depends on dosage effects of more
than one gene and is not seeu in people with a point mutation in just one of the genes.
Williams-Beuren syndrome is typical. Patients are heterozygous for a 1.5 Mb deletion on
chromosome 7ql1.23 that removes about 20 genes. Cases have been described who have
smaller deletions, but no typical case has been found with just one gene deleted or
mutated.
Some other recognizable recurrent syndromes are produced by independent random terminal
deletions of chromosomes in which a dosage-sensitive geue lies close to the telomere.
Examples are the Wolf-Hirschhorn (OMIM 194190) and cri-du-chat (OM1M 123450)
syndromes. In Miller-Dieker lissencephaly syndrome (OMIM 247200), random terminal
deletions of 17p can remove one or mOre dosage-sensitive genes, producing a contiguous
deletion syndrome.