La Chapelle 2010

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Genetica (2011) 139:199–207

DOI 10.1007/s10709-010-9537-x

Inactivation dates of the human and guinea pig vitamin C genes


Marc Y. Lachapelle • Guy Drouin

Received: 23 July 2010 / Accepted: 26 November 2010 / Published online: 8 December 2010
Ó Springer Science+Business Media B.V. 2010

Abstract The capacity to biosynthesize ascorbic acid has synthesis, iron metabolism, and immunity (Padh 1990).
been lost in a number of species including primates, guinea Given its importance, it is surprising that many species,
pigs, teleost fishes, bats, and birds. This inability results such as primates (Burns 1957; Pauling 1970), guinea pigs
from mutations in the GLO gene coding for L-gulono- (Zilva 1936), teleost fish (Dabrowski 1990), bats (Birney
c-lactone oxidase, the enzyme responsible for catalyzing et al. 1976), and passeriform birds (Chaudhuri and
the last step in the vitamin C biosynthetic pathway. We Chatterjee 1969), have lost the capacity to synthesize it. In
analyzed available primate and rodent GLO gene sequen- these species, lack of adequate vitamin C supplementation
ces to determine their evolutionary history. We used a in their diet impairs many metabolic pathways and even-
method based on sequence comparisons of lineages with tually leads to the development of scurvy. In humans and
and without functional GLO genes to calculate inactivation guinea pigs, the inability to synthesize vitamin C is due to a
dates of 61 and 14 MYA for the primate and guinea pig deficiency in L-gulono-c-lactone oxidase (GLO), the liver
genes, respectively. These estimates are consistent with enzyme responsible for catalyzing the last step of vitamin
previous phylogeny-based estimates. An analysis of trans- C biosynthesis (Burns 1957). The molecular basis for this
posable element distribution in the primate and rodent GLO deficiency was discovered in the early 1990s when it was
sequences did not reveal conclusive evidence that illegiti- shown that these species possess highly mutated versions
mate recombination between repeats has contributed to the of the GLO gene (Nishikimi et al. 1992, 1994). Having
loss of exons in the primate and guinea pig genes. undergone inactivating mutations sometime in their his-
tory, the human and guinea pig single copy GLO sequences
Keywords Vitamin C  L-gulono-c-lactone oxidase  are therefore nonfunctional unitary pseudogenes (Graur
GLO gene  Unitary pseudogene  Human  Guinea pig and Li 2000). The genetic basis for a lack of vitamin C in
other species is not yet known, but the observed cases of
GLO inactivity in primates and guinea pigs offer the pos-
Introduction sibility that this occurrence may be widespread.
Here we use data currently available from the various
Ascorbic acid (vitamin C) plays an important role in a genome sequencing projects to trace the molecular evolu-
diversity of physiological reactions, including collagen tionary history of the GLO gene in species unable to bio-
synthesize vitamin C. To this end, we retrieved all
available GLO gene sequences and analyzed their substi-
Electronic supplementary material The online version of this tution record and transposable element distribution. Our
article (doi:10.1007/s10709-010-9537-x) contains supplementary
material, which is available to authorized users. goals were to estimate dates of inactivation and to study
whether repetitive elements have had a role in inactivating
M. Y. Lachapelle  G. Drouin (&) the GLO gene of these two species. Repetitive elements
Département de biologie et Centre de recherche avancée en
have been shown to have a significant effect in mediating
génomique environnementale, Université d’Ottawa, 30 Marie
Curie, Ottawa, ON K1N 6N5, Canada chromosomal rearrangements (Szabó et al. 1999; Kolomi-
e-mail: gdrouin@science.uottawa.ca etz et al. 2002; Coghlan et al. 2005; Nakayama and Ishida

123
200 Genetica (2011) 139:199–207

2006; Uddin et al. 2006). Whether by illegitimate recom- negatively selected against within a functional sequence,
bination or through each element’s integration mechanism, they are free to accumulate at a rate equal to the neutral
repeats are a potent source of genomic modification. Since (synonymous) mutation rate (k) within a pseudogene (Graur
it was observed that the human and guinea pig GLO genes and Li 2000). Thus, for a pseudogene, we expect both kinds
lack some exons (Nishikimi et al. 1992, 1994), it could be of changes to be just as frequent, Ks = Ka. By comparing
that illegitimate recombinations between repetitive ele- the nucleotide substitutions between a pseudogene and its
ments were responsible for the inactivation of the human functional orthologues, it is possible to approximate the time
and guinea pig GLO genes. when these rates became even (Echols et al. 2002).
To measure GLO genes inactivation dates, we used the
method described by Chou et al. (2002) to measure the
Materials and methods inactivation date of the human CMP-N-acetylneuraminic
acid hydroxylase gene. For a pseudogene, the total number
Searches, sequences, and alignments of nonsynonymous substitutions per site is equal to
kt1 ? fNk(t - t1). The first parameter is the number of
BLAST searches of the NCBI databases were done with nonsynonymous substitutions per nonsynonymous site
several different orthologues of the gene encoding L-gul- since inactivation (t1) and the second is the number while
ono-c-lactone oxidase. The sequences obtained and the gene was still functional (t - t1). Here t represents the
accession numbers are provided in Table 1. A manual time when all species studied last shared a common
alignment of the protein coding regions was done using ancestor. The neutral mutation rate k is estimated as the
SequEdit version 0.912 (Drouin et al. 1999). synonymous substitution rate, if we assume there are no
functional constraints against them. The fN value is the
Inactivation date fraction of neutral substitutions and is estimated as the ratio
Ka/Ks for the functional gene. Since Ks = Ka in a pseu-
Dating the time of inactivation of the GLO gene assumes a dogene, fN becomes 1. This relationship is inversely pro-
molecular clock and is based on the observed pattern of portional to the gene’s level of functional constraint and
nucleotide substitution. Due to selective constraints, syn- serves as an approximation of the selective pressure against
onymous nucleotide substitutions (S) usually accumulate at nonsynonymous substitutions.
a quicker rate than nonsynonymous (N) substitutions. This is The substitution record of the inactive GLO gene and its
to be expected since nonsynonymous substitutions produce orthologues is inferred from the observed nucleotide
changes in the translated amino acid sequence, which in changes by the maximum parsimony method. Individual
most cases is deleterious to the organism. The number of nucleotide substitutions are characterized as either synon-
synonymous substitutions per synonymous site (Ks) is thus ymous (S) or nonsynonymous (N) and preference is given
expected to be far greater than the number of nonsynony- to possibilities requiring the least number of evolutionary
mous substitutions per nonsynonymous site (Ka), Ks  Ka. changes. When the order of substitution cannot be deter-
A pseudogene, however, differs most prominently from its mined within a codon, the most likely change is calculated
functional counterpart at the level of its nonsynonymous from weighted averages, which assumes that all paths
substitution rate. Whereas nonsynonymous substitutions are leading to the observed mutation are equally likely.

Table 1 Accession numbers


Species (common name) Accession number(s) References
for sequences used in this study
Mus musculus (mouse) NM_178747 Li et al. 2006
Rattus norvegicus (rat) NM_022220 Nishikimi et al.
1992
Spermophilus tridecemlineatus AAQQ01615738, AAQQ01184860, –
(squirrel) AAQQ01615740
Cavia porcellus (guinea pig) D12762 Nishikimi et al.
1992
Otolemur garnettii (galago) AAQR01573789, AAQR01573776, –
AAQR01573775,
AAQR01177479, AAQR01573774
Homo sapiens (human) AADC01074580, AADC01074579 –
Pan troglodytes (chimpanzee) AACZ02093370, AACZ02093372 –
Macaca mulatta (macaque) AANU01121048, AANU01121047 –

123
Genetica (2011) 139:199–207 201

Synonymous substitution averages are represented by Sa


and nonsynonymous ones by Na. This analysis thus pro-
vides an inactivation date based entirely on the pattern of
nucleotide substitutions.

Repetitive element searches, alignments, and exon loss

The online program PlotRep (Toth et al. 2006) was used to


search for repetitive elements along the primate sequences.
The RepBase Update library of human and non-human
primate repetitive elements were used for these searches.
Areas with complete coverage from all four species
were aligned using ClustalW and analyzed for evidence of
illegitimate recombination (Thompson et al. 1994).

Results

Sequences

Blasting of NCBI databases with various orthologues of the


L-gulono-c-lactone oxidase gene resulted in a variety of
hits. This included numerous cartilaginous fish sequences
as well as at least one representative from most major
mammalian clades (results not shown). Of the species
unable to biosynthesize vitamin C, only the primate and
guinea pig sequences could be retrieved. More specific
searches aimed directly at teleost fish and bat genomic Fig. 1 Schematic representations (not to scale) of the sequences used
databases were fruitless. A schematic representation of the in this study. Black boxes represent exons that are still present while
unfilled boxes with an X represent deleted exons or exon parts.
sequences used in this study is provided in Fig. 1. Only the
Dashed lines represent missing information
mouse, rat, human, chimp, and macaque sequences were
complete. Those of the squirrel and galago were obtained
from unfinished genome sequencing projects. The coding Inactivation date of the human GLO gene
regions of the guinea pig GLO gene were available from a
previous study (Nishikimi et al. 1992) and these were used A phylogenetic tree of the substitution record was con-
to infer the substitution record. Except for a few deletions, structed according to the maximum parsimony method
including all of exon 5 and the 30 -end of exon 6, this (Fig. 2). A total of 556 sites were analyzed, 435 of which
pseudogene sequence is complete. A full alignment of the were nonsynonymous (lN) and 121 of which were synony-
coding regions for all species is provided in Supplementary mous (lS). Multiple substitutions within the same codon
Fig. 1. were frequently observed, and when the order of these could
A frameshift deletion was also observed at the 982nd bp not be determined, weighted averages (half Na, half Sa)
of the galago sequence. This deletion produces a premature were used to calculate the most probable change. Some sites
stop codon which would eliminate 53 amino acids from the were ambiguous to classify (indicated by the symbol ‘‘?’’)
translated protein. However, previous work has demon- due to uncertainties in the ancestral sequence. At these
strated that primates of the Lorisidae order, to which the sites, the observed mutation could have resulted from two
small-eared galago belongs, are able to biosynthesize equally parsimonious changes. Nucleotides from functional
vitamin C (Pollock and Mullin 1987). Since the sequencing sequences were preferentially considered as ancestral and,
project is still ongoing and we cannot be certain about the when absolutely no informative support was available,
actual state of the sequence, we simply inserted a gap to substitutions were evenly divided among the branches. The
complete the alignment and eliminated it from the analysis. inactivation date will therefore be calculated as a range, with
The squirrel sequence was assumed to be functional based ambiguous sites either included or excluded from the
on the available body of literature (Drew et al. 1999). analysis.

123
202 Genetica (2011) 139:199–207

234 C->T,S 378 C->A,N 1011 C->G,S ?301 G->C,N


235 A->G,N 387 G->A,S 1016 A->G,N ?330 C->T,S
?255 G->C,S 392 A->G,N 1047 G->A,S ?348 G->C,S
256 G->A,N *395 A->C,Na1 1081 A->C,N 406 G->T,N
279 T->C,S *396 C->T,Sa1 1091 T->A,N 858 G->A,S
280 G->A,N 401 T->C,N 1093 A->C,N 859 A->C,N
300 C->T,S 875 A->C,N *1208 G->C,Na1 865 A->G,N
323 A->G,N 889 T->C,N *1209 G->A,Sa1 887 C->G,N
*331 G->A,Na2 893 A->T,N 1213 G->A,N ?897 C->T,S
*332 C->T,Sa0 899 G->A,N 1228 T->C,N *970 G->A,Na2
338 C->T,N 919 G->C,N *1258 C->T,Na1.5 *971 C->G,Sa0
353 G->C,N 947 A->C,N *1260 G->A,Sa0.5 979 G->A,N
361 A->G,N 991 A->G,N 1269 G->A,S 987 C->G,N
374 C->T,N 1000 G->T,N 1295 C->T,N 1007 A->T,N
1027 A->G,N
1038 T->C,S
1039 G->A,N
1059 C->T,S
1095 G->C,N
1218 T->C,S
1288 C->T,S
1293 T->C,S
337 T->G,N 342 T->C,S ?1296 G->T,S
?348 G->T,S 375 A->G,N
383 C->T,N 414 G->C,N
404 T->C,N *872 G->A,Na1
410 C->G,N *873 C->T,Sa1
1010 C->T,N 892 G->A,N
1020 G->A,S ?897 C->G,N
1030 C->T,N 912 T->C,S
1034 G->A,N 916 C->T,N
1075 T->C,N *925 G->C,Na2
1207 C->T,N *926 C->T,Sa0
1272 C->A,N 948 G->A,S
1290 G->A,S 950 A->G,N
1300 C->T,S 964 C->A,N
990 C->T,S
1008 C->A,N
1031 G->A,N
1033 G->A,N
1045 C->A,N
1067 G->A,N
*1201 T->C,Na2
333 G->A,N 281 A->G,N *1202 G->T,Sa1
375 A->G,N 912 T->C,S *1203 T->C
973 A->G,N 1018 G->C,N 1212 G->C,N
990 C->T,S 1229 A->T,N 1217 T->C,N
1066 C->T,N 1314 C->T,S 1290 G->T,S
1297 T->C,N
1305 G->T,N

Human Chimp Macaque Galago


S = 16.5 S = 17.5 S = 17.5 S = 9
N = 45.5 N = 44.5 N = 52.5 N = 13
Fig. 2 Substitutions inferred in different primate lineages according codon for which the order of substitution could not be determined
to the maximum parsimony method. A total of 556 sites common to unambiguously and weighted averages were used to calculate the
the human, chimpanzee, macaque, and galago were analyzed and the most likely mutation type (Na, Sa). Question marks indicate sites for
corresponding rodent sequences were used as outgroups. The which the ancestral sequence was not readily deducible and these
observed substitution for each nucleotide is indicated along with an were evenly divided among the branches. The totals for each
S for a synonymous change or an N for a nonsynonymous change. substitution type are indicated for each species
Asterisks indicate the presence of multiple substitutions within a

The neutral mutation rate k was estimated from the and old calibration estimates) between the galago and other
average number of synonymous substitutions per site primates (Steiper and Young 2006). This gives a value
counted along each lineage and using a divergence date t of of k = (((16.5 ? 17.5 ?17.5 ? 9)/4)/121)/77.5 9 106) =
77.5 MYA (and a range of 67.1–97.7 MYA based on young 1.61 9 10-9 per site per year. This value is similar to the

123
Genetica (2011) 139:199–207 203

neutral nucleotide substitution rate of 1.5 9 10-9 per site chromosomal break that resulted in the deletion of exons 2
per year calculated by Cooper (1999; cited in Winter et al. and 3 in the last common ancestor of humans, chimps, and
2001) and by Chou et al. (2002). Calculating the fN value macaques can be traced to a region about 90 bp upstream
from the observed mutations along the galago lineage gives of a mammalian interspersed repetitive element b (MIRb)
fN = (Ka/Ks) = (13/435)/(9/121) = 0.40. Combining these where only the 30 end of the element is present, starting at
values with the average number of nonsynonymous substi- the 123rd bp of a full length MIRb. Since one would expect
tutions per site incurred by the non-functional human, that the full size element would straddle the observed
chimpanzee, and macaque genes (Ka = ((45.5 ? 44.5 breakpoint, this suggests that MIRb elements were not
?52.5)/3)/435 = 0.109), allows us to calculate an inacti- involved in the recombination event that led to the loss of
vation date of t1 = [(Ka/k) - fNt]/(1 - fN) = 61 MYA, and exons 2 and 3. The area between exons 8 and 12 was highly
a range of 48–68 MYA, when all sites are included. The fragmented and several lineage-specific insertions made it
calculations with the six questionable sites excluded from difficult to propose any mechanism that may have led to the
the analysis give k = 1.43 9 10-9, fN = 0.64, an inactiva- loss of exons 8 and 11 (results not shown). Our analyses of
tion date of 74 MYA and a range of 38–92 MYA. the repetitive elements found in the GLO genes of primates
therefore did not provide any conclusive evidence for their
Inactivation date of the guinea pig GLO gene involvement in exon loss for this gene.
Analyses of the repetitive elements found in the mouse,
The phylogenetic tree of the inferred guinea pig substitution rat and guinea pig GLO genes revealed that they contain
record is provided in Fig. 3. Calculations were carried out in 12, 11 and 1 repetitive elements, respectively (results not
the same manner as for the primate lineage with the shown). The single repetitive element found in the guinea
exception that the fN value was obtained from an average of pig GLO gene is a PB1D10 repetitive element and is found
the functional mouse and rat genes. A total of 730 sites were in its fourth intron. Since the single PB1D10 repetitive
analyzed, 575 of which were nonsynonymous and 155 were element found in the rat GLO gene is in its second intron
synonymous. We used a divergence date of 72 MYA, with a and that the two PB1D10 repetitive elements found in the
confidence interval of 63.5–80 MYA, between the guinea mouse GLO gene are in its second and third intron (results
pig and the common ancestor of the mouse and rat as not shown), this element could not have been involved in
t (Huchon et al. 2007). The estimates with all sites included the loss of the fifth exon and part of the sixth exon of the
give k = 3.55 9 10-9, fN = 0.066 and Ka = 0.0626 for an guinea pig GLO gene.
inactivation date of 13.8 MYA, with a confidence interval of
13.2–14.4 MYA. Excluding the five questionable sites from
the analysis gives k = 3.41 9 10-9, fN = 0.069 for a inac- Discussion
tivation date of 14.4 MYA, with a confidence interval of
13.8–15.0 MYA. With the ever increasing number of genomic sequencing
projects being undertaken, a total of eight sequences could
Exon loss and repetitive elements in the human be retrieved for this study. At the time, fifteen other primate
and guinea pig GLO genes sequencing projects have either been announced or initi-
ated (GOLD database v.2.0, http://www.genomesonline.
Analyses of the repetitive element distribution in the GLO org/). Furthermore, of the other species unable to synthe-
gene sequences were carried out to see if illegitimate size vitamin C, several teleost fish and three bat projects
recombination between repeated sequences could have are underway. As more data become available, we will
contributed to exon loss. A variety of elements were found in gain a clearer picture of the molecular evolution of the
primate sequences, including the Alu and MIRb SINEs as L-gulono-c-lactone oxidase gene.
well as several types of LINEs (e.g., MLT1B; results not The inability of teleost fish, such as the rainbow trout,
shown). These were present at a frequency of about 1 per the Japanese medaka, and the common carp, to biosyn-
1,000 bp for all lineages. Although much of the galago thesize vitamin C has been well documented (Dabrowski
sequence is still unavailable in GenBank (due to an unfin- 1990, 1994; Toyohara et al. 1996; Krasnov et al. 1998;
ished sequencing project), two regions of its GLO gene have Moreau and Dabrowski 2000; Hwang and Lin 2002). Since
been sequenced and are suitable for comparison. These two several ancestral actinopterygian fish species have been
regions span exons 1 to 5 and exons 8 to 12. All sequences for observed to retain this ability and that all teleost fish tested
these areas were aligned and annotated for various repetitive so far cannot synthesize vitamin C, the inactivating muta-
elements using PlotRep (Toth et al. 2006). tion must have occurred shortly after the divergence of
Figure 4a is a representation of the region upstream of these two lineages, some 200 to 210 MYA (Dabrowski
exon 4 of the galago and human sequences. The 1994; Moreau and Dabrowski 2000). Given this long

123
204 Genetica (2011) 139:199–207

?255 G->A,S 918 G->A,S 238 G->A,N 998 T->C,N


261 G->A,S 966 G->A,S 243 G->T,N 1011 C->T,S
264 T->C,S 981 A->G,S 252 A->G,S 1016 A->G,N
267 C->T,S 1032 G->A,S 262 G->A,N 1038 T->C,S
423 A->C,S 1092 C->T,S 273 C->G,S 1058 G->C,N
429 A->C,S 1140 A->G,S 282 C->G,N *1090 A->T,Na2
447 G->A,S 1167 C->T,S *289 C->T,Na1 *1091 T->G,Sa0
451 A->G,N 1170 G->A,S *291 A->G,Sa1 1095 G->A,N
465 C->T,S 1188 C->A,S 306 T->C,S 1107 T->C,S
471 C->A,S 1200 C->T,S 449 C->T,N 1116 T->C,S
477 C->T,S 1233 T->C,S 459 A->G,S 1124 G->A,N
498 G->A,S 1241 C->A,N 462 T->C,S 1128 G->T,S
?606 C->T,S 1251 C->T,S 475 A->C,N 1129 G->A,N
791 T->A,N 1253 C->A,N 478 G->C,N 1141 G->A,N
798 G->A,S 1263 A->G,S ?495 T->A,S 1145 A->G,N
819 T->C,S 1288 C->T,S 496 G->A,N *1168 G->C,Na2
860 A->G,N 1319 G->A,S 516 C->A,N *1169 G->A,Sa0
?897 C->T,S 805 G->A,N 1173 C->A,S
810 C->T,S 1176 G->A,S
820 C->T,N 1182 C->T,S
*862 A->T,Na1 1207 A->T,N
*864 C->T,Sa1 1231 C->T,N
865 A->G,N 1244 A->C,N
868 C->T,N 1248 C->T,S
887 C->G,N 1252 G->A,N
252 A->C,S 300 C->T,S 891 C->T,S 1270 G->A,N
277 G->A,N 301 A->G,N 912 T->C,S 1274 C->T,N
*437 C->A,Na1 *449 C->A,Na1.5 *970 G->A,Na2 1290 G->A,S
*438 A->G,Sa1 *450 C->A,Sa0.5 *971 C->G,Sa0 ?1296 G->C,S
*448 G->A,Na1 474 C->A,N 984 C->T,S 1308 G->A,S
*450 C->A,Sa1 513 C->T,S
474 C->T,S 786 C->G,N
499 C->A,S 835 C->A,N
799 T->C,N 873 C->T,S
886 A->T,N 915 C->A,S
1035 G->T,S 933 C->T,S
1056 C->G,S 957 G->A,S
1110 C->G,S 993 G->A,S
1126 C->T,S 1014 G->A,S
1131 C->T,S 1035 G->C,S
1260 T->C,S 1044 C->T,S
1305 G->A,S 1110 C->A,S
1311 G->T,S 1116 T->C,S
1128 G->A,S
1173 C->A,S
1176 G->A,S
1207 A->C,S
1222 A->G,N
1234 G->A,N
1308 G->A,S
1311 G->C,S

Mouse Rat Guinea Pig


S = 43 S = 48.5 S = 24
N = 10 N = 12.5 N = 36
Fig. 3 Substitutions inferred in different rodent lineages according to squirrel and primate sequences were used as outgroups. Symbols used
the maximum parsimony method. A total of 730 sites common to the are the same as for Fig. 2
mouse, rat, and guinea pig were analyzed and the corresponding

timescale, it is quite understandable that no GLO ortho- proved ineffective (Krasnov et al. 1998). While cDNA
logues could be retrieved from the available teleost gen- transcripts were indeed observed in embryos, no GLO
omes. It is likely that the non-functional gene has acquired protein or activity were observed in adults. This suggests
so many mutations that it is now unrecognizable or that it that the metabolic dysfunctions for vitamin C synthesis in
has been excised from the genome altogether. Furthermore, teleost fish may not solely depend on GLO activity.
gene transfer experiments attempting to restore vitamin C Early estimates for the inactivation of the L-gulono-
biosynthesis in rainbow trout by adding the rat GLO gene c-lactone oxidase gene in primates placed the date between

123
Genetica (2011) 139:199–207 205

Fig. 4 a Repetitive element A


distribution in the area upstream
of exon 4 of the galago and
Galago
human sequences. Diagonal 2 3 4
MLT1B MIRb MIRb MIRb Alu
lines indicate the chromosomal
breakpoint area that led to the
loss of exons 2 and 3 in the last
common ancestor of human,
chimps, and macaques.
b Sequence alignment
highlighting the deleted region. 4
Homology with the MIRb Human
element begins only at the 123rd
bp of this element. Hu human,
MLT1B MIRb MIRb
Ch chimpanzee, Ma macaque,
Ga galago 1000 bp
B

35 and 55 MYA (Nishikimi et al. 1994; Challem and primate GLO gene to be between 30.5 and 42.9 MYA, rather
Taylor 1998). These were based primarily on the postulated than between 61 and 74 MYA as we did. This is probably
date of divergence between Anthropoidea (Simiiformes; due to the fact (as they point out themselves) that they used a
new and old world monkeys, tarsiers) and prosimians smaller number of sequences to calculate this date.
(Strepsirrhini; lemurs, galagos and lorises). Since the latter For the guinea pig, the calculated range of GLO inac-
retain the ability to endogenously synthesize vitamin C and tivation lies between 13 and 15 MYA, a date consistent
the former do not, it was thought that the inactivation must with the work of Nishikimi et al. (1992) who proposed a
have occurred sometime since their last common ancestor. date of less than 20 MYA. Limitations for both sets of
However, recent findings place this divergence a little calculations reside predominantly in the amount of avail-
further up the evolutionary tree at about 77.5 MYA able data. In our case, only four primate and four rodent
(Murphy et al. 2004; Steiper and Young 2006). In light of sequences were used for the analysis. This limits the reli-
these new estimates, our calculated dates of between 61 ability of our estimated neutral mutation rates and fN val-
and 74 MYA is consistent with this gene having been ues, even if these seemed to be within acceptable ranges. A
inactivated soon after the divergence of Anthropoidea from larger number of sequences from related species would
prosimians. While there is still some debate as to the have provided us with a more accurate picture. Further-
phylogenetic relationship of the Tarsiiformes lineage to more, only between 42 and 55% of the total length of the
other primates, the fact that tarsiers also lack the ability to GLO gene protein coding regions are currently available
synthesize vitamin C suggest that they are more closely (Fig. 1). The completion of the numerous genome
related to anthropoids than to prosimians (Pollock and sequencing projects should also allow further studies to use
Mullin 1987). The recent study of Zhang et al. (2010), who complete sequences.
also used the Chou et al. (2002) method to calculate Another difficulty caused by this lack of sequences
inactivation dates, found the inactivation date of the concerns the sites which were difficult to characterize.

123
206 Genetica (2011) 139:199–207

Since an ancestral sequence could not be determined often as it does. These species must consume a sufficient
unambiguously, the true direction of nucleotide change amount only to survive, so how could there be any beneficial
could not be concluded. In some instances, one might aspects to this loss? It has been hypothesized that since
choose to simply eliminate these sites from the analysis. vitamin C acts primarily as an antioxidant, a GLO gene
However, because we were estimating substitution rates, inactivation should theoretically bring about elevated levels
these could not simply be disregarded since a mutation had of free radicals. Due to the mutagenic nature of these, the
evidently occurred. The bottom end of the inactivation likelihood of genetic mutations would increase and essen-
ranges were calculated with the ambiguous sites included. tially propel the evolution of these species (Challem 1997).
Although ambiguous, the rates obtained when these are Another hypothesis suggests that formation of ascorbic acid
considered do involve all observable substitutions. may have negative impacts on the liver. Synthesis would
Our results show that the inactivation dating method we involve the formation of deleterious hydrogen peroxide
used gives the same results as a phylogenetic bracketing (H2O2) and the depletion of glutathione stores. Therefore,
approach when a particular gene has been lost in several for species with an ample supply of vitamin C, the selective
species and that the phylogeny (and divergence times) of pressure to preserve the ability of biosynthesis is lost
these species is known. On the other hand, this method is (Bánhegyi et al. 1996). Alternatively, as pointed out
useful when the phylogeny (and divergence times) of these repeatedly in the literature, loosing the capacity to synthe-
species is not known, such as in the case of the guinea pig size vitamin C might simply not be disadvantageous to
gene where a gene has been lost in a single lineage. species having a vitamin C-rich diet (e.g., Chatterjee 1973;
The analysis of repetitive element distribution in the Birney et al. 1976; Nishikimi et al. 1992). The fact that the
primate and rodent sequences did not reveal conclusive inability to synthesize vitamin C in both humans and guinea
evidence that illegitimate recombination between repeated pigs is due to the loss of the same (GLO) gene is likely a
sequences were involved in the loss of exons in the consequence of the fact that the enzymatic function of the
anthropoid and guinea pig GLO genes (Fig. 4 and results GLO protein is the last step in vitamin C production.
not shown). However, since the exon/intron structure is Whereas loosing this gene only affects the production of
conserved between the human, chimp, and macaque, we vitamin C, loosing genes for the other enzymes of this
must infer that the exon deletions found in this gene were synthesis pathway would affect the production of many
already fixed in their last common ancestor some 30 MYA other molecules (Linster and Van Schaftingen 2007).
(Steiper and Young 2006). The mechanisms which led to As more sequence information becomes available,
the deletion of exons 1, 2, 3, 5, 6, 8 and 11 must therefore molecular dating studies such as this one will become more
have occurred prior to that. Such a long time span is suf- accurate. Postulating dates for the inactivation of genes can
ficient to have rendered any footprint or chromosomal provide important information in regards to when major
breakpoint unrecognizable. It is of interest, however, that biochemical shifts occurred and their relation in the
these deletions have occurred earlier rather than later. broader evolutionary picture. These sequences will also
Many types of repetitive elements are thought to have been allow us to determine whether the inactivation of the same
intensely active in the early part of the mammalian and (GLO) gene in humans and guinea pigs is due to chance or
primate radiation (Pascale et al. 1990; Rowold and Herrera is a feature of all species unable to synthesize vitamin C.
2000; Zhang et al. 2003; Pace and Feschotte 2007). This
coincides with the fact that the common anthropoid GLO Acknowledgments We thank the two anonymous referees for their
useful and constructive comments on a previous version of this man-
gene has not sustained any gross modifications since at uscript. This work was supported by a Discovery Grant from the Nat-
least 30 MYA. A quick examination of the eight Alu ural Science and Engineering Research Council of Canada to G. D.
sequences present in the allele revealed that seven belon-
ged to the ancestral subfamilies AluJ and AluSx. These are
thought to have been active 60 and 44 MYA, respectively
(Price et al. 2004). Interestingly, it has been observed that References
the bat genus Myotis has had a recent bout of DNA
transposon activity (Ray et al. 2007). As bat GLO gene Bánhegyi G, Csala M, Braun L, Garzó T, Mandl J (1996) Ascorbate
synthesis-dependent glutathione consumption in mouse liver.
sequences become available, the nucleotide substitution FEBS Lett 381:39–41
pattern will tell us whether vitamin C biosynthetic capacity Birney EC, Jenness R, Ayaz KM (1976) Inability of bats to synthesise
was lost during the same time frame. L-ascorbic acid. Nature 260:626–628
The effects a nonfunctional GLO gene may have had on Burns JJ (1957) Missing step in man, monkey and guinea pig required
for the biosynthesis of L-ascorbic acid. Nature 180:553
anthropoid and guinea pig evolution is considerable. On Challem JJ (1997) Did the loss of endogenous ascorbate propel the
account of the various roles for vitamin C in metabolism, it evolution of anthropoidea and Homo sapiens? Med Hypotheses
is surprising that the inability to synthesize it appears as 48:387–392

123
Genetica (2011) 139:199–207 207

Challem JJ, Taylor EW (1998) Retroviruses, ascorbate, and muta- Nishikimi M, Fukuyama R, Minoshiman I, Shimizux N, Yagis K
tions, in the evolution of Homo sapiens. Free Radic Biol Med (1994) Cloning and chromosomal mapping of the human
25:130–132 nonfunctional gene for L-gulono-gamma-lactone oxidase, the
Chatterjee IB (1973) Evolution and the biosynthesis of ascorbic acid. enzyme for L-ascorbic acid biosynthesis missing in man. J Biol
Science 182:1271–1272 Chem 269:13685–13688
Chaudhuri CR, Chatterjee IB (1969) L-ascorbic acid synthesis in Pace JK 2nd, Feschotte C (2007) The evolutionary history of human
birds: phylogenetic trend. Science 164:435–436 DNA transposons: evidence for intense activity in the primate
Chou H, Hayakawa T, Diaz S, Krings M, Indriati E, Leakey M, Paabo lineage. Genome Res 17:422–432
S, Satta Y, Takahata N, Varki A (2002) Inactivation of CMP- Padh H (1990) Cellular functions of ascorbic acid. Biochem Cell Biol
N-acetylneuraminic acid hydroxylase occurred prior to brain 68:1166–1173
expansion during human evolution. Proc Natl Acad Sci USA Pascale E, Valle E, Furano A (1990) Amplification of an ancestral
99:11736–11741 mammalian L1 family of long interspersed repeated DNA
Coghlan A, Eichler EE, Oliver SG, Paterson AH, Stein L (2005) occurred just before the murine radiation. Proc Natl Acad Sci
Chromosome evolution in eukaryotes: a multi-kingdom perspec- USA 87:9481–9485
tive. Trends Genet 21:673–682 Pauling L (1970) Evolution and the need for ascorbic acid. Proc Natl
Cooper DN (1999) Human gene evolution. BIOS Scientific, Oxford Acad Sci USA 67:1643–1648
Dabrowski K (1990) Gulonolactone oxidase is missing in teleost fish. Pollock JI, Mullin RJ (1987) Vitamin C biosynthesis in prosimians:
The direct spectrophotometric assay. Biol Chem Hoppe Seyler evidence for the anthropoid affinity of tarsius. Am J Phys
371:207–214 Anthropol 73:65–70
Dabrowski K (1994) Primitive actinopterygian fishes are capable of Price AL, Eskin E, Pevzner PA (2004) Whole-genome analysis of Alu
ascorbic acid synthesis. Experimentia 50:745–748 repeat elements reveals complex evolutionary history. Genome
Drew KL, Osborne PG, Frerichs KU, Hu Y, Koren RE, Hallenbeck Res 14:2245–2252
JM, Rice ME (1999) Ascorbate and glutathione regulation in Ray DA, Pagan HJT, Thompson ML, Stevens RD (2007) Bats with
hibernating ground squirrels. Brain Res 851:1–8 hATs: evidence for recent DNA transposon activity in genus
Drouin G, Prat F, Ell M, Clarke G (1999) Detecting and character- Myotis. Mol Biol Evol 24:632–639
izing gene conversions between multigene family members. Mol Rowold DJ, Herrera RJ (2000) Alu elements and the human genome.
Biol Evol 16:1369–1390 Genetica 108:57–72
Echols N, Harrison P, Balasubramanian S, Luscombe NM, Bertone Steiper ME, Young NM (2006) Primate molecular divergence dates.
P, Zhang Z, Gerstein M (2002) Comprehensive analysis of Mol Phylogenet Evol 41:384–394
amino acid and nucleotide composition in eukaryotic genomes, Szabó Z, Levi-Minzi SA, Christiano AM, Struminger C, Stoneking
comparing genes and pseudogenes. Nucleic Acids Res 30: M, Batzer MA, Boyd CD (1999) Sequential loss of two
2515–2523 neighboring exons of the tropoelastin gene during primate
Graur D, Li W-H (2000) Fundamentals of molecular evolution, 2nd evolution. J Mol Evol 49:664–671
edn. Sinauer Associates, Inc., Sunderland, Massachusetts Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTALW: improv-
Huchon D, Chevret P, Jordan U, Kilpatrick CW, Ranwez V, Jenkins ing the sensitivity of progressive multiple sequence alignment
PD, Brosius J, Schmitz J (2007) Multiple molecular evidences through sequence weighting, position-specific gap penalties and
for a living mammalian fossil. Proc Natl Acad Sci USA weight matrix choice. Nucleic Acids Res 22:4673–4680
104:7495–7499 Toth G, Deak G, Barta E, Kiss GB (2006) PLOTREP: a web tool for
Hwang D, Lin T (2002) Effect of temperature on dietary vitamin C defragmentation and visual analysis of dispersed genomic
requirement and lipid in common carp. Comp Biochem Physiol repeats. Nucleic Acids Res 34:W708–W713
B Biochem Mol Biol 131:1–7 Toyohara H, Nakata T, Touhata K, Hashimoto H, Kinoshita M,
Kolomietz E, Meyn MS, Pandita A, Squire JA (2002) The role of Alu Sakaguchi M, Nishikimi M, Yagi K, Wakamatsu Y, Ozato K
repeat clusters as mediators of recurrent chromosomal aberra- (1996) Transgenic expression of L-gulono-gamma-lactone oxi-
tions in tumors. Genes Chromosomes Cancer 35:97–112 dase in medaka (Oryzias latipes), a teleost fish that lacks this
Krasnov A, Reinisalo M, Pitkanen TI, Nishikimi M, Molsa H (1998) enzyme necessary for L-ascorbic acid biosynthesis. Biochem
Expression of rat gene for L-gulono-gamma-lactone oxidase, the Biophys Res Commun 223:650–653
key enzyme of L-ascorbic acid biosynthesis, in guinea pig cells Uddin RK, Zhang Y, Siu VM, Fan Y, O’Reilly RL, Rao J, Singh SM
and in teleost fish rainbow trout (Oncorthynchus mykiss). (2006) Breakpoint associated with a novel 2.3 mb deletion in the
Biochim Biophys Acta 1381:241–248 VCFS region of 22q11 and the role of Alu (SINE) in recurring
Li W, Maeda N, Beck MA (2006) Vitamin C deficiency increases the microdeletions. BMC Med Genet 7:18
lung pathology of influenza virus-infected Gulo -/- mice. J Nutr Winter H, Langbein L, Krawczak M, Cooper DN, Jave-Suarez LF,
136:2611–2626 Rogers MA, Praetzel S, Heidt PJ, Schweizer J (2001) Human
Linster CL, Van Schaftingen E (2007) Vitamin C biosynthesis, type I hair keratin pseudogene phihHaA has functional orthologs
recycling and degradation in mammals. FEBS J 274:1–22 in the chimpanzee and gorilla: evidence for recent inactivation of
Moreau R, Dabrowski K (2000) Biosynthesis of ascorbic acid by the human gene after the Pan-Homo divergence. Hum Genet
extant actinopterigians. J Fish Biol 57:733–745 108:37–42
Murphy WJ, Pevzner PA, O’Brien SJ (2004) Mammalian phyloge- Zhang Z, Harrison PM, Liu Y, Gerstein M (2003) Millions of years of
nomics comes of age. Trends Genet 20:631–639 evolution preserved: a comprehensive catalog of the processed
Nakayama K, Ishida T (2006) Alu-mediated 100-kb deletion in the pseudogenes in the human genome. Genome Res 13:2541–2558
primate genome: the loss of the agouti signaling protein gene in Zhang ZD, Frankish A, Hunt T, Harrow J, Gerstein M (2010)
the lesser apes. Genome Res 16:485–490 Identification and analysis of unitary pseudogenes: historic and
Nishikimi M, Kawai T, Yagi K (1992) Guinea pigs possess a highly contemporary gene losses in humans and other primates.
mutated gene for L-gulono-gamma-lactone oxidase, the key Genome Biol 11:R26
enzyme for L-ascorbic acid biosynthesis missing in this species. Zilva SS (1936) Vitamin C requirements of the guinea-pig. Biochem J
J Biol Chem 267(30):21967–21972 30:1419–1429

123

You might also like