Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

ll

Dispatches
18. Abecassis, Z.A., Berceau, B.L., Win, P.H., 19. Jun, J.J., Steinmetz, N.A., Siegle, J.H., 20. Mathis, A., Mamidanna, P., Cury, K.M., Abe,
Garcia, D., Xenias, H.S., Cui, Q., Pamukcu, A., Denman, D.J., Bauza, M., Barbarits, B., Lee, T., Murthy, V.N., Mathis, M.W., and Bethge, M.
Cherian, S., Hernandez, V.M., Chon, U., et al. A.K., Anastassiou, C.A., Andrei, A., Aydin, C., (2018). DeepLabCut: markerless pose
(2020). Npas1(+)-Nkx2.1(+) neurons are an et al. (2017). Fully integrated silicon probes for estimation of user-defined body parts
integral part of the cortico-pallido-cortical high-density recording of neural activity. with deep learning. Nat. Neurosci. 21, 1281–
loop. J. Neurosci. 40, 743–768. Nature 551, 232–236. 1289.

Evolution: Reconstructing the Timeline


of Eukaryogenesis
Andrew J. Roger1,2,*, Edward Susko1,3, and Michelle M. Leger4
1Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, NS B3H 4R2, Canada
2Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS B3H 4R2, Canada
3Department of Mathematics and Statistics, Dalhousie University, Halifax, NS B3H 4R2, Canada
4Institute of Evolutionary Biology (CSIC-UPF), Barcelona 08003, Spain

*Correspondence: andrew.roger@dal.ca
https://doi.org/10.1016/j.cub.2020.12.035

Timing the events in the evolution of eukaryotic cells is crucial to understanding this major transition. A recent
study reconstructs the origins of thousands of gene families ancestral to eukaryotes and, using a
controversial approach, aims to order the events of eukaryogenesis.

More than half a century after Stanier have also been recently sharpened by host lineage10,11. In such scenarios,
and van Niel delineated the profound metagenomic discoveries of the asgard the genetic contribution of the
differences between prokaryotic and Archaea6. Asgards appear to be the mitochondrial symbiont may not have
eukaryotic cells1, we still lack a clear closest sister lineages to the eukaryote been especially large or impactful.
understanding of how the latter evolved nucleocytoplasm, and harbour genes Other bacterial groups might also
from the former2,3. Eukaryogenesis previously thought to be eukaryote- have contributed genes to the proto-
involved, at minimum, an endosymbiont specific. These include homologs of eukaryote lineage by lateral gene
related to Alphaproteobacteria that eukaryotic cytoskeleton and transfer11,12. Unfortunately, our limited
gave rise to mitochondria, and a host endomembrane system components, information about the FECA to LECA
cell that was related to Archaea. suggesting that some asgards may transition makes it difficult to choose
But the origins of the myriad other possess simple precursors of these between these and alternative
characteristic eukaryotic features and eukaryotic cellular systems6–8. Despite eukaryogenesis scenarios.
the thousands of genes underpinning these discoveries, a gulf persists To address this impasse, Vosseberg
them remain obscure2. A new paper between an asgard-like FECA and a and colleagues4 first attempted to infer
by Vosseberg and colleagues4 tackles fully complex LECA. Dozens of LECA’s protein-domain content and
this problem by extracting ‘time of eukaryotic structures and processes constructed phylogenies for each one.
origin’ information from thousands of had to evolve, involving the origination, They aligned homologs of conserved
phylogenies of proteins sampled from duplication and functional protein domains from proteomes of
organisms across the prokaryote– specialization of thousands of genes several hundred representatives of
eukaryote divide. (Figure 1). The details of how this major eukaryote ‘supergroups’ and
In the last few decades, happened, the origins of the genes thousands of bacteria and archaea.
investigations of diverse eukaryotes involved, and the order of events are After reducing the representation of
have clarified the nature of the last hotly debated. closely related sequences,
eukaryotic common ancestor (LECA), Some have suggested that the phylogenies were estimated for
the ‘endpoint’ of eukaryogenesis mitochondrial symbiosis occurred early, each protein and specific nodes
(Figure 1). LECA appears to have triggering eukaryogenesis and on each tree were annotated as pre-
had all of the traits of canonical contributing much more to the proto- LECA gene duplication nodes,
eukaryotes including a complex eukaryote lineage than the asgard LECA nodes, and post-LECA nodes
dynamic cytoskeleton, a nucleus, archaeal host9. Others maintain that (Figure 2). For trees with prokaryotic
an endomembrane system, and the mitochondrial symbiosis occurred sequences, the closest prokaryotic sister
mitochondria5. Concepts of the first later, after phagocytosis and associated group to the eukaryote clade was
eukaryotic common ancestor (FECA) eukaryotic features evolved in the identified.

Current Biology 31, R186–R214, February 22, 2021 Crown Copyright ª 2020 Published by Elsevier Inc. R193
ll
Dispatches

lack phylogenetic resolution, they


suggest that the genetic contributions
of the mitochondrion and the host
Eukaryotes
lineages may have been similar in
magnitude and that a substantial
proportion of LECA genes originated
from different bacterial donors.
Importantly, asgard families were
most likely to have been duplicated,
whereas alphaproteobacterial families
LECA were the least. Genes of host origin
Gene duplications appear to have contributed more to
Novel genes eukaryogenesis-related innovations and
Gene transfers from prokaryotes genome expansion than those of the
? symbiont.
Cytoskeleton
3 Vosseberg and colleagues then
Microtubular flagellum
attempted to order eukaryogenesis
Nucleus
Mitochondrial events using a method recently
Mitosis & meiosis 2
symbiont developed by Pittis and Gabaldon11. The
Endomembrane system latter suggested that for any protein g and
branch v on its phylogeny, the branch
length Lgv is related to the time span for the
Asgard archaea branch, Tvg , and average substitution rate
Alphaproteobacteria
1 for the protein, Rg , through Lgv = Rg Tvg :
Therefore, if a gene were acquired by the
FECA proto-eukaryote from a prokaryotic
lineage shortly after the latter speciated
from its closest sampled living sister
lineage (Figure 2), the length of the ‘stem’
Current Biology
branch leading to LECA in this
phylogenetic tree could be used to
Figure 1. Eukaryogenesis. determine the relative time of origin of that
The first eukaryotic common ancestor (FECA), which descended from the asgard Archaea, evolved into protein after normalization to remove the
the last eukaryotic common ancestor (LECA) by the sequential acquisition of thousands of traits (box). effect of the protein-specific rate. The
Many intermediate lineages went extinct (represented by crosses). The order in which these traits
normalization factor employed was the
arose, and the relative timing of the mitochondrial endosymbiosis, are unknown. Alternative
‘mitochondria-early’ (1), ‘mitochondria-intermediate’ (2) and ‘mitochondria-late’ (3) scenarios are median of the possible ‘LECA-to-
depicted. The thumbnail illustrations of mitochondria and LECA were provided by Sergio Muñoz Gómez. eukaryote path distances’ (that is, the
sum of consecutive branch lengths on a
path from the LECA node to a eukaryote
Using these annotated trees, overrepresentation of duplicated genes tip sequence) in each phylogenetic tree
Vosseberg and colleagues estimated encoding cytoskeletal, endomembrane, (Figure 2). If Te is the age of LECA, the
that LECA possessed 10,233 protein plasma membrane, and nuclear normalized stem length is then slg =
families (LECA families) and predicted proteins, and an underrepresentation Lgs =medðLge Þ = Rg Tsg =medðRg Te Þ or just:
that LECA had 7,447–21,840 genes. This of genes encoding cytosolic, slg = Tsg =Te (Figure 2). In this case, for two
range was somewhat robust to different peroxisomal, and mitochondrial sets of LECA proteins of different origins A
possible eukaryote root positions and proteins. Overall, eukaryogenesis and B, values of slg will tend to be
fits well with a relatively complex LECA. seems mostly to have involved smaller for group A if its genes have more
Next, they investigated the role of gene innovation in informational, recent origins; that is, group A’s time span
duplications in eukaryogenesis. The endomembrane, signalling and is less than group B’s: TsA < TsB .
total number of LECA families was 1.8 cytoskeletal systems. Vosseberg et al.4 extended this
times the number of novel or acquired A much larger fraction of eukaryotic approach to gene duplication events by
genes, indicating that gene duplication protein families had bacterial affinities calculating normalized ‘duplication
played an important role in (77%) than had archaeal affinities lengths’ for genes (dlg ) to evaluate the
eukaryogenesis. Duplications were (16%), congruent with earlier timing of duplication events (Figure 2)
enriched in genes functioning in analyses12. Crucially, however, only relative to the times of origin of genes from
signalling and genes for information 7.2% had alphaproteobacterial affinities prokaryotic sources.
storage and processing but were and 6.8% had asgard origin. Although Vosseberg et al.4 confirmed Pittis and
depleted in metabolic genes. In terms of the latter two numbers are likely Gabaldon’s earlier findings11 that the
cellular localization, there was an underestimates because some families stem-length distributions of archaeal

R194 Current Biology 31, R186–R214, February 22, 2021


ll
Dispatches

origin proteins were systematically larger


than those of bacterial proteins, with
Time tree
alphaproteobacterial proteins having the Asgards
smallest stem lengths of all. Comparing
duplication lengths revealed a similar
pattern, with proteins of bacterial FECA LECA
g
origin and eukaryote-specific inventions Ts
having the shortest duplication lengths Eukaryotes
and archaeal-origin proteins the largest.
Grouping the duplicated proteins into g
functional classes revealed that Td Te
cytoskeletal, translational, nucleolar Gene g
and intracellular trafficking proteins
had the longest duplication lengths,
often longer than the median of g
alphaproteobacterial stem-length med (L e )
distribution. Other classes, such as L
g

the endomembrane system, had d

similar duplication-length distribution


g
medians to the alphaproteobacterial Ls
stem lengths, whereas the shortest
duplication lengths corresponded to
mitochondrial, metabolic, signal
transduction and transcriptional
proteins. Overall, they interpret these Prokaryotic sister group
analyses to support a ‘mitochondria- Protein g tree
intermediate’ scenario (Figure 1), g g
whereby expansions of the cytoskeletal, Stem length: sl g = L s /med (L e )
nucleolar and membrane trafficking g g
Duplication length: dl g = L d /med (L e )
systems occurred pre-symbiosis, with
the remainder of the systems originating
Current Biology
and complexifying during, or after, the
origin of mitochondria.
Figure 2. Relating the geological ‘time tree’ (top) to branch lengths on a phylogeny of a
But how robust are these protein (below) acquired from a prokaryote during eukaryogenesis.
inferences? Pittis and Gabaldon’s Geological timespans for branches, or sets of sequential branches (paths), v in the tree of protein g are Tvg ,
method has been criticized by Martin and and branch lengths are Lgv . s, stem branch or path; d, duplication branch or path; e, LECA-to-extant-
eukaryote path. Trees are annotated with ‘acquisition nodes’ (green diamond), ‘duplication nodes’
colleagues13 for assuming a molecular (magenta square) and ‘LECA nodes’ (purple circles). LECA families (purple boxes) and their prokaryotic
clock whereby substitution rates for sister group (green box) are shown. Expressions for the stem length (slg ) and duplication length (dlg ) are
stem and eukaryote branches are equal, outlined in orange. Note that Vosseberg et al.4 use the smallest of the possible duplication branches to
calculate stem lengths and duplication lengths. Also, because any gene transfer to the proto-eukaryote
Rgs = Rge , or are a fixed ratio, c = Rgs = Rge , from a prokaryotic source must have occurred after speciation of the donor lineage from its closest
for all genes. They pointed to extensive sister group, the stem-branch timespan, Tsg ; is an upper bound on the timespan of the desired stem
g
rate variation within and between branch post-acquisition, Ts , as discussed by Susko, Steel, and Roger15. Vosseberg and colleagues4
g
branches of the trees in the original data assumed Ts zTsg .

set that violates both of these


assumptions13.
A corresponding expression can be from the same process, with the
Are these assumptions required for
given for the eukaryotic path length with same distribution. It is possible to
the approach to make sense? No.
the complication that the median value, show that, if rvg ðtÞ is bounded away
Following relaxed molecular clock
medðLge Þ; was selected from all possible from 0 and infinity, then Pr½sla > slb > ½
theory14, suppose instead that rates
such paths. For two groups of genes, if and only if group A genes have an
vary throughout the tree according to a
A and B, with the same time of earlier time of origin than group B
time-dependent stochastic process.
acquisition or duplication, evolving genes15. This result justifies the use
Expressing the rate of substitution as
independently according to the of the Mann-Whitney U-test by
an overall protein rate, Rg , multiplied
same process, the probability that any Pittis and Gabaldon11 and Vosseberg
by the relative rate, rvg ðtÞ; for branch
gene a in group A has a stem length et al.4 because its null and alternative
or path v, the length of stem or
duplication branch s in protein g
greater than that of any gene b in hypotheses are Pr½sla > slb = ½ and
Z Te + Tsg group B is Pr½sla > slb = ½. This is Pr½sla > slb > ½, respectively. In
becomes Lgs = Rg rsg ðtÞdt15. because stem lengths are independent this framework15, stem lengths are
Te realizations of random variables coming expected to show variation over

Current Biology 31, R186–R214, February 22, 2021 R195


ll
Dispatches

genes and no molecular clock is that likely induced functional divergence origin of eukaryotes. Nat. Rev. Microbiol. 15,
711–723.
required. in translational proteins, as previously
However, the foregoing assumes all demonstrated for the translation
8. Stairs, C.W., and Ettema, T.J.G. (2020).
proteins from a given class have the same elongation factor 1 alpha protein19. The archaeal roots of the eukaryotic
time of origin. For stem lengths of proteins Similar functional shifts are likely to dynamic actin cytoskeleton. Curr. Biol. 30,
R521–R526.
of asgard or mitochondrial origin, this have affected many different proteins as
assumption is sensible. But for genes they acquired new roles in the proto-
9. Martin, W.F., Tielens, A.G.M., Mentel, M.,
independently acquired from different eukaryotic lineage. However, groups of Garg, S.G., and Gould, S.B. (2017). The
prokaryotic groups, times of origin may proteins of different origins may have physiology of phagocytosis in the context of
show substantial within-class variability. differentially experienced functional mitochondrial origin. Microbiol. Mol. Biol. Rev.
81, e00008–17.
Similarly, duplication times of proteins of divergence, shifting their relative stem-
any origin within functional or localization length distributions. 10. Cavalier-Smith, T., and Chao, E.E.-Y. (2020).
classes are not a priori expected to be Despite these concerns, Vosseberg Multidomain ribosomal protein trees and the
clustered in time. In fact, since some and colleagues have provided important planctobacterial origin of neomura
(eukaryotes, archaebacteria). Protoplasma
genes in any specific functional class will new insights into the roles of gene 257, 621–753.
end up in more than one localization class duplication and gene invention in
and vice versa, an assumption of similar different cellular systems during 11. Pittis, A.A., and Gabaldón, T. (2016). Late
times of origin within a class cannot hold eukaryogenesis, and clarified the relative acquisition of mitochondria by a host with
chimaeric prokaryotic ancestry. Nature 531,
in general. For example, the amino-acid- contributions of host, symbiont and other 101–104.
metabolism functional class contains prokaryotes to the genetic makeup of
proteins in the cytosol and the LECA4. This study, and ongoing 12. Rochette, N.C., Brochier-Armanet, C., and
mitochondrion localization classes; these discoveries of microbes that may be even Gouy, M. (2014). Phylogenomic test
of the hypotheses for the evolutionary
classes, in turn, contain many proteins not closer relatives of the eukaryote host and origin of eukaryotes. Mol. Biol. Evol. 31,
in the amino-acid-metabolism class (for mitochondrial lineages, will no doubt 832–845.
example, translation proteins). Although spur many future attempts to solve one of
the framework above can be extended to life’s most fundamental evolutionary 13. Martin, W.F., Roettger, M., Ku, C., Garg, S.G.,
Nelson-Sathi, S., and Landan, G. (2017). Late
cases of varying origin times, rejection of puzzles.
mitochondrial origin is an artifact. Genome
the null hypothesis then only suggests ‘on Biol. Evol. 9, 373–379.
average’ earlier origins for one group
REFERENCES
versus the other. Thus, the broad stem- 14. Bromham, L., Duchêne, S., Hua, X., Ritchie,
length and duplication-length A.M., Duchêne, D.A., and Ho, S.Y.W. (2018).
1. Stanier, R.Y., and van Niel, C.B. (1962). The Bayesian molecular dating: opening up the
distributions in Vosseberg et al.4 may concept of a bacterium. Arch. Für Mikrobiol. black box. Biol. Rev. Camb. Philos. Soc. 93,
reflect not only stochastic variation in 42, 17–35. 1165–1191.
rates, but also large ranges of origin 2. Dacks, J.B., Field, M.C., Buick, R., Eme, L.,
times. Gribaldo, S., Roger, A.J., Brochier-Armanet, 15. Susko, E., Steel, M., and Roger, A.J. (2021).
C., and Devos, D.P. (2016). The changing view Conditions under which distributions of
A second problematic assumption is edge length ratios on phylogenetic trees
of eukaryogenesis - fossils, cells, lineages and
that proteins of different origins are all how they all come together. J. Cell Sci. 129, can be used to order evolutionary events.
independently evolving according to the 3695–3703. bioRxiv, https://doi.org/10.1101/2021.01.16.
426961.
same probabilistic rate process, rvg ðtÞ. 3. Porter, S.M. (2020). Insights into
However, eukaryogenesis must have eukaryogenesis from the fossil record. 16. Studer, R.A., Dessailly, B.H., and Orengo, C.A.
involved significant shifts in evolutionary Interface Focus 10, 20190105. (2013). Residue mutations and their impact on
constraints on proteins as they adopted protein structure and function: detecting
4. Vosseberg, J., van Hooff, J.J.E., Marcet- beneficial and pathogenic changes. Biochem.
new functions. Functional divergence Houben, M., van Vlimmeren, A., van Wijk, L.M., J. 449, 581–594.
transiently elevates or permanently Gabaldón, T., and Snel, B. (2020).
Timing the origin of eukaryotic cellular
changes substitution rates16 and, if it complexity with ancient duplications. Nat. 17. Greber, B.J., Boehringer, D., Godinic-Mikulcic,
differentially affected proteins of Ecol. Evol. 5, 92–100. V., Crnkovic, A., Ibba, M., Weygand-
Durasevic, I., and Ban, N. (2012). Cryo-EM
different origins, it will also be reflected 5. Koumandou, V.L., Wickstead, B., Ginger, M.L., structure of the archaeal 50S ribosomal
in stem-length differences. To address van der Giezen, M., Dacks, J.B., and Field, subunit in complex with initiation factor 6 and
this, Vosseberg et al.4 investigated M.C. (2013). Molecular paleontology and implications for ribosome evolution. J. Mol.
complexity in the last eukaryotic common Biol. 418, 145–160.
duplicated proteins and concluded that ancestor. Crit. Rev. Biochem. Mol. Biol. 48,
functional divergence after gene 373–396.
18. Schmitt, E., Coureux, P.-D., Kazan, R.,
duplication did not strongly affect their Bourgeois, G., Lazennec-Schurdevin, C., and
6. Zaremba-Niedzwiedzka, K., Caceres, E.F.,
conclusions. However, this overlooks Mechulam, Y. (2020). Recent advances in
€ckström, D., Juzokaite, L.,
Saw, J.H., Ba
archaeal translation initiation. Front. Microbiol.
the possibility that functional Vancaester, E., Seitz, K.W., Anantharaman, K.,
11, 584152.
Starnawski, P., Kjeldsen, K.U., et al. (2017).
divergence could affect non-duplicated Asgard archaea illuminate the origin of
proteins and can occur prior to gene eukaryotic cellular complexity. Nature 541, 19. Inagaki, Y., Blouin, C., Susko, E., and Roger,
duplication. For example, important 353–358. A.J. (2003). Assessing functional divergence in
EF-1alpha and its paralogs in eukaryotes and
changes in the translational system 7. Eme, L., Spang, A., Lombard, J., Stairs, C.W., archaebacteria. Nucleic Acids Res. 31, 4227–
occurred during eukaryogenesis17,18 and Ettema, T.J.G. (2017). Archaea and the 4237.

R196 Current Biology 31, R186–R214, February 22, 2021

You might also like