Professional Documents
Culture Documents
0 Art 0 pvx2tm
0 Art 0 pvx2tm
1
43 To minimize the possible influence of long-branch attraction coupled with convergent compositional
44 signals, various strategies have been applied such as the use of nucleus-encoded mitochondrial genes4,6,7,
45 site or gene exclusion8-10, protein recoding 10 and the use of heterogeneity-tolerant models6,11. These
46 attempts have proposed contradictory hypotheses (Supplementary Discussion). Recently, Martijn et al.
47 revisited the topic and reported that when compositional heterogeneity of the protein sequence alignments
48 was reduced by excluding sites from the amino acid alignment, the entire alphaproteobacterial class
49 formed a sister group to mitochondria12. Their conclusion is at odds with the long-agreed phylogenetic
50 consensus that mitochondria originated from within the Alphaproteobacteria13. However, while excluding
51 possible noise in compositionally heterogenous sites might mitigate systematic errors, it will also
52 necessarily lead to some loss of phylogenetic information. A priori, one cannot rule out the possibility that
53 excluded sites contain signals of true evolutionary connection between mitochondria and
54 Alphaproteobacteria. A similar concern was voiced by Gawryluk14. Here, we examined the phylogenetic
55 affiliations of mitochondria by using several site-exclusion methods and demonstrated that these results
56 should be interpreted with utmost caution. To avoid arbitrary effects and model overfitting by site
57 exclusion, we then applied a straightforward approach to significantly reduce compositional signals in the
58 dataset by using GC-rich mitochondrial sequences that better fit the model in their natural state and to keep
59 the native alignment intact.
60
61 To cross-validate the effects of site-exclusion approaches on mitochondrial and alphaproteobacterial
62 phylogeny, we implemented five metrics with different principles in this study namely Stuart’s test,
63 Bowker’s test, 2-score, ɀ-score and Fast-evolving (Supplementary Table 1). Site-excluded subsets of the
64 ‘24-alphamitoCOGs’ dataset in Martijn et al. (2018) were generated by using the five methods with a
65 series of cutoff values (Supplementary Table 2). Trees of the subsets were compared to the tree of the
66 untreated dataset by calculating topological dissimilarity. Site exclusion approaches led to substantial tree
67 topological changes (Fig. 1). In general, the increase in number of sites removed precipitated increases in
68 changes of tree topology. Among the methods, ɀ-score generally caused the least changes in nearly all the
69 subsets of alignment. These patterns were consistent when either simple or mixed phylogenetic models
70 were applied. Of these trees, nearly half supported mitochondria in a sisterhood with the entire
71 Alphaproteobacteria (‘mito-out’) and the other half supported that mitochondria branch within
72 Alphaproteobacteria (‘mito-in’) (Fig. 1). Site-exclusion method, the number of sites excluded and tree
73 model applied had a mixed function to the phylogenetic relationship of mitochondria to
74 Alphaproteobacteria. One explanation for this observation is that sites strongly supporting either the ‘mito-
75 in’ or the ‘mito-out’ topology were randomly excluded by these metrics. The absence of certain topology-
76 determining and ‘mito-in’-supporting sites could have caused tree shift from one topology to the other,
77 while the further loss of ‘mito-out’-supporting sites might shift the tree topology back. Notably, while we
78 reproduced the results observed in Martijn et al. (2018) that tree topology shifted from ‘mito-in’ to ‘mito-
79 out’ when 5% to 40% of sites were removed by using the 2-score metric, further exclusion of more sites
80 (in total 60% here) changed the tree topology back to ‘mito-in’ predicted by the simple model (Fig. 1a)
81 strongly suggesting their conclusion was based on arbitrary parameter setup.
82
2
83
84 Fig. 1 | Tree dissimilarity based on the Alignment metric between the untreated tree and trees
85 generated after applying site-exclusion approaches. All trees are rooted. Empty dots show trees
86 supporting the Alphaproteobacteria-sister topology and filled dots show trees supporting the within-
87 Alphaproteobacteria topology of mitochondria. a, ML trees under simple models. b, ML trees under the
88 mixed model (C60).
89
90
91 To counter compositional heterogeneity but without arbitrarily compromising phylogenetic signals, we
92 here replaced the mitochondrial and Rickettsiales sequences with GC-rich alternatives in the ‘24-
93 alphamitoCOGs’ dataset and remarkably reduced the heterogeneity in FYMINK/GARP ratio between
94 mitochondria and slowly-evolving alphaproteobacteria (Supplementary Fig. 1). A new dataset namely
95 ‘18-alphamitoCOGs’ was generated including 61 nonredundant taxa and 18 marker proteins for
96 downstream phylogenetic inference (Supplementary Table 3, 4, and Supplementary Discussion).
97 The tree topology of Alphaproteobacteria themselves is an issue in its own right15. When fast-evolving taxa
98 were excluded, the alphaproteobacteria could be classified into four major clades: Alpha I, Alpha II, Alpha
99 III and GT, respectively (Fig. 2ab, Supplementary Fig. 2, 3, Supplementary Table 3, Supplementary
100 Discussion). We assign these alphaproteobacteria as the ‘backbone’ taxa. Despite the addition and removal
101 of fast-evolving taxa, a topology was preserved that all the four backbone clades maintain their monophyly
102 (Fig. 2, tree files are deposited in Supplementary Data Files).
103
3
104
105 Fig. 2 | Schematic phylogenetic trees of subgroups of Alphaproteobacteria in the ‘18-
106 alphamitoCOGs’ dataset. Alphaproteobacterial lineages are named according to Supplementary Table
107 3. Taxa and taxonomic groups in black present the backbone taxa. Filled dots show node support values
108 greater than 80% while empty dots show values greater than 50% but less than 80%. Node values show
109 posterior probability support values for Bayesian trees and bootstrapping support values based on 1000
110 iterations for ML trees. Trees are rooted. Outgroup taxa and Magnetococcus marinus MC-1 are not shown.
111 The Maxdiff values of Bayesian trees are shown beside the trees. a-l, Schematic trees of Supplementary
112 Fig. 2-13, respectively.
113
114
115 Each of the six fast-evolving groups was then added and a series of phylogenetic trees were built,
116 respectively. Our approach based on this new dataset not only provided results congruent with the most
117 recent study15 but also novel findings for the most difficult lineages (Supplementary Discussion).
118 Specifically, Holosporales were placed either in Alpha III forming a sister relationship with Azospirllaceae
119 and Acetobacteraceae (Alpha IIIb) or in sisterhood with the entire Alpha III (Fig. 2cd, Supplementary
120 Fig. 4, 5). Pelagibacteriales branched after Alpha Ia and before Alpha Ib (Fig. 2ef, Supplementary Fig. 6,
121 7). Alphaproteobacterium HIMB59 was placed in Alpha IIb forming a sisterhood with MarineAlpha12
122 Bin1 (Fig. 2gh, Supplementary Fig. 8, 9). Rickettsiales branched as sister to the clade of Alpha II and
123 Alpha III in the ML tree with a weak basal node support but within Alpha II, as the sister of MarineAlpha9
124 Bin5 in the Bayesian tree, suggesting possible connection between Rickettsiales and this newly discovered,
125 non-fast-evolving marine alphaproteobacterium (Fig. 2ij, Supplementary Fig. 10, 11). Moreover, fast-
126 evolving MAGs belonging to FEMAG I and FEMAG II were robustly placed within Alpha IIb (Fig. 2kl,
127 Supplementary Fig. 12, 13). Specifically, FEMAG I showed a strong connection to MarineAlpha9 Bin5,
128 while FEMAG II was linked to MarineAlpha12 Bin1 in the Bayesian tree.
129 We then added mitochondria to the trees of backbone taxa solely or in combinations with other fast-
130 evolving clades (tree files are deposited in Supplementary Data Files). Mitochondria by themselves were
131 placed within Alphaproteobacteria as the sister of Alpha II and Alpha III in the ML tree with a weak node
132 support (Fig. 3a, Supplementary Fig. 14). However, the counterpart Bayesian tree could not resolve the
4
133 relationship of mitochondria to taxa of the four alphaproteobacterial backbone clades (Fig. 3b,
134 Supplementary Fig. 15). Similar results were observed in trees including mitochondria in combination
135 with Holosporales, Pelagibacterales and alphaproteobacterium HIMB59, respectively (Fig. 3c-h,
136 Supplementary Fig. 16-21), suggesting all these lineages have little evolutionary affinity to mitochondria.
137 In contrast, apparent phylogenetic connection of mitochondria to Rickettsiales and FEMAG II was
138 observed in both ML and Bayesian trees (Fig. 3i-l, Supplementary Fig. 22-25). Specifically, mitochondria
139 and Rickettsiales group together independently of the four backbone clades, while mitochondria and
140 FEMAG II form sisters inside the Alpha IIb clade.
141
142
143 Fig. 3 | Schematic phylogenetic trees of mitochondria and subgroups of Alphaproteobacteria in the
144 ‘18-alphamitoCOGs’ dataset. Alphaproteobacterial lineages and Mitochondria are named according to
145 Supplementary Table 3. Taxa and taxonomic groups in black present the backbone taxa. Filled dots show
146 node support values greater than 80% while empty dots show values greater than 50% but less than 80%.
147 Node values show posterior probability support values for Bayesian trees and bootstrapping support values
148 based on 1000 iterations for ML trees. Trees are rooted. Outgroup taxa and Magnetococcus marinus MC-1
149 are not shown. The Maxdiff values of Bayesian trees are shown beside the trees. a-l, Schematic trees of
150 Supplementary Fig. 14-25, respectively.
151
152
153 Since Rickettsiales, alphaproteobacterium HIMB59, FEMAG I and FEMAG II individually showed
154 phylogenetic connections to backbone taxa of Alpha IIb in Bayesian trees, evolutionary relationships
155 between these lineages were then investigated specifically by setting Alpha IIa as the outgroup.
156 MarineAlpha11 Bin1 and MarineAlpha12 Bin2 formed a monophylic clade in both the ML and the
157 Bayesian trees (Fig. 4ab). MarineAlpha9 Bin5 either branched below all the fast-evolving taxa studied
158 here (Fig. 4a) or formed monophyly with FEMAG I (Fig. 4b). The nodes before MarineAlpha9 Bin5 and
5
159 FEMAG I, respectively, had low support suggesting the clade comprising these two lineages in the ML tree
160 was unstable. Both trees reached an agreement that alphaproteobacterium HIMB59 branched within
161 FEMAG II and Rickettsiales was in sisterhood with FEMAG II.
162
163
164 Fig. 4 | Phylogenetic relationships of fast-evolving taxa and mitochondria to alphaproteobacteria of
165 Alpha IIb. Node values show posterior probability support values for Bayesian trees and bootstrapping
166 support values based on 1000 iterations for ML trees. mt, mitochondria. All trees are rooted and the
167 outgroup is not shown. a and b, ML and Bayesian trees, respectively, of fast-evolving alphaproteobacteria
168 and taxa of Alpha IIb. c and d, ML and Bayesian trees, respectively, of fast-evolving alphaproteobacteria,
169 mitochondria and taxa of Alpha IIb.
170
171
172 When mitochondria were present, the topology of all other taxa was preserved in both the ML tree and the
173 Bayesian tree, respectively (Fig. 4cd). Mitochondria were placed below the clade consist of FEMAG II,
174 alphaproteobacterium HIMB59, Rickettsiales and FEMAG I in the ML tree with node support of 71%. In
175 comparison, the phylogenetic relationship of mitochondria, the clade of FEMAG I and MarineAlpha9 Bin5
176 and the clade of FEMAG II, alphaproteobacterium HIMB59 and Rickettsiales was unresolved by Bayesian
177 inference. Despite that, the placement of mitochondria within Alpha IIb was robust. Our result suggests
178 that mitochondria may have originated from the common ancestor of Rickettsiales and certain extant
179 marine planktonic alphaproteobacteria. This tree topology is robust to various parameters and unlikely a
180 result of phylogenetic artifact, as indicated by several lines of evidence (Supplementary Discussion).
181 In summary, we have demonstrated that site-exclusion methods can cause diverse topological shifts via
182 arbitrary cutoff selection. Specifically, the Alphaproteobacteria-sister topology reported by Martijn et al.
183 was the result of a very particular experimental setup and a set of parameters used12. In even most site
184 excluded datasets, mitochondria branch from within Alphaproteobacteria. Therefore, by lacking objective
185 criteria for parameter setup, site exclusion methods are still out for judgement. We then employed
186 alternative approaches to mitigate biases in datasets and found that mitochondria have strong phylogenetic
187 connection to the common ancestor of Rickettsiales and several fast-evolving alphaproteobacteria derived
188 from marine surface metagenomes. While this result again supports a robust evolutionary association
6
189 between mitochondria and Alphaproteobacteria, it also provides important ecological insights into the
190 origin of both mitochondria and Rickettsiales. Based on our result, the common ancestor of mitochondria
191 and Rickettsiales was a free-living alphaproteobacterium. This is consistent with a recent report favoring
192 independent branching of Rickettsiales and mitochondria16 but again in agreement with numerous previous
193 studies which suggested phylogenetic connection between mitochondria and Rickettsiales5. Physiological
194 and geological modellings have suggested that mitochondrial acquisition possibly occurred in shallow
195 marine environments17 or in anaerobic syntrophy18. Proteome study of Rickettsiales and MarineAlpha bins
196 of Alpha II may provide hints about the metabolic nature of the common ancestor of mitochondria18,19.
197
198 1. Ku, C., Nelson-Sathi, S., Roettger, M., Sousa, F. L., et al. Endosymbiotic origin and differential loss of
199 eukaryotic genes. Nature 524, 427-432 (2015).
200 2. Thiergart, T., Landan, G., Schenk, M., Dagan, T. & Martin, W. F. An evolutionary network of genes
201 present in the eukaryote common ancestor polls genomes on eukaryotic and mitochondrial origin.
202 Genome Biol Evol 4, 466-485 (2012).
203 3. Abhishek, A., Bavishi, A., Bavishi, A. & Choudhary, M. Bacterial genome chimaerism and the origin
204 of mitochondria. Can J Microbiol 57, 49-61 (2011).
205 4. Atteia, A., Adrait, A., Brugière, S., Tardif, M., et al. A proteomic survey of Chlamydomonas reinhardtii
206 mitochondria sheds new light on the metabolic plasticity of the organelle and on the nature of the
207 alpha-proteobacterial mitochondrial ancestor. Mol Biol Evol 26, 1533-1548 (2009).
208 5. Roger, A. J., Muñoz-Gómez, S. A. & Kamikawa, R. The origin and diversification of mitochondria.
209 Curr Biol 27, R1177-R1192 (2017).
210 6. Derelle, R. & Lang, B. F. Rooting the eukaryotic tree with mitochondrial and bacterial proteins. Mol
211 Biol Evol 29, 1277-1289 (2012).
212 7. Wang, Z. & Wu, M. An integrated phylogenomic approach toward pinpointing the origin of
213 mitochondria. Sci Rep 5, 7949 (2015).
214 8. Viklund, J., Ettema, T. J. & Andersson, S. G. Independent genome reduction and phylogenetic
215 reclassification of the oceanic SAR11 clade. Mol Biol Evol 29, 599-615 (2012).
216 9. Esser, C., Ahmadinejad, N., Wiegand, C., Rotte, C., et al. A genome phylogeny for mitochondria
217 among alpha-proteobacteria and a predominantly eubacterial ancestry of yeast nuclear genes. Mol
218 Biol Evol 21, 1643-1660 (2004).
219 10. Fitzpatrick, D. A., Creevey, C. J. & McInerney, J. O. Genome phylogenies indicate a meaningful
220 alpha-proteobacterial phylogeny and support a grouping of the mitochondria with the Rickettsiales.
221 Mol Biol Evol 23, 74-85 (2006).
222 11. Rodríguez-Ezpeleta, N. & Embley, T. M. The SAR11 group of alpha-proteobacteria is not related to
223 the origin of mitochondria. PLoS One 7, e30520 (2012).
224 12. Martijn, J., Vosseberg, J., Guy, L., Offre, P. & Ettema, T. J. G. Deep mitochondrial origin outside the
225 sampled alphaproteobacteria. Nature 557, 101-105 (2018).
226 13. Gray, M. W., Burger, G. & Lang, B. F. Mitochondrial evolution. Science 283, 1476-1481 (1999).
227 14. Gawryluk, R. M. R. Evolutionary biology: A new home for the powerhouse? Curr Biol 28, R798-
228 R800 (2018).
229 15. Muñoz-Gómez, S. A., Hess, S., Burger, G., Lang, B. F., et al. An updated phylogeny of the
230 Alphaproteobacteria reveals that the parasitic Rickettsiales and Holosporales have independent
231 origins. Elife 8, (2019).
7
232 16. Castelli, M., Sabaneyeva, E., Lanzoni, O., Lebedeva, N., et al. Deianiraea, an extracellular bacterium
233 associated with the ciliate Paramecium, suggests an alternative scenario for the evolution of
234 Rickettsiales. ISME J (2019).
235 17. Waldbauer, J. R., Newman, D. K. & Summons, R. E. Microaerobic steroid biosynthesis and the
236 molecular fossil record of Archean life. Proc Natl Acad Sci U S A 108, 13409-13414 (2011).
237 18. Gould, S. B., Garg, S. G. & Martin, W. F. Bacterial vesicle secretion and the evolutionary origin of the
238 eukaryotic endomembrane system. Trends Microbiol 24, 525-534 (2016).
239 19. Martin, W. F., Tielens, A. G. M., Mentel, M., Garg, S. G. & Gould, S. B. The physiology of
240 phagocytosis in the context of mitochondrial origin. Microbiol Mol Biol Rev 81, (2017).
241
242
8
243 Methods
244 No statistical methods were used to predetermine sample size. The experiments were not randomized and
245 the investigators were not blinded to allocation during experiments and outcome assessment.
246 Implementation of site-exclusion metrics. To obtain the 24-alphamitoCOGs dataset in Martijn et al.
247 (2018), file ‘alphaproteobacteria_mitochondria_untreated.aln’ was downloaded from
248 https://datadryad.org//resource/doi:10.5061/dryad.068d0d0. As the names of some MarineAlpha bins in
249 this file are not consistent with the phylogenetic trees in the original paper, we obtained the name mapping
250 file from Dr. Joran Martijn on 4 July 2018. On this dataset, 2-score based site exclusion was achieved by
251 applying the equation introduced by Viklund et al.8. The 2-score metric was designed to test site
252 contribution to dataset compositional heterogeneity8 and was applied by Martijn et al. for mitochondrial
253 phylogeny study12. ɀ-score is a metric specifically designed to cope with strong GC content-related amino
254 acid compositional heterogeneity in datasets of alphaproteobacterial phylogeny15. ɀ-scores of sites were
255 calculated according to the method introduced by Muñoz-Gómez et al.15. A method implemented in
256 IQTREE for fast-evolving site selection was also included for comparison since long-branch attraction
257 caused by fast-evolving species in Alphaproteobacteria and mitochondria is a potential issue20. Fast-
258 evolving site exclusion was based on conditional mean site rates estimated under the LG+C60+F+R6
259 model in IQTREE (v1.5.5) using the ‘-wsr’ flag20. Based on these three metrics, 5%, 10%, 20%, 40% and
260 60% of sites with the highest scores were excluded for downstream phylogenetic analyses. Stuart’s test and
261 Bowker’s test are two typical evaluation metrics of symmetry violation21. Site exclusion based on Stuart’s
262 test was conducted by using the stationary-trimming function in BMGE (v1.12)22.
263 Compared to Stuart’s test, Bowker’s test of symmetry was reported to more comprehensive and sufficient
264 to assess the compliance of symmetry, reversibility and homogeneity in time-reversible model
265 assumptions21,23. We used Bowker’s test of symmetry to produce subsets of the ‘alphamitoCOGs-24’
266 dataset by meeting increasingly stringent p-value-based thresholds
267 (>0.005, >0.01, >0.05, >0.1, >0.2, >0.3, >0.4 and >0.5, respectively). The Bowker’s test has long been
268 used as an overall test for symmetry24. The test assesses symmetry in an r × r contingency table with the ij-
269 th cell containing the observed frequency nij. The null hypothesis for symmetry is H0 = nij = nji, i ≠ j, i,j =
270 1,…,r, and the test value is computed as:
271
272 (1)
2
273 The test statistics follows distribution with the number of degrees of freedom equal to the number of
274 comparisons (nij vs nji) made.
275 The scoring function (SF) utilized for symmetry-based alignment trimming employed here is a sum of
276 absolute values of natural logarithms of Bowker's test's p-values, each raised to a certain power (15 as the
277 default value). SF can be computed as a mean over the values in an upper or lower triangular part of a
278 square matrix which rows and columns represent taxa, populated with |ln p|x values for Bowker’s tests
279 among these taxa, e.g:
280
281 (2)
9
282 wherein h is the number of taxa in the msa, and pab is a p value for the sequences a and b.
10
329 the genomes by using Hmmer based on protein-specific e-values of the HMM models. The obtained
330 proteins were processed for ML tree reconstruction by using IQTREE under the model ‘LG+C60+F’.
331 Copies identified as paralogs, possible contaminants or events of lateral gene transfer in each gene tree
332 were removed. Candidatus Paracaedibacter symbiosus was excluded as multiple contaminant proteins
333 were detected in its genome and we think its genome likely suffers from heavy contamination.
334 MitoCOG0003 and MitoCOG0133 were excluded as they were detected in few genomes. MitoCOG00052,
335 MitoCOG00060, MitoCOG00066 and MitoCOG00071 were excluded as they were absent in reselected
336 mitochondrial genomes. Consequenly, 18 marker proteins were selected. Except for outgroup species
337 (including Beta-, Gammaproteobacteria and Magnetococcales), genomes contained 16 or more than 16 of
338 the 18 marker proteins were kept. Furthermore, we removed redundant MarineAlpha bins of the original
339 dataset based on pairwise similarity of marker proteins by using BLASTP (v2.6.0+, identity ⩾ 0.99 and
340 coverage ⩾ 0.95) to reduce computational time. As a result, 61 genomes were kept for downstream
341 analysis (Individual and concatenated alignments of ‘18-alphamitoCOGs’ are deposited in Supplementary
342 Data Files).
343 Before phylogenetic inference, selected proteins were aligned respectively by using MAFFT-L-INS-i.
344 Low-quality columns were removed by BMGE (-m BLOSUM30) and the multiple sequence alignments
345 after quality control were concatenated.
346
347 Data availability. The authors declare that data supporting the findings of this study are available in
348 Supplementary Data Files.
349
350 Code availability. Scripts of site-exclusion methods are available in Supplementary Code Files.
351
352 References
353
354 20. Nguyen, L. T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective
355 stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32, 268-274
356 (2015).
357 21. Jermiin, L. S., Jayaswal, V., Ababneh, F. M. & Robinson, J. Identifying optimal models of evolution.
358 Methods Mol Biol 1525, 379-420 (2017).
359 22. Criscuolo, A. & Gribaldo, S. BMGE (Block Mapping and Gathering with Entropy): a new software
360 for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol
361 10, 210 (2010).
362 23. Bowker, A. H. A test for symmetry in contingency tables. J Am Stat Assoc 43, 572-574 (1948).
363 24. Lartillot, N., Rodrigue, N., Stubbs, D. & Richer, J. PhyloBayes MPI: phylogenetic reconstruction with
364 infinite mixtures of profiles in a parallel environment. Syst Biol 62, 611-615 (2013).
365 25. Nye, T. M., Liò, P. & Gilks, W. R. A novel algorithm and web-based tool for comparing two
366 alternative phylogenetic trees. Bioinformatics 22, 117-119 (2006).
367 26. Kuhner, M. K. & Yamato, J. Practical performance of tree comparison metrics. Syst Biol 64, 205-214
368 (2015).
369 27. Kannan, S., Rogozin, I. B. & Koonin, E. V. MitoCOGs: clusters of orthologous genes from
370 mitochondria and implications for the evolution of eukaryotes. BMC Evol Biol 14, 237 (2014).
11
371 28. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements
372 in performance and usability. Mol Biol Evol 30, 772-780 (2013).
373 29. Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment
374 trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972-1973 (2009).
375 30. Eddy, S. R. Accelerated profile HMM searches. PLoS Comput Biol 7, e1002195 (2011).
376
377
378 Acknowledgements This work was financially supported by the National Natural Science Foundation of
379 China (91851210, 41530105 and 81774152), the European Research Council (ERC 666053), the Shenzhen
380 Key Laboratory of Marine Archaea Geo-Omics, Southern University of Science and Technology,
381 (ZDSYS201802081843490), the Shenzhen Science and Technology Innovation Commission
382 (JCYJ20180305123458107), the VW foundation (93 046), and the Laboratory for Marine Geology,
383 Qingdao National Laboratory for Marine Science and Technology (MGQNLM-TD201810).
384
385 Author Contributions L.F., W.F.M. and R.Z. conceived this study. L.F., D.W., V.G., J.X., Y.X. and S.G.
386 were involved in data analysis. L.F., D.W., V.G., C.Z, W.F.M. and R.Z. interpreted the results and drafted
387 the manuscript. All authors participated in the critical revision of the manuscript.
388
389 Competing interests The authors declare no competing interests.
390
12