Professional Documents
Culture Documents
PLAZA 3.0: An Access Point For Plant Comparative Genomics
PLAZA 3.0: An Access Point For Plant Comparative Genomics
Received September 03, 2014; Revised October 02, 2014; Accepted October 03, 2014
* To whom correspondence should be addressed. Tel: + 32 9 3313822; Fax: + 32 9 33 13 809; Email: klaas.vandepoele@psb.vib-ugent.be
†
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
C The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Nucleic Acids Research, 2015, Vol. 43, Database issue D975
multispecies context. Platforms focusing on gene families served gene content and order), which were pre-computed
rely on grouping homologous (derived from a common an- using i-ADHoRe 3.0 (24) and stored in the database.
cestor) genes (9) and within a family detailed phylogenetic On the PLAZA portal each data type has its own page,
reconstructions are possible (10). Less common are tools with an intuitive and consistent layout. The top of the page
that look at genes in their genomic context to study cross- highlights the most general information, with more specific
species genome evolution and WGDs (11). Finally, compre- and detailed information further down the page. Numerous
hensive platforms were created (12–16) which, in contrast hyperlinks are present to allow users to go from one type of
to genome browsers, integrate numerous types of informa- data to another (e.g. from a gene to its family or orthologs,
tion (e.g. gene families, phylogenetic trees and genomic ho- or from a gene family to a phylogenetic tree). Every page
mology) along with structural and functional annotation, also has its own toolbox, which provides links to additional
providing a versatile starting point for numerous types of analyses and detailed visualizations (Figure 1).
analyses, going from simple sequence retrieval over explor- Expert users can download all sequences, gene families,
ing genomic variation to tracing the effects of large-scale orthology information and functional annotation data in
duplications. bulk from an FTP server, while the PLAZA Workbench en-
In this manuscript, we present version 3.0 of PLAZA ables the efficient retrieval of sequence or functional infor-
(http://bioinformatics.psb.ugent.be/plaza/), an online re- mation for a set of genes. The latter also allows performing
source that offers comparative genomics data for 37 plant additional analyses, such as GO enrichment, which can be
Figure 1. Gene page in PLAZA 3.0. From top to bottom, (A) the header with the menu and search functions, (B) the general information (links to the gene
family, alternative gene names or identifiers) and Descriptions for gene AT1G01720, (C) the toolbox that allows different actions to be performed, (D)
tabs with detailed functional information and (E) the footer with links to general information. A similar layout is consistently used on all pages describing
different data types.
Nucleic Acids Research, 2015, Vol. 43, Database issue D977
The fraction of genes that have a description in five se- Genome evolution
lected species is shown in Figure 3. For Arabidopsis thaliana
Collinearity, defined as conservation of gene content and
extensive annotation efforts assigned descriptions to the
order, has been used in PLAZA to determine homologous
majority of the genes and while these efforts have been
regions between genomes and duplicated regions within
transferred to closely related Brassicaceae species, for more
a genome. The latter are usually remnants of large-scale
distant species proper descriptions are often lacking. In con-
duplication events and various studies have revealed that
trast with other platforms and earlier PLAZA releases, now
traces of WGDs are present in all plant genomes sequenced
a large fraction of genes have an AnnoMine gene descrip-
to date (49). However, as gene loss and rearrangements ac-
tion (63% of the dicot and monocot protein-coding genes),
cumulate after such an event, the detection of WGDs using
including many genes from non-model plants. Although
collinearity becomes increasingly difficult as their ages in-
this text-mining procedure cannot replace expert annota-
crease (50). Therefore, in some cases, collinearity is a sub-
tors, it provides a valuable functional indication in the ab-
optimal measure to detect remnants of ancient duplications.
sence of a curated description.
To overcome this limitation, PLAZA 3.0 now also includes
Nucleic Acids Research, 2015, Vol. 43, Database issue D979
information on syntenic duplicates, which are paralogs from as the implementation of new transfer methods, PLAZA
regions with conserved gene content regardless of the or- 3.0 now offers comprehensive functional annotation for all
der (51). As such, an additional 125 266 and 55 277 genes species included.
were found to be putatively derived from WGDs that were
not found by the default collinearity searches in the di-
SUPPLEMENTARY DATA
cots and monocots versions, respectively (Supplementary
Method 3). Supplementary Data are available at NAR Online.
Haeussler,M. et al. (2014) The UCSC Genome Browser database: Repeated polyploidization of Gossypium genomes and the evolution
2014 update. Nucleic Acids Res., 42, D764–D770. of spinnable cotton fibres. Nature, 492, 423–427.
8. Abeel,T., Van Parys,T., Saeys,Y., Galagan,J. and Van de Peer,Y. 28. Myburg,A.A., Grattapaglia,D., Tuskan,G.A., Hellsten,U.,
(2012) GenomeView: a next-generation genome browser. Nucleic Hayes,R.D., Grimwood,J., Jenkins,J., Lindquist,E., Tice,H., Bauer,D.
Acids Res., 40, e12. et al. (2014) The genome of Eucalyptus grandis. Nature, 510, 356–362.
9. Rouard,M., Guiga,V., Aluome,C., Laporte,M.A., Droc,G., Walde,C., 29. The Tomato Genome Consortium. (2012) The tomato genome
Zmasek,C.M., Perin,C. and Conte,M.G. (2011) GreenPhylDB v2.0: sequence provides insights into fleshy fruit evolution. Nature, 485,
comparative and functional genomics in plants. Nucleic Acids Res., 635–641.
39, D1095–D1102. 30. Xu,X., Pan,S., Cheng,S., Zhang,B., Mu,D., Ni,P., Zhang,G., Yang,S.,
10. Powell,S., Forslund,K., Szklarczyk,D., Trachana,K., Roth,A., Li,R., Wang,J. et al. (2011) Genome sequence and analysis of the
Huerta-Cepas,J., Gabaldon,T., Rattei,T., Creevey,C., Kuhn,M. et al. tuber crop potato. Nature, 475, 189–195.
(2014) eggNOG v4.0: nested orthology inference across 3686 31. Dohm,J.C., Minoche,A.E., Holtgrawe,D., Capella-Gutierrez,S.,
organisms. Nucleic Acids Res., 42, D231–D239. Zakrzewski,F., Tafer,H., Rupp,O., Sorensen,T.R., Stracke,R.,
11. Lee,T.H., Tang,H., Wang,X. and Paterson,A.H. (2013) PGDD: a Reinhardt,R. et al. (2014) The genome of the recently domesticated
database of gene and genome duplication in plants. Nucleic Acids crop plant sugar beet (Beta vulgaris). Nature, 505, 546–549.
Res., 41, D1152–D1158. 32. Peach Genome Initiative,International, Verde,I., Abbott,A.G.,
12. Flicek,P., Amode,M.R., Barrell,D., Beal,K., Billis,K., Brent,S., Scalabrin,S., Jung,S., Shu,S., Marroni,F., Zhebentyayeva,T.,
Carvalho-Silva,D., Clapham,P., Coates,G., Fitzgerald,S. et al. (2014) Dettori,M.T., Grimwood,J. et al. (2013) The high-quality draft
Ensembl 2014. Nucleic acids research, 42, D749–D755. genome of peach (Prunus persica) identifies unique patterns of
13. Lyons,E., Pedersen,B., Kane,J., Alam,M., Ming,R., Tang,H., genetic diversity, domestication and genome evolution. Nat. Genet.,
Wang,X., Bowers,J., Paterson,A., Lisch,D. et al. (2008) Finding and 45, 487–494.
48. UniProt Consortium. (2014) Activities at the Universal Protein tandem origin of angiosperm-specific MADS-box genes. Nat.
Resource (UniProt). Nucleic Acids Res., 42, D191–D198. Commun., 4, 2280.
49. Vanneste,K., Maere,S. and Van de Peer,Y. (2014) Tangled up in two: 52. Quevillon,E., Silventoinen,V., Pillai,S., Harte,N., Mulder,N.,
a burst of genome duplications at the end of the Cretaceous and the Apweiler,R. and Lopez,R. (2005) InterProScan: protein domains
consequences for plant evolution. Philos. Trans. R. Soc. Lond. B Biol. identifier. Nucleic Acids Res., 33, W116–W120.
Sci., 369. 53. Guindon,S. and Gascuel,O. (2003) A simple, fast, and accurate
50. Van de Peer,Y., Fawcett,J.A., Proost,S., Sterck,L. and Vandepoele,K. algorithm to estimate large phylogenies by maximum likelihood. Syst.
(2009) The flowering world: a tale of duplications. Trends Plant Sci., Biol., 52, 696–704.
14, 680–688.
51. Ruelens,P., de Maagd,R.A., Proost,S., Theissen,G., Geuten,K. and
Kaufmann,K. (2013) FLOWERING LOCUS C in monocots and the