Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Available online at www.sciencedirect.

com

Exploring the viral world through metagenomics


Karyna Rosario and Mya Breitbart

Viral metagenomics, or shotgun sequencing of purified viral had been digested with restriction enzymes before ampli-
particles, has revolutionized the field of environmental virology fication and sequencing [5]. The first study investigating
by allowing the exploration of viral communities in a variety of viral communities in an environmental sample through
sample types throughout the biosphere. The introduction of metagenomics improved these methods by mechanically
viral metagenomics has demonstrated that dominant viruses in shearing DNA to allow the random amplification of total
environmental communities are not well-represented by the viral nucleic acids regardless of restriction sites [6]. The
cultured viruses in existing sequence databases. Viral introduction of next generation sequencing technologies
metagenomic studies have provided insights into viral ecology has significantly increased the throughput of viral meta-
by elucidating the genetic potential, community structure, and genomics in the past five years [7]. Almost a decade after
biogeography of environmental viruses. In addition, viral the first environmental viral metagenome metagenome
metagenomics has expanded current knowledge of virus–host was sequenced, viral metagenomics has proven to be an
interactions by uncovering genes that may allow viruses to indispensable tool for understanding viral community
manipulate their hosts in unexpected ways. The intrinsic structure and discovering novel genes and viruses
potential for virus discovery through viral metagenomics can (Figure 1).
help advance a wide array of disciplines including evolutionary
biology, pathogen surveillance, and biotechnology. Metagenomics ‘dark matter’: the vast
unknown
Address To date there are 24 published studies investigating viral
University of South Florida, College of Marine Science, 140 7th Avenue communities in environmental samples through viral
South, Saint Petersburg, FL 33701, United States
metagenomics (Table 1). It is clear that environmental
Corresponding author: Breitbart, Mya (mya@marine.usf.edu) viral communities are different from well-characterized
viruses as studies consistently report more than 50% of
the sequences as unknown. Environmental viral metage-
Current Opinion in Virology 2011, 1:289–297 nomic studies have produced more than 810 Mbp of data
This review comes from a themed issue on for which there are no similarities in GenBank. These
Viral genomics sequences, representing about 70% of the data that has
Edited by Elodie Ghedin and Christipher Upton been generated through environmental viral metage-
nomics, remain in an ever-growing pool of unknowns.
Available online 2nd July 2011
Further analysis of this unknown pool is complicated by
1879-6257/$ – see front matter the lack of a centralized database for submitting and/or
# 2011 Elsevier B.V. All rights reserved. retrieving viral metagenomic data (Table 1). Moreover,
many of the ‘known’ viral sequences in environmental
DOI 10.1016/j.coviro.2011.06.004
viromes have low amino acid similarities (<50%) to
known viral proteins and, thus, represent undescribed
viral species [8,9]. The vast novelty of this genetic infor-
Introduction mation has brought attention to viruses, especially phages
The process of viral particle purification followed by shot- (i.e. viruses that infect prokaryotes) [7,10], as the largest
gun sequencing, known as viral metagenomics, has reservoir of untapped genetic information in the bio-
enabled exploration of the genetic diversity contained sphere.
within the most abundant biological entities in the bio-
sphere. Viral metagenomics circumvents limitations The high percentage of unknown sequences in environ-
associated with traditional virus characterization methods, mental viromes is consistent with findings from cultured
such as PCR or microarrays, and the power of this method virus genomes. Approximately 30% of open reading
resides in targeting total viral nucleic acids without the frames (ORFs) in sequenced viral genomes are ORFans,
need for ‘a priori’ knowledge of the viral types present which have no homologs in the known prokaryotic or viral
[1,2]. In contrast to standard metagenomic approaches that world, compared to only 9% ORFans in bacterial genomes
sequence total DNA [3,4], the initial viral particle purifi- [11]. These observations are further supported by the
cation step in viral metagenomic projects ensures that most high percentage of unknown sequences in environmental
of the sequencing effort results in the detection of viral viromes (>50%) compared to microbial metagenomes
nucleic acids. This concept was first applied to discover (10%) [1], suggesting that viral diversity is significantly
viruses in serum samples from patients with diseases of larger than that among bacteria. These findings highlight
unknown etiology by amplifying viral nucleic acids that our lack of understanding of viral proteins and stress the

www.sciencedirect.com Current Opinion in Virology 2011, 1:289–297


290 Viral genomics

Figure 1

INDUSTRY
[Functional Viral Metagenomics]

es
en

G
n
es

ow
kn

ym
Un

nz
lE
of

ve
tion

No
ota
Ann

thogens
Emerging Pa

DISCOVERY SURVEILLANCE
[Vector-Enabled Metagenomics]
New Viral Species

EVOLUTIONARY
GAPS

VIRAL
NUCLEIC ACIDS
er
ransf
ions

ne T
ract

Ge
Inte

ta l
on
t
os

riz
-H

Ho
us

r
Vi

ECOLOGY
Current Opinion in Virology

Applications of viral metagenomics. Tapping into environmental viral nucleic acids has allowed the scientific community to gain a better understanding
of viral ecology (i.e. viral diversity, community structure, and biogeography); discover novel viral species that may help elucidate evolutionary gaps;
investigate virus–host interactions (i.e. antiviral responses and lysogeny); identify auxiliary metabolic genes obtained through horizontal gene transfer
that may affect host functionality; design virus surveillance strategies for emerging pathogens; and discover viral enzymes with commercial and/or
biomedical value.

need for biochemical data which overwhelmingly lag starting to link metagenomic data from viruses with
behind genomics in this sequencing era. potential hosts based on genomic signatures and phylo-
genetic analyses [17,19–21]. Through these routes, meta-
Despite the inability to identify most of the sequences genomic studies have re-fueled the field of viral ecology,
produced through viral metagenomics, studies have taken providing unprecedented views into viral diversity and
advantage of this technique to catalogue viruses in environ- virus–host interactions, and leading to the discovery of
mental samples based on the identifiable sequences and novel viruses.
investigate community composition through statistical
analyses (see [12–14] for an overview of metagenomic data Metagenomics and viral ecology
analysis). The complete genomes of novel viruses have Data generated through viral metagenomic studies have
been assembled from environmental viromes [15–17,18], been used to tackle questions, such as viral biogeography,
and mathematical modeling based on viral metagenomic that could not be addressed with traditional methods.
sequences has yielded insight into community structure Amplification of signature genes for specific viral groups
and biogeography [6,7]. In addition, several studies are now suggested that certain viruses are found globally in

Current Opinion in Virology 2011, 1:289–297 www.sciencedirect.com


www.sciencedirect.com

Table 1

Published environmental viral metagenomes

Sample type a Location Sequencing b Target Database c Accession number


Coastal seawater [6] California S dsDNA GSS AY079522.1–AY079585.1AY080585.1 (SP)
BH898061.1–BH898933.1 (MB)
Coastal sediment [54] California S dsDNA GSS CC821301.1–CC822456.1
Open ocean and Sargasso Sea, British Columbia, P ssDNA, dsDNA CAMERA CAM_PROJ_MarineVirome
coastal waters [7] Gulf of Mexico, and Arctic
Coastal seawater [55] Canada S RNA GSS DX421099.1–DX421142.1 (SOG)
DX420985.1–DX421098.1 (JP)
Estuary [56] Chesapeake Bay S dsDNA CAMERA CAM_PROJ_CBVIRIO
Soil [25] Peru (Rainforest), California S dsDNA GSS ER781257–ER785833
(Desert), Kansas (Prairie)
Aquaculture system and Salton Sea and Chula Vista, P ssDNA, dsDNA SEED See footnote d
solar salterns [57] California
Saltern [58] Alicante, Spain S dsDNA GenBank DQ238866 (EHP-1); EF503711–EF503714 (EHP-2; EHP-3)
Rice paddy soil [59] Daejeon, Korea S ssDNA, dsDNA GenBank ABQX01000001–ABQX01000878
Hot springs [23] Yellowstone, USA S dsDNA CAMERA CAM_PROJ_ViralSpring
Marine and freshwater Bahamas and Mexico P ssDNA, dsDNA SEED 4440323.3 (Highborne Cay)
microbialites [26] 4440320.3 (PozasAzules)
4440321.3 (Rio Mesquites)
EBPR sludge [30] Madison, Wisconsin S dsDNA IMG/M Taxon ID: 2007300000
Estuary [34] Tampa Bay, Florida P ssDNA, dsDNA SEED 4440102.3
Coral (P. compressa) [60] Kane’ohe Bay, Hawaii P ssDNA, dsDNA SEED 4440376.3 (t = 0); 4440374.3 (control); 440370.3 (DOC);
440371.3 (pH); 44403777.3 (nutrient); 4440375.3 (temperature)

Environmental viral metagenomics Rosario and Breitbart 291


Coral (D. strigosa) [61] Mount Irvine Bay in S dsDNA WGS ABVU01000001–ABVU01001580 (DsH)
Buccoo, Tobago ABVT01000001–ABVT01000930 (DsB)
Hydrothermal vents [62] East Pacific Rise S dsDNA GSS ED017434–ED017695 (RAPD)
ED017696–ED017873 (LASL)
Reclaimed and Manatee County, Florida P ssDNA, dsDNA, RNA CAMERA CAM_PROJ_ReclaimedWaterViruses
potable water [9]
Man-made lake [8] Maryland S, P RNA ftp site ftp://ftp.jcvi.org/pub/data/lakemeta
Antarctic lake [63] Lake Limnopolar P ssDNA, dsDNA SEED 4441778.3 (Spring); 4441558.3 (Summer)
Current Opinion in Virology 2011, 1:289–297

Saltern [21] Alicante, Spain S dsDNA GenBank GU735099–GU735367; GU735369–GU735406;


HM030731–HM030733
Fermented food [64] Korea P dsDNA SRA SRP002583
Activated sludge [65] Auburn, Alabama S dsDNA TA 2251203077–2251204802
Hypersaline lake [24] Senegal S ssDNA, dsDNA GSS GS926706–GS927688
Hydrothermal vents [28] Juan de Fuca Ridge P dsDNA CAMERA CAM_SMPL_A0003
a
References for each metagenomic study are listed with the sample type.
b
Refers to the sequencing technology used, that is, Sanger (S) or Pyrosequencing (P).
c
Indicates the database where metagenomic sequences can be found including GenBank’s Genome Survey Sequence (GSS), Whole Genome Sequences (WGS), Short Read Archive (SRA), and
Trace Archive (TA), as well as platforms dedicated to analyze metagenomic datasets including CAMERA (http://camera.calit2.net/), SEED (ftp site: ftp://ftp.theseed.org), and Integrated Microbial
Genomes (IMG/M: http://img.jgi.doe.gov/cgi-bin/m/main.cgi).
d
The following libraries can be downloaded from SEED: 4440436.3 (low salinity; D–V), 4440432.3 (low salinity; E–V), 4440420.3 (low salinity; F–V), 4440427.3 (medium salinity; K–V), 4440428.3
(medium salinity; I–V), 4440431.3 (medium salinity; G–V), 4440417.3 (medium salinity; J–V), 4440421.3 (high salinity; L–V), 4440145.4 (high salinity; N–V), 4440144.4 (high salinity; M–V), 4440414.3
(freshwater; B2–V), and 4440424.3 (freshwater; A–V).
292 Viral genomics

extremely different environments supporting the hypoth- assays to examine the expression levels of specific lyso-
esis that ‘everything is everywhere’ [22]. Metagenomics geny-related genes [34].
opened the opportunity to test this hypothesis using
genetic information from the viral community as a whole Cellular genes in viral metagenomes
rather than using specific, well-known viral groups. Stat- Sequences similar to metabolic genes from cellular organ-
istical comparisons between different marine DNA vir- isms are consistently recovered in environmental viromes.
omes suggested that the vast majority of viral types are This finding is unexpected since viral particles are pur-
globally distributed in seawater but the relative abun- ified before processing, and initially raised concerns
dance of viral types differs between locations [7]. Viral regarding possible host contamination in the viromes.
metagenomics has also shown that specific viral These metabolic genes may also represent gene transfer
sequences are locally and globally conserved in hot agents (i.e. virus-like particles involved in bacterial gene
springs with different physical and biogeochemical prop- transfer that package random fragments of host DNA)
erties [23] as well as in geographically distant hypersaline [35,36]. However, metabolic genes have been detected in
environments [24]. However, this global distribution may the genomes of completely cultured phages and eukar-
not hold true for all environments. Comparisons between yotic viruses, where they are known as auxiliary metabolic
soil viromes indicated that the viral communities in genes (AMGs) [37]. By investigating the function and
different soil types are genetically unique with little or genomic context of AMGs, bioinformatic studies have
no phylogenetic overlap [25]. Likewise, metagenomic documented that there is little restriction to the types of
data indicated that freshwater and marine microbialites AMGs carried by environmental viruses [10,38,39,40].
from different locations have distinct viral communities Data-mining studies are constantly expanding the suite of
[26]. Initial data suggest that viral biogeography depends known viral AMGs and suggest that the acquisition of
on the environmental matrix; however, deeper sequen- metabolic genes by viruses is not random, with viromes
cing may reveal global distribution of viruses. enriched in AMGs that are beneficial for survival in a
given environment [33,39]. Metagenomic data have
Metagenomics can also provide insight into virus–host been used to design quantitative assays to detect envir-
interactions, which is critical for understanding the co- onmentally relevant AMGs, such as genes related to
evolution of host and viral genomes. The clustered reg- photosynthesis and nutrient stress, revealing that some
ularly interspaced short palindromic repeat (CRISPR) of these genes can be found on the order of 106 copies per
system is an antiviral mechanism found in archaea and liter in the marine environment [33].
bacteria in which genomic sequences from predatory
viruses are integrated as CRISPR spacers in the host Most studies of AMGs have focused on genes that are
genome, rendering the host ‘immune’ to these viruses relevant for oceanic nutrient and biogeochemical cycles.
[27]. CRISPR spacer sequences therefore provide a direct However, viromes contain a diversity of AMGs, such as
link between viruses and their uncultivated prokaryotic those involved in motility and chemotaxis, antioxidation,
hosts. Studies have taken advantage of this CRISPR transcriptional repression, translation and post-translation
feature to elucidate potential microbial hosts of viral modification, which suggests that viruses manipulate
assemblages [28,29] and determine if a given population their hosts in unforeseen ways [39,40,41]. Recent dis-
has been previously infected by specific phages [30]. covery of a phage gene homologous to peptide deformy-
Comparison of environmental viral communities with lases and other genes involved in translation suggests that
their respective microbial host populations through meta- phages may control both phage and host protein expres-
genomics has suggested that silent mutations and exten- sion at the translational level [39]. Furthermore, viral
sive recombination in viral genomes may help viruses metagenomics resulted in the discovery of a phage-
evade the CRISPR system [29,31]. encoded transcriptional repressor that can potentially
help the phage evade host antiviral mechanisms [18].
Another important virus–host interaction explored The diversity and widespread distribution of AMGs in
through metagenomics is the stable symbiotic state different environmental viromes provide further evi-
known as lysogeny. Lysogeny also involves the integ- dence of rampant horizontal gene transfer between hosts
ration of viral sequences into the prokaryotic host gen- and viruses [42,43] by shedding light on genes not com-
ome, but in this case the whole viral genome is integrated. monly associated with the viral gene pool that may shape
This association may be beneficial to both host and virus the evolution and functionality of both prokaryotic and
and is thought to be more prevalent in oligotrophic and/or eukaryotic hosts.
extreme environments [32]. In hydrothermal vents, meta-
genomic studies suggest that temperate phages (phages Metagenomics and virus discovery
with the ability to integrate into host genomes) within a In addition to the dsDNA phages typically thought to
vent site are a less diverse subset of the virioplankton dominate environmental viral communities, viral meta-
community [33]. Data gathered from temperate marine genomics has uncovered the presence of viral types not
phage metagenomes have been used to design quantitative previously described in certain environments (Table 2).

Current Opinion in Virology 2011, 1:289–297 www.sciencedirect.com


Environmental viral metagenomics Rosario and Breitbart 293

Evolutionary biologists can use genetic information from Microviridae have been identified in a wide range of
novel viruses to investigate evolutionary links within viral environments while Inoviridae have only been identified
families as well as between viruses and their hosts in low abundance in a few viromes. Eukaryotic circular
[35,41,44]. Single-stranded DNA (ssDNA) viruses are ssDNA viruses were only known to infect plants, birds,
gaining interest as metagenomic studies incorporating and mammals but their wide distribution in environmen-
multiple displacement amplification have recently tal viromes suggests that these viruses may infect a
expanded the known host range and environmental broader range of hosts. This is supported by the recent
distribution of these viruses. Among ssDNA phages, identification of circoviruses in invertebrates through

Table 2

DNA viral families identified in environmental samples and organisms through viral metagenomic approachesa

Viral family Environment or organism Known host Refs


Single-stranded DNA
Anelloviridae Harbor seals, Humans, Mosquitoes, Sea lions Mammals [66,67,48,68]
Gyrovirus genus Sea turtles Birds [69]
Circoviridae Antarctic lake, Man-made lake, Reclaimed water, Mammals/Birds [63,8,9,55,16,59,70,47,45,48]
British Columbia coastal waters, Chesapeake Bay,
Sargasso Sea, Rice paddy soil, Bats, Chimpanzees,
Dragonflies, Mosquitoes
Microviridae Antarctic lake, Reclaimed water, British Columbia Bacteria [63,9,7,6,34,59,70,26]
and California coastal waters, Gulf of Mexico,
Sargasso Sea, Tampa Bay, Rice paddy soil, Bats,
Freshwater and marine microbialites
Inoviridae Antarctic lake, Reclaimed and potable water, Bacteria [63,9,7,54,64]
Arctic Ocean, Marine sediment, Fermented food
Nanoviridae Antarctic lake, Reclaimed water, Strait of Georgia, Plants [63,9,55,34,59,47,60,48]
Tampa Bay, Rice paddy soil, Chimpanzees, Corals,
Mosquitoes
Geminiviridae Antarctic lake, Reclaimed water, Rice paddy soil, Plants [63,9,59,47,60,48]
Chimpanzees, Corals, Mosquitoes
Parvoviridae Bats, Corals, Mosquitoes, Turkeys, Fermented food Insects/Mammals [70,60,48,71,64]
Double-stranded DNA
Siphoviridae All DNA metagenomes Bacteria/Archaea –
Podoviridae All DNA metagenomes Bacteria –
Myoviridae All DNA metagenomes Bacteria/Archaea –
Rudiviridae Antarctic lake, Hot springs, Fermented food Archaea [63,23,64]
Fuselloviridae Hot springs, Fermented food Archaea [23,64]
Herpesviridae Antarctic lake, Reclaimed water, Hot springs, Mammals [63,9,23,7,60,61,48,64]
Arctic Ocean, Gulf of Mexico, Corals, Mosquitoes,
Fermented food
Mimiviridae Antarctic lake, Reclaimed water, Arctic Ocean, Algae/Protists [63,9,7,28,60,64]
Hydrothermal vent, Gulf of Mexico, Corals,
Fermented food
Phycodnaviridae Antarctic lake, Reclaimed water, Arctic Ocean, Algae [63,9,7,56,28,60,61,64]
Chesapeake Bay, Hydrothermal vent, Corals,
Fermented food
Iridoviridae Antarctic lake, Reclaimed water, British Columbia Fish/Amphibians/Insects [63,9,7,28,60,48,64]
coastal waters, Hydrothermal vent, Corals,
Mosquitoes, Fermented food
Poxviridae Antarctic lake, Reclaimed water, Hydrothermal vent, Mammals/Insects [63,9,7,28,70,60,72,48]
Bats, Corals, Horses, Mosquitoes
Nimaviridae Antarctic lake, Arctic Ocean, Corals Crustaceans [63,7,60]
Adenoviridae Bats, Corals, Fermented food Mammals [70,60,7]
Asfarviridae Antarctic lake, Arctic Ocean, Corals Mammals [63,7,60]
Polydnaviridae Antarctic lake, Corals Insects [63,60]
Baculoviridae Antarctic lake, Arctic Ocean, Corals Insects/Crustaceans [63,7,60]
Papillomaviridae Arctic Ocean, Corals, Mosquitoes Mammals [7,60,48]
Lipothrixviridae Antarctic lake Archaea [63]
Plasmaviridae Antarctic lake Bacteria [63]
Ascoviridae Antarctic lake, Fermented food Insects [63,64]
Tectiviridae Rice paddy soil, Fermented food Bacteria [59,64]
Caulimoviridae Corals, Fermented food Plants [60,64]
a
Table does not include viral families identified in tissues from humans exhibiting disease symptoms.

www.sciencedirect.com Current Opinion in Virology 2011, 1:289–297


294 Viral genomics

metagenomics ([45], Dunlap et al., unpublished data). viruses predating the divergence of eukaryotic super-
The discovery of genomes exhibiting mixed character- groups [44].
istics from different ssDNA viral families [16,47,48] in
metagenomic data suggests that the evolutionary history Concluding remarks
among eukaryotic ssDNA viruses may be more compli- Viral metagenomics has been instrumental for under-
cated than previously recognized [46]. standing viral ecology and exciting advances in this field
will benefit numerous disciplines (Figure 1). Scientists
Metagenomics has also resulted in the discovery of RNA are now developing functional viral metagenomics to
viruses in the environment (Table 3). The few published discover novel viral enzymes that can be used for diag-
RNA viral metagenomes (from coastal waters, a man- nostic and biotechnological purposes [50]. Although this
made lake, reclaimed water, humans, bats, and turkey field focuses on the discovery of enzymes with clinical or
guts) suggest that positive-sense eukaryotic ssRNA commercial importance, it will likely improve the annota-
viruses dominate the RNA viral community regardless tion of viral genes with unknown functions since specific
of the environment. This mirrors the nonuniform distri- functions are identified directly from the virome without
bution of known RNA viruses where positive-sense relying on similarities in existing databases [51]. Sur-
ssRNA viruses are more diverse (encompassing 108 veillance for viral pathogens of importance for public
genera and 27 families) than the other types of RNA health and food security is another area that can take
viruses [49]. The diversity of positive-sense RNA advantage of viral metagenomics. Targeting viruses in
viruses, in particular the picorna-like viruses, has been insect vectors known to transmit human and plant dis-
explained by a potential early diversification of these eases through vector-enabled metagenomics allows for

Table 3

RNA viral families identified in environmental samples and organisms through viral metagenomic approachesa

Viral family Environment or organism Known host Refs


Positive (+) and negative ( ) sense single-stranded RNA
+ Retroviridae Arctic Ocean, Corals, Fermented food Birds/Mammals [7,60,64]
Comovirinae Man-made lake, Reclaimed water, English Bay Plants [8,9,55]
Dicistroviridae Man-made lake, Reclaimed water, English Bay, Insects [8,9,55,70]
Strait of Georgia, Bats
Marnaviridae Man-made lake, English Bay Protists [8,55]
Picornaviridae Man-made lake, Reclaimed water, English Bay, Birds/Mammals [8,9,55,70,73,71]
Bats, Ringed seal, Turkeys
Sequiviridae Man-made lake, Reclaimed water Plants [8,9]
Iflavirus Man-made lake, Bats Insects [8,70]
Tombusviridae Man-made lake, Reclaimed water, Plants [8,9,55,74]
Strait of Georgia, Humans
Umbravirus Man-made lake, Strait of Georgia Plants [8,55]
Nodaviridae Man-made lake, Bats Fish/Insects [8,70]
Sobemovirus Man-made lake, Bats, Humans Plants [8,70,74]
Hepeviridae Man-made lake Birds/Mammals [8]
Flaviviridae Man-made lake Mammals [8]
Virgaviridae Man-made lake, Reclaimed water, Humans Plants [8,9,74]
Flexiviridae Man-made lake Plants [8]
Potyviridae Reclaimed water Plants [9]
Nora virus Man-made lake, Reclaimed water Insects [8,9]
Leviviridae Man-made lake, Turkeys Bacteria [8,71]
Tymoviridae Man-made lake, Bats, Humans, Turkeys Plants [8,70,74,71]
Tetraviridae Man-made lake, Bats Insects [8,70]
Astroviridae Bats, Turkeys Birds/Mammals [70,71]
Luteoviridae Man-made lake, Bats Plants [8,70]
Secoviridae Bats Plants [8]
Coronaviridae Arctic Ocean, Bats Birds/Mammals [7,70]
Bromoviridae Man-made lake, Humans Plants [8,74]
Orthomyxoviridae Man-made lake Birds/Mammals [8]
Double-stranded RNA
Picobirnavirus Man-made lake, Humans, Turkeys Mammals [8,74,71]
Reoviridae Man-made lake, Reclaimed water, English Bay Birds/Mammals/Insects/ [8,9,55]
Plants/Fungi
Partiviridae Man-made lake, Bats Plants/Protists/Fungi [8,70]
a
Table does not include viral families identified in tissues from humans exhibiting disease symptoms.

Current Opinion in Virology 2011, 1:289–297 www.sciencedirect.com


Environmental viral metagenomics Rosario and Breitbart 295

surveillance of the viruses circulating in a given area and automatic phylogenetic and functional analysis of
metagenomes. BMC Bioinformatics 2008, 9:386.
provides a preemptive strategy for identifying emerging
14. Seshadri R, Kravitz SA, Smarr L, Gilna P, Frazier M: CAMERA: a
pathogens [48,52]. Furthermore, metagenomic data community resource for metagenomics. PLoS Biol 2007,
from human feces have been used to identify novel 5:394-397.
bioindicators of fecal pollution with applications for 15. Tucker KP, Parsons R, Symonds EM, Breitbart M: Diversity and
monitoring water quality [9,53]. Overall, viral metage- distribution of single-stranded DNA phages in the North
Atlantic Ocean. ISME J 2011, 5:822-830.
nomics has provided the scientific community with tools
to tap into the massive environmental virome which holds 16. Rosario K, Duffy S, Breitbart M: Diverse circovirus-like genome
architectures revealed by environmental metagenomics. J
the key for elucidating viral diversity, understanding viral Gen Virol 2009, 90:2418-2424.
evolution, and developing assays with important societal 17. Culley AI, Lang AS, Suttle CA: The complete genomes of three
applications. viruses assembled from shotgun libraries of marine RNA virus
communities. Virol J 2007, 4:69.
Acknowledgements 18. Skennerton CT, Angly FE, Breitbart M, Bragg L, He S,
This research was supported by grants to MB from the National Science  McMahon KD, Hugenholtz P, Tyson GW: Phage encoded H-NS: a
Foundation (DEB-1025915, OCE-1049670) and the Department of Energy potential achilles heel in the bacterial defence system. PLoS
Genomes to Life Program. The authors would like to thank Yahayra One 2011, 6:e20095.
Rosario-Cora for the illustration in Figure 1. Assembly of viral metagenomic data led to the discovery of a phage
genome encoding a putative transcriptional repressor that may allow the
phage to repress host proteins involved in antiviral defense.
References and recommended reading 19. Pride DT, Schoenfeld T: Genome signature analysis of thermal
Papers of particular interest, published within the period of review, virus metagenomes reveals Archaea and thermophilic
have been highlighted as: signatures. BMC Genomics 2008, 9:420.
 of special interest 20. Kapoor A, Simmonds P, Lipkin WI, Zaidi S, Delwart E: Use of
 of outstanding interest nucleotide composition analysis to infer hosts for three novel
picorna-like viruses. J Virol 2010, 84:10322-10328.
21. Santos F, Yarza P, Parro V, Briones C, Anton J: The metavirome
1. Edwards RA, Rohwer F: Viral metagenomics. Nat Rev Microbiol of a hypersaline environment. Environ Microbiol 2010,
2005, 3:504-510. 12:2965-2976.
2. Delwart EL: Viral metagenomics. Rev Med Virol 2007, 17: 22. Breitbart M, Rohwer F: Here a virus, there a virus, everywhere
115-131. the same virus? Trends Microbiol 2005, 13:278-284.
3. Roossinck MJ, Saha P, Wiley GB, Quan J, White JD, Lai H,
23. Schoenfeld T, Patterson M, Richardson PM, Wommack KE,
Chavarria F, Shen GA, Roe BA: Ecogenomics: using massively Young M, Mead D: Assembly of viral metagenomes from
parallel pyrosequencing to understand virus ecology. Mol Ecol
Yellowstone hot springs. Appl Environ Microbiol 2008,
2010, 19:81-88.
74:4164-4174.
4. Cox-Foster DL, Conlan S, Holmes EC, Palacios G, Evans JD,
24. Sime-Ngando T, Lucas S, Robin A, Tucker KP, Colombet J,
Moran NA, Quan PL, Briese T, Hornig M, Geiser DM et al.: A
Bettarel Y, Desmond E, Gribaldo S, Forterre P, Breitbart M et al.:
metagenomic survey of microbes in honey bee colony
Diversity of virus–host systems in hypersaline Lake Retba,
collapse disorder. Science 2007, 318:283-287.
Senegal. Environ Microbiol 2010.
5. Allander T, Emerson SU, Engle RE, Purcell RH, Bukh J: A virus
25. Fierer N, Breitbart M, Nulton J, Salamon P, Lozupone C, Jones R,
discovery method incorporating DNase treatment and its
Robeson M, Edwards RA, Felts B, Rayhawk S et al.:
application to the identification of two bovine parvovirus
Metagenomic and small-subunit rRNA analyses reveal the
species. Proc Natl Acad Sci U S A 2001, 98:11609-11614.
genetic diversity of bacteria, archaea, fungi, and viruses in
6. Breitbart M, Salamon P, Andresen B, Mahaffy JM, Segall AM, soil. Appl Environ Microbiol 2007, 73:7059-7066.
Mead D, Azam F, Rohwer F: Genomic analysis of uncultured
26. Desnues C, Rodriguez-Brito B, Rayhawk S, Kelley S, Tran T,
marine viral communities. Proc Natl Acad Sci U S A 2002,
Haynes M, Liu H, Furlan M, Wegley L, Chau B et al.: Biodiversity
99:14250-14255.
and biogeography of phages in modern stromatolites and
7. Angly FE, Felts B, Breitbart M, Salamon P, Edwards RA, Carlson C, thrombolites. Nature 2008, 452:340-343.
Chan AM, Haynes M, Kelley S, Liu H et al.: The marine viromes of
27. Horvath P, Barrangou R: CRISPR/Cas, the immune system of
four oceanic regions. PLoS Biol 2006, 4:e368.
bacteria and archaea. Science 2010, 327:167-170.
8. Djikeng A, Kuzmickas R, Anderson NG, Spiro DJ: Metagenomic
28. Anderson RE, Brazelton WJ, Baross JA: Using CRISPRs as a
analysis of RNA viruses in a fresh water lake. PLoS One 2009,
 metagenomic tool to identify microbial hosts of a diffuse flow
4:e7264.
hydrothermal vent viral assemblage. FEMS Microbiol Ecol 2011
9. Rosario K, Nilsson C, Lim YW, Yijun R, Breitbart M: Metagenomic doi: 10.1111/j.1574-6941.2011.01090.x.
analysis of viruses in reclaimed water. Environ Microbiol 2009, Used a database of CRISPR spacers from all available prokaryotic
11:2806-2820. genomes to identify potential hosts for viruses sequenced from hydro-
thermal vents.
10. Comeau AM, Hatfull GF, Krisch HM, Lindell D, Mann NH,
Prangishvili D: Exploring the prokaryotic virosphere. Res 29. Andersson AF, Banfield JF: Virus population dynamics and
Microbiol 2008, 159:306-313. acquired virus resistance in natural microbial communities.
Science 2008, 320:1047-1050.
11. Yin YB, Fischer D: Identification and investigation of ORFans in
the viral world. BMC Genomics 2008, 9:24. 30. Kunin V, He S, Warnecke F, Peterson SB, Martin HG, Haynes M,
Ivanova N, Blackall LL, Breitbart M, Rohwer F et al.: A bacterial
12. Wooley JC, Ye YZ: Metagenomics: facts and artifacts, and metapopulation adapts locally to phage predation despite
computational challenges. J Comput Sci Technol 2010, global dispersal. Genome Res 2008, 18:293-297.
25:71-81.
31. Heidelberg JF, Nelson WC, Schoenfeld T, Bhaya D: Germ warfare
13. Meyer F, Paarmann D, D’Souza M, Olson R, Glass EM, Kubal M, in a microbial mat community: CRISPRs provide insights into
Paczian T, Rodriguez A, Stevens R, Wilke A et al.: The the co-evolution of host and viral genomes. PLoS One 2009,
metagenomics RAST server - a public resource for the 4:e4169.

www.sciencedirect.com Current Opinion in Virology 2011, 1:289–297


296 Viral genomics

32. Paul JH: Prophages in marine bacteria: dangerous molecular viral diversity obtained through viral metagenomics of
time bombs or the key to survival in the seas? ISME J 2008, mosquitoes. PLoS One 2011, 6:e20579.
2:579-589.
49. Hulo C, de Castro E, Masson P, Bougueleret L, Bairoch A, Xenarios I,
33. Williamson SJ, Rusch DB, Yooseph S, Halpern AL, Heidelberg KB, Le Mercier P: ViralZone: a knowledge resource to understand
Glass JI, Andrews-Pfannkoch C, Fadrosh D, Miller CS, Sutton G virus diversity. Nucleic Acids Res 2011, 39:D576-D582.
et al.: The Sorcerer II Global Ocean Sampling Expedition:
metagenomic characterization of viruses within aquatic 50. Schoenfeld T, Liles M, Wommack KE, Polson SW, Godiska R,
microbial samples. PLoS One 2008, 3:e1456. Mead D: Functional viral metagenomics and the next
generation of molecular tools. Trends Microbiol 2010, 18:20-29.
34. McDaniel L, Breitbart M, Mobberley J, Long A, Haynes M,
Rohwer F, Paul JH: Metagenomic analysis of lysogeny in 51. Schmitz JE, Schuch R, Fischetti VA: Identifying active phage
Tampa Bay: Implications for prophage gene expression. PLoS  lysins through functional viral metagenomics. Appl Environ
One 2008, 3:e3263. Microbiol 2010, 76:7181-7187.
Represents one of the first studies screening a viral metagenome for a
35. Kristensen DM, Mushegian AR, Dolja VV, Koonin EV: New specific function resulting in the discovery of novel lysins.
dimensions of the virus world discovered through
metagenomics. Trends Microbiol 2010, 18:11-19. 52. Ng TFF, Duffy S, Polston JE, Bixby E, Vallad GE, Breitbart M:
 Exploring the diversity of plant DNA viruses and their satellites
36. Lang AS, Beatty JT: Importance of widespread gene transfer using vector-enabled metagenomics on whiteflies. PLoS One
agent genes in alpha-proteobacteria. Trends Microbiol 2007, 2011, 6:e19050.
15:54-62. Proposed vector-enabled metagenomics as an approach to survey
vector-borne viruses circulating in a given region, and provided proof-
37. Breitbart M, Thompson LR, Suttle CA, Sullivan MB: Exploring the of-concept by examining plant viruses from whiteflies.
vast diversity of marine viruses. Oceanography 2007,
20:135-139. 53. Rosario K, Symonds E, Sinigalliano C, Stewart J, Breitbart M:
 Pepper mild mottle virus as an indicator of fecal pollution. Appl
38. Monier A, Pagarete A, de Vargas C, Allen MJ, Read B, Claverie JM, Environ Microbiol 2009, 75:7261-7267.
 Ogata H: Horizontal gene transfer of an entire metabolic Building upon a viral metagenomic study that identified a plant virus
pathway between a eukaryotic alga and its DNA virus. Genome dominating the RNA viral community in human feces, this study deter-
Res 2009, 19:1441-1449. mined that the plant virus can serve as a novel bioindicator of water
Demonstrated the horizontal gene transfer of seven genes that may be quality.
involved in sphingolipid biosynthesis, providing the first clear example of
the transfer of an entire pathway between phytoplankton and viruses. 54. Breitbart M, Felts B, Kelley S, Mahaffy JM, Nulton J, Salamon P,
Rohwer F: Diversity and population structure of a near-shore
39. Sharon I, Battchikova N, Aro E-M, Giglione C, Meinnel T, Glaser F, marine-sediment viral community. Proc R Soc Lond Ser B-Biol
 Pinter RY, Breitbart M, Rohwer F, Beja O: Comparative Sci 2004, 271:565-574.
metagenomics of microbial traits within oceanic viral
communities. ISME J 2011, 5:1178-1190. 55. Culley AI, Lang AS, Suttle CA: Metagenomic analysis of coastal
Developed a strategy to identify viral scaffolds containing microbial genes RNA virus communities. Science 2006, 312:1795-1798.
and detected a diversity of auxiliary metabolic genes amongst marine
metagenomes. 56. Bench SR, Hanson TE, Williamson KE, Ghosh D, Radosovich M,
Wang K, Wommack KE: Metagenomic characterization of
40. Dinsdale EA, Edwards RA, Hall D, Angly F, Breitbart M, Brulc JM, Chesapeake Bay virioplankton. Appl Environ Microbiol 2007,
Furlan M, Desnues C, Haynes M, Li L et al.: Functional 73:7629-7641.
metagenomic profiling of nine biomes. Nature 2008,
452:629-632. 57. Rodriguez-Brito B, Li L, Wegley L, Furlan M, Angly F, Breitbart M,
Buchanan J, Desnues C, Dinsdale E, Edwards R et al.: Viral and
41. Monier A, Claverie JM, Ogata H: Taxonomic distribution of large microbial community dynamics in four aquatic environments.
DNA viruses in the sea. Genome Biol 2008, 9:R106. ISME J 2010, 4:739-751.
42. Liu H, Fu Y, Jiang D, Li G, Xie J, Cheng J, Peng Y, Ghabrial SA, Yi X: 58. Santos F, Meyerdierks A, Pena A, Rossello-Mora R, Amann R,
Widespread horizontal gene transfer from double-stranded Anton J: Metagenomic approach to the study of halophages:
RNA viruses to eukaryotic nuclear genomes. J Virol 2010, the environmental halophage 1. Environ Microbiol 2007,
84:11876-11887. 9:1711-1723.
43. Moreira D: Multiple independent horizontal transfers of 59. Kim KH, Chang HW, Nam YD, Roh SW, Kim MS, Sung Y, Jeon CO,
informational genes from bacteria to plasmids and phages: Oh HM, Bae JW: Amplification of uncultured single-stranded
implications for the origin of bacterial replication machinery. DNA viruses from rice paddy soil. Appl Environ Microbiol 2008,
Mol Microbiol 2000, 35:1-5. 74:5975-5985.
44. Koonin EV, Wolf YI, Nagasaki K, Dolja VV: The Big Bang of 60. Vega Thurber RL, Barott KL, Hall D, Liu H, Rodriguez-Mueller B,
picorna-like virus evolution antedates the radiation Desnues C, Edwards RA, Haynes M, Angly FE, Wegley L et al.:
of eukaryotic supergroups. Nat Rev Microbiol 2008, Metagenomic analysis indicates that stressors induce
6:925-939. production of herpes-like viruses in the coral Porites
compressa. Proc Nat Acad Sci U S A 2008, 105:18413-18418.
45. Rosario K, Marinov M, Stainton D, Kraberger S, Wiltshire EJ,
 Collings DA, Walters M, Martin DP, Breitbart M, Varsani A: 61. Marhaver KL, Edwards RA, Rohwer F: Viral communities
Dragonfly cyclovirus, a novel single-stranded DNA virus associated with healthy and bleaching corals. Environ
discovered in dragonflies (Odonata: Anisoptera). J Gen Virol Microbiol 2008, 10:2277-2286.
2011, 92:1302-1308.
First report of a single-stranded DNA circovirus in invertebrates, identified 62. Williamson SJ, Cary SC, Williamson KE, Helton RR, Bench SR,
using a viral metagenomic approach. Winget D, Wommack KE: Lysogenic virus–host interactions
predominate at deep-sea diffuse-flow hydrothermal vents.
46. Gibbs MJ, Weiller GF: Evidence that a plant virus switched ISME J 2008, 2:1112-1121.
hosts to infect a vertebrate and then recombined with a
vertebrate-infecting virus. Proc Natl Acad Sci U S A 1999, 63. Lopez-Bueno A, Tamames J, Velazquez D, Moya A, Quesada A,
96:8022-8027. Alcami A: High diversity of the viral community from an
Antarctic lake. Science 2009, 326:858-861.
47. Blinkova O, Victoria J, Li Y, Keele BF, Sanz C, Ndjango J-BN,
Peeters M, Travis D, Lonsdorf EV, Wilson ML et al.: Novel circular 64. Park EJ, Kim KH, Abell GC, Kim MS, Roh SW, Bae JW:
DNA viruses in stool samples of wild-living chimpanzees. J Metagenomic analysis of the viral communities in fermented
Gen Virol 2010, 91:74-86. foods. Appl Environ Microbiol 2011, 77:1284-1291.
48. Ng TFF, Willner DL, Lim YW, Schmieder R, Chau B, Nilsson C, 65. Parsley LC, Consuegra EJ, Thomas SJ, Bhavsar J, Land AM,
Anthony S, Ruan Y, Rohwer F, Breitbart M: Broad surveys of DNA Bhuiyan NN, Mazher MA, Waters RJ, Wommack KE, Harper WF

Current Opinion in Virology 2011, 1:289–297 www.sciencedirect.com


Environmental viral metagenomics Rosario and Breitbart 297

Jr et al.: Census of the viral metagenome within an activated 70. Li L, Victoria JG, Wang C, Jones M, Fellers GM, Kunz TH,
sludge microbial assemblage. Appl Environ Microbiol 2010, Delwart E: Bat guano virome: predominance of dietary viruses
76:2673-2677. from insects and plants plus novel mammalian viruses. J Virol
2010, 84:6955-6965.
66. Ng TF, Wheeler E, Greig D, Waltzek TB, Gulland F, Breitbart M:
Metagenomic identification of a novel anellovirus in Pacific 71. Day JM, Ballard LL, Duke MV, Scheffler BE, Zsak L: Metagenomic
harbor seal (Phoca vitulina richardsii) lung samples and its analysis of the turkey gut RNA virus community. Virol J 2010,
detection in samples from multiple years. J Gen Virol 2011, 7:313.
92:1318-1323.
72. Cann AJ, Fandrich SE, Heaphy S: Analysis of the virus
67. Breitbart M, Rohwer F: Method for discovering novel DNA population present in equine faeces indicates the presence of
viruses in blood using viral particle selection and shotgun hundreds of uncharacterized virus genomes. Virus Genes 2005,
sequencing. Biotechniques 2005, 39:729-736. 30:151-156.
68. Ng TFF, Suedmeyer WK, Wheeler E, Gulland F, Breitbart M: Novel 73. Kapoor A, Victoria J, Simmonds P, Wang C, Shafer RW, Nims R,
anellovirus discovered from a mortality event of captive Nielsen O, Delwart E: A highly divergent picornavirus in a
California sea lions. J Gen Virol 2009, 90:1256-1261. marine mammal. J Virol 2008, 82:311-320.
69. Ng TFF, Manire C, Borrowman K, Langer T, Ehrhart L, Breitbart M: 74. Zhang T, Breitbart M, Lee WH, Run JQ, Wei CL, Soh SWL,
Discovery of a novel single-stranded DNA virus from a sea Hibberd ML, Liu ET, Rohwer F, Ruan YJ: RNA viral community in
turtle fibropapilloma by using viral metagenomics. J Virol 2009, human feces: prevalence of plant pathogenic viruses. PLoS
83:2500-2509. Biol 2006, 4:108-118.

www.sciencedirect.com Current Opinion in Virology 2011, 1:289–297

You might also like