Professional Documents
Culture Documents
ASMNGS 2018 Abstracts
ASMNGS 2018 Abstracts
Background: Pistachios have been linked to two multistate outbreaks of Salmonella infections
in 2013 and 2016 and involved with product recalls in 2009 and 2013. Salmonella serovar
Senftenberg has been commonly isolated from pistachios since 2009. Methods: In this study,
whole-genome sequence (WGS) data from 75 Salmonella Senftenberg isolates were analyzed
to provide insight into evolutionary relationships and persistence among strains linked to
events of salmonellosis over the seven-year period. The sources of these isolates comprised
47 pistachio (2009-2016), 18 other food commodities (1941-2013), 3 clinical (2006-2017), 6
environmental (2011-2016) and 1 ATCC reference strain. Forty-two out of the 47 pistachio
isolates were obtained from the USA; three from Lebanon, and one each from Canada and
Turkey. Genomes of three strains isolated from pistachios were completely closed with long-
read sequencing technology, and used as reference to create a cgMLST scheme containing
2696 core genes using Ridom SeqSphere+. Sequence assemblies were imported and typed
Abstract: using the cgMLST scheme. Results: The phylogeny illustrates that the 2016 outbreak involves
direct descendants from the 2009 and 2013 events rather than an independent
contamination event. These isolates shared minimum allelle differences (0-12) between them
and were distinct from other food isolates and pistachio products from outside the United
States. Interestingly, pistachio isolates from a separate 2013 outbreak associated with
environmental contamination were genetically different than the clonal strain. Notably, the
outbreak strain harbored a sodA and a a clpB gene that were not present in the ATCC
reference strain. Part of a stress-induced multi-chaperone system, clpB is involved in the
recovery of the cell from heat-induced damage and sodA has been documented to code for
biocide and heavy metal tolerance. Conclusion: The data suggests a prominent clonal strain
of Salmonella Senftenberg is persistent in the pistachio production supply chain. Further
research aims to elucidate the mechanisms of resistance of this clonal strain in order to
institute better preventative controls and good manufacturing practices that will aid in the
reduction of possible elimination of contamination of this pathogen in pistachios.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
51
Board #:
Title: Molecular Genotyping of Hepatitis A Virus - California
W. Probert1, C. Gonzalez2, A. Espinosa1, J. K. Hacker1;
Author 1
California Department of Public Health, Richmond, CA, 2Sonoma County Public Health
Block:
Laboratory, Santa Rosa, CA.
Background: The United States has seen a resurgence of hepatitis A virus (HAV) infections in
recent years particularly among persons experiencing homelessness and users of illicit drugs.
Several states have recently reported outbreaks of hepatitis A caused by HAV subgenotype IB.
In October 2017, California declared a public health emergency in response to a surge in
hepatitis A cases. As part of that response, we implemented molecular genotyping to identify
the HAV strains associated with the CA outbreak, support epidemiologic investigation of
disease transmission, and monitor the effectiveness of public health control and prevention
measures. Methods: Sera from symptomatic, HAV IgM+ case-patients were accepted for HAV
molecular genotyping. Extracted nucleic acids were amplified using nested, RT-PCR targeting
the VP1-P2B region and a 315 nt fragment sequenced using Sanger sequencing. In addition,
representative specimens were selected for rRNA depletion library preparation and whole
genome sequencing (WGS) on the MiSeq platform. WGS data analysis was facilitated using
the Viral NGS Analysis Pipeline. Results: The HAV VP1-P2B region was successfully amplified
and sequenced for 161 serum specimens collected between August 2017 and May 2018.
Subgenotype classification by VP1-P2B sequence yielded 54 IA-, 104 IB-, and 3 IIIA- HAV
Abstract: positive specimens. Among the IB specimens, phylogenetic analysis of the VP1-P2B sequences
resulted in a tight cluster of 102 specimens that represented the CA outbreak strain with
several closely related variants. The remaining two IB sequences matched outbreak strains
reported in Michigan. Comparison of nearly complete genome sequences indicated that the
CA HAV outbreak strain shared 95.6% nucleotide and 99.7% amino acid sequence identity
with HAV IB type strain. Significantly, nucleotide substitutions in the 5’-UTR and amino acid
substitutions in VP1 and P2C were noted. Conclusions: In response to a public health
emergency, we rapidly developed the capacity for detecting, genotyping, and WGS of HAV.
Molecular genotyping revealed a large cluster of closely related HAV subgenotype IB strains
as the source of the outbreak. Similar strains have been identified as the source of recent
outbreaks in other states including Utah, Kentucky, Indiana, and Arizona. A nearly complete
genome sequence was determined for the predominant HAV IB outbreak strain and several
closely related variants. Substitutions found in putative HAV virulence determinants may
warrant further investigation. Finally, a notable decline in HAV specimen submissions and in
the detection of the outbreak strain genotype indicate that public health measures have been
successful in controlling and preventing further spread of this strain in CA.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
52
Board #:
Title: Clinical and Research Experience with NGS-based Diagnostics
Author J. E. Ellis;
Block: Fry Laboratories, LLC, Scottsdale, AZ.
Next Generation DNA Sequencing (NGS) represents a compelling alternative to the single-
analyte, syndromic panels, or culture-based diagnostic pipelines. The current research-based
NGS analysis tools and pipelines typically rely on specialized operators, are excessively time
consuming, or do not have the features necessary for routine clinical use. To address the
demands of clinical microbiology and basic science research we created the Rapid Infectious
Disease Identification (RIDI™) system, an infectious disease analysis system. We will present a
review of the instrumentation selection, sequencing strategy for “dirty” clinical samples, and
analysis pipelines with contrast to the RIDI analysis pipeline including experience drawn from
5 years of routine clinical use. A summary of the greater than 600 replicates of more than 60
American Type Culture Collection (ATCC) reference standards individually and in various
combinations will also be presented. Further considerations regarding the CFU / genomic
equivalents, nomenclature, and reporting requirements will be reviewed. Curation of massive
Abstract:
public databases and algorithms targeting ambiguous DNA sequences have yielded significant
advances to the DNA-sequence-based identification strategy for pathogens. The RIDI™ system
accurately and rapidly identifies bacteria/archaea with more than 99% of the DNA sequence
reads to the genus level and greater than 75% to the species level of both reference
standards and simulated clinical samples. Additionally, fungi and protozoa are identified
accurately to the genus level more than 86% of the time and to the species level more than
67% of the time. These identification rates support meaningful clinical intervention and
significant research value simultaneously. This strategy has yielded the detection of unusual
microbial DNA signatures in a variety of acute and chronic disease states including
bacteremia, coronary artery debris, uncultivatable eye infections, and chronic inflammatory
illnesses. Ultimately, our experiences support the use of an infectious disease clinical
diagnostics and research pipeline utilizing an NGS-based approach.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
53
Board #:
Title: Mosaic - a Cloud Platform for Microbiome Analysis
Author S. T. Westreich;
Block: DNAnexus, Mountain View, CA.
Exploration of the microbiome is an exciting new area of science, incorporating many “big
data” aspects to fully encompass the diversity and complexity of this environment. There has
been an explosion of new tools designed for the microbiome space; some are modeled after
existing genomic tools, while others are designed specifically for multi-organism “-omics”
analysis. Many of these tools, however, require considerable computational resources and
can prove challenging to install and run. In partnership with Janssen Human Microbiome
Institute (JHMI), DNAnexus has created Mosaic, a cloud platform designed for microbiome-
focused informatics. Mosaic allows for third-party bioinformatics tools to be developed as
applications, which can be run on cloud instances through a graphical user interface (GUI).
Abstract:
Mosaic provides a central platform for tool developers to share their implemented programs,
and for tool users to bring their data for analysis without needing extensive command line
experience. Mosaic also hosts microbiome challenges, similar to CAMI, designed to
encourage development of new methods and raise publicity for the most effective
microbiome tools. Mosaic offers significant scalability and processing power for handling
computational microbiome analysis tasks and is striving to bring together the microbiome
community to allow for increased communication and collaboration. It offers an avenue for
tool comparison and a method for tool usage without the need for a high-performance
computing cluster or extensive programming experience.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
54
Board #:
Title: Metagenomic Sequencing Identification of Rickettsiafelis in an American Patient
Author Y. Chen, C. R. Icenhour;
Block: Aperiomics, Inc, Ashburn, VA.
Background: Rickettsia felis is an intracellular gram-negative bacterium, first described in
1991 as a human pathogen from a patient in Texas. Although R. felis was first identified in the
USA, only one case (as reported in literature) has been reported in the USA over the past two
decades. Cat fleas (Ctenocephalides felis) are considered the primary vector and reservoir
for R. felis, although other vectors are likely. Current methods for R. felis diagnosis include a
real-time PCR assay and antigen-based test, the latter method showing a high failure rate.
Reported here is an American male patient testing positive for R. felis in peripheral blood.
This 34-year-old male lives in Alabama with a domestic cat. After suffering from rashes on his
left arm and trunk, he went to see a dermatologist in February 2017. No new personal care
products or medications had been used immediately before the rash appeared. On March 06,
2018 the patient was sent to an ER, having suffered from a stroke. Except a previously
diagnosed hypertension condition, all blood tests came back normal. Methods: One
peripheral blood plasma sample was collected from the male patient on May 11, 2018. DNA
was extracted from the sample using Qiagen QIAamp DNA Microbiome Kit, and sequencing
library was prepared by KAPA HyperPrep Kits. Sequencing reads were produced with an
Illumina NextSeq500 platform at Technology Center for Genomics & Bioinformatics (UCLA).
Raw sequencing data was processed and analyzed with XplorePATHOSM, a fully automated
Abstract:
metagenomics sequencing data analysis platform developed and operated by Aperiomics,
Inc. Results: 6.36 million paired-end sequence reads of 75bp length were generated by
shotgun metagenomic sequencing. 656,660 sequence reads were filtered out because of low
sequencing quality. In the remaining clean sequence reads, 98.6% (5,627,427) were human
sequences and 0.3% (16,393) aligned to microbial reference genomes. 10,184 sequence reads
were unique for 99 microbial genomes, which belonged to 87 microbial species. R. felis was
identified as the microorganism with the highest relative abundance (14.2%) among all
identified microbial species. In total, 413 paired-end sequence reads (including 256 unique
sequence reads) were aligned to the R. felis genome, covering around 2.0% of the whole R.
felis genome. Based on the sequencing results, the patient’s doctor prescribed doxycycline,
and multiple symptoms of the patient had experienced resolved after two-week
treatment. Conclusion: This case demonstrates the value of shotgun metagenomic
sequencing for the identification of rare pathogens. Xplore-PATHOSM can identify over ten
thousand pathogens in one test, which is crucial for identifying pathogens that cannot be
identified through traditional testing. Billions of people worldwide suffer from emergent and
chronic infectious diseases and shotgun metagenomic sequencing is an important weapon in
the battle against infectious disease.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
55
Board #:
A Core Genome Approach That Enables Prospective and Near Real-time Monitoring of
Title:
Infectious Outbreaks
H. van Aggelen1, R. Kolde1, H. Chamarthi1, J. T. Fallon2, J. J. Carmona3, M. M. Fortunado-
Author Habib3, B. D. Gross3;
1
Block: Philips Research, Cambridge, MA, 2New York Medical College, Valhalla, NY, 3Philips
Healthcare, Cambridge, MA.
Background: Whole genome sequencing is increasingly being adopted in clinical settings to
confirm or rule out potential transmissions of infectious agents. Clinical infection monitoring
is most actionable when performed in a prospective manner, in which samples are
continuously added and compared to previous samples. To enable prospective pathogen
comparison, genomic relatedness metrics must be: i) consistent across time, ii) efficient to
compute, and iii) reliable across the large variety of samples typically seen in a clinical setting.
Appropriate selection of genomic regions to compare, i.e. a core genome, is critical to obtain
a consistent metric of pathogen relatedness via single nucleotide differences. Methods: We
propose a method that selects conserved nucleotides in a reference genome based on the
variation seen in publicly available RefSeq genome assemblies. The conserved nucleotides can
be computed efficiently from the k-mer occurrence frequencies in the genome assembly set.
The resulting core genome is sample set-independent and can be applied universally across
time and location, such that single nucleotide difference metrics remain constant over time.
Given the constancy of the metrics, previously analyzed samples do not need to be re-
analyzed when samples are added, which significantly reduces the computational
burden. Results: Using this method, we generated core genomes based on all 8274 RefSeq
Abstract: assemblies for S. aureus, 2876 assemblies for K. pneumoniae and 782 assemblies for E.
faecium and tested them on large clinical data sets. We show that this method disambiguates
same-pathogen samples better than a core genome consisting of conserved genes, as
measured by ROC curves for same-patient versus different-patient samples for sets of 1362
East England S. aureus and 905 Houston K. pneumoniae samples. For these data sets, the
proposed method achieves a 0.981 area under the curve (AUC) for K. pneumoniae and 0.977
AUC for S. aureus, which is superior to the 0.935 and 0.955 AUC obtained with the conserved
gene approach and translates into a significant difference when monitoring large numbers of
samples. The proposed method is universally applicable, which we demonstrate by comparing
multiple geographically distinct cohorts. We illustrate that this method recovers previously
published confirmed outbreak samples with high accuracy in a large set of 1457 S.
aureus samples from the U.K: all 45 samples part of the outbreak were recovered by the
proposed method, and 2 other samples were identified as similar to the outbreak, whereas
the conserved gene method confirmed only 36 of the 45 outbreak samples and identified 3
other samples as similar. Conclusions: The proposed core genome approach not only makes it
possible to perform prospective and near real-time genomic studies, it also provides a
universal framework to quantify pathogen relationships across geographical locations.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
56
Board #:
Title: Staphylococcus aureus Viewed from the Perspective of 40,000+ Genomes
Author R. A. Petit III, T. D. Read;
Block: Emory University, Atlanta, GA.
We created Staphopia, an analysis pipeline, database and Application Programming Interface
for batch analysis of thousands of S. aureus Illumina whole genome shotgun projects. Written
in Python, Staphopia’s analysis pipeline consists of submodules running open-source tools. It
accepts raw FASTQ reads as an input, which undergo quality control filtration, error
correction and reduction to a maximum of approximately 100x chromosome coverage. This
reduction significantly reduces total runtime without detrimentally affecting the results. The
pipeline performs de novo assembly-based and mapping-based analysis. Automated gene
calling and annotation is performed on the assembled contigs. Read-mapping is used to call
variants (single nucleotide polymorphisms and insertion/deletions) against a reference S.
Abstract: aureus chromosome (N315, ST5). We ran the analysis pipeline on more than 43,000 S. aureus
shotgun Illumina genome projects in the public European Nucleotide Archive database in
November 2017. We found that only a quarter of known multi-locus sequence types (STs)
were represented but the top ten STs made up 70% of all genomes. Methicillin-resistant S.
aureus (MRSA) were 64% of all genomes. Using the Staphopia database we selected 380 high
quality genomes deposited with good metadata, each from a different multi-locus sequence
type, as a non-redundant diversity set for studying S. aureus evolution. In addition to
answering basic science questions, Staphopia could serve as a potential platform for rapid
clinical diagnostics of S. aureus isolates in the future. The system could also be adapted as a
template for other organism-specific databases.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
57
Board #:
Title: Examination of Mycobacterium avium Complex Infections Through WGS Analysis
D. J. Operario, A. F. Koeppel, S. D. Turner, A. J. Prorock, Y. Bao, K. Sol-Church, S. Pholwat, M.
Author
H. Scheurenbrand, E. R. Houpt;
Block:
University of Virginia, Charlottesville, VA.
Background: Disease attributed to Mycobacterium avium complex (MAC) is caused by a
number of closely related non-tuberculous mycobacteria including M. avium and M.
intracellulare. Neither current probe-based MAC diagnostics nor disease presentation lend
themselves to easily distinguishing between the different MAC species. As a result, clinically
diagnosed stable disease, or relapsed/reinfected MAC disease may actually be caused by
distinct organisms. Methods: To elucidate this phenomenon, we identified 36 patients who
each had multiple AFB cultures where MAC was identified, a total of 97 isolates. Bacterial
genomic DNA from each isolate was subjected to whole genome sequencing on an Illumina
NextSeq platform. Data cleanup on the resulting sequences was performed using FastQC. To
rapidly compare sequences between isolates, sequencing controls, and reference sequences,
MinHash was employed thereby eliminating the need to pre-assemble the genomes. MinHash
distances were then used to construct a phylogenetic tree, rooted on a reference sequence
of M. tuberculosis H37Rv. Kraken/Bracken was used to estimate the relative species
abundance in samples with mixed infections. In addition, we performed both phenotypic and
Abstract:
genotypic drug susceptibility testing on each isolate. Phenotypic testing included
clarithromycin, rifampin, rifampicin, ethambutol, amikacin, moxifloxacin, linezolid, tedizolid,
clofazimine, bedaquiline, tigecycline, and ceftazidime/avibactam. Genotypic drug
susceptibility testing was achieved using a combination of SRST2, abricate, and
ariba. Results: Our phylogenetic analysis showed that the isolates formed four distinct
clusters: an M. avium cluster, an M. intracellulare cluster, a
mixed avium/intracellulare cluster, and a “MAC other/M. abscessus” cluster. Species level
abundance estimates from Bracken appear to correlate well with the genetic distances
estimated by MinHash. Grouping isolates by patient, analyzing by order of isolate collection,
and adding phenotypic susceptibility testing information to the phylogenetic tree revealed
that some subjects were truly stable in their MAC disease while others actually switched
between phylogenetic clusters. Conclusions: Our results suggest that for certain patients,
WGS confirms a very stable disease while for others MAC disease is actually being caused by
different organisms that were possibly acquired as superinfections.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
58
Board #:
Genomic Characterization and Phylogenetic Analysis of Salmonella Newport Clinical Strains
Title:
from Tennessee, 2017-2018
L. K. Hudson1, C. Moore2, L. Constantine-Renna3, J. K. Yackley3, X. Qian2, L. S. Thomas2, K. N.
Garman3, J. R. Dunn3, T. G. Denes1;
Author 1
Department of Food Science, University of Tennessee, Knoxville, TN, 2Tennessee Department
Block:
of Health, Division of Laboratory Services, Nashville, TN, 3Tennessee Department of Health,
Nashville, TN.
Background: Salmonella Newport is the third most common Salmonella serovar sent to the
Tennessee (TN) State Public Health Laboratory. The objectives of this study were to
retrospectively examine the genomic population structure of Newport clinical isolates from
TN in 2017-18 and describe epidemiological features among clades of case-patients
identified. Methods: Biosample numbers and metadata for S. Newport (n=91) clinical isolates
from TN collected January 2017 through June 2018 were provided by the TN Dept. of Health.
Raw reads were downloaded from NCBI SRA, trimmed with Trimmomatic, and quality
checked using FastQC. A reference assembly was chosen (BioSample SAMN05172397).
hqSNPs were identified using the CFSAN SNP pipeline and the resulting matrix was used to
construct a neighbor-joining tree with Mega7. Additionally, trimmed reads were assembled
with SPAdes and contigs annotated with Prokka. Assembly statistics were reported by BBMap,
SAMtools, and QUAST. Serotype designations were confirmed with SeqZero. Results: Nine
distinct clades of interest were identified: four major clades (1, 2, 4, and 5) and five minor
clades (3, 6, 7, 8, and 9). Major clades were defined as containing five or more isolates and
minor clades as containing less than five. Clades may represent or contain epidemiological
clusters. Clade 1 consisted of 41 isolates, with most (n=28) from patients in the western
Abstract: region of TN (11 from Shelby county), and were collected over a period of 11 months. Clade 2
consisted of nine isolates with source counties from all three regions and were collected over
13 months. Clade 4 consisted of nine isolates, with the majority (n=6) being isolated from the
western region (three from Obion county) and one each from counties in east and middle TN,
and one with an unknown source county. Clade 4 isolates were collected over eight months
and most were from male patients (78%) and adults (>17 yrs of age; 78%). Clade 5 contained
12 isolates, mostly isolated from counties in either middle (n=5) or east (n=5; three from Knox
county) TN. Clade 5 isolates were collected over one year and were mostly from female
patients (67%) and all from adults. The five minor clades all contained isolates collected from
middle or west TN. Conclusions: The clustering patterns of clades 1 and 4 with a majority of
isolates from western TN, together with the timeline of the isolation dates, may indicate that
these illnesses were from common or related exposures. In contrast, other clades (2 and 5)
are characterized by a widespread geographical distribution throughout the state. Further
investigation of epidemiological data and possible environmental sources may identify the
source of illness and possible preventive strategies. In addition, information gained about the
population structure of this serovar provides guidance for selecting SNP distance thresholds
used to identify clusters that may be of epidemiological significance.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
59
Board #:
Title: ATCC® Site-Specific Mock Community Standards for Human Microbiome Applications
Author M. Hunter, S. Saha, S. King, M. Amselle, J. Lopera, B. Benton, D. Mittar;
Block: ATCC, Manassas, VA.
Advancement and accessibility of next-generation sequencing technologies have influenced
microbiome analyses in tremendous ways, opening up numerous applications in the areas of
human health and disease. To date, a significant body of work has been performed on the
human gut microbiome to evaluate its species composition and influence on physiology; this
research has led to additional studies on microbiomes localized at other sites on the human
body (e.g., skin, oral, vaginal). However, a predominant limitation in these site-specific
microbiome studies is the lack of appropriate and relevant standards to control the technical
biases introduced throughout the metagenomics workflow. To address this, ATCC has
developed a set of genomic and whole cell mock microbial communities from fully sequenced
and characterized ATCC strains that represent species found in the oral, skin, gut, or vaginal
Abstract: microbiome. To further enhance the use of these standards and eliminate the bias associated
with data analysis, we have also collaborated with One Codex to develop data analysis
modules that provide simple output in the form of true-positive, relative abundance, and
false-negative scores for 16S rRNA community profiling and shotgun metagenomics
sequencing. In this proof-of-concept study, we tested these mock communities via 16S rRNA
and shotgun metagenomics sequencing methods and analyzed the resulting data using the
One Codex data analysis platform. From this analysis, we found a strong correlation between
the expected and observed microbial compositions (less than 1 log difference between
relative abundance of individual genomes), indicating that these site-specific microbiome
standards can be used as tools for assay optimization and as daily run quality control
standards for microbiome assays.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
60
Board #:
Assessing Quantitative Performance of Metagenomic Profiling Using Genomic DNA Reference
Title:
Material Mixtures
Author J. Kralj1, D. Tourlousse2, S. Servetas1, S. Forry1, S. Jackson1;
1
Block: NIST, Gaithersburg, MD, 2AIST, Tsukuba, JAPAN.
Background: Shotgun metagenomics is being employed in many microbial-related clinical,
environmental, and bio-security applications. However, development of materials and
methods to characterize relative performance metrics (e.g. sensitivity, specificity, accuracy)
have lagged. This presents a challenge to translating these new technologies into commercial
use, as regulatory agencies often require performance characterization of the entire sample
analysis pipeline. In response, NIST has undertaken development of a DNA-based reference
material (RM 8376) to enable characterization of sample processing and computational
procedures from sequencing to bioinformatics. RM 8376 consists of unmixed genomic DNA
(gDNA) from 19 different bacteria (16 total species, 3 pairs of related subspecies). We
evaluated the performance of a metagenomic profiling pipeline using mixtures of the RM
components. Experimental Sample mixtures contained the 19 components combined into 5
pools of 3 or 4 components each, and mixed in roughly equigenomic and log10 dilution
mixtures (latin square) for 6 total samples. Each sample was prepared using the Nextera XT
DNA Library Prep Kit with AMPure XP beads, and sequenced on an Illumina MiSeq (2x300 bp
paired reads). In addition, in silico read data matching the sample pools were generated by
Abstract:
subsampling raw reads from each individual RM component. For both data sets, taxonomic
classification was performed using Centrifuge with the default p+h+v indices. Results &
Discussion: In silico mixtures showed up to approximately 3-fold difference between expected
and observed relative abundances for 14/16 species; 2 species were missed. Pure isolate read
data were correctly identified, suggesting that the presence of other species resulted in
inaccurate abundance estimates. The physical mixture relative abundances differed from
expected. Of note, GC content was correlated with fold-differences, with high-GC content
genomes being substantially underrepresented. Further study could show quantitation of
biases from different sample preparations. Conclusions: As a whole, these data verified that
the 19-component mixtures can sufficiently represent sample complexity to inform on
computational biases. The new RM is a promising tool for deconvolving and quantifying biases
due sample preparation (as a result of e.g. GC-content) and computational methods, which
was a major goal of this material. The availability of RMs (such as described here) should be
useful in assessing the performance (accuracy, sensitivity, specificity, etc.) of each component
of the analysis pipeline, allowing the validation or optimization of a metagenomic workflow.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
61
Board #:
Identification of a Novel Genetic Marker Associated with Pseudomonas fluorescens Blue
Title:
Discoloration Through a Genome-wide Association Strategy
F. Chiesa1, S. Lomonaco2, M. Rossi1, S. Gallina3, A. Dalmasso1, L. Decastelli3, T. Civera1;
Author 1University of Turin, Grugliasco, ITALY, 2Center for Food Safety and Applied Nutrition, U.S.
Block: Food and Drug Administration, College Park, MD, 3Istituto Zooprofilattico Sperimentale del
Piemonte Liguria e Valle d’Aosta, Torino, ITALY.
Background: The blue discoloration defect in different types of fresh cheeses has surged
worldwide during the present decade. In June 2010, Italian national health authorities notified
the European Rapid Alert System for Food and Feed about altered organoleptic characteristics
(blue color) and high numbers (5.1 106 CFU/g) of Pseudomonas fluorescens in mozzarella
cheese imported from Germany. Since then, the problem persisted and although seldomly, it
consistently affected several producers. The aim of this study was to identify a genetic marker
for the identification of pigmenting strains. Methods: The genomes of 3 pigmenting and 3
non-pigmenting P. fluorescens were sequenced on a MiSeq instrument after library
preparation with Nextera® XT DNA Library Preparation kit (Illumina). Reads were assembled
with SPAdes v3.2 and annotated using RAST (http://rast.nmpdr.org). For the genome-wide
association study (GWAS), 33 additional genomes were selected from GenBank (3 pigmenting
strains and 30 non-pigmenting strains), in addition to the 6 obtained herein. The pan-genome
pipeline Roary was used to calculate the pan-genome (https://sanger-
pathogens.github.io/Roary/) and its output was analyzed using Scoary to calculate the
associations between all accessory genes and the pigment production
(https://github.com/AdmiralenOla/Scoary). Only genes associated to an OD>1 and Pairwise p
Abstract: <0,05 were considered as possible markers. Primers for functionally annotated genes
identified as putative markers were designed using Primer3 plus. The selected target
gene (trpB) was then multiplexed with a previously designed primer set for the identification
of Pseudomonas genus (oprI). Results: WGS highlighted the presence of a 16 Kbp gene cluster,
present only in the pigmenting isolates, characterized by a set of accessory genes related to
tryptophan metabolism. This cluster is located on a 100 Kbp locus, which was highly
conserved (99-100%) in the pigmenting isolates included in our study. The Scoary analysis
confirmed the accessory genes of the pigmenting isolates gene cluster as significantly
associated with the pigment production. A total of 17 functionally annotated genes were
identified as putative markers. A primers pair amplifying a fragment of trpB was selected on
the base of amplification results and suitable amplicon size to work with the oprI primer pair.
The newly designed PCR multiplex assay demonstrated a 100% specificity for the pigmenting
isolates, producing the expected bands (trpB, 200 bp and oprI, 250 bp) in all the pigmenting
strains tested. Conclusions: This study highlights i) the suitability of the GWAS strategy for the
identification of genetic markers; ii) the suitability of trpB for the identification of P.
fluorescens pigmenting strains, and iii) the usefulness of a multiplex PCR assay for the
identification of sources of contamination in production facilities.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
62
Board #:
Design, Analysis, and Validation of Error-Correcting Internal Spike-In Controls for
Title:
Metagenomics
N. Greenfield1, R. Bovee1, C. Smith1, S. A. Cunningham2, R. Patel2;
Author 1
One Codex, San Francisco, CA, 2Department of Laboratory Medicine and Pathology, Mayo
Block:
Clinic, Rochester, MN.
Robust next-generation sequencing (NGS) metagenomic assays require defined detection
limits and process traceability from sample collection to bioinformatic analysis. DNA
sequence spike-ins can serve as qualitative controls, barcode and track samples, and provide
absolute concentration data to address these challenges. Here we describe the design,
analysis, and validation of error-correcting internal spike-in controls for metagenomics. We
developed software to design synthetic sequences and use a Hamming(3, 1) code to encode a
sequence descriptor, barcode, and manufacturing lot information. Our error-correcting
Abstract: encoding makes the spike-ins’ detection robust to DNA base substitutions, insertions, and
deletions. These errors can occur either during manufacturing or as part of sequencing.
Designed sequences are not homologous to known reference genomes and contain no
homopolymer runs. Finally, we added support to the One Codex Platform for automatically
detecting and analyzing these spike-in sequences. We tested our software and spike-in
control design by generating synthetic sequences, synthesizing them, and adding them at
multiple concentrations as part of a clinical NGS workflow. Initial results across a range of
sample types and library preparations are presented.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
63
Board #:
The Cluster Core Genome Size as a Metric to Improve Person-to-person Pathogen
Title:
Transmission Analysis
Author T. de Man, A. Laufer Halpin;
Block: US Centers for Disease Control and Prevention, Atlanta, GA.
Whole genome sequencing (WGS) provides a high resolution view of the microevolution of
bacterial pathogens causing infections. These data are used for tracing person-to-person
transmission events through phylogenetic analysis using single nucleotide polymorphisms
(SNPs) identified by read mapping to an outbreak reference genome. Pairwise distances
between sequenced isolates are usually measured in SNPs passing certain criteria (high
quality SNPs, hqSNPs) from the portion of the reference genome that is considered the core
genome of the isolate set (cluster core genome). Phylogenetically informative hqSNPs are
those that result from neutral point mutations rather than homologous recombination events
or phage DNA. Most phylogenetic tools only output SNPs, limiting inferences that can be
made for pathogen transmission events. However, the size of the cluster core genome
relative to the reference genome can inform a more accurate transmission analysis.
Therefore, we developed a Perl script that estimates the cluster core genome size across a
group of isolates, for a given depth of coverage, by assessing BAM files. An optional
parameter in which the user also provides a BED file of masked regions on the reference
genome will result in disregarding regions that do not harbor phylogenetically informative
hqSNPs (i.e., regions of homologous recombination). Our Perl script is available on GitHub
Abstract: and can be used with output from any SNP pipeline that generates BAM files. The script will
operate post-SNP calling to support isolate relatedness conclusions. However, challenges
remain in establishing transmission events with certainty; read coverage across a reference
genome is rarely uniform due to differences in genomic content, repetitive sequences, and
G+C content bias. Phylogenetically informative cluster core hqSNPs typically require support
of > 10 reads, meaning genomic areas with less read depth are unable to produce hqSNPs.
Increasing read depth stringency therefore leads to a smaller cluster core genome available
for hqSNP inclusion. If the majority of isolates do not meet the horizontal coverage threshold,
it could indicate a more closely related reference genome is needed or that those isolates are
unlikely to be related. Any sequenced isolate that does not meet the minimum coverage
threshold (e.g., 10X coverage across >80% of the reference genome) warrants further
assessment. In outbreak investigations where WGS is implemented, it is critical that cluster
core genome size and depth of coverage are always communicated with the number of
hqSNPs. A small cluster core genome, even with a low number of hqSNPs, can incorrectly
indicate the set of sequenced isolates is closely related. This might otherwise be missed if
only the number of hqSNPs is reported. Considering the cluster core genome size harboring
hqSNPs informs a more accurate transmission analysis to improve patient outcomes.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
64
Board #:
WGS-based Characterisation of Listeria monocytogenes Isolated from the food-production
Title:
Chain and Humans in Italy
F. Chiesa1, S. Gallina2, L. Decastelli2, V. Filipello3, G. Kastanis4, M. Allard4, E. Brown4, S.
Lomonaco4;
Author 1University of Turin, Grugliasco, ITALY, 2Istituto Zooprofilattico Sperimentale del Piemonte
Block: Liguria e Valle d’Aosta, Torino, ITALY, 3Istituto Zooprofilattico Sperimentale della Lombardia,
Emilia Romagna, Brescia, ITALY, 4Center for Food Safety and Applied Nutrition, U.S. Food and
Drug Administration, College Park, MD.
Background: Listeria monocytogenes is an environmentally ubiquitous organism
contaminating food-processing environments. Consumption of contaminated food is thought
to be the cause of 99% of listeriosis cases. Whole genome sequencing (WGS) is a valuable
typing tool for L. monocytogenes isolates and the identification of virulence islands that may
influence infectivity. Methods: The 510 isolates were collected in Italy over 14 years (2002-
2016) and included 94 human clinical and 416 food/environmental isolates. Whole genome
sequences were obtained at the FDA-CFSAN sequencing facility. Available multi-locus
sequence typing schemes were used to assign sequence types (STs), clonal complexes (CCs)
and virulence types (VTs). The NCBI Pathogen Isolates Browser tool was used to determine
SNP clustering and antibiotic resistance (AMR) genotypes. Isolates were screened for the
presence of i) premature stop codons (PMSCs) in inlA, and ii) two Stress Survival Islets
(SSIs). Results: We observed 72 SNP clusters, 51 STs (4 novel), 38 CCs and 45 VTs. Ten CCs
accounted for more than 80% of the isolates. Most of the food/environmental isolates
belonged to CC9/VT11 (n=176) and CC121/VT94 (n=31), while clinical isolates were mostly
represented by CC1/VT20 (n=18). inlA PMSCs were found in 48% of isolates (n=246), and were
more frequent among environmental than clinical isolates (58% vs 10%, respectively).
Tetracycline-associated resistance gene tet(M) was observed in 5.3% of isolates (n=27). Genes
Abstract:
associated with benzalkonium chloride resistance were found in 16.8% (n=86
Tn6188 transposon) and 7.1% (n=37 bcrABC) of isolates. Tn6188 was found in 65% (n=21) of
CC121/VT94 isolates, while bcrABC was present in 77% (n=27) of CC31/VT113 isolates. Finally,
less than 1% of isolates harbored the efflux proteins emrE, qacA and qacC. SSI-1 was found in
63% of isolates (n=321), while SSI-2 in 7% (n=36). All CC99/VT11 isolates carried only SSI-1,
while all CC121/VT94 isolates only carried SSI-2. Conclusions: The overrepresentation of
CC9/VT11 could be related to multiple sampling of the same type/source. This could also
explain the high rate of tet(M), as 23 out of 27 tet(M) positive isolates belonged to
CCT9/VT11. Among the diversity of L. monocytogenes strains from food production, some CCs
(e.g. CC121 and CC9) have been commonly reported as prevalent in processing environments.
This might be linked to the presence of SSIs or the ability to survive to disinfectants. SSI-1
contributes to growth in suboptimal conditions, while SSI-2 is involved in stress response.
These SSIs can help L. monocytogenes adapt to specific niches in food processing
environments. Further studies are needed to characterize the impact of Tn6188 and
bcrABC on environmental persistence. Our findings could be helpful in both monitoring the
food productions more at risk for L. monocytogenes and in supporting epidemiological
investigations of outbreaks.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
65
Board #:
Imipenem Resistance Mechanism Analysis in Pseudomonas aeruginosa Through Whole
Title:
Genome Sequencing Analysis
Author W. Chang, A. Saeed, V. Sapiro, T. Walker;
Block: Opgen, Gaithersburg, MD.
Background: Antibiotic resistance is accelerated by the misuse and overuse of antibiotics. For
better patient treatment and antibiotic stewardship, it is crucial to rapidly and accurately
determine a pathogen’s antibiotic resistance. P. aeruginosa have developed three
mechanisms to confer resistance to carbapenems: acquisition of carbapenem degradation or
modification enzyme genes; mutations of porin genes (e.g. oprD) to decrease outer
membrane permeability; and overexpression of efflux pump systems. In the previous study
we presented at ASM Microbe 2018, we have found that all three mechanisms are involved in
the resistance to meropenem, a carbapenem, using whole genome sequencing (WGS) and
meropenem-resistance phenotypic data. In this study, we studied the relationship of the
relevant genes in P. aeruginosa with resistance to another carbapenem,
imipenem. Materials/methods: WGS sequencing reads or assemblies and phenotype data
describing imipenem resistance for 208 P. aeruginosa isolates were acquired from public
databases and analyzed with OpGen Acuitas® Whole Genome Sequencing Analysis pipelines.
Of these, 87 isolates were resistant to imipenem and 121 isolates were susceptible. All known
carbapenem degradation/modification enzyme genes and chromosomal gene mutations
conferring resistance to carbapenems were analyzed by comparison to reference genes from
the reference strain P. aeruginosa PAO1 (NCBI: NC_002516.2). Results: All three mechanisms
Abstract: were analyzed. 36 isolates possess at least one carbapenem degradation/modification
enzyme gene, 69 have loss of function (LOF) mutations in oprD genes and 13 have LOF
mutations in nalD (a transcription repressor for the efflux system, MexAB-OprM). In all, 88
isolates harbor at least one enzyme or mutation described above; of these, 78 isolates were
resistant to imipenem and 10 were susceptible. Using these as indicators of the resistance to
imipenem, the prediction accuracy is calculated at 90.9%, sensitivity at 89.7%, specificity at
91.7%, positive predictive value at 88.6% and negative prediction value at 92.5%. Similar to
resistance to meropenem, isolates with a LOF mutation in oprD or with carbapenem
degradation/modification enzyme genes have high positive predictive value of resistance to
imipenem: at 91.7% and 92.8%, respectively. And even though a LOF mutation in nalD is
related to imipenem resistance, mutations in the repressor mexR for the same efflux pump
system isn’t. However, unlike resistance to meropenem, OXA-2 and LOF mutation
in opdD don’t confer resistance to imipenem. Conclusions: Similar to resistance to
meropenem, P. aeruginosa can become resistant to imipenem through acquisition of
carbapenem degradation enzymes, mutations in porin genes and efflux pump systems.
However, the differences also exist between resistance to meropenem and resistance to
imipenem such as OXA-2 or LOF mutation of opdD confer resistance to meropenem, but not
to imipenem.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
66
Board #:
Whole Genome Sequencing and Bioinformatic Analysis of Two Foodborne Illness
Title:
Outbreaks: Campylobacter jejuni and Salmonella enterica
Author K. F. Oakeson;
Block: Utah Public Health Laboratory, Taylorsville, UT.
Whole genome sequencing (WGS) is rapidly becoming a powerful tool for determining the
relatedness of bacterial isolates in foodborne illness detection and outbreak investigation.
WGS has been applied to large national outbreaks and surveillance, however, WGS has rarely
been used in smaller local outbreaks. This work describes the retrospective application of
reference free whole genome sequencing and bioinformatic analysis to a local outbreak of
Campylobacter jejuni associated with raw milk
The current work demonstrates the superior resolution of genetic relatedness generated by
Abstract: WGS data analysis when compared to pulsed-field gel electrophoresis (PFGE). WGS is
powerful alternative to PFGE for the determination of genetic relatedness between bacterial
isolates. The application of WGS and bioinformatic analysis was applied to a Utah specific
outbreak of Campylobacter jejuni associated with raw milk and to a national multi-state
outbreak of Salmonella enterica associated with rotisserie chicken to illustrate the flexibility
and scalability of the workflow. Together these two analyses show how a reference sequence
free WGS workflow is superior to PFGE and other WGS workflows that are based on single
nucleotide polymorphisms (SNPs).
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
67
Board #:
Title: Maximal Viral Information Recovery from Sequence Data Using VirMAP
Author N. J. Ajami, M. C. Wong, M. C. Ross, R. E. Lloyd, J. F. Petrosino;
Block: Baylor College of Medicine, Houston, TX.
Accurate classification of the human virome is critical to a full understanding of the role
viruses play in health and disease. This implies the need for sensitive, specific, and practical
pipelines that return precise outputs while still enabling case-specific post hoc analysis. Viral
taxonomic characterization from metagenomic data suffers from high background noise and
signal crosstalk that confounds current methods. Here we develop VirMAP that overcomes
Abstract: these limitations using techniques that merge nucleotide and protein information to
taxonomically classify viral reconstructions independent of genome coverage or read overlap.
We validated VirMAP using published datasets and viral mock communities containing RNA
and DNA viruses and bacteriophages. VirMAP offers opportunities to enhance metagenomic
studies seeking to define virome-host interactions, improve biosurveillance capabilities, and
strengthen molecular epidemiology reporting.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
68
Board #:
Microbial Water Quality Assessment of Hurricane Harvey Floodwater Remnants Using
Title:
Shotgun Metagenomic Sequencing
J. Narayanan1, S. Chellam2, A. Kelley1, S. Christnacht2, H. M. Lavigne2, S. Das2, J. Murphy1, V.
Author Hill1;
1
Block: Centers for Disease Control and Prevention, Atlanta, GA, 2Zachry Department of Civil
Engineering Texas A&M University, College Station, TX.
It is estimated that culturable microbes represent less than one percent of the total microbial
population. With recent advances in next-generation sequencing (NGS), it is possible to obtain
a complete taxonomic and metabolic profile of microbial communities in water samples. In
the present study, we evaluated the two most commonly used NGS-based approaches -
targeted 16S rRNA amplicon sequencing, and shotgun sequencing - to analyze 11 floodwater
samples collected in the greater Houston area following Hurricane Harvey. The shotgun
sequencing data provides taxonomic and metabolic fingerprinting from all nucleic acids (DNA
and RNA) extracted, whereas the targeted 16S rRNA amplicon sequencing only provides
bacterial community profiles. The value of the shotgun metagenomic approach has been
successfully demonstrated by simultaneous detection of both opportunistic waterborne
Abstract:
bacterial pathogens (Legionella, Mycobacterium and Pseudomonas) and protozoan hosts
(Acanthamoeba and Naegleria). These groups of microorganisms typically co-exist in biofilms
associated with drinking water distribution systems or premise plumbing. A metagenomic
analysis of water samples also provided information on the presence of Vibrio. In addition,
fecal indicator bacteria, including Escherichia coli and Enterococcus faecalis, were detected in
samples. Several bacteriophages including those from Mycobacterium, Vibrio, Escherichia,
and Pseudomonas, were also identified along with their bacterial hosts. The shotgun
metagenomic sequencing approach developed herein provides reliable information on
microbial ecology and community level physiological profiles, which may aid in controlling
pathogens present in contaminated water through effective management practices.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
69
Board #:
Title: Multiplex NGS Detection of Conserved Regions of Bacterial Toxins in Environmental Water
A. Gonzalez-Revello1, R. Fort2, A. Iriarte3, P. Zunino4, C. Piccini4, J. Sotelo-Silveira1;
1
Genomics Dept., Instituto de Investigaciones Biológicas Clemente Estable, Montevideo,
Author URUGUAY, 2Sequencing Platform, Instituto de Investigaciones Biológicas Clemente Estable,
Block: Montevideo, URUGUAY, 3Dept. of Biotechnology, Instituto de Higiene, Fac. Medicina,
Montevideo, URUGUAY, 4Dept. Microbiology, Instituto de Investigaciones Biológicas
Clemente Estable, Montevideo, URUGUAY.
The control and microbiological monitoring of water with tools that ensure the identification
of pathogens and their pathogenic or toxic capacities, ensures adequate quality standards.
Currently, regular detection is based mainly on cultures and they have clear limitations. We
aimed to introduce in our country, Uruguay, the detection of pathogens and their
pathogenicity or toxicity, particularly through new genomic techniques. Through the selection
of conserved regions in sequences of relevant bacterial toxins (focusing on cyanobacteria and
a series of pathogens relevant to human health) we designed Ampli-seq panels to multiplex
detection of these genes by NGS in water samples. Additionally, the community present in
the sample will be verified by metagenomics of 16S rRNA. A total of 274 primer pairs,
targeting 548 amplicons (85-340bp in length) were designed targeting 40 conserved regions
Abstract: and 32 full length genes. Control samples yielded sequences mapping 98% to target regions,
producing few off target products and depended on template abundance. For full length
genes, 80 to 90% of the sequence was recovered. Sequencing of environmental water
samples where blooms of cyanobacteria’s were previously observed, yielded amplicons
identifying the species and sequence variance. Sixteen S metagenomics matched the profiles
of bacterial strains both in controls and environmental samples. Additionally, the system can
be used in combination with different sequencing platforms (Ion Torrent, Illumina, or Oxford
Nanopore). In general, the system proved to be sensitive and useful to detect a wide variety
of species yielding sequencing information that could be useful not only to detect the
presence of genes coding for known toxins but ones derived from species yet to be
characterized.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
70
Board #:
Comparison of Phenotypic and Genotypic Drug Susceptibility Test Results of 108 drug-
Title:
resistant Mycobacterium tuberculosis Isolates
A. Takaki1, Y. Murase1, A. Aono1, K. Chikamatsu1, Y. Igarashi1, H. Yoshida2, Y. Tamura2, T.
Author Nagai2, H. Yamada1, S. Mitarai1;
1
Block: The Research Institute of Tuberculosis, JATA, Tokyo, JAPAN, 2Osaka Habikino Medical Center,
Osaka Prefectural Hospital Organization, Osaka, JAPAN.
Background: Rapid diagnosis of Drug-resistant tuberculosis (DR-TB), especially multidrug- and
extensively drug-resistant (M/XDR-) TB is one of the important for present TB management in
the world. In recent years, to shorten the time for drug susceptibility testing (DST), the
molecular diagnostics including the whole genome sequencing (WGS) become popular.
However, it is known that there are some discordances between genotypic and phenotypic
DST. The accuracy of such methods depends on the reliable DST data. Then, to detect the
mutations and evaluate those DST, we compared the WGS data with our DST results about
108 M. tuberculosis (MTB) isolates including MDR and XDR using the 14 TB drug resistance
prediction tool, TGS-TB. Methods: A total of 108 MTB was collected from DR-TB patients in
Osaka Habikino Medical Center (Osaka, Japan) from 1998 to 2016. WGS analysis of the
isolates was used with QIAseq FX DNA Library Kit (QIAGEN) and MiSeq (illumina) from cloning
strains. TGS-TB (https://gph.niid.go.jp/tgs-tb/) was used to predict the mutations in
responsible genes. The phenotypic DST results were obtained using the two kinds of
proportion method for 10 anti-tuberculosis drugs (Ogawa modified medium, Welpack-S,
Kyokuto) and 12 anti-tuberculosis drugs (Lowenstein-Jensen medium), and the minimum
Abstract:
inhibitory concentrations (MICs) for 16 drugs. And DST for Pyrazinamide (PZA) was conducted
using MIGT AST PZA (Becton Dickinson) and Simplified (modified) PZAse test
(unpublished). Results: DR-TB included 43 MDR and 35 XDR, as the results of phenotypic DST,
and consisted of lineage 2 (East-Asian, 88%) and lineage 4 (Euro-American, 12%). The
potential AMR prediction programme of TGS-TB predicted 81% (63/78) of MDR/XDR-TB
correctly. However, one susceptible isolate was predicted as MDR-TB. To date, we analysed
DST for five main drugs, isoniazid (INH), rifampicin (RIF), kanamycin (KM) and levofloxacin
(LVFX) and PZA. The sensitivities of predicted drug resistance for INH, RIF, LVFX were 93%,
93%, 91%, respectively. However, the specificity of INH was low (57%). In contrast, the
sensitivity and the specificity for KM were 68% and 100%, respectively. PZA showed high
sensitivity and specificity (>95%). Conclusion: To increase the accuracy of molecular DST, we
need to establish high quality genome database for DR-TB strains with reliable phenotypic
DST. Given the relatively low sensitivity/specificity of conventional DST to several drugs, the
continuous variables, MIC, will show the more detailed relations of mutations and drug
susceptibility.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
71
Board #:
Mosaic, a Cloud-Based Community Platform for the Acceleration of Translational Microbiome
Title:
Science
Author J. Didion;
Block: DNAnexus, Mountain View, CA.
Mosaic provides a collaborative space where researchers can implement and compare
microbiome methods through community challenges. The “Strains” series of challenges
encourages the improvement of strain-level performance of bioinformatic tools. “Standards”
Abstract:
addresses experimental and computational sources of variability in metagenomic analyses to
promote accurate and reproducible NGS-based microbiome profiling. Learn more and get
started at http://mosaicbiome.com/.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
72
Board #:
Title: Pathogen Detection Community Challenges on the Precision FDA Platform
Author J. Didion;
Block: DNAnexus, Mountain View, CA.
Next-generation sequencing (NGS) is revolutionizing microbial pathogen identification and
surveillance, and commercialization of metagenomics technologies is increasing at an
exponential rate. The U.S. Food and Drug Administration (FDA) seeks to promote the
development and improvement of bioinformatics pipelines for detecting pathogens in
samples sequenced using metagenomics by providing a cloud-based, community platform for
NGS assay evaluation and regulatory science exploration called precisionFDA. Over the past
Abstract: two years, precisionFDA has hosted and will continue to host a range of challenges, including
the Center for Food Safety and Nutrition (CFSAN) Pathogen Detection Challenge, and the
Center for Devices and Radiological Health (CDRH) Biothreat Challenge. These research
activities have challenged the community to test the ability of their pipelines to detect
pathogens, while also revealing specific areas where improvement is needed. Current and
future challenges - hosted at https://precision.fda.gov/challenges - will continue to motivate
improvements in metagenomic pathogen surveillance.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
73
Board #:
Neisseria gonorrhoeae: Genomic Investigation of Azithromycin Resistant Strains, New South
Title:
Wales, 2017
C. R. George1, R. Rockett2, D. Whiley3, R. Enriquez1, R. Kundu1, J. El-Nasser1, V. Sintchenko4, M.
Author Lahra1;
1
Block: NSW Health Pathology, Randwick, AUSTRALIA, 2CIDM-PH, Westmead, AUSTRALIA, 3University
of Queensland, St Lucia, AUSTRALIA, 4NSW Health Pathology, Westmead, AUSTRALIA.
Background: Gonococcal antimicrobial resistance is a global concern, with treatment failure
now reported for every class of clinically assessed antimicrobial agent. Led by the United
Kingdom in 2011, and followed by many countries including the United States, China,
Australia, and since 2016 the World Health Organization, the recommended treatment
of Neisseria gonorrhoeaeis dual therapy (azithromycin and ceftriaxone) to forestall resistance
to ceftriaxone. Gonococcal resistance continues to rise but there is a new paradigm, with
azithromycin resistance increasing globally, whilst rates of ceftriaxone decreased
susceptibility (MIC 0.125 mg/L) decreasing in Australia and overseas. In Australia, rates of
ceftriaxone decreased susceptibility (0.06-0.125 mg/L) fell from 8.8% in 2013 to 1.2% in 2017
(Quarter 1), whilst azithromycin resistance (MIC ≥ 1 mg/L) increased from 2.1% in 2013 to
Abstract: 10.3% in 2017 (Quarter 1). A major outbreak of azithromycin resistant N. gonorrhoeaewas
first detected in South Australia in 2016, and secondary outbreaks have occurred in several
states including New South Wales. Aim: To investigate the outbreak of azithromycin
resistant N. gonorrhoeaein Australia by characterising strains and determining if the
resistance mechanisms were novel or known. Methods: We used Whole Genome Sequencing
and other molecular investigations to rapidly detect and characterise antimicrobial resistance
mechanisms in gonococcal isolates identified from New South Wales during 2017 (Quarter
1, n= 82). Results: We demonstrate that the use of molecular investigations including Whole
Genome Sequencing provides much needed solutions for investigating the characteristics of
outbreaks in this era of emerging antimicrobial resistance. We demonstrate relationships
between azithromycin resistance and epidemiological population groups.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
74
Board #:
Title: A Modular, Versatile WGS Data Analysis Pipeline for Bacterial Outbreak Investigations
Author W. Haas, P. Lapierre, K. Musser;
Block: NYSDOH - Wadsworth Center, Albany, NY.
Background: The pathogens Legionella pneumophila and Escherichia coli have in common
that they can cause life-threatening infections, give rise to outbreaks that affect large
numbers of people, and are prone to horizontal gene transfer (HGT). These factors make
isolate characterization a priority and a challenge for public health laboratories, especially in
the case of species with high genome plasticity where a single mobile genetic element can
drastically change the genetic makeup of an isolate. Here, we present a versatile
bioinformatic pipeline that takes HGT into consideration and that can be easily adopted to
any bacterial pathogen. Methods: Illumina sequencing reads are trimmed and subjected to
quality control. The query isolate's genome sequence is obtained through de novo assembly
and compared to other genomes to find the best possible reference for mapping and variant
calling. A phylogenetic tree and a minimum spanning tree are generated from all genomes
that have been mapped to the same reference to depict how the isolates within a cluster are
related to each other. If the percentage of unmapped reads or the number of variants are too
large, the query's sequence will be added to the list of candidate reference genomes to serve
as nucleus for its own cluster. Several build-in controls ensure that the data are reliable and
Abstract:
alert the user to potential issues. Log, report, and summary files are generated automatically
to simplify data analysis and reporting. Results: Using WGS data from outbreaks of
Legionnaire's disease in New York State as test samples, the pipeline was able to correctly
group related isolates into clusters and identify the sources of the outbreaks by variant
analysis. The pipeline was easily able to accommodate other species, such as E. coli, by
changing the reference database. New capabilities, such as predicting the presence of Shiga-
toxin genes, were easily added by appending modules to the program. The ability to select a
reference from several candidates and to automatically add new references produced a more
detailed genome comparison since mapping all isolates to a single reference ignores mobile
genetic elements. Conclusions: While WGS is replacing methods such as Pulse Field Gel
Electrophoresis as the standard for isolate characterization, the methods to analyze these
data are far from being standardized. Producing inaccurate results can have severe
consequences in a public health setting, potentially resulting in increased morbidity and
mortality. Here, we present a bioinformatic data analyses pipeline that is accurate, robust,
and versatile.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
75
Board #:
Metagenomic and Bioinformatic Evaluation of Clinical Specimens for Culture-Independent
Title:
Source Attribution of Legionnaires’ Disease
Author J. W. Mercante, J. A. Caravas, L. Lie, B. H. Raphael, J. M. Winchell;
Block: Centers for Disease Control and Prevention, Atlanta, GA.
Legionnaires’ disease (LD) is a severe and sometimes fatal pneumonia caused
by Legionella bacteria. Most Legionella species inhabit natural freshwater environments, but
they can also colonize man-made water networks, causing disease when a susceptible host is
exposed to contaminated aerosols. Identifying the source of exposure is important for
stopping ongoing disease transmission. Recently, NGS-based methods have allowed high-
resolution genetic matching of clinical and environmental Legionella isolates during
outbreaks. Yet, LD investigations often fail to recover clinical and/or environmental isolates,
creating uncertainty as to the source of infection. The aim of this study was to explore
metagenomic sequencing and bioinformatic analysis methods for rapid LD source attribution
when clinical isolates are not available. Previous results from our laboratory suggested low-
burden clinical samples may not yield sufficient Legionella sequence data for proper analysis.
Thus, a PCR-based strategy was developed, using Legionella spp. (ssrA) and human-specific
(RNaseP) targets, to prioritize clinical specimens with higher bacterial ratios for sequencing.
To evaluate this strategy, 29 culture-positive respiratory specimens were tested and 2 high
quality specimens were chosen for WGS using MiSeq V3 chemistry. Resulting sequencing
reads were taxonomically profiled and binned with Kraken using a modified v1 database.
Abstract: Remaining, uncategorized reads were identified using the output of DIAMOND in MEGAN6.
Between 0.2% (~52,000 reads) and 0.6% (~76,000 reads) of the paired sequences were
categorized as Legionella and approximately 99.4% of all reads were of human origin.
Mapping reads to the strain Paris reference sequence revealed 6-10X coverage on average
across 85-87% of the reference genome. Sequences were then analyzed by a previously
constructed gene-by-gene Legionella typing scheme implemented in the Bionumerics
platform, and 301 to 994 complete genetic loci were called from the assembled
metagenomes. Importantly, hierarchical clustering showed both clinical-derived
metagenomes clustered tightly within clades (at 98.6-99.3% identity) that included
the Legionella isolate (>2500 loci identified) previously recovered from these specimens, as
well as with additional clinical isolates associated with the same LD outbreak. This study
demonstrates a metagenomic workflow strategy that may provide sufficient genomic
resolution for LD source attribution in the absence of a clinical isolate. Further studies are
underway to optimize the strategy using human-DNA depletion methods and environmental
microbiome enrichment.
The findings and conclusions in this presentation are those of the authors and do not
necessarily represent the official position of the Centers for Disease Control and Prevention.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
76
Board #:
Using Nanopore Sequencing to Understand the Widespread Antibiotic Resistance in the
Title:
Alaskan Soil Resistome
Author T. Haan, M. C. McCarthy, A. Ducluzeau, D. M. Drown;
Block: University of Alaska Fairbanks, Fairbanks, AK.
Resistant microbes may have a significant negative impact on the health of Alaskans.
Identifying specific antibiotic resistant microbes is essential for quick and appropriate
treatment. Increased antibiotic resistance in the environment may limit treatment options for
infections. As the climate changes and permafrost thaws, antibiotic resistant microbes may
multiply at an increased rate. Multidrug resistance genes are often found on plasmids,
enabling the rapid sharing via horizontal gene transfer. We also should consider transmission
via wildlife and humans across Alaska and circumpolar Arctic. Here using culture-based
methods, we found widespread antibiotic resistance along with heavy metal tolerance in local
boreal forest soils affected by thawing permafrost. Using novel metagenomic analysis of long-
Abstract: read, Nanopore DNA sequence data, we identify individual resistance genes present in our
samples. mcr genes confer colistin resistance, a drug of last resort due to side effects. We
found ORFs homologous to mcr3, a plasmid-borne resistance gene discovered by a clinical
study of isolates from Asia and the United States. mcr3 is likely related to a
phosphoethanolamine transferase gene in Enterobacteriaceae and Aeromonas species,
common environmental microbes. The CDC now tracks the mcr gene superfamily. In our
samples, homologs of mcr3 were found in all soil samples and in 6 different genomes
tentatively identified as suspected environmental microbial reservoirs. Importantly, these
methods allow us to better understand the environmental reservoir of antibiotic resistance in
Alaska.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
77
Board #:
Single Chromosomal Genome Assemblies on the Sequel System with Circulomics High
Title:
Molecular Weight DNA Extraction for Microbes
J. Wong1, C. Heiner1, M. Kim2, H. Ferrao1, V. J. Wallace2, K. Eng1, R. Fedak2, J. Wilson1, D.
Author
Kilburn2, M. Ashby1, P. Baybayan1, J. M. Burke2, K. Bjornson1, K. J. Liu2;
Block: 1
Pacific Biosciences, Menlo Park, CA, 2Circulomics, Baltimore, MD.
Background: Recent developments with Nanobind technology from Circulomics provide an
elegant high molecular weight (HMW) DNA extraction solution for sequencing genomes from
Gram-positive and -negative microbes. Nanobind is a nanostructured magnetic disk that can
be used for rapid extraction of HMW DNA from diverse sample types including cultured cells,
blood, plant nuclei, and bacteria. Processing can be either automated using common
instruments or performed manually, and it can be completed in <1 hour for most sample
types. Methods: We have validated several critical workflow steps for generating high-quality
microbial genome assemblies in a high-throughput environment in a new streamlined
microbial multiplexing workflow. This workflow enables high-volume and cost-effective
sequencing of up to 16 microbes totaling 30 Mb in genome size on a single SMRT Cell 1M
using a target shear size of 10 kb. We also evaluated this method on a set of four “class 3”
microbes with >7 kb repeats. The fragment size was increased to ~14 kb, with some
Abstract: fragments >30 kb. Results: Here we share data demonstrating these new capabilities using
isolates relevant to high-throughput sequencing applications, including Shigella, common
foodborne pathogens (Listeria, Salmonella), and species often seen in hospital settings
(Klebsiella, Staphylococcus). For nearly all microbes, including the difficult class 3 microbes,
we achieved complete de novo microbial assemblies of ≤5 chromosomal contigs with
minimum quality scores of 40 (99.99% accuracy) using data from multiplexed SMRTbell
libraries. Each library was sequenced on a single SMRT Cell 1M with the PacBio Sequel System
and analyzed with streamlined SMRT Analysis assembly methods. Conclusions: Using a
combination of Circulomics Nanobind extraction and PacBio SMRT Sequencing, we prepared
and sequenced a pool of microbes totaling ~30 Mb on one SMRT Cell 1M. With our
streamlined workflow, which includes automated demultiplexing and push-button assembly,
we achieved complete closed genomes for most microbes with quality values typically ranging
from 40 to over 50.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
78
Board #:
rRNATagger: an Integrated Pipeline for Marker Gene Amplicon Sequence Data Processing
Title:
Geared for HPC Environments
J. Tremblay1, E. Yergeau2, C. W. Greer1;
Author 1
National Research Council Canada, Montreal, QC, CANADA, 2INRS - Centre Armand-Frappier,
Block:
Laval, QC, CANADA.
With the advent of high throughput sequencing, microbiology is increasingly becoming a data
intensive field of science. Because of its low cost, robust databases and established
bioinformatic workflows, sequencing of 16S/18S/ITS rRNA gene amplicons, which provides a
marker of choice for phylogenetic studies, has become ubiquitous and has grown into the
backbone of modern microbial ecology. Many established end-to-end bioinformatic pipelines
are available to perform short amplicon data analysis and have proven to be central for
advancing the field of microbial ecology. These pipelines have been partly written for a
general audience, which is arguably a main reason for their widespread adoption. However,
few options exist for a more specialized audience that is experienced in Linux-based systems
and high performance computing (HPC) environments. For such an audience, existing
pipelines can be limiting to fully leverage modern HPC capabilities and perform tweaking and
optimization operations. Moreover, a wealth of stand-alone software packages that perform
Abstract: specific targeted bioinformatic tasks are increasingly accessible through code repositories and
scientific publications and finding a way to easily integrate these applications in a pipeline is
critical in the context of the fast-paced evolution in bioinformatic methodologies. Here we
describe rRNATagger, a short rRNA marker gene amplicon pipeline coded in a python
framework that enables fine tuning and integration of virtually any potential rRNA gene
amplicon bioinformatic procedure. It is designed to work within an HPC environment,
supporting a complex network of job-dependencies with a smart-restart mechanism in case
of job failure or parameter modifications. As proof of concept, we present end results
obtained with rRNATagger using 16S, 18S, ITS and PacBio long read amplicon data types as
input. Using a selection of published algorithms for generating Operational Taxonomic Units
and Single Nucleotide Variants and for computing downstream taxonomic summaries and
diversity metrics, we demonstrate the performance and versatility of our pipeline for
systematic analyses of amplicon sequence data.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
79
Board #:
Title: Tracking the Resistome in One Health Surveillance
Author H. Tate;
Block: FDA, Laurel, MD.
The National Antimicrobial Resistance Monitoring System and others have shown that the
presence of known antimicrobial resistance determinants is highly correlated with clinical
resistance. Studies show that whole genome sequence data can be used to reliably predict
resistance in Salmonella, Campylobacter, E.coli and other bacteria, making it possible to infer
resistance from genomes without traditional antimicrobial susceptibility data. Drawing on
that conclusion, in November 2017 the NARMS program launched Resistome Tracker, an
online tool that provides easily accessible and visually informative interactive displays of
antibiotic resistance genes. Resistome Tracker harvests resistance gene information from
Abstract: genomic data deposited at NCBI and allows users to customize visualizations by antibiotic
drug class, compare resistance genes across different sources, identify new resistance genes,
and map selected resistance genes to geographic region. The tool also provides alerts about
new resistance traits as they emerge in a region or source to provide early warning on
emergent trends. Because a variety of sources are represented in the NCBI dataset,
Resistome Tracker inherently employs a One Health approach to antimicrobial resistance
surveillance. By understanding potential reservoirs for the dissemination of resistant traits,
public health officials, academics, and researchers can develop interventions to stop or slow
the spread of antibiotic resistance.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
80
Board #:
Title: Using Whole Genome Sequencing for Detection ofBacillus cereusToxin Genes in Food
Author A. T. Nguyen1, S. M. Tallent2;
1
Block: Merieux NutriSciences, Chicago, IL, 2US FDA/CFSAN/ORS, College Park, MD.
Introduction: Bacillus cereus is among the top ten pathogens associated with foodborne
illness in the United States; it causes diarrheal disease (ingestion and production of hemolysin
BL (Hbl), nonhemolytic enterotoxin (Nhe), or cytotoxin (CytK) in the gut) or emetic disease
(ingestion of pre-formed cereulide (Ces)). Currently, toxin detection is only available for one
component of each tri-part toxin (HblC and NheB) or by mass spectrometry
(Ces). Purpose: Presently, there is no primary screening method for detection of toxin genes
before detection using a commercialized kit or mass spectrometry method. Development of a
genomic sequencing method and pipeline for initial screening of toxin-producing genes would
provide fast primary detection for potential toxins and other virulence factors. Methods: A
sensitive whole genome sequencing method and analysis pipeline using BTyper, a
computational tool that classifies and characterizes virulence potential of suspected strains,
Abstract:
was developed. DNA was extracted and sequenced from B. cereus culture or spiked food (rice,
gravy, whey powder, pancake mix, and infant formula), and BTyper was used to analyze
sequence data. Results: Our results show that BTyper can be used to detect toxin-producing
strains of B. cereus in culture and in six different food types. BTyper analysis confirmed a toxin
profile of 49/50 Hbl and 48/50 Nhe in inclusivity strains and 30/30 exclusivity strains
previously studied by PCR and examined for toxin production by commercialized protein
assays. Significance: Detection of B. cereus toxins is paramount to ensuring food safety;
however, toxin detection from food is laborious and time-consuming, and some toxins are not
directly detectable from food. Development of this method for detection of toxin genes is a
powerful primary screening method and allows for the acquisition of more in-depth data
about the suspected strain.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
81
Board #:
Agricultural Origins of a Highly-persistent Clone of Vancomycin-resistant Enterococcus
Title:
faecalis in New Zealand
R. Rushton-Green1, G. Tairoa1, R. Darnell1, G. P. Carter2, D. Williamson2, G. Cook1, X. Morgan1;
Author 1
University of Otago, Dunedin, NEW ZEALAND, 2Doherty Institute, University of Melbourne,
Block:
Melbourne, AUSTRALIA.
Background: Enterococcus faecalis and Enterococcus faecium are human and animal gut
commensals. Vancomycin-resistant enterococci (VRE) are important opportunistic pathogens
with limited treatment options. Historically, the glycopeptide antibiotics vancomycin and
avoparcin have cultivated vancomycin resistance in both human and animal isolates, resulting
in global cessation of avoparcin use between 1997 and 2000. To better understand human
and animal-associated VRE strains in the post-avoparcin era, we sequenced the genomes of
231 VRE isolates from New Zealand (NZ) (75 human clinical, 156 poultry) cultured between
1998 and 2009. Results: A clonal E. faecalis strain (MLST 108) was highly prevalent among
both poultry and human isolates in the three years following avoparcin discontinuation in
2000, consistent with previous molecular typing of NZ VRE strains. Metadata and
Abstract: antimicrobial susceptibility information suggest an agricultural origin for this clone, and that
historic antimicrobial use led to evolution of clinically-relevant resistances. VRE isolate
resistomes were largely carried on multiple, heterogeneous plasmids containing diverse
resistance determinants and multiple linked selection mechanisms. In contrast to E. faecalis,
lineages of E. faecium delineate between agricultural and human
reservoirs. Conclusions: Historic use of antimicrobials in NZ agriculture has driven the
evolution of a clonal VRE strain carrying a range of clinically-relevant antimicrobial
resistances. The high genomic conservation and the near-universal presence of bacitracin
resistance genes suggest a common poultry origin with high bacitracin exposure. The
persistence of this clone in NZ for over a decade indicates that co-selection and other
stabilizing mechanisms may be important drivers for the persistence of this clone.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
82
Board #:
Title: A Molecular Biology Cloud for the Microbial NGS Researchers
Author V. Nagarajan;
Block: MolBioCloud, Silver Spring, MD.
Background: A combination of the high throughput NGS data technology and the
democratized public cloud computing power, has created a perfect environment for
biomedical discovery. But, unfortunately, the complexity of the cloud ecosystem and the mind
boggling software dependency that is baked into the modern open source ecosystem, poses a
serious threat to this wonderful opportunity that the biomedical researchers are facing today.
In this work, we demonstrate a cloud computing platform that truly brings the power of the
cloud to the researcher's finger tips. Methods: We carefully reviewed the literature for all the
published software, pipelines and workflows that are related to applied microbial NGS data
analysis. The identified tools were then individually installed, configured and tested in
MolBioCloud (an AWS cloud computing marketplace DaaS product, that is FedRAMP
compliant and super user friendly, with a nice drag and drop Desktop interface). The tools
were then packaged (MicrobialMolBioCloudPackage) and tested as a single installer for use
with MolBioCloud. The platform was tested multiple times using the several different instance
types within the Amazon Cloud Computing Infrastructure, for reliability and
reproducibility. Results: The MolBioCloud platform already comes prepackaged with
Abstract:
thousands of standard molecular biology and latest versions of NGS analysis tools. The
MicrobialMolBioCloudPackage that we developed for this work, is freely available at
https://molbiocloud.com/help/tiki-index.php?page=MicrobialMolBioCloudPackage . This
package consists of several tools for NGS based MLST, taxonomy classification, metagenomics,
community composition, GWAS and population genomics. Hundreds of popular and standard
tools including the latest versions of mother, QIIME2, PathoScope, MashTree, Staphopia,
Snippy, Prokka, Roary, abricate, SalmID, Sixess, MEGAN, Krona, MetaPhlAn, SNP-sites etc are
included in this package with thousands of dependencies resolved, configured and
tested. Conclusion: The MicrobialMolBioCloudPackage is a free one-step installer that help
setup a complicated computing environment in a very easy manner. Along with secure, on-
demand, pay-per-use public cloud model, this platform has a great potential, to help the
microbial researchers realize this amazing discovery opportunity, using the BigData and the
public Cloud combination. With this platform, the researchers can now easily set up a secure,
powerful, easy to use, biologist friendly, low-cost and scalable virtual private cloud based nice
and powerful Desktop system, for their personal research.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
83
Board #:
Next-generation Sequencing and Antibiotic Resistance Profiles of Salmonella Strains Isolated
Title:
from Stream Sediments and Poultry Litter in the Shenandoah Valley of Virginia
Author C. Holmes, S. Jurgensen, N. Greenman, J. Herrick;
Block: James Madison University, Harrisonburg, VA.
Environmental reservoirs of Salmonella -- particularly those related to agricultural practices --
may contribute significantly to the dissemination of these potential human pathogens.
However, such reservoirs are not well characterized. Furthermore, the use of antibiotics in
animal agriculture has potentially expanded the transfer and recombination of antimicrobial
resistance (AMR) genes among Salmonella and other bacterial populations in environmental
ecosystems. Surveillance of Salmonella in ecosystems such as streams and soils is potentially
an important tool for understanding the overall distribution and epidemiology of this
pathogen. Stream sediments from seven sites and chicken litter from five poultry houses in
the Shenandoah Valley were sampled between October 2016 and January 2018. Modified
FDA Bacteriological Analytical Manual methods of pre-enrichment, enrichment, and isolation
were used to isolate 38 Salmonella strains. Thirty-three were isolated from stream sediments
Abstract: and five from litter from a single commercial chicken house. Putative Salmonella were
confirmed by polymerase chain reaction amplification of the Salmonella-specific invA gene.
DNA of all isolate genomes was Illumina-sequenced and assembled and eight were sequenced
using Nanopore technology and hybrid-assembled. Fourteen different serotypes were
identified (using SeqSero, SISTR and SMART PCR) among the 38 isolates. AMR profiles were
determined using Sensititre MIC assays as well as surveyed in silico using KmerResistance and
ABRicate utilizing the ARG-Annot database. All isolates possessed the ampicillin resistance
gene ampH and at least one aminoglycoside resistance gene. Thirteen isolates also had
resistance phenotypes and/or genes encoding resistance to other clinically or agriculturally
relevant antibiotics. Populations of Salmonella in poultry litter and especially in stream
sediments impacted by agricultural runoff may constitute important environmental reservoirs
for antibiotic resistance genes.
Session: Poster Session A
Time: Monday, September 24, 2018, 2:00 pm - 3:30 pm
Poster
84
Board #:
Rapid Detection of Antimicrobial Resistance Markers in Bacillus anthracis by Nanopore Whole
Title:
Genome Sequencing
A. S. Gargis1, B. Cherney1, A. Conley2, H. P. McLaughlin1, D. Sue1;
Author 1
Centers for Disease Control and Prevention, Atlanta, GA, 2IHRC Inc. Georgia Tech Applied
Block:
Bioinformatics Laboratory, Atlanta, GA.
Background: Bacillus anthracis is a spore-forming bacterium that causes anthrax in
humans. Antibiotics, including tetracyclines, fluoroquinolones, and β-lactams, are typically
effective for anthrax treatment. During an emergency, the detection of antimicrobial
resistance in B. anthracis strains will be critical for an effective public health response. High
quality whole genome sequencing data may be useful to detect genetic engineering, plasmids
or antimicrobial resistance gene markers in implicated strain(s). Methods: Commercially
available DNA extraction kits, such as the MasterPure Complete DNA and RNA Purification Kit
(Epicentre) and QIAamp DNA Blood Mini Kit (Qiagen), can purify B. anthracis DNA, but have
not been evaluated for nanopore sequencing. Here, both kits were used to isolate gDNA from
agar-cultured cells of a select agent-excluded, attenuated B. anthracis strain (Sterne/pUTE29)
that is not susceptible to tetracycline. Sequencing libraries were prepared using the Rapid
Sequencing Kit (Oxford Nanopore Technologies). De novo assemblies were generated using a
custom pipeline (Albacore, Minimap/Miniasm, Racon, and Nanopolish). We assessed (1) DNA
quantity and quality, (2) if purified DNA could be used for library preparation, (3) sequence
quality of MinION versus Illumina and PacBio data, (4) whether the virulence plasmid, pXO1,
and an introduced plasmid, pUTE29, could be assembled, and (5) if the tetracycline resistance
gene (tetL) could be identified. Finally, retrospective down sampling of MinION reads was
Abstract:
performed to assess the time required for de novo assembly. Results: DNA of sufficient
quality and quantity for MinION sequencing was obtained from both kits, but an optimized
cell lysis with extended incubation (1 h) was required. Nanopore sequencing yielded 3-6 Gb of
data and each de novo assembly covered >99% of the reference sequence (Sterne). Large
(pXO1, 181.7 kb) and small (pUTE29, 7.3 kb) plasmids, as well as tetL were detected in
MinION, Illumina, and PacBio data. Mapping MinION reads to the Sterne reference revealed
numerous (4,000 to 5,000) small indels, especially in homopolymer regions, compared to
Illumina and PacBio assemblies. Down sampling analysis revealed that only minor
improvements in assembly quality were observed after analysis of 100,000 reads. DNA
extraction, library preparation, sequencing, and assembly was completed in ~10 h from a
pure B. anthracis isolate. Conclusions: With an optimized lysis step, both extraction methods
produced B. anthracis DNA of sufficient quality and quantity for the Rapid Sequencing Kit.
MinION reads were error-prone compared to Illumina and PacBio data, particularly in
homopolymer regions. However, plasmids and an introduced antimicrobial resistance
gene were assembled from MinION data. The ability to de novo assemble a B.
anthracis genome from a culture isolate in ~10 h using nanopore sequencing could contribute
to an expedited public health response.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
85
Board #:
Characterization of Microbiota in Cerebrospinal Fluid (CSF) from Patients with CSF Shunt
Title:
Infections Using Shotgun Sequencing
Author P. Hodor1, C. Pope2, K. Whitlock1, L. Hoffman2, T. Simon1;
1
Block: Seattle Children's Hospital, Seattle, WA, 2University of Washington, Seattle, WA.
Background: Treatment of hydrocephalus consists of surgical placement of a CSF shunt.
Approximately 10% of patients develop a shunt infection within 1 year of CSF shunt
placement. Approximately 20% of patients with first infection develop reinfection. It is not
known whether reinfections are caused by an organism previously present in the host or are
independent infection events. Identification of microorganisms associated with CSF shunt
infections has traditionally relied on culture methods, but high throughput sequencing of 16S
ribosomal RNA has been adopted more recently to identify bacterial species present. Here we
present the results of a pilot study using whole genome shotgun sequencing and evaluate the
additional resolution this method provides to our understanding of CSF shunt
infection. Methods: CSF samples were obtained from 6 patients having 2 infections, with one
sample collected near the beginning and another near the end of each infection. The V4
region of 16S ribosomal RNA was amplified and sequenced. Alternatively, DNA was processed
in duplicate by whole genome amplification (WGA) followed by shotgun sequencing.
Taxonomic assignments of sequences obtained by 16S and WGA were compared against each
other and with microbiological culture results. Non-human sequences from WGA were
Abstract:
assembled and compared against known genomes from similar species. Results: Taxonomic
classification of bacteria observed by 16S and WGA was consistent with that obtained in the
CSF cultures at the beginning of each infection episode. However, taxa assigned by 16S
stopped at the genus level, and in one case (Klebsiella pneumoniae) 16S only identified the
family (Enterobacteriaceae). WGA was able to identify all species detected in culture.
Furthermore, WGA provided additional insights into the composition of the samples, such as
showing that human DNA constituted 76 to 99% of the reads, identifying outlier samples of
questionable quality, and detecting 2 cases of significant viral load. A few CSF samples
produced a sufficiently large number of bacterial reads to allow partial assembly of the
predominant species and comparison to known genomes to identify the closest matching
strain. Conclusion: This proof of concept study showed the value of shotgun sequencing in
studying the microbiota of CSF shunt infections. Not only were the results consistent with
culture-based methods, but additional insights could be gained regarding strain identity of
predominant bacteria and identification of viral loads. This approach opens the door to a
detailed understanding of the progression of infections and reinfections.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
86
Board #:
GenEpiO and FoodOn: Enabling Data Interoperability for Infectious Disease Surveillance,
Title:
Investigation and Control
E. Griffiths1, D. Dooley2, G. Gosal2, N. Alikhan3, M. Sanchez4, T. Matthews5, A. Pertkau5, J.
Adam5, R. Timme4, M. Graham5, G. Van Domselaar5, F. Brinkman1, W. Hsiao6;
Author 1Simon Fraser University, Vancouver, BC, CANADA, 2University of British Columbia,
Block: Vancouver, BC, CANADA, 3University of Warwick, Coventry, UNITED KINGDOM, 4US Food and
Drug Administration, College Park, MD, 5Public Health Agency of Canada, Winnipeg, MB,
CANADA, 6BC Centre for Disease Control Public Health Laboratory, Vancouver, BC, CANADA.
Backgroundl:The ability to share data between organizations is crucial for global, real-time
infectious disease surveillance and investigation. Reliable capture and harmonization of whole
genome sequencing (WGS) contextual information (sample source, experimental and
bioinformatics methods, lab, clinical and epidemiological data) is critical for the interpretation
of WGS results used for decision making in health crises. This data is often recorded using free
text and institution-specific data dictionaries, requiring time-consuming and error-prone
transformation before it can be used in investigations. Ontologies provide hierarchies of well-
defined, standardized vocabulary enabling comparisons at different levels of granularity;
universal IDs for disambiguating terms; built-in logic enhancing querying power; and
synonyms that enable institutions to use preferred terms while linking to a standard,
improving interoperability. We have created two ontologies to better harmonize and
integrate genomics data into food microbiology and public health workflows, called the
Genomic Epidemiology Ontology (GenEpiO) and the Food Ontology (FoodOn). Here, we
describe the development of tools which facilitate ontology implementation within food
safety and public health communities. Methods: User engagement activities identified
vocabulary gaps, user needs, and use cases. Two ontology-driven tools were created to
Abstract: enable mapping of food microbiology and public health data to standardized terms. LexMapr,
a Python-based, hybrid lexicon and rule-based system, was developed to address the many
challenges in processing short textual data. Test datasets of metadata were mapped to
GenEpiO and FoodOn to establish rules for natural language processing. Also, a Linux-based,
open source, Python-driven web portal called the Genomic Epidemiology Entity Mart (GEEM)
was developed to better enable the exchange of ontology-driven data specifications between
agencies. Results: These tools and resources are currently being tested and evaluated for use
in key databases and platforms for typing and tracking foodborne pathogens - Enterobase,
GenomeTrakr and IRIDA. LexMapr testing indicates that the software has a high level of
sensitivity in data clean-up, text matching and concept mapping. Furthermore, data
specifications were created using GEEM for different applications, including an International
Organization for Standards (ISO) standard for the implementation of WGS for food
microbiology. The International GenEpiO Consortium (>80 members, from 15 countries) was
also established to create consensus and uptake. Conclusions: The improved inferencing and
computability of harmonized data provided by our resources and tools can enhance
communication and analyses, resulting in faster hypothesis generation during investigations,
and ultimately, better health outcomes.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
87
Board #:
Title: Data Parsing: Efficiency, Organization, and Analysis of NGS Data for Public Health Labs
Author C. Hanigan;
Block: APHL, Silver Spring, MD.
The APHL and CDC Influenza Division began using the APHL Informatics Messaging System
(AIMS) for transmission and analysis of next generation sequencing (NGS) data generated by
the National Influenza Surveillance References Centers on behalf of CDC. With the success of
this endeavor, laboratories have sought to leverage the data transmission and analysis
capabilities for additional pathogens beyond influenza. This expansion has created the need
for a “data parser” that can manage and route files based on data submitter, unique
pathogen, and specific project to the appropriate data bucket and pipeline for analysis. The
data parser, for its ability to slice AMD files and runs, is called Ninja. The Ninja software
organizes and directs incoming NGS data based on pathogens, submitters, and projects to
Abstract: different services, e.g, placing the files into directories, rerouting files to another site, or
notifying other processes about the availability of NGS data. The Ninja is able to sort and
direct NGS data from multiple pathogens and projects into the appropriate data bucket and
data analysis pipeline, based on the information users put on the sample sheet that will cue
the Ninja to parse the NGS data into the appropriate places. Public health laboratories have a
variety of hurdles in their use of NGS for disease detection and surveillance. Limited
workforce and bioinformatics capacity are two of the biggest challenges. As public health
laboratories expand their use of NGS, tools like Ninja will be instrumental in allowing them to
use this technology for more Ninja's technical capabilities allow public health labs to increase
their efficiency and expand their use of NGS into a variety of pathogens.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
88
Board #:
Prediction of Antibiotic Minimum Inhibitory Concentration from Bacterial Whole Genome
Title:
Sequence Data in Klebsiella pneumoniae
Author S. W. Long1, R. J. Olsen1, J. J. Davis2, M. Nguyen2, F. Xia2, T. Brettin2, J. M. Musser1;
1
Block: Houston Methodist Hospital, Houston, TX, 2University of Chicago, Chicago, IL.
Introduction: Antimicrobial resistance testing has been a mainstay of clinical microbiology
since the early 1970s. Phenotypic determination of minimum inhibitory concentration (MIC) is
culture-dependent, requiring hours of growth before rendering an actionable result. Multiple
studies have shown that decreasing the time between initial sample collection to actionable
clinically relevant susceptibility results has multiple patient benefits, including decreased
length of stay, decreased mortality, and decreased costs. Whole genome sequencing (WGS)
has continued to decrease in cost while delivering faster results, proving useful for molecular
microbiology. Recent advances in machine learning can develop classifiers that use bacterial
WGS data to predict MIC within one dilution for many antibiotics. Methods: We used the
whole genome sequence of 1,668 K. pneumoniae isolated from patients which had
phenotypic antimicrobial susceptibility testing performed by BD Phoenix. We used these data
Abstract: to build classifiers using an XGBoost-based machine learning model to predict minimum
inhibitory concentrations (MICs) for 20 antibiotics. These predictions were validated against a
test set of isolates not included in the training set. Results: The overall accuracy of the model,
within ±1 two-fold dilution factor, is 92%. Individual accuracies are ≥90% for 15/20 antibiotics
tested. We show that the MICs predicted by the model correlate with known antimicrobial
resistance genes. Conclusion: Importantly, the genome-wide approach described offers a
method to predict MICs without knowledge of the underlying gene content. This study shows
that machine learning can be used to build a complete in silico MIC prediction panel for K.
pneumoniae and provides a framework for building MIC prediction models for other
pathogenic bacteria. The ability to rapidly sequence bacterial genomes and then predict an
MIC and resulting phenotype hours before culture-based methods have completed is a great
potential advance for patient care and guiding empiric therapy.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
89
Board #:
Long-range Sequencing to Identify Multispecies blaKPC-harboring IncN Plasmid Carriage at a
Title:
New York City Hospital
Author A. Gomez-Simmonds, M. K. Annavajhala, M. J. Giddins, S. L. Stump, A. Uhlemann;
Block: Columbia University Medical Center, New York, NY.
Introduction: Infections caused by carbapenem-resistant Enterobacteriaceae (CRE) are
associated with high mortality due to broad-spectrum antibiotic resistance. The plasmid-
encoded Klebsiella pneumoniae carbapenemase (KPC) is the dominant mechanism of
carbapenem resistance in the US. Both clonal expansion and horizontal transfer have been
implicated in the spread of CRE. However, challenges sequencing plasmids have limited the
ability to assign blaKPC to specific plasmid backbones to assess plasmid-
mediated blaKPC transmission. Focusing on broad-host range IncN plasmids, which we
previously detected in multiple strains of Enterobacter cloacae complex, we used MinION
long-range sequencing to characterize and compare blaKPC-harboring plasmids in CRE clinical
isolates collected at a tertiary care center where CRE are endemic. Methods: CRE isolates
collected between 2010-2017 were identified on the basis of phenotypic resistance to
meropenem (MIC≥2 mcg/dL) and sequenced using Illumina (n=469). blaKPC subtypes,
multilocus sequence types, and plasmid replicon types were detected by SRST2 using the
ARG-ANNOT, PubMLST, and PlasmidFinder databases, respectively. A subset of isolates found
to have blaKPC-3 and a plasmid profile including an IncN replicon (n=15; 11 K. pneumoniae, 4 E.
cloacae) underwent plasmid DNA extraction (Qiagen) followed by long-range sequencing
using the MinION (Oxford). Hybrid plasmid assemblies were generated using SPAdes and
Abstract:
visually curated and compared using Geneious (Biomatters). Results: We successfully
localized both a plasmid replicon gene and blaKPC to a single contig for 8/15 isolates with
median Illumina housekeeping read depths of 27.7 and 4,203 curated MinION sequencing
reads (IQR 26.7-134.2 and 3,512.5-7,013.5, respectively). In 2 additional isolates, 2-3 large
contigs mapped closely to a local internal reference plasmid (pNR0276, NCBI accession
number PNXT00000000). blaKPC-3 was found on IncN plasmids in 6 isolates, including 3 K.
pneumoniae from 3 different STs and 3 E. cloacae from 2 STs, ranging in length from 48,506-
76,249 kb. In 4 K. pneumoniae isolates, blaKPC-3 was found on IncFII plasmids. Alignment of
IncN plasmid sequences to pNR0276 indicated that 5/6 plasmids shared at least 90% pairwise
identity over the full length of pNR0276, while one isolate harbored a truncated plasmid
sharing an ~40 kb core region with pNR0276. Conclusions: Long-range sequencing enabled
identification of an established blaKPC-3-harboring IncN plasmid backbone in carbapenem-
resistant K. pneumoniae and E. cloacae at our hospital. Further study is needed to determine
the extent of dissemination of IncN and other blaKPC-harboring plasmids among
Enterobacteriaceae. Long-range sequencing has the potential to greatly facilitate
comprehensive plasmid sequencing and demonstrate the important contribution of plasmids
to the dissemination of CRE.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
90
Board #:
Title: Pan-Synteny Graphs: Understanding Rearrangements
Author A. Warren;
Block: Biocomplexity Institute at Virginia Tech, Blackburg, VA.
We present Pan-synteny graphs, a multiple whole genome alignment model for
understanding genome rearrangements and a graphical interface for browsing and
understanding complex genome relationships. The visualization and interaction techniques
demonstrate a powerful new kind of genome browser capable of summarizing information
between hundreds of genomes. This effort touches on several different research fronts-
graph representation of genomes and their alignments, synteny block analysis, whole
genome sequence alignment, pan-genome analysis, multiple sequence alignment, and
genome rearrangement analysis. Pan-synteny graphs represent a fundamentally new strategy
to compare thousands of bacterial genomes in a scalable manner. Graph creation also
identifies relative evolutionary events such as inversion, translocation, deletion, and
insertion. Though this approach was originally developed from a pan-genome perspective for
Abstract:
prokaryotes we are excited about its applicability to a wide range of topics. Algorithmically
novel elements include the contextualization of synteny analysis both between and within
multi-contig genomes. We also believe the algorithmic approach for discovering collision
points has great value in the recognition of evolutionary relationships between a group of
genomes. Pan-synteny graphs harness the information in pre-existing family databases, e.g.
COGs and others. We will demonstrate how this information is able to make model
construction more resilient to distant and complex evolutionary relationships as compared to
existing tools such as Mauve and Harvest. This comparative graphical model also serves as a
framework to analyze incomplete genomes. We hope to show that the graph abstraction and
layout algorithm not only serve to make the resulting model approachable in terms of human
cognition but represents a step forward in interactive comparative genomics.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
91
Board #:
Whose Lab is it, Anyway? --- Teaching Lab-specific Biases to a Metagenomics Taxonomy
Title:
Classifier
Author J. A. Russell1, A. Shteyman1, D. Yarmosh1, P. Davis1, P. Li2, K. Davenport2, P. Chain2, J. Bagnoli1;
1
Block: MRIGlobal, Gaithersburg, MD, 2Los Alamos National Laboratory, Los Alamos, NM.
Metagenomics is emerging as an important tool in biosurveillance, public health, and clinical
applications. However, ease-of-use for execution and data analysis remains a primary barrier-
of-entry to the full adoption of metagenomics in applied health and forensics settings. Here,
we present PanGIA (Pan-Genomics for Infectious Agents), a novel framework for hosting,
processing, analyzing, and reporting read-mapping data from metagenomics samples that can
be run on commodity computer hardware. PanGIA was developed to address existing gaps
that may preclude clinicians, medical technicians, forensics personnel, or other non-expert
end-users from routinely leveraging metagenomics data for their needs. PanGIA is primarily
meant for the detection and discovery of pathogenic microorganisms from clinical and
environmental metagenomics data. PanGIA provides two forms of confidence scoring; the
first pairs coverage data with ‘uniqueness’ information derived from each reference genome
for a stand-alone determination of confidence for each query sequence at each taxonomy
Abstract: level, and the second compares a known ‘negative control’ profile with the profile of an
unknown sample to determine significance in presence ‘above background’. Data can be
quickly summarized within the graphical user interface to rapidly detect specific organisms-of-
interest. PanGIA’s default parameters were optimized using a ROC-approach
(Receiver Operating Characteristic curve) from in-silico-generated microbial communities.
Recent work, leveraging a machine-learning approach, has explored the capacity of PanGIA to
learn what known false-positives look like (across confidence score, normalized read
abundance, reference genome linear coverage, depth-of-coverage, RPKM, and other metrics)
such that PanGIA can more accurately distinguish potential false-positives in real-world
laboratory sequencing data. In this way, over time and with initial user input, PanGIA can
‘learn’, recognize, and account for the contaminants and biases inherent to whichever
laboratory it is placed in. This feature adds a unique level of confidence in discerning
unambiguous detection events from low-confidence hits and false positives.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
92
Board #:
Development and Application Of QuAISAR-H: A Bioinformatics Pipeline for Short Read
Title:
Sequences of Healthcare-Associated Pathogens
Author R. A. Stanton, N. Vlachos, T. J. de Man, A. Lawsin, A. Laufer Halpin;
Block: Centers for Disease Control and Prevention, Decatur, GA.
The application of whole genome sequencing (WGS) to surveillance projects and outbreak
investigations of pathogens causing healthcare-associated infections (HAI) grants public
health microbiologists an unprecedented level of resolution towards understanding the
epidemiology of antimicrobial resistance, and transmission dynamics. However, the technical
expertise required for processing and analyzing WGS data is often a major obstacle in public
health laboratories, limiting the feasibility of implementing WGS on a wide-scale.
Furthermore, healthcare-associated pathogens are uniquely challenging because of their
diversity; our group alone has sequenced more than 50 different species causing HAIs. Finally,
the lack of established standards for performing sequence analysis has left a gap in public
health practice. To address these shortcomings, we have developed QuAISAR-H: a specialized
pipeline for Quality control, Assembly, species Identification, Sequence typing, Annotation,
and Resistance mechanisms for Healthcare-associated pathogens. QuAISAR-H currently runs
on the CDC’s high performance computing cluster, utilizing open source software and custom
scripts. It accepts and is optimized for raw reads generated by Illumina short read sequencers
Abstract:
and initially performs a variety of quality control assessments, including species identification
and contamination checks using Kraken and Gottcha. Genome assemblies are generated
using SPAdes, classified using MLST definitions and functionally annotated by Prokka.
Antimicrobial resistance genes are identified using multiple databases from both the raw
reads (using SRST2) and the assemblies (using c-SSTAR). The output assemblies and high-
quality, cleaned reads generated by QuAISAR-H can be used for downstream phylogenetic
analysis. The implementation of QuAISAR-H has allowed us to move towards a more
standardized approach of analyzing WGS data from HAI pathogens. We have iterated and
streamlined the pipeline through processing more than 3400 isolates sequenced internally
and externally, including those from 45 HAI outbreaks. A graphical user interface to provide
public health laboratories across the country with direct and easy access to QuAISAR-H is
currently under development and will be available through the CDC’s Office of Advanced
Molecular Detection online portal. This will enhance not only local capacity , but also national
efficiency in utilizing WGS data for HAI surveillance and investigation.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
93
Board #:
Title: GenomeTrakr Proficiency Testing for Foodborne Pathogen Surveillance
Author R. E. Timme, H. Rand, M. Leon, E. Strain, M. Allard, D. Roberson, J. Baugher;
Block: US Food and Drug Administration, College Park, MD.
Pathogen monitoring is becoming much more robust as sequencing technologies become
more affordable and accessible world-wide. This transition is especially apparent in the field
of food safety, which has demonstrated how whole genome sequencing (WGS) can be used
on a global scale to protect public health. GenomeTrakr coordinates the WGS performed by
public health agencies and other partners by providing a public database with real-time
cluster analysis for foodborne pathogen surveillance. As growing numbers of public health
labs use WGS technology to support enforcement decisions, it is essential to have confidence
in the quality of the data being used and the downstream data analyses which guide these
decisions. Routine proficiency tests, such as the one described here, have an important role in
ensuring the validity of both data and procedures. GenomeTrakr ran an annual internal
proficiency test through 2015 that is now harmonized with PulseNet. In 2015 the
Abstract: GenomeTrakr proficiency test consisted of 8 isolates of common foodborne pathogens;
participating laboratories were required to follow a protocol to culture these and perform
WGS. Resulting sequence data were evaluated for proper annotation, sequence quality, and
applicability to downstream bioinformatics analyses. Overall, this exercise revealed the
degree of variation which should be expected in sequence data produced across a diverse
network of laboratories. Illumina MiSeq sequence data collected for the same set of strains
across 21 different labs exhibited high reproducibility, while revealing a narrow range of
technical and biological variance. The numbers of SNPs reported for sequencing runs of the
same isolates across multiple labs support the robustness of our cluster analysis pipeline in
that each individual isolate cultured and resequenced multiple times in multiple places are all
easily identifiable as originating from the same source. Subsequent proficiency tests confirm
these results.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
94
Board #:
Impact of Antibiotic and Innate Immune Pressures on Enterococcal Adaptation in the Human
Title:
Bloodstream
D. Van Tyne1, A. L. Manson2, M. M. Huycke3, J. Karanicolas4, A. M. Earl2, M. S. Gilmore5;
1
University of Pittsburgh School of Medicine, Pittsburgh, PA, 2Broad Institute, Cambridge,
Author
MA, 3University of Oklahoma Health Sciences Center, Oklahoma City, OK, 4Fox Chase Cancer
Block:
Center, Philadelphia, PA, 5Harvard Medical School, Massachusetts Eye and Ear Infirmary,
Boston, MA.
Multidrug-resistant enterococci emerged in the early 1980s, and are now among leading
causes of drug-resistant bacterial infection worldwide. We used functional genomics to study
one of the earliest outbreaks of multidrug-resistant Enterococcus faecalis bacteremia, to
determine how a clonal lineage adapted to grow and survive in the human bloodstream.
Genome sequence analysis of 62 closely related strains revealed a progression of increasingly
fixed mutations, as well as repeated independent occurrence of mutations in a relatively
Abstract: small set of genes. The most frequently encountered independent mutations we observed
occurred in a novel pathway that rendered E. faecalis better able to withstand antibiotic
pressure and innate defenses in the bloodstream, and were associated with changes in cell
surface-associated polysaccharides. A shift in mutation pattern then occurred, which
corresponded to the introduction of carbapenem antibiotics in 1987. This work uncovers new
pathways that allow enterococci to survive the transition from the gut into the bloodstream,
positioning them to cause infections associated with high mortality.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
95
Board #:
Title: Characterization of Tissue-associated Metagenomes Using Selective Nanopore Sequencing
Author J. Wang, C. Jones, T. Furey, S. Sheikh, O. Finkel, J. Dangl;
Block: University of North Carolina at Chapel Hill, Chapel Hill, NC.
While 16S rDNA profiling has been the standard approach to characterizing host-associated
microbiome communities, it produces taxonomic classifications practically limited to the
genus level and suffers from PCR and other biases. Whole metagenome sequencing produces
more specific taxonomic information and an estimate of genetic content describing the
functional capacity of a microbial community. However, metagenomic studies are expensive
and require high sequencing depth, especially in tissue-associated microbiomes, where host
DNA makes up the vast majority of the sequenced reads (90-99+%). We describe a real-time
sequencing and analysis approach using Oxford Nanopore sequencers that enables real-time
enrichment or depletion of specific sequences. Using the "read-until" functionality of the
MinION sequencer, we perform basecalling and alignment of partial read sequences in real
time on a distributed cloud computing platform and eject reads belonging to the host
genome, thereby increasing the relative and absolute abundance of microbial sequences. This
Abstract: approach is essentially unbiased compared to existing method for preferential cell lysis and
DNA extraction, and produces an actual increase in sequenced microbial DNA unlike post-
sequencing filtering. We demonstrate the power of this approach by depleting host (human)
DNA in a mock host-microbial metagenome, and in a colon biopsy sample to describe the
composition and function of the mucosa-associated microbiome in the colon. We observed a
two to six-fold increase in the relative abundance of microbial sequences relative to host,
depending on the initial proportion. This selection method produces no detectable false-
positive depletion (of microbial sequences) or selection bias in the retained reads. We
additionally propose a simple and effective method for accurately classifying observed long
reads as host, or to their appropriate species/strian-level taxa. These host-depleted
metagenome experiments - with novel methods to efficiently classify long, error-prone reads
- demonstrate the power of tightly coupled sequencing and informatics protocols to enable
efficient investigation of disease-relevant tissue-associated microbiota.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
96
Board #:
ClinicalWhole-Genome Sequencing of mycobacterium tuberculosis complex Isolates- 2½ Years
Title:
of Experience Analyzing, Reporting and Improving TB Testing in New York State
K. Musser1, J. Shea1, P. Lapierre1, T. Halse1, J. Lemon2, J. Rakeman2, V. Escuyer1;
Author 1
Wadsworth Center, NYSDOH, Albany, NY, 2Public Health Laboratory, New York City
Block:
Department of Health and Mental Hygiene, New York City, NY.
Background: Mycobacterium tuberculosis (MTB) is an important pathogen, infecting more
than a third of the world population; New York State (NYS) has the 3rd highest number of
cases by state in the US. The cost and time associated with diagnostic testing and treatment
of MTB can be considerable and weeks to months are required to identify, assess drug
susceptibility, and generate molecular genotypes. Our laboratory developed and validated a
comprehensive whole-genome sequencing (WGS) assay to characterize MTBcomplex (MTBC)
isolates, replacing seven molecular tests. We implemented this testing in March of 2016 and
have continually measured its performance, assessed turnaround time (TAT), success at
resistance prediction and high-resolution genotyping. Methods: The MTBC WGS assay is
comprised of a novel DNA extraction, optimized library preparation, paired-end WGS, and an
in-house developed bioinformatics pipeline; numerous quality control steps are incorporated
in this testing. Following DNA sequencing, the pipeline performs analysis usingthree principal
components: modules used for the phylogenetic analysis, modules used to perform
taxonomic identification of the samples, and modules used for SNP calling and resistance
profiling. The results from all three components generate a final comprehensive report for
each sample analyzed that is reported through our LIMS. Results: To date we have tested
Abstract: 1634 MTBC strains from unique NYS patients,including NYC. Of these, 5 members of MTBC
have been identified: 1560 M. tuberculosis, 27 M. bovis-BCG, 26M. bovis, 17 M. africanum,
and 4 M. orygis. In-silico spoligotypes were generated for 96.5% of strains tested, and strains
found to be closely related (<20 SNPs genome wide) were reported for epidemiological
investigation. Resistance profiles of the MTBC strains showed 79.8% to be susceptible to eight
drugs, 7.8% resistant to at least isoniazid, 2.4% multidrug resistant (MDR), and 0.12%
extensively drug resistant (XDR) strains. When compared with conventional phenotypic drug
susceptibility testing (DST), our assay was found to have an overall resistance predictive value
of 94% and a susceptibility predictive value of 98% based on >8000 phenotype-genotype
comparisons. We have assessed TAT since implementation and reduced our initial 8-day TAT
to 5 days from MTBC DNA extraction to report. This TAT has resulted in genotypic resistance
predictions being reported an average of 8 days earlier than first-line phenotypic
DST. Conclusions: This TB WGS clinical assay is providing comprehensive detection of drug
resistance, identification to the MTBC member and typing for epidemiological investigations.
As a result of improvements as well as updates to analyze more samples at one time, an
improved TB WGS pipeline is in use. This assay continues to improve patient management
and is supporting epidemiological investigations in NYS and NYC.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
97
Board #:
Title: NGS Applied to the Epidemiology of Influenza a Virus Diversity in Brazil
A. B. Veiga1, T. Song2, T. G. Baccin1, T. S. Gregianini3, H. V. Bakel4, A. García-Sastre5, E.
Ghedin2;
Author 1Universidade Federal de Ciências da Saúde de Porto Alegre, Porto Alegre, BRAZIL, 2New York
Block: University, New York City, NY, 35Laboratório Central de Saúde Pública da Secretaria de Saúde
do Estado do Rio Grande do Sul – LACEN/SES-RS, Porto Alegre, BRAZIL, 4Mount Sinai Hospital,
New York City, NY, 5Icahn School of Medicine at Mount Sinai, New York City, NY.
Systematic surveillance of seasonal influenza A viruses using next generation sequencing
(NGS) has the potential to contribute to early detection of novel influenza strains in the
human population. In this study, we sequenced and analyzed clinical samples collected in
Brazil between 2009 and 2016 from 220 individuals infected with pandemic H1N1
(H1N1pdm09) or H3N2 influenza A virus. Phylogenetic analyses show persistence of strains
from one season to the next in Brazil, with introductions of new strains from global circulating
viruses. An analysis of single nucleotide variants (SNV) in the NGS data reveals mixed
infections with minor circulating strains that also appear to seed the next season. Some SNVs
Abstract: are located in antigenic sites of the hemagglutinin, leading to changes in antigenicity in recent
strains. For example, the non-synonymous mutation A538C in segment 4 of H1N1pdm09
(K180Q substitution in the HA antigenic site) appeared in strains during the 2013-2014
influenza season in the Northern Hemisphere, but the SNV analysis shows that minor variants
carrying this mutation had been circulating as early as 2011. In 10 of the 220 infected
individuals sampled we also detected mixed subtype infections, considered a rare occurrence
in the human population. NGS combined with minor variant analysis proves to be a powerful
surveillance tool to identify mixed infections and potential circulating strains in upcoming
seasons.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
98
Board #:
Title: NGS analysis methods for Illumina data while the sequencer is running
S. H. Tausch1, T. P. Loka2, M. S. Lindner2, P. W. Dabrowski2, B. Strauch2, J. M. Schulze2, A.
Author Andrusch2, A. Radonic2, A. Nitsche2, B. Renard2;
1
Block: German Federal Institute for Risk Assessment, Berlin, GERMANY, 2Robert Koch Institute,
Berlin, GERMANY.
Background: Food induced infectious diseases still remain a major cause of health problems
across the globe. With the continuously increased use of next-generation sequencing (NGS) in
the field of infectious disease outbreak analysis and food safety controls, there is a strong
need for fast turnaround time from sample arrival to analysis results. While runtime of data
analysis software has significantly decreased, the overall turnaround time from sample arrival
to interpretable analysis results remained nearly the same due to the sequential paradigm of
data production and analysis. To overcome this limitation, we developed a collection of tools
for sequence analysis while the sequencer is still running. Methods: The presented methods
include software for read mapping (HiLive; Lindner et al., 2017,
doi:10.1093/bioinformatics/btw659), taxonomic classification (LiveKraken; Tausch et al.,
2018, doi:10.1093/bioinformatics/bty433), privacy preservation (PriLive; Loka et al., 2018,
doi:10.1093/bioinformatics/bty128), pathogen identification (PathoLive; Tausch et al.) and a
workflow for SNP/variant calling (Loka et al.) while the sequencer is running. Results: We are
able to show that each of our tools generates comparable or superior results to established
Abstract: tools in the named fields. HiLive’s accuracy (F1 = 0.761) is slightly higher than that of the other
tested approaches (BWA: 0.760, Bowtie2: 0.742) with the end of a sequencing run. LiveKraken
performs identical to Kraken with the end of a full MiSeq run, while reaching comparable
accuracy after less than half of a run (F1 = 0.96 at cycle 80 of 216). PriLive filters human reads
more accurately (F1 = 99.961) than BMTagger (99.956) and DeconSeq (99.941) and can
moreover mask sensitive data before it is completely produced. PathoLive combines a live
mapping approach with novel background masking techniques and thereby achieves highest
accuracy on a real HiSeq run (ROC-auc = 0.97 after 36h turnaround time) compared to Clinical
PathoScope (ROC-auc = 0.91 after 95h turnaround time) and Bracken (ROC-auc = 0.48 after
95h turnaround time). Conclusion: With each of these tools, we prove the ability to generate
meaningful results with or even before the end of a sequencing run. This allows minimizing
the turnaround time of a variety of tasks and can thereby increase the efficiency of high
throughput routine analyses. It could furthermore significantly reduce the response time in
urgent cases of infectious disease outbreaks. Since more and more institutions have their own
sequencers available, the parallelization of wet- and drylab is at hand.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
99
Board #:
Beaver fever: Whole Genome Characterization of Waterborne Giardia Isolates Revealed Mix
Title:
Assemblages and Zoonotic Transmission
K. Tsui1, R. Miller2, M. Uyaguari-Diaz2, P. Tang1, C. Chauve3, W. Hsiao2, J. Isaac-Renton2, N.
Author Prystajecky2;
1
Block: Sidra Medicine, Doha, QATAR, 2University of British Columbia, Vancouver, BC,
CANADA, 3Simon Fraser University, Vancouver, BC, CANADA.
Giardia causes the diarrheal disease known as giardiasis; transmission through contaminated
surface water is common. The protozoan parasite’s genetic diversity has major implications
for human health and epidemiology. To determine the extent of transmission from wildlife
through surface water, we performed whole-genome sequencing (WGS) to characterize
89 Giardia duodenalis isolates from both outbreak and sporadic infections: 29 isolates from
raw surface water, 38 from humans, and 22 from veterinary sources. Using single nucleotide
variants (SNVs), combined with epidemiological data, relationships contributing to zoonotic
transmission were described. Two assemblages, A and B, were identified in surface water,
human, and veterinary isolates. Mixes of zoonotic assemblages A and B were seen in all the
community waterborne outbreaks in British Columbia (BC), Canada, studied. Assemblage A
was further subdivided into assemblages A1 and A2 based on the genetic variation observed.
Abstract:
The A1 assemblage was highly clonal; isolates of surface water, human, and veterinary origins
from Canada, United States, and New Zealand clustered together with minor variation,
consistent with this being a panglobal zoonotic lineage. In contrast, assemblage B isolates
were variable and consisted of several clonal lineages relating to waterborne outbreaks and
geographic locations. Most human infection isolates in waterborne outbreaks clustered with
isolates from surface water and beavers implicated to be outbreak sources by public health.
In-depth outbreak analysis demonstrated that beavers can act as amplification hosts for
human infections and can act as sources of surface water contamination. It is also known that
other wild and domesticated animals, as well as humans, can be sources of waterborne
giardiasis. This study demonstrates the utility of WGS in furthering our understanding
of Giardiatransmission dynamics at the water-human-animal interface.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
100
Board #:
Title: Refactoring the NCBI Prokaryotic Genome Annotation Pipeline into a Stand-alone Tool
F. Thibaud-Nissen1, D. Slotta1, A. Badretdin1, B. Kiryutin1, A. Gourianov1, B. Busby1, R. Cohen1,
Author
W. Hlavina1, M. Hsieh2, S. Turner2;
Block: 1
NCBI/NLM/NIH, Bethesda, MD, 2Pacific Biosciences, Menlo Park, CA.
The NCBI Prokaryotic Genome Annotation Pipeline (PGAP) has been used to annotate RefSeq
prokaryotic genomes since the early 2000s, increasing in quality and consistency over the
years. PGAP annotation, also offered as a service to researchers submitting genome
assemblies to GenBank has become a reliable resource for the prokaryotic community.We
have re-factored PGAP into a stand-alone pipeline that can be executed outside of NCBI on
individual computers or in a cloud environment. The pipeline is written in CWL, which
executes programs wrapped in Docker containers, to run on a variety of platforms. To ensure
conformance of the stand-alone results with results generated at NCBI, manually curated
Abstract: evidence and other datasets used by the pipeline are bundled and distributed with the
pipeline. The goal is for stand-alone PGAP to produce annotation that is in line with internal
NCBI PGAP and that is submittable to GenBank. We expect that making PGAP portable will
accelerate research by providing scientists a quality annotation of the genomes they
assemble prior to submission. It will also give users an opportunity to iterate over the
assembly process until the assembly quality is high enough to produce quality annotation. We
will describe the stand-alone PGAP prototype and the results of the annotation tests
performed on multiple platforms and by multiple users across several locations, with respect
to performance and conformance.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
101
Board #:
Diagnosis and Characterization of Canine Distemper Virus Through Real Time Sequencing by
Title:
MinION Nanopore Technology
A. Lorusso, A. Peserico, M. Marcacci, D. Malatesta, M. Di Domenico, I. Mangone, F. Pizzurro,
Author
G. Zaccaria, C. Cammà;
Block:
Istituto Zooprofilattico dell' Abruzzo e del Molise, Teramo, ITALY.
Rapid identification of the etiologic agent of an infectious disease is essential for setting up
treatment and preventive measures. In general, pathogen identification is performed by
direct diagnostic tests which normally include amplification of target nucleic acid by PCR-
based assays. Although these approaches are highly specific and, often, validated, they suffer
a number of limitations, including the difficulties of testing for the plethora of rare pathogens
that might be expected to cause a given pathology and their inability to identify new or
unexpected pathogens, eventually originated from cross-species jumps. Therefore, the
existence of other more rapid, broad-range and sensitive techniques have become more and
more important in the milieu of laboratory diagnosis of infectious diseases. In this
Abstract:
perspective, nucleic acids purified from the brain tissue of a dog succumbed after severe
neurological signs were processed with the MinION (Oxford Nanopore Technologies,
Cambridge UK) sequencing technology. Canine distemper virus (CDV) infection was
diagnosed. The earliest detection of sequence reads belonging to CDV was accomplished
within the first 20 minutes of real time sequencing. Subsequently, a specific real time RT-PCR
assay and immunohistochemistry were used to confirm the presence of CDV RNA and
antigen, respectively, in tissues. This study supports the use of the MinION in veterinary
clinical practice with tremendous advantages in terms of rapidity and accuracy of molecular
diagnosis.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
102
Board #:
Title: Biohansel for RapidSubtyping of Highly Clonal Pathogens Using Canonical SNPS
G. Labbe1, P. Kruczkiewicz2, P. Mabon2, M. Rankin1, M. Gopez2, J. Robertson1, N. Knox2, A. R.
Reimer2, G. Tong2, H. J. Adam3, R. P. Johnson1, G. Van Domselaar2, J. H. Nash4;
1
National Microbiology Laboratory, Public Health Agency of Canada, Guelph, ON,
Author
CANADA, 2National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB,
Block:
CANADA, 3Department of Medical Microbiology, University of Manitoba, Winnipeg, MB,
CANADA, 4National Microbiology Laboratory, Public Health Agency of Canada, Toronto, ON,
CANADA.
Background: Whole genome sequencing (WGS) is rapidly being adopted by Public Health in
many jurisdictions, creating a need for rapid, robust analytical tools. Single nucleotide
polymorphism (SNP) genotyping panels have been developed for numerous organisms based
on canonical SNPs that are discriminatory for clonal populations. Using canonical SNP panels,
new isolates can rapidly be placed within the population structure without the need to
rebuild phylogenetic trees for the entire population. A canonical SNP-based nomenclature can
facilitate long-term surveillance by allowing numerous comparisons of isolates across time in
a context broader than typically considered for outbreak response. Methods: Biohansel
rapidly classifies WGS data into hierarchical subtypes without the need for assembly.
Canonical SNP schemas for two prevalent Salmonella serovars (S. Enteritidis and S.
Heidelberg) have been incorporated into biohansel, and user-defined schemas can also be
supplied at runtime for subtyping other pathogens. Biohansel identifies SNPs using the Aho-
Corasick algorithm (Ju et al., 2017, doi.org/10.1101/229708) according to defined k-mers
containing target SNPs. Results are evaluated using a quality assurance module which
identifies problematic samples according to the number of targets found, target coverage,
and concordance with the population structure defined by each schema. Possible mixed
Abstract:
samples are identified based on the presence of discordant sets of SNPs and presence of
multiple SNPs for each target. Biohansel is a Python 3 application and available on PyPI,
Conda and as a Galaxy tool. Source code is available at https://github.com/phac-
nml/biohansel. Results: We demonstrate the utility of biohansel by rapidly analyzing
>23,000 S. Enteritidis and >3,000 S. Heidelberg WGS datasets from public repositories using
minimal computational resources, and by identifying subtype associations with commodities
and geography. Biohansel proved useful to rapidly identify closely related isolates and exclude
poor quality WGS datasets, enabling the creation of reference-mapped phylogenetic trees
with the high discriminatory power needed for traceback investigations. The tool was also
able to detect and subtype Salmonella in shotgun metagenomics datasets obtained from
clinical stool samples. The Aho-Corasick algorithm for k-mer searching is as fast as NCBI
BLAST+ against assembly contigs (~0.4s) and is 10 times faster than Jellyfish (~33s vs ~356s)
for typically sized Salmonella read sets (30-100X coverage). Conclusions: In a public health
context, biohansel enables rapid and high resolution classification of North American isolates,
providing a robust, stable framework for source attribution and supporting identification of
possible interventions to reduce contamination of food products.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
104
Board #:
Title: IRIDA: A Platform for Genomic Epidemiology
A. Petkau1, T. Matthews1, F. Bristow2, J. Adam1, P. Kruczkiewicz1, J. Cabral1, J. Thiessen1, E.
Griffiths3, D. Dooley4, D. Fornika4, G. Winsor3, M. Graham1, E. Taboada1, R. Beiko5, W. Hsiao4,
F. Brinkman3, G. Van Domselaar1;
Author 1
Public Health Agency of Canada, Winnipeg, MB, CANADA, 2University of Manitoba, Winnipeg,
Block:
MB, CANADA, 3Simon Fraser University, Burnaby, BC, CANADA, 4BC Public Health
Microbiology and Reference Laboratory, Vancouver, BC, CANADA, 5Dalhousie University,
Halifax, NS, CANADA.
Background: Whole genome sequencing (WGS) is a powerful tool for public health infectious
disease investigations owing to its higher resolution, greater efficiency and cost-effectiveness
over traditional genotyping methods. However, implementation of WGS in routine public
health microbiology labs is impeded by the complexity in data management, availability of
easy-to-use pipelines, integration of pipeline results with epidemiological metadata, and
restrictive jurisdictional data sharing policies. To address these issues, we developed the
Integrated Rapid Infectious Disease Analysis (IRIDA) platform—a user-friendly, decentralized,
open source bioinformatics and analytical web platform—to support real-time infectious
disease outbreak investigations using WGS data. Methods/Results: IRIDA stores and manages
WGS data alongside contextual metadata—providing a single system for processing and
generating reports on sequenced samples. WGS data is automatically uploaded to IRIDA using
a tool installable on a sequencing instrument. Data is then processed to evaluate quality,
assemble, perform in silico sequence typing, and save results into the epidemiological
metadata system. Typing of Salmonella genomes uses SISTR, a tool for Salmonella serovar
prediction and cgMLST analysis from WGS data. Additional k-mer based typing pipelines
include MentaLiST for cg/wgMLST and biohansel for SNP-based typing. SNVPhyl provides
Abstract: whole genome phylogenetic analysis using SNV/SNPs; Mash provides rapid distance
estimation to existing genomes in RefSeq. Genomes may be sent to IslandViewer for genomic
island detection. Pipelines may be configured to trigger automatically on upload of new WGS
data, or users may select sets of samples for additional analysis through the IRIDA pipelines.
The IRIDA metadata system integrates data generated from a pipeline—such as sequence
type—with user-provided metadata into a single table. Users may toggle the display of
metadata fields and save specific views of the metadata for later use. These views of
metadata may also be visualized alongside a phylogenetic tree. The IRIDA REST API enables
secure exchange of genomic and epidemiological metadata, enabling construction of a
decentralized genome data sharing network. The IRIDA REST API may also be used to extend
IRIDA’s functionality, such as through additional tools for custom report generation or
integration with the phylogeographic software GenGIS. Conclusion: IRIDA is successfully
deployed as the official bioinformatics platform for public health genomics in the pan-
Canadian Public Health Laboratory Network (CPHLN). The storage, management, and analysis
of WGS data alongside contextual metadata has helped simplify surveillance and outbreak
investigation activities. IRIDA is open source and freely available at https://github.com/phac-
nml/irida and http://irida.ca.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
105
Board #:
Interpreting Whole-Genome Sequence Analyses of Foodborne Bacteria for Regulatory
Title:
Applications and Outbreak Investigations
Author A. Pightling, J. Pettengill, Y. Luo, J. Baugher, H. Rand, E. Strain;
Block: U.S. Food and Drug Administration, College Park, MD.
Whole-genome sequence (WGS) analysis has revolutionized the food safety industry by
enabling high-resolution typing of foodborne bacteria. Higher resolving power allows
investigators to identify origins of contamination during illness outbreaks and regulatory
activities quickly and accurately. Government agencies and industry stakeholders worldwide
are now analyzing WGS data routinely. Although researchers have published many studies
that assess the efficacy of WGS data analysis for source attribution, guidance for interpreting
WGS analyses is lacking. Here, we provide the framework for interpreting WGS analyses used
by the Food and Drug Administration's Center for Food Safety and Applied Nutrition (CFSAN).
We based this framework on the experiences of CFSAN investigators, collaborations and
interactions with government and industry partners, and evaluation of the published
literature. A fundamental question for investigators is whether two or more bacteria arose
from the same source of contamination. Analysts often count the numbers of nucleotide
Abstract:
differences (single-nucleotide polymorphisms [SNPs]) between two or more genome
sequences to measure genetic distances. However, using SNP thresholds alone to assess
whether bacteria originated from the same source can be misleading. Bacteria that are
isolated from food, environmental, or clinical samples are representatives of bacterial
populations. These populations are subject to evolutionary forces that can change genome
sequences. Therefore, interpreting WGS analyses of foodborne bacteria requires a more
sophisticated approach. We present a framework for interpreting WGS analyses that
combines SNP counts with phylogenetic tree topologies and bootstrap support. We also
elucidate the roles of WGS, epidemiological, traceback, and other evidence in forming the
conclusions of investigations, making clear that WGS data alone is insufficient for links
between bacterial isolates to be made. Finally, we present examples that illustrate the
application of this framework to real-world situations.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
106
Board #:
GenomeTrakr Database and Network: WGS Network for Real-Time Characterization and
Title:
SourceTracking of Foodborne Pathogens
M. W. Allard, R. Timme, M. Sanchez, E. Stevens, M. Hoffmann, K. Yao, G. Kastanis, G. Kastanis,
Author D. Miller, T. Muruvanda, S. Lomonaco1, E. Strain, J. Payne, A. Pightling, H. Rand, J. Pettengill,
Block: Y. Luo, N. Gonzalez-Escalona, D. Melka, S. Lindley, Y. Chen, S. Tallent, E. Brown;
US FDA, College Park, MD.
A national database of federal, state, academic and international laboratories has been using
WGS data to rapidly characterize pathogens. This mature GenomeTrakr network is part of
NCBI Pathogen Detection web site. Public health agencies (FDA, CDC and USDA-FSIS) collect
and share data in real time. This high-resolution, rapidly growing database is actively being
used in outbreak investigations at state, national, and international levels. GenomeTrakr
database has demonstrated how distributed network of desktop WGS sequencers can be
used in concert with traditional epidemiology and investigation for source tracking of
foodborne pathogens. This new “open data” model allows greater transparency between
federal/state agencies, industry partners, academia, and international collaborators. This
database has continued to grow and diversify the foodborne pathogen database doubling in
Abstract:
the last year to ~207,000 draft genomes. Two new international surveillance efforts were
added to collect food, animal and environmental isolates including Campylobacter. NCBI has
release new data analysis tools that improve rapid interpretation and visualization. NCBI,
currently is producing daily clustering results for 22 pathogens
including: Salmonella, Listeria, E. coliand Campylobacter. The high-resolution WGS signal in
concert with epidemiological or inspection evidence has drastically enhanced our ability to
identify the food sources of current outbreaks for foodborne pathogens with ~200 regulatory
clusters examined in 2017. Results demonstrate global benefits of having an open data
model. Understanding root causes of foodborne contamination assists our academic, public
health and industry partners to develop preventative controls to make food safer globally.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
108
Board #:
Title: Ultra-Rapid Sample-to-Answer for Fieldable Genomic Sequencing-Based Biothreat Detection
T. Reed1, M. Karavis2, S. Deshpande3, R. Lewandowski1, C. Anderson2, M. LaFrance1, P. Roth4,
A. Liem4, R. C. Bernhards5;
1
CBRNE Analytical & Remediation Activity, 20th CBRNE Command, US Army, Aberdeen
Author
Proving Ground, MD, 2Edgewood Chemical Biological Center, Aberdeen Proving Ground,
Block:
MD, 3Science & Technology Corp. support to Edgewood Chemical Biological Center, Aberdeen
Proving Ground, MD, 4DCS Corp. support to Edgewood Chemical Biological Center, Aberdeen
Proving Ground, MD, 5Defense Threat Reduction Agency, Ft. Belvoir, VA.
Rapid and accurate detection technologies are critically needed in the field, especially for
unknown, emerging, and genetically modified biothreats. Next-generation sequencing (NGS)
technologies are superior in that the entire genome can be analyzed, which allows for
unbiased, conclusive identification, and the ability to detect new and synthetically modified
threats. However, most NGS sequencing technologies have substantial size, power, and
sample preparation requirements which severely limits their use in far-forward
environments, and current methodologies for sample-to-answer take multiple days to
complete. The MinION nanopore sequencer developed by Oxford Nanopore Technologies has
recently emerged as a portable NGS technology. Nanopore sequencing utilizes biological
proteins as nanopores for the passage and identification of DNA and RNA molecules.
Improvements in error rates combined with the high amount of read generation have made
nanopore sequencing comparable to existing NGS sequencing technologies, such as Illumina
and PacBio, without the need for large, expensive equipment. The MinION is able to fit in the
palm of your hand, offering the capability to conduct true field-deployable sequencing, which
can allow for rapid identification of unknown threats and disease monitoring in resource-
limited settings. This project aims to accelerate the time from sample to answer, simplify the
Abstract:
procedures, and reduce equipment/power that is needed. An optimized workflow was
established using simple, fieldable, and rapid sample and library preparation procedures. The
workflow includes the use of the portable OmniLyse bead beading device capable of lysing
spores within two minutes, a rapid DNA purification protocol, and an eight minute library
preparation. In addition, the utility of the VolTRAX automated sample/library preparation
device, the Flongle flow cell adapter, and the MinIT miniature processor are currently being
investigated for inclusion into the workflow. Within 10 minutes of sequencing on the MinION,
enough reads are generated to conclusively identify the organisms present in the sample.
Automatic offline live basecalling is used during the sequencing run, and after sequencing is
complete, the data is analyzed instantly using offline software developed at ECBC. Using this
workflow, raw sample-to-answer can be achieved in approximately one hour. Field
demonstrations are being conducted with DoD mobile lab operators for assessment. The goal
is to allow for genomic sequencing identification to be performed rapidly by minimally
trained personnel in low-resource environments and without the need for high-powered lab
equipment. Using this procedure, the MinION could be used by the warfighter to rapidly
identify unknown biothreats on the battlefield or in expeditionary analytic scenarios.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
109
Board #:
Title: Tracing the Origins of Hospital-onset Clostridioides difficile Infections
J. Worley1, C. Cummins1, M. Delaney1, A. DuBois1, S. Men1, M. Klompas2, L. Bry1;
Author 1Massachusetts Host-Microbiome Center, Department of Pathology, Brigham and Women's
Block: Hospital, Boston, MA, 2Department of Population Medicine, Harvard Medical School and
Harvard Pilgrim Health Care Institute, Boston, MA.
Background: Clostridioides difficile is a leading cause of health care-associated infection in the
United States and the leading cause of death from a gastrointestinal pathogen. The annual
costs associated with its treatment are frequently estimated to be in excess of $5 billion. We
designed an experiment to address if patients who develop hospital-onset C. difficile infection
(CDI) were infected by strains commonly found at the hospital, from other patients, or
asymptomatically carried upon admission. While some strains have been more common,
particularly sequence type 1/NAP1/PCR ribotype 027 (ST1), there is high genetic diversity
within disease-causing C. difficile. Whole-genome sequencing, which can identify clonally
related bacteria, was used to address this question by sequencing strains from CDI presenting
patients and an incoming patient screen. Methods: The study period was September 2017
through May 2018. Patients admitted to the intensive care units are screened for
vancomycin-resistant Enterococci by rectal swab (VRE swab) upon admission and weekly
thereafter. These swabs were screened for C. difficile to identify strains arriving to the
hospital. VRE swabs were collected from November through April 15th, while stool was
collected over the entire period. Stool collection was hospital wide. Isolates were sequenced
using the Illumina MiSeq platform. Single-nucleotide polymorphisms were identified de novo
Abstract: using kSNP and through core-gene alignment. Sequence types and genetic features were
assessed using BLAST and MUSCLE. Sequence types were classified using
PubMLST. Results: 2418 swabs from over 1500 patients were screened for C. difficile, of
which 177 produced C. difficile isolates (7%). 179 stool samples were collected during this
period, of which over 90% produced isolates. In this dataset, 5 patients transitioned from
asymptomatic carriage to CDI, each time without changing sequence type. Additionally, 7
patients transitioned from non- carriage to CDI. While sequencing is not complete
(anticipated completion by September), a diverse set of isolates representing over 50
sequence types (ST) was found from 242 sequenced isolates. Of these, only 12 were from ST1
(5%). Strains from ST1 and related STs were more likely to be found in CDI than other strains,
and atoxigenic strains less likely. Conclusions: Strains incoming to the hospital are highly
diverse and represent much of the genetic diversity within C. difficile. ST1 does not represent
the predominant strain in our samples, even though it is still more strongly linked to disease
than other STs. We find that, in all cases where a patient is asymptomatically colonized before
CDI onset, the same strain was isolated before and during CDI. Even with a small sample size,
this raises the possibility of being able to identify a subpopulation of patients at greater risk
for developing CDI and adjusting medical care appropriately.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
110
Board #:
Title: Metagenomic Strain Detection with Rainbow Sketching
Author R. Bovee, C. Smith, N. Greenfield;
Block: One Codex, San Francisco, CA.
Identifying specific strains and mixtures of strains in complex metagenomic samples is a key
challenge in both epidemiology and environmental microbiology. Even sensitive, k-mer-based
metagenomic classification tools struggle with strain identification due to issues including
database quality, contamination, and genetic recombination and other shared homology. In
contrast, recent MinHash methods sketch the entire sample (which is inappropriate for
complex mixtures) or require a comparison for each available reference (limiting scalability).
We present Rainbow Sketching, an approach that leverages both k-mer-based taxonomic
classification and MinHash sketching for strain tracking and mixture modeling. Rainbow
Abstract: Sketching first performs taxonomic classification of each individual k-mer (“coloring” each k-
mer) and uses these colors to build a rainbow of discrete sketches. Each taxa-specific sketch
may then be compared against a subset of relevant reference genomes - identifying present
strains and determining strain-reference novelty. Count data within these sketches also
provides a foundation for modeling strain mixtures.
We employed this method in the recent PrecisionFDA CFSAN Pathogen Detection Challenge -
achieving the highest overall score detecting Salmonella strains against a metagenomic
background. We present results from this challenge, several clinical and other real-world
datasets, and simulated data to demonstrate the sensitivity and specificity of this approach.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
111
Board #:
Development of a Serotyping Pipeline Using Whole Genome Sequencing (WGS)
Title:
for Shigella Identification
Author Y. Wu, H. Lau, T. Lee, D. K. Lau;
Block: FDA, Alameda, CA.
The bacteria Shigella spp. of 4 species and >50 serotypes cause shigellosis, a disease that
leads to significant morbidity, mortality, and economic loss worldwide. An estimated 500,000
annual shigellosis cases occur in the US, and the number of cases has been on the rise.
Shigellosis is transmitted through the fecal-oral route, and about one-third of these cases are
foodborne. Serotyping (speciation) is an important tool for Epidemiological surveillance that
informs future policy making for outbreak control and vaccine development.
Classical Shigella serotyping based on serology is tedious, time-consuming, limited by the
availability of sensitive and serotype-specific antibodies, and its interpretation often
interfered by cross-reactivity. Modern molecular diagnostic assays are fast and sensitive but
does not distinguish Shigella at species level or even from the closed related
enteroinvasive Escherichia coli (EIEC) strains. Due to its high discriminating power, whole
genome sequencing (WGS) holds the promise to replace the conventional Shigella serotyping
with a faster and more accurate in silico serotyping. However, analysts trained as Laboratory
Microbiologists do not usually possess sophisticated bioinformatics skills. Some serotypes
of Shigella are determined by both O-antigen biosynthetic genes and O-antigen modification
Abstract: enzymes, which can be complicated to interpret. We have developed an automated workflow
that utilizes limited computational resources to accurately and rapidly
determine Shigella serotypes using WGS data from Shigella and EIEC strains available in the
laboratory and on NCBI SRA. To conserve time and computational resources, raw WGS reads
are subjected to alignment with an in-house curated reference sequence database composed
of Shigella serotype determinants and genus- and species-specific sequences as indicators to
exclude non-Shigella isolates. Serotype prediction is made based on sequence hits that pass
threshold levels of coverage and accuracy. Operators with minimal computer programming
skills and knowledge in Shigella genetics can obtain an unambiguous interpretation using this
pipeline. For pair-ended fastq reads of < 1.7 GB, the turn-around time is under 5 minutes. This
pipeline will be further optimized and streamlined for accuracy, ease of use, and confidence
of predictive values before validation. We are also expanding the reference sequence
database by constantly updating it with newly available sequences from provisional
serotypes. This pipeline is the first step towards building a comprehensive WGS-based
analysis pipeline of Shigella spp. for outbreak investigation and control in a field laboratory
setting, where speed is essential.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
112
Board #:
Title: Quinolone Resistance Mechanisms Found in E. coli from Four Animal Species in Norway
H. Kaspersen1, C. Sekse1, J. S. Slettemeås1, R. Simm2, M. Norström1, H. Sørum3, A. Urdahl1, K.
Lagesen1;
Author 1
Norwegian Veterinary Institute, Oslo, NORWAY, 2Department of Oral Biology, Faculty of
Block:
Dentistry, University of Oslo, Oslo, NORWAY, 3Institute of Food Safety and Infection Biology,
Norwegian University of Life Sciences, Oslo, NORWAY.
Quinolones and fluoroquinolones are regarded as critically important for human health, but
increased use of these compounds have been linked to increased occurrence of resistance. In
Norway, fluoroquinolones are used in negligent amounts in livestock, and prophylactic use is
prohibited. Nevertheless, low levels of quinolone resistant E. coli (QREC) have been observed
in a high proportion of the samples analysed in the monitoring programme for antimicrobial
resistance in the veterinary and food production sectors (NORM-VET). To better understand
the occurrence of QREC, the resistance mechanisms present in selected isolates from the
NORM-VET programme are characterized. E. coli isolates were defined as QREC when they
grew in the presence of ciprofloxacin and/or nalidixic acid at concentrations above the
epidemiological cut-off values of 0.06 µg/ml and 16 µg/ml, respectively. QREC isolates were
randomly selected and grouped based on animal species of origin, minimum inhibitory
Abstract: concentration (MIC) for ciprofloxacin and nalidixic acid, as well as the number of additional
resistant phenotypes, resulting in 285 isolates. The isolates originated from wild birds (n = 69),
red foxes (n = 53), pigs (n = 75), and broilers (n = 88). The MIC ranges of the isolates for
ciprofloxacin and nalidixic acid were 0.03 - 16 µg/ml and 4 - 256 µg/ml, respectively. Whole
genome sequencing on Illumina HiSeq2/3/4000 with Nextera Flex/XT library prep was
performed on the isolates. The resulting sequences were run through the Bifrost pipeline
(github.com/NorwegianVeterinaryInstitute/Bifrost) for quality control, antimicrobial
resistance gene identification, and multilocus sequence typing (MLST). Acquired resistance
genes and mutations in intrinsic genes are identified from reads by mapping to a reference
database, followed by local assemblies. Preliminary results suggest that over 80 % of the
isolates have at least one mutation in the gyrA gene, less than 30 % in the gyrB, less than 50 %
for parC and above 60 % for parE. Further analysis is being done and results will be presented.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
113
Board #:
Population Genomic Analysis of Pseudomonas aeruginosa Reveals Strain Sharing During New-
Title:
Onset Cystic Fibrosis Infections
P. J. Stapleton1, C. Izydorczyk2, A. Blanchard3, P. W. Wang2, J. Diaz Caballero2, Y. Yau1, V.
Waters3, D. S. Guttman2;
Author 1
Dept. of Laboratory Medicine and Pathobiology, Univ. of Toronto, Toronto, ON,
Block:
CANADA, 2Dept. of Cell and Systems Biology, Univ. of Toronto, Toronto, ON, CANADA, 3Div. of
Infectious Diseases, Hospital for Sick Children, Toronto, ON, CANADA.
Introduction: Sharing of Pseudomonas aeruginosa (Pa) strains between cystic fibrosis (CF)
patients with chronic infection is relatively common. It is unclear how frequent Pa strain
sharing is in new-onset infections occurring earlier in CF, when infections are treated with
antibiotic eradication therapy (AET) and epidemic strains infrequently encountered. We
sequenced Pa isolated from sputum of children prior to initiation of inhaled AET, to
determine the frequency of mixed strain infection, strain sharing, and their association with
AET failure. Methods: We sequenced 342 Pa isolates using Illumina technology, collected
from 65 children with 75 distinct episodes of new-onset infection (episodes at least 1 year
apart, AET failure in 27% of episodes) between 2012 and 2016. Up to 10 isolates were
sequenced per episode. We performed first-pass analysis of population structure by building
phylogenies with 1) Assembly and Alignment Free (AAF), 2) a pairwise distance matrix
generated from assemblies using Mash and 3) a conventional mapping step and SNP
alignment. We further investigated clusters suggestive of strain sharing by mapping genomes
from each cluster to closely related references, using 3 pipelines; 1) Bacteria and Archaea
Genome Analyser (BAGA), 2) Snippy and 3) an in-house pipeline. Maximum likelihood
phylogenetic trees were generated from Single Nucleotide Polymorphisms (SNP) alignments
with IQ-TREE. Strain sharing, which could result from direct/indirect transmission or a
Abstract:
common environmental reservoir, was inferred based on detection of appropriate topological
signal in these trees (strains from different patients exhibiting close monophyletic, or
paraphyletic, relationships). Univariate logistic regressions were used to assess associations
between mixed infection, strain sharing and AET failure. All statistical analyses were done
using SAS 9.04.01. Results: Pairwise SNP differences between closely related isolates differed
depending on the pipeline used and how closely related the reference, but tree topologies
were broadly similar. A large number of patients shared Pa strains with other patients
(N=25/65, 40%). Mixed infection (two or more strains present in sputum concurrently)
occurred in 12/75 episodes (16%). Having a mixed infection was significantly associated with
sharing of Pa strains (unadjusted OR 10.7, 95% CI 2.2; 53.7, p <0.01) but was not associated
with AET failure. Furthermore, strain sharing was not associated with AET
failure. Conclusions: A large proportion of patients were infected with a Pa strain shared with
other patients; the reason for this requires for further investigation. Mixed lineage Pa
infections were relatively frequently observed in new-onset episodes and were associated
with strain sharing between patients. Tree topologies for individual clusters were similar
regardless of SNP calling pipeline used despite variation in pairwise SNP difference between
isolates.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
114
Board #:
Antimicrobial Resistance Prediction by Whole Genome Sequencing in MRSA and VRE: A Real-
Title:
World Application
A. Babiker, M. M. Mustapha, K. A. Shutt, C. D. Ezeonwuka, S. L. Ohm, M. P. Pacey, J. Marsh, V.
Author
S. Cooper, Y. Doi, L. H. Harrison;
Block:
University of Pittsburgh, Pittsburgh, PA.
Background: The antimicrobial resistance (AMR) crisis represents a serious threat to public
health and the healthcare economy and has resulted in concentrated efforts to increase
development of rapid molecular diagnostics for AMR. In combination with publicly-available
web-based AMR databases, whole genome sequencing (WGS) offers the capacity for rapid
detection of antibiotic resistance genes. Here we studied the concordance between WGS-
based resistance prediction and phenotypic susceptibility testing results for methicillin-
resistant Staphylococcus aureus (MRSA) and vancomycin resistant Enterococcus (VRE) clinical
isolates using publicly-available tools. Methods: Clinical isolates prospectively collected at the
University of Pittsburgh Medical Center between December 2016 and December 2017
underwent WGS. Antibiotic-resistant gene content was assessed from assembled genomes by
BLASTn search of online databases ResFinder and the Comprehensive Antibiotic Resistance
Abstract:
Database (CARD). Concordance between WGS-predicted and phenotypic susceptibility as well
as sensitivity, specificity, positive and negative predictive values (NPV, PPV) were calculated
for each antibiotic/organism combination, using the phenotypic results as the gold
standard. Results: Phenotypic susceptibility testing and WGS results were available for 109
and 105 MRSA and VRE isolates respectively. Out of a total of 1,058 isolate/antibiotic
combinations overall concordance was 98.8% with a sensitivity, specificity, PPV, NPV of 98.0%
(95% CI, 0.97-0.99), 99.1% (95 % CI, 0.98-0.99), 98.5% (95% CI, 0.97-0.99), 99.0% (95% CI,
0.98-0.99), respectively. Identification of point mutations in housekeeping genes increased
the concordance to 99.3% and the sensitivity to 99.5% (95% CI, 0.98-0.99) and NPV to 99.8%
(95% CI, 0.99-0.99). Conclusion: WGS can be used as a reliable predicator of phenotypic
resistance for both MRSA and VRE using readily available online tools
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
115
Board #:
Deep Sequencing of a Measles Vaccine Strain Reveals Complexity of Defective Interfering
Title:
Genomes
Author A. Beck, M. Coughlin, P. Rota, B. Bankamp;
Block: Centers for Disease Control and Prevention, Atlanta, GA.
Defective interfering particles (DI) of measles virus (MeV) frequently arise in cell culture,
suppressing the replication of standard virus. Paramyxovirus DIs are immunostimulatory, so
elucidation of the formation and function of DIs is important to more fully understand MeV
replication. Many of the complex truncation jump points of DI RNAs arise from a “copy-back”
mechanism involving disassociation of the polymerase complex and reattachment in
opposing strand-orientation. DI are challenging to quantify with traditional molecular
techniques. We serially passaged the MeV vaccine strain, Moraten, in Vero-hSLAM and MRC5
cells, using conditions that promote DI formation. A MIQE-standard RT-qPCR assay using SYBR
chemistry was developed to measure ratios of MeV full-length genomic and DI RNA species
and to minimize experimental noise arising from viral mRNAs. RT-qPCR data were validated
by a two-step, end-point PCR procedure that detects a copy-back polarity switch in the DI
Abstract: RNA sequence. RNA extracted from DI-containing cell lysates were sequenced using stranded
Illumina chemistry and analyzed for DI jump points by various in silico methods. Gapped read
alignments were used in conjunction with R/Bioconductor ranged data processing to detect
the true diversity of DI content in the samples. A high diversity of truncation jump points was
observed and stranded sequencing data suggested replication of DI genomes. Cyclic
suppression of standard virus titers was observed along the passage series, and was inversely
correlated with concentrations of DI RNAs determined by RT-qPCR. DI were detected after
serial passage in Vero-hSLAM cells but not MRC5 cells. The findings represent novel evidence
of DI complexity in a laboratory passaged MeV vaccine strain; DIs were stably and
independently observed in replicate trials over 20 passages. These findings suggest that cell-
specific mechanisms affect DI formation, and that NGS methods are of utility for the discovery
of novel RNA populations in paramyxovirus-infected cells.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
116
Board #:
Title: Long-read Sequencing and Assembling of Bacillus Species in Error-free Simulations
Author J. Li;
Block: Changshu Institute of Technology, Suzhou, CHINA.
The generation of genomic data from microorganisms has revolutionized our abilities to
understand their biology, but it is still challenging to quickly and cheaply obtain the complete
genome sequence of microbes in an automated, high-throughput manner. While the advent
of second-generation sequencing technologies provided significantly higher data throughput,
their shorter read lengths and more pronounced sequence-context bias led to a shift towards
resequencing applications. Recently, single molecule real-time (SMRT) DNA sequencing has
been used to generate sequencing reads that are much longer than second-generation or
even Sanger sequencing reads, facilitating de novo genome assembly and genome finishing.
Abstract: Here we tried to develop a novel multiplex strategy to make full use of the capacity and
characteristics of SMRT sequencing in microbe genome assembly. We first used error-free
simulations to evaluate the practicability of assembling SMRT genomic sequencing data from
multiple microbes into finished genomes once at a time. And then we compared the
influence of some key factors, including sequencing coverage and read length, on multiplex
assembling. Our results showed that long-read genomic sequencing inherently provided the
ability to assemble genomic sequencing data from multiple microbes into finished genomes
duo to its long read length. This approach might be helpful for the various groups of microbial
genome projects or metagenomics research.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
117
Board #:
Rapid Antibiotic Resistance Identification Using VOLTRAX Library Preparation and Nanopore
Title:
Sequencing
Author J. Humphrey, T. Seitz, A. Ducluzeau, D. Drown;
Block: University of Alaska Fairbanks, Fairbanks, AK.
Antibiotic resistance is a growing health crisis in the US accounting for over 20,000 deaths
annually. Environmental bacteria can act as reservoir of opportunistic pathogens despite a
lack of exposure. Resistant microbes may have a significant negative impact on the health of
Alaskans. Identifying specific antibiotic resistant microbes is essential for quick and
appropriate treatment. Here we demonstrate a rapid, automated, and portable sequencing
platform. In this proof of concept, each library was constructed from a single cultured
Abstract: microbial isolate using the the VOLTRAX, a rapid, automated library preparation. Sequencing
was carried out using the portable Nanopore DNA sequencer, MinION. We demonstrate the
results of our long read bioinformatic pipeline for assembly, contig polishing, and annotation.
All of these steps were carried out by two undergraduates with introductory laboratory skills.
Importantly, these methods allow us to better understand the environmental reservoir of
antibiotic resistance in Alaska. These results demonstrate the potential application of the
portable library preparation and sequencing for a mobile biosurveillance laboratory.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
118
Board #:
Metagenomics for Diagnosis of Sterile Site Infection: Balancing Automation with Expert
Title:
Interpretation
Author C. Anscombe, A. Nguyen, N. Le, H. Nguyen, P. Ashton, T. Le;
Block: Oxford University Clinical Research Unit (OUCRU), Ho CHi MInh City, VIET NAM.
Background: A quick literature search will demonstrate the increasing popularity of
metagenomic sequencing methods to diagnose patients with suspected infections when other
methods have failed. However, there is a need to carry out these analyses on larger,
prospective cohorts, to determine sensitivity and specificity. At OUCRU, Vietnam, we are
investigating its use in central nervous system (CNS) infections and in patients with fever of
unknown aetiology. In order to examine this data effectively we have developed an analysis
pipeline which is rapid, requires low RAM, and is designed for clinicians or microbiologists to
use. Methods: The pipeline takes raw reads from Illumina sequencing, removes host reads
and classifies the remaining samples using CLARK in light mode. After classification, a
prediction of genome coverage is made for each organism identified based on number of
reads and the genome size of the organism. If a threshold is met, the reference for that taxon
ID is downloaded and sample reads mapped. Outputs include mapping statistics such as
genome coverage and number of reads mapped. A report on the frequency at which taxon
IDs are found across the run is automatically generated, allowing users to consider
contamination. Users can customize the classification database to fit with need, define the
relevant host genome, input contamination libraries and specify taxa to ignore in analysis
based on local knowledge. Results: The pipeline was used to analyze metagenomic
Abstract:
sequencing results from 71 CSF samples collected from patients presenting with CNS infection
in Vietnam. After pipeline completion, the number of reference mapping analyses was 104.
Different sub-types of torque teno virus accounted for 33 of these, and were removed from
the analysis. The results were then edited for clinical significance by a microbiologist. Results
identified pathogens in 17 samples; 8 Streptococcus suis, 4 enteroviruses, two cases of
mumps and one S. pneumoniae, Japanese encephalitis and Varicella-zoster virus (VZV). In
addition, Hepatitis B was identified in 5 cases, but was not considered a cause of CNS disease,
but merely reflective of the high incidence of Hepatitis B in Vietnam. Genome coverage of
these pathogens varied from 0.83% to 81.33%. All findings were confirmed with specific PCR,
with Ct values ranging from 27 to 40. Use of an arbitrary cut of in reference genome coverage
led to missing VZV and produced 2 false positives (a polytropic provirus and Streptococcus
agalactiae ), which were all negative by PCR. Showing that just as in a culture based
diagnostic approach there is no replacement for expert interpretation of
results. Conclusions: Bioinformatics can help to automate processing of metagenomic
sequencing but, interpretation remains the domain of human intelligence. Building
bioinformatics tools with this in mind will enable more rapid uptake of metagenomics for
diagnosis from sterile sites.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
119
Board #:
Title: Microbial Dark Matter Analysis Using 16S rRNA Gene Metagenomics Sequences.
Author H. Barak, A. Sivan, A. Kushmaro;
Block: Ben-Gurion University, Beer Sheva, ISRAEL.
Microorganisms are the most diverse and abundant life forms on Earth and account for a
large portion of the Earth’s biomass and biodiversity. To date though, our knowledge
regarding microbial life is lacking, as it is based mainly on information from cultivated
organisms. Indeed, microbiologists have borrowed from astrophysics and termed the
‘uncultured microbial majority’ as ‘microbial dark matter’. The realization of how diverse and
unexplored microorganisms are, actually stems from recent advances in molecular biology,
and in particular from novel methods for sequencing microbial small subunit ribosomal RNA
genes directly from environmental samples termed next generation sequencing (NGS). This
has led us to use NGS that generates several gigabases of sequencing data in a single
experimental run, to identify and classify environmental samples of microorganisms. In
metagenomics sequencing analysis (both 16S and shotgun), sequences are compared to
reference databases that contain only small part of the existing microorganisms and
therefore their taxonomy assignment may reveal groups of unknown microorganisms or
origins. These unknowns, or the ‘microbial sequences dark matter’, are usually ignored in
Abstract:
spite of their great importance. The goal of this work was to develop an improved
bioinformatics method that enables more complete analyses of the microbial communities in
numerous environments. Therefore, NGS was used to identify previously unknown
microorganisms from three different environments (industrials wastewater, Negev Desert’s
rocks and water wells at the Arava valley). 16S rRNA gene metagenome analysis of the
microorganisms from those three environments produce about ~4 million reads for 75
samples. Between 0.1-12% of the sequences in each sample were tagged as ‘Unassigned’.
Employing relatively simple methodology for resequencing of original gDNA samples through
Sanger or MiSeq Illumina with specific primers, this study demonstrates that the mysterious
‘Unassigned’ group apparently contains sequences of candidate phyla. Those unknown
sequences can be located on a phylogenetic tree and thus provide a better understanding of
the ‘sequences dark matter’ and its role in the research of microbial communities and
diversity. Studying this ‘dark matter’ will extend the existing databases and could reveal the
hidden potential of the ‘microbial dark matter’.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
120
Board #:
Title: Whole-Genome Sequencing of Zika Virus Directly from Clinical Samples
K. Kamelian1, A. Olmstead2, V. Montoya2, W. Dong2, M. Morshed3, P. R. Harrigan1, J. Joy2;
Author 1
University of British Columbia, Vancouver, BC, CANADA, 2BC Centre for Excellence in
Block:
HIV/AIDS, Vancouver, BC, CANADA, 3BC Centre for Disease Control, Vancouver, BC, CANADA.
Background: In 2016, The World Health Organization declared the Zika virus a Public Health
Emergency of International Concern due to the increasing prevalence of Zika virus infections
in the Americas. The Zika virus has been associated with increased incidence of the
neurological condition Guillain-Barré syndrome and birth defect microcephaly. Routine
surveillance tools currently rely on PCR amplification, Sanger sequencing, and antibody-based
tests to identify new cases of Zika infections. However, whole-genome sequencing (WGS) of
the Zika virus may present certain advantages over other surveillance tools by providing more
detailed information on viral phylogenetic clustering, transmission, and
geography. Methods: Specimens from five subjects with travel-acquired Zika virus infection
(putatively from Belize, Mexico, an undisclosed Caribbean region, Barbados, and Panama)
were obtained from the British Columbia Centre for Disease Control Public Health Laboratory
(BCCDC) and had a range of cyclic threshold (Ct) values (21 - 33). WGS of Zika virus was
performed on an Illumina MiSeq using a previously published procedure designed to
overcome some of the limitations of low viral load and partially degraded samples by
amplifying several short amplicons to create a tiling path across the Zika virus genome
(Quick et al.,2017). Sequences were analyzed for depth of coverage and total number of reads
Abstract:
including total quality trimmed reads, viral reads, and human reads. Phylogenetic analysis was
performed to investigate geographic clustering of travel-related cases. Results: Consensus
sequences ranging from 8 - 10.5 kb were obtained for the five samples. Higher Ct values were
correlated with lower coverage, lower number of viral reads, higher number of human reads,
and overall lower depth. Median depth of coverage of the samples was 24,000 (IQR: 17,000-
25,000). Although some contigs had low depth of coverage (less than 10 reads), they still
provided adequate genome coverage for the regions sequenced. Phylogenetic analysis of
sequencing data confirmed the suspected regions of Zika infection for two of five samples.
Three samples were missing reference genomes of suspected areas of infection. However,
they clustered within close geographical proximity to neighboring regions. Conclusions: Our
results highlight the usefulness of WGS using the tiling amplicon method in a clinical setting.
WGS of the Zika virus allows insight into the origins of infection, transmission patterns, and
the genetic diversity of travel related cases. However, our samples gave rise to five sources of
the Zika virus infection suggesting that the complexity and global movement of the Zika virus
epidemic is likely to limit precise interpretations of the origin of travel related cases and is
dependent on availability of reference sequences from regions of interest.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
121
Board #:
Identifying Putative Transmission Clusters of The Multidrug-Resistant E. coli ST131-
Title:
H30 Lineage Among U.S. Children Using Whole Genome Sequencing
Author A. Miles-Jay1, S. J. Weissman2, A. L. Adler2, J. G. Baseman1, D. M. Zerr2;
1
Block: University of Washington, Seattle, WA, 2Seattle Children's Hospital, Seattle, WA.
Introduction: E. coli ST131-H30 is a globally disseminated lineage that is implicated in rising
rates of multidrug resistance among extraintestinal E. coli infections. Despite the public
health significance of this pathogen, its transmission dynamics are poorly understood. This is
in part due to ST131-H30’s capacity for prolonged subclinical intestinal colonization, likely
resulting in a plethora of “silent” transmission events that are difficult to capture directly. We
assessed the ability to detect putative transmission clusters among E. coli ST131-H30 isolates
collected from U.S. children during routine clinical care. Methods: We applied whole genome
sequencing and a novel framework for transmission cluster detection to clinical E. coli ST131-
H30 isolates collected in a multicenter surveillance study that took place from 2009-2013 at 4
geographically diverse U.S. children’s hospitals. Isolates were sequenced on an Illumina
NextSeq platform. Quality filtered and trimmed sequencing reads were mapped to a high-
quality ST131-H30 reference genome and core genome single nucleotide variants were
identified. The R package transcluster—which probabilistically infers the number of
transmission events separating cases using pairwise genomic distance and sampling dates—
was used to identify and characterize putative transmission clusters where the implied
number of transmissions was less than 25 with a probability of 80% (the default
Abstract: settings). Results: A total of 126 E. coli ST131-H30 isolates were included. Twelve isolates
(9.5%) were placed into 6 putative transmission clusters; each cluster contained 2 isolates and
no clusters spanned multiple study sites. The time between sampling in a cluster ranged from
1 to 199 days. Five of the 6 clusters were composed of the CTX-M-15-type extended-spectrum
beta-lactamase-producing subclone of ST131-H30; 1 cluster was composed of non-ESBL
producing H30 isolates. The implied number of transmission events separating isolates in a
single cluster ranged from 1-18 events. The clusters contained a mix of hospital associated (n
= 5), healthcare-associated community onset (n = 3), and community-associated (n = 4)
infections. Two instances of plausible nosocomial transmission were
identified. Conclusions: The integration of whole genome sequencing data and a novel
framework for transmission cluster detection revealed putative transmission clusters
among E. coli ST131-H30 isolates collected during routine clinical care. Although geographic
location was not explicitly incorporated into the analysis, all clusters sorted by geographic
site, strengthening their epidemiologic plausibility. Whole genome sequencing of clinical
isolates could guide more detailed and resource-intensive sampling efforts designed to
elucidate transmission pathways of difficult-to-track and worrisome lineages like E.
coli ST131-H30.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
122
Board #:
Virulence Characteristics and an Action Mode of Antibiotic Resistance in Multidrug-
Title:
Resistant Pseudomonas aeruginosa
Author W. Hwang;
Block: Department of Microbiology and Immunology, Yonsei University, Seoul, KOREA, REPUBLIC OF.
Pseudomonas aeruginosa displays intrinsic resistance to many antibiotics and known to
acquire actively genetic mutations for further resistance. In this study, we attempted to
understand genomic and transcriptomic landscapes of P. aeruginosa clinical isolates that are
highly resistant to multiple antibiotics. We also aimed to reveal a mode of antibiotic
resistance by elucidating transcriptional response of genes conferring antibiotic resistance. To
this end, we sequenced the whole genomes and profiled genome-wide RNA transcripts of
three different multi-drug resistant (MDR) clinical isolates that are phylogenetically distant
from one another. Multi-layered genome comparisons with genomes of antibiotic-
susceptible P. aeruginosa strains and 70 other antibiotic-resistance strains revealed both well-
characterized conserved gene mutations and distinct distribution of antibiotic-resistant genes
Abstract: (ARGs) among strains. Transcriptions of genes involved in quorum sensing and type VI
secretion systems were invariably downregulated in the MDR strains. Virulence-associated
phenotypes were further examined and results indicate that our MDR strains are clearly
avirulent. Transcriptions of 64 genes, logically selected to be related with antibiotic resistance
in MDR strains, were active under normal growth conditions and remained unchanged during
antibiotic treatment. These results propose that antibiotic resistance is achieved by a
“proactive” response scheme, where ARGs are constitutively expressed even in the absence
of antibiotic stress, rather than a “reactive” response. Bacterial responses explored at the
transcriptomic level in conjunction with their genome repertoires provided novel insights into
(i) the virulence-associated phenotypes and (ii) a mode of antibiotic resistance in MDR P.
aeruginosa strains.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
123
Board #:
Title: Metagenomics Analysis of Spent Water from a Livestock Farm
O. A. Aiyegoro1, A. A. Adegoke2, D. T. Babalola3, E. Adetiba4;
1
Agricultural Research Council- Animal Production Institute, Pretoria, SOUTH
Author
AFRICA, 2University of Uyo, Akwa-Ibom, Uyo, NIGERIA, 3Covenant University-Centre for
Block:
Systems and Information Services, Cannanland, Ota, NIGERIA, 4Covenant University-
Department of Electrical and Information Engineering, Cannanland, Ota, NIGERIA.
Microorganisms are ubiquitous and important to the proper functioning of ecosystems;
including aquatic milieu, where they play vital roles in the water cycles and removal of
nutrients and toxins. Hence, studying these microbes are very essential. Prior to now, non-
culturable microbes are difficult to study, however, the advent of metagenomics analysis has
helped to solve this problem. In this present study, the gene profile of microbial compositions
of used water from animal research farm was analysed, in order to obtain a scanned profile of
all resident microbiome in the water. To analyse the microbial communities as depicted in the
sequenced data, MetaPhlAn2 was used. The reads were analysed and combined to form a
merged abundance table. This table was edited and viewed using LibreOffice Calc. A heatmap
showing the abundance profiles of the microbes was generated using Hclust2. A cladogram
showing taxonomic relatedness was captured using GraphlAn. This was done by rendering
trees and annotating them with microbial names and relative abundances. Pie-charts,
showing specific comparisons based on various clade, were generated using Krona. The
analysis of the microbial samples showed that the environment was dominated by Bacteria
(99.88%). The sample also showed that Archaea (0.07%) and Viruses (0.05%) were present, in
very small populations. Upon further analysis, it was shown that about 5 phyla were present
within the bacterial population namely: Actinobacteria (0.5%), Bacteroidetes (9%), Firmicutes
Abstract:
(3.3%), Proteobacteria (86.6%) and Spirochaetes (0.6%). A total of 40 different genera were
identified with the genera Thauera making up 74% of the entire population. This comes as no
surprise as thauera is a denitrifying bacteria playing a crucial role in the waste water
ecosystem. Thauera plays an important role in the removal of nitrogen nitrate and other
aromatic compounds. For this reason, Thauera is usually detected in most wastewater
treatment samples. Another notable genera worth noting among the bacterial population is
the genera Thiomonas making about 5% of the bacterial population. The viruses present in
the effluent are composed majorly of the genus Siphoviridae and Gammaretrovirus. The
Kingdom Archaea was found to consist of only
the genera Methanobrevibacter. Metagenomic analysis provide insight into the microbial
community present in the waste water effluent. We were able to analyse the various
microorganisms present as well as their relative abundances. We were also able to use
various tools to provide graphical illustrations that aided our analysis. The results revealed
that a large percentage of the microorganisms present were bacteria and we were able to
view their diversity. A huge part of this bacterial presence was directly involved in the
wastewater ecology and had major roles in the breakdown of chemical compounds present in
the water.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
124
Board #:
Title: Nanopore Sequencing for AMR Detection and Characterization
Author Y. Fan1, W. Timp1, T. Simner2, P. Tamma2, Y. Bergman2;
1
Block: Johns Hopkins University, Baltimore, MD, 2Johns Hopkins Hospital, Baltimore, MD.
Background: The continuing threat of antimicrobial resistance poses an urgent global public
health concern. Early and accurate detectionof resistance mechanisms can both prevent the
dissemination of resistant organisms with in the healthcare environment and ensure patients
are placed on early and effective antibiotic therapy. To this end, we leveraged the long reads,
low overhead, and real time analysis capabilities of nanopore sequencing in order to detect
both acquired resistance genes and chromosomal mutations that potentially confer
antimicrobial resistance. Methods: Forty clinical Klebsiella pneumoniae isolates with a variety
of resistance mechanisms from patients hospitalized at The Johns Hopkins Hospital were
sequenced on the Oxford Min ION platform. Genomes were assembled using canu, and
corrected with signal level algorithms implemented in the nanopolish software package.
These isolates were also sequenced on the Illumina platform, and the more accurate short
read data were used to further correct the assemblies. Abricate was used to screen the
contigs for resistance genes, using several databases, including CARD, Resfinder, and
PlasmidFinder. Chromosomal mutations and their consequences in the amino acid domain
were identified using custom C++ code. Results: We found that disagreements between the
Abstract: nanopolished and Illumina polished assemblies clustered near methylation motifs. By
examining these errors, and by using improved signal models for these motifs in
amethylationa ware versionof nanopolish, we can build an assembly using only nanopore
data that achieves ~99.8% identity with illumina polished assemblies. We are continuing
toexamine the locations and motifs of the remaining ~0.2% of errors to identify new and
better approaches to polish nanopore-only assemblies. This remaining ~0.2% is important
because it makes accurate prediction of protein translation, and hence truncating or missense
mutations difficult to detect. Corrected assemblies allowed us to identify a variety of small
mutations noted int he literature to be responsible for resistance phenotypes. By limiting our
analysis pipeline to 52,000 reads with an average length of 10 kb, we are able to sequence
and build a high quality genome in under 8 hours using a machine with 36 cores and 72 GB
RAM. Conclusions: We are developing tools that apply nanopore sequencing for rapid and
accurate identification of antibiotic resistance mutations, which will clinicians in placing
critically-ill patients on early and effective antibiotic therapy. As we collect and sequence
more isolates, and accrue information about genetic features that give rise to antimicrobial
resistance, we will increase the utility of real time sequencing assays for diagnostic purposes.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
125
Board #:
ASA³P: An Automatic and Highly Scalable Pipeline for Bacterial Genome Assembly, Annotation
Title:
and Higher-level Analyses
O. Schwengers1, A. Hoek1, M. Schneider1, M. Fritzenwanker2, L. Falgenhauer2, J. Falgenhauer2,
T. Hain2, T. Chakraborty2, A. Goesmann1;
Author 1
Bioinformatics and Systems Biology, Justus-Liebig University Giessen, Giessen,
Block:
GERMANY, 2Institute of Medical Microbiology, Justus-Liebig University Giessen, Giessen,
GERMANY.
Background: Major technological advances and the dramatic decrease in costs of bacterial
whole genome sequencing is having an unprecedented effect in genome epidemiology and
metagenomics. These exciting developments require the establishment of effective, efficient
and scalable bioinformatics software tools for data processing and analysis of the high
throughput data obtained before scientific interpretation can take place. Methods: In order
to solve core bioinformatics tasks such as quality trimming, assembly and annotation, ASA³P
takes advantage of published and well performing third party tools and combines them with
comprehensive databases. It is a modular software pipeline comprising a core application
implemented in Java and Groovy together with cluster distributable scripts implemented in
Groovy. HTML reports take advantage of modern and interactive JavaScript libraries. For
massive scalability our pipeline integrates well with Sun Grid Engine compatible compute
clusters. Within cloud computing environments the software is able to setup complex
hardware and software infrastructures and thus is able to automatically create its own
compute clusters. Results: Here, we introduce ASA³P, a fully automatic and scalable
Abstract:
assembly, annotation and higher-level analysis pipeline for bacterial genomes. The pipeline
conducts all of the necessary data processing steps, i.e. quality clipping and assembly of
sequencing reads, scaffolding subsequent contigs and annotation of genome sequences.
Furthermore, ASA³P performs comprehensive genome characterizations and analyses, e.g. for
taxonomic classification, and detection of both AMR genes and virulence factors. Results are
presented via an HTML based user interface providing aggregated information, interactive
visualizations and access to intermediate results in standard bioinformatic file formats. ASA³P
is available in two versions: a local Docker container for small-scale projects and an
OpenStack cloud version able to automatically create and manage its own self-scaling
compute cluster. Discussion: ASA³P is a software tool enabling the automatic processing,
assembly, annotation and higher level analysis of bacterial NGS whole genome data in a
comfortable but high-throughput manner. The burden of technical complexity is overcome by
simple setup routines and the use of Docker and OpenStack images. Thus, automatic and
standardized analysis of hundreds of bacterial genomes is now feasible on a daily basis.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
126
Board #:
Title: Soil Microbiome Phenotypic Response to Nutrient and Moisture Perturbations
T. Roy Chowdhury1, E. M. Bottos2, R. A. White III2, J. O. Brown2, L. M. Bramer2, J. D. Zucker2, C.
Author M. Brislawn2, S. J. Fansler3, L. McCue3, S. J. Callister3, J. K. Jansson3;
1
Block: University of Maryland, College Park, MD, 2Pacific Northwest Laboratory, Richland,
WA, 3Pacific Northwest National Laboratory, Richland, WA.
Soil microbiome responses to changing environmental conditions are manifested as shifts in
community structure and/or modification of activity. However, molecular-level details
underlying functional responses of soil microbiomes to perturbation are largely unknown.
Here, we demonstrate a multi-omics approach to determine the impact of environmental
perturbations on the soil microbiome across taxonomic and functional levels. Kansas native
prairie soil samples from three field locations were either treated with glycine as a model root
exudate, or perturbed by changing moisture conditions. The microbiome response was
assessed using a suite of omics measurements: 16S rRNA amplicon sequencing,
Abstract:
metagenomics, metatransciptomics, and metabolomics. The soil microbiome responded to
glycine at the functional level, but not at the community structure level. In contrast, soil
drying shifted both the microbiome composition and function. A major challenge in soil
microbial ecology is the extraordinary phylogenetic and functional diversity of the soil
microbiome in association with the physico-chemical complexity of the soil habitat. Here by
using a multi-omics approach, we elucidated the phenotypic response of the soil microbiome
across different levels of expression; thus providing a proof-of-concept for use of this
approach to assess key physiological traits expressed by the soil microbiome.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
127
Board #:
Title: Microbial Diversity of New Orleans Groundwater Using Illumina Miseq
Author S. Sherchan;
Block: Tulane University, New Orleans, LA.
Groundwater contamination will result in poor drinking water quality, loss of water supply,
and pose human health in great risk. Increased attention has been devoted to the direct
detection of pathogenic organisms in groundwater by using next-generation sequencing
(NGS). We investigated microbial biodiversity of groundwater using Illumina Miseq. Water
samples were collected from 55 private wells in New Orleans Louisiana. Our results indicate
twenty bacterial phyla. Proteobacteria was the most dominant phylum in most of samples
(relative abundance: 71.1%), followed by Chlorobi (5.1%), Actinobacteria (4.2%), Chloroflexi
Abstract: (3.3%), Cyanobacteria (2.2%), and Bacteroidetes (2.0%) (Fig. S5). At the genus level, five
genera were abundant (> 3%) in well water samples with Methylomonas (5.3%),
Methylosinus (3.7%), Mycobacterium (3.4%), Dechloromonas (3.3%), and Thiobacillus (3.1%).
The relative abundances of the class of Gammaproteobacteria and Actinobacteria were
positively associated with qPCR results of Legionella spp. and mycobacteria respectively.
However, the regression analysis showed no significance (p > 0.05). Principal coordinates
analysis (PCoA) of unweighted UniFrac indicates patterns of bacterial community composition
in groundwater reflect sampling locations.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
128
Board #:
Title: Third-Generation Sequencing of Microbes Isolated from Fermented Beverages
J. H. Collins, E. Van Zyl, A. Cosio, G. Brogan, L. Chico, A. Rozen, E. Thomson, J. Coburn, E. M.
Author
Young;
Block:
Worcester Polytechnic Institute, Worcester, MA.
Fermented beverages represent a growing multibillion dollar industry. This is not limited to
beer and wine – other products like hard cider and kombucha are among the fastest growing
sectors in the industry. Understanding the microbial communities that generate – or spoil –
these products are essential to controlling flavor and consistency. Plummeting sequencing
costs and new long read sequencing technologies can enable genomics of individual isolates
as well as microbial consortia. To this end, we demonstrate the utility of nanopore sequencing
for genome and metagenome sequencing of beer, hard cider, and kombucha fermentations.
We present de novo genome assemblies of four isolates: a Saccharomyces cerevisiae strain
from homebrew, two novel Saccharomyces and Pichia yeast isolates from hard cider, and
a Gluconacetobacter xylinus strain from a home kombucha culture. We further attempt to
sequence the metagenome of the kombucha culture, and show genome assembly of the most
common Acetobacter strain is possible, although targeted metagenomes such as those based
on 16S rRNA are likely better suited to capturing all members of the community. To assemble
the isolate genomes, four de novo genome assemblers, MiniASM, Canu, Flye, and
SMARTdenovo, were evaluated at varying genome coverages, with SMARTdenovo performing
the best based on number of contigs, number of mismatches, and average coding sequence
Abstract: (CDS) length. Quality of the assemblies can be greatly improved by scaffolding to a reference
genome with pyScaf and polishing using Nanopolish, increasing the average CDS length by
approximately 33% across all assemblies. Yet, the average CDS length of the polished
nanopore assemblies were approximately 75% of their reference strain, which could be
improved by complementary Illumina sequencing. Yet, without Illumina data, we were able to
place the homebrew S. cerevisiae strain in the “Beer 2” clade based on multiple mutations to
the MAL11 gene and a specific W497 mutation in the FDC1 gene, identify new yeast species
from hard cider, assemble a genome from a metagenome experiment, and assemble a
complete G. xylinus genome from an isolate with a contig longer than the existing reference
strain. This demonstrates that quality genome assemblies from fermented beverages using
nanopore sequencing are possible. The low cost and ease of use of nanopore technology
promises high quality genomic information for future strain breeding or engineering, as well
as assessing spoilage for process control. Thus, a workflow of nanopore sequencing coupled
with de novo assembly using SMARTdenovo, and optionally pyScaf and Nanopolish, can
facilitate quality genomics of microbes from fermented beverages and therefore has great
potential utility for producers of fermented beverages for the control of flavor and
consistency.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
129
Board #:
Whole Genome Analysis of Clinical Multidrug-resistant Acinetobacter baumannii Strains in
Title:
Vietnam Hospital
N. Si-Tuan1, C. Nguyen2, H. Nguyen Thuy3;
1
Thongnhat General Hospital of Dongnai Province, 234 highway 1A, Tan Bien ward, Bien Hoa
Author City, VIET NAM, 2Department of Bioinformatics and Medical Statistics, Vinmec Research
Block: Institute of Stem Cell and Biotechnology, Hanoi, Vietnam, Hanoi, Vietnam, VIET
NAM, 3Department of Biotechnology, Faculty of Chemical Engineering, Ho Chi Minh City
University of Technology, HCM National University, Ho Chi Minh City, VIET NAM.
Background: Acinetobacter baumannii is an important nosocomial pathogen that can develop
multidrug resistance. In this study, we sought to explore the genomic properties,
phylogenetic relationships, and comparative genomics of this pathogen through strain
DMS06669 and DMS06670 (isolated from the sputum of two male patients with hospital-
acquired pneumonia). Methods: Whole genome analysis of A. baumanniiDMS0669 and
DMS06670 included de novo assembly; gene prediction; functional annotation to public
databases; phylogenetic tree construction by average nucleotide identity; pan-genome
analysis and antibiotics resistance genes identification. Antibiotics resistance genes in-
vitro were isolated by PCR and re-confirmed by improved Sanger method. Results: The data
Abstract: showed that a total of 19 possible antibiotic resistance genes, conferring resistance to eight
distinct classes of antibiotics, were identified in two strains. Nine of these genes have not
previously been reported to occur in A. baumannii. Comparative analysis of 23 available
genomes of A. baumannii revealed an open pan-genome consisting of 15,883 genes. All
antibiotics resistance genes were isolated. Conclusions: Our results provide important
information regarding mechanisms that may contribute to antibiotic resistance in the
DMS06669 and DMS06670 strain and have implications for treatment of patients infected
with A. baumannii.
Keywords: Acinetobacter baumannii, multidrug resistance, pan-genome analysis, comparative
whole-genome analysis, next generation sequencing.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
130
Board #:
Genomic Characterization of Carbapenem-resistant Escherichia coli Isolated from Clinical
Title:
Samples in Thailand
K. R. Margulieux1, A. Srijan1, S. Ruekit1, E. Snesrud2, A. Ong2, O. Serichantalergs1, R.
Kormanee3, P. Sukhchat3, A. Jones2, P. McGann2, J. Crawford1, B. Swierczewski2;
Author 1
Armed Forces Research Institute of Medical Sciences, Bangkok, THAILAND, 2Walter Reed
Block:
Army Institute of Research, Silver Spring, MD, 3Queen Sirikit Naval Hospital, Chonburi,
THAILAND.
Background: The spread of multidrug-resistant bacterial pathogens is one the most
dangerous current public health threats, especially in regions such as Southeast Asia with
widely unregulated antibiotic usage. Whole genome sequencing (WGS) of clinical isolates can
provide additional insights about multidrug-resistant isolates and complement efforts of local
clinical laboratories. Methods: A total of 183 clinical Escherichia coli isolates were collected at
Queen Sirikit Naval Hospital in Chonburi, Thailand from October 2016 - January 2018 as part
of routine surveillance for multidrug-resistant pathogens. The isolates were verified and
underwent antimicrobial susceptibility tested using the BD PhoenixTM 50 and the NMIC/ID 95
panel to screen for carbapenem-resistant isolates. WGS of identified carbapenem-resistant
isolates was performed on an Illumina MiSeq Benchtop sequencer and subsequently
analyzed. Results: Of the 183 multidrug-resistant E. coli isolates collected, 168 (91.8%)
demonstrated resistance to 3rd generation cephalosporins and 167 (91.3%) to azetronam.
Phenotypic resistance to at least one carbapenem tested was observed in 11/183 (6.0%)
isolates. 8/11 isolates were positive for carbapenemase production using a CarbaNP test.
Abstract: Genomic characterization of E. coli isolates showed that the 11 isolates carried 5
carbapenem-resistance genes between them. Single isolates carried blaNDM-4, blaKPC-2,
and blaOXA-181, two isolates carried blaOXA-48, and six isolates carried blaNDM-5. The three isolates
that tested negative for carbapenemase production with the CarbaNP test
carried blaOXA genes. Isolate relatedness was shown through whole genome
comparison. Conclusions: A total of 11/183 (6%) multidrug-resistant E. coli isolates identified
over 16 months at Queen Sirikit Naval Hospital were shown to be carbapenem resistant. Five
carbapenem-resistance genes were identified in these 11 isolates. The most prevalent
carbapenem-resistance gene was blaNDM-5, a gene increasingly reported in Southeast Asia.
Notably, the isolates carrying blaOXA genes appeared to have lower phenotypic levels of
carbapenemase production compared to other gene types. This may lead to an under-
reporting of carbapenem-resistance in the region and treatment complications if not
detected during routine clinical screening. It is important to continue long-term surveillance
of hospitals in Thailand, and utilizing WGS for in-depth isolate analysis is important to fully
understand the current circulation of resistant pathogens and their evolution over time.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
131
Board #:
Title: All 16S rRNA Gene Variable Regions Essential for Microbiome Survey
Author Q. Yang, C. Franco, W. Zhang;
Block: Flinders University, Adelaide, AUSTRALIA.
Background: Marine sponges (phylum Porifera) are enriched by host-specific and
opportunistic microorganisms that make up to 60% of the mesohyl volume. The majority of
these microbes have not been identified. 16S rRNA gene based metagenomics sequencing has
become the method of choice to study sponge microbiomes, however, results from amplicon-
based analyses that employ only one pair of primers targeting specific variable regions have
been found to be grossly under-representative. Therefore, the aims of this study were to test
the hypothesis that primers targeting different variable regions of the 16S rRNA gene reveal
vastly different parts of the microbiome and to develop an improved approach to reveal a
more complete microbial profile. Methods: To test the hypothesis, five primers sets covering
all the variable 16S rRNA gene regions were validated to reveal the microbiomes of four
representative sponge species in different orders on Illumina MiSeq platform. Results: A
significant increase in microbiome coverage was achieved. 29.5% of phylum-level OTUs and
35.5% class-level OTUs generated from this developed approach could not be recovered by
the most commonly used single primer set targeting the V4 region only. In relation to the
Abstract:
microbial sequence recovery, this approach could increase the sequence coverage by 93.9 to
549.9% for each of four sponge species when compared to that using the V4 primer set. A
further indirect comparison with metagenomics-based microbiome survey demonstrated that
the multi-primer approach performed substantially better, especially in revealing unaffiliated
taxa, that are either candidate or unassigned. Conclusions: Our study indicated that a
validated combination of variable region-specific primer sets covering the full length of 16S
rRNA gene is essential to analyze sponge microbial communities when using amplicon-based
analysis, so as to avoid an incomplete and misleading microbiome profiling. This multi-primer
approach can be conveniently applied and represents a fundamental change from
conventional single primer set amplicon-based microbiome studies. It could contribute
significantly to any microbiome survey projects, to achieve a more comprehensive
understanding of the microbial profile. The superior capacity on uncovering the unaffiliated
microbial OTUs allows for a greater potential to discover the taxonomic ‘blind spots’ within
the largely unknown microbial world.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
132
Board #:
Title: Factors Associated with Surface Water Microbial Community Structure
Author T. Chung1, D. L. Weller2, J. Kovac1;
1
Block: The Pennsylvania State University, State College, PA, 2Cornell University, Ithaca, NY.
Background: According to the U.S. Geological Survey, surface-water sources accounted for
52% of all irrigation withdrawals in 2015. However, temporal variation in physical (e.g.,
turbidity) and chemical (e.g., pH) water quality, and spatiotemporal variation in
environmental factors (e.g., weather, proximity to upstream livestock operations) may affect
microbial community composition and the microbiological quality of surface water. While a
number of studies have investigated drinking water microbiomes, little is known about the
microbial communities in surface waters that are used for irrigation. Here we investigated
geospatial, weather and landscape factors associated with surface water micro- and
mycobiomes. Methods: We characterized the bacterial and fungal community composition of
68 samples collected using Moore swabs from six streams in upstate New York between May
and August 2017. Samples were separated into particulate matter (i.e., soil) and water
fractions. Total DNA was extracted from fractions using PowerSoil (n=68) or PowerWater
(n=46) kits, respectively. Data on physical and chemical water quality, upstream landscape
characteristics and weather data were also collected at sampling. Microbial community
composition of each sample was determined by Illumina sequencing of PCR-amplified 16S
Abstract: rRNA gene V4 region and ITS sequences. Sequences were processed using Mothur. The
resulting OTUs were used to investigate sample biodiversity within and between streams
using permutational multivariate analysis of variances (PERMANOVA), clustering, ordination
and network analysis. Results: Significant differences in microbial community structure were
observed among samples collected from different streams. According to PERMANOVA, these
differences may be associated with differences in upstream activity (p<0.05). Three out of 18
samples collected immediately downstream of dairies had a relatively higher abundance of
Moraxellaceae or Enterobacteriaceae. Microbial communities also differed between water
and soil fractions of individual samples according to the UniFrac-based PCoA clustering, and
PERMANOVA (p<0.01). While the dominant families in soil fractions were Rhodocyclaceae,
Rubritaleceae, and Sphingomonadaceae; Chitinophagaceae, Enterobacteriaceae, and
Moraxellaceae were more abundant in water fractions. Conclusions: Taxonomic composition
of soil and water fraction of collected surface water samples differ. Further, the microbial and
especially Enterobacteriaceae content in water may be affected by the adjacent land use.
Thus, this study provides a baseline data describing surface water microbiome structure that
can guide further studies focused on detection of microbiological safety hazards in water.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
133
Board #:
Whole Genome Sequencing Analysis Reveals That Air-conditioners Cooling Towers are
Title: Reservoirs for Legionella pneumophila and Lead to Infections with the Same Strains Over
Years
D. Wüthrich1, S. Gautsch2, P. Brodmann2, O. Dubuis3, R. Spieler-Denz4, S. Tschudin-Sutter1, V.
Gaia5, S. Fuchs4, A. Egli1;
Author 1
University Hospital of Basel, Basel, SWITZERLAND, 2Cantonal Laboratory City of Basel, Basel,
Block:
SWITZERLAND, 3Viollier, Allschwil, SWITZERLAND, 4Medical Services City of Basel, Basel,
SWITZERLAND, 5National Reference Center for Legionella, Bellinzona, SWITZERLAND.
Background: Water supply and air-conditioner cooling towers (ACCT) are known potential
sources of Legionella pneumophila. Traditional typing methods have low resolution and may
not allow reliable identification of transmissions. The advent of whole genome sequencing
(WGS) allows high-resolution analysis, and the study of complexity within environmental
compartments. Materials/methods: In summer 2017, the health administration of the City of
Basel detected an increase of Legionella pneumophila infections compared to previous
months. An epidemiological and WGS-based microbiological investigation was performed,
involving isolates from the local water supply and two ACCTs (n=60), clinical outbreak and
non-outbreak related isolates from 2017 (n=8) and those collected between 2003-2016
(n=26). Finally, we also compared the sequenced strains to already published bacterial
Abstract: genomes from 17 countries (n=539). Results: Phylogenetic analysis of the ACCT isolates
showed clustering into two groups separated by a few hundred allelic differences. Several
strains were found in both ACCTs. Furthermore, we found that isolates from the two ACCTs
were highly related to three clinical isolates from 2017. Five clinical isolates from the last
decade also found to be closely related to the recent isolates from ACCTs. Finally, we found
several clinical isolates to be related to published genomes. Conclusions: Current outbreak-
related and historic isolates were linked to ACCTs. ACCTs form a complex environmental
habitat in which strains are conserved over years and are exchanged between locations. WGS-
based typing allows to explore this complex network, which might have public health
implications on the tracing of potential sources and the interpretation of environmental
findings.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
134
Board #:
Transmission Analysis and Modelling Epidemiology of Tuberculosis (TAME-TB) Using
Title: Population Based Whole Genome Sequencing in Slums and Slum Rehabilitation Dwellings in
Mumbai, India
Author A. Chatterjee;
Block: Indian Institute of Technology Bombay, Mumbai, Maharashtra, INDIA.
A recent study using whole genome sequencing (WGS) demonstrated the presence of a clonal
outbreak of multidrug resistant (MDR) TB in Mumbai. Thus transmission of Mtb in the
community appears to be one of the most significant contributors to the current epidemic
like situation in Mumbai and elsewhere, and emerges as a key intervention point for the
public health system. While nosocomial transmission and transmission in public spaces have
been identified for intervention, household transmission is overlooked. One of the reasons
for the neglect is due to limited data. Previous inability to ascertain the path of transmission
has been recently overcome by the use of whole genome sequencing (WGS) which has
successfully traced several outbreaks of TB. Here we determine the transmission of Mtb in
low socio-economic households, in a defined slum cluster and an adjacent slum rehabilitation
(SRA) cluster in Mumbai. In these low income settlements the air quality has been found to
Abstract: be poor which can significantly increases the risk of airborne infection. Additionally the
spatial adjacency of these slum settlements pose a potential risk to increasing vulnerability.
Using WGS of Mtb isolated for TB patients in the two locations, the study demonstrates the
proportion of TB caused by household transmission. Using phylogenomics, bayesian
estimation of risk of infection and GIS mapping, the study will conclusively trace the
transmission chains in the locale. By overlaying this information with modelling of the
household built environment, the study proposes to understand the potential contribution of
such layouts and the effect of spatial autocorrelation to the increased transmission, the study
will device a novel public health tool for Real time monitoring and mapping of TB
transmission. Additionally, this study will contribute towards understanding the relationship
between TB transmission and slum-household clustering through a spatio-temporal analysis
route. This spatial analysis route adds to the novelty of this study.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
135
Board #:
Illumina Based Analysis and Prediction of a Suitable Metagenomic Diversity of Endophytic
Title:
Bacteria to Improve Rice Crop Productivity
Author A. Krishnamoorthy, M. K. Maiti;
Block: Indian Institute of Technology Kharagpur, Kharagpur, INDIA.
Background: To combat the declining supply of natural resources and to address the need of
feeding a steadily growing population, implementation of sustainable agricultural practices is
mandatory. For enhancing crop productivity, exploring the interactions between plant and
microbial populations could prove as an effective strategy. Endophytes are microorganisms
that reside and colonize within the host plant tissues and are reported to aid in plant growth
promotion (PGP). Through the present study, we would like to address the following key
questions in the domain of rice endophyte research through Next Generation Sequencing
(NGS): (i) What are the core and temporal compositional profiles of endophytic bacterial
diversity in the different host tissues of rice plant? (ii) How do they affect the fitness of the
host plants? (iii) How do plants respond to inoculation with the endophytic bacterial strains
upon seed priming in relation to PGP characteristics? Methods: In the present study, using
Illumina sequencing of the V3-V4 hypervariable region of the 16S rRNA gene and
metagenomic library analysis, we have identified the endophytic bacterial population present
in the root and shoot tissues of selected aromatic rice cultivar Kalonunia grown in
vitro aseptically. Further, we have isolated endophytic bacteria through the culturable
approach and investigated the PGP effect rendered by them on the host plant rice. Results: A
Abstract: total of 40,600 reads were obtained from the two 16S rRNA gene samples sent for Illumina
sequencing after removal of reads corresponding to Cyanobacteria and other chimeras. In our
study, regardless of the plant tissue, members of phylum Firmicutes were the most abundant,
followed by those of Proteobacteria, Bacteroidetes, Actinobacteria
Spirochaetes and Tenericutes. Analysis of the sequences revealed the presence of a core
bacterial endomicrobiome comprising mainly of Acinetobacter sp., Heliorestis sp.
and Thiomonas sp. in the root and shoot metagenomic samples. The root metagenome of
rice, however, showed more abundance and diversity of bacterial endophytes than the shoot
sample which include species of Pseudomonas and Stenotrophomonas. All the bacterial
endophytes identified are previously reported to exhibit PGP in host plants through several
mechanisms, such as phytohormone production, nitrogen fixation, phosphate solubilization,
siderophore production, etc. Through the culturable approach, different species
of Pseudomonas and Methylobacterium were isolated from the callus cultures of Kalonunia
rice cultivar which are reported to have good PGP abilities. Conclusion: Our findings indicate
that a typical metagenomic diversity of endophytic bacteria could be predicted through NGS
for the development of suitable bacterial consortia using selected endophytic isolates as
bioinoculants to improve rice crop productivity.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
136
Board #:
Molecular Evolution of HSV-1 and -2 Clinical Specimens During Storage, Transport and Single-
Title:
passage Culture
Author P. Roychoudhury, A. Greninger, H. Xie, M. Huang, S. Selke, A. Wald, C. Johnston, K. Jerome;
Block: University of Washington, Seattle, WA.
The herpes simplex viruses HSV-1 and -2 are ubiquitous human pathogens responsible for a
large burden of disease worldwide, manifesting as oral and genital ulcers, neonatal disease,
encephalitis and keratitis. To date, most HSV genomics has been performed on culture
isolates, raising concerns that these genomes may not accurately represent the clinical
specimens from which they were derived. We have developed and validated an approach
that combines a DNA oligonucleotide hybridization panel with a bioinformatic pipeline that
allows the recovery of near-complete HSV genomes directly from clinical specimens with
Illumina sequencing. Our computational pipeline performs rapid assembly and annotation of
whole viral genomes starting from raw reads, allowing the recovery of near-full-length
genomes from specimens with as low as 102HSV copies/ml and 100,000 reads. We applied
Abstract:
this approach to a set of HSV-1 clinical swabs and paired single-passage culture isolates and
saw limited sequence evolution with 14 out of 17 specimens being completely identical in the
UL-US regions. With a separate set of clinical samples, we compared HSV-2 sequences from
swab-derived specimens sequenced after different methods of storage (swab in viral
transport media or PCR buffer, single passage culture, extracted DNA) and again saw minimal
sequence evolution across different specimen types. Together, these results show that low-
passage clinical isolates are reflective of the viral sequences present in the lesion and can be
used for phylogenetic analyses. We have also used this method to detect superinfection by
unrelated HSV strains in single and temporal samples, illustrating the power of direct-from-
specimen sequencing of HSV.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
137
Board #:
Population Structure, Pilus Island Distribution,and Antimicrobial Resistance Genes of Group
Title:
B Streptococcus Isolated from Infants with Invasive Disease
S. T. Lukhele1, G. Kwatra1, A. Ismail2, M. Ali2, C. Cutland1, Z. Dangor1, S. Madhi1;
Author 1
University of the Witwatersrand, Johannesburg, SOUTH AFRICA, 2National Institute of
Block:
Communicable Diseases, Johannesburg, SOUTH AFRICA.
Background Group B Streptococcus is a leading cause of neonatal invasive disease, however,
there is limited information on the invasive disease genotypes from Africa. This study aimed
to investigate genotype diversity and antimicrobial resistance genes associated with among
invasive GBS isolates collected over a 12years period in Johannesburg, South
Africa. Methods Whole genome sequencing was performed using illumina bio-sequencer and
Nextera DNA kit. Whole genome multi-locus typing was used to determine the genetic
diversity of Group B Streptococcus isolates. The presence of resistance genes and pilus islands
were identified using PubMLST. Results Among 293 isolates, 17 genotypes were found with
Abstract:
ST17 (36.51%) and ST23 (19.79%) as being dominant genotypes, followed by ST109 (16.3%),
ST1 (4.09%), ST28 (4.09%) and ST10 (3.41%). The invasive disease isolates were mostly
associated (90%) with cps-Y, -L, and -F capsular biosynthesis genes. Pilus islands (PI) identified
included PI-2b (24.5%), PI-2a (26.2%), and PI-1 (28.6%) and in combination PI-1+2a (9.5%) and
PI-1+2b (27.9%). Nearly 92.8% of invasive isolates had a Tet-M gene and 5.11% had erm gene
in their genome. Conclusion The dominant sequence type of invasive disease isolates were
ST17 and ST23. The presence of tetracycline resistance gene and PI-1 were observed among
the most dominant genotypes of Group B Streptococcus.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
138
Board #:
Elucidation of Major Environmental Factors That Govern the Microbial Community Structure
Title:
and Function in Acid Mine Drainage of Malanjkhand Copper Project, India
Author A. Gupta, A. Dutta, J. Sarkar, M. K. Panigrahi, P. Sar;
Block: Indian Institute of Technology Kharagpur, Kharagpur, INDIA.
Background: Biological oxidation of the sulfidic ores in the mining region generates highly
acidic mine drainage which is threat to an ecosystem as it contains high concentration of
heavy metals and sulfate, hence considered to be an extreme environment for life. The
present study is designed to understand the role of major environmental factors (pH, SO42-
and DOC) in assemblage of microbial community structure and function. Methods: Acid mine
drainage (water and sediment) samples collected from the Malanjkhand copper project, India
were used in the present study. To attain this objective, geochemistry of the samples was
thoroughly assessed followed by 16S rRNA based targeted sequencing and shot gun
metagenome approach to understand the microbial diversity and function as well as
statistical analysis was used to comprehend the role of environmental factors in shaping the
microbial community composition. Results: The samples were found to be distinct in its
geochemical parameters and were partitioned in to two pH regimes [low (1.9 < pH < 4.0) and
high 4.0 < pH < 6.0)]. The low pH samples contained high sulfate and heavy metals
concentration as compared to high pH samples. The microbial diversity of the low pH samples
were dominated with highly acidic, Fe/S oxidizing taxa responsible for AMD generation
Abstract:
whereas the high pH samples constituted of moderately acidophilic/neutrophilic microbial
groups involved in diverse biogeochemical cycling and contained groups which could be used
as a potent members for AMD attenuation. Canonical correspondence analysis established
the role of pH, Fe, SO42-, DOC etc. in shaping the structure of microbial community
composition. Spearman correlation of the OTUs with pH, Fe, SO42-revealed that highly acidic
and Fe/S oxidizing groups were found to be positively correlated with these parameters. To
understand the metabolic potential of the microbial community, one sample from each pH
regime was considered for shot gun metagenome based approach and results revealed that
genes involved in diverse biogeochemical cycling were detected in both the samples. The high
abundance of genes involved in S oxidation and pH stress were detected in low pH sample
whereas sulfate reducing gene were found to be more in high pH sample. The genes involved
in C fixation, nitrogen metabolism and heavy metal stress were detected in both the samples
hence confirmed the function of these organisms under low organic carbon and heavy metal
stress. Conclusion:The present study provides a deeper insight into the role of environmental
variables in shaping the microbial community structure and function in acid mine drainage.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
139
Board #:
Rapidly Accumulated Tobramycin Resistance by Pseudomonas aeruginosa in CF-like Acidic pH
Title:
Environment
Author Q. Lin, Y. Di;
Block: University of Pittsburgh, Pittsburgh, PA.
Background: Cystic fibrosis (CF) is a genetic disease with a loss of cystic fibrosis
transmembrane conductance regulator (CFTR) function that leads to impaired airway host
defense. Chronic infection and colonization by gram-negative Pseudomonas aeruginosa (P.
aeruginosa), an opportunistic pathogen, contribute to high mortality rates in CF. While the
airway surface liquid of CF patients becoming more acidic with aging, the prevalence of P.
aeruginosa lung infection also gradually increases over time in CF patients from age 2 to 45
and P. aeruginosa eventually becomes the dominant bacterial strain colonized in the lungs of
CF suffers. We previously demonstrated that the acidic CF lung microenvironment
promotes P. aeruginosa biofilm formation and multi-drug resistance. But the effects of acidic
CF lung microenvironment on tobramycin treatment-associated antibiotic resistance (AR)
remains unknown. In this study, we hypothesize that the acidic microenvironment promotes
Abstract: faster and stronger tobramycin resistance compared to physiologically neutral pH non-CF lung
microenvironment. Methods: Planktonic and bead-transfer biofilm models were used for P.
aeruginosa PA14 evolution study in pH 6.5 and 7.5 with or without tobramycin treatment.
Bacterial whole genome sequence data were acquired by Next-Generation Sequencing (NGS)
technology. Results: Our results indicated that PA14 exhibited a rapid morphological change
under acidic pH conditions. Acidic environment also stimulated faster and stronger PA14
tobramycin resistance compared to neutral pH conditions. NGS results showed that acidic
environments elicited several DNA mutations that were likely pH-dependent. Conclusions:
Our results indicated that PA14 generated AR quickly under tobramycin treatment and the
acidic lung microenvironment promoted even faster tobramycin resistance in the biofilm
mold of growth. The pH-dependent DNA mutations are potential targets for future treatment
in CF patients to effectively eliminate P. aeruginosa infection.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
140
Board #:
Genomic Characterization and Phylogenetic Analysis of Salmonella Javiana Clinical Strains
Title:
from Tennessee, 2017-2018
L. K. Hudson1, C. Moore2, L. Constantine-Renna3, X. Qian2, L. S. Thomas2, K. N. Garman3, J. R.
Dunn3, T. G. Denes1;
Author 1
Department of Food Science, University of Tennessee, Knoxville, TN, 2Tennessee Department
Block:
of Health, Division of Laboratory Services, Nashville, TN, 3Tennessee Department of Health,
Nashville, TN.
Background: Salmonella Javiana is the fourth most common serovar of Salmonella found to
cause illnesses in Tennessee (TN), but is geographically clustered in the western region.
Almost two-thirds (63%) of S. Javiana clinical isolates from January 2017 through June 2018
were from counties in west TN. The objectives of this study were to retrospectively examine
the genomic population structure of Javiana isolates from patients in TN in 2017-18 and
describe epidemiological features among clades of case-patients
identified. Methods: Biosample numbers and metadata for S. Javiana (n=61) clinical isolates
from TN collected January 2017 to June 2018 were provided by the Tennessee Department of
Health. Raw reads were downloaded from the NCBI SRA database, trimmed using
Trimmomatic, and quality checked using FastQC. An appropriate reference assembly was
chosen (BioSample SAMN01832085) and downloaded from the NCBI refseq database. hqSNPs
were identified using the CFSAN SNP pipeline and the resulting matrix was used to construct a
neighbor-joining tree with Mega7. Additionally, trimmed reads were assembled using SPAdes
and contigs annotated with Prokka. Assembly statistics were generated by BBMap, SAMtools,
and QUAST. SeqSero was used to confirm serotype designations. Results: Two distinct major
clades of interest were identified (clades 1 and 4), each with geographical clustering, along
Abstract: with 2 minor clades. Major clades were defined as containing five or more isolates and minor
clades as containing less than five. Clades may represent or contain epidemiological
clusters. Clade 1 consisted of 23 isolates, with almost all (n=21) being isolated from patients in
the western region of TN and the majority isolated in a single county (Shelby; n=13). Isolates
from Clade 1 were collected over a period of approximately nine months. Clade 4 consisted of
20 isolates, mostly isolated from counties in west TN (n=13), that were collected over a period
of approximately one year. There is a notable subclade of seven isolates within clade 4
(subclade 4A), all from four rural counties in one geographic location (Carroll, Gibson,
Crockett, and Madison counties). These were collected over a period of about five months
and the majority of isolates from this subclade were collected from patients that were male
(86%) and adults (86%). Conclusions: The clustering patterns of S. Javiana isolates with a large
portion originating in western TN, together with the timeline of the isolation dates and SNP
differences, may indicate that many of these are environmentally acquired. Further
investigation of epidemiological data and possible environmental sources may identify the
source of illness and possible preventive strategies. In addition, information gained about the
population structure of this serovar provides guidance for selecting SNP distance thresholds
used to identify clusters that may be of epidemiological significance.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
142
Board #:
Identifying Metabolically Active Bacteria in Tobacco Products with DNA Labeling and Next-
Title:
generation Sequencing
Author S. Chattopadhyay1, L. Malayil1, E. Mongodin2, A. Sapkota1;
1
Block: University of Maryland, College Park, MD, 2University of Maryland, Baltimore, MD.
The advent of the Family Smoking Prevention and Tobacco Control Act, implemented by the
U.S. Food and Drug Administration, has resulted in the need to improve our understanding of
the microbial constituents of tobacco products. 16S rRNA gene sequencing techniques have
enabled us to gain insights into identifying non-culturable bacteria present in tobacco
products. These sequencing techniques generate massive amounts of unbiased data but are
unable to determine what proportion of identified bacterial communities are live and active.
To bridge this knowledge gap, our study aimed to identify and quantify the metabolically
active bacterial communities in commercially-available tobacco products. We tested 14
tobacco products: 4 brands of cigarettes, 4 brands of little cigars and 6 brands of hookah,
each with three distinct flavors. For each product, 0.2g of tobacco was treated with either i)
Propidium monoazide (PMA), allowing detection of viable bacteria by inactivating DNA that is
not contained within an intact cell membrane, or ii) 5-bromo-2’- deoxyuridine (BrdU),
allowing detection of proliferating cells, or left untreated (control samples), after which total
genomic DNA was extracted. BrdU samples were immunocaptured and all samples
Abstract:
underwent PCR of the 16S rRNA gene followed by sequencing on Illumina HiSeq. Downstream
analyses were performed using QIIME and R. Overall, 1,242 species-level operational
taxonomic units (OTUs) were identified from more than 11 million sequences across 88
samples. Alpha diversity analysis (Observed and Shannon indices) showed significant
(p<0.005) differences between bacterial communities among BrdU-treated, PMA-treated and
control samples across all tobacco products. In addition, Fflavoring of tobacco products also
showed significant effects on bacterial community composition. Beta diversity analysis
comparing products using Bray-Curtis dissimilarity also identified significant differences
between BrdU- and PMA-treated samples (ANOSIM R= 24.4%, p<0.001). Actinobacteria,
Bacteroidetes, Firmicutes, and Proteobacteria were the top phyla identified across all
products. Our data confirms that tobacco bacterial communities are diverse and differ across
brands and products. This study is the first to characterize the presence of a metabolically
active fraction of bacterial communities residing within these products, which are affected by
tobacco flavors.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
143
Board #:
WGS Analysis of O157 and non-O157 Shiga Toxin-producing Escherichia coli Clinical Isolates
Title:
from Michigan, 2010-2014
Author H. Blankenship, S. D. Manning;
Block: Michigan State University, East Lansing, MI.
Background. Shiga toxin-producing Escherichia coli (STEC) is a leading foodborne pathogen
with a diverse genetic background that contributes to variation in disease presentation and
severity. The use of WGS allows for a more comprehensive genomic profiles to be rapidly and
easily obtained for comparing between isolates and for identification of genetic factors
associated with disease outcomes. Methods. STEC isolates were obtained from 2010-2014 as
part of an active surveillance system, which included four hospitals in Michigan. Wizard
Genomic DNA purification and Illumina Nextera XT kits were used followed by sequencing
using the Illumina MiSeq platform. de novo genome assembly was performed with Spades
following trimming and quality checking with Trimmomatic and FastQC. Results. WGS data is
available for 477 STEC isolates (33 O157 and 444 non-O157) from Michigan. A workflow was
Abstract: developed with bioinfomatic scripts to extract the molecular serotype, virulence and
resistance gene profiles, and multilocus sequence type (ST) for each strain. SNPs specific for
one of nine clades were extracted from the 33 O157 strains as were CRISPR spacer regions to
demonstrate a high level of diversity with multiple unique gene profiles. 45 non-O157
serogroups were identified and the isolates were grouped into 54 STs with 4 new STs
identified. The O157 strains grouped into 5 clades, with the majority (n=14) belonging to
clade 8. Conclusion. Use of WGS to characterize 477 STEC isolates has demonstrated that
strains recovered from patients in Michigan are diverse and that specific gene profiles can be
associated with epidemiological data. Comparative genomic analyses of STEC and other
foodborne pathogens are important to identify key profiles that are most important for
severe infections, and to validate existing subtyping methods.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
144
Board #:
Genome Sequences of Bacillus sporothermodurans Strains Isolated from Ultra High
Title:
Temperature (UHT) Milk
R. Owusu-Darko1, M. Allam2, S. D. Oliveira3, C. A. Ferreira3, S. Grover4, S. Mtshali2, A. Ismail2,
E. M. Buys1;
Author 1University of Pretoria, Pretoria, SOUTH AFRICA, 2National Institute for Communicable
Block: Diseases, Johannesburg, SOUTH AFRICA, 3School of Sciences, Pontifícia Universidade Católica
do Rio Grande do Sul, Porto Alegre, BRAZIL, 4Dairy Microbiology Division, Molecular Biology
Unit, National Dairy Research Institute, Karnal, INDIA.
Bacillus sporothermodurans, first isolated in ultra-high temperature (UHT) milk, is a thermo-
resistant, Gram-positive bacterium that can produce highly heat resistant endospores (HRS),
that may survive UHT heat treatments. We sequenced four genomes of B.
sporothermodurans, including for the first time, both heat resistant and non-heat resistant
strains. The size of the genomes ranges from 3.4 Mb to 3.9 Mb with an average G + C content
of 36 % and the number of coding sequences ranging from 3768 to 4558. Our research also
shows that both heat resistant and non-heat resistant strains have similar compliment of heat
resistance genes, the hrcA-dnaK-dnaJ-grpE operon and biofilm formation of the TasA and
homologs. The whole genome sequence of three of the four sequenced B.
Abstract: sporothermodurans strains have the Listeria sp. pathogenicity island LIPI-1, presumably
obtained through horizontal gene transfer. Evolutionary trends of B.
sporothermodurans suggest a common ancestor originating from the gut of insects or
Arachnids like its closest phylogenetic neighbor, Bacillus oleronius. The draft genomes carried
out on the Illumina MiSeq system will enhance our understanding of the genes and pathways
responsible for heat resistance and biofilm formation which is of prime importance to the
dairy industry. It will also allow for pangenome studies which are ongoing and the
evolutionary relationships with other Bacillus species of concern to the food industry. PacBio
sequencing to start in earnest will allow to fill out the gaps in the genomes undertaken
through the MiSeq platform.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
145
Board #:
Dissection of The Mobilome of Carbapenem-Resistant Klebsiella pneumoniae (CR-Kpn) Using
Title:
Short and Long Read Assemblies: A Prospective Study in Houston, TX
W. C. Shropshire, A. Q. Dinh, H. Ecklund, A. Wanger, W. Miller, D. Panesso, T. T. Tran, C. A.
Author
Arias, B. M. Hanson;
Block:
UTHealth, Houston, TX.
Background: Klebsiella pneumoniae (Kpn) is a gram-negative pathogen that is responsible for
nosocomial infections leading to significant morbidity and mortality worldwide. Mobile
genetic elements (MGEs; e.g. plasmids and transposons) are of crucial importance for these
organisms to adapt and evolve. Exchanging of accessory genes that confer resistance to
antimicrobials is particularly important in clinical settings. The application of short- and long-
read sequencing platforms with next generation sequencing (NGS) bioinformatic tools
permits high resolution of these complex resistance elements which otherwise prove difficult
using one sequencing platform alone. Here, we describe the genomic profiles of 95 CR-Kpns
belonging to clonal group (CG) 258 and non-CG258 collected in hospitals across Houston, TX
from May to December 2017. Methods: Libraries were prepared with Nextera XT DNA Library
Prep Kit (Illumina) and Rapid Sequencing Kit (SQK-RBK004, Oxford Nanopore Technologies,
ONT). Sequencing platforms used were the MiSeq and HiSeq 4000 (Illumina) and GridION X5
(ONT). A custom pipeline was developed for high-throughput data QC, processing, assembly,
and analysis of Illumina data. Oxford Nanopore sequencing data was assembled using Canu
v1.7.1 and polished with Illumina sequencing data using Pilon 1.22. Chromosomes and
Abstract: plasmids were circularized using Circlator 1.5.5. Results: Phylogenetic and multi-locus
sequence typing (MLST) analysis on 95 Kpn samples revealed two predominant sequence
types (STs) with 38/95 (40%) ST258 and 35/95 (37%) ST307. Short-read alignment analysis
indicated that all ST307s had the extended spectrum beta lactamase (ESBL) CTX-M-15 gene
whereas it was only present in one ST258 isolate. Two non-ST258/ST307 isolates carried the
genes encoding NDM-1 carbapenemase. Two representative isolates from each dominant
clade and four non-ST258/ST307 isolates were sequenced using the GridION X5 platform with
their plasmid and MGE structures closed and resolved respectively. Fox example, we were
able to resolve transposon Tn4401a linked with blaKPC-3 carriage within an ST307 isolate on a
single ONT read. This allowed us to identify multiple isoforms and SNPs of genes initially
identified through the abricate tool, i.e. CARD and PlasmidFinder, and gain greater detail of
the MGE structures that carried them. Conclusions: Initial phylogenetic analysis revealed two
clades, ST258 and ST307, which appear to dominate the multifocal prevalence within our
Houston hospital setting. The application of two NGS sequencing platforms along with our
custom bioinformatics pipeline allowed a complete elucidation of the MGE structures of
interest as well as the resistance determinants of which these MGEs carried.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
146
Board #:
Hybrid Sequencing and Assembly Reveals Genomic Diversity of Methicillin-
Title: susceptible Staphylococcus aureus (MSSA) from a ,eonatal Intensive Care Unit (NICU)
Surveillance Effort
M. K. Annavajhala, W. Geng, A. C. Hill-Ricciuti, S. Ferguson, S. L. Stump, M. J. Giddins, M.
Author
Messina, P. Zachariah, D. A. Green, S. Whittier, L. Saiman, A. Uhlemann;
Block:
Columbia University Medical Center, New York, NY.
Background: MSSA is a more prevalent NICU pathogen than methicillin-resistant S.
aureus (MRSA), yet optimal MSSA infection prevention and control strategies are unclear.
Given the ubiquity of MSSA as a human colonizer, neonatal acquisition likely occurs through
multiple routes during delivery and close contact with parents and healthcare providers.
However, the introduction and potential local spread of MSSA and the role of systematic
decolonization for infants colonized with MSSA remain incompletely understood. Here, we
used short- and long-read whole-genome sequencing (WGS) to define the diversity of MSSA
during an ongoing NICU surveillance effort and aimed to identify genomic features
potentiating the spread of prevalent MSSA clones. Methods: Infants hospitalized in a 75-bed
university-affiliated level III-IV NICU over an 18-month period were screened twice monthly
for MSSA-positive clinical and/or pooled four-site surveillance cultures. We spa typed isolates
using PCR and sequenced the most prevalent spa types using Illumina WGS (n=107). We used
SRST2 for in silico multilocus sequence typing and antibiotic resistance gene and plasmid
replicon typing. Oxford nanopore sequencing and hybrid assembly for each MLST type was
used to create optimal reference genomes. We included previously published data in
phylogenetic analyses of core genome SNPs to identify the evolutionary history of major
clones. Results: We collected 466 MSSA isolates from 297 infants. MSSA spa types identified
Abstract:
(80 in total) included t279 (n=86), t1451 (n=21), and t571 (n=10). Of note, t1451 and t571
belong to ST398, a common clindamycin-resistant MSSA in our local community. In contrast,
t279 (CC15) has not been encountered in community surveillance efforts yet increased in the
NICU during the study period. We used nanopore sequencing of the oldest 2016 CC15 isolate
to generate a reference genome. Compared to publicly available genomes, our reference
greatly reduced pairwise SNP distances and allowed for more accurate phylogenetic
inferences. ST398 NICU isolates formed three clusters with closely related community
isolates, suggesting community members and NICU reservoirs as sources of acquisition in
neonates. CC15 comprised two clades of closely related isolates (< 100 SNPs) distinct from
known community MSSA, pointing to clonal expansion within the NICU. Almost all CC15 also
harbored mupA-encoding plasmids, including the reference isolate, indicating potential
proliferation due to decolonization efforts with mupirocin. Conclusions: MSSA in our NICU
exhibited substantial genetic heterogeneity. Comparative genomics indicate genotype-
specific pathways of introduction and spread of MSSA, including potential community-
(ST398) or healthcare- (CC15) associated sources. Antibiotic resistance may play an important
role in dissemination of CC15. Future surveillance efforts could benefit from routine
genotyping.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
147
Board #:
Title: Microfluidic NGS Sample Preparation for High-Throughput Epidemiology
Author S. Kim, G. Lagoudas, P. Blainey;
Block: The Broad Institute of MIT and Harvard, Cambridge, MA.
While low-cost DNA sequencing is transforming biological research and discovery, preparing
large sample sets for sequencing with minuscule starting material is now the limiting factor in
many applications. Here we introduce a polydimethlysiloxane (PDMS) microfluidic device that
automates the key steps in whole genome NGS sequencing sample preparation, integrating
lysis, fragmentation, adapter tagging, purification, and size selection of 96 samples in parallel.
We applied our device to process about 5000 whole genome sequencing libraries
Abstract: of Pseudomonas aeruginosa clinical isolates, methicillin resistant Staphylococcus
aureus, Mycobacterium tuberculosis, soil microbes, and human gut microbes using
dramatically reduced sample input and reagent quantities. These microfluidic libraries
showed excellent coverage for variant calling, phylotyping, metabolic profiliing, and
metagenomic analysis performance from only 10,000 cells (50 picograms of genomic DNA).
Our method will enable high-throughput processing of samples for shotgun sequencing with
broad application to basic science and clinical medicine.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
148
Board #:
Title: PathOGiST: Calibrated Multi-criterion Genomic Analysis for Public Health
P. Feijao1, M. Katebi1, E. Lasalle2, H. Yao3, S. La1, M. Nguyen1, C. Chauve1, L. Chindelevich1;
Author 1
Simon Fraser University, Vancouver, BC, CANADA, 2ENS Paris-Saclay, Paris, FRANCE, 3École
Block:
Polytechnique, Paris, FRANCE.
As public health organizations start to rely on whole-genome sequencing (WGS)data for
infectious disease surveillance and outbreak investigations, two main issues emerge from the
use of WGS data for genotyping. First, methods for differentiating outbreak-related strains
from sporadic strains are often based on a single type of genomic variation. This approach
captures only a limited amount of the genomic variability and tells only a partial story of the
organism's evolutionary history. Second, WGS-based sample clustering algorithms are often
not calibrated, meaning that the determination of clustering thresholds or subtyping cutoffs
is still mostly arbitrary. There are many forces driving pathogen evolution and as a result,
using the wrong set of variants or the wrong cutoffs may mislead the investigation of a
Abstract:
pathogen outbreak.We address this issues by developing PathOGiST, that implements and
integrates existing and novel genomic variant calling algorithms from WGS data (SNPs, Multi
locus sequence typing and copy number variations), together with clustering algorithms
based on a multi-criterion genome dissimilarity measure using various kinds of genomic
variants. Final steps include the calibration of the statistical models and algorithms using
large reference sets of selected pathogen genomes from epidemiologically confirmed
outbreak strains.The PathOGiST pipeline will be implemented both as a standalone and as a
Galaxy tool, and will be part of the IRIDA platform, making it available to public health
workers in Canada and around the world.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
149
Board #:
Title: Worflows for detection in methylomes in ONT and Pacbio
Author L. A. Arteaga-Figueroa, V. Villegas-Escobar, J. Correa-Alvarez;
Block: Universidad EAFIT, Medellín, COLOMBIA.
Background: TheThird generation sequencing (TGS) technologies present many advantages in
comparison with Next generation sequencing (NGS) , mainly because its capability to produce
long reads, detect base modifications, RNA sequencing, superior performance on repeated
regions. Many comparisons are found regarding the performance, but few compare the
capability to detect DNA modifications, available software performance, and final annotated
methylome. In this study,we provide workflows for the detection of 6mA and 5mC in
Nanopore (ONT) and SMRT sequencing data, and a comprehensive comparison between
them. Methods: Data obtained with ONT and PacBio from Bacillus subtilis project (EA-
CB0015) were used to design workflows for the base detection analysis (considering the most
tools available).These workflows compare intensively mappers and modified base detectors.
For 6mA, we implemented mCaller, Tombo,and ipdSummary (kineticTools) ; for 5mC, we
implemented mCaller, Tombo, Nanopolish and ipdSummary. Homescripts in python were
used for output analysis and graphs. Results: In terms of mapping, for PacBio, so far, only
Blasr was able to output the .bam necessary for the analysis. For ONTdata, Graphmap and
Abstract:
Minimap2,Although Graphmap was more accurate than Minimap2 for high error datasets,
Minimap2 performed better; we are currently testing lordFAST.For 5mC detection in ONT
data, we found that Tombo is the most sensitive, nearly followed by Nanopolish. For 6mA,
Tombo was also the most sensitive. ipdSummary failed many times to assign identity to
modified bases, and outputted many less bases than ONT software for both 6mA and 5mC.
Additionally, detection of modified bases in the same position (in dimers like TA, GC) but in
different strands were also observed.The missidentification of 4mC remains as an issue
through the analysis, the later chemistry of PacBio does not have an appropriate software for
4mC detection. When we compared ONT for 4mC with 5mC ONT detections, some positions
matched, and as aforementioned, a big part of the output of ipdSummary does not assign
identity to all the modified bases, we are still working in the identification of 8oxoA, 8oxoG,
and other kinds of mC with the purpose of identify possible false positives for 5mC in ONT
results. Conclusions: Oxford Nanopore sequencing showed to be more sensitivethat PacBio
sequencing for 5mC and 6mA detection accordingto our results.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
150
Board #:
Metagenomic MinION and Illumina Sequencing for Surveillance of Cholera and Other
Title:
Waterborne Pathogens in Haiti
Author B. Stebbins, S. Hung, C. Martin, T. Ford;
Block: UMass Amherst, Amherst, MA.
The recent deadly cholera outbreak after the 2010 earthquake in Haiti prompted the need for
a portable system that will allow for rapid pathogen identification without requiring
expensive laboratory resources. To address the need, we are testing the field suitability of the
hand-held sequencer called the MinION (Oxford Nanopore Technologies) for the
metagenomic assessment and detection of waterborne pathogens such as Vibrio cholerae. In
the initial testing phase, we collected and tested water samples for coliforms from the Mill
Abstract: and Fort Rivers in Amherst and Hadley, Massachusetts. DNA was then isolated, spiked with V.
cholerae DNA, and prepared via the DNA ligation method for sequencing. Initial MinION
sequencing followed by bioinformatic analysis of the data using WIMP, OneCodex, and
CosmosID (Rockville, MD) platform detected V. cholerae spike-in but not the coliforms unless
the samples were enriched for these species. Further optimization of the portable system is
being undertaken to allow for future field metagenomic applicability. We will also be using
Illumina MiSeq sequencing for comparison of read accuracy.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
151
Board #:
Title: HAIviz: Visualizing Genomic Epidemiology Data of Healthcare Associated Infections
B. Permana1, B. M. Forde1, L. Roberts1, P. Harris2, S. A. Beatson1;
Author 1
University of Queensland, St Lucia, AUSTRALIA, 2UQ Centre for Clinical Research, Brisbane
Block:
City, AUSTRALIA.
Visualization is an essential aid for communicating genomic epidemiology of infectious
disease that often employs complex epidemiological and genomic information. Over the past
few years, several tools such as Microreact, PhyloGeoTool, and Nextstrain have been
developed and demonstrated the benefit of data integration and interactive visualization.
However, these tools mainly focus on global epidemiology and phylogeography. Applications
that feature specific data related to Healthcare Associated Infection (HAI) such as the
patient's bed movements, hospital room layout, and infection transmission network remain
unavailable.
Here we present HAIviz, an interactive single web page application for visualizing genomic
epidemiology of HAIs. It was developed using popular web technologies to allow infection
control professionals, epidemiologist, clinicians, and public health decision-makers explore
potential insights from Whole Genome Sequencing (WGS) data and epidemiological
Abstract: information. HAIviz allows users to display a detailed hospital map, integrated with patient
metadata, phylogenetic trees and transmission networks. Users can create single or multiple
visualization windows by uploading their own dataset in the required format; metadata and
transmission files in Comma Separated Value (CSV), maps in GeoJSON, and trees in Newick.
Each generated window is independently arranged, giving users the freedom to display their
preferred information.
HAIviz is freely available at the URL http://haiviz.beatsonlab.com and can be accessed using
any modern browser. As a client-side application, HAIviz perform all computational process in
the user's machine with no information posted to the server, making it inherently private,
secure, and scalable. Currently, HAIviz is accessible as a standalone visualization application
that works with the input files created by a separate workflow. In the future, HAIviz aimed to
be an integrated system with WGS-based epidemiology and bioinformatics pipeline,
promoting a real-time HAI surveillance and investigation framework.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
152
Board #:
The Role of Region-Specific Based on Single Nucleotide Polymorphisms in Virulence Genes
Title:
in Mycobacterium tuberculosis
Author A. Mutshembele, L. Malinga, M. Van Der Walt;
Block: South African Medical Research Council, PRETORIA, SOUTH AFRICA.
Background: Countries are experiencing a serious public health threat and major obstacle to
disease control due to excess antibiotic use and drug-resistant Mycobacterium
tuberculosis. Single nucleotide polymorphisms (SNPs) and groups of virulence genes will be
used for genotyping. Single nucleotide polymorphism could be the most valid markers due to
the very low level of homoplasy and that they are ideally suited for defining phylogenetic
grouping with very high confidence. Objectives: To analyse approximately 800 M.
tuberculosis whole genome sequencing (WGS) data deposited in a web-based comprehensive
information system PATRIC website and select drug resistance and susceptible genomes for
genotyping. To create database of virulence gene mutation catalogues based on M.
tuberculosis genome deposited from Brazil, China and South Africa. Research Methods: In
this work a bioinformatics analysis of virulence genes from 303 whole genome sequencing
of M. tuberculosis was performed from Brazil (n=2), China (n=23), India (n=238), Russia (n=
259) and South Africa (n=278) downloaded from PATRIC database and analysed on CLC
Abstract:
Genomics Workbench 11. Results: A bioinformatics analysis of M. tuberculosis WGS showed
that out of 15 tested genes (mazF3; vapB17; vapC47; higA; vapC37; vapC38; vapC6; mazF8;
vapC3; mce3B; cyp125; vapC25; vapB34; mce3F; vapC46) only from gene vapC3 did not have
mutations. Several genes were found to carry SNPs that correlate with specific genotypes.
Using vapC37 and vapC38 we observed Beijing lineage and its sublineages which are
associated with drug resistance and elevated virulence. Mutations obtained were specific for
lineage and sublineage level. VapC3; vapC38; and mazF8 genes were associated with LAM1,
2,9 and LAM4/F15/KZN sublineages. Conclusions: The constructed SNP reflected the
evolutionary relationship between lineages. In future we will need to establish a South
African M. tuberculosis catalogue of SNPs in virulence gene specific to the F15/LAM4/KZN
lineage this will complement the diagnostic pipeline using WGS data for drug resistance
detection and lineage determination. We will determine a list of conserved genes that can be
used for future development of DNA vaccines.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
153
Board #:
Title: Rapid and Precise Alignment of Raw Reads Against Redundant Databases with KMA
Author P. T. Clausen, F. M. Aarestrup, O. Lund;
Block: Technical University of Denmark, Lyngby, DENMARK.
Background: As the cost of sequencing has declined, clinical diagnostics based on next
generation sequencing (NGS) have become reality. Diagnostics based on sequencing will
require rapid and precise mapping against redundant databases because some of the most
important determinants, such as antimicrobial resistance and core genome multilocus
sequence typing (MLST) alleles, are highly similar to one another.In order to facilitate this, a
novel mapping method, KMA (k-mer alignment), was designed. KMA is able to map raw reads
directly against redundant databases, it also scales well for large redundant databases. KMA
uses k-mer seeding to speed up mapping and the Needleman-Wunsch algorithm to accurately
align extensions from k-mer seeds. Multi-mapping reads are resolved using a novel sorting
Abstract: scheme (ConClave scheme), ensuring an accurate selection of templates.Results: The
functionality of KMA was compared with SRST2, MGmapper, BWA-MEM, Bowtie2, Minimap2
and Salmon, using both simulated data and a dataset of Escherichia coli mapped against
resistance genes and core genome MLST alleles. KMA outperforms current methods with
respect to both accuracy and speed, while using a comparable amount of
memory. Conclusion: With KMA, it was possible map raw reads directly against redundant
databases with high accuracy, speed and memory efficiency. Availability: KMA is
implemented in C, and is freely available
at: https://bitbucket.org/genomicepidemiology/kma and as web-service
at: https://cge.cbs.dtu.dk/services/KMA/.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
154
Board #:
Population Genomics Identified Salmonella Newport ST45 as the Main Driver of Emerging
Title:
Multidrug Resistance
Author M. Yue1, S. Rankin2, D. Schifferli2, W. Fang1, H. Pan1;
1
Block: Zhejiang University, Hangzhou, CHINA, 2University of Pennsylvania, Philadelphia, PA.
Salmonella is one of the most important foodborne pathogens in the world, the emerging of
multidrug-resistant Salmonella clones pose a significant threat for veterinary public health
and food safety. However, the genetic and/or evolutionary pressure for the selection of
antibiotic-resistant pathogens in food animals remains poorly understood. The aim of this
study was a global investigation of the population diversity of S. Newport isolates by studying
the MLST of 2,250 isolates. Three clades were identified that correlated with the
niches/origins of isolation (human, animal, and environment). Sequence analysis of
1,855 S. Newport genomes identified Sequence Type 45 (ST45) as the predominant clone
among the animal isolates (87%), but only in 9% of the isolates from human infections. ST45
isolates carried multiple plasmids, the majority (> 90%) had a unique IncA/C plasmid that
ranged in size from 80 to 200 kb. The plasmid carried genes responsible for multidrug
Abstract: resistance, including floR, tetAR, strAB, sul, mer, and bla. Importantly, three Chinese strains
carried the mcr-1 gene, that confers plasmid-mediated resistance to colistin, one of a number
of last-resort antibiotics for treating Gram-negative bacterial infections. A genome-wide
association study (GWAS) correlated chromosome regions or genetic variations with
maintenance of an IncA/C plasmid in ST45 isolates. An additional investigation of the
minimum inhibitory concentration (MIC) of 27 antibiotics in 3,728 isolates isolated from the
food-chain (food-animals, retail meats, and humans) suggested that AR S. Newport from
humans have multiple, but distinct origins. Animal and retail-meat isolates are distinct from >
92% of the human isolates by their antibiotic-resistance patterns. Taken together, our
findings suggest S. Newport ST45 is the dominant clone in food-animals in the world. The
GWAS data will serve to investigate genetic determinants that contribute to the maintenance
of this clone in food-animals.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
155
Board #:
BacPipe: A Rapid, User-Friendly Whole Genome Sequencing Pipeline for Clinical Diagnostic
Title:
Bacteriology and Outbreak Detection
B. Xavier, M. Mysara, M. Bolzan, C. Lammens, S. Kumar-Singh, H. Goossens, S. Malhotra-
Author
Kumar;
Block:
University of Antwerp, Wilrijk, BELGIUM.
Despite rapid advances in whole genome sequencing (WGS) technologies, their integration
into routine microbiological diagnostics and infection control has been hampered by the need
for downstream bioinformatics analyses that require considerable expertise. We have
developed a comprehensive, rapid, and computationally low-resource bioinformatics pipeline
(BacPipe) that enables direct analyses of bacterial whole-genome sequences (raw reads,
contigs or scaffolds) obtained from second and third-generation sequencing technologies.
BacPipe is an ensemble of state-of-the-art, open-access tools for quality verification, genome
assembly, annotation, and identification of the bacterial genotype (MLST, emm typing),
resistance genes, plasmids, virulence genes, and single nucleotide polymorphisms (SNPs). The
Abstract:
outbreak module (SNPs and patient metadata) can simultaneously analyse many strains to
identify evolutionary relationships and transmission routes. Importantly, parallelization of
tools in BacPipe considerably reduces the time-to-result. Validation of BacPipe using prior
published WGS datasets from hospital, community and food-borne outbreaks and from
transmission studies of important pathogens demonstrated the speed and simplicity of the
pipeline that reconstructed the same analyses and conclusions within a few hours. We
believe this fully automated pipeline will contribute to overcoming one of the primary hurdles
to WGS data analysis and interpretation, facilitating its application for routine patient-care in
hospitals and public-health and infection-control monitoring.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
156
Board #:
Title: Rapid Extraction of Single-copy Core Genes for Species Delimitation
S. Wittouck1, S. Wuyts1, C. Meehan2, V. van Noort3, S. Lebeer1;
Author 1
University of Antwerp, Antwerp, BELGIUM, 2Institute of Tropical Medicine Antwerp,
Block:
Antwerp, BELGIUM, 3KULeuven, Leuven, BELGIUM.
Background: Many analyses in comparative genomics and phylogenetics rely on single-copy
core genes (SCGs): genes present in all genomes of interest in exactly one copy. Current
strategies to obtain SCGs are either slow or rely on pre-computed marker genes, either
universal or lineage-specific. Methods: We developed a tool for the rapid extraction of SCGs
from large sets of genomes in linear time. The tool works by first identifying candidate SCGs
on a random subset of “seed” genomes with OrthoFinder, which uses an approach based on
all-vs-all blastp and MCL. This is followed by a search for those candidate SCGs in all genomes
using HMMER. Finally, a score cutoff is determined per candidate SCG to optimize for single-
copy presence and only candidate SCGs present in nearly all genomes are retained. We apply
our tool to all 2110 publicly available genomes that belong to the Lactobacillus Genus
Complex (LGC). We show the applicability of the obtained SCGs by using them for 1) quality
control of all genomes, 2) species delimitation based on pairwise single-copy core nucleotide
identities (SCNIs) and 3) phylogeny inference using one representative genome per species. In
addition, we compare our SCNI-based species delimitation with ANI and TETRA based species
delimitations. Results: On a subset of 200 LGC genomes, we show that our tool identifies
Abstract: similar SCGs as full gene family clustering, but is faster. In the full dataset of 2,110 genomes,
we identify 422 SCGs sensu lato. Using those, we find that 1,980 genomes are of high quality
based on filters of < 5% missingness and < 5% contamination. The pairwise SCNI and ANI
similarities are strongly correlated above and slightly below the species threshold, while,
surprisingly, they are very weakly correlated for more distantly related genomes. Species
delimitation of the high-quality genomes results in the identification of thirteen “new”
species, in the sense that it is not yet known that genomes of those species are publicly
available. Some of those genomes are annotated as other species but are too distant from
their type strain to be classified as such, while others are annotated as unclassified on the
species level. The phylogeny of the species shows that the new species are spread across the
LGC tree, with some being closely related to known species, while others are more
distant. Conclusions: Our tool for rapid extraction of SCGs yields similar results as current
methods and is much faster. We suggest the SCNI similarity as an alternative for ANI since it
can be determined rapidly for large datasets, results in very similar species boundaries and
might more accurately represent genome distances for more distantly related genomes.
Finally, we identify thirteen new species among publicly available LGC genomes.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
157
Board #:
Genomic Epidemiology of Vibrio cholerae O1 in Haiti: A Switch from the Ogawa to Inaba
Title:
Serotype
T. K. Paisie, C. Mavian, T. Azarian, M. Cash, D. J. Nolan, A. Ali, M. T. Alam, J. Morris Jr., M.
Author
Salemi;
Block:
University of Florida, Gainesville, FL.
Vibrio cholerae is the causative agent of the disease cholera. This bacterium is ubiquitous in
aquatic environments and toxigenic V. cholerae O1 may serve as a source for recurrent
cholera epidemics around the globe. In January 2010, a massive earthquake struck Haiti,
causing severe damage to the public health infrastructure. Then in October 2010, cholera
appeared in Haiti for the first time in over 150 years. Previous studies show that the early
cases of cholera in Haiti are consistent with a single-source introduction of V.
cholerae O1 from Nepalese U.N. peacekeeping troops sent after the earthquake. After the
initial epidemic waves, cholera may now be endemic in Haiti, showing seasonal outbreak
patterns associated with the rainy season. This clonal, single-source introduction of V.
cholerae O1 presents a unique opportunity to study the evolution and selective pressures
acting on this microorganism. By performing phylodynamic analysis with genome-wide single
nucleotide polymorphisms (SNP), we are able to investigate the ongoing cholera epidemic
occurring in Haiti and the underlying evolutionary processes and selective pressures at a
Abstract: remarkable resolution. Since the start of the cholera outbreaks in 2010, the dominate
serotype of V. cholerae O1 circulating in Haiti was the Ogawa serotype. Then in 2015, Inaba
became the dominant serotype in Haiti. The main driver causing the switch from the Ogawa
to the Inaba serotype is by a nucleotide substitution in the wbeT gene. Though the switch
from the Ogawa to the Inaba serotype is a common phenomenon in the genome of V.
cholerae O1, if the Ogawa serotype still remains dominate in the population and an outbreak
of the Inaba serotype occurred, this could have been caused by a separate introduction into
the population. Previous studies have shown that the Inaba serotype has been present in
Haiti since 2012 but it has never propagated and become established as the dominate
serotype circulating in Haiti. By using genome-wide SNPs to perform our analysis, we are able
to assess potential evolutionary changes and selective pressures that are occurring in the V.
cholerae O1 genome to generate this switch in serotype. Our results suggest that the V.
cholerae O1 strains currently circulating in Haiti have evolved from their initial clonal, single-
source outbreak of the Ogawa serotype to the new, unintroduced Inaba serotype.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
158
Board #:
LINbase: A Fast and Precise Whole Genome-Based Web Tool for Bacterial Pathogen
Title:
Identification and Tracking
Author L. Tian, L. S. Heath, B. A. Vinatzer;
Block: Virginia Tech, Blacksburg, VA.
The current pragmatic approach to bacterial taxonomy provides clear classificationguidelines
to determine if a bacterial isolate belongs to an already named species. It also provides clear
nomenclature rules on how to name new species. However, species descriptions do not
reveal the extent of genetic and phenotypic diversity within species and current taxonomy
does not provide any general guidelines or rules for intraspecific classification. This is highly
problematic in the case of bacterial pathogens, including foodborne pathogens, since most
pathogen species contain non-pathogenic strains and even pathogenic strains can be
separated into different intraspecific groups based on genomic and phenotypic features.
Whole genome sequencing (WGS) has shown considerable potential to facilitate the
detection of foodborne disease outbreaks and origin tracking by increasing the discriminatory
power compared to molecular methods such as pulsed-field gel electrophoresis (PFGE),
Abstract:
multiple-locus variable number tandem repeat analysis (MLVA) and multi-locus sequence
typing (MLST). Life Identification Numbers (LINs) have been shown to reflect phylogenetic
relationships and to provide a system for classification at - and below - the species level. At
the same time, LINs greatly improve identification of bacterial isolates because LINs can
identify bacterial isolates as members of species or members of intraspecific groups or
members of any other defined groups, e.g. isolates simply belonging to the same disease
outbreak. LINbase is a Web tool that implements LINs for classification and precise
identification of bacteria. In combination with fast algorithms and a user-friendly Web
interface, LINbase will not only provide users with the ability to precisely identify any
bacterial isolate based on its genome sequence within minutes, but also to determine
outbreak-association.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
159
Board #:
Title: Next Generation Sequencing to Investigate Nosocomial Transmission of Influenza
D. Frampton1, R. Blackburn1, C. Houlihan1, C. Smith1, Z. Kozlakidis1, S. Hue2, A. Hayward1, E.
Author Nastouli1;
1
Block: UCL / Farr Institute for Health Informatics Research, London, UNITED KINGDOM, 2London
School of Hygiene and Tropical Medicine, London, UNITED KINGDOM.
Overview: Evidence-based infection control of nosocomial influenza has the potential to offer
substantial human health improvements and financial cost-savings. However, few studies
have examined nosocomial influenza transmission outside the narrow context of suspected
outbreaks. This study is one of the first to apply full genome sequencing to examine influenza
transmission in hospital settings and to compare genomic clusters defined at the level of the
full genome with cases that were epidemiologically linked in time and space (hospital ward or
clinic). Our findings exemplify the use of full genome sequencing for hospital surveillance of
transmission, showing the technique can identify distinct transmission chains with
substantially greater resolution than can be achieved through classical epidemiological
investigation. We show that an important proportion of hospitalized influenza cases (at least
one in eleven) lead to onward transmission usually involving short chains of transmission
(average length of 3 cases). Many transmission chains cannot be explained by known contact
between individuals suggesting “missing links” in the chain due to under-ascertainment of
influenza cases in patients and a potential role for staff and/or visitors (who were not
sampled) in transmission. Methods: All influenza samples from inpatients, outpatients and
Abstract:
A&E attenders at a single hospital were included between September 2012 and March 2014.
Clinical records were used to define patients with suspected nosocomial infection with
possible transmission inferred from timing of first positive sample (relative to admission) and
spatio-temporal links to other infected patients. Sequencing was by Illumina
MiSeq. Results: 50 of 214 cases were part of genetically defined transmission chains amongst
hospitalised patients. The proportion in genetic transmission chains was substantially higher
for patients testing positive after 2 days of admission than those diagnosed soon after
admission (p<0.001), and for those with spatio-temporal links compared to those without
(p<0.001). The genetic distance between pairs of cases with spatio-temporal links was lower
than that for pairs with no spatial links (p<0.001). Assuming each genetically identified cluster
includes one community-acquired index case we estimate that 16% of hospital cases were
due to nosocomial transmission. 1 in 11 cases seeded a new transmission chain comprising an
average of 3 cases. Conclusions: Nosocomial influenza contributes significantly to hospital
burden during outbreak seasons. Routine whole genome sequencing will support outbreak
investigations and monitor the impact of infection and control measures.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
160
Board #:
Title: PiReT: Pipeline for Reference-based Transcriptomics
Author M. Shakya, S. Feng, C. Lo, K. W. Davenport, B. Hu, P. S. Chain;
Block: Los Alamos National Laboratory, Los Alamos, NM.
Transcriptomics enables identifying genes and pathways that are differentially expressed in
one condition over another, discovering small RNAs (sRNA), annotating transcribed genes,
and characterizing alternative splicing. With the rapid advancement in sequencing
technologies providing unprecedented throughput at an acceptable cost, many research
laboratories have shown interests in applying transcriptomics for their research. However,
most of the laboratories have found themselves continuously challenged by the lack of
bioinformatics and statistical expertise needed to design, implement, and maintain
computational workflows capable of analyzing transcriptomics data. Here, we
present Pipeline for Reference-based Transcriptomics or PiReT, a one of a kind reference-
based transcriptomics solution that adopts an open architecture and is built upon web-based
analysis platform of EDGE Bioinformatics to enable biologists with little or no computational
Abstract:
knowledge to analyze their data.A typical transcriptomics workflow requires implementing an
array of bioinformatics tools, each of which addresses a particular step in the analysis, e.g.
quality control, alignment, fragment counting, statistical hypothesis testing, etc. PiReT
effectively weaves together open source bioinformatics tools such as FaQCs, HISAT2,
featureCounts, EdgeR, DeSeq2, etc. and presents it in an interactive web Graphical User
Interface (GUI) where users can upload their raw data (fastq), customize steps of analysis, and
produce biologist-friendly results (e.g. RPKM/FPKM/TPM, read counts, list of differentially
expressed genes and pathway, etc.) and data visualizations within the GUI. It can perform
metatranscriptome analysis like host and pathogens responses, detect sRNAs, and perform
gene set enrichment or pathway analysis. PiReT can be used as a stand-alone workflow in
command line and is also integrated into EDGE Bioinformatics.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
161
Board #:
Chronic Campylobacteriosis Outbreak Investigation in Great Apes Using Next-Generation
Title:
Sequencing
Author D. Bandoy1, E. Crook2, N. Kong1, C. Huang1, B. Weimer1;
1
Block: University of California,Davis, Davis, CA, 2Hogle Zoo, Salt Lake City, UT.
Background: Campylobacteriosis is one of the leading causes of diarrhea globally. While more
than a thousand genomes have been published for Campylobacter jejuni, genomes from
other Campylobacter species, like C. hyointestinalis, have been reported infrequently. To date
only 18 C. hyointestinalis complete and draft genomes are available published, primarily from
domestic and wild ruminants. Methods: Campylobacter was isolated and identified from a
longitudinal surveillance strategy with feces using classical microbiological methods. Whole
genome sequencing was done using the previously described protocol of the 100K Foodborne
Pathogen Project using Illumina HiSeq X Ten instruments. Raw reads were assembled using
CLC Genomics. Genome distance was computed using GGDC and the distance matrix values
were used to generate a phylogenomic tree. Annotation was done using Prokka followed by
pangenome analysis using Roary that was visualized using Phandango. Whole genome
sequences were analyzed using ABRIcate and the online Comprehensive Antibiotic Resistance
database (CARD) for antimicrobial resistance genes and virulence factors. Genomic islands
were predicted using Island Viewer online tool and manual curation was performed using
Mauve alignment. Results: In silico genome distance placed the isolates into distinct groups of
host species of origin, indicating host species adaptation. This clustering corresponds to the
Abstract: host species specific set of genes as demonstrated by presence-absence variation (PAV)
analysis using the pangenome output. Only one isolate (BCW 9279) within the primate
outbreak showed phylogenetic incongruence by clustering with the New Zealand deer
isolates. The apparent phylogenetic incongruence has been determined to be due to
horizontal gene transfer with genomic islands acquired from Clostridioides difficile. Further
analysis revealed the existence of arsenical resistance genes which eventually was no longer
identified in the post-antibiotic treatment genome sequences. This finding indicates a
mechanism of genomic divergence with the acquisition of genomic islands that were
negatively selected within the context of an ongoing outbreak and therapeutic intervention.
Surprisingly, despite the great ape’s exposure to tetracycline treatment, C. hyointesinalis ssp.
hyointestinalis resistome profiling revealed absence of any known antibiotic resistance genes.
rRNA copy number comparison (one copy in all the primates versus three copies in the
reference) suggests a possible mechanism of a carrier state with reduced metabolic
activity. Conclusion: Whole genome sequencing revealed unprecedented resolution of
longitudinal infection dynamics, revealing acute genomic island gain and loss due to negative
selection pressure with antibiotic exposure. These findings highlight the genomic flexibility
of Campylobacter hyointestinalis ssp. intestinalis in chronic diarrhea of great apes.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
162
Board #:
Title: Sunbeam: An Extensible Pipeline for Analyzing Metagenomic Sequencing Experiments
E. L. Clarke1, L. J. Taylor1, C. Zhao2, A. Connell1, F. D. Bushman1, K. Bittinger2;
Author 1
Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 2Children's
Block:
Hospital of Philadelphia, Philadelphia, PA.
Background: Shotgun metagenomic sequencing experiments provide functional and
compositional insight into complex microbial communities. To analyze such data, a number of
preprocessing and analytical steps must be performed. Many of these steps, such as quality
control, adapter trimming, and phylogenetic classification, are common to many sequencing
experiments. Other analyses are specific to each study. Methods:Here we introduce
Sunbeam, a modular and user-extensible pipeline designed to process metagenomic
sequencing data in a consistent and reproducible fashion. Sunbeam performs multiple
processing steps common to many metagenomic sequencing experiments including quality
control, adapter trimming, host read removal, low-complexity filtering, metagenomic
classification, read assembly, and reference genome alignments. Sunbeam also includes a
powerful extension framework that enables users to incorporate new analysis or processing
steps easily. Results: Sunbeam installs in a single step, has no dependencies other than Linux,
Abstract:
doesn't require administrative access, and works on most cluster computing frameworks.
Sunbeam is inherently modular and will restart where it left off in case of error. To quickly
and accurately filter problematic low-complexity reads in metagenomic data, we also
introduce Komplexity, a rapid sequence complexity analysis tool, which identifies low
complexity sequences to allow removal. The Sunbeam pipeline is well-documented, regularly
updated and in routine use. We also provide a number of pre-built extensions
(github.com/sunbeam-labs/). Conclusions: Sunbeam provides an easy-to-use, extensible
framework for in-depth analysis of metagenomic sequencing experiments. Sunbeam ensures
reproducible and consistent analyses by standardizing post-processing, analytical, and custom
steps, and robust removal of problematic, low-complexity reads. Sunbeam is written in
Python using the Snakemake workflow management software and is freely available at
github.com/sunbeam-labs/sunbeam under the GPLv3.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
163
Board #:
Title: A Carbon Nanotube Platform for Virus Enrichment
Author Y. Yeh, M. Terrones;
Block: The Pennsylvania State University, University Park, PA.
Viral pathogens evolve rapidly and unpredictably, challenging the effectiveness of existing
studies of viral evolution. Deep sequencing techniques detect viral mutations and diversities
by sequencing a genome region multiple times. A higher coverage along a consensus
sequence allows for reliable identifications of mutations among a viral population. Starting
with a sufficient amount and a high purity of genomic materials is a key to obtain a high
coverage consensus. One current challenge is the low virus counts in most samples, leading to
sequence reads that are dominated by hosts rather than by viral pathogens. Extant
enrichment methods, including virus culture and genome amplification, often introduce
artificial variants or bias among sequence reads. Size-tunable-enrichment-platform (STEP),
our recently developed portable technology, is constructed by aligned and functionalized
carbon nanotube forests to enrich different viruses based on their sizes while removing host
contaminants, e.g. host cell debris, DNA, mRNA, etc. The CNT-STEP significantly improves
detection limits and virus isolation rates by at least 100 times. We integrate CNT-STEP with
NGS in order to sequence unknown virus directly from field samples after enrichment. After
Abstract: enrichment, NGS viral reads increased from 2.9% (37,627 reads) to 90.6% (1,175,537 reads),
thus corresponding to an enrichment factor of ~600, and indicating that the CNT-STEP
removed most of the contamination from the host. In order to validate our new approach for
real field samples, we applied a cloacal swab pool collected from five ducks during a 2012 AIV
surveillance in Pennsylvania. Without any virus purification and propagation, the duck swab
sample was enriched and concentrated by a CNT-STEP of 95 nm inter-tubular distance. No
clogging was observed under scanning electron microscopy (SEM). NGS and de novo
sequence assembly yielded 8 AIV contigs in complete lengths, but no AIV related contig was
discovered in the sample without CNT-STEP enrichment. We named it
“A/duck/PA/02099/2012 (H11N9)”. The H11N9 strain was further confirmed by US
department of agriculture (USDA) through serological tests. This enrichment increases two
orders of magnitude of sequencing coverage that dramatically enhance the sensitivity in
identifying mutations. An outcome of this collaboration is the establishment of a unique
method that enables close monitor of viral evolutions and a cost-effective sample preparation
platform to allow for efficiency in viral deep sequencing.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
164
Board #:
Whole Genome Sequencing to Track the Origin and Spread of Tuberculosis in Low Prevalence
Title:
Setting of Australia
S. Gautam1, M. Aogáin2, L. Cooley3, G. Haug4, J. Fyfe5, M. Globan5, R. O’Toole1;
Author 1University of Tasmania, Hobart, AUSTRALIA, 2Trinity College, Dublin, Dublin, IRELAND, 3Royal
Block: Hobart Hospital, Hobart, AUSTRALIA, 4Launceston General Hospital, Launceston,
AUSTRALIA, 5Victoria Infectious Diseases Reference Laboratory, Melbourne, AUSTRALIA.
Background: Tasmania is a small island state in Australia with annual tuberculosis (TB)
incidence rate of 1.7/100,000 population in 2014. A 60% drop in the current rate of TB by
2035 and a 95% drop by 2050 are required in Tasmania to meet World Health Organization’s
international target of TB eradication by 2050. This study was designed to identify the source
and track transmission of TB in Tasmania which is largely unknown. Methods: Whole genome
sequence (WGS) analyses of cultured isolates of Mycobacterium tuberculosis obtained from
2014 to 2016 in Tasmania was performed using Illumina Miseq at University of Tasmania. The
genomic data was analyzed for single locus variation to determine phylogeny and drug
resistance-conferring mutations. Genomic information was also analyzed in reference to
public health surveillance records. Furthermore, in silico spoligotyping was performed to
relate Tasmanian TB cases with publicly available isolates of International spoligotypes.
Abstract: Household contacts of TB cases were traced and their isolates analyzed. A cut-off of ≤5 single
nucleotide polymorphism (SNP) differences between the isolates was used to define the
recent transmission. Results: More than 80% of TB cases in Tasmania were detected in non-
Australian born individuals. Two clusters of TB were detected, one belonging to individuals
originating from Nepal and other from New Zealand. Based on WGS data, isolates belonging
to the largest cluster of TB in Tasmania were related to those prevalent in patient’s country of
origin, Nepal. Furthermore, SNP analyses revealed Vietnam as the origin of the first case of
multi-drug resistant TB in Tasmania. In addition, a human case of bovine TB reported after 40
years of its eradication from cattle in Tasmania was linked to M. bovis previously reported in
mainland Australia. Conclusion: Majority of TB cases in Tasmania have been reported in
foreign-born individuals. Geographically, TB in the state had a foreign origin. Transmission of
TB occurred within the members of the close community but not in a wider population.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
165
Board #:
Title: Understanding genomic landscapes in EnteroBase with cgMLST & GrapeTree.
N. Alikhan1, Z. Zhou1, N. Luhmann1, C. Vaz2, A. P. Francisco2, J. A. Carriço3, M. Achtman1;
Author 1University of Warwick, Coventry, UNITED KINGDOM, 2Instituto de Engenharia de Sistemas e
Block: Computadores: Investigação e Desenvolvimento, Lisbon, PORTUGAL, 3Universidade de Lisboa,
Lisbon, PORTUGAL.
Sequenced raw reads are available in ENA for more than 647,000 bacterial genomes.
Important goals for such data may include identifying groups of genetically related bacteria in
order to facilitate epidemiological tracking or in depth analyses. However, even these simple
goals are difficult unless the raw data is codified.
We have developed an online tool, EnteroBase (http://enterobase.warwick.ac.uk), which
provides access to genomic assemblies, genotypes and analytical tools to biologists, clinicians
and epidemiologists. EnteroBase includes consistent high-resolution genotyping by core
genome multi-locus sequence typing (cgMLST) schemes
for Salmonella, Escherichia, Yersinia & Clostridioides and their intuitive visualization by
GrapeTree (https://github.com/achtman-lab/GrapeTree) (1). Phylogenetic analyses via single
nucleotide polymorphisms (SNPs) of up to 1,000 genomes are also available on-demand. An
initial impression of the benefits of this approach can be found in a recent review article (2).
We are already implementing the combination of data from modern genomes with ancient
DNA. EnteroBase contains more than 150,000 genomes from Salmonella and 70,000 from
Abstract:
Escherichia. These are unprecedented troves of data on the diversity within these two genera,
and the size of these databases will continue to increase dramatically over the next few years.
All read data are checked for quality, assembled and genotyped with a versioned pipeline,
ensuring consistency. EnteroBase supports sharing of data within private groups of
researchers as well as publishing graphical analyses and datasets for the entire global
community. We are also establishing facilities to allow free download of all genomes in
EnteroBase via a dedicated server. MSTree V2 and RapidNJ are implemented within
GrapeTree, and can identify important clusters of related organisms among 100,000 genomes
based on cgMLST. However, we are already preparing for the future that will encompass
orders of magnitude more genomes, by developing hierarchical clustering, which will provide
persistent and scalable designations, as a general tool for microbial genomics.
Reference List
1. Z. Zhou et al., BioRxiv doi: https://doi.org/10.1101/216788 (2017).
2. N.-F. Alikhan, Z. Zhou, M. J. Sergeant, M. Achtman, PLoS Genet 14, e1007261 (2018).
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
166
Board #:
Comparison of Genomic Analysis Methods for Investigation of a Legionellosis Cluster in New
Title:
York City, October 2017
C. Kretz1, P. Lapierre2, J. Mercante1, J. Novak3, E. Omoregie3, E. Gonzalez3, J. Wang3, B.
Author Raphael1, S. Hughes3, K. Musser4, J. Rakeman3;
1
Block: CDC, Atlanta, GA, 2NYS Wadsworth Center, Albany, NY, 3NYC PHL, NYC, NY, 4NYS Wadsworth,
Albany, NY.
Legionellosis is caused by exposure to Legionella species found in water; symptoms range
from a mild influenza-like illness to a serious and sometimes fatal form of pneumonia. In the
United States, ~79% of cases are associated with Legionella pneumophila serogroup 1
(Lp1). Legionella is a growing public health concern in the country; disease incidence has
nearly quadrupled since 2000 with several large high-profile outbreaks in the recent years,
including in New York City. During October 1-14, 2017, 15 cases of legionellosis were
confirmed in a <0.75 km radius in Flushing, Queens, NY. We used 2 comparative genomic
methods to characterize environmental isolates collected during the investigation and
compare them with circulating strains to assess diversity of Lp1. During the environmental
investigation, 55 epidemiologically linked cooling towers and water fountains were sampled
and tested by culture-based methods and by real-time PCR for presence
of Legionella species, Legionella pneumophila, and Legionella pneumophila serogroup 1 (Lp1).
No clinical isolates were recovered from 6 sputum specimens obtained from patients, but
13 Legionella species isolates and 5 Lp1 isolates were recovered from environmental sources.
Whole-genome sequencing (WGS) was performed on all 5 Lp1 isolates and single nucleotide
polymorphism (SNP) and multilocus sequence typing (wgMLST) were carried out for in-depth
Abstract: molecular characterization of circulating strains. SNP analysis was used to compare isolates
recovered during the investigation with historical Lp isolates from New York State (NYS). One
isolate matched two unrelated clinical isolates from NYS with 24-25 SNPs differences,
whereas, remaining isolates were closely related to each other and environmental isolates
previously recovered during the 2015 South Bronx outbreak, which indicates persistence of
this strain in NYC. In silico sequence-based typing revealed that 4 isolates were sequence type
(ST) 1400 that has been found only in NY and 2 of these isolates shared a high degree of
similarity (>99% allele identity) with a clinical isolate from 2009 recovered in NYC. WGS has
provided additional resolution to outbreak investigations. In this investigation, two different
genomic sequence analysis methods were used with comparable results. Both methods were
able to distinguish and separate isolates based on their relatedness. We conclude that
particular Legionella strains could be endemic and persistent in NYC based on similarity of
strains and ST unique to NY. Additionally, we detected diversity of potential disease-causing
strains in cooling towers when compared with strains commonly found in the United States.
Our investigation showcases how WGS is crucial in outbreak investigations and highlights
need for obtaining clinical isolates from patients with legionellosis to identify disease sources
to prevent additional exposures.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
167
Board #:
Prevalence and Serovar Diversity of Salmonella spp. in Primary Agriculturalhorticultural Fruit
Title:
Production Environments
Author L. Chidamba, L. L. Korsten, A. Gomba;
Block: University of Pretoria, Pretoria, SOUTH AFRICA.
Increases in foodborne disease outbreaks associated with fresh produce have necessitated
the need to identify potential sources of microbial contamination in produce and agricultural
environments. The present study evaluated Salmonella prevalence and serovar diversity in
fruit (225), water (140) and surface (126) samples, from three commercial farms and
associated packhouses, located in different farming regions in South Africa. Fruit and water
samples were collected from both orchards and packhouses, while surface samples were
collected from conveyer belts and hands of packhouse employees. Salmonella was detected
in 26 of the 491 (5.3%) samples. Environmental samples (water and surfaces) recorded a
slightly higher proportion (3.1%; 15/491) of positive samples compared to fruit samples
Abstract: (2.2%; 11/491). Salmonella was not detected on employee hands and river water samples. A
total of 263 Salmonella cultures were isolated from the 26 from positive samples by standard
culture methods, preliminarily identified through matrix-assisted laser desorption ionisation-
time of flight mass spectroscopy (MALDI-TOF MS) and API 20E, and confirmed by invA gene.
Of the 39 representative isolates serotyped the serovars Muenchen (33.3%), Typhimurium
(30.8%), Heidelberg (20.5%), Bsilla (7.7%), Salmonella subspecies IIb: 17: r: z (5.1%) and one
untypable strain were identified. Most samples had multiple serovars with orchard water
form one site recording the highest serovar diversity (4 serovars). Our findings show the
potential of agricultural fruit production environments to act as reservoirs of clinically
important Salmonella serovars.
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
168
Board #:
Applying Bioinformatics Pipelines to Reconstruct Bacterial Genomes from Human Faecal
Title:
Metagenomes
F. M. Mobegi1, L. E. Leong1, B. Ramadass2, E. Mortimer3, M. J. Manary4, D. H. Alpers5, G. P.
Young3, B. S. Ramakrishna6, G. B. Rogers1;
Author 1SAHMRI, Adelaide, AUSTRALIA, 2All India Institute of Medical Sciences, Odisha,
Block: INDIA, 3Flinders University, Bedford Park, AUSTRALIA, 4Washington University in St. Louis, St
Louis, MO, 5Washington University in St. Louis, St. Louis, MO, 6SRM Institutes for Medical
Science, Chennai, INDIA.
Introduction: Advances in metagenomics and computational methods, together with
reductions in sequencing costs, have aided culture-independent studies of complex microbial
systems. Metagenomics-based analysis has primarily focused on assessments of microbiome
diversity and functional capacities. However, deep metagenomics sequencing also allows the
reconstruction and exploration of draft microbial genomes. This approach is particularly
powerful in relation to culture-refractory species. We aimed to retrieve high-quality draft
bacterial genomes from faecal metagenomes, generated from pre-school children in Tamil
Nadu, India. Methods: Shotgun metagenomic sequencing was performed on longitudinal
stool sample collections from three stunted and three non-stunted adolescents who were
enrolled in a starch supplementation study. On average, 261 million high-quality reads were
obtained from each sample. Using IDBA and CD-HIT, the reads were de novo assembled into
contigs and dereplicated to remove redundant sequences respectively. Individual sample
reads were then mapped to the non-redundant contigs using Burrow-Wheeler Aligner (BWA),
and the resulting BAM files sorted and indexed using Samtools. MetaBAT, with all the five
preset parameters, and a depth file of each BAM file, was used to bin contigs, as previously
described1. Resulting bins, which represent draft genomes, were assessed for completeness
and purity and refined using checkM and RefineM. Taxonomic assignments for the drafts
Abstract: were confirmed using Kraken. Results: Based on deep metagenomic sequencing we
reconstructed near-complete genomes of 114 bacterial taxa with high genome quality
(completeness ≥70%, contamination ≤10%). Approximately 93% of all recovered genomes
represented fermentative commensal species. Although the reconstructed genomes
displayed notable consistency with their type-strains in the core genome composition, some
selected culture-refractory anaerobic bacteria revealed significant differences with their type-
strain counterparts in the accessory genome. In monosaccharides metabolism, for
example, reconstructed A. muciniphila has genes needed to utilise d-galacturonate and d-
glucuronate, which are absent in the type-strain A. muciniphila (ATCCBAA-835). In contrast,
the ATCC strain has genes for fructose utilisation that are absent in our genome. These
differences might reflect characteristics of local diet. Conclusion: We demonstrated the
successful recovery of draft bacterial genomes from faecal metagenomes and their
comparison to type-strains. This ability to construct bacterial genomes directly from
metagenomes is valuable in allowing the analysis of culture-refractory taxa, and is likely to be
particularly important in contexts where advanced culture techniques are unavailable. It also
provides a means to mine existing published datasets.
Literature
1. Parks, D.H. et al. Nature Microbiology 2, 1533-1542 (2017).
Session: Poster Session B
Time: Tuesday, September 25, 2018, 2:00 pm - 3:30 pm
Poster
169
Board #:
Title: Building Bioinformatics Infrastructure in New England State Public Health Laboratories
T. Hsu1, G. Gallagher1, N. L. Yozwiak2, D. J. Park2, C. Tomkins-Tinch2, T. Fink1, N. Consortium3,
P. Sabeti2, S. Smole1;
Author 1
Massachusetts State Public Health Laboratory, Jamaica Plain, MA, 2Broad Institute of MIT and
Block:
Harvard, Cambridge, MA, 3Northeast Environmental and Public Health Laboratory Directors,
NA, MA.
Background: Public health laboratory surveillance systems have historically relied on two
types of assays to detect pathogens, which include i) profiling the organism-of-interest via
morphological traits (microscopy), metabolic capability (culture), and/or molecular subtyping,
or ii) surveying the environment of the organism-of-interest through serology. With
decreasing costs, next generation sequencing (NGS) has emerged as a surveillance tool that
potentially provides a standardized protocol across groups of microorganisms and finer
resolution than subtyping. Unfortunately, the implementation of NGS and bioinformatics
analyses in state laboratories has remained challenging. Methods: Massachusetts State Public
Health Laboratory (MA SPHL) was funded as the bioinformatics leader laboratory in 2018 for
the New England area, which includes CT, MA, ME, NH, NY, RI, VT, NYC, and NJ. In order to
determine the status of bioinformatics for these laboratories, MA SPHL hosted a series of 6
calls in partnership with the Broad Institute of MIT and Harvard. The first 3 calls reviewed
each state’s sequencing and bioinformatics infrastructure, while the latter 3 calls consisted of
demonstrations of cloud computing through Amazon Web Services (AWS) and Google
Abstract: Compute Engine. Results: Surveys and calls revealed that most states had MiSeq sequencing
capability, and participated in CDC programs such as PulseNet, National Antimicrobial
Resistance Monitoring System (NARMS), Global Health Outbreak and Surveillance Technology
(GHOST), and CaliciNet. Wadsworth NY was an outlier with both sequencing and
bioinformatics cores, along with active assay and pipeline development for organisms outside
those projects (for e.g, adenovirus, mumps, and zika). Difficulties in setting up bioinformatics
infrastructure stemmed from information technology (IT) resistance for Linux or Cloud
support, little funding for bioinformatics staff, and lack of data policies for sequencing
data. Conclusions: Most New England states have obtained the ability to sequence by
participating in CDC programs, but rely on the CDC or outside collaborators for bioinformatics
analyses. Each state will likely require its own unique solution due to differing state laws and
governance, which in turn shapes state IT departments and prevents them from emulating
CDC’s compute model. To provide future guidance, we are currently drafting a
“Bioinformatics Implementation Guide”, which will include considerations for hiring
bioinformaticians, finding compute hardware, and working with IT.