Professional Documents
Culture Documents
Multilaboratory Assessment of Metagenomic Next-Generation Sequencing For Unbiased Microbe Detection
Multilaboratory Assessment of Metagenomic Next-Generation Sequencing For Unbiased Microbe Detection
h i g h l i g h t s g r a p h i c a l a b s t r a c t
We developed a set of well-defined Sampling Nucleic acid extraction Library construction High-throughput sequencing
reference materials, which is very
beneficial to monitor problems in
Genome fragmentation
mNGS workflows and identify
optimal protocols.
End repair, add A Tail, ligate
The high interlaboratory variability in adapters
a r t i c l e i n f o a b s t r a c t
Article history: Introduction: Metagenomic next-generation sequencing (mNGS) assay for detecting infectious agents is
Received 26 August 2021 now in the stage of being translated into clinical practice. With no approved approaches or guidelines
Revised 24 September 2021 available, laboratories adopt customized mNGS assays to detect clinical samples. However, the accuracy,
Accepted 24 September 2021
reliability, and problems of these routinely implemented assays are not clear.
Available online 27 September 2021
Objectives: To evaluate the performance of 90 mNGS laboratories under routine testing conditions
through analyzing identical samples.
https://doi.org/10.1016/j.jare.2021.09.011
2090-1232/Ó 2022 The Authors. Published by Elsevier B.V. on behalf of Cairo University.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
D. Han, Z. Diao, H. Lai et al. Journal of Advanced Research 38 (2022) 213–222
Methods: Eleven microbial communities were generated using 15 quantitative microbial suspensions.
They were used as reference materials to evaluate the false negatives and false positives of participating
mNGS protocols, as well as the ability to distinguish genetically similar organisms and to identify true
Keywords:
pathogens from other microbes based on fictitious case reports.
Metagenomic next generation sequencing
mNGS
Results: High interlaboratory variability was found in the identification and the quantitative reads per
Respiratory tract infection million reads (RPM) values of each microbe in the samples, especially when testing microbes present
Reference material at low concentrations (1 103 cell/ml or less). 42.2% (38/90) of the laboratories reported unexpected
Infectious disease microbes (i.e. false positive problem). Only 56.7% (51/90) to 83.3% (75/90) of the laboratories showed a
sufficient ability to obtain clear etiological diagnoses for three simulated cases combined with patient
information. The analysis of the performance of mNGS in distinguishing genetically similar organisms
in three samples revealed that only 56.6% to 63.0% of the laboratories recovered RPM ratios (RPMS. aur-
eus/RPMS. epidermidis) within the range of a 2-fold change of the initial input ratios (indicating a relatively
low level of bias).
Conclusion: The high interlaboratory variability found in both identifying microbes and distinguishing
true pathogens emphasizes the urgent need for improving the accuracy and comparability of the results
generated across different mNGS laboratories, especially in the detection of low-microbial-biomass
samples.
Ó 2022 The Authors. Published by Elsevier B.V. on behalf of Cairo University. This is an open access article
under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Table 1
Characteristics of the mock communities in this study.
Sample Microbial composition Concentration (Cell/ml) Results need reported by the laboratory to NCCL Use in this study
ID
S1 14 microbes at different 1 102 –1 107 All the microbes that higher than the cut-off value built by Sensitivity of mNGS for
concentrations the laboratory and their specific reads and RPM values detecting microbes
2 2
S2-S4 9 microbes at the same 5 10 (S2), 1 10 (S3), All the microbes that higher than the cut-off value built by Performance in detecting low
concentration 1 10 (S4) the laboratory and their specific reads and RPM values microbial biomass samples
S5-S7 Staphylococcus aureus: 1 106: 1 105 (10:1), Staphylococcus aureus and Staphylococcus epidermidis and Performance in distinguishing
Staphylococcus 1 106: 1 106 (1:1), their specific reads and RPM values genetically similar organisms
epidermidis 1 105: 1 106 (1:10)
S8 Aspergillus fumigatus and 1 104 (Aspergillus Microbes that considered to be the true pathogens of Performance in identifying true
other 5 bacteria fumigatus) simulated cases and their specific reads and RPM values pathogens for case report 1
S9 Candida albicans and other 1 105 (Candida albicans) Microbes that considered to be the true pathogens of Performance in identifying true
6 bacteria simulated cases and their specific reads and RPM values pathogens for case report 2
S10 Streptococcus pneumoniae 1 105 (Streptococcus Microbes that considered to be the true pathogens of Performance in identifying true
and other 6 bacteria pneumoniae) simulated cases and their specific reads and RPM values pathogens for case report 3
S11 Negative sample – All the microbes that higher than the cut-off value built by Negative control
the laboratory and their specific reads and RPM values
Performance in detecting microbes with different concentrations laboratories on low-microbial-biomass samples, we found that the
detected rate of every microbe was lower in the laboratories using
Sample S1 is a complex mock community consisting of 14 dif- host depletion approach (depletion group) than in those without
ferent concentrations of microbes and is used to evaluate the sen- host depletion (nondepletion group) (supplementary Table S2).
sitivity of mNGS in microbial identification. The reported data Fig. 2b intuitively shows that the detection rate of each microbe
showed that no laboratory could detect all the 14 microbes. 40% in sample S2 decreased in the depletion group, with S. pneumoniae,
(36/90) of the laboratories detected all the microbes with concen- H. influenza, P. aeruginosa and A. fumigatus decreasing significantly
trations in the range of 1 104 to 1 107cell/ml (Fig. 1a). By con- (all p < 0.05). These results indicated that adding a host DNA deple-
trast, <50% of the laboratories could detect any of the six microbes tion process in an mNGS protocol might increase the risk of false
at concentrations of 1 103 cell/ml or less (Fig. 1a). Taking the negatives in testing low-microbial-biomass samples. Similarly,
reported data of all laboratories as a whole, we found that the we performed the same analysis on the data of sample S1 and
higher the microbial load in the sample was, the higher the median found that the detection rate of most microbes (S. maltophilia, S.
RPM of the corresponding microbe obtained by sequencing (see epidermidis, S. marcescens and B. fragilis) at low concentrations
supplementary dataset 2, Fig. 1b). However, for each microbe, (1 102–1 103 cell/ml) dropped significantly in the depletion
the RPM values reported by each laboratory varied greatly (see group (supplementary Table S3). However, host DNA depletion
supplementary dataset 2). For example, the median RPMs of H. did not have a significant effect on the detection rate of four bacte-
influenza and S. pneumoniae were 17556.9 (IQR: 4358.9–41154.8; ria (H. influenzae, S. pneumoniae, A. baumannii and K. pneumoniae)
range: 0.3 (lab_34)-330310 (lab_82)) and 24977.3 (IQR 5236.4– at high concentrations (1 106–1 107 cell/ml) (supplementary
58704.8; range: 0.5 (lab_34)-962186.2(lab_8)), respectively. Table S3, all p > 0.05).
The presence of unexpected microbes (i.e. false positive) is a In addition, the data showed that the detection rate of
matter of concern. In this study, 42.2% (38/90) of the laboratories microbes in the three samples was higher in the laboratories
reported 306 unexpected microbes (Fig. 1c). Among the 306 unex- with a bead-beating process (bead-beating group) (supplemen-
pected microbes, the ones with the highest frequency were Pas- tary Table S4). Fig. 2c showed that the detection rate of H.
teurella multocida (in 15 labs), Haemophilus aegyptius (in 13 labs) influenza, P. aeruginosa and A. fumigatus in sample S2 was signif-
and Haemophilus haemolyticus (in 12 labs). They are not common icantly increased in the bead-beating group (all p < 0.05). Espe-
microorganisms in the laboratory environment and might origi- cially for A. fumigatus, a fungus with a hard cell wall, its
nate from mismatches in sequence alignment and assignment detection rate can greatly improve in laboratories by adding a
steps. Other bacteria, such as Streptococcus mitis, Haemophilus bead-beating process. For example, in sample S2, the detection
parainfluenzae, Streptococcus oralis, Acinetobacter pittii and Kleb- rate of A. fumigatus in the bead-beating group was 62.9%
siella oxytoca, are common microorganisms in respiratory tract (44/70), while that in the no bead-beating group was only 30%
samples. They might be reported as a result of laboratory contam- (6/20). When the concentration of A. fumigatus was reduced to
ination or cross-contamination with other clinical respiratory 10 cell/ml (Sample S4), all the test results of the laboratories
specimens. The reported Streptococcus pseudopneumoniae and Shi- in the no bead-beating group were negative, but 37.1% (26/70)
gella flexneri are genetically similar microorganisms to S. pneumo- of the laboratories in the bead-beating group could still detect
niae and E. coli in sample S1, respectively. Their misreporting this microbe, with a relatively low median RPM value (0.3, IQR
might be caused by an inadequate resolution of the bioinformatics 0.1–0.8) (see supplementary dataset 3).
algorithms in the relevant laboratories. However, when we compared the effects of other variables (ex-
traction strategy, sequencer, alignment/taxonomy assignment tool
Performance in detecting low-microbial-biomass samples and database) on the detection rate of microbes in samples, we
found that no one approach was better than the other approaches
We compared the performance of the mNGS assay in testing for every microbe (see supplementary dataset 6).
three low-microbial-biomass samples (S2, S3 and S4) with concen-
trations of 500 cell/ml, 100 cell/ml and 10 cell/ml, respectively. For Performance in distinguishing genetically similar organisms
most microbes, with the decrease of microbial concentration, the
number of laboratories with positive results decreased (Fig. 2a). Whether genetically closely related species can be accurately
When compared the effects of various approaches used in different distinguished is an important aspect to evaluate the analytical
216
D. Han, Z. Diao, H. Lai et al. Journal of Advanced Research 38 (2022) 213–222
Table 2 input ratios (10:1 in sample S5, 1:1 in sample S6, 1:10 in sample
Methodological variability reported by the participating laboratories. S7) was observed in many laboratories. Only 56.6% (43/76) of the
1. host DNA depletion Number of laboratories tested for sample S5 have an RPM ratio within the
laboratories range of a 2-fold change of the initial input ratio (10:1) (Fig. 3a),
Yes lower than that for sample S6 (65.9%, 56/85) and that for sample
Differential lysis 24 S7 (63.0%, 51/81) (Fig. 3b and 3c). In total, all the above results
Magnetic bead-based method selectively removes host 2 indicate that the sample processing or data analysis processes in
DNA containing CpG methylation
CRISPR–Cas9-based approaches, deplete abundant 1
some laboratories need further optimization to enhance the ability
sequences of accurate discrimination and quantification of similar species in
No 63 samples.
2. bead-beating included in microbal cell disruption?
Yes 70
No 20 Performance in identifying true pathogens
3. DNA extraction kit
TIANGEN 32 At present, there is no guideline as a standard for interpreting
QIAGEN 。 9
MGITM 8
mNGS sequencing data. Laboratories give reportable results based
MAPMITM 8 on their own independent interpreting criteria [9]. In this study,
Zymo Research 4 we designed three fictitious case reports (case reports 1, 2 and
MagMAXTM 2 3) and the corresponding simulation samples (samples S8, S9
Promega 1
and S10) to assess the laboratory’s ability to identify true patho-
Other custom kits (16kinds) 26
4. Extraction strategy gens from multimicrobial samples. The reported RPM for every
Automatic 7 microbe varied greatly from laboratory to laboratory (see supple-
Manual 83 mentary dataset 5, Fig. 4). Only 56.7% (51/90) to 83.3% (75/90) of
5. Sequencer the participating laboratories made definite diagnosis for these
illumina NextSeq 500/550 39
MGISEQ-2000/200/50 18
case reports (Table 3). 32.2% (29/90), 15.6% (14/90) and 12.2%
illumina NextSeq CN500 11 (11/90) of the laboratories had detected the true pathogen but
BioelectronSeq 4000 8 did not give a definitive diagnosis samples S8, S9, and S10,
illumina NovaSeq 6000 4 respectively (Table 3), indicating that although some laboratories
Others 10
can detect the microbes present in the samples, their data analy-
6. Sequence alignment or taxonomy assignment
BWA 23 sis team does not have enough ability to make a clear etiological
SNAP 13 diagnosis combined with the given case information (clinical
Kraken2 12 symptoms, medication history, outcome, etc.). In addition, a few
Kraken 10 laboratories mistakenly reported microbes that did not exist in
bowtie 10
NCBI BLAST 4
the original samples, which will mislead the clinical diagnosis
Centrifuge 2 and treatments of the cases.
MetaPhlAn 1
Not report 15
7. Database Discussion
Public database
NCBI nr/nt database 6
During the initial phase of the survey, we received applications
NCBI RefSeq database 22
Custom database
from as many as 90 laboratories to participate in this multicenter
Genseq-PDB V0.20.0 11 quality assessment study, which showed that the use of mNGS
PMseqDB v5.3.0 11 technology into clinical practice recently to aid infection diagnosis
CBPD V3.2 8 has been advancing rapidly. By designing a sample set containing
IDseqDB V2.0 7
15 microbes related to respiratory tract infections, we evaluated
Others 25
8. method validation the testing performance of all the participating laboratories that
Done 51 have established what they believe is the best way to perform
Not yet 39 mNGS testing. Analysis of the reported data indicated several
issues of mNGS tests under local conditions, including a variety
of mNGS protocols with different operating procedures from
specificity of an mNGS protocol [16,31]. In this study, we analyzed nucleic acid extraction to result interpretation were reported by
three samples (S5, S6 and S7) that contained two related species in the participating laboratories. However, approximately half
the same genus (S. aureus and S. epidermidis) at concentrations to (43.3%, 39/90) of the laboratories have not yet completed method
yield RPM ratios (RPM S. aureus/RPM S. epidermidis) of approximately validation; high interlaboratory variability was found in the iden-
10:1, 1:1, and 1:10 (see supplementary dataset 1). The results tification and reported RPM value of each microbe in the samples,
revealed that, although the two bacteria were both present at a especially when testing microbes with low concentrations; the
high concentration of 1 105 or 1 106 cell/ml, not all laborato- data analysis teams of several laboratories did not show sufficient
ries successfully detected them. For example, S. aureus and S. epi- ability to make a clear etiological diagnosis of a case combined
dermidis both reached a concentration level of 1 106 cell/ml in with the given patient information; and the analysis of the perfor-
sample S6, but they were even missed by 3 (lab_34, 54 and 80) mance mNGS in distinguishing genetically similar organisms in
and 5 (lab_34, 46, 48, 54 and 80) laboratories, respectively (see three samples revealed that only 56.6% to 63.0% of the laboratories
supplementary dataset 4). In addition, the calculated RPM ratio recovered RPM ratios (S. aureus and S. epidermidis) within the range
(RPM S. aureus/RPM S. epidermidis) of S. aureus and S. epidermidis varied of a 2-fold change in the initial input ratios (indicating a relatively
greatly from 0.02 (lab_7) to 64.4 (lab_49) in sample S5, from 0.002 low level of bias).
(lab_7) to 9.8 (lab_50) in sample S6, and from 0.0001 (lab_62) to For mNGS, an important issue at present is the lack of standard-
5.9 (lab_50) in sample S7 (see supplementary dataset 4). A high ized guidelines for method validation. The general requirements
level of bias (over- or underestimating > 2-fold) from the initial for validation of laboratory-developed molecular assays for infec-
217
D. Han, Z. Diao, H. Lai et al. Journal of Advanced Research 38 (2022) 213–222
a c
Lab_15
Lab_88
Lab_20
_83
Lab
81
Lab
_2 9
La
b_
80
b_
La
La
b_
3
b_
La
La
78
30
b_ 18 3 1 b_
31 2 9
La 22 3 La
b_ 76
33 13 7 b_
Haemophilus influenzae (88/90) La
Lab 3 1
Streptococcus pneumoniae (85/90) _34 _7 3
11 97 Lab
Acinetobacter baumannii (83/90) Lab_ 10 3 70
35 Lab_
Klebsiella pneumoniae (85/90) 19 2
Pseudomonas aeruginosa (69/90) Lab_38
70 57 Lab_68
Candida albicans (83/90) 99
Unexpected microbes in Sample S1
1
Lab_40 Lab_65
Escherichia coli (43/90)
3 2
Staphylococcus aureus (66/90) Lab_
41 Lab_
64
2 1
Stenotrophomonas maltophilia (36/90) 2 La b
Staphylococcus epidermidis (29/90) La b_4 1 1 _63
50 1
Serratia marcescens (23/90) b_
4 3
1 7
La
b_
62
Moraxella catarrhalis (2/90) La 1 1 La
45 12 1
b_ 28 30 b_
Fusobacterium nucleatum (43/90) La
2 60
La
4
b_
Bacteroides fragilis (38/90)
b_
5
La
La
b_
6
_ 50
L ab
b_
Lab_5
Lab_68
Lab_81
Lab_8
Lab_48
Lab_34
Lab_50
Lab_20
Lab_72
Lab_41
Lab_7
Lab_62
Lab_56
Lab_70
Lab_11
Lab_33
Lab_83
Lab_45
Lab_71
Lab_69
Lab_67
Lab_61
Lab_58
Lab_32
Lab_14
Lab_25
Lab_90
Lab_89
Lab_79
Lab_57
Lab_39
Lab_36
Lab_27
Lab_24
Lab_22
Lab_16
Lab_19
Lab_38
Lab_76
Lab_35
Lab_40
Lab_10
Lab_73
Lab_75
Lab_82
Lab_29
Lab_88
Lab_80
Lab_78
Lab_64
Lab_59
Lab_30
Lab_51
Lab_47
Lab_60
Lab_84
Lab_1
Lab_77
Lab_18
Lab_46
Lab_3
Lab_55
Lab_12
Lab_66
Lab_53
Lab_44
Lab_9
Lab_37
Lab_74
Lab_26
Lab_28
Lab_13
Lab_87
Lab_86
Lab_21
Lab_85
Lab_15
Lab_65
Lab_17
Lab_2
Lab_4
Lab_42
Lab_49
Lab_43
Lab_63
Lab_31
Lab_54
Lab_52
Lab_6
Lab_23
Lab_51
La
Lab_54
Lab_52
59
L ab
_5 6
b 30 d Streptococcus cristatus
1×107 cell/ml Streptococcus parasanguinis
6 Raoultella planticola
1×10 cell/ml
Ralstonia pickettii
1×105 cell/ml Moraxella osloensis
20 1×104 cell/ml Klebsiella variicola
1×103 cell/ml Klebsiella aerogenes
Citrobacter koseri
Log2 (RPM value)
5×102 cell/ml
1×102 cell/ml
Candida tropicalis
Streptococcus suis
10 Staphylococcus hominis
Shigella flexneri
Enterobacter cloacae
Salmonella enterica
Klebsiella oxytoca
0 Streptococcus oralis
Haemophilus parainfluenzae
Acinetobacter pittii
Streptococcus pseudopneumoniae
Streptococcus mitis
i
ns
is
lis
e
us
e
sa
i
e
lia
m
is
ol
ni
en
za
ia
ia
al
d
tu
gi
c
an
i
ca
re
no
Haemophilus haemolyticus
ph
i
on
on
rrh
m
sc
en
ea
fra
a
au
bi
um
gi
hi
to
er
m
m
ce
ta
flu
cl
al
ru
Haemophilus aegyptius
ic
s
al
id
eu
eu
us
ba
nu
ca
de
ar
in
er
ae
da
ep
pn
pn
cc
m
ch
oi
s
Pasteurella multocida
er
la
m
di
as
as
lu
co
us
er
tia
el
Es
iu
s
la
t
an
ac
hi
on
on
cu
ct
ax
lo
cc
er
el
rra
op
C
ob
Ba
hy
om
si
oc
ct
co
or
Se
m
eb
ba
do
et
ap
M
oc
lo
0 5 10 15 20
ph
ae
in
Kl
so
hy
eu
St
pt
ro
Ac
H
re
Fu
ap
Ps
No. of labs
ot
St
en
St
St
Fig. 1. The results of Sample S1 detected by the 90 participating laboratories. (a) The heat map shows the detection of each microbe in sample S1 by each laboratory. In
general, mNGS performed poorly in the detection of microbes at low concentrations (S. maltophilia, S. epidermidis, S. marcescens, M. catarrhalis, F. nucleatum, and B. fragilis). The
yellow square indicates ‘‘detected”, and the red square indicates ‘‘not detected”. (b) The scatter plot shows that the RPM value (converted to log2 value) of each microbe from
each laboratory varies widely. Each dot represents a laboratory. This figure also shows that the higher the concentration of a microbe, the higher the median RPM value
detected by mNGS. (c) Unexpected microbes were reported by 38 laboratories. (d) The unexpected microbes that occur most frequently in the participating laboratories.
tious diseases were not entirely suitable for this new technique, detection, precision, accuracy, interference and stability. In 2020,
which enables unbiased detection of all potential pathogens in a a study presented the validation strategies of an mNGS protocol
single test [32]. In this study, approximately 56.7% (51/90) of the enabling simultaneous detection of both DNA and RNA viruses
laboratories validated their workflows using self-designed mock on Illumina HiSeq 4000 and NextSeq 500 sequencing systems
communities containing a small number of microbes or collected using a selection of 25 respiratory pediatric samples [14]. These
clinical samples. Part or all of the performance metrics, such as practices are important to ensure that the developed mNGS proto-
analytical sensitivity (limit of detection), precision (repeatability cols are ready for implementation in the clinical laboratory and to
and reproducibility), analytical specificity and accuracy, were eval- provide feasible examples for other laboratories to validate the
uated. However, the validation strategies used in most laboratories performance of their in-house mNGS pipelines.
can hardly be said to be standardized because of the great differ- Previous studies have shown that the analytical sensitivity of
ences in the selected sample types (sputum, blood, nasopharyngeal mNGS for different organism types can be as low as hundreds or
swab and bronchoalveolar lavage fluid, etc.), sample sizes (from a fewer copies per test [14,31]. In this study, the detection rate
few to hundreds), the numbers and type of microbes and the inter- decreased significantly for microbes at a concentration of 1 104
pretation criteria. To our knowledge, scientists from ARUP Labora- cell/ml or less. Less than half of the participating laboratories were
tories (Salt Lake City, Utah), IDbyDNA Inc. (Sunnyvale, California), able to detect common respiratory microbes at concentrations
and the University of California, San Francisco, are making efforts below 1 103 cell/ml. The validation of a molecular assay is often
to carry out independent validation for the mNGS assay [15]. In completed when the testing system is in the best state, and the
2017, Schlaberg et al. provided two example solutions for validat- whole process is subject to the least external interference. Thus,
ing mNGS workflows that were used for pathogen detection from the reported performance metrics, such as the limit of detection,
respiratory secretions and cerebrospinal fluid [15]. They discussed just represent the optimal values of the validated protocol. mNGS
the challenges and provided solutions about assay design, valida- is a technique that requires complex multistep processing. The
tion of ‘‘wet lab” and bioinformatics pipelines, and various quality results observed from real-world applications have difficulty
control (QC) metrics. In 2019, Miller et al. and Blauwkamp et al. reaching the level claimed by method validation because they
first validated the performance of two clinical mNGS assays for are easily affected by various variation factors associated with
testing cerebrospinal fluid (CSF) and plasma samples, respectively, samples, laboratory environment, reagents, consumables and
in Clinical Laboratory Improvement Amendments (CLIA)-certified human operation [7]. Increasing studies have made efforts to
laboratories [16,31]. They thoroughly described the strategies used enhance the detection sensitivity of mNGS for different specimen
to evaluate the performance characteristics, including limits of types by optimizing the procedures of sample preparation [17],
218
D. Han, Z. Diao, H. Lai et al. Journal of Advanced Research 38 (2022) 213–222
a
100
Number of labs with microbe detected
b c
Fig. 2. The performance of mNGS in detecting low-microbial-biomass samples. (a) For most microbes, with the decrease of microbial concentration, the number of
laboratories with positive results decreased. The impact of host depletion (b) and bead-beating processes (c) on the low-microbial-biomass sample S2 with a concentration of
5 102 cell/ml for each microbe.
a b c
20 30 30
20
Log2 (RPM value)
20
Log2 (RPM value)
10
10
10
0
0
0
-10
o i mo a
s ae
St do na ata us
lis
Ps ho ella s a ns
oc s a op is
er eu idis
Kl ccu aum us
li
Es uc ae
l
l
gi
pt on m rha
cu e hi
tu
eu mo c ure
a
de ni
an
at
i
on
t e
tro ora cc bic
rm
e
f ra
um
l a
ch ea
t c
lo cte mig
c s
e pid ma ar ca nu de
r
hi
m
t
M oco al
l
oc a al
ic
x u
a
er
rb
n
ap ob s f
hy did
n
s u
e
cu n at xe ri te
re m s
m
ba a p
u
St net illu
er
cc oc mo rr ra cte Bac
s
ap n
iu
St Ca
g
co loc ho Se Mo ba
a
l
Ac per
ct
co
o
s
pt phy trop so
eb
As
re Fu
hy
St Sta eno
i
Fu
St
en
St
Fig. 3. The performance of mNGS in detecting three samples (S5, S6 and S7) that contain two genetically similar microbes (S. aureus and S. epidermidis). Scatter plots
(a), (b) and (c) show the yield RPM ratios (RPM S. aureus/RPM S. epidermidis) reported by each laboratory in samples S5, S6, and S7, respectively. Shading zones indicate a low level
of bias (over or underestimating < 2-fold) from the initial input ratios (10:1 in sample S5, 1:1 in sample S6, 1:10 in sample S7). The violin charts on the right show the quartile
distribution of RPM ratios (RPM S. aureus/RPM S. epidermidis) detected by laboratories in each sample.
219
D. Han, Z. Diao, H. Lai et al. Journal of Advanced Research 38 (2022) 213–222
80
a 80 Sample S5
70
RPM ratio (RPM S. aureus / RPM S. epidermidis)
60
60
50 S. aureus: S. epidermidis=10:1
40 40
40
20 Median =8.0
30
2020S8
0
20
10
5
0
lab_10
lab_11
lab_12
lab_13
lab_14
lab_16
lab_19
lab_20
lab_21
lab_22
lab_23
lab_24
lab_25
lab_26
lab_27
lab_28
lab_29
lab_30
lab_31
lab_32
lab_33
lab_35
lab_36
lab_37
lab_38
lab_39
lab_40
lab_41
lab_42
lab_43
lab_44
lab_45
lab_49
lab_50
lab_51
lab_52
lab_53
lab_56
lab_57
lab_58
lab_59
lab_60
lab_61
lab_62
lab_63
lab_64
lab_65
lab_66
lab_67
lab_68
lab_70
lab_71
lab_72
lab_73
lab_74
lab_75
lab_76
lab_77
lab_78
lab_79
lab_81
lab_82
lab_83
lab_84
lab_85
lab_86
lab_87
lab_88
lab_89
lab_90
lab_1
lab_3
lab_5
lab_6
lab_7
lab_9
Labs
10
b 10
9 Sample S6
RPM ratio (RPM S. aureus / RPM S. epidermidis)
8
7 S. aureus: S. epidermidis=1:1
6
5 5
5
4
Median=0.9
3
0
2
1
0.5
0
lab_10
lab_11
lab_12
lab_13
lab_14
lab_15
lab_16
lab_17
lab_18
lab_19
lab_20
lab_21
lab_22
lab_23
lab_24
lab_25
lab_26
lab_27
lab_28
lab_29
lab_30
lab_31
lab_32
lab_33
lab_35
lab_36
lab_37
lab_38
lab_39
lab_40
lab_41
lab_42
lab_43
lab_44
lab_45
lab_47
lab_49
lab_50
lab_51
lab_52
lab_53
lab_55
lab_56
lab_57
lab_58
lab_59
lab_60
lab_61
lab_62
lab_63
lab_64
lab_65
lab_66
lab_67
lab_68
lab_69
lab_70
lab_71
lab_72
lab_73
lab_74
lab_75
lab_76
lab_77
lab_78
lab_79
lab_81
lab_82
lab_83
lab_84
lab_85
lab_86
lab_87
lab_88
lab_89
lab_90
lab_1
lab_2
lab_3
lab_4
lab_5
lab_6
lab_7
lab_8
lab_9
Labs
c 8
7
6 Sample S7 8
RPM ratio (RPM S. aureus / RPM S. epidermidis)
5 7
4 S. aureus: S. epidermidis=1:10 6
5
4
0.6 0.8
0.6
0.4
0.4
0.2 Median =0.1
0.0
0.2
0.1
0.05
0.0
lab_10
lab_11
lab_12
lab_13
lab_14
lab_15
lab_16
lab_17
lab_18
lab_19
lab_20
lab_21
lab_22
lab_23
lab_24
lab_25
lab_26
lab_27
lab_28
lab_29
lab_30
lab_31
lab_32
lab_33
lab_35
lab_36
lab_37
lab_38
lab_39
lab_40
lab_41
lab_42
lab_43
lab_44
lab_45
lab_46
lab_49
lab_50
lab_51
lab_52
lab_53
lab_55
lab_56
lab_57
lab_58
lab_59
lab_60
lab_61
lab_62
lab_63
lab_64
lab_65
lab_66
lab_67
lab_68
lab_69
lab_70
lab_71
lab_72
lab_73
lab_74
lab_75
lab_76
lab_77
lab_79
lab_81
lab_82
lab_83
lab_84
lab_85
lab_86
lab_87
lab_89
lab_90
lab_1
lab_2
lab_3
lab_4
lab_5
lab_7
lab_9
Labs
Fig. 4. The performance of mNGS in distinguishing true pathogens from multimicrobial samples. Scatter plots (a), (b) and (c) show the RPM value (converted to log2
value) of each microbe reported by laboratories in samples S2, S3, and S4, respectively. Each black dot represents a laboratory.
nucleic acid extraction and purification [33], library preparation microbial DNA in samples. Contrary to expectations, in this study,
[34], sequencing amount/depth [35] and bioinformatics tools we found that introducing a host DNA depletion process reduced
[20,21,36]. The interference of host DNA in clinical samples and the detection rate of low-microbial-biomass samples. We believe
the different efficiency of nucleic acid extraction kits to different that there are two possible reasons for the negative effects of host
microbial DNA are considered to be two critical obstacles that DNA depletion on low-microbial-biomass samples. The first one is
affect the sequencing results. Adding the processes of host DNA that the PBS wash, centrifugation, resuspension, and recovery
depletion [28,37] and bead-beating before nucleic acid extraction operations related to the host DNA depletion approach will cause
are two conventional methods to effectively increase the yield of an absolute loss of microbe cells. A greater number of operations
220
D. Han, Z. Diao, H. Lai et al. Journal of Advanced Research 38 (2022) 213–222
Table 3 that mechanical wall breaking can increase the yield of microbes at
The result interpretation for three case reports by each participating laboratory. low concentrations, especially for fungi with hard cell walls, such
Labs Unexpected microbes& as A. fumigatus.
Case Report 1- the result How to interpret and report the sequencing results is a topic
interpretation of Sample S8 with inconclusive answers. Facing clinical samples with complex
1. Definitely diagnosed as A. 56.7% backgrounds and large biological variations, it is very challenging
fumigatus infection. (51/ or even impossible to set a general standard for interpreting the
90)
2. Did not give a conclusive
results of mNGS sequencing data. Not only comprehensive techni-
diagnosis: cal skills but also bioinformatic, biological and medical knowledge
a. Negative report, no pathogen 10% is of paramount importance for proper analyses of mNGS data for
was detected and reported. (9/90) true pathogen identification. As we reviewed in a previous report,
b. A. fumigatus was not detected, 1.1% C. albicans (lab_11)
metrics including the specific read ranking, normalized read num-
unexpected microbes were (1/90)
reported. bers (such as RPM), genome coverage, scoring algorithms,
c. In addition to A. fumigatus, 3.3% C. albicans (lab_56), P. pathogenicity and clinical relationships of the detected microbes
unexpected microbes were (3/90) aeruginosa (lab_75), S. were selected as some of the criteria developed in different labora-
also reported. pneumonia (lab_38) tories [9]. In this study, 84.4% (76/90) of the participants were
d. Both A. fumigatus and other 28.9%
independent third-party testing laboratories. These laboratories
microbes initially added to (26/
the sample S8 were reported. 90) often own a powerful team of bioinformatics talents but lack
Case Report 2- the result front-line physicians and microbiologists with rich clinical experi-
interpretation of Sample S9 ence. When reporting the sequencing results were performed at
1. Definitely diagnosed as C. 78.9%
the discretion of their data analysis teams, 32.2% (29/90), 15.6%
albicans infection. (71/
90) (14/90) and 12.2% (11/90) of the participating laboratories had
2. Did not give a conclusive detected the true pathogen but did not give a definitive diagnosis
diagnosis: for the three simulated case-related samples (S8, S9, and S10),
a. Negative report, no pathogen 4.4% respectively (Table 3). Thus, the experienced clinical microbiolo-
was detected and reported. (4/90)
gists and clinicians should lead the interpretation of the mNGS
b. C. albicans was not detected, 1.1% Candida tropicalis(lab_48)
unexpected microbes were (1/90) data.
reported. Well-defined reference materials are the premise for perform-
c. In addition to C. albicans, 2.2% S. epidermidis (lab_46 and ing multicenter quality assessment evaluations [19,38,39]. How-
unexpected microbes were (2/90) lab_40)
ever, there are no accredited or certified fit-for-purpose reference
also reported.
d. Both C. albicans and other 13.3%
materials available to enable researchers to compare results gener-
microbes initially added to (12/ ated across different mNGS laboratories. In this study, by designing
the sample S9 were reported. 90) 11 mock communities containing 15 clinical microbes and a back-
Case Report 3- the result ground matrix (human cells), we made such a large-scale multi-
interpretation of Sample S10
center evaluation in the field. Although DNA viruses and RNA
1. Definitely diagnosed as S. 83.3%
pneumonia infection. (75/ viruses were not included, these mock communities with bacteria
90) and fungi did reflect the issues in the pathogen DNA mNGS proto-
2. Did not give a conclusive cols performed by routine laboratories from different aspects. We
diagnosis:
have included the evaluation of RNA virus mNGS in the follow-
a. Negative report, no pathogen 2.2%
was detected and reported. (2/90)
up research plan. In addition, analysis of mNGS in distinguishing
b. S. pneumonia was not detected, 2.2% S. epidermidis (lab_68), Oral genetically related microbes revealed a relatively low level of bias
unexpected microbes were (2/90) Streptococcus and Streptococcus (<2-fold change) from the original input ratios of S. aureus and S.
reported. mutans (lab_72) epidermidis observed in only 56.6% to 63.0% of the laboratories.
c. In addition to S. pneumonia, 3,3% H. influenza (lab_32),
However, we cannot tell whether the biases come from wet labs
unexpected microbes were (3/90) Staphylococcus hominis and
also reported. Acinetobacter lwoffii (lab_40), (from sampling to DNA extraction and library preparation to
Citrobacter koseri and sequencing) or dry labs (bioinformatics analysis). This question
Enterobacter cloacae (lab_78) can be further addressed by developing DNA standards for assess-
d. Both S. pneumonia and other 8.9%
ing the performance of the processes after nucleic acid extraction
microbes initially added to (8/90)
the sample S10 were reported.
or in silico data sets for assessing bioinformatic performance
[15,38].
& Unexpected microbes represent microbes that did not initially spiked into to the
corresponding sample (S8, S9, or S10). If these microbes are reported, the diagnosis
will be considered wrong.
Conclusions
Declaration of Competing Interest [16] Blauwkamp TA, Thair S, Rosen MJ, Blair L, Lindner MS, Vilfan ID, et al.
Analytical and clinical validation of a microbial cell-free DNA sequencing test
for infectious disease. Nat Microbiol 2019;4(4):663–74. doi: https://doi.org/
The authors declare that they have no known competing finan- 10.1038/s41564-018-0349-6.
cial interests or personal relationships that could have appeared [17] Bachmann NL, Rockett RJ, Timms VJ, Sintchenko V. Advances in clinical sample
preparation for identification and characterization of bacterial pathogens
to influence the work reported in this paper.
using metagenomics. Front Public Health 2018;6. doi: https://doi.org/10.3389/
fpubh.2018.00363.
Acknowledgements [18] Quince C, Walker AW, Simpson JT, Loman NJ, Segata N. Shotgun
metagenomics, from sampling to analysis. Nat Biotechnol 2017;35:833.
[19] Han D, Gao P, Li R, Tan P, Xie J, Zhang R, et al. Multicenter assessment of
We thank the 90 laboratories for participating actively in our study microbial community profiling using 16S rRNA gene sequencing and shotgun
and for reporting the test results to NCCL on time. metagenomic sequencing. J Adv Res 2020;26:111–21. doi: https://doi.org/
This work was supported by the ‘‘AIDS and Hepatitis, and Other 10.1016/j.jare.2020.07.010.
[20] Yan Q, Wi YM, Thoendel MJ, Raval YS, Greenwood-Quaintance KE, Abdel MP,
Major Infectious Disease Control and Prevention” Program of China et al. Evaluation of the CosmosID bioinformatics platform for prosthetic Joint-
under Grant [No. 2018ZX10102001] and the National Natural Associated sonicate fluid shotgun metagenomic data analysis. J Clin Microbiol
Science Foundation of China under Grant [No. 81703276]. 2019;57(2). doi: https://doi.org/10.1128/JCM.01182-18.
[21] Thoendel M, Jeraldo P, Greenwood-Quaintance KE, Yao J, Chia N, Hanssen AD,
et al. Comparison of three commercial tools for metagenomic shotgun
Appendix A. Supplementary material sequencing analysis. J Clin Microbiol 2020;58(3). doi: https://doi.org/
10.1128/JCM.00981-19.
[22] Miao Q, Ma Y, Wang Q, Pan J, Zhang Y, Jin W, Yao Y, Su Y, Huang Y, Wang M, Li
Supplementary data to this article can be found online at B, Li H, Zhou C, Li C, Ye M, Xu X, Li Y, Hu B. 2018. Microbiological diagnostic
https://doi.org/10.1016/j.jare.2021.09.011. performance of metagenomic next-generation sequencing when applied to
clinical practice. Clin Infect Dis 67:S231-S240. 10.1093/cid/ciy693.
[23] Han D, Li R, Shi J, Tan P, Zhang R, Li J. Liquid biopsy for infectious diseases: A
References focus on microbial cell-free DNA sequencing. Theranostics 2020;10
(12):5501–13. doi: https://doi.org/10.7150/thno.45554.
[1] El Bcheraoui C, Mokdad AH, Dwyer-Lindgren L, Bertozzi-Villa A, Stubbs RW, [24] Chiu CY, Miller SA. Clinical metagenomics. Nat Rev Genet 2019;20(6):341–55.
Morozoff C, et al. Trends and patterns of differences in infectious disease doi: https://doi.org/10.1038/s41576-019-0113-7.
mortality among US counties, 1980–2014. JAMA 2018;319:1248. doi: https:// [25] Nicholls SM, Quick JC, Tang S, Loman NJ. 2018. Ultra-deep, long-read nanopore
doi.org/10.1001/jama.2018.2089. sequencing of mock microbial community standards. bioRxiv:487033.
[2] Troeger C, Blacker B, Khalil IA, Rao PC, Cao J, Zimsen SRM, et al. Estimates of the 10.1101/487033.
global, regional, and national morbidity, mortality, and aetiologies of lower [26] Singer E, Andreopoulos B, Bowers RM, Lee J, Deshpande S, Chiniquy J, et al.
respiratory infections in 195 countries, 1990–2016: A systematic analysis for Next generation sequencing data of a defined microbial mock community. Sci
the Global Burden of Disease Study 2016. Lancet Infect Dis 2018;18 Data 2016;3(1). doi: https://doi.org/10.1038/sdata.2016.81.
(11):1191–210. doi: https://doi.org/10.1016/S1473-3099(18)30310-4. [27] Hornung BVH, Zwittink RD, Kuijper EJ. Issues and current standards of controls
[3] Zinter MS, Dvorak CC, Mayday MY, Iwanaga K, Ly NP, Mcgarry ME, et al. in microbiome research. Fems Microbiol Ecol 2019;95. doi: https://doi.org/
Pulmonary metagenomic sequencing suggests missed infections in 10.1093/femsec/fiz045.
immunocompromised children. Clin Infect Dis 2018. doi: https://doi.org/ [28] Oechslin CP, Lenz N, Liechti N, Ryter S, Agyeman P, Bruggmann R, et al. Limited
10.1093/cid/ciy802. correlation of shotgun metagenomics following host depletion and routine
[4] Dubourg G, Raoult D, Fenollar F. Emerging methodologies for pathogen diagnostics for viruses and bacteria in low concentrated surrogate and clinical
identification in bloodstream infections: An update. Expert Rev Mol Diagn samples. Front Cell Infect Mi 2018;8. doi: https://doi.org/10.3389/
2019;19(2):161–73. doi: https://doi.org/10.1080/14737159.2019.1568241. fcimb.2018.0037510.3389/fcimb.2018.00375.s00110.3389/fcimb.2018.00375.
[5] Gabaldón T, Opathy C. Recent trends in molecular diagnostics of yeast s002.
infections: From PCR to NGS. Fems Microbiol Rev 2019. doi: https://doi.org/ [29] Bruggeling C, Garza DR, Achouiti S, Mes W, Dutilh BE, Boleij A. 2020. Optimized
10.1093/femsre/fuz015. DNA isolation method for microbiome analysis of human tissues.
[6] Jain S, Self WH, Wunderink RG, Fakhran S, Balk R, Bramley AM, et al. bioRxiv:2020-2028. 10.1101/2020.08.25.267641.
Community-acquired pneumonia requiring hospitalization among U.S. adults. [30] Wei MP, Qiu JD, Li L, Xie YF, Guo YH, Yu H, Cheng YL, Qian H, Yao WR. 2020.
N Engl J Med 2015;373(5):415–27. doi: https://doi.org/10.1056/ The chemical profile and biological activity of different extracts of Sapindus
NEJMoa1500245. mukorossi Gaertn. Against Cutibacterium acnes. Nat Prod Res:1-6. 10.1080/
[7] Bryson AL, Miller SA, Filkins LM, Mitchell SL. Navigating clinical utilization of 14786419.2020.1715399.
direct-from-specimen metagenomic pathogen detection: Clinical applications, [31] Miller S, Naccache SN, Samayoa E, Messacar K, Arevalo S, Federman S, et al.
limitations, and testing recommendations. Clin Chem (Baltimore, Md.) Laboratory validation of a clinical metagenomic sequencing assay for pathogen
2020;66(1381–1395). doi: https://doi.org/10.1093/clinchem/hvaa183. detection in cerebrospinal fluid. Genome Res 2019;29(5):831–42. doi: https://
[8] Muller WJ, Chaudhury S. Utility of metagenomic next generation sequencing doi.org/10.1101/gr.238170.118.
(mNGS) of plasma for infectious pathogens. Clin Infect Dis 2020. doi: https:// [32] Burd EM. Validation of Laboratory-Developed molecular assays for infectious
doi.org/10.1093/cid/ciaa218. diseases. Clin Microbiol Rev 2010;23(3):550–76. doi: https://doi.org/10.1128/
[9] Han D, Li Z, Li R, Tan P, Zhang R, Li J. MNGS in clinical microbiology CMR.00074-09.
laboratories: On the road to maturity. Crit Rev Microbiol:1–18. 2019. doi: [33] Yang F, Sun J, Luo H, Ren H, Zhou H, Lin Y, et al. 2020. Assessment of fecal DNA
https://doi.org/10.1080/1040841X.2019.1681933. extraction protocols for metagenomic studies. Gigascience 9. 10.1093/
[10] Charalampous T, Kay GL, Richardson H, Aydin A, Baldan R, Jeanes C, et al. gigascience/giaa071.
Nanopore metagenomics enables rapid clinical diagnosis of bacterial lower [34] Xu Y, Lewandowski K, Lumley S, Pullan S, Vipond R, Carroll M, et al. Detection
respiratory infection. Nat Biotechnol 2019;37(7):783–92. doi: https://doi.org/ of viral pathogens with multiplex nanopore MinION sequencing: Be careful
10.1038/s41587-019-0156-5. with cross-talk. Front Microbiol 2018;9. doi: https://doi.org/10.3389/
[11] Xie Y, Du J, Jin W, Teng X, Cheng R, Huang P, et al. Next generation sequencing fmicb.2018.0222510.3389/fmicb.2018.02225.s001.
for diagnosis of severe pneumonia: China, 2010–2018. J Infection 2019;78 [35] Ni J, Yan Q, Yu Y. How much metagenomic sequencing is enough to achieve a
(2):158–69. doi: https://doi.org/10.1016/j.jinf.2018.09.004. given goal? Sci Rep-Uk 2013;3:1968. doi: https://doi.org/10.1038/srep01968.
[12] Zhang Yi, Ai J-W, Cui P, Zhang W-H, Wu H-L, Ye M-Z. A cluster of cases of [36] Brinkmann A, Andrusch A, Belka A, Wylezich C, Höper D, Pohlmann A, et al.
pneumocystis pneumonia identified by shotgun metagenomics approach. J Proficiency testing of virus diagnostics based on bioinformatics analysis of
Infection 2019;78(2):158–69. doi: https://doi.org/10.1016/j.jinf.2018.08.013. simulated in silico high-throughput sequencing data sets. J Clin Microbiol
[13] Graf EH, Simmon KE, Tardif KD, Hymas W, Flygare S, Eilbeck K, et al. Unbiased 2019;57(8). doi: https://doi.org/10.1128/JCM.00466-19.
detection of respiratory viruses by use of RNA Sequencing-Based [37] Hasan MR, Rawat A, Tang P, Jithesh PV, Thomas E, Tan R, et al. Depletion of
metagenomics: A systematic comparison to a commercial PCR panel. J Clin human DNA in spiked clinical specimens for improvement of sensitivity of
Microbiol 2016;54(4):1000–7. doi: https://doi.org/10.1128/JCM.03060-15. pathogen detection by next-generation sequencing. J Clin Microbiol 2016;54
[14] van Boheemen S, van Rijn AL, Pappas N, Carbo EC, Vorderman RHP, Sidorov I, (4):919–27. doi: https://doi.org/10.1128/JCM.03050-15.
et al. Retrospective validation of a metagenomic sequencing protocol for [38] Hardwick SA, Deveson IW, Mercer TR. Reference standards for next-generation
combined detection of RNA and DNA viruses using respiratory samples from sequencing. Nat Rev Genet 2017;18(8):473–84. doi: https://doi.org/10.1038/
pediatric patients. J Mol Diagn 2020;22(2):196–207. doi: https://doi.org/ nrg.2017.44.
10.1016/j.jmoldx.2019.10.007. [39] Amos GCA, Logan A, Anwar S, Fritzsche M, Mate R, Bleazard T, et al. Developing
[15] Schlaberg R, Chiu CY, Miller S, Procop GW, Weinstock G. Validation of standards for the microbiome field. Microbiome 2020;8(1). doi: https://doi.
metagenomic Next-Generation sequencing tests for universal pathogen org/10.1186/s40168-020-00856-310.21203/rs.3.rs-102112/v1.
detection. Arch Pathol Lab Med 2017;141(6):776–86. doi: https://doi.org/
10.5858/arpa.2016-0539-RA.
222