NIHMS1745813 Supplement SI - 01

Supporting Information
Expanding Urinary Metabolite Annotation through Integrated Mass Spectral Similarity Networking
Fausto Carnevale Neto*†, Daniel Raftery*†,‡
†
Northwest Metabolomics Research Center, Department of Anesthesiology and Pain
Medicine, University of Washington, 850 Republican Street, Seattle, Washington 98109, United States.
‡
Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109,
United States.
*
Corresponding Author
Fausto Carnevale Neto, PhD.

Northwest Metabolomics Research Center
Department of Anesthesiology and Pain Medicine
University of Washington
850 Republican St.
Seattle, WA 98109
Tel: +1 (206) 685-4753
Fax: 206-616-4819
Email: fcneto@uw.edu
Daniel Raftery, PhD.

Northwest Metabolomics Research Center
Department of Anesthesiology and Pain Medicine
University of Washington
850 Republican St.
Seattle, WA 98109
Tel: +1 (206) 543-9707
Fax: 206-616-4819
Email: draftery@uw.edu
S1
Table of Contents
Data availability...........................................................................................................................................4
Supporting Methods....................................................................................................................................5
Sample preparation....................................................................................................................................5
Q-ToF/MS parameters...............................................................................................................................5
Data processing on Progenesis QI.............................................................................................................5
Illustration..................................................................................................................................................6
Supporting Results and Discussion............................................................................................................7
Supporting References.................................................................................................................................9
Supporting Figures....................................................................................................................................10
S2
Data availability
The default parameters of the molecular networks can be access at:
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=40b27881a05d46c5a5ec5fa2cb40883a
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=db2f91898a10405b8fafca9e18b49d42
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=12eddda60e8c486fa52123d0d4c672db
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=7df34d835dcf4d4cac38675cf84f187d
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=641425d3178849808489563681ce2cdc
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=d107a73c22e142c6bdfc77e60209a701
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=692236f015914501a9b0f2286c1df47f
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=39ed9b19f2e348f1b45bb8c78e11c16d
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=a0566b225de640e8a89c63579debf9c2
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=2e7be69345654eb0ad7e3d15a9766698
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=b86d90fc64754e1abdb5c4a3b30b1561
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=421cc3d8538445c785150e74687fd829
S3
Supporting Methods
Sample preparation. Stored urine samples were thawed and 50 µL of each sample was aliquotted into 2
mL Eppendorf tubes, mixed with 250 µL methanol and vortexed for 10 s. The solutions were kept at -20
°C for 20 min, centrifuged at 14,000 g and 4 °C for 15 min (Microfuge 20R, Beckman coulter Inc.,
Indianapolis, IN). 200 µL of the supernatants was transferred into new 2 mL Eppendorf tubes, dried under
vacuum (~1.5 hour) at 30 °C using an Eppendorf Vacufuge (Brinkmann Instruments, Westbury, NY) and
reconstituted with 200 µL ACN/H2O (6:4, v/v). The samples were vortex for 10 s, centrifuged at 14,000 g
for 5 min, and then 160 µL of the supernatants was transferred into LC vials for LC-MS analysis.
Q-ToF/MS parameters. The ESI conditions were as follows: ESI positive and negative modes; voltage
3.8 kV; cone voltage 40 V; source temperature 120 °C; desolvation temperature 325 °C; cone gas flow 20
L h–1; nebulizer pressure 45 psi; desolvation gas flow 600 L h–1. Nitrogen was used as the drying and
collision gases, with collision energy 15 eV, MS scan rate 5.0 spectra s –1 and MS/MS scan rate 3.0 spectra
s–1 across m/z 50–1000. Two precursor ions were selected per cycle (relative threshold >200 counts),
excluding ions previously detected on blank injections (initial mobile phase). The MS was calibrated
using purine (2 μM) and Agilent’s HP-921 calibration solution. Data were acquired using Agilent’s
MassHunter Data Acquisition Workstation v. B.06.01.6157 software, and stored in both profile and
centroid formats.
Data processing on Progenesis QI. Peak alignment was carried out by taking the pooled QCs as the
reference. Peak-peaking was performed using sensitivity = 1 and a chromatographic peak width of 0.05
min. The retention time limit was set to between 1.2 – 14.8 min. The resulting data were filtered using a
+1 charge state in positive ion mode (-1 in negative mode). Possible adduct ions were defined based on
previous knowledge of urine LCMS untargeted analysis, as follows: [M + H]+, [M + Na]+, [M + NH4]+,
[2M + H]+, [2M + Na]+, [2M + NH4]+, [M + H – H2O]+, [M + H – 2H2O]+, in (+) ESI mode, and [M – H]–,
[M + CH3COO – H]–, [M – H2O – H]–, [M + Cl]–, [2M – H]–, [M + Na – 2H]–, [M + K –2H]–, [2M +
S4
CH3COO – H]–, and [2M + CH3COO – H]– in (-)ESI mode. The adduct ions were grouped into mass
features through peak deconvolution. For peak annotation using full scan (MS1) data, we set the m/z
tolerance to 10 ppm, 90% isotopic similarity, and used an elemental composition filter consisting of C, H,
O, N, P, S, Cl, and F. For DDA-MS2 annotation, we used a precursor ion tolerance of 15 ppm, 90%
isotopic similarity, and fragment tolerance of 30 ppm.
Illustration. All the figures were prepared on a Dell Latitude 7400 with Intel(R) Core(TM) i5-8265U
CPU, 1.60GHz, 1800 Mhz, 4 Cores, 8 Logical Pocessors. OS system Microsoft Windows 10 Enterprise
ver. 10.0.19042. The MNs were formatted and exported from Cytoscape (ver. 3.2.1). The network was
organized with Cytoscape’s organic layout, node colors were mapped based on the source files of the
MS2 data and the edge thickness attribute was defined to reflect cosine similarity scores, with thicker
lines indicating higher similarity. Cytoscape style parameters and the session files (.cys) can be send by
the authors at further request. The Venn diagrams were plotted on InteractiVenn. 1 The doughnut and pie
charts were built on Microsoft Excel 2016 MSO ver. 16.0.4266.1001. The chemical structures were
elaborated on ChemDraw ver. 19.0.0.22. The figures were finalized on Inkscape 0.92.4 and exported
as .svg and .pdf formats.
S5
Supporting Results and Discussion
A large cluster in the (+)-ESI MN, shown in Figure S-11A, provided only one candidate hit by GNPS
MS2 spectral matching, phenylacetylglutamine. Metabolite annotation of the mass features corresponding
to the nodes that form this cluster by Progenesis QI led to three correct assignments based on MS2 library
matching out of 12 candidate hits: serine, asparagine, and the dipeptide glutamylarginine (Figure S-11A-
B). No correct identification was observed for eight candidate hits suggested using the full scan MS1
data. Using MolNetEnhancer, we were able to visualize a Mass2Motif related to glutamine propagated
throughout the cluster. MotifDB characterized this substructure pattern based on the main ions m/z 147,
130, 84, and 56. Peripheral nodes forming distinct regions within the MN cluster also resulted in
additional uncharacterized Mass2Motifs. NAP-Fusion in silico structure annotation was re-ranked
according to the candidate hit phenylacetylglutamine, and suggested the presence of di- and tri- glutamyl
peptides, guiding the annotation of 13 nodes after manual curation. ClassyFire assisted the metabolite
annotation of other nodes by predicting “amino acids and derivatives” as the most frequent nodes in this
cluster. Considering the direct parent result, these nodes were classified as “N-acyl-alpha amino acids”,
i.e., compounds containing an alpha amino acid which bears an acyl group at its terminal nitrogen atom.
After manual curation, it was possible to identify 30 di- and tri-peptides previously detected in urine
samples, but not necessarily available in the MS2 spectral databases. 2
In one cluster formed by seven nodes in (-)-ESI MN, as illustrated in Figure S-12A-B, the GNPS spectral
library led to the annotation of two nodes as hydroxyhippuric acid with two distinct retention times,
suggesting hydroxylation at different positions of the aromatic ring MS2LDA extracted one motif related
to the neutral elimination of CO2. This fragmentation pattern is characteristic of organic molecules
containing carboxylic acids (Figure S-12A). The NAP annotation tool predicted the chemical structures of
two additional nodes as methoxyhippuric acid, after using MetFusion to re-rank the scoring. A third node
was tentatively annotated by NAP prediction as isopropylmalic acid; however, since this node was
S6
directly connected only to nodes without GNPS candidate hits or NAP predictions, it was not possible to
use Fusion scoring to recalculate the predictions based on the network propagation. Manual
correspondence between GNPS nodes and Progenesis QI mass features indicated that only one compound
was annotated by this conventional approach (Figure S-12B). Similar to the NAP prediction, Progenesis
suggested the metabolite isopropylmalic acid using both the full scan MS1 based on high-resolution m/z
and isotopic pattern, and the MS2 spectrum (MS1 [M-H]- m/z 175.0615, 1.8 ppm, isotopic similarity
96.7; MS2 frag. matching score 14.7). After manual inspection of the raw data, we verified that the
precursor ion m/z 175 was a contaminant not removed by filtered steps; its MS2 spectrum also contained
the main fragments of the neighbor node m/z 178, i.e., m/z 134 (elimination of CO2) and m/z 77. The two
remaining nodes were identified as hippuric acid based on manual curation, which was supported by the
motif.
S7
Supporting References
(1) Heberle, H.; Meirelles, G. V.; da Silva, F. R.; Telles, G. P.; Minghim, R. InteractiVenn: A Web-
Based Tool for the Analysis of Sets through Venn Diagrams. BMC Bioinformatics 2015, 16 (1),
169. https://doi.org/10.1186/s12859-015-0611-3.
(2) Jandke, J.; Spiteller, G. Dipeptide Analysis in Human Urine. J. Chromatogr. B Biomed. Sci. Appl.
1986, 382, 39–45.
S8
Supporting Figures
Figure S-1. Metabolite annotation through Progenesis QI and GNPS workflows. A) Overlapped
putative annotation proposed by Feature-based Molecular Network (FBMN), Classical molecular network
(MN), and Progenesis QI in (-)-ESI and B) (+)-ESI modes. Annotation not yet validated based on manual
curation.
S9
Figure S-2. Global mass spectral molecular network of urine samples acquired in ESI negative mode. Molecular network colored by motif
interaction between nodes, retrieved through the MolNetEnhancer workflow and MS2LDA. Each node represents a group of similar MS2 spectra.
The connection indicates the spectral similarity between the nodes.
S10
Figure S-3. Global mass spectral molecular network of urine samples acquired in ESI positive mode. Molecular network colored by motif
interaction between nodes, retrieved through the MolNetEnhancer workflow and MS2LDA. Each node represents a group of similar MS2 spectra.
The connection indicates the spectral similarity between the nodes.
S11
Figure S-4. Global mass spectral molecular network of urine samples acquired in ESI negative mode.
Molecular network colored by putative chemical “superclass level” retrieved through the
MolNetEnhancer workflow and ClassyFire.
S12
Figure S-5. Global mass spectral molecular network of urine samples acquired in ESI positive mode.
Molecular network colored by putative chemical “superclass level” retrieved through the
S13
Figure S-6. Global mass spectral molecular network of urine samples acquired in ESI negative mode.
Molecular network colored by putative chemical “direct parent” level retrieved through the
S14
Figure S-7. Global mass spectral molecular network of urine samples acquired in ESI positive mode.
Molecular network colored by putative chemical “direct parent” level retrieved through the
S15
Figure S-8. Metabolite annotation through MolNetEnhancer workflow. A) Overlapped annotation of
NAP-Fusion, MetFrag and Consensus in silico tools. B) Categorization of the metabolites annotated by
GNPS MS2 library search. Metabolites are classified according to the “Superclass level” from the
ClassyFire ranking system.
S16
Figure S-9. Metabolic annotation of ten urine samples analyzed by HILIC-(-)-ESI-MS/MS (DDA),
according to “direct parent level” of the ClassyFire ranking system. The internal pie chart provides a
better view of the chemical classes that compose the “others” group.
S17
Figure S-10. Metabolic annotation of ten urine samples analyzed by HILIC-(+)-ESI-MS/MS (DDA),
according to “direct parent level” of the ClassyFire ranking system. The internal pie chart provides a
better view of the chemical classes that compose the “others” group.
S18
Figure S-11. Spectral cluster observed in ESI positive mode and formed by amino acids and peptides. A)
MN layout representing the Mass2motifs interactions that group the nodes. B) Bar plot with the number
of annotated nodes according to the different approaches. Additional MN layouts showing GNPS library
search (blue node), nodes equivalent to mass features annotated by Progenesis QI using fullscan MS1 (red
node) and MS2 (green node), and the chemical classification at the superclass, subclass, and direct parent
levels. ClassyFire also suggested “carboxylic acids and derivatives” at the class level.
S19
Figure S-12. Spectral cluster observed in ESI positive mode and formed by hippuric acid derivatives. A)
MN layout representing the Mass2motifs interactions that group the nodes. B) Bar plot with the number
of annotated nodes according to the different approaches. Additional MN layouts showing GNPS library
search (blue node), nodes equivalent to mass features annotated by Progenesis QI using fullscan MS1 (red
node) and MS2 (green node), and the chemical classification at the superclass and direct parent levels.
ClassyFire suggested “benzene and substituted derivatives” and “benzoic acids and derivatives” at class
and subclass levels, respectively.
S20

NIHMS1745813 Supplement SI - 01

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NIHMS1745813 Supplement SI - 01

Uploaded by

Copyright:

Available Formats

Supporting Information

Fausto Carnevale Neto†, Daniel Raftery†,‡

Fausto Carnevale Neto, PhD.

Daniel Raftery, PhD.

Data processing on Progenesis QI.............................................................................................................5

Supporting Results and Discussion............................................................................................................7

The default parameters of the molecular networks can be access at:

isotopic similarity, and fragment tolerance of 30 ppm.

as .svg and .pdf formats.

additional uncharacterized Mass2Motifs. NAP-Fusion in silico structure annotation was re-ranked

samples, but not necessarily available in the MS2 spectral databases. 2

You might also like