Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

Supporting Information

Expanding Urinary Metabolite Annotation through Integrated Mass Spectral Similarity Networking

Fausto Carnevale Neto*†, Daniel Raftery*†,‡


Northwest Metabolomics Research Center, Department of Anesthesiology and Pain
Medicine, University of Washington, 850 Republican Street, Seattle, Washington 98109, United States.

Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109,
United States.

*
Corresponding Author

Fausto Carnevale Neto, PhD.


Northwest Metabolomics Research Center
Department of Anesthesiology and Pain Medicine
University of Washington
850 Republican St.
Seattle, WA 98109
Tel: +1 (206) 685-4753
Fax: 206-616-4819
Email: fcneto@uw.edu

Daniel Raftery, PhD.


Northwest Metabolomics Research Center
Department of Anesthesiology and Pain Medicine
University of Washington
850 Republican St.
Seattle, WA 98109
Tel: +1 (206) 543-9707
Fax: 206-616-4819
Email: draftery@uw.edu

S1
Table of Contents
Data availability...........................................................................................................................................4

Supporting Methods....................................................................................................................................5

Sample preparation....................................................................................................................................5

Q-ToF/MS parameters...............................................................................................................................5

Data processing on Progenesis QI.............................................................................................................5

Illustration..................................................................................................................................................6

Supporting Results and Discussion............................................................................................................7

Supporting References.................................................................................................................................9

Supporting Figures....................................................................................................................................10

S2
Data availability

The default parameters of the molecular networks can be access at:

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=40b27881a05d46c5a5ec5fa2cb40883a

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=db2f91898a10405b8fafca9e18b49d42

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=12eddda60e8c486fa52123d0d4c672db

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=7df34d835dcf4d4cac38675cf84f187d

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=641425d3178849808489563681ce2cdc

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=d107a73c22e142c6bdfc77e60209a701

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=692236f015914501a9b0f2286c1df47f

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=39ed9b19f2e348f1b45bb8c78e11c16d

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=a0566b225de640e8a89c63579debf9c2

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=2e7be69345654eb0ad7e3d15a9766698

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=b86d90fc64754e1abdb5c4a3b30b1561

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=421cc3d8538445c785150e74687fd829

S3
Supporting Methods

Sample preparation. Stored urine samples were thawed and 50 µL of each sample was aliquotted into 2

mL Eppendorf tubes, mixed with 250 µL methanol and vortexed for 10 s. The solutions were kept at -20

°C for 20 min, centrifuged at 14,000 g and 4 °C for 15 min (Microfuge 20R, Beckman coulter Inc.,

Indianapolis, IN). 200 µL of the supernatants was transferred into new 2 mL Eppendorf tubes, dried under

vacuum (~1.5 hour) at 30 °C using an Eppendorf Vacufuge (Brinkmann Instruments, Westbury, NY) and

reconstituted with 200 µL ACN/H2O (6:4, v/v). The samples were vortex for 10 s, centrifuged at 14,000 g

for 5 min, and then 160 µL of the supernatants was transferred into LC vials for LC-MS analysis.

Q-ToF/MS parameters. The ESI conditions were as follows: ESI positive and negative modes; voltage

3.8 kV; cone voltage 40 V; source temperature 120 °C; desolvation temperature 325 °C; cone gas flow 20

L h–1; nebulizer pressure 45 psi; desolvation gas flow 600 L h–1. Nitrogen was used as the drying and

collision gases, with collision energy 15 eV, MS scan rate 5.0 spectra s –1 and MS/MS scan rate 3.0 spectra

s–1 across m/z 50–1000. Two precursor ions were selected per cycle (relative threshold >200 counts),

excluding ions previously detected on blank injections (initial mobile phase). The MS was calibrated

using purine (2 μM) and Agilent’s HP-921 calibration solution. Data were acquired using Agilent’s

MassHunter Data Acquisition Workstation v. B.06.01.6157 software, and stored in both profile and

centroid formats.

Data processing on Progenesis QI. Peak alignment was carried out by taking the pooled QCs as the

reference. Peak-peaking was performed using sensitivity = 1 and a chromatographic peak width of 0.05

min. The retention time limit was set to between 1.2 – 14.8 min. The resulting data were filtered using a

+1 charge state in positive ion mode (-1 in negative mode). Possible adduct ions were defined based on

previous knowledge of urine LCMS untargeted analysis, as follows: [M + H]+, [M + Na]+, [M + NH4]+,

[2M + H]+, [2M + Na]+, [2M + NH4]+, [M + H – H2O]+, [M + H – 2H2O]+, in (+) ESI mode, and [M – H]–,

[M + CH3COO – H]–, [M – H2O – H]–, [M + Cl]–, [2M – H]–, [M + Na – 2H]–, [M + K –2H]–, [2M +
S4
CH3COO – H]–, and [2M + CH3COO – H]– in (-)ESI mode. The adduct ions were grouped into mass

features through peak deconvolution. For peak annotation using full scan (MS1) data, we set the m/z

tolerance to 10 ppm, 90% isotopic similarity, and used an elemental composition filter consisting of C, H,

O, N, P, S, Cl, and F. For DDA-MS2 annotation, we used a precursor ion tolerance of 15 ppm, 90%

isotopic similarity, and fragment tolerance of 30 ppm.

Illustration. All the figures were prepared on a Dell Latitude 7400 with Intel(R) Core(TM) i5-8265U

CPU, 1.60GHz, 1800 Mhz, 4 Cores, 8 Logical Pocessors. OS system Microsoft Windows 10 Enterprise

ver. 10.0.19042. The MNs were formatted and exported from Cytoscape (ver. 3.2.1). The network was

organized with Cytoscape’s organic layout, node colors were mapped based on the source files of the

MS2 data and the edge thickness attribute was defined to reflect cosine similarity scores, with thicker

lines indicating higher similarity. Cytoscape style parameters and the session files (.cys) can be send by

the authors at further request. The Venn diagrams were plotted on InteractiVenn. 1 The doughnut and pie

charts were built on Microsoft Excel 2016 MSO ver. 16.0.4266.1001. The chemical structures were

elaborated on ChemDraw ver. 19.0.0.22. The figures were finalized on Inkscape 0.92.4 and exported

as .svg and .pdf formats.

S5
Supporting Results and Discussion

A large cluster in the (+)-ESI MN, shown in Figure S-11A, provided only one candidate hit by GNPS

MS2 spectral matching, phenylacetylglutamine. Metabolite annotation of the mass features corresponding

to the nodes that form this cluster by Progenesis QI led to three correct assignments based on MS2 library

matching out of 12 candidate hits: serine, asparagine, and the dipeptide glutamylarginine (Figure S-11A-

B). No correct identification was observed for eight candidate hits suggested using the full scan MS1

data. Using MolNetEnhancer, we were able to visualize a Mass2Motif related to glutamine propagated

throughout the cluster. MotifDB characterized this substructure pattern based on the main ions m/z 147,

130, 84, and 56. Peripheral nodes forming distinct regions within the MN cluster also resulted in

additional uncharacterized Mass2Motifs. NAP-Fusion in silico structure annotation was re-ranked

according to the candidate hit phenylacetylglutamine, and suggested the presence of di- and tri- glutamyl

peptides, guiding the annotation of 13 nodes after manual curation. ClassyFire assisted the metabolite

annotation of other nodes by predicting “amino acids and derivatives” as the most frequent nodes in this

cluster. Considering the direct parent result, these nodes were classified as “N-acyl-alpha amino acids”,

i.e., compounds containing an alpha amino acid which bears an acyl group at its terminal nitrogen atom.

After manual curation, it was possible to identify 30 di- and tri-peptides previously detected in urine

samples, but not necessarily available in the MS2 spectral databases. 2

In one cluster formed by seven nodes in (-)-ESI MN, as illustrated in Figure S-12A-B, the GNPS spectral

library led to the annotation of two nodes as hydroxyhippuric acid with two distinct retention times,

suggesting hydroxylation at different positions of the aromatic ring MS2LDA extracted one motif related

to the neutral elimination of CO2. This fragmentation pattern is characteristic of organic molecules

containing carboxylic acids (Figure S-12A). The NAP annotation tool predicted the chemical structures of

two additional nodes as methoxyhippuric acid, after using MetFusion to re-rank the scoring. A third node

was tentatively annotated by NAP prediction as isopropylmalic acid; however, since this node was

S6
directly connected only to nodes without GNPS candidate hits or NAP predictions, it was not possible to

use Fusion scoring to recalculate the predictions based on the network propagation. Manual

correspondence between GNPS nodes and Progenesis QI mass features indicated that only one compound

was annotated by this conventional approach (Figure S-12B). Similar to the NAP prediction, Progenesis

suggested the metabolite isopropylmalic acid using both the full scan MS1 based on high-resolution m/z

and isotopic pattern, and the MS2 spectrum (MS1 [M-H]- m/z 175.0615, 1.8 ppm, isotopic similarity

96.7; MS2 frag. matching score 14.7). After manual inspection of the raw data, we verified that the

precursor ion m/z 175 was a contaminant not removed by filtered steps; its MS2 spectrum also contained

the main fragments of the neighbor node m/z 178, i.e., m/z 134 (elimination of CO2) and m/z 77. The two

remaining nodes were identified as hippuric acid based on manual curation, which was supported by the

motif.

S7
Supporting References

(1) Heberle, H.; Meirelles, G. V.; da Silva, F. R.; Telles, G. P.; Minghim, R. InteractiVenn: A Web-
Based Tool for the Analysis of Sets through Venn Diagrams. BMC Bioinformatics 2015, 16 (1),
169. https://doi.org/10.1186/s12859-015-0611-3.
(2) Jandke, J.; Spiteller, G. Dipeptide Analysis in Human Urine. J. Chromatogr. B Biomed. Sci. Appl.
1986, 382, 39–45.

S8
Supporting Figures

Figure S-1. Metabolite annotation through Progenesis QI and GNPS workflows. A) Overlapped
putative annotation proposed by Feature-based Molecular Network (FBMN), Classical molecular network
(MN), and Progenesis QI in (-)-ESI and B) (+)-ESI modes. Annotation not yet validated based on manual
curation.

S9
Figure S-2. Global mass spectral molecular network of urine samples acquired in ESI negative mode. Molecular network colored by motif
interaction between nodes, retrieved through the MolNetEnhancer workflow and MS2LDA. Each node represents a group of similar MS2 spectra.
The connection indicates the spectral similarity between the nodes.

S10
Figure S-3. Global mass spectral molecular network of urine samples acquired in ESI positive mode. Molecular network colored by motif
interaction between nodes, retrieved through the MolNetEnhancer workflow and MS2LDA. Each node represents a group of similar MS2 spectra.
The connection indicates the spectral similarity between the nodes.

S11
Figure S-4. Global mass spectral molecular network of urine samples acquired in ESI negative mode.
Molecular network colored by putative chemical “superclass level” retrieved through the
MolNetEnhancer workflow and ClassyFire.

S12
Figure S-5. Global mass spectral molecular network of urine samples acquired in ESI positive mode.
Molecular network colored by putative chemical “superclass level” retrieved through the
MolNetEnhancer workflow and ClassyFire.

S13
Figure S-6. Global mass spectral molecular network of urine samples acquired in ESI negative mode.
Molecular network colored by putative chemical “direct parent” level retrieved through the
MolNetEnhancer workflow and ClassyFire.

S14
Figure S-7. Global mass spectral molecular network of urine samples acquired in ESI positive mode.
Molecular network colored by putative chemical “direct parent” level retrieved through the
MolNetEnhancer workflow and ClassyFire.

S15
Figure S-8. Metabolite annotation through MolNetEnhancer workflow. A) Overlapped annotation of
NAP-Fusion, MetFrag and Consensus in silico tools. B) Categorization of the metabolites annotated by
GNPS MS2 library search. Metabolites are classified according to the “Superclass level” from the
ClassyFire ranking system.

S16
Figure S-9. Metabolic annotation of ten urine samples analyzed by HILIC-(-)-ESI-MS/MS (DDA),
according to “direct parent level” of the ClassyFire ranking system. The internal pie chart provides a
better view of the chemical classes that compose the “others” group.

S17
Figure S-10. Metabolic annotation of ten urine samples analyzed by HILIC-(+)-ESI-MS/MS (DDA),
according to “direct parent level” of the ClassyFire ranking system. The internal pie chart provides a
better view of the chemical classes that compose the “others” group.

S18
Figure S-11. Spectral cluster observed in ESI positive mode and formed by amino acids and peptides. A)
MN layout representing the Mass2motifs interactions that group the nodes. B) Bar plot with the number
of annotated nodes according to the different approaches. Additional MN layouts showing GNPS library
search (blue node), nodes equivalent to mass features annotated by Progenesis QI using fullscan MS1 (red
node) and MS2 (green node), and the chemical classification at the superclass, subclass, and direct parent
levels. ClassyFire also suggested “carboxylic acids and derivatives” at the class level.

S19
Figure S-12. Spectral cluster observed in ESI positive mode and formed by hippuric acid derivatives. A)
MN layout representing the Mass2motifs interactions that group the nodes. B) Bar plot with the number
of annotated nodes according to the different approaches. Additional MN layouts showing GNPS library
search (blue node), nodes equivalent to mass features annotated by Progenesis QI using fullscan MS1 (red
node) and MS2 (green node), and the chemical classification at the superclass and direct parent levels.
ClassyFire suggested “benzene and substituted derivatives” and “benzoic acids and derivatives” at class
and subclass levels, respectively.

S20

You might also like