Professional Documents
Culture Documents
Acs Analchem 6b03116
Acs Analchem 6b03116
Acs Analchem 6b03116
pubs.acs.org/ac
(MCR-ALS).7−9 MCR-ALS has been widely used for second- information about the spectra or concentrations of the
order data arising from LC with ultraviolet−visible detection compounds present in the sample.19,8,22 MCR-ALS can be
(UV−vis),10−14 LC with fluorescence spectroscopy,12,15,16 and viewed as a multicomponent Beer’s law relationship given as
with low-resolution mass spectrometry.17,18 MCR-ALS has also the equation below.
recently been utilized for LC-HRMS analyses, including
metabolomics;19 however, the advantages that the high- X = CST (1)
resolution data provide have not been fully realized. In almost
all cases, the LC-HRMS data analyzed by MCR-ALS is In this relationship, X is the raw second-order data resulting
subjected to binning, a process of grouping the mass intensities from an LC-MS run, C is a matrix consisting of vectors
into bins within a specific range.3 This is done to reduce the representing the pure chromatographic profiles, and ST is the
size of the LC-HRMS data, which is very large due to the corresponding matrix of pure mass spectra.3,8 An initial guess
number of masses in the data set. For example, a mass spectrum for either the spectral or chromatographic profiles allows for the
with a range from 50 to 1000 amu at intervals of 0.001 amu solution of eq 1 for either C or ST using alternating least-
contains a possible 9 × 106 mass-to-charge (m/z) values. squares algorithms.4,25 Most often, spectral initial guesses are
Binning to intervals of 0.1 amu, for example, reduces the used, which can be obtained through methods such as
number of possible m/z values to 9 × 103. SIMPLISMA (ACD/Laboratories, Toronto, Canada)26 or
Even though unmanageable in its raw form, the information iterative orthogonal projection analysis (IOPA),27−29 both of
contained in HRMS data is often necessary to differentiate which aim to extract the most dissimilar spectra from the raw
between compounds with very similar masses. In a lipidomic data. In this work, IOPA was used. Briefly, IOPA extracts a
study of placental cells by Gorrochategui et al., MCR-ALS was certain number of spectra, defined by the user, from the raw
performed on binned LC-HRMS data. After the data was data that are the most orthogonal from one another.
resolved at unit mass resolution, the authors examined the raw, Constraints. The defining step in MCR-ALS is the
high-resolution data at masses found to correlate to potential application of constraints to drive the solution toward the
biomarkers. From the raw data, masses at 0.0001 amu precision correct, chemically relevant answer for the pure, resolved
were assigned.20 This approach can be problematic if components. Commonly employed constraints include non-
chromatographically overlapped species contain spectral peaks negativity, selectivity, unimodality (one maximum per
which share the same mass at lower precision because MCR- component), closure (mass balance), smoothness,30 compo-
ALS would be unable to resolve these peaks in low precision nent correspondence,31 area correlation,32 and hard modeling
data.13,21,22 An alternative approach to binning is the use of constraints.8,9 Of the constraints listed above, nonnegativity and
wavelet transforms; however, when MCR-ALS is to be used, selectivity were applied in this procedure. Nonnegativity
the Haar wavelet must be selected to retain non-negativity in ensures that the chromatographic and mass spectral intensities
the data. The effect of the Haar wavelet transformation is a do not go below zero, as the negative values are not physically
pairwise averaging effect which is practically equivalent to the reasonable. Selectivity notifies the algorithm of prior known
binning process, with a loss of resolution accompanying the values, such as regions of time known to contain no
compression. Recently, Tauler et al. have published a new compounds or regions of the mass spectra known to contain
protocol outlining a different approach to LC-HRMS data no signals.25,33 These constraints were applied only to those
analysis using MCR-ALS.23 Their approach defines “regions of components believed to correspond to true chemical species.
interest” (ROI) prior to MCR-ALS analysis, allowing for data Additional constraints beyond these provided no improvement
compression to take place.24 These regions of interest are to the resolution results.
chosen on the basis of several parameters including a signal-to- Extended MCR-ALS. As shown in eq 1, MCR-ALS operates
noise threshold, which if set incorrectly may lead to the on second-order data (i.e., a matrix); however, MCR-ALS is
exclusion of compounds at low intensities, particularly if a low- also capable of analyzing multiway (e.g., multiple samples,
level analyte is present in the vicinity of a much higher multidimensional chromatography,34,35 etc.) and multiset data
concentration analyte. (e.g., data fusion36). Making use of this higher-order data can
The work described in the current paper presents a new greatly reduce the amount of rotational ambiguity often
strategy which can analyze LC-HRMS data by finding relevant associated with MCR-ALS.37 For example, we often want to
regions in the binned mass spectra using MCR-ALS and include multiple samples, creating a multiway array (i.e., a
discarding all other masses. These regions are used for a second cube) of data. In order to analyze all samples simultaneously,
round of MCR-ALS at a 0.1 amu bin level. This process is augmentation of the third-order array into a second-order array
repeated until data at 0.001 amu precision are analyzed. This is necessary. This process is illustrated in Figure 1. Essentially,
allows for the resolution of compounds which overlap in the every sample is concatenated along the time mode to create a
chromatographic mode and share masses at even 0.01 amu single augmented time mode containing all samples, while
precision, while greatly reducing the size of the data. In contrast conserving the spectral mode. When this augmented data
to the ROI approach, the current approach does not make use
of any thresholds prior to MCR-ALS analysis, allowing for low-
level analytes to be captured without risk of mistakenly
eliminating them due to incorrect thresholding.
■ THEORY
Multivariate Curve Resolution-Alternating Least
Squares (MCR-ALS). MCR-ALS is an iterative optimization
method that mathematically resolves signals arising from Figure 1. Graphical representation of data rearrangement process for
chemical species and background without needing prior reshaping a third-order data array to a second-order data array.
■ STRATEGY
The overall strategy proposed for the analysis of LC-HRMS
data is outlined in Figure 2. The approach includes selection of
greater than a set threshold percentage of the spectral peak computer while the former was used for data translation and
intensity within each analyte’s resolved spectral prof ile are chosen program development. The Bioinformatics Toolbox by Math-
as significant. In this work, a threshold of 5% of the maximum works was used for importing data into MATLAB.
was used. It is important to note that the threshold is applied to Data Collection. Two data sets were analyzed with the
each component individually, negating any possibility of strategy described above to demonstrate its applicability to both
excluding low-intensity compounds. Compounds at low targeted analyses and untargeted, discovery type analyses. Both
intensities should resolve into their own component and their sample sets were analyzed with a Shimadzu LC system (Nexera
spectra will be normalized to the maximum intensity, and thus, series, Kyoto, Japan) coupled to an AB Sciex TripleTOF 5600
masses significant to those compounds should dominate their mass spectrometer (Concord, Ontario, Canada). The chroma-
corresponding spectral profile regardless of their intensity in the tographic conditions for each sample set are listed in their
raw, unresolved data. The significant masses are then expanded respective sections below. The data were collected in profile
to the next bin level (e.g., from 0.1 to 0.01 amu) and signals mode, rather than being centroided. Data were converted from
corresponding to those masses were then extracted from the AB Sciex .wiff files to mzXML files using msConvert, which is
raw data and subsequently binned to the new bin size similar to contained in the ProteoWizard suite.9 These data were then
the binning procedure in step 1. For example, if a component imported into MATLAB using the Bioinformatics toolbox
contains a significant mass at 163 amu, for the next step of the mzxmlread function, and peak lists were extracted. These were
analysis, the masses selected for binning cover a range from then ready to be processed using the in-house binning program.
162.6 to 163.5 at 0.1 amu intervals. From the raw data, signals Amphetamine Samples. Amphetamine standards were
at masses 162.5500−162.6499 amu are binned to 162.6 amu purchased from Grace Discovery Services (Columbia, Mary-
and 162.6500−162.7499 are binned to 162.7 and so on. This land). The names, abbreviations, and structures of the
process is portrayed graphically in Figure 4. Only data at these amphetamines used are listed in Figure 5. The compounds
were divided into three groups. For each group, four calibration
mixtures and two test mixtures were created. The concen-
trations and further sample information are given in Supporting
Information Table S1. The data from all three groups were
analyzed as a single data set. This analysis represents an ideal
■ EXPERIMENTAL SECTION
Software. All programs were written in-house using
MATLAB (Mathworks, Inc., Natick, MA) version R2013a on
a Dell Precision T3600 with an Intel Xeon E5−1620 CPU at
3.60 GHz and 32.0 GB of RAM and version R2015b on a Dell
Optiplex 9020 with an Intel Core i7−4790 CPU at 3.60 Hz and
32.0 GB of RAM. Both systems were running Windows 7 Figure 5. Structures and abbreviations of amphetamines contained in
Enterprise. Most of the calculations were run on the latter the amphetamine standard solutions analyzed.
analytical experiment and is used to demonstrate the feasibility out. As shown in Figure 3, the peaks between 150 and 190 s
of our strategy. were not selected for analysis. This was due to these analytes,
For the amphetamine analysis, an Accucore C18 column (2.1 ephedrine and pseudophedrine, being diastereomers with
× 100 mm, 2.6 μm; Thermo Scientific, Waltham, MA) was identical mass spectra. MCR-ALS requires different spectra to
used. Acetonitrile was used as mobile phase B, and 10 mM be able to resolve analyte signals. Otherwise, all of the windows
formic acid was used as mobile phase A. Gradient elution was shown in Figure 3 were analyzed using the methods described
used starting at 2.5% B, followed by an increase to 35% B over here; however, this discussion primarily focuses on the analysis
10 min. of the second window. First, the data were binned to unit mass,
Bacterial Lipid Analysis. To demonstrate the utility of our and MCR-ALS was performed. Two components were
strategy to complex analyses, five replicates from three different observed to contain realistic chromatographic peak shapes.
strains of bacteria were analyzed. To prepare for analysis, the This agreed with the two known compounds in this retention
samples were freeze−thawed three times in 200 μL of time window, PEA and PPA. The resolved mass spectral
phosphate-buffered saline followed by probe sonication. profiles are shown in Figure 6A. Three masses were found
Then, 1 mL of methanol was added followed by bath
sonication. Another round of sonication was performed with
0.5 mL of chloroform. After a 2 h incubation at 48 °C, 1 mL of
chloroform and 3 mL of water were added followed by
vortexing and centrifugation. The organic layer was extracted. A
second extraction was performed with an additional 2 mL of
chloroform. The organic extract was dried via vacuum
centrifugation and resuspended in 100 μL of methanol for
analysis.
These samples were analyzed using an Acuity HSS T3 C18
column (2.1 × 150 mm, 1.8 μm; Waters, Milford, MA) at 55
°C. Gradient elution was used with mobile phase A consisting
of 60:40 water:methanol with 10 mM ammonium formate and Figure 6. Resolved mass spectral profiles for components in window 2
at bin levels of 1 amu (A) and 0.001 amu (B). Below each spectral
0.1% formic acid and mobile phase B consisting of 90:10 plot, the masses corresponding to components 1 (PEA; blue) and 2
isopropanol:acetonitrile with 10 mM ammonium formate and (PPA; red) are listed.
0.1% formic acid. The analysis began with 100% A and
increased to 100% B from 1 to 21 min and held at 100% B for 4
above the 5% intensity threshold in this window for both of
min.
these components. These masses are listed in Table 1. These
MCR-ALS Analysis. For this work, spectral initial estimates
masses were expanded as described in the Strategy section and
were obtained using IOPA to initiate MCR-ALS. This was
at the 0.1 amu bin level, 3 masses were again determined to be
performed using an in-house MATLAB program. MCR-ALS
significant using a 5% intensity threshold.
was also performed using an in-house MATLAB program,
which was based on previously described programs by Allen
and Rutan28 and Bezemer and Rutan.39 The raw three-way LC- Table 1. All Extracted Masses for Compounds Resolved
MS data, including the sample mode, are input along with the within Window 2 Collected above the 5% Intensity
initial estimates of the component spectra. The sample Threshold at Each Bin Level
augmentation is performed within the program. The maximum bin level PPA PEA
number of iterations and the convergence criterion are defined unit 134, 135, 152 105, 106, 122
and the constraints are set for the selected components. For the 0.1 134.1, 135.1, 152.1 105.1, 106.1, 122.1
current work, non-negativity was used in both chromatographic 0.01 134.09, 134.10, 134.11, 135.10, 105.07, 105.08, 106.07,
and spectral modes while selectivity was used in the 152.10, 152.11 122.09, 122.10,
chromatographic mode. This selectivity set certain regions in 0.001a 134.096, 135.099, 152.106 105.070, 106.074, 122.096
the chromatographic profiles to zero intensity. Specifically, this a
Masses at this level represent the maxima of the spectral peaks.
was used at the edges of the retention windows where no peaks
were present to ensure resolution of background signals. A The masses at this level are more precise than that of the unit
smoothing constraint based on Eilers’ perfect smoother40 was mass bin level. This process was repeated until the 0.001 amu
investigated for use in the chromatographic mode, but showed bin level was reached. It is important to note that at the unit
no obvious improvement for these data. mass and 0.1 amu bin levels each of the masses were
An important consideration when using MCR-ALS is the represented by a single data point (i.e., a spike) in the
degree of rotational ambiguity present in the results. For this spectrum, whereas at the 0.01 and 0.001 amu bin levels, the
work, the application of constraints along with the selectivity peaks are represented by several data points creating a spectral
provided by mass spectrometry minimized rotational ambiguity. peak shape, thus more masses were found as significant. At the
This is supported by the observation that using additional final level of resolution in Table 1, only the masses
constraints provided no significant differences in the resolved corresponding to the maximum intensity of each peak are
analyte profiles.
■
listed. Because the m/z axis was irregular with intervals at
approximately 0.0015 amu, at the final level of binning (0.001
RESULTS AND DISCUSSION amu), some bins contained no m/z values. In several instances,
Known Amphetamine Data. To demonstrate the ability this caused spectral peaks to contain false zero intensities, as
of the proposed algorithm to analyze LC-HRMS data with determined by a discontinuous peak. To account for this, a
MCR-ALS, an analysis of amphetamines (Figure 3) was carried cubic spline interpolation was performed subsequent to binning
11096 DOI: 10.1021/acs.analchem.6b03116
Anal. Chem. 2016, 88, 11092−11099
Analytical Chemistry Article
■ CONCLUSIONS
A novel method for resolving analyte signals in LC-HRMS data,
which conserves the information from LC-HRMS data, was
developed. In the targeted amphetamine analysis, all known
amphetamine components in each specified window were
recovered using MCR-ALS at every resolution level from unit
mass to 0.001 amu, allowing for the facile quantitation of each
compound. The utility of this procedure is clearly demonstrated
by the finding that the final MCR-ALS step was completed with
as little as 0.55% of the original high-resolution data, with all
relevant high-resolution information being preserved for total
analysis of amphetamine data. The application of this procedure
to unknown, discovery type analyses was also demonstrated
through the analysis of a bacterial lipid data set. The samples
analyzed here were chosen to demonstrate the feasibility of the
proposed strategy; however, this general strategy can easily be
applied to many types of analyses utilizing LC-HRMS.
While a comprehensive comparison between the strategy
outlined in the paper and the ROI approach described
previously was not carried out,23 a brief comparison was
performed in which prediction errors were similar to those
found in Table 2. In order to make a meaningful comparison, a
more comprehensive study should be undertaken with varying
sample conditions. It is our belief that the present strategy will
be more useful in cases of signals with low signal-to-noise. This
is due to the ROI approach requiring a threshold prior to
MCR-ALS analysis, whereas the present strategy only requires a
threshold that is relative to each resolved compound’s most
intense spectral peak. The present strategy also includes fewer
tunable parameters, which may allow for more robust operation
despite requiring significant user interaction. In its current
form, this strategy took up to several minutes per window for
Figure 9. Resolved mass spectral profiles (left column) and complete analysis, including loading data, with user-interaction
chromatographic profiles (right column) of the found components, throughout. While it was not a specific goal of this work, several
labeled 1−4, of the analyzed window at the final bin level. Background
steps of this work would lend themselves well to automation
components were found in the analysis but are not shown here for
clarity. Component 1 required combination of two components as requiring minimal input by the user. Further optimization of
described in the text. the code may also provide significant reduction in analysis time.
These further refinements will allow this strategy to be easily
Table 3. Found Masses in the Bacterial Dataset and Their implemented by analysts with limited chemometrics training.
Associated Molecular Formula
average
retention time associated corresponding molecular formula
■
*
ASSOCIATED CONTENT
S Supporting Information
peak (s) masses (m/z) (within ±0.005 m/z tolerance)a The Supporting Information is available free of charge on the
1 588 211.168 C13H23O2 ACS Publications website at DOI: 10.1021/acs.anal-
212.172 chem.6b03116.
2 589 199.169 C12H23O2
Amphetamine sample table; calibration and fit statistics
200.173
3 592 293.210 C18H29O3 for MCR-ALS (PDF)
■
294.213
447.130 C22H23O10 AUTHOR INFORMATION
448.135
Corresponding Author
4 595 313.214 C21H29O2
314.219
*E-mail: srutan@vcu.edu.
315.193 Notes
a
Calculated from LIPID MAPS structure search 43 The authors declare no competing financial interest.
■ ACKNOWLEDGMENTS
The authors would like to thank the Lipidomics/Metabolomics
components can also be submitted to pattern recognition Core Facility at Virginia Commonwealth University for the LC-
MS analysis of amphetamines used in this work. The authors
algorithms such as principal components analysis (PCA) to aid acknowledge financial support from NSF CHE-1507332.
in distinguishing differences between the bacterial strains. D.W.C. is supported by an Altria Graduate Student Fellowship.
11098 DOI: 10.1021/acs.analchem.6b03116
Anal. Chem. 2016, 88, 11092−11099
Analytical Chemistry Article
■ REFERENCES
(1) Wei, X.; Shi, X.; Kim, S.; Zhang, L.; Patrick, J. S.; Binkley, J.;
(34) Porter, S. E. G.; Stoll, D. R.; Rutan, S. C.; Carr, P. W.; Cohen, J.
D. Anal. Chem. 2006, 78 (15), 5559−5569.
(35) Omar, J.; Olivares, M.; Amigo, J. M.; Etxebarria, N. Talanta
McClain, C.; Zhang, X. Anal. Chem. 2012, 84 (18), 7963−7971. 2014, 121, 273−280.
(2) Tautenhahn, R.; Böttcher, C.; Neumann, S. BMC Bioinf. 2008, 9, (36) Mas, S.; Tauler, R.; de Juan, A. J. Chromatogr. A 2011, 1218
504. (51), 9260−9268.
(3) Danielsson, R.; Bylund, D.; Markides, K. E. Anal. Chim. Acta (37) Golshan, A.; Abdollahi, H.; Beyramysoltan, S.; Maeder, M.;
2002, 454 (2), 167−184. Neymeyr, K.; Rajkó, R.; Sawall, M.; Tauler, R. Anal. Chim. Acta 2016,
(4) Cook, D. W.; Rutan, S. C. J. Chemom. 2014, 28 (9), 681−687. 911, 1.
(5) Smith, C. A.; Want, E. J.; O’Maille, G.; Abagyan, R.; Siuzdak, G. (38) Otto, M. Chemometrics, 2nd ed.; Wiley-VCH: Weinheim, 2007.
Anal. Chem. 2006, 78 (3), 779−787. (39) Bezemer, E.; Rutan, S. C. S. Chemom. Intell. Lab. Syst. 2006, 81
(6) Tsugawa, H.; Cajka, T.; Kind, T.; Ma, Y.; Higgins, B.; Ikeda, K.; (1), 82−93.
Kanazawa, M.; VanderGheynst, J.; Fiehn, O.; Arita, M. Nat. Methods (40) Eilers, P. H. C. Anal. Chem. 2003, 75 (14), 3631−3636.
2015, 12 (6), 523−526. (41) Hsu, S.-H.; Raglione, T.; Tomellini, S. A.; Floyd, T. R.; Sagliano,
(7) Tauler, R. Chemom. Intell. Lab. Syst. 1995, 30 (1), 133−146. N.; Hartwick, R. A. J. Chromatogr. A 1986, 367, 293−300.
(8) Rutan, S. C.; de Juan, A.; Tauler, R. In Comprehensive (42) Jeong, L. N.; Sajulga, R.; Forte, S. G.; Stoll, D. R.; Rutan, S. C. J.
Chemometrics; Brown, S. D., Tauler, R., Walczak, B., Eds.; Elsevier: Chromatogr. A 2016, 1457, 41−49.
Amsterdam, 2009; Vol. 2, pp 249−259. (43) Sud, M.; Fahy, E.; Cotter, D.; Brown, A.; Dennis, E. A.; Glass, C.
(9) de Juan, A.; Jaumot, J.; Tauler, R. Anal. Methods 2014, 6 (14), K.; Merrill, A. H.; Murphy, R. C.; Raetz, C. R. H.; Russell, D. W.;
4964. Subramaniam, S. Nucleic Acids Res. 2007, 35, D527−D532.
(10) Peré-Trepat, E.; Hildebrandt, A.; Barceló, D.; Lacorte, S.;
Tauler, R. Chemom. Intell. Lab. Syst. 2004, 74 (2), 293−303.
(11) Gargallo, R.; Tauler, R.; Cuesta-Sánchez, F.; Massart, D. L.
TrAC, Trends Anal. Chem. 1996, 15 (7), 279−286.
(12) Bortolato, S. A.; Olivieri, A. C. Anal. Chim. Acta 2014, 842, 11−
19.
(13) Peré-Trepat, E.; Tauler, R. J. Chromatogr. A 2006, 1131 (1−2),
85−96.
(14) Pérez, R. L.; Escandar, G. M. Anal. Chim. Acta 2014, 835, 19−
28.
(15) De Llanos, A. M.; De Zan, M. M.; Culzoni, M. J.; Espinosa-
Mansilla, A.; Cañada-Cañada, F.; De La Peña, A. M.; Goicoechea, H.
C. Anal. Bioanal. Chem. 2011, 399 (6), 2123−2135.
(16) Bortolato, S. A.; Arancibia, J. A.; Escandar, G. M. Anal. Chem.
2009, 81 (19), 8074−8084.
(17) Dantas, C.; Tauler, R.; Ferreira, M. M. C. Anal. Bioanal. Chem.
2013, 405 (4), 1293−1302.
(18) Peré-Trepat, E.; Lacorte, S.; Tauler, R. J. Chromatogr. A 2005,
1096 (1−2), 111−122.
(19) Navarro-Reig, M.; Jaumot, J.; García-Reiriz, A.; Tauler, R. Anal.
Bioanal. Chem. 2015, 407, 8835.
(20) Gorrochategui, E.; Casas, J.; Porte, C.; Lacorte, S.; Tauler, R.
Anal. Chim. Acta 2015, 854, 20−33.
(21) Peré-Trepat, E.; Lacorte, S.; Tauler, R. Anal. Chim. Acta 2007,
595 (1−2), 228−237.
(22) Sánchez Pérez, I.; Culzoni, M. J.; Siano, G. G.; Gil García, M. D.;
Goicoechea, H. C.; Martínez Galera, M. Anal. Chem. 2009, 81 (20),
8335−8346.
(23) Tauler, R.; Gorrochategui, E.; Jaumot, J.; Tauler, R. Protoc. Exch.
2015, http://dx.doi.org/10.1038/protex.2015.102.
(24) Bedia, C.; Tauler, R.; Jaumot, J. J. Chemom. 2016, 30, 575−588.
(25) Tauler, R.; Maeder, M.; de Juan, A. In Comprehensive
Chemometrics; Brown, S. D., Tauler, R., Walczak, B., Eds.; Elsevier:
Amsterdam, 2009; Vol. 2, pp 473−505.
(26) Sánchez, F. C.; Massart, D. L. Anal. Chim. Acta 1994, 298 (3),
331−339.
(27) Cook, D. W.; Rutan, S. C.; Stoll, D. R.; Carr, P. W. Anal. Chim.
Acta 2015, 859, 87−95.
(28) Allen, R.; Rutan, S. Anal. Chim. Acta 2012, 723, 7−17.
(29) Sánchez, F. C.; Toft, J.; van den Bogaert, B.; Massart, D. L. Anal.
Chem. 1996, 68 (1), 79−85.
(30) Hugelier, S.; Devos, O.; Ruckebusch, C. J. Chemom. 2015, 29,
448−456.
(31) Parastar, H.; Radović, J. R.; Bayona, J. M.; Tauler, R. Anal.
Bioanal. Chem. 2013, 405 (19), 6235−6249.
(32) Neves, A. C. de O.; Tauler, R.; de Lima, K. M. G. Anal. Chim.
Acta 2016, 937, 21−28.
(33) van Stokkum, I. H. M.; Mullen, K. M.; Mihaleva, V. V. Chemom.
Intell. Lab. Syst. 2009, 95 (2), 150−163.