Professional Documents
Culture Documents
MetaboliticsDB IEEE TCBB-2
MetaboliticsDB IEEE TCBB-2
Abstract—Web-based metabolomics databases enable researchers to disseminate metabolite concentration datasets measured
under different physiological conditions. In addition, many of these databases offer a number of tools to process (e.g., normalization,
outlier elimination, etc.) and analyze (e.g., clustering, enrichment studies, etc.) users’ metabolomics data sets. Nevertheless, none of
the existing metabolomics databases offer infrastructure and tools to store, manage, compare, and search metabolomics analysis
results. Besides, their pathway-level analysis capabilities are mostly limited to superimposing the measurements onto the pathways of
the measured metabolites. In this paper, we present MetaboliticsDB that features a database of metabolomics analyses and a set of
associated analytics tools. It enables users to store their metabolomics analysis results, and compare them against their own or other
publicly available analysis results to study, for instance, the progression of a disease, the effect of a drug, similarities between
well-known physiological conditions and the currently studied data, etc. Besides, MetaboliticsDB allows querying the metabolomics
analysis results database with flexible criteria, such as, listing all analyses where a certain pathway experiences a major
increase/decrease in activity to help researchers identify conditions sharing a similar metabolic mechanism. Moreover, MetaboliticsDB
offers a genome-scale metabolic network-based analysis tool that significantly extends the capabilities of the existing databases.
Finally, MetaboliticsDB employs AI-based as well as distance-based methods to associate the studied metabolomics data with
diseases stored in its database. To this end, it automatically trains, manages, and updates machine learning models based on the
stored metabolomics analysis data stored in its database for each disease. We demonstrate the use of MetaboliticsDB with a case
study on Hepatocellular Carcinoma. Our results show that MetaboliticsDB provides biologically relevant metabolic network-level
analysis results, disease association with high accuracy, and a scalable architecture supporting hundreds of simultaneous users.
Availability: MetaboliticsDB is available online at http://metabolitics.itu.edu.tr/.
Web interface source codes are available at https://github.com/itu-bioinformatics-database-lab/metabolitics-client.
Web API source codes are available at https://github.com/itu-bioinformatics-database-lab/metabolitics-api.
Source codes of the Metabolitics data analysis algorithm are available at
https://github.com/itu-bioinformatics-database-lab/metabolitics.
1 I NTRODUCTION
resources. In addition to the basic searching and visualiza- above web tools, several others offer somewhat similar
tion capabilities provided by the metabolic data resources, metabolomics analysis features, but are available only as
this category of tools also allows users to upload their stand-alone desktop applications, such as Pathway Tools
own metabolomics data, and then the analysis results are [18], SIMCA-P+ (Umetrics, Umea, Sweden), etc.
provided to the user in tabular and/or graphical form.
One of the most comprehensive analysis resources in this Even though the above database-enabled metabolomics
category is MetaboAnalyst 5.0 [28]. In addition to basic analysis resources are quite extensive in the number and
statistical significance and discrimination analysis tools at variety of the raw data processing and statistical analysis
the metabolite level, it also features pathway-level analysis features that they offer, we note the following gaps: (i)
in the form of pathway enrichment and topology-based Although some of the existing resources allow users to
assessment. Moreover, MetaboAnalyst 5.0 allows integrated store their raw metabolomics data, none of them enable
analysis of transcriptomics and metabolomics datasets at users to store, manage, compare, and search metabolomics
the pathway level. Metabolomics Workbench [13] is another analysis results. (ii) Their pathway-level analysis capabilities
comprehensive metabolomics data repository and analysis are limited to superimposing measured metabolite changes
resource. It offers a wide array of statistical analysis tools at onto their corresponding pathways. Moreover, all the
the metabolite level, ranging from running ANOVA analysis above resources consider each pathway independently from
to hierarchical clustering analysis. It also allows creation the metabolic network that they belong to. Even though
of various forms of plots on top of user data, such as MetaboAnalyst computes network-level measures, such as
boxplots, volcano plots, bar graphs, etc. However, it does betweenness, centrality, etc., it only uses them for ranking
not provide any pathway-level analysis. MeltDB 2.0 [14] pathways. Hence, the assessment in the above resources is
offers a number of features to annotate the metabolites, limited to only those pathways whose metabolites overlap
detect peaks, eliminate noise, etc. in user-submitted raw with the user-submitted measurements to some extent. Even
data. Once the data is preprocessed, users may perform then, the included pathways are evaluated individually by
statistical significance tests, classification and clustering, and ignoring the production/consumption/regulation relation-
various forms of visualization on their data. One differ- ships among them. (iii) Existing resources do not attempt
entiating feature of MeltDB 2.0 is that it provides built-in to associate metabolomics analysis results with known dis-
support to form project groups, share particular data files eases or physiological conditions.
among group members or with different project groups, and In this paper, we present a novel database-enabled web
create and manage different experiments. MetabolomeEx- resource, MetaboliticsDB, that performs metabolic activity
press [15] offers similar capabilities to MeltDB 2.0 in the analysis of user-provided metabolomics data in a holis-
categories of raw data processing and statistical analysis tic manner considering interconnections between pathways
of the processed data. XCMS Online [16] takes multi-omics with mass-balances preserved. To this end, it employs a
integration one step further than MetaboAnalyst 5.0, and state of the art systems-level algorithm [2] under the hood.
accommodates proteomics (in addition to transcriptomics) MetaboliticsDB stores analysis results in its database, and
along with metabolomics measurements. It also features users may compare current analysis results with previously
other common metabolite- and pathway-level analysis fea- stored analysis results of their own or other publicly avail-
tures (e.g., raw data processing, statistical analysis, pathway able analysis results shared by other users. Furthermore,
enrichment analysis, etc.) similar to the above tools. Caleydo MetaboliticsDB allows users to make a comparison between
[17] allows mapping omics data on pathways and nicely different analysis methods (e.g., Metabolitics vs. pathway
visualizes them so that users can see which pathways are be- enrichment) on the same dataset. Users may also flexibly
ing covered by the uploaded omics data, and to what degree search the stored analysis results to list those where certain
their activities change based on the concentration change pathways experience significant activity increase/decrease.
of metabolites in a metabolomics dataset, or the change in As another novel feature, MetaboliticsDB enables users to
mRNA levels in a gene expression dataset. WebSpecmine associate their metabolomics datasets with diseases based
[29] is a web-based metabolomics data analysis and min- on AI models that it creates and maintains.
ing tool built on the specmine R package. WebSpecmine
enables users to upload datasets to process, visualize, and We evaluate the features of MetaboliticsDB on a real
perform various analyses. WebSpecmine also trains machine metabolomics data set obtained from individuals with Hep-
learning models and predicts classes for future samples. atocellular Carcinoma (HCC). Our results demonstrate that
3Omics [30] is a web-based human metabolomics data anal- MetaboliticsDB provides (i) biologically relevant metabolic
ysis tool that offers visualization and analysis capabilities network-level analysis results along with markings through
on the uploaded datasets with an emphasis on combining a metabolic graph, (ii) disease association to analysis results
different omics data. Workflow4Metabolomics [31] offers with high accuracy, and (iii) a scalable architecture support-
data preprocessing with retention time alignment and peak ing hundreds of simultaneous users. This paper is orga-
extraction, and univariate analysis with nonparametric and nized as follows. The next section summarizes data man-
parametric tests. POMAShiny [32] provides data prepro- agement, metabolomics analysis features, and architecture
cessing with missing value imputation, normalization, and of MetaboliticsDB. Then, we discuss our results from the
outlier detection, univariate analysis with t-test, ANOVA, evaluation of MetaboliticsDB on an HCC dataset as well as
Mann-Whitney U-test, and Kruskal-Wallis test. It also of- the performance and accuracy of different features. Finally,
fers clustering with the k-means algorithm and classifica- we conclude with a discussion on how MetaboliticsDB may
tion with the random forest algorithm. In addition to the be employed in a wider scope with the proposed features.
3
pathways (see Fig. 3). In order to quantify the similarity, 0 and 1 is provided, which reflects the probability that the
MetaboliticsDB turns both the current analysis results and underlying AI model assigns to its diseases prediction for
disease analysis results stored in the database into vectors that individual.
of numbers where each number represents the diff value
for a particular pathway. Then, Pearson correlation is com- 2.6 Comparison of Analysis Results
puted between each disease vector and the current analysis
result vector. Finally, the diseases in the database are sorted MetaboliticsDB allows users to compare and contrast their
according to their correlation values, and top 5 of them are metabolomics data analysis results to (i) their previous anal-
presented to the user. ysis results and (ii) other users’ publicly available analysis
results stored in its database. From the analysis results page,
2.5 Machine Learning-based Disease Prediction the user may choose any number of analysis results that
they are interested in comparing by clicking on the checkbox
Alternative to similarity-based disease association, Metabol- next to each study. Then, clicking on ”compare” button
iticsDB also offers machine learning-based disease status at the top leads to the comparison page. The comparison
prediction for each individual. To this end, MetaboliticsDB interface features a heatmap at the top where the rows
trains machine learning models based on previously stored represent pathways with the highest variance in terms of
metabolomics data analysis results for each disease peri- their diff values among the compared analysis results, and
odically. More specifically, for each disease four different the columns represent the compared analysis results (see
types of models based on Logistic Regression, Random Fig. 4). Each cell in the heatmap is colored according to
Forest, Support Vector Machines, and XGBoost are trained. the corresponding pathway diff values. The bottom part of
10-fold cross validation is used to tune the parameters the comparison interface includes a table that lists all the
of each model and evaluate its classification performance pathways with their computed diff values for each selected
based on f1-scores. Then, the best performing model is study (similar to the bottom part of Fig. 3).
chosen and stored in the database as binary files with the
pickle Python package. The models employ the computed
reaction metabolic differentiation scores as features. Then, 2.7 Advanced Search Interface
the reaction diff values of future metabolomics data analysis As different from similar tools, as a novel feature, Metabol-
results are exploited to predict potential diseases associated iticsDB allows to search the metabolomics analysis results
with an individual. Next to each disease a value between in the database in terms of the metabolic activity changes
6
any search result entry takes the user to the details and
visualizations of the clicked entity.
Users may also choose to browse the database pathway
by pathway. The browsing page lists all the pathways in the
database on the left, and clicking on each pathway displays
the reactions in the pathway and a graphical visualization
that shows a network view of the pathway (see Fig. 6 for
an example). The pathway visualizations in MetaboliticsDB
are drawn by using the Escher library [39] and saved in the
database.
Fig. 3. Analysis results page: top-20 pathways with the highest absolute
diff values. 2.9 Architecture
The architecture of MetaboliticsDB (Fig. 5) is carefully de-
of involved pathways. For instance, users may search for signed to meet demanding computational and data man-
metabolomics analysis results where Urea Cycle experi- agement requirements of various features in an efficient and
ences a decreased activity, whereas Fatty Acid Synthesis flexible manner. All frontend interfaces are implemented in
experiences increased activity. Optionally, the user may Angular, which is a Javascript framework that allows the
also specify the magnitude of increase and decrease. They development of sophisticated single page web applications.
cana flexibly add and remove pathways from consideration Most of the rendering and application logic is developed
leading to multiple conditions which are connected via SQL at client side to enhance the performance and user expe-
AND semantics. rience. The relational database stores analysis results and
user accounts, and it is hosted by a Postgres instance. The
frontend communicates with the database via the RESTful
2.8 Basic Searching, Browsing, and Other Visualiza- API that is developed in Flask, a micro web framework
tion Features in Python. One advantage of adopting RESTful API imple-
Similar to many other metabolic databases, Metaboli- mentation is that it offers programmatic API access to other
ticsDB includes a search interface to locate metabolites, researchers who may want to programmatically utilize the
reactions, and pathways that they are interested in. The analysis algorithms and services of MetaboliticsDB in their
search interface features an auto-complete feature (similar to project implementations. This feature provides another in-
Google Search) that automatically suggests names from the terface to MetaboliticsDB for users with programming skills.
database as user types in their search terms. The suggestions Documentation for MetaboliticsDB’s RESTful APIs inter-
are not a flat list of mixed names, but are categorized faces is available at http://metabolitics.itu.edu.tr/api/spec
into different groups based on the matching entity types in OPENAPI specification. In addition, MetaboliticsDB have
(i.e., metabolites, reactions, etc.). The search results are also a sophisticated system to manage the analysis requests,
presented in a similar manner where the matching items as each analysis task is computationally expensive, which
are categorized based on their entity types. Clicking on cannot be handled in the regular life cycle of HTTP requests.
7
to be less than that in control cases, and the activation disease. Presently, there are 40 distinct diseases stored in
levels of Pyridoxal kinase enzyme have been reported to MetaboliticsDB. Hence, the ground-truth clustering includes
help disease progression [49]. Increased activity of Pyridoxal 40 clusters. To compare the clusterings, we employ two
kinase reaction in the Vitamin B6 metabolism seen in HCC intuitive comparison metrics, i.e., homogeneity and com-
analysis results supports these statements. pleteness [55]. Briefly, homogeneity checks if each cluster
Another pathway with a large diff value in the positive contains patients with the same disease, and completeness
direction is Limonene and pinene degradation. Limonene deals with whether all patients with the same disease are
has been reported to inhibit the progression of HCC by assigned to the same cluster. Both measures may have
suppressing cell proliferation [50]. Pinene also has been a value between 0 and 1, where 1 represents best score,
reported to inhibit cancer cell development in vitro and in and 0 represents the worst score. In our evaluation, both
vivo [51]. Decreased levels of Limonene and Pinene due homogeneity and completeness are measured as 0.94, which
to activities of Limonene and pinene degradation might indicates that cluster assignments are mostly accurate.
contribute to HCC progression.
3.2.2 Machine Learning-based Disease Association
Cytochrome metabolism is another pathway with a large
diff value in the positive direction. Intrinsic clearance val- In this section, we present the prediction performance of the
ues indicating activity levels show an activity growth for machine learning models that MetaboliticsDB creates and
CYP2E1, CYP2D6, and CYP2C9 cytochrome P450 types in manages in its database. We employ k-fold cross validation
HCC samples [52]. Increased activity of Cytochrome P450 (with k = 10 or k = 5 depending on sample size) to test the
2E1, Cytochrome P450 2C9, and Cytochrome P450 2D6 classification accuracy. Table 2 summarizes precision, recall,
reactions in the Cytochrome metabolism pathway seen in and F1 scores for the disease prediction models stored in the
HCC analysis results supports these observations. database. The F1 score was calculated by the harmonic mean
Another pathway with a large diff value in the posi- of precision and recall values. The results show that healthy
tive direction is Thiamine metabolism. The activity levels and tumor samples are classified with high accuracy.
of enzymes that rely on Thiamine have been reported to Disease Precision Recall F1 Alg. K
increase in cancer cases [53]. Increased activity levels of Hepatocellular Carcinoma 0.89 0.91 0.90 LR 10
Thiamine diphosphokinase, Thiamine diphosphate kinase, Colon Carcinoma 0.96 0.99 0.98 RF 5
Breast Cancer 0.88 0.94 0.91 LR 10
and Thiamine-triphosphatase reactions in the Thiamine Stomach Cancer 0.94 0.99 0.96 LR 10
metabolism pathway seen in HCC analysis results support Ovarian Cancer 0.94 0.92 0.91 RF 10
these findings. Crohn’s Disease 0.81 0.91 0.84 RF 10
Finally, Fructose and mannose metabolism is the last Asthma 0.99 0.99 0.99 RF 10
Rheumatoid Arthritis 0.89 0.85 0.85 RF 10
among the top 10 pathways with a large diff score in the Steatotic Liver Disease 0.82 0.95 0.87 RF 10
positive direction. The development of HCC is increased Type 2 Diabetes Mellitus 0.86 0.96 0.91 LR 10
with diets rich in fructose since it enhances activity levels of Wilson Disease 0.80 0.90 0.83 RF 5
Adult Respiratory Distress
the lipogenic pathway and lipid accumulation [54]. Syndrome
0.88 1.00 0.93 LR 5
The above brief discussion illustrates that Metaboli- Androgenic Alopecia 0.81 0.94 0.86 LR 10
ticsDB is useful and effective in analyzing metabolomics Ankylosing Spondylitis 0.83 1.00 0.89 RF 5
datasets with insights on the underlying metabolic mech- Autistic Disorder 0.75 0.88 0.79 RF 5
Chronic Fatigue Syndrome 0.73 0.74 0.72 LR 10
anisms. Cystic Fibrosis 0.75 1.00 0.83 RF 5
Intermediate Coronary
0.77 1.00 0.85 RF 5
Syndrome
3.2 Disease Association Evaluation Peanut Allergy 0.83 0.92 0.83 RF 10
Placental Abruption 0.88 1.00 0.92 RF 5
In this section, we evaluate the disease association feature
Pre-eclampsia 0.75 0.88 0.75 RF 5
of MetaboliticsDB in two distinct dimensions. Sarcoidosis 0.81 0.86 0.78 SVM 10
Schizophrenia 0.90 1.00 0.95 SVM 10
3.2.1 Similarity-based Disease Association TABLE 1
Average results of k-fold cross validation
MetaboliticsDB reports diseases and physiological condi-
tions that are most similar to the analyzed metabolomics
dataset based on the correlation between the current We further evaluate the prediction performance of the
analysis results and the previously computed disease models by relaxing the true positive definition slightly. In
metabolomics data analysis results. In order to evaluate particular, since MetaboliticsDB provides a list of possible
the relevancy of the results provided by the proposed associated diseases sorted by their predicted likelihoods, we
scheme, we cluster all diseases in the database using the consider a disease association as true positive if the true
same proposed vector representation and similarity mea- disease appears in top 3 suggested diseases for patients,
sure (i.e., using agglomerative clustering with similarity and it does not appear at all for healthy individuals. The
measure: pearson correlation, linkage: complete). Then, we prediction is said to be accurate if the disease is listed in
compare the resulting disease clustering to a ”ground-truth top-3 predictions for patients or the disease isn’t listed in
clustering”. The ground-truth clustering that we employ predictions for healthy samples. The number of samples
in this evaluation includes one cluster per distinct disease. predicted accurately divided by all samples is given in the
That is, patient characteristics such as gender and age are precision column. The number of accurate predictions for
ignored, metabolomics samples are assigned to the same disease samples divided by all disease samples is given in
cluster as long as the corresponding patients have the same the recall column.
9
Disease Precision Recall F1
Hepatocellular Carcinoma 0.89 0.84 0.87
Moreover, in order to test the effect of the metabolomics data
Colon Carcinoma 1.00 1.00 1.00 size, random metabolites are selected from each network
Breast Cancer 0.81 0.75 0.78 and random fold-change values are assigned to them. The
Stomach Cancer 0.98 0.98 0.98 number of metabolite measurements included in the tests
Ovarian Cancer 0.92 0.88 0.90
Crohn’s Disease 0.88 0.76 0.82 varied between 5 and 150 (incremented by 5 leading to 30
Asthma 0.99 0.99 0.99 different metabolomics data sets). Then, the analysis is run
Rheumatoid Arthritis 0.88 0.76 0.82 with those artificial metabolomics data sets on each evalu-
Steatotic Liver Disease 0.91 0.84 0.88
Type 2 Diabetes Mellitus 0.90 0.87 0.88
ated metabolic network. Fig. 7 charts the average analysis
Wilson Disease 0.92 0.83 0.87 running time (in seconds) over all metabolomics data sets
Adult Respiratory Distress Syndrome 0.91 0.90 0.91 for each metabolic network.
Androgenic Alopecia 0.88 0.77 0.82
Ankylosing Spondylitis 0.75 0.50 0.60
Autistic Disorder 0.92 0.83 0.87
Chronic Fatigue Syndrome 0.80 0.63 0.71
Cystic Fibrosis 0.89 0.67 0.76
Intermediate Coronary Syndrome 1.00 1.00 1.00
Peanut Allergy 0.91 0.82 0.86
Placental Abruption 0.92 0.83 0.87
Pre-eclampsia 1.00 1.00 1.00
Sarcoidosis 0.75 0.69 0.72
Schizophrenia 0.93 0.99 0.96
TABLE 2
Disease prediction results
offered by MetaboAnalyst 5.0 and MeltDB 2.0. Volcano MetabolomeExpress, XCMSOnline, WebSpecmine, and
plots and ANOVA analysis are available in Metabolomics 3Omics enable users to compare analysis results. Advanced
Workbench. t-tests and fold-change analysis are provided analysis search interface is available in MetaboliticsDB,
by MetabolomeExpress, but volcano plots are not sup- Metabolomics Workbench, MetabolomeExpress, and XCMS
ported. Fold-change analysis is also offered by XCMS On- Online. Metabolic flux change prediction is only available in
line. ANOVA analysis is also accessible along with t-tests MetaboliticsDB.
and fold change analysis in WebSpecmine. Non-parametric
and parametric tests and ANOVA analysis are provided by
5 D ISCUSSION
Workflow4Metabolomics and POMAShiny.
MetaboliticsDB offers automatically managed machine MetaboliticsDB is novel in a number of aspects in compar-
learning classification models of type Logistic Regression, ison to the existing similar works. In particular, it offers
Support Vector Machines, Random Forest, and XGBoost. a data management platform for metabolomics analysis
MetaboAnalyst 5.0 provides Support Vector Machine, Ran- results along with a set of associated web-based tools that
dom Forest, and Partial Least Squares Discriminant Analysis allow user to effectively query, visualize, and study the
classification methods. Random Forest and Orthogonal Par- analysis results at network level. Considering metabolomics
tial Least Squares Discriminant Analysis classification meth- analysis results as the main object, and designing tools
ods are provided by Metabolomics Workbench. Support around it facilitates the offering of, in particular, three useful
Vector Machine and Random Forest Classification methods features:
are also supported by MeltDB 2.0. Linear Discriminant (i) With the comparison feature, MetaboliticsDB enables
Analysis and Support Vector Machine methods are provided researchers to compare their datasets to the known dis-
by WebSpecmine. Random Forest algorithm is also available eases or other users’ public analysis results representing
in POMAShiny for classification. different physiological conditions. In particular, the compar-
ison feature allows researchers to make connection between
Both enrichment analysis and pathway analysis are
seemingly different conditions, and have some insights
available in MetaboliticsDB. Enrichment analysis is sup-
about what kind of condition the current metabolomics
ported by MetaboAnalyst 5.0, Metabolomics Workbench,
data set may belong to. In the former case, the recognition
MeltDB 2.0, XCMS Online, and 3Omics. MetaboAnalyst 5.0,
of common mechanisms between two different conditions
MeltDB 2.0, MetabolomeExpress, XCMS Online, Caleydo,
may pave the way for sharing known/existing therapies
WebSpecmine, and 3Omics provide Pathway Analysis.
designed for each condition. In the latter case, it may pro-
MetaboliticsDB, MetaboAnalyst 5.0, Metabolomics vide researchers with pointers on where to look at in the
Workbench, MeltDB 2.0, XCMS Online, Caleydo, 3Omics, existing knowledge while interpreting a new and possibly
and POMAShiny support genome-scale metabolic complex case. Besides, the comparison interface can be
networks. Python or R packages are available for utilized to understand the differences between sub-types of
MetaboliticsDB, MetaboAnalyst 5.0, Metabolomics a disease, progression of disease stages, and the effect of
Workbench, XCMS Online, WebSpecmine, and POMAShiny. possible drugs through before and after comparison. Finally,
MetaboliticsDB, MetaboAnalyst 5.0, WebSpecmine, and existing or possible common patterns across the same class
POMAShiny train machine learning models and use these of diseases may be observed (e.g., Fig. 4 compares different
models for sample prediction. MetaboliticsDB periodically cancers). For instance, Warburg Effect in different types of
trains and stores predictive models on analysis results for cancers may be studied in depth.
disease prediction. (ii) With the disease and physiological condition as-
MetaboliticsDB, Metabolomics Workbench, MeltDB 2.0, sociation tools, MetaboliticsDB may help clinicians to get
11
[6] E. Lee, H.-Y. Chuang, J.-W. Kim, T. Ideker, and D. Lee, “Inferring models,” Nucleic acids research, vol. 44, no. D1, pp. D515–D522,
pathway activity toward precise disease classification,” PLoS com- 2015.
putational biology, vol. 4, no. 11, p. e1000217, 2008. [25] S. Kim, P. A. Thiessen, E. E. Bolton, J. Chen, G. Fu, A. Gindulyte,
[7] P. Khatri, S. Sellamuthu, P. Malhotra, K. Amin, A. Done, and L. Han, J. He, S. He, B. A. Shoemaker et al., “Pubchem substance
S. Draghici, “Recent additions and improvements to the onto- and compound databases,” Nucleic acids research, vol. 44, no. D1,
tools,” Nucleic Acids Research, vol. 33, no. suppl 2, pp. W762–W765, pp. D1202–D1213, 2015.
2005. [26] B. Elliott, M. Kirac, A. Cakmak, G. Yavas, S. Mayes, E. Cheng,
[8] S. Draghici, P. Khatri, A. L. Tarca, K. Amin, A. Done, C. Voichita, Y. Wang, C. Gupta, G. Ozsoyoglu, and Z. Meral Ozsoyoglu, “Path-
C. Georgescu, and R. Romero, “A systems biology approach for case: pathways database system,” Bioinformatics, vol. 24, no. 21, pp.
pathway level analysis,” Genome research, vol. 17, no. 10, pp. 1537– 2526–2533, 2008.
1545, 2007. [27] A. E. Cicek, X. Qi, A. Cakmak, S. R. Johnson, X. Han, S. Alshalwi,
[9] A. L. Tarca, S. Draghici, P. Khatri, S. S. Hassan, P. Mittal, J.-s. Kim, Z. M. Ozsoyoglu, and G. Ozsoyoglu, “An online system for
C. J. Kim, J. P. Kusanovic, and R. Romero, “A novel signaling metabolic network analysis,” Database, vol. 2014, p. bau091, 2014.
pathway impact analysis,” Bioinformatics, vol. 25, no. 1, pp. 75–82, [28] Z. Pang, J. Chong, G. Zhou, D. A. de Lima Morais, L. Chang,
2008. M. Barrette, C. Gauthier, P.-É. Jacques, S. Li, and J. Xia, “Metabo-
[10] C. J. Vaske, S. C. Benz, J. Z. Sanborn, D. Earl, C. Szeto, J. Zhu, analyst 5.0: narrowing the gap between raw spectra and functional
D. Haussler, and J. M. Stuart, “Inference of patient-specific path- insights,” Nucleic acids research, vol. 49, no. W1, pp. W388–W396,
way activities from multi-dimensional cancer genomics data using 2021.
paradigm,” Bioinformatics, vol. 26, no. 12, pp. i237–i245, 2010. [29] S. Cardoso, T. Afonso, M. Maraschin, and M. Rocha,
[11] L. M. Heiser, A. Sadanandam, W.-L. Kuo, S. C. Benz, T. C. Gold- “Webspecmine: A website for metabolomics data analysis and
stein, S. Ng, W. J. Gibb, N. J. Wang, S. Ziyad, F. Tong et al., “Subtype mining,” Metabolites, vol. 9, no. 10, 2019. [Online]. Available:
and pathway specific responses to anticancer compounds in breast https://www.mdpi.com/2218-1989/9/10/237
cancer,” Proceedings of the National Academy of Sciences, vol. 109, [30] T.-C. Kuo, T.-F. Tian, and Y. J. Tseng, “3omics: a web-based systems
no. 8, pp. 2724–2729, 2012. biology tool for analysis, integration and visualization of human
[12] J. Xia, I. V. Sinelnikov, B. Han, and D. S. Wishart, “Metaboana- transcriptomic, proteomic and metabolomic data,” BMC systems
lyst 3.0—making metabolomics more meaningful,” Nucleic acids biology, vol. 7, pp. 1–15, 2013.
research, vol. 43, no. W1, pp. W251–W257, 2015. [31] F. Giacomoni, G. Le Corguille, M. Monsoor, M. Landi, P. Pericard,
[13] M. Sud, E. Fahy, D. Cotter, K. Azam, I. Vadivelu, C. Burant, M. Pétéra, C. Duperier, M. Tremblay-Franco, J.-F. Martin, D. Jacob
A. Edison, O. Fiehn, R. Higashi, K. S. Nair et al., “Metabolomics et al., “Workflow4metabolomics: a collaborative research infras-
workbench: An international repository for metabolomics data tructure for computational metabolomics,” Bioinformatics, vol. 31,
and metadata, metabolite standards, protocols, tutorials and train- no. 9, pp. 1493–1495, 2015.
ing, and analysis tools,” Nucleic acids research, vol. 44, no. D1, pp.
[32] P. Castellano-Escuder, R. González-Domı́nguez, F. Carmona-
D463–D470, 2015.
Pontaque, C. Andrés-Lacueva, and A. Sánchez-Pla, “Pomashiny:
[14] N. Kessler, H. Neuweger, A. Bonte, G. Langenkämper, K. Niehaus, A user-friendly web-based workflow for metabolomics and pro-
T. W. Nattkemper, and A. Goesmann, “Meltdb 2.0–advances of the teomics data analysis,” PLOS Computational Biology, vol. 17, no. 7,
metabolomics software system,” Bioinformatics, vol. 29, no. 19, pp. p. e1009148, 2021.
2452–2459, 2013.
[33] E. Brunk, S. Sahoo, D. C. Zielinski, A. Altunkaya, A. Dräger,
[15] A. J. Carroll, M. R. Badger, and A. H. Millar, “The metabolome-
N. Mih, F. Gatto, A. Nilsson, G. A. Preciat Gonzalez, M. K. Aurich
express project: enabling web-based processing, analysis and
et al., “Recon3d enables a three-dimensional view of gene variation
transparent dissemination of gc/ms metabolomics datasets,” BMC
in human metabolism,” Nature biotechnology, vol. 36, no. 3, pp. 272–
bioinformatics, vol. 11, no. 1, p. 376, 2010.
281, 2018.
[16] R. Tautenhahn, G. J. Patti, D. Rinehart, and G. Siuzdak, “Xcms
[34] J. A. Baron, C. S.-B. Johnson, M. A. Schor, D. Olley, L. Nickel,
online: a web-based platform to process untargeted metabolomic
V. Felix, J. B. Munro, S. M. Bello, C. Bearer, R. Lichenstein et al.,
data,” Analytical chemistry, vol. 84, no. 11, pp. 5035–5039, 2012.
“The do-kb knowledgebase: a 20-year journey developing the
[17] M. Streit, A. Lex, M. Kalkusch, K. Zatloukal, and D. Schmalstieg,
disease open science ecosystem,” Nucleic acids research, vol. 52,
“Caleydo: connecting pathways and gene expression,” Bioinfor-
no. D1, pp. D1305–D1314, 2024.
matics, vol. 25, no. 20, pp. 2760–2761, 2009.
[35] Z. A. King, J. Lu, A. Dräger, P. Miller, S. Federowicz, J. A. Lerman,
[18] P. D. Karp, S. M. Paley, M. Krummenacker, M. Latendresse, J. M.
A. Ebrahim, B. O. Palsson, and N. E. Lewis, “Bigg models: A
Dale, T. J. Lee, P. Kaipa, F. Gilham, A. Spaulding, L. Popescu
platform for integrating, standardizing and sharing genome-scale
et al., “Pathway tools version 13.0: integrated software for path-
models,” Nucleic acids research, vol. 44, no. D1, pp. D515–D522,
way/genome informatics and systems biology,” Briefings in bioin-
2016.
formatics, vol. 11, no. 1, pp. 40–79, 2009.
[19] M. Kanehisa, M. Furumichi, M. Tanabe, Y. Sato, and K. Morishima, [36] E. Fahy and S. Subramaniam, “Refmet: a reference nomenclature
“Kegg: new perspectives on genomes, pathways, diseases and for metabolomics,” Nature methods, vol. 17, no. 12, pp. 1173–1174,
drugs,” Nucleic acids research, vol. 45, no. D1, pp. D353–D361, 2017. 2020.
[20] R. Caspi, R. Billington, L. Ferrer, H. Foerster, C. A. Fulcher, [37] J. D. Orth, I. Thiele, and B. Ø. Palsson, “What is flux balance
I. M. Keseler, A. Kothari, M. Krummenacker, M. Latendresse, analysis?” Nature biotechnology, vol. 28, no. 3, pp. 245–248, 2010.
L. A. Mueller et al., “The metacyc database of metabolic path- [38] A. C. Müller and A. Bockmayr, “Fast thermodynamically con-
ways and enzymes and the biocyc collection of pathway/genome strained flux variability analysis,” Bioinformatics, vol. 29, no. 7, pp.
databases,” Nucleic acids research, vol. 44, no. D1, pp. D471–D480, 903–909, 2013.
2015. [39] Z. A. King, A. Dräger, A. Ebrahim, N. Sonnenschein, N. E. Lewis,
[21] A. Fabregat, K. Sidiropoulos, P. Garapati, M. Gillespie, K. Haus- and B. O. Palsson, “Escher: a web application for building, sharing,
mann, R. Haw, B. Jassal, S. Jupe, F. Korninger, S. McKay et al., “The and embedding data-rich visualizations of biological pathways,”
reactome pathway knowledgebase,” Nucleic acids research, vol. 44, PLoS computational biology, vol. 11, no. 8, p. e1004321, 2015.
no. D1, pp. D481–D487, 2015. [40] T. Chen, G. Xie, X. Wang, J. Fan, Y. Qiu, X. Zheng, X. Qi, Y. Cao,
[22] D. S. Wishart, T. Jewison, A. C. Guo, M. Wilson, C. Knox, Y. Liu, M. Su, X. Wang et al., “Serum and urine metabolite profiling
Y. Djoumbou, R. Mandal, F. Aziat, E. Dong et al., “Hmdb 3.0—the reveals potential biomarkers of human hepatocellular carcinoma,”
human metabolome database in 2013,” Nucleic acids research, Molecular & Cellular Proteomics, vol. 10, no. 7, pp. M110–004 945,
vol. 41, no. D1, pp. D801–D807, 2012. 2011.
[23] J. Hastings, P. de Matos, A. Dekker, M. Ennis, B. Harsha, N. Kale, [41] S. Sahoo, H. S. Haraldsdóttir, R. M. Fleming, and I. Thiele, “Mod-
V. Muthukrishnan, G. Owen, S. Turner, M. Williams et al., “The eling the effects of commonly used drugs on human metabolism,”
chebi reference database and ontology for biologically relevant The FEBS journal, vol. 282, no. 2, pp. 297–317, 2015.
chemistry: enhancements for 2013,” Nucleic acids research, vol. 41, [42] H. Knauf and E. Mutschler, “Clinical pharmacokinetics and phar-
no. D1, pp. D456–D463, 2012. macodynamics of torasemide,” Clinical pharmacokinetics, vol. 34,
[24] Z. A. King, J. Lu, A. Dräger, P. Miller, S. Federowicz, J. A. Lerman, pp. 1–24, 1998.
A. Ebrahim, B. O. Palsson, and N. E. Lewis, “Bigg models: A [43] L. Che, P. Paliogiannis, A. Cigliano, M. G. Pilo, X. Chen, and D. F.
platform for integrating, standardizing and sharing genome-scale Calvisi, “Pathogenetic, prognostic, and therapeutic role of fatty
13