Professional Documents
Culture Documents
Integration of Multi-Omics Data in Gsms
Integration of Multi-Omics Data in Gsms
Transcriptomic data is added to the GEM for building context specific GEM but the data only
gives the idea of gene expression levels only. For a much better GEM the protein translation
levels and the metabolic fluxes must also be known. These two crucial factors are addressed
by including the proteomic data and the expression levels in different conditions. Inclusion of
metabolomic data fetches in the quantitative information of small cellular molecules, amino
acids, fatty acids, pathway intermediates and secondary metabolites. Now the integration of
multi-omics data sets to GEMs and choosing the algorithm to be used depends upon the data
type and optimization problem (e.g., LP, mixed integer linear programming (MILP)) used by
the algorithm. The omics integration algorithms are categorised in four types: objective
function required (OFR), expression data compatible (EDC), metabolic task-derived (MTD),
core reaction-required (CRR) algorithm
OFR algorithms integrate the gene expression data with the genome scale model 44,45. The
algorithm works with the objective of minimizing the inconsistency between the measured
and the calculated metabolic fluxes and the expression data. The objective function still needs
to be satisfied to obtain the optimal fluxes. Therefore, the inconsistencies between the gene
expression data and the GEMs are not fully resolved.
EDC algorithms work on maximizing the similarities between the flux and gene expression
levels without a specific objective function46,47. The models so created have disadvantages
such as the model might be non-functional as there is no objective function which checks that
the model shows positive growth capabilities.
MTD algorithms compensates for the inconsistencies in the OFR and the EDC algorithm 47,48.
This algorithm basically works in the OFR framework but incorporates an additional
objective function for growth or other metabolic results.
The other set of data integrated to the GEMs is the 13C labelling data in the model. In this
type of integration, a large set of reaction is present from the genome annotation. The large
set of reactions comes with its unique set of challenges such as the large number of reactions
and the atom transitions from each and every reaction present in the network. Attempts have
been made by several groups to create such models and understand the metabolic flux
distribution which is a valuable tool on metabolic engineering49–51.