Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

System Biology: An approach to decoding life

Systems biology is the study of an organism, viewed as an integrated and interacting network of
genes, proteins and biochemical reactions which give rise to life. Instead of focusing on
individual parts, the focus is on a complete system made up of different parts interacting with
each other. Based on the philosophy that the whole is greater than the sum of the parts. For
example, the immune system isn’t made up of one single component but instead a multitude of
genes, proteins and external influences. A systems biology approach means-Investigating the
components of cellular networks and their interactions, Applying experimental high-throughput
and whole-genome techniques, Integrating computational and theoretical methods with
experimental efforts.

The key properties of system biology

System structures: These include the network of gene interactions and biochemical pathways, as
well as the mechanisms by which such interactions modulate the physical properties of
intracellular and multicellular structures. System dynamics: How a system behaves over time
under various conditions can be understood through metabolic analysis, dynamic analysis
methods such as phase portrait and bifurcation analysis, and by identifying essential mechanisms
underlying specific behaviors. The control method: Mechanisms that systematically control the
state of the cell can be modulated to minimize malfunctions and provide potential therapeutic
targets for treatment of disease. The design method: Strategies to modify and construct biological
systems having desired properties can be devised based on definite design principles and
simulations, instead of blind trial-and-error.

Framework for systems biology

1. Identification all of the components (i.e., defining all genes in the genome, all mRNAs and
proteins expressed in a particular condition, or all protein-protein interactions) of the system. 2.
Use these components, along with prior biochemical and genetic knowledge, to formulate an
initial model. Systematically perturb and monitor components of the system. 3. Specific
perturbations may be genetic (e.g., gene deletions, gene overexpression, or undirected mutations)
or environmental (e.g., changes in growth conditions, temperature, or stimulation by hormones
or drugs). 4. Reconcile the experimentally observed responses with those predicted by the model.
Refine the model such that its predictions most closely agree with experimental observations. 5.
Agreement between the observed and predicted responses is evaluated qualitatively and/or
quantitatively using a goodness-of-fit measure. Design and perform new perturbation
experiments to distinguish between multiple or competing model hypotheses.

Plant System Biology


Plant structure and function is a complex biological system determined by its molecular
constituents such as DNA, RNA, proteins, metabolites, and macro- and microelements. Plants
1|Page
are enormously investigated in various aspects, like molecular genetics, breeding, genomics, and
proteomics; however, we still have limited knowledge about plant genetic architecture and
functioning mechanisms. Various solitary and mixed approaches have been developed in the past
decades which have enhanced our knowledge about the role of genetics in plant. But, a holistic
research approach requires a complete understanding of the plant structure and function at the
molecular level which needs a lot of computational resources, different kinds of data generation,
and integration algorithmic approaches. The most widely identified challenges are the data
integration and management of large datasets from various sources such as genomic sequences,
phenotype images, protein 3D structures, and -omics data. The recent advancements of
computational techniques, resources, and high-throughput sample processing technologies have
broken the technical and implementation barriers and facilitated the evolution and integration of
“-omics” fields such as genomics, transcriptomics, proteomics, metabolomics, and phenomics for
the implementation of systems biology paradigm in plant science. A large number of
comprehensive and quantitative datasets have been generated in numerous targeted and system-
wide studies facilitating the development of databases, software, data formats, and multivariate
approaches for the integration of multi-omics data. The high-throughput genetic and molecular
tactics adopted to generate -omics data that can be analyzed and used in mathematical and
computational models for revealing the networks on a global scale in the same platform is
termed as systems biology. Systems biology approaches successfully categorized the key
molecules and their roles in complex biological events in the recent period. It reveals the large
complex set of transcription factor with protein, primary metabolite and secondary metabolite
association which regulate physiology, growth and development, and response to the
environment requires the identification of networks on a genome and proteome scale. These
interactions can be either physical or functional and often can be inferred from available data.

Plant systems biology is the study of the interactions and dynamic behaviors of the constitutive
components of a plant system under different conditions, and the establishment of methods and
models to monitor and control cellular responses to developmental stages, genetic
perturbations, and environmental changes. Thus, plant systems biology is all about networks—
how the components of the system (genes, transcripts, proteins, etc.) interact with one another to
make a plant phenotype.

High-Throughput Experimental Techniques in Plant Systems Biology

Metabolomics

Metabolome is the collection of all kinds of metabolites in a biological cell, tissue, organ, or
organism as the end products of cellular processes. Metabolomics profiling is getting popularity
due to instant snapshot of the cell physiology, whereas gene expression and proteomics analyses
are limited up to genes and their products being produced in the cell. Cellular metabolomics is a
cohesive network of metabolite and biochemical interactions which have not yet been fully
characterized for products, reactants, intermediate steps, and regulatory molecules. Metabolite

2|Page
profiling and metabolic fingerprinting are the major approaches used in metabolomics.
Metabolite profiling is used to identify and quantify metabolites of plant cell. Metabolic
fingerprinting is the high-throughput approach of metabolomics which is used for tissue
comparison and discrimination analysis. Metabolomics is also used on the metabolic response of
organisms to physiological stimuli or genetic modification.

Bioinformatics

Bioinformatics encompasses the integration of engineering, mathematics, and statistics along


with computer science in order to interpret and understand biological data. Bioinformatics can be
a powerful tool for the development of biotechnology and industrial processes. Moreover,
bioinformatics can provide a better understanding of and faster solutions to problems in
pharmaceutical, medical, agricultural, and environmental fields, among others. Likewise,
bioinformatics is a reliable, cost-effective approach to expensive laboratory processes insofar as
it is able to predict outcomes through mathematical/statistical modeling of scientific research.
Bioinformatics facilitates the integration of different molecular techniques with high production
processes in reduced time, thereby making engineering and industrial processes more feasible.
Thus, one advantage in using bioinformatics is that it allows the process to be more reliable and
predictable. Despite the reliability of bioinformatics, however, its application will depend on the
types of biological tools and/or approaches used in bioinformatics.

DNA Microarray Technology

Microarray allows to study the expression of thousands of genes simultaneously through the
hybridization of probe sequences to nucleic acid sequences in mixture. In microarrays, probe
sequences are fixed on solid surface and hybridizations are detected through fluorescent
detection signal of labeled samples. Microarray technology was developed in the late 1970s and
revolutionized in the beginning of new century due to high growth in genomic sequences,
genomic sequencing projects, and availability of publicly curated and non-curated databases.
Glass spotted arrays, in situ synthesized arrays, and self-assembled arrays are basic types of
arrays used in the time frame. Microarrays were used to measure gene expression levels and
differential gene expression studies, comparison of expression patterns across samples, trait
associations, etc. These studies improved our understanding of the cellular physiology and
dynamics, interconnection of gene networks, and products for environmental input processing
and phenotypic regulation which facilitates the global gene expression studies at systems level

Proteomics

Proteomics is known for the study of quantitative measurement of proteins expressed by genome
to characterize organism or biological processes which explain the mechanism of gene
expression control. Field of protein research is continuously evolving since the isolation of
individual proteins of E. coli from protein complexes through two-dimensional polyacrylamide
gel electrophoresis (2-DE). Later, mass spectrometry (MS) was coupled with 2-DE gels for the

3|Page
identification of large number of proteins which evolved as proteomics. Proteomics studies have
three main components, i.e., expression proteomics, bioinformatics analysis, and functional
proteomics. Expression proteomics is about sample preparation through gel-based or gel-free
methods and protein identification analysis. Quantitative proteomics have been used to identify
proteins expressed in a specific cell or tissue, comparison of protein expression profile in
differential experimental conditions or disease states to explore physiology and pathogenic
mechanisms. Functional proteomics is a protein characterization approach of proteomics to
understand the role of targeted proteins in cellular functions which requires high-throughput
comprehensive analyses of protein-protein interactions, protein complexes, and transmembrane
of organism.

Biological Network
A network is used to represent a system of elements that interact. A network has following two
basic parts: Nodes (or vertices) represent the active biological molecules present inside a cell
(e.g. proteins, RNAs and/or metabolites). An edge (or links) between nodes represent their
biological relationships (e.g. physical interaction, regulatory connections, metabolic reactions).
Multi-partite graphs contain different classes of node (such as mRNA and protein). Directed
edges Originate from a source (starting node) to a sink (ending node) and represent
unidirectional flow of material or information. Non-directed edges are used to represent mutual
interactions, for example, physical interaction between two proteins of unknown function, or
interactions where the directional flow of information is not known. Source nodes: Nodes with
only outgoing edges. Sink nodes: Nodes with only incoming edges. The degree of a node is the
number of edges pointing towards or emanating from that node. A node’s total degree is the sum
of its in-degree and out-degree, which respectively quantify the number of incoming and
outgoing edges of the node. The degree distribution P(k) quantifies the fraction of nodes with
degree k. It is calculated by dividing degree with total number of nodes. Path length: The path
length or distance between any two nodes in a network is the number of edges in the shortest
path connecting those nodes.

Biological Network Topology and Characteristics

Nodes and edges are the basic element for network building. In biological science, genes and
proteins works as nodes, and functional or physical relationships between them are known as
edges. In general, network topologies are defined as the order of nodes and edges to determine
the functional aspects of the network. Node degree (i.e., number of edges connected to a node)
and degree distribution (i.e., overall distribution of node degrees in a network) determine the
nature of networks. The degree distribution of networks is often used to differentiate different
classes of networks, whereas a number of edges are used to measure distances between networks.
Navigability of network is measured through minimum path length (minimum number of edges)
or mean path length (average of shortest path between all pairs of nodes). Node centrality or hub
node is measured through the shortest path between all pairs of nodes in a network. In contrast to

4|Page
hub node, bottleneck node which does not necessarily have higher interaction works as linker
between different subnetworks. A node can work as hub and bottleneck node both. Network
robustness can also be characterized through network redundancy and degeneracy. Nodes’
redundant connectivity through multiple paths is important for network sustainability and
integrity in the absence of other connections, whereas network degeneracy is a special type of
redundancy which leads to both overlapping and separate effects of network.

Source node

Sink node

The distance between


nodes L and K = 1 The
distance between nodes K
and L = 2 (along the KML
path) The distance
between nodes J and I is
infinite because no path
starting from J and ending
in I exists.

Degree Distribution

Path length

5|Page
Network properties
Random network: To construct a
random network, two nodes are
chosen randomly from a pool of N
nodes and a link is established
between them. In random networks,
there are a few nodes that are lowly
or highly connected and most nodes
have roughly the same number of
links.

In scale-free networks there are


many nodes that are poorly
connected and few nodes that are very highly connected. The highest degree nodes (highest
number of edges) are typically referred to as ‘hubs’ and are important for the architecture and
function of the network. Removing a hub from biological network has a high probability of
resulting in a lethal phenotype, supporting their fundamental role. The important consequence of
scale-free topology of naturally occurring networks is robustness. Scale-free networks show an
extraordinary tolerance to perturbations as compared to random networks of equivalent size.

The importance of hubs in protein network

Proteins and their interactions form a protein–protein interaction network, where the proteins are
the nodes and the interactions are the edges. Genomic studies show that deleting a highly
connected protein node (hub) is more likely to be lethal to an organism than deleting a lowly
connected node (non-hub), a phenomenon known as the centrality-lethality rule. Because hubs
are more important than non-hubs in organizing the global network structure, the centrality-
lethality rule is widely believed to reflect the significance of network architecture in determining
network function, a key notion of systems biology.

Modules in networks: The large, complex biological networks are organized into smaller sub-
networks consisting of directly interacting, or ‘connected’ molecular components. These sub-
networks correspond to biologically functional units or ‘modules’. Importance: Comparative
analyses of structurally similar modules across different species may identify mutually shared
functions, associate a modular structure with a new function, and provide insight into the
evolution of various network structures.

Network motifs

6|Page
‘Network motifs’ are patterns of interconnections that recur in many different parts of a network
at frequencies much higher than those found in randomized networks.

interact (FFL): The motif, termed ‘feed forward loop’, is defined by a transcription factor X that
regulates a second transcription factor Y, such that both X and Y jointly regulate an operon Z. X
is the ‘general transcription factor’, Y the ‘specific transcription factor’, and Z the ‘effector
operon(s)’.

Example of a FFL: Crp is the general transcription


factor and AraC the specific transcription factor.

A feed forward loop motif is ‘coherent’ if the direct effect of the general transcription factor on
the effector operons has the same sign (negative or positive) as its net indirect effect through the
specific transcription factor. For example, if X and Y both positively regulate Z, and X positively
regulates Y, the feed forward loop is coherent. If, on the other hand, X represses Y, then the
motif is incoherent. We find that most (85%) of the feed forward loop motifs are coherent.

Single input motif (SIM): The motif, termed single-input module (SIM), is defined by a set of
operons that are controlled by a single transcription factor. All of the operons are under control
of the same sign (all positive or all negative) and have no additional transcriptional regulation.
The transcription factors controlling SIM motifs are usually autoregulatory.

Example of a SIM is the arginine biosynthesis


pathway, where the transcription factor ArgR
uniquely controls five operons that encode arginine
biosynthesis genes.

Dense overlapping regulons (DOR): The motif, termed ‘dense overlapping regulons’ (DOR), is
a layer of overlapping interactions between operons and a group of input transcription factors
that is much more dense than corresponding structures in randomized networks.

7|Page
Network motifs: autoregulation

Negative autoregulation: Negative autoregulation (NAR) occurs when a transcription factor


represses the transcription of its own gene. NAR has been shown to display two important
functions: (1) NAR speeds up the response time of gene circuits. (2) NAR can reduce cell–cell
variation in protein levels.

NAR speeds up the response time of gene circuits: This occurs when NAR uses a strong
promoter to obtain a rapid initial rise in the concentration of protein X. When X concentration
reaches the repression threshold for its own promoter, the production rate of new X decreases.
Thus, the concentration of X locks into a steady-state level that is close to its repression
threshold. By contrast, a simply regulated gene that is designed to reach the same steady-state
level must use a weaker promoter. As a result, an NAR system reaches 50% of its steady state
faster than a simply regulated gene. The dynamics of NAR show a rapid initial rise followed by a
sudden locking into the steady state.

NAR can reduce cell–cell variation in protein levels: Unavoidable variations are due to an
inherent source of noise: the production rates of proteins fluctuate by tens of percents. This noise
results in cell–cell variation in protein level. NAR can, in many cases, reduce these variations.
High concentrations of X reduce its own rate of production, whereas low concentrations cause an
increased production rate. The result is a narrower distribution of protein levels than would be
expected in simply regulated genes.

Positive auto regulation (PAR) occurs when a transcription factor enhances its own rate of
production.

8|Page
PAR slows the response time: At early stages, when levels of X are low, production is slow.
Production picks up only when X concentration approaches the activation threshold for its own
promoter. The response time is longer than in a corresponding simple-regulation system.

PAR tends to increase cell–cell variability: If PAR is weak (that is, X moderately enhances its
own production rate), the cell–cell distribution of X concentration is expected to be broader than
in the case of a simply regulated gene. Strong PAR can lead to bimodal distributions, whereby
the concentration of X is low in some cells but high in others. In cells in which the concentration
is high, X activates its own production and keeps it high indefinitely. Strong PAR can therefore
lead to adifferentiation-like partitioning of cells into two populations.

Application of Network System in Plant Biology


Gene-to-Metabolite Network

This network calculates the correlation and significance between differentially expressed genes
which are associated with metabolic regulation at a given set of condition. In this interaction
network, genes and metabolites act as node and edge, respectively. The interactions are
interpreted depending on the distance between the genes and the metabolites. This type of
network is highly complex and difficult to study in plants, owing to the enormous diversity and
number of metabolites being produced in the cells due to their sessile lifestyle. In the area of
plant science, gene-to-metabolite networks elucidate the interrelations among biological
processes, gene functional annotation, discovery of new genes in biosynthesis regulation, and
transport of metabolites. For the various biotic and abiotic stresses in plants, researchers utilize
gene-to-metabolite networks to reveal how genes regulate cellular pathways as well as primary
and secondary metabolites synthesis to protect plants.

Importance: 1. The gene-to-metabolite network clarified how biological processes are


interrelated with better prediction of outcomes for system perturbations. 2. The systemic view
enables the discovery of key regulatory components in a biological system and thus improves
gene function annotation. 3. The association of genes with metabolites enables the discovery of
new genes involved in metabolite biosynthesis, transport and regulation.

9|Page
Protein-Protein Interaction Network

Protein-protein interactions (PPIs) are one of the most significant components of bio logical
networks. In PPI networks, the nodes are proteins which are associated by direct edges if the
direction of information flow during their interaction is known or nondirect edges if there is
strong evidence of their physical interaction or association without an evidence for directionality
of interaction. Two types of interactions might be possible: genetic or physical. In genetic
approaches, a network of genes characterized on the basis of genetic interactions to explain gene
function within physiological processes. Still, this method is difficult to implement owing to the
ploidy levels and perennial plants. While in physical methods, interaction maps have been
experimentally elucidated for homo- and hetero-dimerization within two large classes of
transcription factors, for example, networks between the MADS box transcription factors and the
MYB transcription factor family.

Transcriptional Regulatory Network

Transcription regulatory network elucidates the regulatory interactions between transcription


factors and downstream genes. To understand cellular dynamics, thorough knowledge of each
regulatory network is required. In this network, nodes represent transcription factors and
regulatory genes, whereas edges represent transcriptional regulation. Other regulatory network
models are evaluated based on promoter co-occupancy by pairs of transcription factors and
computational prediction of cis-elements.

Gene Regulatory Network

A gene regulatory network reveals role of genes in physiological processes of life, including cell
differentiation, metabolism, the cell cycle, and signal transduction. In this network, the nodes
correspond to genes and messengers RNAs or proteins, and the edges represent the regulatory
interactions like activation, inhibition, and repression between the components of the network.
Generally it’s a collective network of genes, noncoding RNAs, proteins, metabolites, and
signaling components. Gene regulatory network incorporates regulation of DNA transcription,
RNA translation, posttranscriptional RNA processing, as well as the posttranslational
modifications like protein targeting and covalent protein modifications. Gene regulatory
networks display the dynamics of the plant systems.

Modelling in systems biology


Discrete deterministic models usually characterize network nodes by two binary states
corresponding to, for example, an expressed or not expressed gene, an open or closed ion
channel, or above-threshold or below-threshold concentration of a molecule. The change in state
of each regulated node is generally described by a logical function using the Boolean operators
‘‘and,’’ ‘‘or,’’ and ‘‘not’’. Boolean models can predict dynamic trends in the absence of detailed
kinetic parameters.

10 | P a g e
Boolean network models are a special case of discrete dynamic models. A Boolean network
consists of a set of nodes whose state is binary and is determined by other nodes in the network
through Boolean functions. In terms of complexity, Boolean networks lie between static network
models and continuous dynamic models, making them a tractable and powerful approach to
modeling large-scale biological systems.

Deterministic models of genetic regulatory networks

A deterministic model of a genetic regulatory network may involve a number of different


mechanisms that capture the collective behavior of the elements constituting the network. The
models can differ in numerous ways, such as in the nature of the physical elements that are
represented in the model (i.e., genes, proteins, and other factors); the resolution or scale at which
the behavior of the network elements are captured (e.g., are genes discretized, such as being
either on or off, or do they take on continuous values?); and how the network elements interact
(e.g., interactions can either be present or absent or they may have a quantitative nature). The
common aspect of deterministic models is the inherent lack of randomness in the model.

Software and Tools for Network Analysis

Researchers develop and use bioinformatics software or databases for the comprehensive study
of plant systems biology. Many of the tools, databases, and other resources used in the analyses
of the individual -omics platforms include the tools for network visualization, modeling
environments, pathway construction and visualization tools, systems biology platforms, and
repositories of the models. Visualization is a means of investigative data analysis and a key
method for network analysis. The purpose of large -omics data visualization should be to create
clear, meaningful, and integrated resources without being besieged by the inherent complexity of
data (Gehlenborg et al. 2010). Pathway databases are used for modeling systems, since they offer
a clear-cut way of building network topologies by the annotated reaction system. Some of the
widely used tools and data-bases by plant research community are represented in Table.

Table: Systems biology tools and database resources.

SL Tool Description and URL


No.
Genetic interaction network
1 GeneOrienteer GeneOrienteer is a database that predicts gene-gene interactions. It does this
based on correlations between genetic traits, orthology to known interacting
genes, and public two- hybrid data. Inputting a gene name for a specific
organism can result in a list of predicted interactions, information about the
inputted gene, or details about an inputted pair of genes
[http://geneorienteer.org/]
AraNet AraNet is a probabilistic functional gene network of Arabidopsis thaliana,
constructed by a modified Bayesian integration of 24 types of “-omics” data
from multiple organisms, with each data type weighted according to how well
it links genes that are known to function together in Arabidopsis thaliana.

11 | P a g e
Each interaction in AraNet has an associated log-likelihood score (LLS) that
measures the probability of an interaction representing a true functional
linkage between two genes [http://www.inetbio.org/aranet/]
Protein-protein interaction
AtPID The AtPID (Arabidopsis thaliana protein interactome database) represents a
centralized platform to depict and integrate the information pertaining to
protein-protein interaction networks, domain architecture, ortholog
information, and GO annotation in the Arabidopsis thaliana proteome
[http://www.megabionet.org/atpid/webfile/]
MitoInteractome MitoInteractome is a web-based portal containing information relevant to
mitochondrial proteins. It also serves as a research tool for finding interacting
partners and studying mitochondrial diseases. It has a comprehensive
collection and organization of organelle-specific data. The data is primarily
obtained by keyword search at Swissprot, MitoP, and MitoProteome
[http://mitointeractome.kobic.kr/]
Metabolic pathways
MetExplore MetExplore is a web server that offers the possibility to link the metabolites
identified in untargeted metabolomic experiments within the context of
genome-scale reconstructed metabolic networks. The analysis pipeline
comprises mapping metabolomics data (from masses or identifiers) onto the
specific metabolic network of an organism, then applying graph-based
methods and advanced visualization tools to enhance data analysis.
MetExplore stores metabolic networks and information about metabolites
from about 60 organisms into a relational database. Various filters can be
applied in MetExplore to restrict the scope of the study, for example, by
selecting only particular pathways or by restricting the network to the small-
molecule metabolism.
[http://metexplore.toulouse.inra.fr/joomla3/]
Plant reactome The plant reactome is a free, open- source, curated, and peer-reviewed
database of plant metabolic and regulatory pathways. Its goal is to provide
intuitive bioinformatics tools for the visualization, interpretation, and analysis
of pathway knowledge to support basic research, genome analysis, modeling,
systems biology, and education. [http://plantreactome.gramene.org/]
Signaling pathways
DNAtraffic DNAtraffic database is dedicated to be a unique comprehensive and richly
annotated database of genome dynamics during the cell life. DNAtraffic
contains extensive data on the nomenclature, ontology, structure, and function
of proteins related to control of the DNA integrity mechanisms such as
chromatin remodeling, DNA repair, and damage response pathways
[http://dnatraffic.ibb.waw.pl/]
Arabidopsis The aim of Arabidopsis reactome is to develop a curated resource of core
reactome pathways and reactions in plant biology. The information in this database is
authored by biological researchers with expertise in their field and maintained
by the Arabidopsis reactome editorial staff. Contents are cross-referenced
with the following external databases: PubMed, GO, ATIDB, TAIR, MIPS,
UniProt., ChEBI, and KEGG COMPOUND. In addition to curated events
(center of reaction map), imported Arabidopsis events from KEGG and
AraCyc databases are also provided. Moreover, inferred orthologous events in
five other plants including rice, grape, poplar, and moss are also available.
[http://www.arabidopsisreactome.org]

12 | P a g e
Systems biology approach to abiotic stress

In the post-genomic era, comprehensive analyses using three systematic approaches or omics
have increased our understanding of the complex molecular regulatory networks associated with
stress adaptation and tolerance. Integration of the different ‘omics’ analyses facilitates abiotic
stress signaling studies allowing for more robust identifications of molecular targets for future
biotechnological applications in crops and trees. The significance of system biology in exploring
abiotic stress tolerance are the following:

1. Co-expression analyses identify regulatory hubs


2. Time-series analyses reveal multiple phases in stress responses
3. Integration of omics analysis identifies molecular networks functioning in abiotic stress
responses
4. Systematic application of ‘omics’ technologies has contributed to the development of
stress-tolerant crops in the field
5. Mapping stress responses has provided new insights and identified gaps in our knowledge
of abiotic stress responses

Systems approaches to studying root transcriptional regulatory networks regulating root


development

Transcription factors (TFs) of interest are first identified by cell-type-specific and development-
specific transcriptional profiling. TFs are analyzed and transcriptionally profiled to detect
potential downstream targets of the initial TFs selected. One can validate physical interaction
between the TFs and their potential targets first by determining that the TFs and their potential
targets are coexpressed, using their transcriptional profile, then by different in vitro or in vivo
tests. High-confidence TF targets are validated by multiple methods of identification. Subsequent
rounds of TF/target identification can allow one to generate a model of a TF network. Other
aspects of the network, such as the metabolic output, signaling, and posttranscriptional gene
silencing, should also be characterized in a high-resolution, systematic manner. Combining all
these results can allow researchers to generate a detailed, high-resolution, gene regulatory
network model of root development. This model should then be tested experimentally to
determine its ability to recapitulate experimental observations and for its robustness. The
dynamics of these networks can then be monitored by measuring the expression of fluorescently
tagged genes within the network and by testing the network under a range of conditions.

Metabolic network analysis of plant systems

Metabolic engineering of plants with enhanced crop yield and value-added compositional traits is
particularly challenging as they probably exhibit the highest metabolic network complexity of all
living organisms. Therefore, approaches of plant metabolic network analysis, which can provide
systems-level understanding of plant physiology, appear valuable as guidance for plant metabolic
engineers. Strongly supported by the sequencing of plant genomes, a number of different
experimental and computational methods have emerged in recent years to study plant systems at
13 | P a g e
various levels: from heterotrophic cell cultures to autotrophic entire plants. The present review
presents a state-of-the-art toolbox for plant metabolic network analysis. Among the described
approaches are different in silico modeling techniques, including flux balance analysis,
elementary flux mode analysis and kinetic flux profiling, as well as different variants of
experiments with plant systems which use radioactive and stable isotopes to determine in vivo
plant metabolic fluxes. The fundamental principles of these techniques, the required data input
and the obtained flux information are enriched by technical advices, specific to plants. In
addition, pioneering and high-impacting findings of plant metabolic network analysis highlight
the potential of the field.

Diverse studies have greatly enhanced our understanding of plant functioning and strongly
contributed to discoveries in plant biochemistry and physiology. The increasing knowledge
derived from these studies, directly feeds the continuous curation, validation, and extension of
plant metabolic network models. Additional value of plant metabolic network analysis still lies in
the future: the full integration of fluxes and network properties as guidance for metabolic
engineering of superior plant lines, which has so far shown success only very rarely. In such a
way, rational metabolic engineering could assist in establishing plants as green factories for
industrial applications.

At the sub-cellular scale, the biological components of interest, and thus the nodes of the
networks, are genes, the products of their expression (mRNA, proteins) and others molecules
(e.g. sugars, lipids, ions, etc.) interacting with them. Depending on the starting point and the
objective of the studies at this scale, two distinct approaches have been developed: bottom-up
(local) and top-down (global) approaches.
‘Bottom-up’ approaches
The first type of approach that can be used to model subcellular systems corresponds to local
‘bottom-up’ studies. Their focus is on being able to predict global behaviours from a set of local
mechanisms. Bottom-up studies are idea driven, meaning that networks considered in bottom-up
studies are built according to the specification of the modellers. At the sub-cellular scale, this
implies that the modellers need at least a partial knowledge of the considered biological system
to be able to define the corresponding network according to logical rules. By nature, bottom-up
studies thus deal with small to medium scale networks.
‘Top-down’ networks
The second type of network study widely used at the subcellular scale is often described as a
global ‘top-down’ approach. It focuses on analysing the behaviour of a system to extract

14 | P a g e
information on the underlying mechanisms. Top-down studies are data driven, and at the sub-
cellular scale are dependent on the existence of knowledge databases generated by ‘omics’
techniques such as transcriptomics, proteomics or metabolomics. Top-down approaches have
arisen from the need to extract meaningful information from those databases, consequently
dealing with larger scale networks. Contrary to bottom-up networks, the size of the interaction
networks implies that edge definition cannot be done manually by the modellers. Instead, their
construction relies on data-mining, regression analysis and statistical network inference methods
to determine node interconnections and the functions associated with edges.
Stochastic Modeling of Biochemical Reactions
On a molecular level, chemical reactions are random events and cause molecule numbers to
fluctuate. To study the microscopic dynamics in biochemical systems, we can describe chemical
reaction systems as stochastic processes, compute the resulting fluctuations of substance
amounts, and trace them across cellular networks. Mathematical random processes used in such
models can describe individual reaction events (calculation by the chemical master equation or
direct simulation), their frequencies in time (calculation by the τ -leaping method), or randomly
drifting substance concentrations (chemical Langevin equation), and they entail deterministic
models as a limiting case. The temporal fluctuations in substance amounts can be characterized
by autocorrelations and spectral densities.
ODE Systems for Biochemical Networks
Systems of ordinary differential equations (ODE) are the most frequently used approach to
model the static and dynamic behaviors of biochemical networks. They employ continuous
variable values (mostly concentrations) and continuous time.
Basic Components of ODE Models
To formulate an ODE model for a dynamic biochemical reaction network, we need the following
information:
1. The basic building blocks are all compounds and all reactions converting these
compounds into each other.
2. The modeler must set the boundary of the system.
3. For all reactions that are part of the model, assign kinetic laws
4. Determine the values of the kinetic parameters used in the kinetic laws. They can be
taken either from databases or literature or they can be fitted to experimental data

15 | P a g e
5. Simulate the time course for a given set of parameter values and initial conditions
6. Analyze the effect of perturbations. The impact of small changes of parameter values is
studied by sensitivity analysis and – for biochemical networks – by metabolic control
analysis
Constraint-Based Flux Optimization
Flux balance analysis, an optimality-based method for flux prediction, is one of the most
popular modeling approaches for metabolic systems. Flux optimization methods do not
describe how a certain flux distribution is realized (by kinetics or enzyme regulation), but
which flux distribution is optimal for the cell – for example, providing the highest rate of
biomass production at a limited in flow of external nutrients. This allows us to predict flux
distributions without the need for a kinetic description. From a specie’s genome sequence,
the metabolic network can be roughly predicted. Even if we do not know anything about the
enzyme kinetics, we can infer which metabolites the network can produce and which
precursors are needed to produce biomass. Given a number of nutrients and a hypothesized
optimality requirement, for example, for fast biomass production, we can try to predict an
optimal flux distribution in the network.

16 | P a g e

You might also like