Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Supplementary material

CHRONOS: A time-varying method for microRNA-


mediated subpathway enrichment analysis
Aristidis G. Vrahatis, Konstantina Dimitrakopoulou, Panos Balomenos,
Athanasios K. Tsakalidis and Anastasios Bezerianos
Contents
CHRONOS FRAMEWORK .............................................................................................. 3
Supplementary text 1. User input ............................................................................................................. 3
Supplementary text 2. Data retrieval ........................................................................................................ 3
Supplementary text 3. Converting pathway to gene-gene network .......................................................... 3
Figure S1. Example KGML file. .......................................................................................................... 4
Figure S2. Group node conversion. ...................................................................................................... 4
Figure S3. Chemical compound removal. ............................................................................................ 5
Figure S4. Converting chemical reactions in metabolic pathways. ...................................................... 5
Table S1. Categorization of interaction types. ................................................................................. 6
Figure S5. Converting pathway to gene-gene network. ....................................................................... 7
Supplementary text 4. Linear Subpathway extraction .............................................................................. 7
Figure S6. Linear subpathway extraction. ............................................................................................ 8
Figure S7. Linear subpathway expansion. ........................................................................................... 8
Figure S8. Optimal linear subpathway length. ..................................................................................... 8
Supplementary text 5. Non-Linear subpathway extraction ...................................................................... 9
Figure S9. Non-linear subpathway extraction ...................................................................................... 9
Figure S10. Optimal non-linear subpathway length. ............................................................................ 9
Figure S11. Linear and non-linear subpathway length distribution ................................................... 10
Interactivity scoring schemes ................................................................................................................. 10
Figure S12. Fold Change Interactivity (FCI) score. ........................................................................... 10
Supplementary text 6. Pseudocode of TVI score ............................................................................... 11
Supplementary text 7. Structural and functional measures ................................................................... 11
Figure S13. Subpathway-based structural measures. ......................................................................... 11
Figure S14. Evaluation of subpathway topological measures relative to subpathway length. ........... 12
Figure S15. Computational time performance comparison .................................................................... 13
CHRONOS Output ................................................................................................................................. 13
Figure S16. Example output ranked list of enriched miRNA-mediated subpathways. ...................... 13
Figure S17. Example KEGG visualization......................................................................................... 14
Figure S18. Example circular plots .................................................................................................... 14
Figure S19. Schematic description of CHRONOS output files. ......................................................... 15

APPLICATION TO SYNTHETIC DATASET .......................................................... 15


Figure S20. Simulation analysis results I. .......................................................................................... 15
Figure S21. Simulation analysis results II. ......................................................................................... 16
Figure S22. Visualization of the top 21 inter-connected subpathways of H. sapiens ......................... 16
Table S2. Topological and functional features of human pathway map. ....................................... 17

APPLICATION TO REAL EXPRESSION DATA ................................................... 18


Figure S23. Real analysis results I. .................................................................................................... 20
Figure S24. Real analysis results II. ................................................................................................... 21
Figure S25. Real analysis results III................................................................................................... 21
Table S3. Real analysis results IV. Comparison of CHRONOS, TEAK, Subpathway-GM,
timeClip and DEAP ....................................................................................................................... 22

REFERENCES ...................................................................................................................... 22
CHRONOS Framework

Supplementary text 1. User input


Input
CHRONOS requires mRNA and microRNA (if available) time series expression data along with their labels. The
expression data needs to be formatted in matrices with dimensions (N,E), where N is the number of mRNAs/
microRNAs and E the time points of data (i.e. X(i,j) is the expression value of mRNA/miRNA i at time j). In case
microRNA (mRNA) expression data are not available, CHRONOS can be run without processing and exporting
miRNAs in the final subpathways.

Nomenclature
CHRONOS builds up using EntrezGene IDs as gene label system, in accordance with KEGG pathway maps.
However, twelve common gene label systems are supported as input using the web-based data mining tool
BioMart (Durinck et al., 2005) namely: EntrezGene ID, Ensembl Gene ID, Ensemble Transcript ID, Ensemble
Protein ID, HGNC ID, HGNC Symbol, HGNC Transcript name, Refseq mRNA ID, Refseq Protein ID,
UniProt/Swissprot Accession, UniProt/Swissprot ID, UniGene ID and UniProt Genename ID. The above systems
are converted to Entrez IDs for downstream analysis. The output results are returned both in users input label
system as well as in Entrez IDs. Previous and updated mRNA label lists are downloaded from miRBase
(http://www.mirbase.org/) in order to address miRNA nomenclature issues. The user is informed via a list for the
re-annotated miRNAs.

Data preprocessing
CHRONOS accepts time series expression profile data and operates more effectively if data are normalized and
log2-fold change differences relative to an initial condition (control state – 0h) are computed. Multiple biological
replicates should be summarized so that one sample per time point is provided as input (Welsh et al., 2013).

Supplementary text 2. Data retrieval


Organism pathway maps
KGML files are downloaded for all available organisms and both metabolic and non-metabolic KEGG pathway
maps are considered (Kanehisa et al., 2002). The user can also select specific pathway maps for analysis.

miRNA targets
Plausible mRNA-miRNA pairs are obtained from miRecords database (Xiao et al., 2009), which combines
validated and predicted targets from 11 established miRNA target prediction programs (DIANA-microT,
MicroInspector, miRanda, MirTarget2, miTarget, NBmiRTar, PicTar, PITA, RNA22, RNAhybrid, and
TargetScan⁄TargertScanS). Predicted targets are downloaded using the web interface of miRecords. Only targets
predicted by at least 4 out of 11 prediction programs are considered, as proposed in (Dimitrakopoulou et al., 2015).
The user can change this default threshold, or even specify which prediction programs will be used (i.e. prediction
targets from PicTar and miTarget). Also, validated mRNA-miRNA pairs are obtained by miRecords and TarBase
(Vlachos et al., 2015).

Supplementary text 3. Converting pathway to gene-gene network


KGML files
Information for each KEGG pathway map is stored in KEGG Markup Language (KGML) files including entry,
relation and reaction elements. Entry elements such as gene, compound, enzyme are represented as nodes and the
relation and reaction elements as connections among nodes (edges). The relation elements indicate the relationship
between gene products or between products and compounds in non-metabolic pathways including the ‘relation
type’ element and the ‘subtype’ subelement (Figure S1, Case A). There are four main relation types, the (i)
protein-protein relations (type=‘PPrel’), the (ii) gene expressions relation (type=‘GErel’), the (iii) protein-
compound relation (type=‘PCrel’) and, (iv) links to another maps (type‘maplink’). From these elements many
subtypes arise as shown in Table S1. Metabolic pathways have metabolites and enzymes as nodes and the main
relation type among enzymes is the enzyme-enzyme relations (type=‘ECrel’). Metabolites are mainly participating
in chemical reactions with subelements substrates and products (Figure S1, Case B). Detailed information is
provided in http://www.genome.jp/kegg/xml/docs/.
Figure S1. Example KGML file. In case A, a relation in a non-metabolic pathway is illustrated, as depicted in a KGML
file, between nodes with entry ids ‘45’ and ‘63’. Relation type ‘PPrel’ indicates a protein-protein interaction between proteins
and subtype ‘activation’ indicates that node ‘entry1’ activates node ‘entry2’. In case B, a reaction in a metabolic pathway is
illustrated with the respective substrate and product.

Node conversion
KEGG pathway entry elements representing genes (type=‘gene’) can correspond to one or multiple gene products
containing most likely gene families or genes with similar function. In the latter case the node is expanded to
multiple separate nodes by rewiring the incoming and outgoing links of each entry. Optionally, these entries can be
considered (user-defined) as single nodes by unifying the corresponding gene products (Judeh et al., 2013). In any
of the above cases, these nodes are treated as single nodes in the conversion stage. The user-defined choice of
expansion is occurred in subpathway extraction stage decreasing the computational complexity of the approach in
terms of execution time. Entry elements representing group of genes (type=‘group’) usually characterize protein
complexes and are expanded to separate nodes by keeping the interactivity of each entity (Li et al., 2013).
Optionally, these group entries can be considered (user-defined) as single nodes by unifying the corresponding
gene products. In Figure S2, node A (type=‘group’) includes a protein complex with three proteins. Three new
nodes are constructed by linking each newly created node with node B. Entry elements representing KEGG
orthologs (type=‘ortholog’) and maps (type=‘map’) are ignored since we examine each organism and pathway
separately (Figure S5). The chemical compounds are removed from the topology. As usual, compound-mediated
interactions are interactions for which a compound acts as a bridge between two elements (usually genes). So we
established that, if element A is linked to compound c and compound c is linked to element B, then the compound
is removed and replaced with an interaction connecting A and B (Sales et al., 2012). Also, chains of compounds
are examined in a similar manner (Figure S3).

Figure S2. Group node conversion. Nodes representing a group of k nodes are expanded in k new nodes and the incoming
and outgoing links are also rewired in the expanded nodes. Here, node A represents a group of three nodes (N1, N2, N3) and is
expanded in three new nodes by rewiring its links.
Figure S3. Chemical compound removal. The chemical compounds are removed from the network without losing the
connectivity semantics among genes. As usual, compound-mediated interactions are interactions for which a compound acts as a
bridge between two elements (genes). So we established that, if element A is linked to compound c and compound c is linked to
element B (Case I), then the compound is removed and replaced with an interaction connecting A and B. The same logic is
followed with chains of compounds (Case II).

Edge conversion
The relation subtypes define the directionality and the operation of each relation. In a KEGG pathway map, a
relation with type=‘inhibition’ between ‘entry1’ and ‘entry2’ is displayed which means that ‘entry1’ inhibits
‘entry2’ (Figure S1). However, many relation types have ambiguous interpretation both in directionality and in
operation issues. For example, the relation type=‘binding/association’ has no specific directionality and we
suppose that this relation is bidirectional. The relation type=‘activation/inhibition’ has ambiguous operation. Thus,
we established an in-house categorization with three kinds of relations (activation, inhibition, unknown);
activation/expression is translated as enhancement of gene regulation, inhibition/repression as suppression and
unknown with unclear molecular context. Also, one fourth type of relation indicated as ‘no-interaction’ is provided
as option to the user in case there is interest to focus only on specific interaction types by setting all the irrelevant
interaction types as ‘no-interaction’. In Table S1 our default translation is presented based on (Judeh et al., 2013;
Wrzodek et al., 2013).
The main difference in metabolic pathways is that the edges are mainly constructed from chemical reactions
(type=‘reaction’) which involve enzymes (gene products) or substrates. For each reaction we examined the
substrate, the product and the reaction type (reversible or irreversible). More specifically, for enzyme e and
reaction with substrate id=‘s’, product id=‘p’ and type=‘irreversible’, then edges (s, e) and (e, p) are created. In
reversible reactions), edges (e, s) and (p, e) are created additionally (Figure S4). For enzyme e and reaction with
substrate id=‘s’, product id=‘p’ and type=‘reversible’, then edges (s, e), (e, p), (e, s) and (p, e) are created. The
directionality in the two reaction types is also user-defined. Reactions and relations in metabolic pathways are
considered with ambiguous operation (Büchel et al., 2013; Cicek et al., 2014) and we defined them as ‘unknown’
relation type. An indicative example about pathways conversion to gene-gene network is provided in Figure S5.

Figure S4. Converting chemical reactions in metabolic pathways. For enzyme e reacting with substrate s and product
p, edges (s, e) and (e, p) are created. This happens if the reaction is irreversible. In case of reversible reactions, edges (e, s) and
(p, e) are created additionally. The directionality of reaction types is also user-defined.
Table S1. Categorization of interaction types.

Type of Interaction Directionality Type of effect


expression Unidirectional Activation
activation + phosphorylation Unidirectional Activation
activation + dephosphorylation Unidirectional Activation
activation + ubiquitination Unidirectional Activation
activation + indirect effect Unidirectional Activation
activation + binding/association Unidirectional Activation
activation + phosphorylation + binding/association Unidirectional Activation
activation + phosphorylation + indirect effect Unidirectional Activation
compound + expression Unidirectional Activation
compound + activation Unidirectional Activation
compound + activation + indirect effect Unidirectional Activation
compound + activation + phosphorylation Unidirectional Activation
activation + methylation Unidirectional Activation
expression + indirect effect Unidirectional Activation
inhibition Unidirectional Inhibition
repression Unidirectional Inhibition
compound + inhibition Unidirectional Inhibition
inhibition + phosphorylation Unidirectional Inhibition
inhibition + dephosphorylation Unidirectional Inhibition
inhibition + ubiquitination Unidirectional Inhibition
inhibition + indirect effect Unidirectional Inhibition
inhibition + binding/association Unidirectional Inhibition
inhibition + expression Unidirectional Inhibition
inhibition + methylation Unidirectional Inhibition
repression + indirect effect Unidirectional Inhibition
ubiquitination + inhibition Unidirectional Inhibition
phosphorylation Unidirectional Unknown
indirect effect Unidirectional Unknown
dephosphorylation Unidirectional Unknown
ubiquitination Unidirectional Unknown
activation + inhibition Unidirectional Unknown
phosphorylation + indirect effect Unidirectional Unknown
phosphorylation + binding/association Unidirectional Unknown
phosphorylation + dissociation Unidirectional Unknown
dephosphorylation + indirect effect Unidirectional Unknown
binding/association + indirect effect Unidirectional Unknown
binding/association Bidirectional Unknown
compound Bidirectional Unknown
dissociation Bidirectional Unknown
hidden compound Bidirectional Unknown
dissociation + missing interaction Bidirectional Unknown
state change Bidirectional Unknown
binding/association + missing interaction Bidirectional Unknown
missing interaction Bidirectional Unknown
reversible reaction* Bidirectional Unknown
irreversible reaction* Unidirectional Unknown
All possible interaction types in KEGG pathway maps (left) and our default directionality classification and type of relation
categorization in three classes (right) based on (Judeh et al., 2013; Wrzodek et al., 2013). The user has the option to change both
the directionality and type of relation or even set some relations as ‘no-interaction’ and focus only on specific interactions.
*These types of reaction refer to metabolic pathways.
Figure S5. Converting pathway to gene-gene network. KEGG pathway map (left) conversion to gene-gene network
(right). Nodes representing a map (orange) are discarded since we are interested in the analysis of each pathway independently.
Interactions B-C, G-C are considered as ‘uknown/unidirectional’ and C-D as ‘unknown/bidirectional’. Interactions A-F, A-E are
considered as ‘activation/unidirectional’ and A-B, G-H as ‘inhibition/unidirectional’. Compounds c1 and c2 are removed by
rewiring their corresponding nodes as described in Figure S3. Nodes representing KEGG orthologs (grey node) and KEGG map
(yellow node) are removed are since we examine each organism and pathway separately.

Supplementary text 4. Linear Subpathway extraction

Let G be a pathway graph, Sn a node with zero in-degree and Dm a node with zero out-degree, where 0 ≤ n ≤ N and
0 ≤ m ≤ M. A subpathway is defined as a path starting from start-node Sn and terminating at end-node Dm. Any (Sn,
Dm) pair with at least one path connecting them can yield one or more subpathways, since multiple subpathways
may share a start and/or an end node (Figure S6). Thus, by traversing G between each start and end node, a set S
of subpathways will be extracted. If no end node is visited within a specific number of steps, the algorithm
backtracks and selects another possible path which may lead to an end node.
However, the number of subpathways in some complex KEGG pathway graphs can reach the order of billions,
increasing the time complexity of both the extraction process and meta-processing analysis. To avoid this, we
employ the following approaches: (i) We exploit the default grouping of genes in KEGG pathway maps: A typical
KEGG pathway graph depicts interactions between nodes corresponding to groups of genes, which frequently may
be quite large. Thus subpathway extraction happens in two phases. First, a set Sc of compact subpathways is
extracted from G, whose nodes also consist of groups of genes. Second, each such subpathway is expanded to a set
S of subpathways each consisting of one gene per node (see an example on Figure S7). (ii) We avoid the extraction
of extremely long subpathways. Extracted subpathways may be arbitrarily long. Thus, expanding long compact
subpathways with oversized group nodes may result in an exponential increase of the number of extracted
subpathways, which in turn hinders any statistical meta-processing used to attribute biological significance to the
subpathways. In this regard, multiple subpathway ranges have been considered, varying from a handful of genes to
the border of experimental practicality, for the three most common organisms: Homo sapiens, Mus musculus and
Rattus norvegicus. The results (Figure S8) indicate that the vast majority of the organism's genes are present in
extracted subpathways ranging from three to ten genes; this range is sufficient to tackle the inherent complexity of
pathways without loss of valuable information.
Figure S6. Linear subpathway extraction. Start-nodes (A, B) and end-nodes (D, F) are identified and all four possible
paths connecting them are extracted.

Figure S7. Linear subpathway expansion. The initial subpathway has two nodes that represent multiple genes (three and
two respectively) and is expanded to six distinct subpathways. We denote that the user has the option to avoid expansion and
treat these group nodes or nodes with multiple gene products as single entity during subpathway extraction.

Figure S8. Optimal linear subpathway length. Cumulative distribution function (CDF) plots of the node coverage for
three organism pathway maps in relation to the subpathway length show that lengths ranging from 3 to 10 nodes cover almost the
entire pathway map (>97% ).
Supplementary text 5. Non-Linear subpathway extraction

Non-linear subpathways are extracted by using the k-clique algorithm (Figure S9) on each gene-gene pathway
graph separately which is converted to an undirected graph. k-clique algorithm is outlined (ref) as following: Let D
be a matrix and D(i,j) is the shortest path from node i to node j. Johnson's algorithm is used to fill G D; let N =
max(D[i][j]) for all i, j; (2) each edge is a 1-clique by itself; (3) for k = 2,...,N, try to expand each (k-1)-clique to k-
clique: (3.1) consider a (k-1)-clique the current k-clique KC; (3.2) repeat the following: if for all nodes j in KC,
D[v][j] <= k, add node v to KC; (3.3) eliminate duplicates; (4) the whole graph is N-clique.
The optimal subpathway length range in non-linear cascades is obtained based on the previous approach (see
Supplementary text 3). The results show (Figure S10) that subpathways with length up to ten members cover more
than 70% of each organism pathway map. The entire pathway maps are fully covered by subpathways with length
up to 20 members. However, for such length value dozens of interactions arise and thus, we chose to limit the
subpathway length to 10 genes. In our default setting, the k is equal to 2 but we provide the user the flexibility to
adjust k.

Figure S9. Non-linear subpathway extraction with k-clique method for k= 2 (left) and k=3 (right). The distance between
any two nodes is no greater than 2 or 3 respectively.

Figure S10. Optimal non-linear subpathway length. Cumulative distribution function (CDF) plots of the node coverage
for three organism pathway maps with regard to subpathway length show that non-linear subpathways of length ranging from 3
to 10 nodes cover a large proportion of the entire pathway map (70%).
Figure S11. Linear and non-linear subpathway length distribution in hsa pathway map.

Interactivity scoring schemes

Figure S12. Fold Change Interactivity (FCI) score. For edges connecting nodes where both have high absolute log2-
fold change values and have high positive (A-B, D-E) or negative (A-D) correlation, the FCI score is close to one or minus one
respectively. In any other case (such as A-C), FCI values are closer to zero. The FCI results were obtained with parameters
C=1, K=1 and T=0.3.
Supplementary text 6. Pseudocode of TVI score

Require: X  ( N ,T ) {gene expressions for N genes and T time points}


Require: G  (V , E ) {interaction network graph}
Require: W  {2,  1, 0, 1, 2} {weight range}
Initialization of probability at time t=1
for all e in E do
1  
P( X t 1  x t 1|W t 1  wt 1 )  exp   wet 1xit 1xtj1 

Z w t 1
  e  i , j E



for t = 2 to T do
 
Calculate transition probability:  
Q w(t 1) , wt 
1
exp   wet xit xtj  *P( X t 1  x t 1|W t 1  wt 1 )
Z w   t
 e i , j E


Choose TVI score: TVI et  max Q(wt 1, wt )
w

Output: TVI ( E,T )

Supplementary text 7. Structural and functional measures

Figure S13. Subpathway-based structural measures. Illustration of subpathways represented as meta-nodes (light blue
nodes) with high subDEG (dashed green line) and high subBC (dashed red line) values. These two measures capture local and
global topological aspects respectively based on the average degree and betweenness centrality values of the involved nodes.

Functional measure (subPathness)

CHRONOS bridges structural and functional aspects with an in-house measure, called subPathness , which
measures the degree up to which a subpathway serves as a bridge (Kovács et al., 200) among different pathways in
an organism pathway map. The Pathness of a gene/node i in P pathway maps is:

p p
Pathness(i)    T (a, b, i),
a 1 b  a , b 1
where T (a, b, i) is the area-overlap, or common area between pathway maps a and b including node i . Moving
forward under the perspective of subpathways, we considered as subPathness the mean of Pathness of all
genes/nodes that belong to the same subpathway. subPathness is expressed as:


N
j 1
Pathness( j )
subPathness(i)  ,
N

where N is the number of genes/nodes included in subpathway i . Subpathways with high subPathness value
pinpoint subpathways acting as bridges among pathways.

Figure S14. Evaluation of subpathway topological measures relative to subpathway length. With respect to
subPathness we observed that linear subpathways with length equal to 7 and 9 and non-linear equal to 8 achieved better scores.
Accordingly, with respect to subBC the linear subpathways with length equal to 4, 6 and 7 and non-linear with length equal to 8
achieved better scores. Regarding subDEG, the linear subpathways with length equal to 4 and 6 and non-linear with length equal
to 7 and 8 achieved better scores.
Figure S15. Computational time performance comparison of TEAK, with
CHRONOS (two case-scenarios were run: with miRNAs included in processing and without miRNAs) based on human KEGG
pathway maps and random gene sets with varying size. We chose to compare CHRONOS only against TEAK since this tool
includes all related subpathway time-consuming processes: (a) online download of both metabolic and non-metabolic KEGG
pathway maps, (b) gene-gene network construction, (c) subpathway extraction and evaluation. As shown, CHRONOS (with and
without miRNAs) outperforms TEAK.

CHRONOS Output

Figure S16. Example output ranked list of enriched miRNA-mediated subpathways. The enriched (miRNA-
mediated) subpathways are exported in serial alignment relative to their subscore (Equation 5) and FDR corrected P-values (q-
values). Subpathways are exported for each time point separately for each type of extraction (linear and non-linear) and pathways
(metabolic and non-metabolic) and are accompanied by the graph-based metric values (subDEG, subBC, subPathness).
Figure S17. Example KEGG visualization. Snapshot of a KEGG pathway map where a six-member linear subpathway is
highlighted.

Figure S18. Example circular plots illustrating the relations of miRNAs and the targeted subpathway members (genes) per
time point.
Figure S19. Schematic description of CHRONOS output files.

Application to synthetic dataset

Figure S20. Simulation analysis results I. Boxplots depicting the precision and recall of CHRONOS, TEAK,
Subpathway-GM and DEAP based on 100 independent synthetic network models for each duration period of regulation (in all
time points, in early/late time points, in one time point. CHRONOS outperforms the remaining tools in all cases with statistical
significance (two-sided Wilcoxon signed rank test, p-value<0.01).
Figure S21. Simulation analysis results II. Boxplots indicating the node coverage during subpathway extraction of
CHRONOS, TEAK and Subpathway-GM in 100 generated random graphs based on the Erdős–Rényi and Edgar Gilbert models.
CHRONOS outperforms the remaining tools in both models with statistical significance (two-sided Wilcoxon signed rank test, p-
value<0.01).

Figure S22. Visualization of the top 21 inter-connected subpathways of H. sapiens (hsa) based on three
subpathway-adapted topological measures: subDEG, subBC and subPathness. The subpathways are represented by nodes whose
size is analogous to the subBC score, the color scale analogous to subPathness score and border size to subDEG score.
Table S2. Topological and functional features of human pathway map.

Subpathway members
sub Id KEGG Pathway subPathness subBC subDEG
(HGNC symbol)
AKT3,MAP2K4,MAPK8,MAPK8IP3,MAP3K1,
04010_1 MAPK signaling pathway MAP2K1,LAMTOR3,MAP2K2,MAPK1,MYC 0.74 0.64 71.5
OPRD1,GNAI2,PIK3CG,AKT1,NOS3,GUCY1A
04022_1 cGMP-PKG signaling pathway 3,PRKG2,RAF1,MAP2K1,MAPK1 0.84 0.62 103.9
ADRBK2,CCR4,GNAI2,GNG10,PIK3CB,KRAS,
04062_1 Chemokine signaling pathway RAF1,MAP2K1,MAPK1 0.82 0.63 146.5
EGF,EGFR,GRB2,SOS1,NRAS,BRAF,MAP2K1,
04068_1 FoxO signaling pathway MAPK3,FOXO3,CCND1 0.72 0.60 87.4
VEGFA,KDR,PLCG1,PRKCG,SPHK1,KRAS,RA
04370_1 VEGF signaling pathway F1,MAP2K2,MAPK1,PLA2G4B 0.75 0.65 109.7
PTEN,PTK2,PIK3R5,VAV3,RAC1,PAK1,MAP2
04510_1 Focal adhesion K1,MAPK1,ELK1,BIRC2 0.71 0.60 99.5
P2RY12,GNAI1,PIK3CA,AKT3,NOS3,GUCY1A
04611_1 Platelet activation 2,PRKG1,MAPK14,MAPK1,PLA2G4B 0.70 0.64 115
CHRM2,GNAI1,PIK3R5,HRAS,MAP2K1,MAPK
04725_1 Cholinergic synapse 1,CREB1 0.76 0.60 93.2
GNRH1,GNRHR,GNAQ,PLCB1,PRKCA,RAF1,
04912_1 GnRH signaling pathway MAP2K1,MAPK1,ELK1 0.78 0.61 86.7
GPER1,GNAS,ADCY1,PRKACA,RAF1,MAP2K1
04915_1 Estrogen signaling pathway ,MAPK1,CREB3 0.77 0.64 87
GRM1,GNAQ,PLCB4,PRKCD,ADCY6,PRKAC
04915_2 Estrogen signaling pathway G,RAF1,MAP2K2,MAPK3,ATF4 0.73 0.71 115.8
ITGB3,PLCB1,PRKCA,MAPK1,THRB,PIK3R5,
04919_1 Thyroid hormone signaling AKT3,GSK3B,CTNNB1 0.88 0.89 133.2
ITGB3,PLCD1,PRKCB,MAPK3,THRB,PIK3CG,
04919_2 Thyroid hormone signaling THRA,CTNNB1 0.70 0.83 118.7
OXT,OXTR,GNAQ,PLCB2,PRKCA,RAF1,MAP2
04921_1 Oxytocin signaling pathway K1,MAPK1,PLA2G4E 0.81 0.69 95.1
OXT,OXTR,GNAI2,PLCB2,PRKCB,RAF1,MAP
04921_2 Oxytocin signaling pathway 2K2,MAPK3,PLA2G4B 0.77 0.68 100.7
ATP6AP1,PRKCB,RAF1,MAP2K1,MAPK1,FOS
05161_1 Hepatitis B ,MMP9 0.76 0.67 62.5
ATP6AP1,PRKCA,RAF1,MAP2K1,MAPK1,CRE
05161_2 Hepatitis B B1,CXCL8 0.78 0.69 67.6
EGF,EGFR,PLCG1,PRKCB,KRAS,RAF1,MAP2
05200_1 Pathways in cancer K1,MAPK1,JUN,PGF 0.97 0.69 115.3
PDGFA,PDGFRA,PLCG1,PRKCB,KRAS,RAF1,
05200_2 Pathways in cancer MAP2K1,MAPK1,FOS,VEGFA 0.90 0.71 111.9
PDGFB,PDGFRB,PLCG2,PRKCBNRAS,RAF1,
05214_1 Glioma MAP2K2,MAPK3 0.77 0.60 95.4
IGF1,IGF1R,PLCG2,PRKCB,NRAS,BRAF,MAP
05214_2 Glioma 2K2,MAPK3 0.78 0.64 97.8
List of top 21 top subpathways of Homo sapiens (hsa) based on three subpathway-based measures: subPathness, subDEG and
subBC. Note: subpathways belonging to the same pathway are indexed with a number after the “_” symbol.
Application to real expression data
Figure S23. Real analysis results I. Circular plots visualizing the rewiring of miRNA effect upon non-metabolic
subpathways in time points (12h, 24h and 72h). The width of categories is analogous to the number of human subpathways
(denoted with KEGG pathway Ids) targeted by a miRNA at the specific time point. The miRNA-mediated subpathways (linear
and non-linear) had subscore > 0.4 with q-value < 0.05. For better readability non miRNA-mediated subpathways are not
included (the circular plots of 3h and 48h are illustrated in Figure 3 of the Manuscript).
Figure S24. Real analysis results II. Distribution of enriched (miRNA-mediated) subpathways across time (subscore >0.4
in the respective time point and q-value<0.05); all subpathways (left) and metabolic (right). With blue rhomb we refer to all
significantly enriched subpathways and with red square to the subset that is regulated by at least one miRNA. The maximum
number of (miRNA-mediated) subpathways was observed at time 48h. Notably due to large number of metabolic subpathways (>
60,000), we lowered the q-value (q-value < 0.01).

Figure S25. Real analysis results III. Bar plots visualizing per time point the percentage of all (left) and metabolic (right)
enriched subpathways that belong to the top scoring 25% (of the total pool of subpathways) with regard to the structural and
functional measures (subDEG, subBC, subPathness). As shown, the majority of enriched subpathways displayed at 48h the
highest value in all three metrics.
Table S3. Real analysis results IV. Comparison of CHRONOS, TEAK, Subpathway-GM, timeClip and DEAP

Pathways CHRONOS TEAK Subpathway timeClip DEAP (Nazarov et al., 2013)


-GM

Jak-STAT signaling pathway YES (3h, 12h) YES - YES NO (3h, 12h)
Hypertrophic cardiomyopathy YES (3h, 12h, 48h) YES - NO NO
(HCM)
Dilated cardiomyopathy YES (24h, 48h, 72h) YES - NO NO Cardiovascular diseases
(24h, 48h, 72h)
Viral myocarditis YES (3h, 72h) NO - NO NO
Inflammatory bowel disease YES (3h) NO - YES YES Immunological Disease
(3h, 12h, 24h)
Cell Cycle YES (24h, 48h) NO - NO NO Cell Cycle (24h, 72h)
Apoptosis NO NO - YES YES Cell Death and Survival
p53 signaling pathway YES (3h, 12h, 48h) NO - YES NO (3h, 12h, 24h)
Choline metabolism in cancer YES (3h, 48h) NO - NO NO
Proteoglycans in cancer YES (3h, 12h, 48h) NO - NO YES
Cancer
Chemical carcinogenesis YES (3h, 48h) NO - NO NO (24h, 48h, 72h)
Thyroid cancer YES (48h) NO - NO NO
Prostate cancer YES (3h, 24, 48h) NO - NO YES
Non-small cell lung cancer YES (48h, 72h) NO - NO NO
Ether lipid metabolism YES (24h, 48h) NO
YES NO NO Lipid Metabolism
Glycosphingolipid biosynthesis YES (24h, 48h) YES YES NO NO (24h, 72h)
Fatty acid degradation YES (48h) NO NO NO NO
Pathogenic Escherichia coli YES (12h) NO - NO NO
infection
Salmonella infection YES (3h, 48h) YES - NO NO
Shigellosis YES (3h, 12h, 48h) YES - NO NO
Pertussis YES (3h, 12h, 24, 48h) NO - NO NO
Tuberculosis YES (3h, 24h, 48h) YES - NO YES
Staphylococcus aureus infection YES (24h) NO - NO NO
Infectious diseases
HTLV-I infection YES (3h, 12h, 24h, 48h, 72h) YES - YES YES (3h, 12h, 24h)
Measles YES (48h) NO - YES NO
Influenza A YES (3h, 24h, 48h) NO - YES NO
Hepatitis B YES (3h, 48h) YES - YES YES
Hepatitis C YES (3h, 12h) YES - NO NO
Epstein-Barr virus infection YES (3h, 12h, 24h) NO - NO NO
Amoebiasis YES (3h, 12h, 24h) YES - NO NO
Comparison of CHRONOS, TEAK, Subpathway-GM, timeClip and DEAP based on the results of (Nazarov et al., 2013). From TEAK the 5.9% of the top
scoring linear and non-linear metabolic and non-metabolic subpathways was evaluated. The 5.9% cutoff was defined so that the ratio of subpathways relative to
the whole pool of subpathways is comparable to CHRONOS results (subscore>0.4). We denote that Subpathway-GM is not applicable in non-metabolic
pathways, thus no result is provided in the respective field; also, the identified DEGs were used as the list of ‘interesting genes’ for Subpathway-GM. timeClip
and DEAP require a user-defined graph as input thus they were run based on the CHRONOS output pathway network in order to obtain comparable results.
From Subpathway-GM, timeClip and DEAP all resulting subpathways were evaluated. miRNAs were not included since their expression analysis is not
applicable in all methods. All methods were evaluated based on the significantly enriched pathway and Gene Ontology (GO) terms (those that can be related to
KEGG pathways) reported in (Nazarov et al., 2013) with p-value <0.01. ‘YES’/‘NO’ means presence/absence of the respective pathway/ontology term in the
output of each method. CHRONOS, opposed to the other four tools, detects almost all GO terms (via related KEGG pathway terms) reported in the original
study.
References
Büchel,F. et al. (2013) Path2Models: large-scale generation of computational models from biochemical pathway maps. BMC systems biology, 7(1),
116.
Cicek,A.E. et al. 2014 An online system for metabolic network analysis. Database, 2014, bau091.
Dimitrakopoulou,K. et al. (2015) Integromics network meta-analysis on cardiac aging offers robust multi-layer modular signatures and reveals
micronome synergism. BMC genomics, 16(1), 147.
Durinck,S. et al. (2005) BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics,
21(16), 3439-3440.
Judeh,T. et al. (2013) TEAK: topology enrichment analysis framework for detecting activated biological subpathways. Nucleic Acids Res., 41,
1425-37.
Kanehisa,M. et al. (2002) The KEGG databases at GenomeNet. Nucleic acids research, 30(1), 42-46.
Kovács,I.A. et al. (2010) Community landscapes: an integrative approach to determine overlapping network module hierarchy, identify key nodes
and predict network dynamics. PloS one, 5(9), e12528.
Li,C. et al. (2013) Subpathway-GM: identification of metabolic subpathways via joint power of interesting genes and metabolites and their
topologies within pathways. Nucleic Acids Res., 41, e101.
Sales,G. et al. (2012) graphite-a Bioconductor package to convert pathway topology to gene network. BMC bioinformatics, 13(1), 20.
Vlachos,I.S. et al. (2015) DIANA-TarBase v7. 0: indexing more than half a million experimentally supported miRNA: mRNA interactions. Nucleic
acids research 43(D1), D153-D159.
Xiao,F. et al. miRecords: an integrated resource for microRNA–target interactions. Nucleic acids research, 37(suppl 1), D105-D110.
Welsh,E.A. et al. (2013) Iterative rank-order normalization of gene expression microarray data. BMC bioinformatics, 14(1), 153.
Wrzodek,C. et al. (2013) Precise generation of systems biology models from KEGG pathways. BMC systems biology, 7(1), 15.

You might also like