Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 51

Protein network analysis

Network motifs
Network clusters / modules
Co-clustering networks & expression
Network comparison
(species, conditions)
Integration of genetic & physical nets
Network visualization

Network motifs

Network Motifs (Milo, Alon et al.)


Motifs are patterns of interconnections occurring in
complex networks.
That is, connected subgraphs of a particular isomorphic
topology
The approach queries the network for small motifs (e.g.,
of < 5 nodes) that occur much more frequently than
would be expected in random networks
Significant motifs have been found in a variety of
biological networks and, for instance, correspond to
feed-forward and feed-back loops that are well known in
circuit design and other engineering fields.
Pioneered by Uri Alon and colleagues

Motif searches in 3 different contexts

How many motifs (connected subgraph topologies) exist involving three nodes?
If the graph is undirected?
If the graph is directed?

All 3-node directed subgraphs

What is the frequency of each in the network?

Outline of the Approach


Search network to identify all possible n-node connected
subgraphs (here n=3 or 4)
Get # occurrences of each subgraph type
The significance for each type is determined using
permutation testing, in which the above process is
repeated for many randomized networks (preserving
node degrees why?)
Use random distributions to compute a p-value for each
subgraph type. The network motifs are subgraphs with
p < 0.001

Schematic view of network motif detection

Networks are randomized preserving node degree

Concentration of feedforward motif:


(Num. appearances of motif divided by
all 3 node connected subgraphs)

Mean+/-SD of 400 subnetworks

Transcriptional
network results

Neural networks

Food webs

World Wide Web

Electronic circuits

Interesting questions
Which networks have motifs in common?
Which networks have completely distinct motifs versus
the others?
Does this tell us anything about the design constraints
on each network?
E.g., the feedforward loop may function to activate
output only if the input signal is persistent (i.e., reject
noisy or transient signals) and to allow rapid deactivation
when the input turns off
E.g., food webs evolve to allow flow of energy from top
to bottom (?!**!???), whereas transcriptional networks
evolve to process information

Identifying modules in the network


Rives/Galitski PNAS paper 2003
Define distance between each pair of
proteins in the interaction network
E.g., d = shortest path length
To compute shortest path length, use
Dijkstras algorithm
Cluster w/ pairwise node similarity = 1/d2

Integration of
networks and expression

Querying biological networks for Active Modules


Color network nodes (genes/proteins) with:
Patient expression profile
Protein states
Patient genotype (SNP state)
Enzyme activity
RNAi phenotype

Active Modules
Ideker et al. Bioinformatics (2002)

Interaction Database
Dump, aka Hairball

A scoring system for expression activity


A
1
2
3
4

1
1
3
2

B
2
1
0
1

1 2 2 1
1
4

2
0
3
3

1
2

Perturbatio
ns
/conditions

Scoring over multiple perturbations/conditions

Searching for active pathways in a large network


Score subnetworks according to their overall amount of
activity
Finding the highest scoring subnetworks is NP hard, so we
use heuristic search algs. to identify a collection of highscoring subnetworks (local optima)
Simulated annealing and/or greedy search starting from an
initial subnetwork seed
During the search we must also worry about issues such as
local topology and whether a subnetworks score is higher
than would be expected at random

Simulated Annealing
Algorithm

Network
regions whose
genes change
on/off or off/on
after knocking
out different
genes

Initial Application to Toxicity:


Networks responding to DNA damage in yeast

Tom Begley and Leona Samson; MIT Dept. of


Bioengineering
Systematic phenotyping of gene knockout
strains in yeast
Evaluation of growth of each strain in the
presence of MMS (and other DNA damaging
agents)
Sensitive
Not sensitive
Not tested
MMS sensitivity in ~25% of strains
Screening against a network of protein
interactions

Begley et al., Mol Cancer Res,

Networks responding to DNA damage as revealed by


high-throughput phenotypic assays

Begley et al., Mol Cancer Res,

Host-pathogen interactions regulating early stage HIV-1 infection

Genome-wide RNAi screens for genes required for infection utilizing a


single cycle HIV-1 reporter virus engineered to encode luciferase and
bearing the Vesicular Stomatitis Virus Glycoprotein (VSV-G) on its
Sumit Chanda
surface to facilitate efficient infection

Project onto a large network of humanhuman and human-HIV protein


interactions

Network modules associated with


infection

onig et al. Cell 2008

Network-based
classification

NETWORK-BASED CLASSIFICATION

Disease aggression
(Time from Sample Collection SC
to Treatment TX)

Chuang et al. MSB 2007


Lee et al. PLoS Comp Bio 2008
Ravasi et al. Cell 2010

The Mammalian Cell Fate Map:


Can we classify tissue type using expression, networks,
etc?
Gilbert Developmental
Biology 4th Edition

Interaction coherence within a tissue


class
r = 0.9
A

Endoderm

r = 0.0
A

Mesoderm

r = 0.2

Ectoderm (incl. CNS)

F A B
Taylor et al. Nature Biotech 2009

Protein interactions, not levels, dictate tissue


specification

Functional Enrichment

::: Introduction.
Gene Set Enrichment Analysis - GSEA -

GSEA
MIT
Broad Institute
v 2.0 available since Jan 2007
Version 2.0 includes Biocarta, Broad Institute,
GeneMAPP, KEGG annotations and more...
Platforms: Affymetrix, Agilent, CodeLink, custom...
(Subramanian et al. PNAS. 2005.)

::: Introduction.
Gene Set Enrichment Analysis - GSEA -

GSEA applies Kolmogorov-Smirnof test to find assymmetrical distributions for defined


blocks of genes in datasets whole distribution.

Is this particular Gene Set enriched in my experiment?

Genes selected by researcher, Biocarta pathways, GeneMAPP sets,


genes sharing cytoband, genes targeted by common miRNAs
up to you

::: Introduction.
Gene Set Enrichment Analysis - GSEA -

::: K-S test

orovSmirnov test is used to determine whether two underlying one-dimensional probability distributions differ, or wheth
ng probability distribution differs from a hypothesized distribution, in either case based on finite samples.

mple KS test compares the empirical distribution function with the cumulative distribution functionspecified by the null hy
plications are testing goodness of fit with the normal and uniform distributions.

mple KS test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive t
tion and shape of the empirical cumulative distribution functions of the two samples.

Number of genes

Dataset distribution

Gene set 1 distribution

Gene set 2 distribution

Gene Expression Level

::: Introduction.
Gene Set Enrichment Analysis - GSEA -

ClassA

ClassB

FDR<0.05

...testing genes independently...

ttest
ttest cut-off
cut-off

FDR<0.05

Biological meaning?

::: Introduction.
Gene Set Enrichment Analysis - GSEA -

ClassA ClassB

Gene
Set 1

Gene
Set 2

Gene
Set 3

Gene set 3
enriched in Class B

Correlation with CLASS

ttest
ttest cut-off
cut-off

Gene set 2
enriched in Class A

Subramaniam, PNAS 2005

::: Introduction.
Gene Set Enrichment Analysis - GSEA -

The Enrichment Score :

NES
NES

pval
pval

FDR
FDR
Benjamini-Hochberg

Network Alignment
Species 1 vs. species 2
Physical vs. genetic

Cross-comparison of networks:
(1) Conserved regions in the presence vs. absence of
stimulus
(2) Conserved regions across different species

Kelley et al. PNAS 2003 Suthram et al. Nature 2005


Ideker & Sharan Gen ResSharan & Ideker Nat.

Sharan et al.
RECOMB
2004
Scott et
al. RECOMB

Plasmodium: a network apart?

Plasmodium-specific
protein complexes
Conserved Plasmodium /
Saccharomyces protein

Suthram et al. Nature 2005


La Count et al. Nature 2005

Human vs. Mouse TF-TF Networks in Brain

Tim Ravasi, RIKEN Consortium et al. Cell

Finding physical pathways to explain genetic interactions


Genetic Interactions:

Classical method used to


map pathways in model
species

Highly analogous to
multi-genic interaction in
human disease and
combination therapy

Thousands are being


uncovered through
systematic studies

Thus as with other types, the


number of known genetic
interactions is
exponentially increasing
Adapted from Tong et al., Science 2001

Integration of genetic and physical interactions

160 betweenpathway models


101 withinpathway models
Num interactions:
1,102 genetic
933 physical

Kelley and Ideker Nature Biotechnology (2005)

Systematic identification of
parallel pathway relationships in yeast

Unified Whole
Cell Model of
Genetic and
Physical
interactions

A dynamic DNA damage module map

Bandyopadhyay et al. Science

You might also like