Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Structural Bionformatics 2004

Prof. Haim Wolfson


Structural Bionformatics 2004
Prof. Haim Wolfson
Flexible Docking - general
methodology
Major approaches :
Rigid subpart docking (place and join):
Split the flexible molecule into rigid subparts.
Dock independently each subpart.
Pair the top hypotheses for each subpart to detect
hinge consistency.
Example : Des J arlais, Sheridan, Dixon, Kuntz,
Venkatraghavan (1986).
Structural Bionformatics 2004
Prof. Haim Wolfson
Incremental construction method :
Position a preferred anchor fragment.
Rotate sequentially the flexible bonds to position the
other fragments.
Example: Leach & Kuntz (1992); Lengauer et al. - FLEXX.
Hinge scoring method:
Incorporate bond information already in the initial
filtering steps by accumulating information at the hinges.
No preference for specific parts. Reminds the place and
join method yet exploits the consistency of neighboring
part placement in the initial stages.
Example : Sandak, Nussinov, Wolfson (1995).
Structural Bionformatics 2004
Prof. Haim Wolfson
Search in multi-dimensional degrees of freedom
(torsion angle) space :
Evolutionary/Genetic Algorithms :
Represent degrees of freedom as strings.
Create offsprings by (genetic) combination of parents.
Re-evaluate fitness of each string and prune weak
hypotheses.
J ones et al. J . Mol. Bio . Vol 245 (1995), pp. 43-
.
Simulated Annealing :
AutoDock Goodsell et al. Proteins 1990.
Structural Bionformatics 2004
Prof. Haim Wolfson
GGH based flexible docking
Applies either to flexible ligands or to flexible receptors.
Structural Bionformatics 2004
Prof. Haim Wolfson
General Algorithm outline
Can be applied either to a dataset of ligands vs a
receptor or a dataset of receptors vs. a ligand.
Calculate the molecular surface of the receptor and
the ligands and their interest points (+ normals).
Match the interest points and recover candidate
multi-transformations.
Check for inter-molecule and intra-molecule
penetrations and score the amount of contact.
Rank by energies.
Structural Bionformatics 2004
Prof. Haim Wolfson
Point Matching algorithm- prepr.
For each database molecule :
Define r.f.s at every hinge.
For each minimal feature (e.g. triplet) compute an r.f.
and shape signature.
For each (triplet based) reference frame compute the
transformation btwn that frame and the hinge based
frame and store (molec., part, r.f., transf.) in a hash
(lookup) table at an entry addressed by the r.f. shape
signature.
Structural Bionformatics 2004
Prof. Haim Wolfson
Point Matching algorithm-
recognition
For the target molecule :
For each minimal feature compute an r.f. and shape
signature.
Access the table by the shape signature, and for
each transformation appearing there :
transform the r.f. to hypothesized hinge position;
advance the counter of that hinge location for the appropriate
molecule and part.
Check highest scoring hinges .
Verify the resulting transformations .
Structural Bionformatics 2004
Prof. Haim Wolfson
Flexible Docking
Calmodulin with M13 ligand
Structural Bionformatics 2004
Prof. Haim Wolfson
Flexible Docking
HIV Protease Inhibitor
Structural Bionformatics 2004
Prof. Haim Wolfson
The FlexX Algorithm
Rarey, , Lengauer. J . Mol. Bio., vol. 261,
(1996), pp. 470-
An incremental construction algorithm
Structural Bionformatics 2004
Prof. Haim Wolfson
The general schema
I ncremental construction
Scoring function
Receptor-ligand interactions
Ligand conformational flexibility
Modeling
Algorithm
Base selection
Base placement
Structural Bionformatics 2004
Prof. Haim Wolfson
The Ligand conformational
flexibility
Approximated by a discrete set of
conformations.
rotatable single bond - modeled by a
discrete set of preferred torsion angles from
the MIMUMBA DB.
Ring system - A set of ring conformations is
computed with the program CORINA.
Structural Bionformatics 2004
Prof. Haim Wolfson
The model of receptor-ligand
interactions
Modeled by a few special types of
interactions
hydrogen bonds
metal acceptors bonds
hydrophobic contacts
Structural Bionformatics 2004
Prof. Haim Wolfson
The model of protein-ligand
interactions Cont.
To each interaction group, we assign:
Interaction types
Interaction geometry ( center + surface)
Structural Bionformatics 2004
Prof. Haim Wolfson
Two groups interact if :
The centers of the groups lie approximately on the
surface of the counter group.
The interaction types are compatible
The intermolecular interactions can be classified by the
strength of their geometric constrains
Structural Bionformatics 2004
Prof. Haim Wolfson
Scoring function
Estimates the free binding energy in the complex
The function is additive in the ligand atoms.
match score
contact score
Structural Bionformatics 2004
Prof. Haim Wolfson
Overall docking algorithm
1. Ligand fragmentation
2. Select & Place a set of base fragments
3. Construct the ligand by linking the
remaining fragments.
Structural Bionformatics 2004
Prof. Haim Wolfson
Structural Bionformatics 2004
Prof. Haim Wolfson
Ligand fragmentation
The ligand is decomposed into
components by cutting at each acyclic
bond.
Fragmentation is a partition of the
components of the molecule, such that
every part, called fragment, is connected
in the component tree.
Structural Bionformatics 2004
Prof. Haim Wolfson
Ligand fragmentation
Good results are produced if the added fragments
are small
Every fragment, except for the base fragment,
consists of only one component.
Structural Bionformatics 2004
Prof. Haim Wolfson
Selecting a base fragment
The problem: Find a fragment which leads
to low energy docking solution.
Good base fragment properties:
Placeability
Specificity
Structural Bionformatics 2004
Prof. Haim Wolfson
Selecting a base fragment Cont.
We look for fragments maximizing the
function:
Structural Bionformatics 2004
Prof. Haim Wolfson
Rules for selecting a set of
fragments
No base fragment is fully contained in
another base fragment
Each component occurs in at most two
base fragments
Each component in a base fragment must
be either necessary for the connectivity of
the fragment or it must have interaction
centers.
Structural Bionformatics 2004
Prof. Haim Wolfson
The base placement algorithm
Goal: find positions of the base fragment
in the active site such that sufficient
number of favorable interactions between
the fragment and the protein can occur
simultaneously.
Solution: pose clustering.
Structural Bionformatics 2004
Prof. Haim Wolfson
The base placement algorithm
Cont.
Preparation: Store all triangles of
interaction points (IP) of the protein in a
hash table.
Find all the compatible fragment IPs
triangles.
Clustering of the legal transformations
Structural Bionformatics 2004
Prof. Haim Wolfson
The incremental construction
algorithm
Input: solution set - set of partial
placements with the ligands constructed
up to and including fragment i-1
Output: set of partial placements with the
ligands constructed up to and including
fragment i
Structural Bionformatics 2004
Prof. Haim Wolfson
Structural Bionformatics 2004
Prof. Haim Wolfson
The complex construction algorithm
cont.
Adding the next fragment in all the possible
conformations
Reject extended placements that have strong
overlap with the receptor or internal overlap with
the ligand.
Searching for new interactions
Optimizing the positions of the partial ligand
Selecting a new solution set
Clustering the solution set
Structural Bionformatics 2004
Prof. Haim Wolfson
Optimizing the positions of the
partial ligand
The placement is optimized when:
New interactions are found.
The placement contains slightly overlapping
atoms between the receptor and the ligand.
(
)
2
r l w
i i i

Structural Bionformatics 2004


Prof. Haim Wolfson
Selecting a new solution set
Select k best-scoring solutions
Problem: the scoring values cannot be
compared directly when different
fragments are involved.
Solution: estimate the score of the whole
ligand, given a partial placement.
Structural Bionformatics 2004
Prof. Haim Wolfson
Clustering partial solutions
If no placement contains the other, the
distance is infinity
Otherwise, the distance is defined to be
the RMSD of the intersecting atoms.
A cluster is reduced to a single placement.
Structural Bionformatics 2004
Prof. Haim Wolfson
Exploring receptor Flexibility
Structural Bionformatics 2004
Prof. Haim Wolfson
Protein flexibility - motivation
Induced fit side chain or even backbone
adjustments upon docking of different ligands to
the same protein.
Even small conformational changes are critical
for docking applications e.g. if a rotatable bond
prevents a ligand from binding in the correct
position.
Structural Bionformatics 2004
Prof. Haim Wolfson
Protein flexibelity
Main idea: describe the protein structure
variations with a set of protein structures
representing the flexibility, mutation or
alternative models of a protein.
The variability considered by FlexE is defined by
the differences within the given input structures.
Structural Bionformatics 2004
Prof. Haim Wolfson
United protein description
Data structure that handles the protein
structures variations.
Contains an ensemble of up to 30 possible
conformation of the protein.
Most of them are low energy
conformations of the same protein.
Structural Bionformatics 2004
Prof. Haim Wolfson
United protein description -
construction
Superposition
Clustering
Add picture - 8
Structural Bionformatics 2004
Prof. Haim Wolfson
Notation
Component : all the
atoms which belong to the
same amino acid or
mutation of the amino
acid. Contains a backbone
part and a side chain part
Part : set of instances
Instance : one of the
alternative conformations.
Structural Bionformatics 2004
Prof. Haim Wolfson
United protein description -
clustering
The superimposed structures are
combined by clustering each part
separately
Complete linkage hierarchical cluster
The clustered instances can be
recombined to form new valid protein
structures.
Structural Bionformatics 2004
Prof. Haim Wolfson
Incompatibility
Two instances of the united
protein description are
incompatible if they cannot
be realized simultaneously.
Logical: two instances are
alternative to each other
Geometric: two logically
compatible instances overlap
Structural: two instances of
the same chain are
unconnected
Structural Bionformatics 2004
Prof. Haim Wolfson
Incompatibility graph
{ }
} ble incompatia and E
ces ins V
v v e
j i ij
=
= tan
Structural Bionformatics 2004
Prof. Haim Wolfson
Incompatibility graph
The incompatibility is
internally represented as
a graph by using the
instances as nodes and
the connecting pairs of
incompatible nodes by an
edge.
Valid protein structures
correspond to
independent sets in the
graph.
Structural Bionformatics 2004
Prof. Haim Wolfson
Selection of instances
The ligand is placed fragment by fragment
into the active site by the incremental
construction algorithm.
After each construction step, all possible
interactions are determined.
Apply the scoring function for each
instance.
We chose the IS with the highest score.
Structural Bionformatics 2004
Prof. Haim Wolfson
The IS can be assembled from IS of the
connected components.
Apply a modified version of the Bron-
Kerbosch algorithm.
Select the optimal IS
Structural Bionformatics 2004
Prof. Haim Wolfson
Evaluation
FlexE was evaluated with ten protein
structures ensembles containing 105
crystal structure from the PDB.
The structures within the ensemble
highly similar backbone trace
Different conformations for several side
chains.
Structural Bionformatics 2004
Prof. Haim Wolfson
Structural Bionformatics 2004
Prof. Haim Wolfson
Evaluation Cont.
FlexE finds a ligand position with RMSD
below 2 A in 67% of the cases.
Average CPU time for the incremental
construction algorithm is 5.5 minutes.
Structural Bionformatics 2004
Prof. Haim Wolfson
Structural Bionformatics 2004
Prof. Haim Wolfson
Discussion
The ensemble approach is able to cope
with several side-chains conformations
and even movements of loops.
Motions of larger backbone segments or
even domains movements are not covered
by this approach.
Structural Bionformatics 2004
Prof. Haim Wolfson
FlexDock
FlexDock
: Algorithm Stages
: Algorithm Stages
Rigid Parts Docking via
Geometric Hashing
B
B
Assembly of partial dockings
into a flexible result
A
A
A
A
A
A
A
A
B
B
A
A
A
A
Structural Bionformatics 2004
Prof. Haim Wolfson
Flexible Assembly Stage
Flexible Assembly Stage
NODE: NODE:
transformation, score transformation, score
Part 1 results Part 1 results Part 2 results Part 2 results Part 3 results Part 3 results
Structural Bionformatics 2004
Prof. Haim Wolfson
Results Compatibility
Results Compatibility
B B
2 2
B B
1 1
A
A
A
A
Two docking results are compatible if
and only if:
(1) Their transformations superimpose
the hinge point into the same
location (approximately).
(2) The parts are not penetrating.
A
A
B B
1 1
B B
2 2
Note: compatible results may have some
shape complementarity
Structural Bionformatics 2004
Prof. Haim Wolfson
Flexible Assembly
Flexible Assembly
s
s
t
t
NODE: NODE:
transformation, score transformation, score
Part 1 results Part 1 results Part 2 results Part 2 results Part 3 results Part 3 results
EDGE: EDGE:
parts docking score parts docking score
Structural Bionformatics 2004
Prof. Haim Wolfson
Flexible Assembly Graph
Flexible Assembly Graph
DAG:
DAG:
Directed Acyclic Graph.
Directed Acyclic Graph.
NODE:
NODE:
part transformation, score.
part transformation, score.
EDGE:
EDGE:
connects compatible parts, score of
connects compatible parts, score of
docking between the parts.
docking between the parts.
DOCKING PATH:
DOCKING PATH:
a path between s and t.
a path between s and t.
PATH SCORE:
PATH SCORE:
sum of nodes and edges scores.
sum of nodes and edges scores.
Goal:
Goal:
find
find
K
K
best paths in the assembly graph.
best paths in the assembly graph.
Solution:
Solution:
dynamic programming.
dynamic programming.

You might also like