Professional Documents
Culture Documents
Auto Dock
Auto Dock
By
Susan McClatchy, Milind Misra,
Chandreyee Mukherjee, Indu Shrivastava
Introduction
Chandreyee Mukherjee
Automated Docking:
Importance
What is docking?
Prediction of the optimal physical configuration and
energy between two molecules
The docking problem optimizes:
Categories of docking
1. Protein-Protein Docking:
Both molecules are rigid
Interaction produces no change in
conformation
Similar to lock-and key model
2. Protein-Ligand Docking:
Ligand is flexible but the receptor protein is
rigid
Interaction produces conformational
changes in ligand
1. Protein-Protein Docking
2. Protein-Ligand Docking
optimized
molecular properties.
Exploration of the configuration spaces available for
interaction between ligand and receptor.
Evaluate and rank configurations using a scoring system, in
this case the binding energy
bind
conform
+ G
tor
+ G
sol
Algorithms Overview
Simulated Annealing
Genetic Algorithm
Charles Darwins Theory of Evolution
Genotype
Phenotype
Lamarckian Algorithm ( Jean Baptiste de
Lamarck)
Phenotype
Genotype
Project Goal
Study algorithms used to perform the
searches and to calculate minimum
energy
The Algorithms
Sue McClatchy
Simulated Annealing
The SA Algorithm
Pseudocode for SA
Compute a random initial state s
n=0, x*n = s
// initialize best solution to s and first state to 0
Repeat i = 1, 2,
// specify number of temperatures to try
Repeat j = 1, 2, , mi // no. of steps to perform for each temp. Ti
Compute a neighbor s = N(s) // s = new solution from N(s)
if (f(s) <= f(s)) then
// if energy of s <= energy of s
s = s
// accept new solution s
if (f(s) < f(x*n)) then
// if energy of new solution <
x*n = s
// energy of best solution of
n=n+1
// state n, replace best with new
endif
else
// otherwise replace s with s using
s = s with probability e (f(s) - f(s))/Ti
// Boltzmann dist.
endif
EndRepeat
EndRepeat
Initial population of
binary creatures
having 6 genes
Each gene has two
different alleles,
either a 0 or a 1
Three operators:
crossover, mutation
and selection
Selection based on a
fitness function f(x)
This operator chooses
those individuals with
the lowest values
Those with higher
values chosen with a
very low probability
Sco
re
Selection
20
13
48
52
Crossover
1 1 1
1 0 0
0 1 0 0
0 0 0
0 0 1
Mutation
0
0 1 0 0
1 1 0 0
0 1
1 1
Sco
re
#o
fs
p
Replacement
0
15
22
Pseudocode for GA
Select an initial population set xi0 = {x10 , x20,, xM0}
Determine fitness values f(xi0) for each individual
Repeat for g = 1, 2, # of generations
Perform selection
Perform crossover with probability
Perform mutation with probability
Determine fitness f(xig) for new individuals
xg* = argmini=1,M f(xig) and yg* = f(xg*)
Perform replacement
Until stopping criterion (# of generations) is reached
How GA works in
AutoDock
Mapping
Mutation operator
mutates coordinate or
other angle values by
adding a random real
number according to a
Cauchy distribution,
which is similar to a
Gaussian but has thicker
tails
Replacement
Lamarckian Genetic
Algorithm
The Application
Milind Misra
Ligand
(AHA006)
(Source: PDB)
(Rasmol)
Initial X-Ray
crystallographic
positions of protein
and ligand
(SYBYL)
Docking Preparation
Ligand
Assign
charges
Define rotatable bonds
Rename aromatic carbons
Merge non-polar hydrogens
Write .pdbq ligand file
Docking Preparation
Protein
Add
essential hydrogens
Load charges
Merge lone-pairs
Add solvation parameters
Write .pdbqs protein file
uses
grid-based
docking
Ligand-protein
interaction
energies are precalculated and
then used as a
look-up table
(AutoDockTools)
Docking Simulated
Annealing
Runs = 100
Cycles = 50
Initial Temp (RT) = 1,000
Temp reduction factor = .95
Linear temperature reduction
Translation reduction factor = 1
Quaternion reduction factor = 1
Torsional reduction factor = 1
# rotatable bonds = 12
Initial coordinates = Random
Initial quaternion = Random
Initial dihedrals = Random
Translation step = 2.0
Quaternion step = 50 deg
Torsion step = 50 deg
Results:
100 different clusters
Energy range: -0.63 to
+64,000
Conformation #81: -0.63
Conformation #67: +20.02
Conformation #68: +10.74
(SYBYL)
Close-up of previous
(SYBYL)
(SYBYL)
100 Clustered SA
Conformations
(gOpenMol)
Docking Genetic
Algorithm
Runs = 50
# Evaluations = 250,000
Population size = 50
Elitism count = 1
Mutation rate = 0.02
Crossover rate = 0.8
Window size = 10
Cauchy alpha = 0
Cauchy beta = 1
# rotatable bonds = 12
Initial coordinates = Random
Initial quaternion = Random
Initial dihedrals = Random
Translation step = 2.0
Quaternion step = 50 deg
Torsion step = 50 deg
Results:
50 different clusters
Energy range: -18.66 to
+86.28
Conformation #39: -18.66
Conformation #9: -10.60
Runs = 50
Solis-Wets iterations = 300
Consecutive successes = 4
Consecutive failures = 4
Rho = 1
Lower bound on rho = 0.01
LS frequency = 0.06
# rotatable bonds = 12
Initial coordinates = Random
Initial quaternion = Random
Initial dihedrals = Random
Translation step = 2.0
Quaternion step = 50 deg
Torsion step = 50 deg
Results:
18 different clusters
Energy range: +35.92 to
+215,200
Confs #20, 21, 22, 23: +35.92
Docking Lamarckian GA
Runs = 10
Max # Evaluations = 250,000
Max # Generations = 27,000
Population size = 50
Elitism count = 1
Mutation rate = 0.02
Crossover rate = 0.8
Window size = 10
Cauchy alpha = 0
Cauchy beta = 1
Solis-Wets iterations = 300
Consecutive successes = 4
Consecutive failures = 4
Rho = 1
Lower bound on rho = 0.01
LS frequency = 0.06
* Gray options *
Results:
10 different clusters
Energy range: -18.10 to 8.38
Conformation #7: -18.10
(SYBYL)
(SYBYL)
References
http://cmgm.stanford.edu/biochem218/Projects%201998/Apaydin.pdf
http://www.biz.uiowa.edu/class/6K299_menczer/PPT/Hart/sld018.html
http://cs.felk.cvut.cz/~xobitko/ga/
http://www.bch.msu.edu/labs/kuhn/web/projects/screening/solvation.html
http://wwwcmc.pharm.uu.nl/gillies/thesis/
http://www.chem.uidaho.edu/~honors/boltz.html
S.Kumar et.al. Protein Flexibility and Electrostatic Interactions. IBM Journal of
Research and Development Vol45. No 2001.
G. Morris et.al. Automated Docking Using a Lamarckian Genetic Algorithm and an
Empirical Binding Free Energy Function. Journal of Computational Chemistry, Vol.
19, No. 14, 1639-1662 (1998)
C. Rosin et.al. A Comparison of Global and Local Search Methods in Drug Docking.
UCSD CSE Technical Report #CS97-522 (1997)
C. A. Sotriffer et.al. Automated Docking of Ligands to Antibodies: Methods and
Applications. Methods 20, 280-291 (2000)
M. Vieth et.al. Assessing Search Strategies for Flexible Docking.
Practical Handbook of Genetic Algorithms. Edited by Lance Chambers
An Introduction to Genetic Algorithms. Melanie Mitchell.
Goodsell and Olson Prot. Struct. Func. Genet, 8, 195(1990).
Principals of Biochemistry: Lehninger
R. Durbin, S Eddy, A. Krogh, G. Mitchison Biological sequence analysis
Wm. E. Hart. A Theoretical Comparison of Genetic Algorithms and Simulated
Annealing Sandia National Laboratories, www.cs.sandia.gov/~wehart.