Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

BIOINFORMATICS WORKSHOP FIESTA 2018

Practical: Molecular dynamics (MD) simulation on thymidylate kinase (TMPK)


and its ligand (5HU)

This hands-on manual is an introductory practical for performing minimization and molecular
dynamics simulation using Ambertools. Leap is used to setup input files that are required to
perform minimization and molecular dynamics simulation (using sander). In this practical, we
will be performing MD simulation on thymidylate kinase (TMPK) and its ligand (5-
hydroxymethyluridine-2'-deoxy-5'-monophosphate) from Mycobacterium tuberculosis (control
MD study).

You should start this exercise in creating a folder, name “TMPK” in your tutorial directory
(/home/osboxes/tutorial/TMPK).

Section 1: File preparation


This section is to show steps to create input files (prmtop and inpcrd files) using xleap for
explicit MD simulation.

1. Before we begin, let’s download our starting file (1MRS.pdb) from the Protein Data
Bank (https://www.rcsb.org/).

2. Clean the pdb file using pdb4amber. At the same time, remove all crystallographic
water molecule.
> pdb4amber -i 1mrs.pdb -o 1mrs_new.pdb --dry

3. Subsequently, edit the 1mrs_new.pdb (using vim or gedit) to remove SO4 and MG
(heteroatom).

4. Copy 5HU entries from 1mrs.pdb and put into a new file (5hu.pdb).
>grep 5HU 1mrs_new.pdb>5hu.pdb

5. Add hydrogen atoms to 5HU by using the reduce command:


>reduce 5hu.pdb>5hu_H.pdb

6. Perform antechamber to generate parameter and coordinate files for 5HU.


Explanation for each parameter used can be referred to amber manual.
>antechamber -i 5hu_H.pdb -fi pdb -o 5hu_H.mol2 -fo mol2 -c bcc -s 2

7. Test to see if all parameters for 5HU are available by:


> parmchk2 -i 5hu_H.mol2 -f mol2 -o 5hu_H.frcmod

8. Prepare its library in xleap:


>xleap
>source leaprc.gaff
>5HU=loadmol2 5hu_H.mol2
>loadamberparams 5hu_H.frcmod
>check 5HU
>saveoff 5HU 5hu_H.lib
>saveamberparm 5HU 5hu_H.prmtop 5hu_H.inpcrd
>quit

9. Once we have prepared the ligand parameter file, we are ready to prepare MD input
files. Open terminal window, and type “xleap” to call out the xleap programme.

1
BIOINFORMATICS WORKSHOP FIESTA 2018

10. Next, we will call out the Amber force field FF14SB and GAFF (Wang et al., 2004)
force field file in xleap by typing the following:
> source leaprc.protein.ff14SB
> source leaprc.gaff
>source leaprc.water.tip3p
> loadoff 5hu_H.lib
> loadamberparams 5hu_H.frcmod
> TMPK=loadpdb 1mrs_new.pdb
> edit TMPK

This should cause the unit editor window to pop up. Right click on your mouse to
translate the structure, while hold down the control button, and move your mouse to
rotate the structure and hold down control key and right click at your mouse to zoom
in and out. To exit the unit editor, go to Unit> Close.

11. Next, we shall check the system if there is any close contacts or any major error on
the system. Generally the rule of thumb of using xleap would be to keep on checking
the structure until the program stop complaining. This can be done simply by:
> check TMPK

12. Next we add neutralizing counterions to neutralize the system by using the following
command. The “addions” method works by constructing Coulombic potential on 1 Å
grid and place counterions one at a time at the most negative/positive position of the
structure. Putting “0” at the end of the command simply means neutralize it.
> charge TMPK
> addions TMPK Cl- 0

2
BIOINFORMATICS WORKSHOP FIESTA 2018

This should add a total of 1 chloride anion to counteract with the +1 charge of the
complex.

13. Before we continue with solvation, we shall save the parameter and topology file
(containing ion) for later use.
>saveamberparm TMPK TMPK_ion.prmtop TMPK_ion.inpcrd

14. Next, solvate the entire complex into a truncated octahedron box of water around the
complex by:
> solvateoct TMPK TIP3PBOX 10.0

And it should give the following output. You can also view the complex in octahedron
water box by using the “edit” command:

15. Finally, we save our AMBER parameter and topology file:


> saveamberparm TMPK TMPK_wat.prmtop TMPK_wat.inpcrd

Now we have our input files, we can progress to the next session that performs minimization
and molecular dynamics simulation.

3
BIOINFORMATICS WORKSHOP FIESTA 2018

Section 2: Minimization
The next stage is to minimize the system prior performing MD simulation in order to remove
bad steric clashes and contacts due to solvation. We will perform two different minimization
algorithms in this section using restraint and non-restraint.

1. Minimization stage 1: restraint complex (solute)


500 steps of steepest descent followed by 500 conjugate gradient minimization with 300
kcal/mol restraint force on the complex (solute). You are advised to refer to AMBER manual
for further explanation of the parameters used.

min1.in
complex: initial minimization: fixed solute, free solvent and ion
&cntrl
imin=1, maxcyc=1000, ncyc=500, ntb=1, ntr=1, cut=10
/
Hold complex (protein+ion+ligand) fixed
300.0
RES 1 210
END
END

Perform minimization using the following command:


> sander –O –i min1.in –o min1.out –c TMPK_wat.inpcrd –p TMPK_wat.prmtop –r
min1.rst –ref TMPK_wat.inpcrd &

Putting “&” at the end of the command puts the job to background of terminal

Input file: min1.in, TMPK_wat.inpcrd, TMPK_wat.prmtop


Output file: min1.out, min1.rst

2. Minimization stage 2: minimize the entire system


Now we have minimized water and neutralizing ions in the system, we proceed to next stage
of minimization that is to minimize the entire system. In this case, we will run 1000 steps of
steepest descent followed by 1000 conjugate gradient minimization without any restraints.
The input file as follows:
min2.in
complex: second stage minimization : free all
&cntrl
imin=1, maxcyc=2000, ncyc=1000, ntb=1, ntr=0, cut=10
/

Perform minimization using the following command:


> sander –O –i min2.in –o min2.out –c min1.rst –p TMPK_wat.prmtop –r min2.rst &

Input file: min2.in, min1.rst, TMPK_wat.prmtop


Output file: min2.out, min2.rst

4
BIOINFORMATICS WORKSHOP FIESTA 2018

Section 3: MD simulation: Heating, equilibration, production


The next section is to start performing MD simulation. We will start off the MD simulation by
running heating stage (initial temperature of 0K and final temperature at 300K); continue with
equilibration stage and finally NPT production stage. Cut-off of 10 Å and Particle Mesh
Ewald (PME) will be used to correct short and long range interaction, respectively. SHAKE
algorithm will also be turn on throughout the simulation to constraint fast motion bond
involving hydrogen atoms to allow integration of force equation at 2 fs. Explanation for each
parameters used here can be found in AMBER manual.

1. Prepare input file for heating as follows:

md1.in
complex: heating stage (20ps)
&cntrl
imin=0, irest=0, ntx=1, ntb=1, cut=10, ntr=1, ntc=2, ntf=2, tempi=0.0, temp0=300.0, ntt=3,
gamma_ln=1.0, nstlim=10000, dt=0.002, ntpr=500, ntwx=500, ntwr=500
/
Hold complex fixed
100.0
RES 1 210
END
END

Perform heating using the following command:


> sander –O –i md1.in –o md1.out –c min2.rst –p TMPK_wat.prmtop –r md1.rst –x
md1.mdcrd –ref min2.rst &

input file: md1.in, min2.rst, TMPK_wat.prmtop


output file: md1.out, md1.rst, md1.mdcrd

2. Prepare input file for NVT equilibration.

md2.in
complex: equilibration stage (60ps)
&cntrl
imin=0, irest=1, ntx=7, ntb=1, cut=10, ntr=0, ntc=2, ntf=2, tempi=300.0, temp0=300.0, ntt=3,
gamma_ln=1.0, nstlim=30000, dt=0.002, ntpr=500, ntwx=500, ntwr=500
/

Perform equilibration using the following command:


> sander –O –i md2.in –o md2.out –c md1.rst –p TMPK_wat.prmtop –r md2.rst –x
md2.mdcrd &

input file: md2.in, md1.rst, TMPK_wat.prmtop


output file: md2.out, md2.rst, md2.mdcrd

3. Prepare input file for NPT production.

md3.in
complex: production stage (200ps)
&cntrl
imin=0, irest=1, ntx=7, ntb=2, cut=10, ntr=0, ntc=2, ntf=2, tempi=300.0, temp0=300.0, ntt=3,
gamma_ln=1.0, ntp=1, pres0=1.0, nstlim=100000, dt=0.002, ntpr=500, ntwx=500, ntwr=500
/

5
BIOINFORMATICS WORKSHOP FIESTA 2018

We will run 5 simulations one after another using the restart file from previous run as the
input for next run. While you can run each of the jobs manually using command line as
below (with relevant filenames incremented by one each time):

> sander –O –i md3.in –o md3.out –c md2.rst –p TMPK_wat.prmtop –r md3.rst –x


md3.mdcrd &

or it is probably best to write a simple script to run all of these jobs so that we can leave it
running overnight:

run.x
#!/bin/csh
set AMBERHOME=”/usr/local/amber18”
set START=3
set END=7
set CURRENT=$START
set INPUT=0

echo –n “Starting script at:”


date
echo “”

while ( $CURRENT <= $END)


echo –n “Job $CURRENT started at:”
date
@ INPUT = $CURRENT - 1
$AMBERHOME/bin/sander –O –i md3.in -o md$CURRENT.out -p TMPK_wat.prmtop
-c md$INPUT.rst -r md$CURRENT.rst -x md$CURRENT.mdcrd
gzip -9 –v md$CURRENT.mdcrd
echo –n “Job $CURRENT finished at:”
date
@CURRENT = CURRENT + 1
end
echo “JOBS DONE”

In order to be able to run this script, we need to make the file executable by
> chmod +x run.x
To execute this script, simply type:
> ./run.x >&run.log&

Here are the output files:


Simulation time (ps) mdout restart mdcrd
0-20 md1.out md1.rst md1.mdcrd
20-80 md2.out md2.rst md2.mdcrd
80-280 md3.out md3.rst md3.mdcrd
280-480 md4.out md4.rst md4.mdcrd
480-680 md5.out md5.rst md5.mdcrd
680-880 md6.out md6.rst md6.mdcrd
880-1080 md7.out md7.rst md7.mdcrd

6
BIOINFORMATICS WORKSHOP FIESTA 2018

Section 4: Analysis
The section will cover basic analysis on the MD simulations. This is done by monitoring their
system properties to check the quality of our equilibrium.

1. Let’s start by analyzing our out file.


> mkdir output
> cd output
> $AMBERHOME/bin/process_mdout.perl ../md1.out ../md2.out ../md3.out ../md4.out
../md5.out ../md6.out ../md7.out

This will give u a series of summary files in the output folder. Let’s plot some of these
files to see if we have reach equilibration for the system.

> xmgrace summary.EPTOT summary.EKTOT summary.ETOT

The plotting programme xmgrace is used here to plot the potential energy, kinetic energy
and total energy plots.

a. Based on the graph, what can you observe regarding the energy profile for your
system?
b. Why do you observe the raise of energies in the beginning of the simulation? and
why are the energies will decrease after some time?
c. Do you observe plateau in your graph? Why does that happen?

Next we plot the temperature, pressure, volume and density graphs:


> xmgrace summary.TEMP
> xmgrace summary.PRES

7
BIOINFORMATICS WORKSHOP FIESTA 2018

The volume and density files cannot be used directly because the first 80 ps do not
contain volume and density information since during constant volume simulations, the
volume of the box is not written to the output file. Therefore, we need to modify the files
by removing the first 81 lines of the summary.VOLUME and summary.DENSITY files.
You can use any text editor to do it, however in this case, we will use vi to do it:

>vi summary.VOLUME
d80 ( this will remove the current (first) line and the next 80 lines of the file)
<Esc>:wq summary.VOLUME_modified (enter)
>xmgrace summary.VOLUME_modified

(repeat the same method for summary_DENSITY file)

a. Based on the temperature and pressure graphs, what can you observe regarding the
profile for your graphs?
b. In the pressure graph, do you observe zero pressure in the beginning? Why is it so?
c. Do you observe wild fluctuation in your pressure graph? When does the mean
pressure (1 atm) starting to show stability?

8
BIOINFORMATICS WORKSHOP FIESTA 2018

d. What can you observe regarding the volume and density profiles in your graphs in
the beginning of the simulation?
e. At what simulation time does your volume and density show plateau?

2. Analyse trajectory
One of the ways to check whether our system is reasonable is by checking the root mean
square deviation (RMSD) from the starting structure. We can use cpptraj to calculate the
RMSD values as a function of simulation time. In this case, we would like to see the
backbone of our protein structure, and therefore we will only consider backbone atoms of C,
CA and N.

Before we can do RMSD analysis, we need to prepare input file for cpptraj to calculate
RMSD. However, prior calculating RMSD, we will need to re-image the whole trajectory and
also to remove water molecules from the trajectory file (we are not interested in water
molecules in this practical). We will also combine our trajectory into 1 file so that we could
view it.

image_rmsd.in
trajin md1.mdcrd
trajin md2.mdcrd
trajin md3.mdcrd.gz
trajin md4.mdcrd.gz
trajin md5.mdcrd.gz
trajin md6.mdcrd.gz
trajin md7.mdcrd.gz
trajout md1-7_nowat.mdcrd
rms first out rmsd.out @C,CA,N time 1.0
center :1-210
image familiar
strip :WAT

>cpptraj ../TMPK_wat.prmtop <image_rmsd.in> log &

output file: md1-7_nowat.mdcrd, rmsd.out, log

You can view the RMSD using xmgrace:


> xmgrace rmsd.out

9
BIOINFORMATICS WORKSHOP FIESTA 2018

a. Based on the RMSD graph, what can you observe regarding the profile?
b. At what simulation time does the RMSD of the system show plateau?
c. What can you conclude based on the RMSD graph?

3. Viewing trajectory
We can view our trajectory using any compatible viewer. In this practical, we will be using
vmd. To load vmd, simply type vmd at the terminal window:
> vmd

Before we can load the trajectory, we need to load the parameter file first by:
- File> New Molecule> Browse for TMPK_ion.prmtop and make sure the file type as
“Amber7 parm”> Load.

Next, load trajectory file (make sure this is loaded under “TMPK_ion.prmtop) by browsing for
md1-7_nowat.mdcrd and make sure file type as “Amber Coordinates (with Periodic Box)”>
Load.

And this is it; you can see the trajectory in the vmd window. You can view the trajectory and
try to understand what is happening during the simulation time. You can even change the
representation mode and render its colour to make it more interesting.

10
BIOINFORMATICS WORKSHOP FIESTA 2018

11

You might also like