Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

SIX1008 BIOCOMPUTING

SEMESTER I, ACADEMIC SESSION 2019/2020

SIX1008 BIOCOMPUTING
GROUP PROJECT

GROUP: 1

SEMESTER: 1

ACADEMIC SESSION: 2019/2020

COURSE INSTRUCTOR: ASSOCIATE PROFESSOR DR. TEOH TEOW CHONG

DEADLINE: 27th of December, 2019

TEAM MEMBERS

NAME MATRIC NUMBER

1. NUR ATIKAH BINTI AZALI SIF160058

2. AMIRAH BINTI BASIR SIF170004

3. MAISARAH BINTI SAIHUN SIF170017

4. LIM CHI CHING SIQ170012

5. PUA WEI YI SIP180017

Page | 1
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

1. Compute the pair sequence alignment using generalGap.py and localAlign.py scripts for the
following sequences: [10 marks]

a) RNASEQUENCE

CONSEQUENCE

generalGap.py

localAlign.py

Page | 2
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

b) CONSIGNMENT

ASSIGNMENT

generalGap.py

localAlign.py

Page | 3
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

c) SIMILAR

SINGULAR

generalGap.py

localAlign.py

d) THISISTHETHESIS

Page | 4
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

HYPOTHESIS

generalGap.py

localAlign.py

e) AFFINITY

IDENTITY
Page | 5
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

generalGap.py

localAlign.py

2. By using the Protein BLAST at https://blast.ncbi.nlm.nih.gov/Blast.cgi, perform the local


sequence alignment for the amino acid sequences below, and identify the proteins. [10 marks]

>Q1

Page | 6
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

ADLELERAADVRWEEQAEISGSSPTLSITISEDGSMSIKNEEEEQTLGGGGTGGGGAGVLWDVP
SPPPVGKAELEDGAYRIKQKGILGYSQIGAGVYKEGTFHTMWHVTRGAVLMHKGKRIEPSWADK
KDLISYGGGWKLEGEWKEGEEVQVLALEPGKNPRAVQTKPGLFKTNTGTIGAVSLDFSPGTSGS
PIVDKKGKVVGLYGNGVVTRSGTYVSAIAQTERSIEDNPEIEDDIFRKRKL

A) Protein Identified: Chain A, FLAVIVIRUS_NS2B/Peptidase S7 [Dengue virus 2]

Score 466 bits (1199)

Expect 2e-165

Method Compositional matrix adjust.

Identities 235/241 (98%)

Positives 237/241 (98%)

Gaps 1/241 (0%)

B) Protein Identified: Chain A, Ns2b-ns3 Protease [Dengue virus 2]

Score 466 bits (1198)

Expect 2e-165

Page | 7
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

Method Compositional matrix adjust.

Identities 235/241 (98%)

Positives 237/241 (98%)

Gaps 1/241 (0%)

Page | 8
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

C) Protein Identified: Chain A, Ns2b-ns3 Protease [Dengue virus 2]

Score 465 bits (1196)

Expect 6e-165

Method Compositional matrix adjust.

Identities 234/241 (97%)

Positives 236/241 (97%)

Gaps 1/241 (0%)

Page | 9
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

>Q2

GSHMVDMYIERAGDITWEKDAEVTGNSPRLDVALDESGDFSLVEDDGPPMAGGGGSGGGGSGAL
WDVPAPKEVKKGETTDGVYRVMTRGLLGSTQVGVGVMQEGVFHTMWHVTKGSALRSGEGRLDPY
WGDVKQDLVSYCGPWKLDAAWDGHSEVQLLAVPPGERARNIQTLPGIFKTKDGDIGAVALDYPA
GTSGSPILDKCGRVIGLYGNGVVIKNGSYVSAITQGRR

A) Protein Identified: Chain A, Ns2b-ns3 Protease, ns2b-ns3 Protease [Zika virus]

Score 461 bits (1187)

Expect 4e-164

Method Compositional matrix adjust.

Identities 230/230 (100%)

Positives 230/230 (100%)

Gaps 0/230 (0%)

Page | 10
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

B) Protein Identified: Chain A, Ns2b-ns3 Protease Chimera, genome Polyprotein


[Zika virus]

Score 450 bits (1158)

Expect 2e-159

Method Compositional matrix adjust.

Identities 224/226 (99%)

Positives 225/226 (99%)

Gaps 0/226 (0%)

Page | 11
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

>Q3

GSVVIVGRIILGSGSAPITAYAQQTRGLFGTIVTSLTGRDKNVVTGEVQVLSTATQTFLGTTVG
GVMWTVYHGAGSRTLAGTKHPALQMYTNVDQDLVGWPAPPGAKSLEPCTCGSADLYLITRDADV
IPARRRGDSTASLLSPRPLACLKGSSGGPVMCPAGHVAGIFRAAVCTRGVAKALQFIPVETLST
QARS

A) Protein Identified: protease-helicase, partial [Hepacivirus C]


Score 363 bits (931)

Expect 2e-126

Method Compositional matrix adjust.

Identities 178/181 (98%)

Positives 180/181 (99%)

Gaps 0/181 (0%)

Page | 12
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

B) Protein Identified: protease-helicase, partial [Hepacivirus C]


Score 362 bits (930)

Expect 3e-126

Method Compositional matrix adjust.

Identities 177/181 (98%)

Positives 180/181 (99%)

Gaps 0/181 (0%)

Page | 13
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

C) Protein Identified: NS3 protease, partial [Hepacivirus C]

Score 362 bits (930)

Expect 3e-126

Method Compositional matrix adjust.

Identities 177/181 (98%)

Positives 180/181 (99%)

Gaps 0/181 (0%)

Page | 14
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

>Q4

TDMWLERAADISWEMDAAITGSSRRLDVKLDDDGDFHLIDDPGVPWKGGGGSGGGGGGVFWDTP
SPKPCSKGDTTTGVYRIMARGILGTYQAGVGVMYENVFHTLWHTTRGAAIMSGEGKLTPYWGSV
KEDRIAYGGPWRFDRKWNGTDDVQVIVVEPGKAAVNIQTKPGVFRTPFGEVGAVSLDYPRGTSG
SPILDSNGDIIGLYGNGVELGDGSYVSAIVQGDRQEEPVPEAYT

A) Protein Identified: polyprotein, partial [Japanese encephalitis virus]

Score 447 bits(1151)

Expect 5e-151

Method Compositional matrix adjust.

Identities 227/262(87%)

Positives 228/262(87%)

Gaps 26/262(9%)

Page | 15
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

B) Protein Identified: polyprotein [Japanese encephalitis virus]

Score 456 bits(1173)

Expect 3e-144

Method Compositional matrix adjust.

Identities 229/262(87%)

Positives 229/262(87%)

Gaps 26/262(9%)

Page | 16
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

C) Protein Identified: polyprotein [Japanese encephalitis virus]

Score 454 bits(1169)

Expect 1e-143

Method Compositional matrix adjust.

Identities 228/262(87%)

Positives 228/262(87%)

Gaps 26/262(9%)

Page | 17
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

>Q5

TDMWIERTADITWESDAEITGSSERVDVRLDDDGNFQLMNDPGAPWKGGGGSGGGGGGVLWDTP
SPKEYKKGDTTTGVYRIMTRGLLGSYQAGAGVMVEGVFHTLWHTTKGAALMSGEGRLDPYWGSV
KEDRLCYGGPWKLQHKWNGHDEVQMIVVEPGKNVKNVQTKPGVFKTPEGEIGAVTLDYPTGTSG
SPIVDKNGDVIGLYGNGVIMPNGSYISAIVQGERMEEPAPAGFEPEMLR

A) Protein Identified: CF40GlyNS3pro protein [synthetic construct]

Score 475 bits(1223)

Expect 5e-169

Method Compositional matrix adjust.

Identities 234/240(98%)

Positives 238/240(99%)

Gaps 0/240(0%)

Page | 18
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

B) Protein Identified: Chain A, Serine Protease Subunit Ns2b, Serine Protease Ns3
[West Nile virus]

Score 454 bits(1169)

Expect 4e-161

Method Compositional matrix adjust.

Identities 225/226(99%)

Positives 225/226(99%)

Gaps 0/226(0%)

Page | 19
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

C) Protein Identified: NS2-NS3 protein [West Nile virus]

Score 469 bits(1206)

Expect 2e-159

Method Compositional matrix adjust.

Identities 232/267(87%)

Positives 232/267(86%)

Gaps 26/267(9%)

Page | 20
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

3. Download a protein databank (pdb) file for HIV-1 gp120-CD4 protein structure at
https://www.rcsb.org/, by using molecular viewer of RasWin, print the molecular rendering for:
[5 marks]

a) Wireframe

b) Sticks

Page | 21
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

c) Spacefill

d) Ball & Stick

Page | 22
SIX1008 BIOCOMPUTING
SEMESTER I, ACADEMIC SESSION 2019/2020

e) Cartoons

Due date: 27/12/2019, Friday

Format: Hardcopy submitted to Ms. Devi.

Page | 23

You might also like