Professional Documents
Culture Documents
BMTech-BioInformatics-Revision Session
BMTech-BioInformatics-Revision Session
BMTech-BioInformatics-Revision Session
ACTCTTCTGGTCCCCACAGACTCAGAGAGAACCCACCATGGTGCTGTCTCCTGCCGACAAGACCAACGTCAAGG
CCGCCTGGGGTAAGGTCGGCGCGCACGCTGGCGAGTATGGTGCGGAGGCCCTGGAGAGGATGTTCCTGTCCT
TCCCCACCACCAAGACCTACTTCCCGCACTTCGACCTGAGCCACGGCTCTGCCCAGGTTAAGGGCCACGGCAAG
AAGGTGGCCGACGCGCTGACCAACGCCGTGGCGCACGTGGACGACATGCCCAACGCGCTGTCCGCCCTGAGC
GACCTGCACGCGCACAAGCTTCGGGTGGACCCGGTCAACTTCAAGCTCCTAAGCCACTGCCTGCTGGTGACCCT
GGCCGCCCACCTCCCCGCCGAGTTCACCCCTGCGGTGCACGCCTCCCTGGACAAGTTCCTGGCTTCTGTGAGCA
CCGTGCTGACCTCCAAATACCGTTAAGCTGGAGCCTCGGTGGCCATGCTTCTTGCCCCTTGGGCCTCCCCCCAGC
CCCTCCTCCCCTTCCTGCACCCGTACCCCCGTGGTCTTTGAATAAAGTCTGAGTGGGCGGC
1. Search the sample sequence given below against the protein database (nr) using the
BLASTX program. Answer the questions that follow.
2. For database hit NM_001042626.1 which frame of the query sequence does alignment
begin in? +2/+2
3. At which nucleotide of the query sequence does the frame change?
No frame change
4. Using what you learnt about navigating through the NCBI database, find the
nucleotide sequence corresponding to the protein with the accession
NM_001042626.1 Describe the steps you took to find it.
Go to NCBI, click blastX and then tblastx, copy the sample sequence, paste it in the BLASTX
search box, Blast it. Click on the correct accession number, check for the frame shift (+ or -),
paste the sequence with the fame shift.
5. Make a local alignment of the nucleotide sequence from 3. with the sample
sequence. Download your alignment and paste it below.
>NM_001042626.1 Pan troglodytes hemoglobin subunit alpha 1 (HBA1), mRNA
ACTCTTCTGGTCCCCACAGACTCAGAAAGAACCCACCATGGTGCTGTCTCCTGCCGACAAGA
CCAACGTCAAGGCCGCCT
GGGGTAAGGTCGGCGCGCACGCTGGCGAGTATGGTGCGGAGGCCCTGGAGAGGATGTTC
CTGTCCTTCCCCACCACCAAG
ACCTACTTCCCCCACTTCGACCTGAGCCACGGCTCTGCCCAGGTTAAGGGTCACGGCAAGA
AGGTGGCCGACGCGCTGAC
CAACGCCGTGGCGCACGTGGACGACATGCCCAACGCGCTGTCCGCCCTGAGTGACCTGCAC
GCGCACAAGCTTCGGGTGG
ACCCGGTCAACTTCAAGCTCCTAAGCCACTGCCTGCTGGTGACCCTGGCCGCCCACCTCCCC
GCCGAGTTCACCCCTGCG
GTGCACGCCTCCCTGGACAAGTTCCTGGCTTCTGTGAGCACCGTGCTGACCTCCAAATACCG
TTAAGCTGGAGCCTCGGT
GGCCATGCTTCTTGCCCCTTGGGCCTCTCGCCAGGCCCTCCTCTCCTTCCTGCACCTGTACCC
CCCCTGGTCTTTGAATA
AAGTCTGAGTGGGCGGC
6. Highlight the nucleotide that is bringing about the frameshift in the alignment you
have pasted in 4.
No frame shift
7. What other differences are there between the two nucleotide sequences?
There are no differences
Triple QxxK/R motif-containing protein and protein kinase C-binding protein 1 are
structurally similar. Given structural similarity you would expect to find sequence similarity.
However, both share sequence identity. We will use PSI-BLAST to find the sequence.
1. Using a sequence database of your choice retrieve the protein sequence for
Triple QxxK/R motif-containing protein.
1 mgrkdssntk lpvdqyrkqi gkqdykktkp ilratklkae akktaigike
vglmlaaila
61 lllafyaffy lrlstnidsd ldlded
2. Open the BLASTP webpage on NCBI and paste the sequence you found in 1.
What is the UniProt accession number for the corresponding protein? Back on the
entry page, save the sequence in FASTA format. Do not save it in MSWord, save it as
a text or .fsa file. Now go to http://molbiol-tools.ca/ which provides a set of freely
available online tools for sequence analysis. Select “Composition” and then the
“Genomics %A~T Content Calculator”. Calculate the nucleotide composition of your
sequence. 2. What are the numbers of each of the bases in your sequence and AT
content? Is it GC or AT rich?
8. Go to protein database and select a protein of your choice and find the crystal
structure image.
5Y1Z: Crystal structure of ZMYND8 PHD-BROMO-PWWP tandem in complex with
Drebrin ADF-H domain
Biological Unit for 5Y1Z: dimeric; determined by author
and by software (PISA)
Molecular Graphic