BMTech-BioInformatics-Revision Session

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Biomedical Techniques

BioInformatics Revision Session

ACTCTTCTGGTCCCCACAGACTCAGAGAGAACCCACCATGGTGCTGTCTCCTGCCGACAAGACCAACGTCAAGG
CCGCCTGGGGTAAGGTCGGCGCGCACGCTGGCGAGTATGGTGCGGAGGCCCTGGAGAGGATGTTCCTGTCCT
TCCCCACCACCAAGACCTACTTCCCGCACTTCGACCTGAGCCACGGCTCTGCCCAGGTTAAGGGCCACGGCAAG
AAGGTGGCCGACGCGCTGACCAACGCCGTGGCGCACGTGGACGACATGCCCAACGCGCTGTCCGCCCTGAGC
GACCTGCACGCGCACAAGCTTCGGGTGGACCCGGTCAACTTCAAGCTCCTAAGCCACTGCCTGCTGGTGACCCT
GGCCGCCCACCTCCCCGCCGAGTTCACCCCTGCGGTGCACGCCTCCCTGGACAAGTTCCTGGCTTCTGTGAGCA
CCGTGCTGACCTCCAAATACCGTTAAGCTGGAGCCTCGGTGGCCATGCTTCTTGCCCCTTGGGCCTCCCCCCAGC
CCCTCCTCCCCTTCCTGCACCCGTACCCCCGTGGTCTTTGAATAAAGTCTGAGTGGGCGGC

1. Search the sample sequence given below against the protein database (nr) using the
BLASTX program. Answer the questions that follow.
2. For database hit NM_001042626.1 which frame of the query sequence does alignment
begin in? +2/+2
3. At which nucleotide of the query sequence does the frame change?
No frame change
4. Using what you learnt about navigating through the NCBI database, find the
nucleotide sequence corresponding to the protein with the accession
NM_001042626.1 Describe the steps you took to find it.
Go to NCBI, click blastX and then tblastx, copy the sample sequence, paste it in the BLASTX
search box, Blast it. Click on the correct accession number, check for the frame shift (+ or -),
paste the sequence with the fame shift.
5. Make a local alignment of the nucleotide sequence from 3. with the sample
sequence. Download your alignment and paste it below.
>NM_001042626.1 Pan troglodytes hemoglobin subunit alpha 1 (HBA1), mRNA
ACTCTTCTGGTCCCCACAGACTCAGAAAGAACCCACCATGGTGCTGTCTCCTGCCGACAAGA
CCAACGTCAAGGCCGCCT
GGGGTAAGGTCGGCGCGCACGCTGGCGAGTATGGTGCGGAGGCCCTGGAGAGGATGTTC
CTGTCCTTCCCCACCACCAAG
ACCTACTTCCCCCACTTCGACCTGAGCCACGGCTCTGCCCAGGTTAAGGGTCACGGCAAGA
AGGTGGCCGACGCGCTGAC
CAACGCCGTGGCGCACGTGGACGACATGCCCAACGCGCTGTCCGCCCTGAGTGACCTGCAC
GCGCACAAGCTTCGGGTGG
ACCCGGTCAACTTCAAGCTCCTAAGCCACTGCCTGCTGGTGACCCTGGCCGCCCACCTCCCC
GCCGAGTTCACCCCTGCG
GTGCACGCCTCCCTGGACAAGTTCCTGGCTTCTGTGAGCACCGTGCTGACCTCCAAATACCG
TTAAGCTGGAGCCTCGGT
GGCCATGCTTCTTGCCCCTTGGGCCTCTCGCCAGGCCCTCCTCTCCTTCCTGCACCTGTACCC
CCCCTGGTCTTTGAATA
AAGTCTGAGTGGGCGGC
6. Highlight the nucleotide that is bringing about the frameshift in the alignment you
have pasted in 4.
No frame shift
7. What other differences are there between the two nucleotide sequences?
There are no differences

Triple QxxK/R motif-containing protein and protein kinase C-binding protein 1 are
structurally similar. Given structural similarity you would expect to find sequence similarity.
However, both share sequence identity. We will use PSI-BLAST to find the sequence.
1. Using a sequence database of your choice retrieve the protein sequence for
Triple QxxK/R motif-containing protein.
1 mgrkdssntk lpvdqyrkqi gkqdykktkp ilratklkae akktaigike
vglmlaaila
61 lllafyaffy lrlstnidsd ldlded

2. Open the BLASTP webpage on NCBI and paste the sequence you found in 1.

3. Submit a PSI-BLAST search against the SwissProt database, narrowing your


search organism to Orang-utan. Keep the PSI-BLAST threshold at 0.005.
4. In your search results the description section will be split in two tables. The top
table will contain alignments with E-values below the cut-off and the lower table
will show the hits above the E-value cut-off.
a. What range of the % identities do you see in the alignments in both the
tables?
Table 1: only 92.31%(no range)
Table 2: 23.53%-53.85%
b. Do you get hits to proteins other than Triple QxxK/R motif-containing
protein in the first table?
No
c. Take a look at the Taxonomy report for this search. You will find a link to it
just above the Graphic Summary section. Which organism has the lowest
alignment score?
5. Run the 2nd iteration of PSI-BLAST using hits in the first table. This may take a
while to run so please be patient.
Do you get hits to Triple QxxK/R motif-containing protein on the 2nd
iteration?
No. the 2nd iteration did not display any results.
Using the Taxonomy report for the 2nd iteration determine:
a. if you get a hit to Triple QxxK/R motif-containing protein
b. which is the evolutionarily closest species to Orango tan you get a Triple
QxxK/R motif-containing protein hit for? Hint: look at the number of hits
in the lineage report.
c. how long is the Triple QxxK/R motif-containing protein for this species?
6. Run a 3rd iteration this time including the Triple QxxK/R motif-containing
protein hits.
a. Do you get a hit to Triple QxxK/R motif-containing proteini?
b. If yes, what is the accession number for it?
c. What % identity does the query sequence align with it ?
7.
Go to the European Nucleotide Archive: http://www.ebi.ac.uk/ena/ and search for the
accession number “P40243.1”. Choose the sequence result with the correct accession
number. First view the text entry to get more details about the gene.

What is the UniProt accession number for the corresponding protein? Back on the
entry page, save the sequence in FASTA format. Do not save it in MSWord, save it as
a text or .fsa file. Now go to http://molbiol-tools.ca/ which provides a set of freely
available online tools for sequence analysis. Select “Composition” and then the
“Genomics %A~T Content Calculator”. Calculate the nucleotide composition of your
sequence. 2. What are the numbers of each of the bases in your sequence and AT
content? Is it GC or AT rich?
8. Go to protein database and select a protein of your choice and find the crystal
structure image.
5Y1Z: Crystal structure of ZMYND8 PHD-BROMO-PWWP tandem in complex with
Drebrin ADF-H domain
Biological Unit for 5Y1Z: dimeric; determined by author
and by software (PISA)

Molecular Graphic

You might also like