Send Adrae Anderson BIOT3113 Lab 7 Bioinformatics and Disease

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

Bioinformatics and disease

Name :

ID number:

Course name: Biotechnology

Course Code: BIOT3113

Title: Bioinformatics and Disease

Date: November 18, 2022

BIOT3113: BIOTECHNOLOGY I

Lab 8 – Bioinformatics & Disease

This lab is designed to introduce you to basic Bioinformatics

software and tasks one performs when analyzing a sequence.

All answers should be added and submitted in this worksheet.

Scenario:

You’ve come across an infected plant in the field. A sample is

collected, and a DNA extraction protocol is carried out.

Following a cloning experiment, you’ve sent the unknown plant

pathogen sample for sequencing. The following sequence is what

you received from the lab

(Download your assigned sequence document)

State which sequences you have been assigned: Sequence 3


Questions:

1. Go to the National Center for Biotechnology Information

website (https://www.ncbi.nlm.nih.gov/ ). Select the tab for

‘Data and Software’ then ‘Tools.’ Locate the ‘Basic Local

Alignment Search Tool’ and select ‘Nucleotide BLAST.’ Copy

the DNA sequence to the ‘Query Sequence’ box and select

BLAST.

a. Which sequence does your sample share the highest

percentage identity with? (1 mark)

Ans. Tobacco leaf curl Cuba virus

b. What is the size of your sequence? (1 mark)

Ans. 2610 bp

c. What is the Accession number of your sequence’s closest

match and what is the purpose of the Accession number? (2

marks)

Ans. KU562963.1. The accession number is given as a unique

number for the order of acquisition of an entry to a


database. Its purpose is for tracking different versions of

the sequence record.

d. List your sample’s five closest matches and the

corresponding percentage identity (10 marks)

Ans.

Description %

Identity

Tobacco leaf curl Cuba virus segment DNA-A, 100.00 %

complete sequence

Tobacco leaf curl Cuba virus AV1 gene, AC1 97.05%

gene, AC2 gene, AC3 gene

Tobacco leaf curl Cuba virus isolate CU/frijol 96.82%

8/2014 segment DNA-A, complete sequence.

Tobacco leaf curl virus strain Haiti 2014 segment 96.09 %

A, complete sequence.

Tobacco leaf curl Cuba virus Dominica Republic. 95.83%

Juan Gomez 2015 segment DNA-A complete sequence

e. What does “percentage identity” mean? (1 mark)

Ans. Percentage of similarity between the bases in the query

number to the target sequence.


2. Create a phylogenetic tree by selecting the first ten

similar sequences and choosing the ‘Distance tree of

results.’

a. Insert a screen grab of tree. (3 marks)

Diagram Phylogenetic tree of the Tobacco leaf curl Cuba virus Sample sequence and

its 10 closest relative from the ancestor of origin

b. What is the purpose of a phylogenetic tree? (1 mark)

Ans. The Phylogenetic tree will show the evolutionary

relationship between the sequences of biological entities.


c. What does the phylogenetic tree tell you about the

sequence you isolated from the infected plant? (5 marks)

Ans. The phylogenetic tree will give information about the

evolutionary relationships. It shows how closely linked

organisms evolved from several common ancestors. The similarity

will depend on how recent the common ancestor is to two

organisms. The phylogenetic tree tells that the sequence shares

a common ancestor with the other species and are closely related.

It shows the relation of the tobacco leaf curl Cuba virus with

other viruses such as sida yellow mottle virus strain, and

wissadula golden mosaic St. Thomas virus as they share the same

common ancestor.

This portion aims to introduce you to the initial steps in

protein analysis. You will be required to do basic research on

the structure of Begomoviruses.

3. Return to the National Center for Biotechnology Information

homepage. Select the tab for ‘Data and Software’ then

‘Tools.’ Locate the ‘Open Reading Frame Finder’ and select.

Copy the DNA sequence to the ‘Query Sequence’ box and

submit.

a. How does this BLAST function identify the ORFs/proteins)?


The ORF’s are identified by comparing the nucleotide or

protein sequences between target sequences (1 mark)

b. What is the shortcoming of this methodology?

The shortcoming is that only one sequence can be searched

not multiple (2 marks)

c. Return to the NCBI homepage and search for the Accession

number of your sequence. Click on the result to examine

the sequence record. What information here can you use to

fill in the table below? What does the abbreviation mean?

_____________________________________________________(2

marks)

d. Fill in the table below with the functional proteins(15

marks)

Description Start Stop nt aa

size size

Coat Protein 182 937 755 251

Replication- Enhancer 934 1332 398 132

protein

Transactivating protein 1079 1468 389 129


Replication-associated 1380 2465 1085 361

protein

AC4 Protein 2051 2308 257 85

nt – nucleotide, aa – amino acids

4. Now that you know the location of your CP (coat protein),

copy the sequence and use the Primer-Blast tool (

https://www.ncbi.nlm.nih.gov/tools/primer-blast/ ) to create

primers that will amplify the CP. (3 marks)

Primer Sequence Size

Forward Primer TTCTCACTCGCGCTATCGTC 20 nt

Reverse Primer TAAACCGCGCAACAACTTGG 20 nt

a. State your reason for choosing the above primer pair (1

mark)

Ans. This primer is unlikely to form dimers as it had a very

small self-complimentary value

5. What are three characteristics of a good primer? (3 marks)


Ans. A good primer has a melting temperature between 50 °C

and 65 °C, Absence of dimerization capability and the lack

of secondary priming site

The NCBI GenBank is an important tool used by scientists all

around the world. This section will introduce you to the most

basic function of the GenBank

6. Go to the NCBI GenBank home page (

https://www.ncbi.nlm.nih.gov/genbank/ )

a. What is the GenBank? (1 mark)

Ans. The GenBank is a collective database which

contains publicly available nucleotide sequences,

through submission from individual laboratories and

large scale sequencing projects.

b. Use the List of Accession numbers given below to

locate/identify the name of the sequences. (20 marks)

Accession Sequence Name Size/bp


Number
AP018036 Mycobacterium tuberculosis 4413362 bp

FJ407052 Europhia mosaic virus - [Jamaica: 2609 bp


Wissadula : Euphorbia : 2004] segment
DNA-A complete sequence
EF585288 Macroptillium yellow mosaic virus 2630 bp.
isolate Hope Pastures segment DNA- A
complete sequence.
NC_046154 Rattus rattus isolate New Zealand 272949890 bp
chromosome 1, Rrattus_CSIRO_V1 whole
genome shotgun sequence.
AF035224 Tomato dwarf leaf curl gemnivirus 1319 bp
replication gene (AC1) and coat protein
(AV1) genes, partial cds.
AE014075 Escherichia coli CFT07, complete 5231428 bp
genome.
FJ601917 Malvastrum yellow mosaic Jamaica 2612 bp
virus isolate Ma179A73 segment
DNA-A, complete sequence.
NC_004162 Chikungunya virus, complete genome. 11826 bp
NC_001802 Human immunodeficiency virus, complete 9181 bp
genome
AY391777 Human coronavirus OC43, complete 30738 bp
genome.

In this section you’ll be asked to develop a hypothetical DNA

vaccine.

Scenario:

In recent times, since the discovery of Anelloviruses, they have

been increasingly linked to various other diseases as some kind

of ‘helper-molecule’, exacerbating the development of chronic

illnesses. A vaccine could provide protection against these

viruses, and in turn slow down the progress of other illnesses.


7. What are the different types of vaccines? (6 marks)

Ans.

 Inactivated vaccines – The use of dead disease-causing bacteria

occurs in inactivated vaccines, given it is not as effective as

using live ones. Booster shots may be required for immunity.

 Live-attenuated vaccines – A weakened form of the pathogen is

used via live attenuated vaccines, an infection occurs which

gives long-lasting protection.

 Messenger RNA (mRNA) vaccines - Toxoid vaccinations are used for

immunity through the introduction of a bacterial toxin.

 Subunit, recombinant, polysaccharide, and conjugate vaccines -

Components of the germ, such as its capsid, sugar, or protein are

used. These vaccines focused on essential germ sections to

produce a very potent immune response. They can also be applied

to those with compromised immune systems and chronic medical

conditions.

 Toxoid vaccines - Toxoid vaccines make use of a toxin instead of

the germ itself, the immunity that results are to the components

of the germs that cause the disease. That indicates that the

poison is the focus of the immune response rather than the entire

germ.

 Viral vector vaccines - Viral vector vaccines give protection by

using a modified form of a different virus as a vector. The


influenza virus, the vesicular stomatitis virus (VSV), the

measles virus, and the adenovirus that causes the common cold

have all been utilized as vectors.

8. How is immunity achieved through vaccination? (5 marks)

Ans. Immunity is achieved through the simulation of a

disease which is done by vaccines. It triggers the

manufacture of T-lymphocytes and antibodies. A weakened,

dead, or components of a virus can be inserted in the body

to give an imitation of the illness, after the immune system

has taken care of the infection, there will be a memory of

that infection, making it easier to handle if the body ever

gets infected by the same virus.

9. What are DNA vaccines? (2 marks)

Ans. DNA vaccines, also called third generation vaccines,

uses modified DNA to trigger an immune response against

threats such as viruses and bacteria.

10.Now that you know how to BLAST a nucleotide sequence and

locate ORFs, use this knowledge to design a cloning


experiment to create a DNA vaccine for the sequence located

in the document named “Question 10 DNA Vaccine”

(Resources you may need: http://nc2.neb.com/NEBcutter2/ and

the provided PDF document ‘Human anelloviruses and the

central nervous system’)

(20 marks)

Ans. 1. The viral DNA will be blasted on the NCBI website

after the sequence is copied and uploaded.

2. Using the Open Reading Frame Finder, we will look for

ORF1 with start codon 598 and stop codon 2841 and ORF2 with

start codon 356 and stop codon 724.

3. Using the NEB program, we will visualize the viral DNA

plasmid and identify the restriction sites and their

matching restriction enzymes, provided that ORF1 is expected

to have replication-related protein activity and ORF2 is

thought to have phosphatase activity. These enzymes were

identified as BsiE1-154BP and FspI-2982 bp.

4. Using these restriction enzymes to breakdown the viral

DNA will allow us to identify functional genes, which we

will then use to generate primers using the NCBI primer

blast program. Polymerization will occur between the 356 bp

first restriction site on the plasmid and the 2982 bp cut by

the FspI restriction enzyme.


5. It is critical to use PCR to create primers with a low

self-complementary value to avoid the formation of primer

dimers, which will result in a poor yield of amplified viral

genes once the plasmid has been digested with the required

restriction enzymes. A restriction enzyme will be employed

to cut at various cloning sites.

6. Using DNA ligase, we may now re-circularize the plasmid

with the integrated PCR result.

7. Using a specialized agar media, we will convert the

plasmid encoding the targeted gene into a bacterium, such as

E. coli. As the bacterium grows and multiplies, the target

gene repeats along with the bacterial genome.

8. We can use a range of chromatographic techniques, such as

gas chromatography, column chromatography,

ultra/diafiltration, and so on, to achieve the optimum

purification.

9. In the initial experiment, which is carried out on

animals before moving on to human trials, the plasmid is

used as the injection site in the medium.

TOTAL 105 MARKS

You might also like