Professional Documents
Culture Documents
Bioinformatics by MHN
Bioinformatics by MHN
BIO INFORMATICS
COMPUTER APPLICATION
𝐦𝐇𝐧 𝐀𝐬𝐢𝐟 𝐀𝐥𝐢
[ ]
𝟐𝐊𝟐𝟏/ 𝐌𝐋𝐓/𝟎𝟗
MLT
CONTENTS
1. UCSC GENOME BROWSER
2. NCBI
3. BLAST
4. BLAT
5. FASTA
6. Differences
7. ENSEMBL Genome browser
ی ِْف
8. STRING
9. CLUSTAL OMEGA
10.CLUSTALW
َس یب ْی یل
11.SWISS MODEL
12.Polymerase Chain Reaction
(PCR)
ّٰ ی
الل
13. REFERENCE.
Medical laboratory Technology, 2nd Year 4th Semester. 2
NCBI:
• NCBI is the abbreviation of National Center Of Biotechnology Information.
Established in 1988 as a national resource for molecular biology information,
NCBI creates public databases, conducts research in computational biology,
develops software tools for analyzing genome data, and disseminates
biomedical information.
1. Entrez
Entrez is the search and retrieval tool for all of NCBI. It is French for enter.
Entrez allows you to search all of the NCBI databases, including PubMed,
nucleotide, protein, structure, etc.
2. NCBI Gene
Gene as the center (loci) of NCBI databases Links to each key NCBI
resource.
3. GenBank (NCBI Data Model)
GenBank is the NIH genetic sequence database, an annotated collection of
all publicly available DNA sequences.
4. BLAST
BLAST (Basic Local Alignment Search Tool) is a set of similarity search
programs, designed to explore all of the available sequence databases
regardless of whether the query is protein or DNA (or soon RNA).
• Blast Types
1. blastn - for nucleotide - nucleotide comparisons
2. blastp - for protein - protein comparisons
3. blastx - compares the nucleotide sequence "against nr translated into
hypothetical
proteins mHñ Asif
4. tblastn - compares the protein sequence" against the nr nucleotide
database translated into hypothetical proteins in all six reading frames
5. tblastx - compares the nucleotide sequence" translated in all six
reading frames against the nr nucleotide translated in all six reading
frames.
Medical laboratory Technology, 2nd Year 4th Semester. 5
BLAST:
BLAST stands for Basic Local Alignment Search Tool. This searches for
similarity between a query sequence and the sequences deposited in National
Center for Biotechnology Information (NCBI) website. The putative genes in the
query sequence can be detected based on the sequence homology of the deposited
sequences. BLAST is popular as a bioinformatics tool due to its ability to identify
regions of local similarity between two sequences quickly. BLAST calculates an
expectation value, which estimates the number of matches between two
sequences. It uses the local alignment of sequence.
BLAT:
The BLAST-Like Alignment Tool (BLAT) is used to find genomic sequences
that match a protein or DNA sequence submitted by the user. BLAT is typically
used for searching similar sequences within the same or closely related species
Uses:
BLAT’s speed is one of its main advantages. It is useful for quickly finding
the genome location of a genomic, mRNA or protein sequence.
FASTA
• What is FASTA
FASTA is another sequence alignment tool which is used to search
similarities between sequences of DNA and proteins. The query sequence is
broken down into sequence patterns or words known as ktuples and the
target sequences are searched for these k-tuples in order to find the
similarities between the two. FASTA is a fine tool for similarity searches.
When finding sequence similarities, the best way to conduct your search is
to first perform a BLAST search and then go to FASTA. The FASTA file format
is widely used as the input method in other sequence alignment tools like
BLAST.
BLAST and FASTA are two similarity searching programs that identify
homologous DNA sequences and proteins based on the excess sequence
similarity. The excess similarity between two DNA or amino acid
sequences arises due to the common ancestry-homology. The most effective
similarity searching is the comparing of amino acid sequence of proteins
rather than DNA sequences. Both BLAST and FASTA use a scoring strategy in
order to compare two sequences and provide highly accurate statistical
estimates about the similarities between sequences. The main
difference between BLAST and FASTA is that BLAST is mostly involved in
finding of ungapped, locally optimal sequence alignments whereas FASTA is
involved in finding similarities between less similar sequences.
Medical laboratory Technology, 2nd Year 4th Semester. 7
String
A database of known and predicted protein-protein interactions. The database
contains information from numerous sources, including experimental
repositories, computational prediction methods and public text collections.
STRING is regularly updated and gives a comprehensive view on protein-protein
interactions currently available.
Uses:
STRING allows for the searching of one or multiple proteins at a time with the ability to additionally
limit the search to the desired species.
Medical laboratory Technology, 2nd Year 4th Semester. 8
Clustal Omega
Clustal Omega
Clustal Omega is a multiple sequence alignment program for aligning three or more
sequences together in a computationally efficient and accurate manner. It produces
biologically meaningful multiple sequence alignments of divergent sequences.
Evolutionary relationships can be seen via viewing Cladograms or Phylograms.
Uses:
Clustal Omega is a used package for carrying out multiple sequence alignment.
CLUSTAL W
Introduction:
mHñ Asif
Clustal w is a tool for aligning multiple protein or nucleotide sequences. The
alignment is achieved via three steps: pairwise alignment, guide-tree generation
and progressive alignment. ClustalW-MPI is a distributed and parallel
implementation of ClustalW.
Uses:
• It is a general purpose multiple alignment program for DNA or proteins.
• Calculate all possible pairwise alignments,
• Record the score for each pair.
• Calculate a guide tree based on the pairwise distances
Medical laboratory Technology, 2nd Year 4th Semester. 9
SWISS-MODEL
Introduction:
Swiss model is a server for automated comparative modeling of
three-dimensional (3D) protein structures. It pioneered the field of
automated modeling starting in 1993 and is the most widely-used free
web-based automated modeling facility ttoday
Uses:
SWISS-MODEL is a web-based integrated service dedicated
to protein structure homology modelling. It guides the user in building
protein homology models at different levels of complexity.
Medical laboratory Technology, 2nd Year 4th Semester. 10
1. Components of PCR:
The PCR reaction requires the following components:
1. DNA Template : The double stranded DNA (dsDNA) of interest, separated from the
sample.
2. DNA Polymerase : Usually a thermostable Taq polymerase that does not rapidly
denature at high temperatures (98°), and can function at a temperature optimum
of about 70°C.
3. Oligonucleotide primers : Short pieces of single stranded DNA (often 20-30 base
pairs) which are complementary to the 3’ ends of the sense and anti-sense strands
of the target sequence.
4. Deoxynucleotide triphosphates : Single units of the bases A, T, G, and C (dATP,
dTTP, dGTP, dCTP) provide the energy for polymerization and the building blocks
for DNA synthesis.
5. Buffer system : Includes magnesium and potassium to provide the optimal
conditions for DNA denaturation and renaturation; also important for polymerase
activity, stability and fidelity.
Medical laboratory Technology, 2nd Year 4th Semester. 11
2. PCR procedure
All the PCR components are mixed together and are taken through series of 3
major cyclic reactions conducted in an automated, self-contained thermocycler machine.
1. Denaturation :
This step involves heating the reaction mixture to 94°C for 15-30 seconds. During
this, the double stranded DNA is denatured to single strands due to breakage in
weak hydrogen bonds.
2. Annealing :
The reaction temperature is rapidly lowered to 54-60°C for 20-40 seconds. This
allows the primers to bind (anneal) to their complementary sequence in the
template DNA.
3. Elongation :
Also known at extension, this step usually occurs at 72-80°C (most commonly
72°C). In this step, the polymerase enzyme sequentially adds bases to the 3′ each
primer, extending the DNA sequence in the 5′ to 3′ direction. Under optimal
conditions, DNA polymerase will add about 1,000 bp/minute.
This set of Bioinformatics Multiple Choice Questions & Answers (MCQs) focuses on “Comparison of
FASTA and BLAST”.
1. BLAST uses a _______ to find matching words, whereas FASTA identifies identical matching
words using the _____
a) substitution matrix, hashing procedure
b) substitution matrix, blocks
c) hashing procedure, substitution matrix
d) ktups, substitution matrix
View Answer
Answer: a
Explanation: BLAST and FASTA have been shown to perform almost equally well in regular
database searching; However, there are some notable differences between the two approaches.
The major difference is in the seeding step– BLAST uses a substitution matrix to find matching
words, whereas FASTA identifies identical matching words using the hashing procedure.
3. The use of low-complexity masking in the BLAST procedure means that it may have higher
specificity than FASTA because potential false positives are reduced.
a) True
b) False
View Answer
Answer: a
Explanation: In addition to the given statement, BLAST sometimes gives multiple best-scoring
alignments from the same sequence. FASTA returns only one final alignment.
Answer: a
Explanation: In addition to this, user friendly UI of BLAST is also one of its benefits. However, it
does not handle gaps well. In that case gapped BLAST is better.
6. BLAST often produces several short HSPs rather than a single aligned region.
a) True
b) False
View Answer
Answer: a
Explanation: The results of the word matching and attempts to extend the alignment are
segments. They are called as HSPs (High-Scoring Segment Pairs). BLAST often produces several
short HSPs rather than a single aligned region.
7. FASTA is derived from logic of the dot plot.
a) True
b) False
View Answer
Answer: a
Explanation: Because of this, it computes best diagonals from all frames of alignment. The
method looks for exact matches between words in query and test sequence.
8. The gapped portion in the diagonals represents matches in FASTA.
a) True
b) False
View Answer
Answer: b
Explanation: The diagonal’s nature indicates the matching of the sequences. After all diagonals
are found, it tries to join diagonals by adding gaps. Further, it Computes alignments in regions of
best diagonals.
Medical laboratory Technology, 2nd Year 4th Semester. 14
Reference :
• From different websites and articles, by using internet.
Medical laboratory Technology, 2nd Year 4th Semester. 15
Medical Laboratory
Technology