Bioinformatics by MHN

Medical laboratory Technology, 2nd Year 4th Semester.
BIO INFORMATICS
COMPUTER APPLICATION
𝐦𝐇𝐧 𝐀𝐬𝐢𝐟 𝐀𝐥𝐢
[ ]
𝟐𝐊𝟐𝟏/ 𝐌𝐋𝐓/𝟎𝟗
MLT
CONTENTS
1. UCSC GENOME BROWSER
2. NCBI
3. BLAST
4. BLAT
5. FASTA
6. Differences
7. ENSEMBL Genome browser
‫ی ِْف‬
8. STRING
9. CLUSTAL OMEGA
10.CLUSTALW
‫َس یب ْی یل‬
11.SWISS MODEL
12.Polymerase Chain Reaction
(PCR)
‫ّٰ ی‬
‫الل‬
13. REFERENCE.
Medical laboratory Technology, 2nd Year 4th Semester. 2
UCSC GENOME BROWSER

• Introduction:
The University of California Santa Cruz (UCSC) Genome Browser
(genome.ucsc.edu) is a popular Web-based tool for quickly displaying a requested
portion of a genome at any scale, accompanied by a series of aligned annotation
“tracks”.
• Uses
The annotations—generated by the UCSC Genome Bioinformatics Group
and external collaborators—display gene predictions, mRNA and expressed
sequence tag alignments, simple nucleotide polymorphisms, expression and
regulatory data, phenotype and variation data, and pairwise and multiple-species
comparative genomics data. All information relevant to a region is presented in
one window, facilitating biological analysis and interpretation.
Tools of UCSC genome browser include:

• Genome Browser—graphical view of genes, gene structure, and annotation
tracks.
• BLAT—aligning DNA sequence with a reference genomic assembly.
• Custom Tracks—displaying your data in conjunction with existing browser
data.
• Table Browser—bulk data manipulation and downloads, intersections and
joins between data sets.
• Session—sharing your data with others.
• PCR—getting DNA bracketed by a pair of primers.
NATIONAL CENTER OF BIOTECHNOLOGY INFORMATION

(NCBI):
NCBI:
• NCBI is the abbreviation of National Center Of Biotechnology Information.
Established in 1988 as a national resource for molecular biology information,
NCBI creates public databases, conducts research in computational biology,
develops software tools for analyzing genome data, and disseminates
biomedical information.
1. Entrez
Entrez is the search and retrieval tool for all of NCBI. It is French for enter.
Entrez allows you to search all of the NCBI databases, including PubMed,
nucleotide, protein, structure, etc.
2. NCBI Gene
Gene as the center (loci) of NCBI databases Links to each key NCBI
resource.
3. GenBank (NCBI Data Model)
GenBank is the NIH genetic sequence database, an annotated collection of
all publicly available DNA sequences.
4. BLAST
BLAST (Basic Local Alignment Search Tool) is a set of similarity search
programs, designed to explore all of the available sequence databases
regardless of whether the query is protein or DNA (or soon RNA).
• Blast Types
1. blastn - for nucleotide - nucleotide comparisons
2. blastp - for protein - protein comparisons
3. blastx - compares the nucleotide sequence "against nr translated into
hypothetical
proteins mHñ Asif
4. tblastn - compares the protein sequence" against the nr nucleotide
database translated into hypothetical proteins in all six reading frames
5. tblastx - compares the nucleotide sequence" translated in all six
reading frames against the nr nucleotide translated in all six reading
frames.
BLAST:
BLAST stands for Basic Local Alignment Search Tool. This searches for
similarity between a query sequence and the sequences deposited in National
Center for Biotechnology Information (NCBI) website. The putative genes in the
query sequence can be detected based on the sequence homology of the deposited
sequences. BLAST is popular as a bioinformatics tool due to its ability to identify
regions of local similarity between two sequences quickly. BLAST calculates an
expectation value, which estimates the number of matches between two
sequences. It uses the local alignment of sequence.
BLAT:
The BLAST-Like Alignment Tool (BLAT) is used to find genomic sequences
that match a protein or DNA sequence submitted by the user. BLAT is typically
used for searching similar sequences within the same or closely related species
Uses:
BLAT’s speed is one of its main advantages. It is useful for quickly finding
the genome location of a genomic, mRNA or protein sequence.
What are the differences between BLAT and BLAST?

BLAT is an alignment tool like BLAST, but it is structured differently. On
DNA, BLAT works by keeping an index of an entire genome in memory. Thus, the
target database of BLAT is not a set of GenBank sequences, but instead an index
derived from the assembly of the entire genome. By default, the index consists of
all non-overlapping 11-mers except for those heavily involved in repeats, and it
uses less than a gigabyte of RAM. This smaller size means that BLAT is far more
easily mirrored than BLAST. Blat of DNA is designed to quickly find sequences of
95% and greater similarity of length 40 bases or more. It may miss more divergent
or shorter sequence alignments.
From a practical standpoint, BLAT has several advantages over BLAST:
• Speed (no queues, response in seconds) at the price of lesser homology
depth
• The ability to submit a long list of simultaneous queries in FASTA format
• Five convenient output sort options
• A direct link into the UCSC browser
• Alignment block details in natural genomic order
• An option to launch the alignment later as part of a custom track.
FASTA
• What is FASTA
FASTA is another sequence alignment tool which is used to search
similarities between sequences of DNA and proteins. The query sequence is
broken down into sequence patterns or words known as ktuples and the
target sequences are searched for these k-tuples in order to find the
similarities between the two. FASTA is a fine tool for similarity searches.
When finding sequence similarities, the best way to conduct your search is
to first perform a BLAST search and then go to FASTA. The FASTA file format
is widely used as the input method in other sequence alignment tools like
BLAST.
Main Difference – BLAST vs FASTA
BLAST and FASTA are two similarity searching programs that identify
homologous DNA sequences and proteins based on the excess sequence
similarity. The excess similarity between two DNA or amino acid
sequences arises due to the common ancestry-homology. The most effective
similarity searching is the comparing of amino acid sequence of proteins
rather than DNA sequences. Both BLAST and FASTA use a scoring strategy in
order to compare two sequences and provide highly accurate statistical
estimates about the similarities between sequences. The main
difference between BLAST and FASTA is that BLAST is mostly involved in
finding of ungapped, locally optimal sequence alignments whereas FASTA is
involved in finding similarities between less similar sequences.
Ensembl Genome Browser

INTRODUCTION
The Ensembl project creates evidence-
based annotation of genome sequences
and integrates these data with other
biological information.
Ensembl was established in 1999,
towards the end of the Human Genome
Project, in response to a recognition
that understanding the genetic code of organisms is as important as reading it.
The project provides an expanding wealth of information for a diverse list
of species, including: mHñ Asif
• Intron and exon structure for protein-coding and non-coding genes
• Genomic variations and somatic mutations and their consequences on
genes and genotypes in populations and individuals
• Cross-species gene trees and whole genome alignments
• Functional genomic data - including regulatory region annotation.
String
A database of known and predicted protein-protein interactions. The database
contains information from numerous sources, including experimental
repositories, computational prediction methods and public text collections.
STRING is regularly updated and gives a comprehensive view on protein-protein
interactions currently available.
Uses:
STRING allows for the searching of one or multiple proteins at a time with the ability to additionally
limit the search to the desired species.
Clustal Omega
Clustal Omega
Clustal Omega is a multiple sequence alignment program for aligning three or more
sequences together in a computationally efficient and accurate manner. It produces
biologically meaningful multiple sequence alignments of divergent sequences.
Evolutionary relationships can be seen via viewing Cladograms or Phylograms.
Uses:
Clustal Omega is a used package for carrying out multiple sequence alignment.
CLUSTAL W
Introduction:
mHñ Asif
Clustal w is a tool for aligning multiple protein or nucleotide sequences. The
alignment is achieved via three steps: pairwise alignment, guide-tree generation
and progressive alignment. ClustalW-MPI is a distributed and parallel
implementation of ClustalW.
Uses:
• It is a general purpose multiple alignment program for DNA or proteins.
• Calculate all possible pairwise alignments,
• Record the score for each pair.
• Calculate a guide tree based on the pairwise distances
SWISS-MODEL
Introduction:
Swiss model is a server for automated comparative modeling of
three-dimensional (3D) protein structures. It pioneered the field of
automated modeling starting in 1993 and is the most widely-used free
web-based automated modeling facility ttoday
Uses:
SWISS-MODEL is a web-based integrated service dedicated
to protein structure homology modelling. It guides the user in building
protein homology models at different levels of complexity.
POLYMERASE CHAIN REACTION (PCR)

Introduction:
Polymerase chain reaction (PCR) is a very versatile gene amplification method that
has brought a tremendous progress in molecular biology and genetics. It is an in vitro
method of amplifying a desired DNA sequence of any origin hundreds of million times in
hours. Typically, the goal of PCR is to make enough of the target DNA region that it can
be analyzed or used in some other way. For instance, DNA amplified by PCR may be sent
for sequencing, visualized by gel electrophoresis, or cloned into a plasmid for further
experiments. PCR is used in many areas of biology and medicine, including molecular
biology research, medical diagnostics, and even some branches of ecology.
1. Components of PCR:
The PCR reaction requires the following components:
1. DNA Template : The double stranded DNA (dsDNA) of interest, separated from the
sample.
2. DNA Polymerase : Usually a thermostable Taq polymerase that does not rapidly
denature at high temperatures (98°), and can function at a temperature optimum
of about 70°C.
3. Oligonucleotide primers : Short pieces of single stranded DNA (often 20-30 base
pairs) which are complementary to the 3’ ends of the sense and anti-sense strands
of the target sequence.
4. Deoxynucleotide triphosphates : Single units of the bases A, T, G, and C (dATP,
dTTP, dGTP, dCTP) provide the energy for polymerization and the building blocks
for DNA synthesis.
5. Buffer system : Includes magnesium and potassium to provide the optimal
conditions for DNA denaturation and renaturation; also important for polymerase
activity, stability and fidelity.
2. PCR procedure
All the PCR components are mixed together and are taken through series of 3
major cyclic reactions conducted in an automated, self-contained thermocycler machine.
1. Denaturation :
This step involves heating the reaction mixture to 94°C for 15-30 seconds. During
this, the double stranded DNA is denatured to single strands due to breakage in
weak hydrogen bonds.
2. Annealing :
The reaction temperature is rapidly lowered to 54-60°C for 20-40 seconds. This
allows the primers to bind (anneal) to their complementary sequence in the
template DNA.
3. Elongation :
Also known at extension, this step usually occurs at 72-80°C (most commonly
72°C). In this step, the polymerase enzyme sequentially adds bases to the 3′ each
primer, extending the DNA sequence in the 5′ to 3′ direction. Under optimal
conditions, DNA polymerase will add about 1,000 bp/minute.
COMPARISON OF FASTA AND BLAST

This set of Bioinformatics Multiple Choice Questions & Answers (MCQs) focuses on “Comparison of
FASTA and BLAST”.
1. BLAST uses a _______ to find matching words, whereas FASTA identifies identical matching
words using the _____
a) substitution matrix, hashing procedure
b) substitution matrix, blocks
c) hashing procedure, substitution matrix
d) ktups, substitution matrix
View Answer
Answer: a
Explanation: BLAST and FASTA have been shown to perform almost equally well in regular
database searching; However, there are some notable differences between the two approaches.
The major difference is in the seeding step– BLAST uses a substitution matrix to find matching
words, whereas FASTA identifies identical matching words using the hashing procedure.
2. Which of the following is not a benefit or a factual of FASTA over BLAST?

a) FASTA scans smaller window sizes
b) It gives more sensitive results
c) It gives less sensitive results
d) It gives results with a better coverage rate for homologs
View Answer
Answer: c
Explanation: By default, FASTA scans smaller window sizes. Thus, it gives more sensitive results
than BLAST, with a better coverage rate for homologs. However, it is usually slower than
BLAST.
3. The use of low-complexity masking in the BLAST procedure means that it may have higher
specificity than FASTA because potential false positives are reduced.
a) True
b) False
View Answer
Answer: a
Explanation: In addition to the given statement, BLAST sometimes gives multiple best-scoring
alignments from the same sequence. FASTA returns only one final alignment.
4. Which of the following is not a benefit of BLAST?

a) Handling of gaps
b) Speed
c) More sensitive
d) Statistical rigor
View Answer
Answer: a
Explanation: In addition to this, user friendly UI of BLAST is also one of its benefits. However, it
does not handle gaps well. In that case gapped BLAST is better.
5. BLAST might not find matches for very short sequences.

a) True
b) False
View Answer
Answer: a
Explanation: In BLAST, similarity matching of words is involved. If no words are found similar,
then no alignment is detected and hence it might not find matches for very short sequences.
6. BLAST often produces several short HSPs rather than a single aligned region.
a) True
b) False
View Answer
Answer: a
Explanation: The results of the word matching and attempts to extend the alignment are
segments. They are called as HSPs (High-Scoring Segment Pairs). BLAST often produces several
short HSPs rather than a single aligned region.
7. FASTA is derived from logic of the dot plot.
a) True
b) False
View Answer
Answer: a
Explanation: Because of this, it computes best diagonals from all frames of alignment. The
method looks for exact matches between words in query and test sequence.
8. The gapped portion in the diagonals represents matches in FASTA.
a) True
b) False
View Answer
Answer: b
Explanation: The diagonal’s nature indicates the matching of the sequences. After all diagonals
are found, it tries to join diagonals by adding gaps. Further, it Computes alignments in regions of
best diagonals.
Only for Knowledge.

MLT
Reference :
• From different websites and articles, by using internet.
Medical Laboratory
Technology

Bioinformatics by MHN

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bioinformatics by MHN

Uploaded by

Copyright:

Available Formats

Medical laboratory Technology, 2nd Year 4th Semester.

UCSC GENOME BROWSER

Tools of UCSC genome browser include:

NATIONAL CENTER OF BIOTECHNOLOGY INFORMATION

What are the differences between BLAT and BLAST?

Main Difference – BLAST vs FASTA

Ensembl Genome Browser

POLYMERASE CHAIN REACTION (PCR)

COMPARISON OF FASTA AND BLAST

2. Which of the following is not a benefit or a factual of FASTA over BLAST?

4. Which of the following is not a benefit of BLAST?

5. BLAST might not find matches for very short sequences.

Only for Knowledge.

You might also like