Genome Browser Exercise

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

LSM2241 Practical 8

UCSC Genome Browsers

Objectives:

1. Genome Browser: graphical view of genes, gene structure, and annotation


tracks.
2. BLAT: aligning DNA sequence with a reference genomic assembly.
3. Custom Tracks: displaying your data in conjunction with existing browser
data.
4. Table Browser: bulk data manipulation and downloads, intersections and joins
between data sets.
5. In-Silico PCR: getting DNA bracketed by a pair of primers

Part 1: Using Genome Browsers

For more information, see:


https://www-sciencedirect-com.libproxy1.nus.edu.sg/science/article/pii/S0888754308000451

Exercise 1: Genome Browser


The UCSC Genome Browser allows users to visualise multiple genome annotation datasets
(such as gene annotations, histone modifications, and sequence conservation) from multiple
databases against a genome assembly.

Use the genome browser to look at the Human genome (Assembly: hg38; Position:
chr5:66500000-67200000)

Questions:
A. Does this region code for any protein according to GENCODE transcript set? If yes,
Which one?
B. Are there any Single Nucleotide polymorphisms (SNPs) in this region of
chromosome?

Exercise 2: BLAT
BLAT stands for “BLAST-Like Alignment Tool”. It rapidly aligns highly similar sequences
(>95% similarity) to a selected genome assembly.

Use BLAT tool and with the following parameters:


Genome: Chimp
Assembly: panTro6
Query Type: DNA
Sort Output: query, score
Output Type: Hyperlink

The following DNA sequence was obtained from the human hg18 assembly. However, we
will be BLATing it on to the Chimp assembly in this exercise.

GTCCGGGCAGCCCCCGGCGCAGCGCGGCCGCAGCAGCCTCCGCCCCCCGCACGGTGTGAGCGCCCGACGCGGCC
GAGGCGGCCGGAGTCCCGAGCTAGCCCCGGCGGCCGCCGCCGCCCAGACCGGACGACAGGCCACCTCGTCGGCG
TCCGCCCGAGTCCCCGCCTCGCCGCCAACGCCACAACCACCGCGCACGGCCCCCTGACTCCGTCCAGTATTGAT
CGGGAGAGCCGGAGCGAGCTCTTCGGGGAGCAGCGATGCGACCCTCCGGGACGGCCGGGGCAGCGCTCCTGGCG
CTGCTGGCTGCGCTCTGCCCGGCGAGTCGGGCTCTGGAGGAAAAGAAAGGTAAGGGCGTGTCTCGCCGGCTCCC
GCGCCGCCCCCGGATCGCGCCCCGGACCCCGCAGCCCGCCCAACCGCGCACCGGCGCACCGGCTCGGCGCCCGC
GCCCCCGCCCGTCCTTTCCTGTTTCCTTGAGATCAGCTGCGCCGCCGACCGGGACCGCGGGAGGAACGGGACGT
TTCGTTCTTCGGCCGGGAGAGTCTGGGGCGGGCGGAGGAGGAGACGCGTGGGACACCGGGCTGCAGGCCAGGCG
GGGAACGGCCGCCGGGACCTCCGGCGCCCCGAACCGCTCCCAACTTTCTTCCCTCACTTTCCCCGCCCAGCTGC
GCAGGATCGGCGTCAGTGGGCGAAAGCCGGGTGCTGGTGGGCGCCTGGGGCCGGGGTCCCGCACGTGCGCCCCG
CGCTGTCTTCCCAGGGCGCGACGGGGTCCTGGCGCGCACCCGAGGGGCGGGCGCTGCCCACCCGCCGAGACTGC
ACTGTTTAGGGAAGCTGAGGAAGGAACCCAAAAATACAGCCTCCCCTCGGACCCCGCGGGACAGGCGGCTTTCT
GAGAGGACCTCCCCGCCTCCGCCCTCCGCGCAGGTCTCAAACTGAAGCCGGCGCCCGCCAGCCTGGCCCCGGCC
CCTCTCCAGGTCCCCGCGATCCTCGTTCCCCAGTGTGGAGTCGCAGCCTCGACCTGGGAGCTGGGAGAACTCGT
CTACCACCACCTGCGGCTCCCGGGGAGGGGTGGTGCTGGCGGCGGTTAGTTTCCTCGTTGGCAAAAGGCAGGTG
GGGTCCGACCCGCCCCTTGGGCGCAGACCCCGGCCGCTCGCCTCGCCCGGTGCGCCCTCGTCTTGCCTATCCAA
GAGTGCCCCCCACCTCCCGGGGACCCCAGCTCCCTCCTGGGCGCCCGCGCCGAAAGCCCCAGGCTCTCCTTCGA
TGGCCGCCTCGCGGAGACGTCCGGGTCTGCTCCACCTGCAGCCCTTCGGTCGCGCCTGGGCTTCGCGGTGGAGC
GGGACGCGGCTGTCCGGCCACTGCAGGGGGGGATCGCGGGACTCTTGAGCGGAAGCCCCGGAAGCAGAGCTCAT
CCTGGCCAACACCATGGTGTTTCAAAATGG

Questions:
A. Which chromosome(s) does this sequence belong to? What are the start and end
positions?
B. How identical is/are the aligned Chimp sequence(s) to the given Human sequence?
C. Does it code for any protein? If yes, which one? (Hint: Play around with the tracks in
the browser)

Exercise 3: Table Browser


Table Browser formats data from specified genomes and tracks into plain-text, allowing
users to download large datasets conveniently. This tool can be used after identifying
regions and tracks of interest with the Genome Browser.

Use the Table Browser tool to find all the genes present on the chromosome 1 in the region
(1-100,000) and set the parameters to:
Clade: Mammal
Genome: Human
Assembly: hg38
Group: Genes and Gene Predictions
Track: NCBI RefSeq
Region: Position chr1:1-100,000
Table: UCSC RefSeq (RefGene)
Leave everything else as default

Click on “get output” to obtain the list of items, or “summary/statistics” for result summary.

Questions:
A. How many items did you find in this genomic region? What can you infer about these
result entries based on their names?

Part 2: In-Silico PCR


Polymerase chain reaction (PCR) is a very common wet-lab experiment for amplifying DNA
using two short nucleotide sequences (known as primers). With PCR, we can extract DNA
sequences of interest for further study. An example of this is shown in Fig. 1 below.

Each single-stranded DNA (ssDNA) is polar (i.e. has a direction), and DNA sequences are
conventionally presented in the 5’ (“five prime”) to 3’ (“three prime”) direction. The numbers
represent the carbon number of a nucleotide’s sugar backbone. Because of the asymmetry
of the sugar backbone, certain enzymatic reactions can only occur in one direction.
Polynucleotide synthesis by DNA polymerase is one such reaction, and it occurs in the 5’ to
3’ direction. The template strand is read from 3’ to 5’ during replication while the newly
synthesised strand is extended from 5’ to 3’ to generate antiparallel double-stranded DNA
(dsDNA).

PCR is commonly used to amplify specific stretches of sequences from genomic DNA.
However, as primers are typically only 20-30 bases long, there is a likelihood that off-target
amplification could occur anywhere else in the genome. To avoid large-scale trial and error
of multiple primer pairs, UCSC’s In-Silico PCR tool allows users to select a genomic
assembly and simulate this process in-silico to check the specificity of designed primers and
identify any potential off-target amplicons.
Fig. 1​. PCR primers bind to ssDNA to initiate extension
(https://www.khanacademy.org/science/biology/biotech-dna-technology/dna-sequencing-pcr-electrophoresis/a/polymerase-chain-reaction-pcr)

For more information on PCR:


https://www.khanacademy.org/science/biology/biotech-dna-technology/dna-sequencing-pcr-
electrophoresis/a/polymerase-chain-reaction-pcr

Exercise 4: In-Silico PCR


Set up the UCSC In-Silico PCR tool with the following parameters:
Genome: Mouse (Assembly: mm10)
Check the “Flip reverse primer” box
Forward Primer: ​TACTTCCCTTTCCTAGTTTTACAG
Reverse Primer: ​AGTGTCTAGTCTCAGTGTGTATCA
Leave everything else as default

Questions
A. Is there a unique match for this primer combination?
B. Where is this match located and how long is the predicted target sequence
(amplicon)?

References:
This practical was adapted from:
Zweig, A.S., Karolchik, D., Kuhn, R.M., Haussler, D., Kent, W.J., 2008. UCSC
genome browser tutorial. Genomics 92, 75–84.
https://doi.org/10.1016/j.ygeno.2008.02.003

You might also like