Lec 25 Nov

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Human

genome
project

To sequence whole
DNA of human
Human genome
sequencing overview
DNA --- partial restriction digestion---cloned into BACs---Shotgun sequencing approach
Then screening for sequence-tagged sites (STSs), ---DNA fragments, usually less than 500 base pairs in length, of known
sequence and chromosomal location---can be amplified using polymerase chain reaction (PCR

Library clones were also digested with the restriction enzyme, and the sizes of the resulting DNA fragments were determined
using agarose gel electrophoresis. Each library clone exhibited a DNA fragment "fingerprint," which could be compared to that
of all other library clones in order to identify overlapping clones.

Fluorescence in situ hybridization (FISH) was also used to map library clones to specific chromosomal regions. Collectively, the
STS, DNA fingerprint, and FISH data allowed the scientists to generate contigs, which consisted of multiple overlapping
bacterial artificial chromosome (BAC) library clones spanning each of the 24 different human chromosomes (i.e., 22 autosomes
and the X and Y chromosomes).

Next, individual BAC clones selected for DNA sequence analysis were further fragmented, and the smaller genomic DNA
fragments were subcloned into vectors to generate a BAC-derived shotgun library. The inserts were sequenced using primers
matching the vector sequence flanking the genomic DNA insert, and overlapping shotgun clones were used to generate a DNA
sequence spanning the entire BAC clone. The term "shotgun" comes from the fact that the original BAC clone was randomly
fragmented and sequenced, and the raw DNA sequence data was then subjected to computational analyses to generate an
ordered set of DNA sequences that spanned the BAC clone.
Genome sequence

13 years
$2.7 billion
130 volumes

Is it of worth?
What info do you get?
Why did you invested a lot of money and time
for sequencing the human genome?
Look for the genes
• After genome sequencing, scientist look for the position of genes on
chromosomes
• First they made a rough estimate by looking at the genome size of
different organism then they determined actual number and location
of genes.
Estimation of number of genes
• Assumptions
• Entirety of the genome codes for
genes of interest
• A typical protein contains roughly
300 amino acids (approx. 1000
bases)
• Naïve estimate
Number of genes = Genome size/1000
10,000bases/1000
10 genes
For eukaryotes it was hard to make a correct a guess because of
a major non coding region

This strategy fails spectacularly when we apply it to eukaryotic genomes,


resulting for example in the estimate that the number of genes in the human
genome should be 3,000,000, a gross overestimate.
How to locate genes on sequence map
• Genetic mapping and physical mapping had already provided
sequence of some genes. They are known as expressed sequence tags
(ESTs).
• These ESTs are located on chromosomes using fluorescent in situ
hybridization (FISH).
Fluorescence in situ hybridization (FISH)
To locate a specific DNA sequence on a
chromosome. In this approach, a fluorescent
dye is attached to a purified piece of DNA,
and then that DNA is incubated with the full
set of chromosomes from the originating
genome, which have been attached to a glass
microscope slide. The fluorescently labeled
DNA finds its matching segment on one of the
chromosomes, where it sticks. By looking at
the chromosomes under a microscope, a
researcher can find the region where the DNA
is bound because of the fluorescent dye
attached to it. This information thus reveals
the location of that piece of DNA in the
starting genome
How to locate genes on sequence map
• Genetic mapping and physical mapping had already provided
sequence of some genes. They are known as expressed sequence tags
(ESTs).
• Position of these ESTs are located on chromosomes using fluorescent
in situ hybridization (FISH).
• Bioinformatics was used to predict exons
• Softwares are available to determine exons
• Homology searches help in gene prediction
Softwares for homology search
• BLAST (Basic local alignment sequence tool)
Software for gene prediction

FragGeneScan: Predicting genes in complete genomes and sequencing Reads


ATGpr: Identifies translational initiation sites in cDNA sequences
AUGUSTUS: Eukaryote gene predictor
DIOGENES: Fast detection of coding regions in short genome sequences
Dragon Promoter Finder: Program to recognize vertebrate RNA polymerase II promoters
How to create genomic library
• Artificial chromosome in bacteria (BACs)
• Artificial chromosome in yeast (YACs)
BACs
• A bacterial artificial chromosome (BAC) is
an engineered DNA molecule used to
clone DNA sequences in bacterial cells
• A short piece of the organism's DNA is
amplified as an insert in BACs, and then
sequenced. Finally, the sequenced parts
are rearranged in silico, resulting in the
genomic sequence of the organism
• Segments of an organism's DNA, ranging
from 100,000 to about 300,000 base
pairs, can be inserted into BACs.
• The BACs, with their inserted DNA, are
then taken up by bacterial cells. As the
bacterial cells grow and divide, they
amplify the BAC DNA, which can then be
isolated and used in sequencing DNA
• Based on a functional fertility plasmid (or F-plasmid)
• F-plasmids play a crucial role because they contain partition genes
that promote the even distribution of plasmids after bacterial cell
division
Common gene components
• RepE
For plasmid replication and regulation of copy
number.
• parA and parB
For partitioning F plasmid DNA to daughter
cells during division & ensures stable
maintenance of the BAC.
• A selectable marker
For antibiotic resistance; some BACs also have
lacZ at the cloning site for blue/white selection.
• T7 & Sp6
Phage promoters for transcription of inserted
genes.
How to create BAC library
In general BAC clones carry inserts of DNA up to 200,000 base pairs
The DNA fragments in BACs are much simpler to sequence than trying to tackle an entire genome
Making a BAC library
Isolate the cells containing the DNA you want to store. For animal and human BAC libraries the DNA normally comes from white
blood cells.
The cells are then treated with enzymes to dissolve their cell membranes and release the DNA into the agarose gel
A DNA-cutting enzyme is used to chop the DNA into pieces around 200,000 base pairs in length.
DNA fragments are extracted from gel.
These DNA fragments are inserted into a BAC vector using an enzyme called ligase to join the two bits of DNA together. They are
now called BAC clones.
The BAC clones are added to bacterial cells, usually E. coli.
The bacteria are then spread on nutrient rich plates that allow only the bacteria that carry BAC clones to grow.
The bacteria grow rapidly, resulting in lots of bacterial cells, each containing a copy of the BAC clone.
After they have grown, the bacteria are then ‘picked’ into plates of 96 or 384 so that each tube contains a single BAC clone.
The bacteria can also be copied or frozen and kept until researchers are ready to use the DNA for sequencing?.
A BAC library has been created.
Yeast artificial chromosomes (YACs)
These genetically engineered chromosomes derived from the DNA of the yeast, Saccharomyces cerevisiae, which is
then ligated into a bacterial plasmid.
By inserting large fragments of DNA, from 100–1000 kb, the inserted sequences can be cloned and physically mapped
using a process called chromosome walking.
Components of YACs:
• Ligation of selectable marker into plasmid vector: this allows for the differential selection of colonies with, or
without the marker gene An antibiotic resistance gene allows the YAC vector to be amplified and selected for in E.
coli by rescuing the ability of mutant E. coli to synthesize leucine in the presence of the necessary components
within the growth medium. TRP1 and URA3 genes are other selectable markers. The YAC vector cloning site for
foreign DNA is located within the SUP4 gene. This gene compensates for a mutation in the yeast host cell that
causes the accumulation of red pigment. The host cells are normally red, and those transformed with YAC only, will
form colorless colonies. Cloning of a foreign DNA fragment into the YAC causes insertional inactivation of the gene,
restoring the red color. Therefore, the colonies that contain the foreign DNA fragment are red.
• A yeast centromere that ensures chromosome portioning between two daughter cells
• Ligation of Autonomously Replicating Sequences (ARS) providing an origin of replication to undergo mitotic
replication. This allows the plasmid to replicate extrachromosomally.
• Ligation of artificial telomeric sequences to convert circular plasmid into a linear piece of DNA
• Insertion of DNA sequence to be amplified (up to 1000kb)
Cloning in YACs
• Human gene mapping began in 1911
• color blindness lay on the X chromosome.
• In the late 1960s, two technical developments occurred.
• First, somatic cell hybridization became a mapping strategy. This method mixed chromosomes by
fusing together cells from humans and other organisms. The mixed chromosomes fragment and
reorganize into metastable cell lines that retain various amounts of human (DNA). It turned out
that rodent-human cell lines, after a few generations, generally retained mainly rodent and only a
small amount of human DNA and were relatively stable over time. By assembling large numbers
of such cell lines, and devising clever ways to select only those cells that contained functional
genes of interest, it became possible to map genes.
• During this period, it also became possible to differentiate the 24 distinct human chromosomes
under the light microscope by staining them with DNA-binding dyes, producing a karyotype
(normally 22 pairs of autosomes and either a pair of Xs, in females, or an X and Y, in males). In a
photograph of the nucleus of a cell, the chromosomes could be directly seen and large-scale
deletions, rearrangements, and duplications detected.
• In the mid-1970s, recombinant DNA led to the isolation and cloning
of hundreds of human genes, then 1mapping by linkage to DNA
markers. Once located, the markers could be used to trace the
inheritance of bits of chromosomes through families, so that the
inheritance of markers could be compared with the inheritance of
diseases or other traits.
DNA fragment using RE
Ligate into BACs
Each BACs---DNA isolate---PCR----Sequence
STS (ESTs = 500bp of known sequence) find out in BACs
BACs DNA were also cur=t with RE---and gel elcto---fragment size was
determined (DNA FINGERPRINTING)
FISH was used to identify specific STS
Alignment
Analysis
Scientist look for the ESTs

You might also like