human genetic variation

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

HUMAN GENETIC

VARIATION

2
Why do we care about genetic variations?

1. Genetic variations underlie phenotypic differences among


different individuals

2. Genetic variations determine our predisposition to complex diseases


and responses to drugs and environmental factors

3. Genetic variations reveal clues of ancestral human migration


history

3
Copy-number variations (CNVs)
A form of structural variation or alterations of the DNA of a genome that
results in the cell having an abnormal number of copies of one or more
sections of the DNA.

CNVs correspond to relatively large regions of the genome that have been
deleted (fewer than the normal number) or duplicated (more than the normal
number) on certain chromosomes.

CNVs can be caused by structural rearrangements of the genome such as


deletions, duplications, inversions, and translocations. LCRs, which are
region-specific repeat sequences, are susceptible to such genomic
rearrangements resulting in CNVs. Factors such as size, orientation,
percentage similarity and the distance between the copies influence the
susceptibility of LCRs to genomic rearrangement.

4
Main Types of Genetic Variations
A. Single nucleotide mutation
 Resulting in single nucleotide polymorphisms (SNPs)
 Accounts for up to 90% of human genetic variations
 Majority of SNPs do NOT directly or significantly contribute to any phenotypes

B. Insertion or deletion of one or more nucleotide(s)


1. Tandem repeat polymorphisms
 Tandem repeats are genomic regions consisting of variable length of sequence motifs
repeating in tandem with variable copy number.
 Used as genetic markers for DNA finger printing (forensic, parentage testing)
 Many cause genetic diseases
 Microsatelites (Short Tandem Repeats): repeat unit 1-6 bases long
 Minisatelites: repeat unit 11-100 bases long

2. Insertion/Deletion (INDEL or DIPS) polymorphisms


Often resulted from localized rearrangements between homologous tandem repeats.

C. Gross chromosomal aberration


 Deletions, inversions, or translocation of large DNA fragments
 Rare but often causing serious genetic diseases
5
VNTR
A variable number tandem repeat (VNTR) is a location in a genome where a
short nucleotide sequence is organized as a tandem repeat.

These can be found on many chromosomes, and often show variations in length
between individuals.

Each variant acts as an inherited allele, allowing them to be used for personal or
parental identification. Their analysis is useful in genetics and biology
research, forensics, and DNA fingerprinting.

There are two principal families of VNTRs: microsatellites and minisatellites.


The former are repeats of sequences less than about 5 base pairs in length, while
the latter involve longer blocks.

6
In analysing VNTR data, two basic genetic principles can be used:

Identity Matching- both VNTR alleles from a specific location must match. If two
samples are from the same individual, they must show the same allele pattern.

Inheritance Matching- the VNTR alleles must follow the rules of inheritance. In
matching an individual with his parents or children, a person must have an allele that
matches one from each parent. If the relationship is more distant, such as a grandparent or
sibling, then matches must be consistent with the degree of relatedness.

7
The Human Gene Mutation Database
(HGMD)

The HGMD was established in April 1996 to collate published germline


mutations responsible for human inherited disease. In October 2001,
HGMD contained 26,637 mutations.

The scope of HGMD is limited to mutations leading to a defined inherited


phenotype, including a broad range of mechanisms, such as point
mutations, insertion/deletions, duplications and repeat expansions within the
coding regions of genes

 The HGMD search interface is primarily text based and targeted


searching tends to rely on knowledge of the correct HUGO nomenclature
for a gene.

8
9
The Protein Mutation Database (PMD)
 The Protein Mutation Database (PMD) is unique among genetic variation databases
as it contains both natural and artificial mutation data derived from human proteins.

 The database gives detailed description of the functional or structural effects of the
mutations if known and provides links to the original publications. Relative
differences in activity and/or stability, in comparison with the wild-type protein,
are also indicated.

 A complete report on the mutated protein sequence is displayed which


allows the user to see the position of altered amino acids. Where 3D structures
have been experimentally determined, PMD displays mutated residues in a different
colour on the 3D structure.

 PMD contains 119,190 natural and artificial mutations (January 2002) and these can be
searched by keyword or sequence similarity

10
EXAMPLE - Hemoglobin

11
12
Database of Genomic Variants archive

13
DGVa is a central archive that receives data from, and distributes data to, a number of
resources:

 The DGVa accepts direct submissions from researchers and accession numbers for
data objects included in these are given the prefix 'e'.

 The DGVa also exchanges data on a regular basis with dbVar. Data objects
accessioned by dbVar have the prefix 'n'.

 You can retrieve DGVa data from the data download page, search the DGVa using
Biomart, and view the data in a genomic context using Ensembl .

The DGVa also supplies data to DGV (Database of Genomic Variants, hosted by The
Centre for Applied Genomics in Canada), where further curation and interpretation is
carried out.

The data stored in the DGVa are organised according to the DGVa's data model and
are centred around three types of object:
the study
the genomic region in which the variation occurs
the particular variation observed in an individual sample (call)
14
Study
'Study' is the placeholder for all data objects and related information for a genomic
structural variation study. The accession number has the prefix estd (or, if the study has
been curated by dbVar, nstd.) Study-related information includes details about the study
authors and their affiliation, the type of study and the publication that describes the
study.
Region
'Region' denotes the genomic location where structural variation is asserted to exist. The
accesssion number has the prefix esv(or, if the study has been curated by dbVar, the
prefix is nsv.) Study authors assert the presence of a structural variant region on the basis
of individual variation in samples. Region-related information includes the assertion
method, which describes how the variation in samples has been merged to define the
region (for example, sample calls overlap by at least 80%.)
Call
'Call' describes the individual variation observed in a sample. The accession number has
the prefix essv (or, if the study has been curated by dbVar, the prefix is nssv.) Call-related
information includes the name of the sample, the experimental procedure that generated
the call (e.g. sequencing or array), the type of variation (e.g. deletion or insertion) and
placement (location) in the sample's genome.

15
Database of Genomic Variants
Database of Genomic Variants is to provide a comprehensive
summary of structural variation in the human genome.

Structural variation is defined as genomic alterations that involve


segments of DNA that are larger than >1kb. Also InDels in 100bp-1kb
range are annotated.

The Database of Genomic Variants provides a useful catalog of


control data for studies aiming to correlate genomic variation with
phenotypic data.

16
17
EXAMPLE-NM_030798
This gene encodes an RCC1-like G-exchanging factor. It is deleted in Williams
syndrome, a multisystem developmental disorder caused by the deletion of contiguous
genes at 7q11.23

18
19
Online Mendelian Inheritance in Man (OMIM):
A Brief Overview
 URL: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM

 OMIM is a human genetic disorders database built and curated using results
from published studies.

 Each OMIM record provides a summary of the current state of knowledge of the
genetic basis of a disorder, which contains the following information:

 description and clinical features of a disorder or a gene involved in genetic


disorders;
 biochemical and other features;
 cytogenetics and mapping;
 molecular and population genetics;
 diagnosis and clinical management;
 animal models for the disorder;
 allelic variants.

 OMIM is searchable via NCBI Entrez, and its records are cross-linked to other
NCBI resources. 20
Online Mendelian Inheritance in Man Stats

21
OMIM: Allelic Variants
 The OMIM database includes genetic disorders caused by various
mutation/variation, from SNPs to large-scale chromosomal abnormalities.

 The listed allelic variants are searchable through the "Allelic Variants" field.
 Single nucleotide substitutions (SNPs);
 small insertions and deletions (INDEL/DIPS);
 frame shifts caused by these INDELs.

 Allelic variants are represented by a 10-digit OMIM number, and can be


searched in two ways:
 Search for a gene or a disease, when retrieved, view its allelic variants.
 Use the Limits to narrow your search to:
-- retrieve only records that contain allelic variant information;
-- search for particular terms within the allelic variants field.

22
 For most genes, only selected mutations are included
Criteria for inclusion include: the first mutation to be discovered, high population
frequency, distinctive phenotype, historic significance, unusual mechanism of
mutation, unusual pathogenetic mechanism, and distinctive inheritance.

 Most of the allelic variants represent disease-producing mutations, NOT


polymorphisms.

 A few polymorphisms are included, many of which show a positive statistical


correlation with particular common disorders.

 Few neutral polymorphisms are included in OMIM.

 Some SNPs in the dbSNP records are not linked to the corresponding OMIM
records.

23
dbVar
dbVar is a database of genomic structural variation that allows you to
search, view, and download variant data from studies submitted for any
organism. In general, variants are ≥ 50 nucleotides, but are occasionally
smaller. dbVar provides access to the raw data (when available) and links
to other NCBI and external resources.

24
25
The browser can be sorted by :
- Study accession, Organism, Study Type, Method, Number of Variant
Regions, Number of Variant Calls, or Publication

26
This is followed by a Detailed Information section where you can download
variant data for the current study (Variant Regions, Variant Calls, Both, or
everything via FTP) or browse details about Variant Summary, Samplesets,
Experimental Details, or Validations in a tabbed format.

27
GENETIC MARKER AND MICROSATELLITE
DATABASES
dbSTS

It is an NCBI resource that contains sequence data for short genomic landmark
sequences or Sequence Tagged Sites. These STSs can include polymorphic sequences
such as short tandem repeats (STRs) or non-polymorphic sequences.

The dbSTS database maintains complete records for over 133,202 STS markers,
including 18,000 STR markers and gives key information for each record such as
primer sequences, map location and marker aliases.

Searching dbSTS can be achieved in many ways. The UniSTS interface allows
direct searches by keyword, the NCBI Map View application allows searching by
genomic location or locus, while dbSTS is also available for BLAST searching by
NCBI BLAST.

This array of search options makes the dbSTS database a very reliable source for
Retrieval of both genetic and physical STS map markers.
28
29
NON-NUCLEAR AND SOMATIC MUTATION
DATABASES
The mitochondrial genome consists of a 16,569-bp closed circular molecule in
the mitochondrion each of the several thousand mtDNAs per cell encodes a
control region encompassing a replication origin and the promoters, a large (16 S)
and small (12 S) rRNA, 22 tRNAs, and 13 polypeptides.

Maternally inherited mtDNA has a very high mutation rate —mtDNA mutates
10–20 times faster than nuclear DNA as a result of inadequate proofreading by
mitochondrial DNA polymerases and limited mtDNA repair capability.

 As a result mtDNA mutations might be expected to be relatively common this


is supported by the relative abundance of mitochondrial disorders described to so
far —although it is also important to note that such mutations, being
comparatively easy to identify by sequencing, are likely to have been among the
first to be characterized.

The MITOMAP database integrates information on all known mtDNA


mutations and polymorphisms with the broad spectrum of available molecular,
genetic, functional and clinical data, into an integrated resource which can be
queried from a variety of different perspectives
30
MITOMAP

31
32
Somatic Mutations
A completely distinct category of human mutations arises somatically during
the process of tumourgenesis.

 These mutations may take many forms, the most commonly characterized are
somatic point mutations identified during the screening of candidate genes in
tumour tissues.

 Cytogenetic studies of human neoplasias have also identified a number of


chromosomal aberrations involving large deletions and duplications

As somatic mutations are not inherited it is obviously important to avoid


mixing somatic point mutation data with human polymorphism and mutation
data.

33
COSMIC

34
EXAMPLE – T315I mutation

35
36
THANK YOU

37

You might also like