Professional Documents
Culture Documents
Module8 ComparGenomics
Module8 ComparGenomics
Julie Horvath, PhD Evolutionary Anthropology Department Primate Genomics Initiative Institute for Genome Sciences & Policy Duke University
A tool for decoding genomic information stored in DNA Functional sequences evolve more slowly than nonfunctional sequences Sequences that remain conserved throughout evolution may perform a biological function
Evolutionary reconstructions of genes and genomes Apply knowledge to medicine, biotechnology, agriculture, conservation
Comparative Genomics:
Genotype Phenotype
Humans and Chimpanzees have different disease susceptibilities: A common sialic acid (Neu5Gc) is inactive in humans and may affect cell surface binding of various pathogens Chou et al. PNAS 2002 Humans have smaller masticatory muscles than other apes: Humans have a loss of function mutation in the MYH16 myosin gene Did this myosin mutation lead to gracilization of the human skull? Stedman et al. Nature 2004 and McCollum et al. J Hum Evol 2006 **Use caution when linking genes to phenotype!**
See Varki and Altheide Genome Research 2005
Where To Start?
To identify conserved regions, you must: Decide which species you would like to compare Identify and extract the relevant genome sequences Annotate genes, and other features found in the genome sequences
Human vs.
Chimpanzee
Mouse
Opossum
Pufferfish
Size (Gbp)
Time since divergence Sequence conservation (in coding regions) Aids identification of Background noise
3.0
~6 MYA
2.5
> 90 MYA
4.2
~150 MYA
0.4
~450 MYA
>99%
Recently changed sequences and genomic rearrangements
~80%
Both coding and noncoding sequences
~70-75%
Both coding and noncoding sequences
~65%
Primarily coding sequences
High
Moderate
Low
Lower
Orthologues
Duplication
Paralogues
Orthologues Genes in different species that are derived from the same gene in last common ancestor
Novel function
Paralogues Gene families that have diverged within a single species, often by duplication
Gene 1
Gene 2
Gene 3
Functional Orthologues
Links to the closest putative orthologous genes in other species Hyperlinks to view alignments & positional information
Contains a wealth of information about homologous genes and links to other resources
Trace archives
Be Cautious!
BLAST searches can identify related but non-orthologous genes
~220 kb BMP8a BMP8b
Human chr1
Mouse chr4
~170 kb
Bmp8b
Confirm that no other paralogous genes are present in your species of interest
(BLAST, self-chain @UCSC, Segmental Dup Track @ UCSC, Ensembl paralogous genes)
Phylogenetic analysis
2. Phylogenetic Analysis
Clusters homologous genes into a gene genealogy
(evolutionary tree)
22
Vista http://www-gsd.lbl.gov/vista
requires annotation files, repeat masks for you Global alignment, AVID
Semi automated
zPicture - http://www.dcode.org
Local alignment, BLASTZ
Automatic
Genome Browsers, e.g UCSC and Ensembl
ECRbrowser - http://www.dcode.org
BLAT, BLAST and BLASTZ
23