Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Sequence homology

Sequence homology is the biological


homology between DNA, RNA, or protein
sequences, defined in terms of shared ancestry
in the evolutionary history of life. Two
segments of DNA can have shared ancestry
because of three phenomena: either a
speciation event (orthologs), or a duplication
event (paralogs), or else a horizontal (or
lateral) gene transfer event (xenologs).[1]

Homology among DNA, RNA, or proteins is


Gene phylogeny as red and blue branches within grey species
typically inferred from their nucleotide or
phylogeny. Top: An ancestral gene duplication produces two
amino acid sequence similarity. Significant
paralogs (histone H1.1 and 1.2). A speciation event produces
similarity is strong evidence that two orthologs in the two daughter species (human and chimpanzee).
sequences are related by evolutionary changes Bottom: in a separate species (E. coli), a gene has a similar function
from a common ancestral sequence. (histone-like nucleoid-structuring protein) but has a separate
Alignments of multiple sequences are used to evolutionary origin and so is an analog.
indicate which regions of each sequence are
homologous.

Contents
Identity, similarity, and conservation
Orthology
Databases of orthologous genes
Paralogy
Regulation
Paralogous chromosomal regions
Ohnology
Xenology
Homoeology
Gametology
See also
References

Identity, similarity, and conservation


The term "percent homology" is often used to mean "sequence similarity." The percentage of identical residues (percent identity)
or the percentage of residues conserved with similar physicochemical properties (percent similarity), e.g. leucine and isoleucine,
is usually used to "quantify the homology." Based on the definition of homology specified above this terminology is incorrect
since sequence similarity is the observation, homology is the conclusion. Sequences are either homologous or not.
A sequence alignment of mammalian histone proteins. Sequences are the middle
120-180 amino acid residues of the proteins. Residues that are conserved across all
sequences are highlighted in grey. The key below denotes conserved sequence (*),
conservative mutations (:), semi-conservative mutations (.), and non-conservative
mutations ( ).[2]

As with morphological and anatomical structures, sequence similarity might occur because of convergent evolution, or, as with
shorter sequences, by chance, meaning that they are not homologous. Homologous sequence regions are also called conserved.
This is not to be confused with conservation in amino acid sequences, where the amino acid at a specific position has been
substituted with a different one that has functionally equivalent physicochemical properties.

Partial homology can occur where a segment of the compared sequences has a shared origin, while the rest does not. Such partial
homology may result from a gene fusion event.

Orthology
Homologous sequences are orthologous if they are inferred to be
descended from the same ancestral sequence separated by a
speciation event: when a species diverges into two separate species,
the copies of a single gene in the two resulting species are said to
be orthologous. Orthologs, or orthologous genes, are genes in
different species that originated by vertical descent from a single
gene of the last common ancestor. The term "ortholog" was coined
in 1970 by the molecular evolutionist Walter Fitch.[3]

For instance, the plant Flu regulatory protein is present both in


Arabidopsis (multicellular higher plant) and Chlamydomonas Top: An ancestral gene duplicates to produce two
(single cell green algae). The Chlamydomonas version is more paralogs (Genes A and B). A speciation event
produces orthologs in the two daughter species.
complex: it crosses the membrane twice rather than once, contains
Bottom: in a separate species, an unrelated gene
additional domains and undergoes alternative splicing. However it has a similar function (Gene C) but has a
can fully substitute the much simpler Arabidopsis protein, if separate evolutionary origin and so is an analog.
transferred from algae to plant genome by means of genetic
engineering. Significant sequence similarity and shared functional
domains indicate that these two genes are orthologous genes,[4] inherited from the shared ancestor.

Orthology is strictly defined in terms of ancestry. Given that the exact ancestry of genes in different organisms is difficult to
ascertain due to gene duplication and genome rearrangement events, the strongest evidence that two similar genes are orthologous
is usually found by carrying out phylogenetic analysis of the gene lineage. Orthologs often, but not always, have the same
function.[5]
Orthologous sequences provide useful information in taxonomic classification and phylogenetic studies of organisms. The pattern
of genetic divergence can be used to trace the relatedness of organisms. Two organisms that are very closely related are likely to
display very similar DNA sequences between two orthologs. Conversely, an organism that is further removed evolutionarily from
another organism is likely to display a greater divergence in the sequence of the orthologs being studied.

Databases of orthologous genes


Given their tremendous importance for biology and bioinformatics, orthologous genes have been organized in several specialized
databases that provide tools to identify and analyze orthologous gene sequences. These resources employ approaches that can be
generally classified into those that use heuristic analysis of all pairwise sequence comparisons, and those that use phylogenetic
methods. Sequence comparison methods were first pioneered in the COGs database in 1997.[6] These methods have been
extended and automated in the following databases:

eggNOG[7]
GreenPhylDB[8] for plants
InParanoid[9] focuses on pairwise ortholog relationships
OHNOLOGS (http://ohnologs.curie.fr/)[10][11] is a repository of the genes retained from whole genome
duplications in the vertebrate genomes including human and mouse.
OMA
OrthoDB[12] appreciates that the orthology concept is relative to different speciation points by providing a
hierarchy of orthologs along the species tree.
OrthoInspector (http://lbgi.igbmc.fr/orthoinspectorv3/)[13] is a repository of orthologous genes for 4753 organisms
covering the three domains of life
OrthologID[14]
OrthoMaM[15] for mammals
OrthoMCL[16]
Roundup[17]
Tree-based phylogenetic approaches aim to distinguish speciation from gene duplication events by comparing gene trees with
species trees, as implemented in databases such as:

LOFT[18]
TreeFam[19]
A third category of hybrid approaches uses both heuristic and phylogenetic methods to construct clusters and determine trees, for
example:

EnsemblCompara GeneTrees[20]
HomoloGene[21]
Ortholuge[22]

Paralogy
Paralogous genes are genes that are related via duplication events in the last common ancestor (LCA) of the species being
compared. They result from the mutation of duplicated genes during separate speciation events. When descendants from the LCA
share mutated homologs of the original duplicated genes then those genes are considered paralogs.[1]

As an example, in the LCA, one gene (gene A) may get duplicated to make a separate similar gene (gene B), those two genes will
continue to get passed to subsequent generations. During speciation, one environment will favor a mutation in gene A (gene A1),
producing a new species with genes A1 and B. Then in a separate speciation event, one environment will favor a mutation in gene
B (gene B1) giving rise to a new species with genes A and B1. The descendants’ genes A1 and B1 are paralogous to each other
because they are homologs that are related via a duplication event in the last common ancestor of the two species.[1]

Additional classifications of paralogs include alloparalogs (out-paralogs) and symparalogs (in-paralogs). Alloparalogs are
paralogs that evolved from gene duplications that preceded the given speciation event. In other words, alloparalogs are paralogs
that evolved from duplication events that happened in the LCA of the organisms being compared. The example above is an
example alloparalogy. Symparalogs are paralogs that evolved from gene duplication of paralogous genes in subsequent speciation
events. From the example above, if the descendant with genes A1 and B underwent another speciation event where gene A1
duplicated, the new species would have genes B, A1a, and A1b. In this example, genes A1a and A1b are symparalogs.[1]

Paralogous genes can shape the structure of whole genomes and


thus explain genome evolution to a large extent. Examples include
the Homeobox (Hox) genes in animals. These genes not only
underwent gene duplications within chromosomes but also whole
genome duplications. As a result Hox genes in most vertebrates are
clustered across multiple chromosomes with the HoxA-D clusters
being the best studied.[23]

Another example are the globin genes which encode myoglobin


and hemoglobin and are considered to be ancient paralogs.
Similarly, the four known classes of hemoglobins (hemoglobin A,
hemoglobin A2, hemoglobin B, and hemoglobin F) are paralogs of
each other. While each of these proteins serves the same basic
function of oxygen transport, they have already diverged slightly in
function: fetal hemoglobin (hemoglobin F) has a higher affinity for
oxygen than adult hemoglobin. Function is not always conserved,
however. Human angiogenin diverged from ribonuclease, for Vertebrate Hox genes are organized in sets of
example, and while the two paralogs remain similar in tertiary paralogs. Each Hox cluster (HoxA, HoxB, etc.) is
on a different chromosome. For instance, the
structure, their functions within the cell are now quite different.
human HoxA cluster is on chromosome 7. The
mouse HoxA cluster shown here has 11
It is often asserted that orthologs are more functionally similar than
paralogous genes (2 are missing).[23]
paralogs of similar divergence, but several papers have challenged
this notion.[24][25][26]

Regulation
Paralogs are often regulated differently, e.g. by having different tissue-specific expression patterns (see Hox genes). However,
they can also be regulated differently on the protein level. For instance, Bacillus subtilis encodes two paralogues of glutamate
dehydrogenase: GudB is constitutively transcribed whereas RocG is tightly regulated. In their active, oligomeric states, both
enzymes show similar enzymatic rates. However, swaps of enzymes and promoters cause severe fitness losses, thus indicating
promoter–enzyme coevolution. Characterization of the proteins shows that, compared to RocG, GudB's enzymatic activity is
highly dependent on glutamate and pH.[27]

Paralogous chromosomal regions


Sometimes, large regions of chromosomes share gene content similar to other chromosomal regions within the same genome.[28]
They are well characterised in the human genome, where they have been used as evidence to support the 2R hypothesis. Sets of
duplicated, triplicated and quadruplicated genes, with the related genes on different chromosomes, are deduced to be remnants
from genome or chromosomal duplications. A set of paralogy regions is together called a paralogon.[29] Well-studied sets of
paralogy regions include regions of human chromosome 2, 7, 12 and 17 containing Hox gene clusters, collagen genes, keratin
genes and other duplicated genes,[30] regions of human chromosomes 4, 5, 8 and 10 containing neuropeptide receptor genes, NK
class homeobox genes and many more gene families,[31][32][33] and parts of human chromosomes 13, 4, 5 and X containing the
ParaHox genes and their neighbors.[34] The Major histocompatibility complex (MHC) on human chromosome 6 has paralogy
regions on chromosomes 1, 9 and 19.[35] Much of the human genome seems to be assignable to paralogy regions.[36]

Ohnology
Ohnologous genes are paralogous genes that have originated by a
process of whole-genome duplication. The name was first given in
honour of Susumu Ohno by Ken Wolfe.[37] Ohnologues are useful
for evolutionary analysis because all ohnologues in a genome have
been diverging for the same length of time (since their common
A whole genome duplication event produces a
origin in the whole genome duplication). Ohnologues are also
genome with two ohnolog copies of each gene.
known to show greater association with cancers, dominant genetic
disorders, and pathogenic copy number
variations.[38][39][40][41][42]

Xenology
Homologs resulting from horizontal gene transfer between two
organisms are termed xenologs. Xenologs can have different
functions, if the new environment is vastly different for the
A speciation event produces orthologs of a gene
horizontally moving gene. In general, though, xenologs typically
in the two daughter species. A horizontal gene
have similar function in both organisms. The term was coined by
transfer event from one species to another adds
Walter Fitch.[3] a xenolog of the gene to its genome.

Homoeology
Homoeologous (also spelled homeologous) chromosomes or parts
of chromosomes are those brought together following inter-species
hybridization and allopolyploidization, and whose relationship was
completely homologous in an ancestral species. In allopolyploids,
the homologous chromosomes within each parental sub-genome
should pair faithfully during meiosis, leading to disomic
inheritance; however in some allopolyploids, the homoeologous
A speciation event produces orthologs of a gene
chromosomes of the parental genomes may be nearly as similar to in the two daughter species. Subsequent
one another as the homologous chromosomes, leading to hybridisation of those species generates a hybrid
tetrasomic inheritance (four chromosomes pairing at meiosis), genome with a homoeolog copy of each gene
intergenomic recombination, and reduced fertility. from both species.

Gametology
Gametology denotes the relationship between homologous genes on non-recombining, opposite sex chromosomes. The term was
coined by García-Moreno and Mindell.[43] 2000. Gametologs result from the origination of genetic sex determination and barriers
to recombination between sex chromosomes. Examples of gametologs include CHDW and CHDZ in birds.[43]
See also
Deep homology
EggNOG (database)
OrthoDB
Orthologous MAtrix (OMA)
Protein family
Protein superfamily
TreeFam
Syntelog

References
1. Koonin EV (2005). "Orthologs, paralogs, and evolutionary genomics" (https://zenodo.org/record/1234975).
Annual Review of Genetics. 39: 309–338. doi:10.1146/annurev.genet.39.073003.114725 (https://doi.org/10.114
6%2Fannurev.genet.39.073003.114725). PMID 16285863 (https://www.ncbi.nlm.nih.gov/pubmed/16285863).
2. "Clustal FAQ #Symbols" (http://www.ebi.ac.uk/Tools/msa/clustalw2/help/faq.html#23). Clustal. Retrieved
8 December 2014.
3. Fitch WM (June 1970). "Distinguishing homologous from analogous proteins". Systematic Zoology. 19 (2): 99–
113. doi:10.2307/2412448 (https://doi.org/10.2307%2F2412448). JSTOR 2412448 (https://www.jstor.org/stable/2
412448). PMID 5449325 (https://www.ncbi.nlm.nih.gov/pubmed/5449325).
4. Falciatore A, Merendino L, Barneche F, Ceol M, Meskauskiene R, Apel K, Rochaix JD (January 2005). "The FLP
proteins act as regulators of chlorophyll synthesis in response to light and plastid signals in Chlamydomonas" (htt
ps://www.ncbi.nlm.nih.gov/pmc/articles/PMC540235). Genes & Development. 19 (1): 176–187.
doi:10.1101/gad.321305 (https://doi.org/10.1101%2Fgad.321305). PMC 540235 (https://www.ncbi.nlm.nih.gov/p
mc/articles/PMC540235). PMID 15630026 (https://www.ncbi.nlm.nih.gov/pubmed/15630026).
5. Fang G, Bhardwaj N, Robilotto R, Gerstein MB (March 2010). "Getting started in gene orthology and functional
analysis" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845645). PLoS Computational Biology. 6 (3):
e1000703. doi:10.1371/journal.pcbi.1000703 (https://doi.org/10.1371%2Fjournal.pcbi.1000703). PMC 2845645
(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845645). PMID 20361041 (https://www.ncbi.nlm.nih.gov/pubme
d/20361041).
6. COGs: Clusters of Orthologous Groups of proteins (https://www.ncbi.nlm.nih.gov/COG/)
Tatusov RL, Koonin EV, Lipman DJ (October 1997). "A genomic perspective on protein families" (https://zenodo.o
rg/record/1231126). Science. 278 (5338): 631–637. doi:10.1126/science.278.5338.631 (https://doi.org/10.1126%
2Fscience.278.5338.631). PMID 9381173 (https://www.ncbi.nlm.nih.gov/pubmed/9381173).
7. eggNOG: evolutionary genealogy of genes: Non-supervised Orthologous Groups (http://eggnog.embl.de/)
Muller J, Szklarczyk D, Julien P, Letunic I, Roth A, Kuhn M, Powell S, von Mering C, Doerks T, Jensen LJ, Bork P
(January 2010). "eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised
orthologous groups, species and functional annotations" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC280893
2). Nucleic Acids Research. 38 (Database issue): D190–5. doi:10.1093/nar/gkp951 (https://doi.org/10.1093%2Fn
ar%2Fgkp951). PMC 2808932 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2808932). PMID 19900971 (http
s://www.ncbi.nlm.nih.gov/pubmed/19900971).
8. GreenPhylDB (http://www.greenphyl.org)
Conte MG, Gaillard S, Lanau N, Rouard M, Périn C (January 2008). "GreenPhylDB: a database for plant
comparative genomics" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2238940). Nucleic Acids Research. 36
(Database issue): D991–8. doi:10.1093/nar/gkm934 (https://doi.org/10.1093%2Fnar%2Fgkm934). PMC 2238940
(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2238940). PMID 17986457 (https://www.ncbi.nlm.nih.gov/pubme
d/17986457).
9. Inparanoid: Eukaryotic Ortholog Groups (http://inparanoid.sbc.su.se/cgi-bin/index.cgi)
Ostlund G, Schmitt T, Forslund K, Köstler T, Messina DN, Roopra S, Frings O, Sonnhammer EL (January 2010).
"InParanoid 7: new algorithms and tools for eukaryotic orthology analysis" (https://www.ncbi.nlm.nih.gov/pmc/arti
cles/PMC2808972). Nucleic Acids Research. 38 (Database issue): D196–203. doi:10.1093/nar/gkp931 (https://do
i.org/10.1093%2Fnar%2Fgkp931). PMC 2808972 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2808972).
PMID 19892828 (https://www.ncbi.nlm.nih.gov/pubmed/19892828).
10. Singh PP, Arora J, Isambert H (July 2015). "Identification of Ohnolog Genes Originating from Whole Genome
Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes" (https://www.ncbi.nl
m.nih.gov/pmc/articles/PMC4504502). PLoS Computational Biology. 11 (7): e1004394.
doi:10.1371/journal.pcbi.1004394 (https://doi.org/10.1371%2Fjournal.pcbi.1004394). PMC 4504502 (https://www.
ncbi.nlm.nih.gov/pmc/articles/PMC4504502). PMID 26181593 (https://www.ncbi.nlm.nih.gov/pubmed/26181593).
11. "Vertebrate Ohnologs" (http://ohnologs.curie.fr/). ohnologs.curie.fr. Retrieved 2018-10-12.
12. Zdobnov EM, Tegenfeldt F, Kuznetsov D, Waterhouse RM, Simão FA, Ioannidis P, Seppey M, Loetscher A,
Kriventseva EV (January 2017). "OrthoDB v9.1: cataloging evolutionary and functional annotations for animal,
fungal, plant, archaeal, bacterial and viral orthologs" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210582).
Nucleic Acids Research. 45 (D1): D744–D749. doi:10.1093/nar/gkw1119 (https://doi.org/10.1093%2Fnar%2Fgkw
1119). PMC 5210582 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210582). PMID 27899580 (https://www.nc
bi.nlm.nih.gov/pubmed/27899580).
13. Linard, B; Allot, A; Schneider, R; Morel, C; Ripp, R; Bigler, M; Thompson, JD; Poch, O; Lecompte, O (1 February
2015). "OrthoInspector 2.0: Software and database updates". Bioinformatics. 31 (3): 447–8.
doi:10.1093/bioinformatics/btu642 (https://doi.org/10.1093%2Fbioinformatics%2Fbtu642). PMID 25273105 (http
s://www.ncbi.nlm.nih.gov/pubmed/25273105).
14. OrthologID (http://nypg.bio.nyu.edu/orthologid/)
Chiu JC, Lee EK, Egan MG, Sarkar IN, Coruzzi GM, DeSalle R (March 2006). "OrthologID: automation of
genome-scale ortholog identification within a parsimony framework". Bioinformatics. 22 (6): 699–707.
doi:10.1093/bioinformatics/btk040 (https://doi.org/10.1093%2Fbioinformatics%2Fbtk040). PMID 16410324 (http
s://www.ncbi.nlm.nih.gov/pubmed/16410324).
15. OrthoMaM (http://www.orthomam.univ-montp2.fr)
Ranwez V, Delsuc F, Ranwez S, Belkhir K, Tilak MK, Douzery EJ (November 2007). "OrthoMaM: a database of
orthologous genomic markers for placental mammal phylogenetics" (https://www.ncbi.nlm.nih.gov/pmc/articles/P
MC2249597). BMC Evolutionary Biology. 7: 241. doi:10.1186/1471-2148-7-241 (https://doi.org/10.1186%2F1471
-2148-7-241). PMC 2249597 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2249597). PMID 18053139 (https://
www.ncbi.nlm.nih.gov/pubmed/18053139).
16. OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes (http://www.orthomcl.org)
Chen F, Mackey AJ, Stoeckert CJ, Roos DS (January 2006). "OrthoMCL-DB: querying a comprehensive multi-
species collection of ortholog groups" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347485). Nucleic Acids
Research. 34 (Database issue): D363–8. doi:10.1093/nar/gkj123 (https://doi.org/10.1093%2Fnar%2Fgkj123).
PMC 1347485 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347485). PMID 16381887 (https://www.ncbi.nlm.
nih.gov/pubmed/16381887).
17. Roundup (http://roundup.hms.harvard.edu/)
Deluca TF, Wu IH, Pu J, Monaghan T, Peshkin L, Singh S, Wall DP (August 2006). "Roundup: a multi-genome
repository of orthologs and evolutionary distances". Bioinformatics. 22 (16): 2044–2046.
doi:10.1093/bioinformatics/btl286 (https://doi.org/10.1093%2Fbioinformatics%2Fbtl286). PMID 16777906 (https://
www.ncbi.nlm.nih.gov/pubmed/16777906).
18. TreeFam: Tree families database (http://www.treefam.org/)
van der Heijden RT, Snel B, van Noort V, Huynen MA (March 2007). "Orthology prediction at scalable resolution
by phylogenetic tree analysis" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1838432). BMC Bioinformatics. 8:
83. doi:10.1186/1471-2105-8-83 (https://doi.org/10.1186%2F1471-2105-8-83). PMC 1838432 (https://www.ncbi.n
lm.nih.gov/pmc/articles/PMC1838432). PMID 17346331 (https://www.ncbi.nlm.nih.gov/pubmed/17346331).
19. TreeFam: Tree families database (http://www.treefam.org/)
Ruan J, Li H, Chen Z, Coghlan A, Coin LJ, Guo Y, Hériché JK, Hu Y, Kristiansen K, Li R, Liu T, Moses A, Qin J,
Vang S, Vilella AJ, Ureta-Vidal A, Bolund L, Wang J, Durbin R (January 2008). "TreeFam: 2008 Update" (https://
www.ncbi.nlm.nih.gov/pmc/articles/PMC2238856). Nucleic Acids Research. 36 (Database issue): D735–40.
doi:10.1093/nar/gkm1005 (https://doi.org/10.1093%2Fnar%2Fgkm1005). PMC 2238856 (https://www.ncbi.nlm.ni
h.gov/pmc/articles/PMC2238856). PMID 18056084 (https://www.ncbi.nlm.nih.gov/pubmed/18056084).
20. Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E (February 2009). "EnsemblCompara GeneTrees:
Complete, duplication-aware phylogenetic trees in vertebrates" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2
652215). Genome Research. 19 (2): 327–335. doi:10.1101/gr.073585.107 (https://doi.org/10.1101%2Fgr.073585.
107). PMC 2652215 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2652215). PMID 19029536 (https://www.ncb
i.nlm.nih.gov/pubmed/19029536).
21. Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M,
Federhen S, Feolo M, Fingerman IM, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Lu Z, Madden
TL, Madej T, Maglott DR, Marchler-Bauer A, Miller V, Mizrachi I, Ostell J, Panchenko A, Phan L, Pruitt KD,
Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Slotta D, Souvorov A, Starchenko G, Tatusova TA,
Wagner L, Wang Y, Wilbur WJ, Yaschenko E, Ye J (January 2011). "Database resources of the National Center
for Biotechnology Information" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3013733). Nucleic Acids
Research. 39 (Database issue): D38–51. doi:10.1093/nar/gkq1172 (https://doi.org/10.1093%2Fnar%2Fgkq1172).
PMC 3013733 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3013733). PMID 21097890 (https://www.ncbi.nlm.
nih.gov/pubmed/21097890).
22. Fulton DL, Li YY, Laird MR, Horsman BG, Roche FM, Brinkman FS (May 2006). "Improving the specificity of
high-throughput ortholog prediction" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1524997). BMC
Bioinformatics. 7: 270. doi:10.1186/1471-2105-7-270 (https://doi.org/10.1186%2F1471-2105-7-270).
PMC 1524997 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1524997). PMID 16729895 (https://www.ncbi.nlm.
nih.gov/pubmed/16729895).
23. Zakany J, Duboule D (August 2007). "The role of Hox genes during vertebrate limb development". Current
Opinion in Genetics & Development. 17 (4): 359–366. doi:10.1016/j.gde.2007.05.011 (https://doi.org/10.1016%2
Fj.gde.2007.05.011). PMID 17644373 (https://www.ncbi.nlm.nih.gov/pubmed/17644373).
24. Studer RA, Robinson-Rechavi M (May 2009). "How confident can we be that orthologs are similar, but paralogs
differ?" (https://serval.unil.ch/notice/serval:BIB_39F8106EE698). Trends in Genetics. 25 (5): 210–216.
doi:10.1016/j.tig.2009.03.004 (https://doi.org/10.1016%2Fj.tig.2009.03.004). PMID 19368988 (https://www.ncbi.nl
m.nih.gov/pubmed/19368988).
25. Nehrt NL, Clark WT, Radivojac P, Hahn MW (June 2011). "Testing the ortholog conjecture with comparative
functional genomic data from mammals" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111532). PLoS
Computational Biology. 7 (6): e1002073. doi:10.1371/journal.pcbi.1002073 (https://doi.org/10.1371%2Fjournal.pc
bi.1002073). PMC 3111532 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111532). PMID 21695233 (https://w
ww.ncbi.nlm.nih.gov/pubmed/21695233).
26. Eisen, Jonathan. "Special Guest Post & Discussion Invitation from Matthew Hahn on Ortholog Conjecture Paper"
(http://phylogenomics.blogspot.com/2011/09/special-guest-post-discussion.html).
27. Noda-Garcia L, Romero Romero ML, Longo LM, Kolodkin-Gal I, Tawfik DS (July 2017). "Bacilli glutamate
dehydrogenases diverged via coevolution of transcription and enzyme regulation" (http://embor.embopress.org/c
ontent/18/7/1139). EMBO Reports. 18 (7): 1139–1149. doi:10.15252/embr.201743990 (https://doi.org/10.15252%
2Fembr.201743990). PMC 5494520 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5494520). PMID 28468957
(https://www.ncbi.nlm.nih.gov/pubmed/28468957).
28. Lundin LG (April 1993). "Evolution of the vertebrate genome as reflected in paralogous chromosomal regions in
man and the house mouse". Genomics. 16 (1): 1–19. doi:10.1006/geno.1993.1133 (https://doi.org/10.1006%2Fg
eno.1993.1133). PMID 8486346 (https://www.ncbi.nlm.nih.gov/pubmed/8486346).
29. Coulier F, Popovici C, Villet R, Birnbaum D (December 2000). "MetaHox gene clusters". The Journal of
Experimental Zoology. 288 (4): 345–351. doi:10.1002/1097-010X(20001215)288:4<345::AID-JEZ7>3.0.CO;2-Y
(https://doi.org/10.1002%2F1097-010X%2820001215%29288%3A4%3C345%3A%3AAID-JEZ7%3E3.0.CO%3B
2-Y). PMID 11144283 (https://www.ncbi.nlm.nih.gov/pubmed/11144283).
30. Ruddle FH, Bentley KL, Murtha MT, Risch N (1994). "Gene loss and gain in the evolution of the vertebrates".
Development: 155–161. PMID 7579516 (https://www.ncbi.nlm.nih.gov/pubmed/7579516).
31. Pébusque MJ, Coulier F, Birnbaum D, Pontarotti P (September 1998). "Ancient large-scale genome duplications:
phylogenetic and linkage analyses shed light on chordate genome evolution". Molecular Biology and Evolution.
15 (9): 1145–1159. doi:10.1093/oxfordjournals.molbev.a026022 (https://doi.org/10.1093%2Foxfordjournals.molbe
v.a026022). PMID 9729879 (https://www.ncbi.nlm.nih.gov/pubmed/9729879).
32. Larsson TA, Olsson F, Sundstrom G, Lundin LG, Brenner S, Venkatesh B, Larhammar D (June 2008). "Early
vertebrate chromosome duplications and the evolution of the neuropeptide Y receptor gene regions" (https://ww
w.ncbi.nlm.nih.gov/pmc/articles/PMC2453138). BMC Evolutionary Biology. 8: 184. doi:10.1186/1471-2148-8-184
(https://doi.org/10.1186%2F1471-2148-8-184). PMC 2453138 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC24
53138). PMID 18578868 (https://www.ncbi.nlm.nih.gov/pubmed/18578868).
33. Pollard SL, Holland PW (September 2000). "Evidence for 14 homeobox gene clusters in human genome
ancestry". Current Biology. 10 (17): 1059–1062. doi:10.1016/S0960-9822(00)00676-X (https://doi.org/10.1016%2
FS0960-9822%2800%2900676-X). PMID 10996074 (https://www.ncbi.nlm.nih.gov/pubmed/10996074).
34. Mulley JF, Chiu CH, Holland PW (July 2006). "Breakup of a homeobox cluster after genome duplication in
teleosts" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1502464). Proceedings of the National Academy of
Sciences of the United States of America. 103 (27): 10369–10372. doi:10.1073/pnas.0600341103 (https://doi.or
g/10.1073%2Fpnas.0600341103). PMC 1502464 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1502464).
PMID 16801555 (https://www.ncbi.nlm.nih.gov/pubmed/16801555).
35. Flajnik MF, Kasahara M (September 2001). "Comparative genomics of the MHC: glimpses into the evolution of
the adaptive immune system". Immunity. 15 (3): 351–362. doi:10.1016/S1074-7613(01)00198-4 (https://doi.org/1
0.1016%2FS1074-7613%2801%2900198-4). PMID 11567626 (https://www.ncbi.nlm.nih.gov/pubmed/11567626).
36. McLysaght A, Hokamp K, Wolfe KH (June 2002). "Extensive genomic duplication during early chordate
evolution". Nature Genetics. 31 (2): 200–204. doi:10.1038/ng884 (https://doi.org/10.1038%2Fng884).
PMID 12032567 (https://www.ncbi.nlm.nih.gov/pubmed/12032567).
37. Wolfe K (May 2000). "Robustness--it's not where you think it is". Nature Genetics. 25 (1): 3–4. doi:10.1038/75560
(https://doi.org/10.1038%2F75560). PMID 10802639 (https://www.ncbi.nlm.nih.gov/pubmed/10802639).
38. Singh PP, Affeldt S, Cascone I, Selimoglu R, Camonis J, Isambert H (November 2012). "On the expansion of
"dangerous" gene repertoires by whole-genome duplications in early vertebrates". Cell Reports. 2 (5): 1387–
1398. doi:10.1016/j.celrep.2012.09.034 (https://doi.org/10.1016%2Fj.celrep.2012.09.034). PMID 23168259 (http
s://www.ncbi.nlm.nih.gov/pubmed/23168259).
39. Malaguti G, Singh PP, Isambert H (May 2014). "On the retention of gene duplicates prone to dominant
deleterious mutations". Theoretical Population Biology. 93: 38–51. doi:10.1016/j.tpb.2014.01.004 (https://doi.org/
10.1016%2Fj.tpb.2014.01.004). PMID 24530892 (https://www.ncbi.nlm.nih.gov/pubmed/24530892).
40. Singh PP, Affeldt S, Malaguti G, Isambert H (July 2014). "Human dominant disease genes are enriched in
paralogs originating from whole genome duplication" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4117431).
PLoS Computational Biology. 10 (7): e1003754. doi:10.1371/journal.pcbi.1003754 (https://doi.org/10.1371%2Fjo
urnal.pcbi.1003754). PMC 4117431 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4117431). PMID 25080083
(https://www.ncbi.nlm.nih.gov/pubmed/25080083).
41. McLysaght A, Makino T, Grayton HM, Tropeano M, Mitchell KJ, Vassos E, Collier DA (January 2014). "Ohnologs
are overrepresented in pathogenic copy number mutations" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3890
797). Proceedings of the National Academy of Sciences of the United States of America. 111 (1): 361–366.
doi:10.1073/pnas.1309324111 (https://doi.org/10.1073%2Fpnas.1309324111). PMC 3890797 (https://www.ncbi.n
lm.nih.gov/pmc/articles/PMC3890797). PMID 24368850 (https://www.ncbi.nlm.nih.gov/pubmed/24368850).
42. Makino T, McLysaght A (May 2010). "Ohnologs in the human genome are dosage balanced and frequently
associated with disease" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2889102). Proceedings of the National
Academy of Sciences of the United States of America. 107 (20): 9270–9274. doi:10.1073/pnas.0914697107 (http
s://doi.org/10.1073%2Fpnas.0914697107). PMC 2889102 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC28891
02). PMID 20439718 (https://www.ncbi.nlm.nih.gov/pubmed/20439718).
43. García-Moreno J, Mindell DP (December 2000). "Rooting a phylogeny with homologous genes on opposite sex
chromosomes (gametologs): a case study using avian CHD". Molecular Biology and Evolution. 17 (12): 1826–
1832. doi:10.1093/oxfordjournals.molbev.a026283 (https://doi.org/10.1093%2Foxfordjournals.molbev.a026283).
PMID 11110898 (https://www.ncbi.nlm.nih.gov/pubmed/11110898).
Retrieved from "https://en.wikipedia.org/w/index.php?title=Sequence_homology&oldid=903749549"

This page was last edited on 27 June 2019, at 18:30 (UTC).

Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using
this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia
Foundation, Inc., a non-profit organization.

You might also like