Genomics is the study of whole genomes of organisms, and
incorporates elements from genetics. It uses a combination of recombinant DNA, DNA sequencing methods, and bioinformatics to sequence, assemble, and analyse the structure and function of genomes. It is the study of genes, their structure, function, and expression. The number of genes present in a particular genome depends on the species. But it is very difficult to determine the exact number of genes present in a genome. Structural genomics aims to determine the structure of every protein encoded by the genome. Functional genomics aims to collect and use data from sequencing for describing gene and protein functions. Comparative genomics aims to compare genomic features between different species. Genome resource banking (GRB) is defined as the methodical collection, storage, and distribution of genes from threatened and endangered species. Genetic material from rare and endangered species can be collected and preserved indefinitely in liquid nitrogen (-196°C). PlantGDB (http://www.plantgdb.org/) is a database of molecular sequence data for all plant species with significant sequencing efforts. The database organizes EST sequences into contigs that represent tentative unique genes. Expressed sequence tags (ESTs) are fragments of mRNA sequences derived through single sequencing reactions performed on randomly selected clones from cDNA libraries. To date, over 45 million ESTs have been generated from over 1400 different species of eukaryotes. Contigs are annotated and, whenever possible, linked to their respective genomic DNA. GDB entries are highly cross-linked to each other, to literature citations and to entries in other databases, including the sequence databases, OMIM, and the Mouse Genome Database. The largest gene bank in the world is situated in Norway. The National Genebank, which was set up in 1996, has been upgraded with the latest technology and now has the capacity to conserve approximately one million germplasm The main intention of genebanks is to conserve collections of plant genetic resources for posterity. This means that the documentation of the material must also be ensured across generations Genome Browsers Genome browsers integrate genomic sequence and annotation data from different sources and provide a platform to search, browse, retrieve, and analyze genomic data. Among the best known are the UCSC Genome Browser, Ensembl Genome Browser, JBrowse and NCBI's Genome Data Viewer. These genome browsers may support multiple genomes, however, other genome browsers may be specific for particular species. Genome Sequencing A laboratory method that is used to determine the entire genetic makeup of a specific organism or cell type. This method can be used to find changes in areas of the genome. These changes may help scientists understand how specific diseases, such as cancer, form. Scientists use a process called genomic sequencing to decipher the genetic material found in an organism or virus. Sequences from specimens can be compared to help scientists track the spread of a virus, how it is changing, and how those changes may affect plant health. This method can generate high density maps, making the genome assembly easier. It generally includes four steps, preparation of BAC clone library, preparation of clone fingerprint, BAC clone sequencing, and sequence assembly A bacterial artificial chromosome (BAC) is an engineered DNA molecule used to clone DNA sequences in bacterial cells (for example, E. coli). BACs are often used in connection with DNA sequencing. Segments of an organism's DNA, ranging from 100,000 to about 300,000 base pairs, can be inserted into BACs. Genetic Mapping Genetic mapping - also called linkage mapping - can be used to support hereditary diseases in plants. Mapping also provides clues about which chromosome contains the gene and precisely where the gene lies on that chromosome. Genetic mapping is based on the principle that genes (markers or loci) segregate via chromosome recombination during meiosis (i.e. sexual reproduction), thus allowing their analysis in the progeny. Genetic maps have been used successfully to find the gene responsible for relatively rare, single-gene inherited diseases Gene mapping is the sequential allocation of loci to a relative position on a chromosome. Genetic maps are species-specific and comprised of genomic markers and/or genes and the genetic distance between each marker. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of 'junk' DNA with no evident function. Genomic DNA contains genes, discrete regions that encode a protein or RNA. A gene comprises the coding DNA sequence, as well as the associated regulatory elements that control gene expression. Nuclear eukaryotic genes also contain noncoding regions called introns. A vast amount of gene types and genetic variations exist, so let's break down five major types of genes in a comprehensive way. Complementary Genes Supplementary Genes. Duplicate Genes. Polymeric Genes. Sex-linked Genes. Genome organisation Genome is a complex hierarchical structure, and its spatial organization plays an important role in its function. Chromatin loops and topological domains form the basic structural units of this multiscale organization and are essential to orchestrate complex regulatory networks and transcription mechanisms. Genomic DNA exists as single linear pieces of DNA that are associated with a protein called a nucleoprotein complex. The primary function of the genome is to store, propagate, and express the genetic information that gives rise to a cell's architectural and functional machinery. However, the genome is also a major structural component of the cell. Genomes are organized into complex higher-order structures by folding of the DNA into chromatin fibers, chromosome domains, and ultimately chromosomes. The higher-order organization of genomes is functionally important for gene regulation and control of gene expression programs Most of the well-characterized prokaryotic genomes consist of double-stranded DNA organized as a single circular chromosome 0.6- 10 Mb in length and one or more circular plasmid species of 2 kb-1.7 Mb. The past few years, however, have revealed some major variations in genome organization. Nucleotide Substitution Substitution, as related to genomics, is a type of mutation in which one nucleotide is replaced by a different nucleotide. Substitution mutations can be good, bad, or have no effect. They cause three specific types of point mutation: silent, missense, and nonsense mutations. A silent mutation is one where the function of the protein is not changed. A missense mutation codes for the wrong protein. Causes of mutation Mutations can arise in cells of all types as a result of a variety of factors, including chance. Some mutations are the result of spontaneous events during replication, and they are known as spontaneous mutations. Slippage of the DNA template strand and subsequent insertion of an extra nucleotide is one example of a spontaneous mutation; excess flexibility of the DNA strand and the subsequent mispairing of bases is another. Environmental exposure to certain chemicals, ultraviolet radiation, or other external factors can also cause DNA to change. These external agents of genetic change are called mutagens. Exposure to mutagens often causes alterations in the molecular structure of nucleotides, ultimately causing substitutions, insertions, and deletions in the DNA sequence. Mutations are a source of genetic diversity in populations they can have widely varying individual effects. In some cases, mutations prove beneficial to an organism by making it better able to adapt to environmental factors. In other situations, mutations are harmful to an organism — for instance, they might lead to increased susceptibility to diseases and pests. In still other circumstances, mutations are neutral, proving neither beneficial nor detrimental outcomes to an organism. Nucleotide substitution model Substitution models attempt to predict the rate of substitution for nucleotides or amino acids at a given site, and also the distribution of substitutions across the entire sequence. The differential rate of substitutions across the sequence is called the rate heterogeneity. The simplest substitution model for nucleotides is the Jukes–Cantor (JC) one-parameter model, which assumes that all nucleotides occur in equal frequency (25%) and are substituted with equal probability. This model requires a single parameter denoting rate. A simple method is proposed for estimating the average number of nucleotide substitutions per site within and between populations for the case where a large number of individuals are examined for many restriction enzymes. This method gives essentially the same results as those obtained by Nei and Li's method but saves a large amount of computer time. The variances of the quantities estimated can be obtained by the jackknife method, and these variances are very similar to those obtained by Nei and Jin's more sophisticated method. A similar simple method can also be applied to DNA sequence data. DNA Structure variations Structural variation (SV) is generally defined as a region of DNA approximately 1 kb and larger in size and can include inversions and balanced translocations or genomic imbalances (insertions and deletions), commonly referred to as copy number variants (CNVs). Refers to the genetic trait involving the number of copies of a particular gene present in the genome of an individual. Genetic variants, including insertions, deletions, and duplications of segments of DNA, are also collectively referred to as copy number variants. Structural variants (SVs) such as deletions, insertions, duplications, inversions and translocations litter genomes and are often associated with gene expression changes and severe phenotypes Copy number variation (CNV) disorders arise from the dosage imbalance of one or more gene(s), resulting from deletions, duplications or other genomic rearrangements that lead to the loss or gain of genetic material. Comparative genomics Comparative genomics is the direct comparison of complete genetic material of one organism against that of another to gain a better understanding of how species evolved and to determine the function of genes and noncoding regions in genomes. For example, a March 2000 study comparing the fruit fly genome with the human genome discovered that about 60 percent of genes are conserved between fly and human. Or, to put it simply, the two organisms appear to share a core set of genes. It pinpoints genes that are essential to life and highlights genomic signals that control gene function across many species. It helps us to further understand what genes relate to various biological systems, which in turn may translate into innovative approaches for treating human disease and improving human health. Comparative genomics also provides a powerful tool for studying evolutionary changes among organisms, helping to identify genes that are conserved or common among species, as well as genes that give each organism its unique characteristics. The major principle of comparative genomics is to compare basic biological similarities or differences in genomic features resulting from DNA sequences between organisms at the genetic level. Genomic features include DNA sequences, gene contents, gene order, regulatory sequences and other genomic structures. The structure of different genomes can be compared at three levels: overall nucleotide statistics, genome structure at DNA level, and genome structure at gene level. The main difference between genomics and metagenomics is the number of organisms evaluated in an assay or sample. Genomics studies the genome of a single organism while metagenomics studies the collection of different organisms' genomes within a sample. Comparative genomics is the field of bioinformatics that involves comparing the genomes of two different species, or of two different strains of the same species. One of the first questions to ask when comparing the genomes of two species is: do the two species have the same number of genes