This document provides information about the Bioinformatics (BIOT 305) course taught by Simon Kumar Shrestha at KU. The course is worth 3 credits and consists of 50% theory and 50% practical work. Students will be evaluated based on internal exams, lab work presentations, paper presentations, and attendance/class performance. The course covers topics such as biological databases, sequence alignment, gene/promoter prediction, molecular phylogenetics, and structural bioinformatics. The intended learning outcomes are for students to gain knowledge of bioinformatics applications and tools, and skills in using software to analyze genomic and post-genomic data.
This document provides information about the Bioinformatics (BIOT 305) course taught by Simon Kumar Shrestha at KU. The course is worth 3 credits and consists of 50% theory and 50% practical work. Students will be evaluated based on internal exams, lab work presentations, paper presentations, and attendance/class performance. The course covers topics such as biological databases, sequence alignment, gene/promoter prediction, molecular phylogenetics, and structural bioinformatics. The intended learning outcomes are for students to gain knowledge of bioinformatics applications and tools, and skills in using software to analyze genomic and post-genomic data.
This document provides information about the Bioinformatics (BIOT 305) course taught by Simon Kumar Shrestha at KU. The course is worth 3 credits and consists of 50% theory and 50% practical work. Students will be evaluated based on internal exams, lab work presentations, paper presentations, and attendance/class performance. The course covers topics such as biological databases, sequence alignment, gene/promoter prediction, molecular phylogenetics, and structural bioinformatics. The intended learning outcomes are for students to gain knowledge of bioinformatics applications and tools, and skills in using software to analyze genomic and post-genomic data.
Simon.Shrestha@ku.edu.np Course • Total Credit > 3 equivalent to 48 hours • Theory: 50 % • Practical: 50% • Evaluation: • Internal exams • Lab work presentation • Paper Presentations • Attendance & Class performances • Text Books: • Bioinformatics, Sequence and Genome Analysis: David W. Mount • Essential Bioinformatics: Jin Xiong • References: • Proteins, Structure and Molecular Properties: Thomas E. Caeighton • Discovering Genomics, Proteomics and Bioinformatics: A. Malcolm Campbell and Laurie J. Heyer Syllabus UNIT I: Introduction & Biological Databases What is Bioinformatics? Goal, Scope, Applications, Limitations DNA Sequencing, Genomic sequencing, Sequencing cDNA libraries of expressed genes, Submission of sequences to the databases, Sequence accuracy, Computer storages of sequences, Sequence formats, Conversion of one sequence format to another, Multiple sequence formats, Storage of information in a sequence database, Using the database access program ENTREZ UNIT II: Sequence Alignment Pairwise Sequence Alignment: Definition of sequence alignment, Evolutionary basis, Sequence homology versus Sequence similarity, Sequence similarity versus Sequence Identity, Methods, Scoring matrices, Significance of Sequence Alignment Database Similarity Searching: Requirements of Database Searching, Heuristic Database Searching, Basic Local Alignment Search Tools (BLAST), FASTA, Comparison of FASTA and BLAST Multiple Sequence Alignment: Scoring functions, Exhaustive algorithms, Heuristic algorithms, Practical Issues Profiles and Hidden Markov Models: Position Specific Scoring Matrices, Profiles, Markov models and Hidden Markov Models UNIT III: Gene and Promoter Prediction Gene Prediction: Gene Prediction in Prokaryotes and Eukaryotes, Gene prediction programs Promoter and Regulatory Element Prediction: Promoter and Regulatory Elements Prediction in Prokaryotes and Eukaryotes, Prediction algorithms UNIT IV: Molecular Phylogenetics Molecular Evolution and Molecular Phylogenetics, Terminology, Gene phylogeny vs. Species phylogeny, Forms of Tree representation, Procedure, Phylogenetic Tree Construction Methods and Programs (Distance – based and Character – based Methods), Tree Evaluation UNIT V: Structural Bioinformatics Protein Structure Basics: Amino acids, Peptide Formation, Dihedral Angles, Hierarchy, Secondary Structures, Tertiary Structures, Determination of Protein 3 – D Structures Protein Structure Visualization, Comparison and Classification Protein Secondary Structure Prediction: Secondary Structure prediction for Globular, Transmembrane proteins Protein Tertiary Structure Prediction: Methods, Homology modeling, Threading and Fold recognition, Ab Initio Protein Structure Prediction RNA Structure Prediction: Introduction, Types of RNA structures, Methods for RNA Secondary Structure prediction, Ab initio Approach, Comparative Approach, Performance Evaluation Intended learning outcome • Knowledge: • Knowledge of most widely used available bioinformatics applications and technology. • An appreciation of the advantages and shortcomings of various bioinformatics software tools • An understanding of the appropriate application of a range of bioinformatics software • Skills: • A familiarity with the use of much of the existing software for the analysis of genomic and post-genomic data. • An ability to select the most appropriate bioinformatics tools for a given analysis • An ability to synthesise information What is Bioinformatics? • An interdisciplinary research area at the interface between computer and biological science • The Discipline of quantitative analysis of information relating to biological macromolecules with the aid of computers • The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information • Biology & information Technology • Involves technology that uses computers for storage, retrieval, manipulation & distribution of information related to biological macromolecules such as RNA, DNA and proteins Need of Bioinformatics • Large amount of sequence and supplementary information is generated every year • What should be done with this information? • It is stored in the database so that at time of need it can be retrieved and manipulated Data Explosion DNA sequences as information • DNA sequences can code for an amino acid sequences (mRNAs) • The DNA can also code for stable RNA sequences: • tRNA, rRNA, snRNA, siRNA, lncRNA • DNA sequence act as protein binding site • DNA code for architectural information • Intrinsic DNA curvature • Nucleosome positioning • DNA code for architectural information: • Transcriptional initiation • Origin of replication • Mutational Hot Spots RNA sequences as information • The mRNAs contain several levels of information: • Specifies amino acid sequence for proteins • Localization signals • Stability signals • Splice signals • Editing signals • The tRNAs code for the genetic code • The rRNAs code for the structure of ribosomes Protein sequences as information • The protein sequence can code for an "active site" for enzymes • The protein sequence can code for structural roles: microtubules, myosin, collagen, etc. • The protein sequence can code for ion channels/pumps • The protein sequence can code for localization information • The protein sequence can code for modification sites What is this? >Hello Find me ATGGGACTACCCTGGTACCGCGTACATACAGTAGTTCTGAACGATCCAGGACGGCTGATTTCTGTACACCTAATGCACACTGC TCTTGTCGCAGGTTGGGCGGGCTCTATGGCCCTGTACGAATTGGCAGTTTTTGACCCATCAGACCCAGTTCTCAATCCCATGT GGCGTCAAGGTATGTTTGTCATGCCTTTTATGGCTCGTTTGGGTGTAACTCAATCCTGGGGTGGCTGGAGTCTAACTGGTGA AGTAGCCGATAATCCCGGAATTTGGTCTTTTGAAGGGGTAGCCGCTACCCATATCATCTTGTCAGGTCTATTATTCCTGGCAGC AGTTTGGCACTGGGTTTACTGGGATCTGGAACTGTTTACCGATCCTCGGACTGGTGAACCAGCCCTAGACCTACCCAAAATG TTCGGAATTCATTTATTCCTATCTGGTTTGCTTTGTTTTGGCTTCGGAGCCTTCCACCTCACGGGACTATTCGGACCGGGAATG TGGGTTTCTGACCCCTATGGATTGACGGGAAGTATACAACCTGTCGCTCCTTCCTGGGGGCCTGAAGGATTTAACCCCTTCAA TGCTGGCGGTATTGCGGCTCACCATATTGCGGCCGGAATTGTTGGCATTATTGCCGGACTATTCCACCCGTCCGTCAGACCAC CTCAGCGCCTATACAAAGCCCTGCGTATGGGAAATATCGAAACTGTACTATCTAGTAGTATCGCGGCGGTATTCTTTGCGGCTT TTGTGGTAGCTGGAACTATGTGGTATGGTTCGGCTGCAACTCCGATTGAACTGTTTGGACCTACCCGCTATCAGTGGGATCAG GGATATTTCCAACAGGAAATTCAGCGCCGGGTACAAAGCAGTATTGCTCAGGGTGACAGCCCCTCAGAAGCATGGTCTAAG ATTCCTGAAAAACTGGCATTTTATGACTATGTTGGTAACAGTCCCGCTAAAGGCGGTTTGTTCCGCGTCGGTCCGATGAACAA GGGCGATGGTATTGCTCAAGGTTGGCTCGGACACCCAGTATTCACTGATGCAGAAGGTCGCGAATTAACTGTTCGTCGTCTT CCTAACTTCTTTGAAACCTTCCCCGTCATTCTGACTGATGCTGATGGCGTAATTCGCGCTGACGTTCCTTTCCGTCGCGCGGA GTCTCGCTACAGCTTTGAGCAAACTGGGGTGACTGTTTCTTTATATGGTGGTGAACTCAATGGTAAAACCTTCACCGATCCCG CCTCTGTGAAGAAATATGCCCGCTTTGCTCAACAGGGTGAACCATTTGCCTTTGACCGGGAAACTCTCGGCTCTGATGGGGT ATTTCGTACCAGTACCCGTGGCTGGTTTACTTTCGGTCACGCTTGCTTTGCTCTGCTTTTCTTCTTTGGTCATATTTGGCACGGT TCCCGCACCATCTTCCGAGATGTATTTGCTGGGGTGGAAGCTGACCTAGAAGAACAAGTTGAGTGGGGTAACTTCCAGAAA GTTGGAGACCAAACAACTCGTGTTCAAAAGACCGTCTAA Goals of Bioinformatics • To better understand • A living cell • How it functions at the molecular level • Cellular functions are ultimately controlled by the proteins translated via central dogma of biology • Specificity and capabilities of the proteins are determined by their sequences • Generate new insights and provide a “global” perspective of the cell Bioinformatics to Systems Biology • When we are able to integrate the ~omics data sets in comprehensive virtual biological correlation networks • We are able to make a complete description of complex biological processes of interest which we can subsequently model or emulate in silico. • This modelling will allow us to elucidate specific and pleiotropic gene functions and relationships. In other words we will be able to understand (the behaviour of and (un)stability of ) complex phenotypes Scope of Bioinformatics • 3 major aspects of bioinformatics • Structure analysis • Structure Prediction of nucleic acids, proteins • Protein structure classification • Protein structure comparison • Sequence Analysis • Genome comparison, Gene and promoter prediction, sequence alignment, Sequence database searching • Function analysis • Metabolic pathway modeling, Gene expression profiling, Protein interaction prediction, Protein subcellular localization prediction Contd… • These major aspects are accomplished by 2 subfields of Bioinformatics • Development of computational tools and databases • Application of these tools and databases in generating biological knowledge to better understand the living systems Techniques frequently used • Bioinformatics employs a wide range of computational techniques including: • sequence and structural alignment • database design and data mining • macromolecular geometry • phylogenetic tree construction • prediction of protein structure and function • gene finding • expression data clustering Distinction: Bioinformatics & Computational Biology • Bioinformatics is limited to sequence, structural, and functional analysis of genes and genomes • computational biology encompasses all biological areas that involve computation. • For example, mathematical modeling of ecosystems, population dynamics, application of the game theory in behavioral studies, and phylogenetic construction using fossil records all employ computational tools, but do not necessarily involve biological macromolecules Why do we need Bioinformatics? • Sequence Analysis • Processing of DNA and protein sequences to understand its function, structure and other features • Comparison of sequences to find similarity with existing sequences Why do we need Bioinformatics? • Gene Expression Profiling • Measurement of the activity of thousands of genes at once • DNA microarray technology and SuperSAGA used for profiling. Also, RNAseq • Statistical Analysis performed Why do we need Bioinformatics? • Comparative Genomics • Study of the relationship of genomic structures of different species. • Helps to understand the evolutionary processes Why do we need Bioinformatics? • Drug Discovery • Process of discovering and designing drugs • Includes, target identification, validation, optimization and trials. • Specific databases and bioinformatics tools (ADMET) available Applications of Bioinformatics • Molecular Medicine - Genetic diseases e.g. Cystic fibrosis - Alterations of genomes due to body's response to the environmental stresses e.g. heart diseases, cancer etc. - Human genome project helps to understand these types of diseases Contd… • Clinical medicine - Pharmacogenomics (Study of how genetic inheritance of an individual affects the body's response to the drugs) - Detailed knowledge of genetic profile of individual has helped the doctors to prescribe the right therapy from the beginning Contd… • Gene Therapy - Treatment of diseases on the basis of the expression of genes causing diseases - This technique is not frequently used these days but that's not too far also Contd… • Drug Design - Recent drugs target ~ 500 proteins - Understanding of disease mechanisms can help to find out the new drugs that can act on the target proteins • Microbial genome applications - Microbial genome project can help to utilize variety of microbes for useful purposes e.g. wastes clean up (Deinococcus radiodurans, radiation resistant bacteria), industrial processing, energy production (Chlorobium tepidum, has huge capacity of generating energy from light) etc. • Biotechnology - Corynebacterium glutamicum has been used by chemical industries for biotechnological production of lysine (Lysine is rich source in animal nutrition) - Xanthomonas campestris is used to produce exopolysacharide xanthan gums - Lactobacillus lactis in dairy industries • Antibiotic Resistance - Enterococcus faecalis - a leading cause of bacterial infection among hospital patients - Discovery of the antibiotic-resistant virulence regions of the bacteria may help to provide the useful markers for detecting pathogenic strains and help to establish controls to prevent the spread of infections in wards. • Forensic analysis of microbes - Genomic tools used to distinguish the strains of Bacillus anthryacis to analyze the terrorists attack of Anthrax in Florida in 2001 • Evolutionary studies - Archaea, bacteria and eukaryota, 3 domains of life can be used to analyze the universal common ancestor • Polymerase Chain Reaction (PCR) - Primer design - Accuracy detection of PCR efficiency using PCR testing tools and software • Vetinary Science and Comparative studies - Understanding of animals genomes can help to understand the biology of animals - Genome sequencing of different organisms can help to compare the genomes Limitations of Bioinformatics • Bioinformatics results are not always accurate • Bioinformatics predictions based on the interpretation of experimental data are not formal proofs of any concepts • Bioinformatics does not replace the traditional research experimental methods • Over reliance on poor quality data can lead to misleading conclusions since Bioinformatics predictions are solely based on the quality of data and the algorithms used
Download Computation In Bioinformatics Multidisciplinary Applications 1St Edition S Balamurugan Editor Anand T Krishnan Editor Dinesh Goyal Editor Balakumar Chandrasekaran Editor Boomi Pandi Editor online ebook texxtbook full chapter pdf