Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Application of Bioinformatics in agriculture

Nhu V.H1*, Sang C1*, Trinh L.T.D1* and Xuan N.U.Y1*


1School of Biotechnology, International University-Vietnam National University Hochiminh city

Quarter 6, Linh Trung Ward, Thu Duc District, HCMC


*the authors contributed equally to this work.

ABSTRACTS
Bioinformatics plays an important role in agriculture science. As an interdisciplinary
scientific field, bioinformatics combines with biology, computer science, information
engineering, statistics for data analysis and interpretation. Collection and storage of plant
genetic resources are essential for producing stronger, disease and insect resistant crops. It
also improves the quality of livestock healthier, more resistant to diseases and more
productive. The biological data growth follows an exponential curve, hence the ability to
capture, manage, process, analyze and interpret data are very important than ever. Scientists
can be solved it due to bioinformatics and computer with great speed and accuracy. This
review aims to describe some key concepts of software packages, methods, and databases
used in bioinformatics. Web tools, resources of bioinformatics, applications in agriculture
were also reviewed.

INTRODUCTION
Bioinformatics is the application of computer technology which involved to the processing
and managing the data in biological experiments. It refers to genes, DNA, RNA, or protein,
and is especially helpful in comparing genes and different sequences in proteins and other
sequences within an organism or between many organisms, looking at evolutionary
relationships between organisms, and using the models that exist across DNA and protein
sequences to determine what their function is[1]. It can be categorized into two classes:
management of biological data and computational biology. It is too much difficult for human
brain to process this sea of data. Comparing the 15,000 genes of Arabidopsis to the genes of a
similar plant and keeping track of the 100,000 genes of a human being would be
unimaginable and take many years to process it[2]. There has more and more demand for
computational methods to process and contextualize these data. By using stored information
as it is discovered before, computers ease the immense job of genome mapping easily[3].

1
Instructor: Nguyen Minh Thanh Group 21

Computers can analyze, store information as well as being used to construct models. In this
study, we investigated the application of various bioinformatics tools in agriculture science
enables storage, retrieval, analysis, annotation and visualization of results and promotes better
understanding of biological system in fullness.

METHOD
Vital biological database on agriculture.
When the "genomic transformation" takes put, bioinformatics to begin is to set up and keep
up databases to store expansive sums of biological data such as nucleotide sequence and
amino acids. Database advancement not only includes the design and storage of data, but too
the development of a user-friendly GUI (graphical user interface) so that searchers can get to
existing information at the same time and submit new or modified information, for instance:
NCBI, Ensembl. Today, as technology advances, there are also more and more useful
databases where information on agricultural plant species can be found.
Another imperative facility is the Plant Genome Database: PlantGDB is a genomic sequence
catalog of all plant species, made for the execution of comparative genomes. This database
too categorized the EST sequence into conceivable characteristics and distinguished unique
genes.

There are three fundamental sequence databases: GenBank (NCBI), the Nucleotide Sequence
Database (EMBL) and the DNA Databank of Japan (DDBJ) which is the store of crude
sequence data of plants. In this manner, SWISS-PROT and TrEMBL are the essential
databases for storing plant protein sequences. There are too secondary databases such as
PROSITE, PRINTS and BLOCKS and the strings they contain are not crude information, but
are taken from information within the fundamental database.

The first bioinformatics databases put accentuation on information collection. At the starting
of this decade, the focus shifted from data capture to data synthesis and integration. The
model organism database (MOD), the integrated store of all electronic information sources
related to a specific species of experimental plant or animal, has become the beat choice of
potential bioinformatics world. Intergrating a assortment of biological data around a few
species, these assets permit analysts to make revelations that would not be possible with a
single species analysis. These systems integrate data about many organisms and use
comparative analysis to discover patterns in the genome that may well be missed.

2
Instructor: Nguyen Minh Thanh Group 21

Currently, different bioinformatics approache are applied when studing plant genome
information. A few of the most popular sequency are:

Sequence alignment methods and applications for comparing genome sequences: The
advance of technologies for the huge scale quantification and identification of biological
molecules combined with the progress of computing advances and the web has contributed to
facilitate the delivery of major volumes of biological data to the analysts. The increased
productivity was gained through automation, miniaturization, and integration of technologies
and applying this approach to the assays of other biological molecules including mRNA,
proteins, and metabolites has effected in a large increase in the generation of biological
information.

Very often the main essence of the bioinformatics strategies for sequence alignment is the
comparison of cDNA/EST and genomic sequences and annotation. In addition to whole
genome sequencing, plant sequence information have been collecting from three main
sources: sample sequencing of bacterial artifcial chromosomes (BACs), genome survey
sequencing (GSS) and sequencing of expressed sequence tags (ESTs).

Sequence alignment: This is the arrangement of two or more amino acid or nucleotide
sequences from an organism or organisms in such a way as to adjust areas of the sequences
sharing common properties. Well known versions for pairwise alignment are the
Smith-Waterman algorithm for local alignment and the Needelman-Wunsch algorithm for
global alignment.

Multiple sequence alignment: Multiple alignment demonstrates relationships between two


or more sequences. When the involved sequences are different, the conserved residues are
often key residues related to maintenance of structural stability or biological function.
Multiple alignments can divulge a lot of clues about protein structure and function. The most
commonly used alignment software is the ClustalW package.

Sequence Similarity Searching Algorithms: Possibly the most used of these


are FASTA and BLAST. Both tools BLAST and FASTA provide very fast searches of
sequence databases.

3
Instructor: Nguyen Minh Thanh Group 21

Genome Comparison Tools: MegaBlast is NCBI BLAST based algorithm for large
sequence similarity search. MegaBlast is used to liken the raw genomic sequences to a
database of contaminant sequences.

Expressed sequence tags (ESTs): ESTs are fractional, gene sequences which have been
produced or are in the process of being produced in several laboratories using different
species and cultivars as well as diversed tissues and developmental stages. ESTs are now
widely used throughout the genomics and molecular biology society for gene discovery,
mapping, polymorphism assay, expression studies, and gene prediction.[4]

Genome Sequencing
DNA pieces are randomly sheared, cloned and sequenced in parallel. There is software which
can place together with the overlapping sequences which are sequenced separately (Myers,
1995). Many packages of software exist for sequence assembly (Gibbs et al., 2003) such as
Phred/Phrap/Consed (http://www.phrap.org), Arachne (http://www.broad.mit.edu/wga/), and
GAP4 (http://staden.sourceforge.net/overview.html). Package of a modular, open-source
developed by TIGR has known AMOS (http://www.tigr.org/soft ware/AMOS/), which can be
used for assembly of the comparative genome (Pop et al., 2004).

Representation of data and Storage


The notable software of relational database that is freely available and popular in
bioinformatics is MySQL (http://www.mysql.com/) and PostgreSQL
(http://www.postgresql.org/). Data are represented as attributes, entities and relationships
between the entities in relational databases. This representation type is called
Entity-Relationship (ER), and database schemas are described using ER diagrams (TAIR
schema http://arabidopsis.org/search/schemas.html). Attributes and entities become columns
and tables in the physical database implementation respectively. Data are the values that are
stored in the tables' fields. For storing large data quantities, relational databases are powerful
ways.

Data Access and Exchange


For data exchanging and information via the web is Extensible Markup Language (XML)
which is an emerging standard. It allows to information providers define new attribute names
and tag at will and to nest document structures to any complexity level, among other features.

4
Instructor: Nguyen Minh Thanh Group 21

The document that defines tags meaning for an XML document is called Document Type
Definition (DTD). The common DTD use allows different users and applications to exchange
data in XML.[5]

RESULT
Tools in bioinformatics play an important role in providing information about the genes
present in the genomes of these species. These tools also help predict the function of different
genes and factors that influence these genes. The genetic information provided by the tool
allows scientists to create plants with enhanced resistance to drought, herbicides and
pesticides. Similarly, certain genes can be modified to improve meat and milk yield. They
can fight disease by making certain changes to your genome . The following are some of
[6]

the practical uses of bioinformatics in agriculture:

Improve nutritional Quality


When plant genomes change, nutritional value
Plants also increased. Golden rice is an important achievement in this endeavor. Here, these
genes are inserted into the rice genome to increase vitamin A level in the crop. Vitamin A is
an important ingredient for the eyes and in the absence of vitamin A. If it occurs in the body,
it can lead to blindness. Through this work, scientists introducing genetically modified rice to
reduce blindness globally everyone. [7]

Figure 1: Transfer of genes into rice to enlarge the levels of Vitamin A[4]

5
Instructor: Nguyen Minh Thanh Group 21

Crops
When evolutionary changes take place in plants, their genomes are conserved and do not
provide much information. Ever since bioinformatics tools were introduced, it was possible to
extract the information needed from the genome of a particular plant. There are two types of
edible plants whose genomes have been fully mapped, eg Arabidopsis (Arabidopsis thaliana)
and rice (Oryza sativa). These two species of plants have their names in English as water
cress and rice respectively. Arabidopsis is a small plant in the Brassica family, with a smaller
genomic size, so researchers took an interest in it and studied the plant's development. The
genome is composed of 5 chromosomes with a ~ 125 Mbp DNA distribution. It breeds in 5
weeks, creating a new generation. Understanding the gene and its expression provides
information about the protein and its expression in other plants. There are many uses for
knowing the genome of A. thaliana, but its main use is to increase plant yield. [7]

Anti-insect
Many plants have incorporated the desired gene, resulting in resistance to insects. Bacillus
thuringiensis increases soil fertility and protect plants from pests and diseases. When
researchers map the genome by incorporating that gene into plants, we have created
resistance to insects. For example, corn, cotton and potatoes are so far resistant to pests and
diseases. By doing genes of bacteria in the plant genome, when insects eat plants, bacteria. It
enters the bloodstream, starving you and eventually dying. Bt corn is one edible plants have
been modified by inserting a
bacterial gene that it is effective
against insects by developing
resistance to insects. Use Bt genes
in the plant genome have led
farmers to use pesticides in very
little amount. As a result, crop yield
and nutritional value will increase
and benefit human health. [7]

Figure 2: Bacillus thuringiensis gene

6
Instructor: Nguyen Minh Thanh Group 21

Poorer soils and Drought Resistant


Some grains that have poor growth in soil are underdeveloped and drought/diving resistant.
In this way, we can also use that area. The soil fertility is low, reducing droughts and floods.
Great advances in molecular biology over the past decades, advances in genetic engineering
have led to an explosive development in the field of biology, information created by the
scientific community, information about the genome. As a result, a computer database is
stored, organized and it indexes data and uses special tools to view and analyze it. The
publication of the completed Arabidopsis thaliana genome sequence (Arabidopsis Genome
Initiative, 2000) and draft sequence for rice genome (Goff et al 2002), plant research and the
industry has crossed the threshold of the genetic era. Many applications of genomic
information has opened up many opportunities to incorporate rich rewards in subsystem
biology, integrated biology, and large-scale systematic functional gene projects. Freely
through the accumulation of these various data understanding the genome. Based on this
understanding, a method for determining the amount and sensation of changes in gene or
protein expression levels, and evaluation of interactions with other genes and proteins, and
finally, group of metabolites in a given tissue. Great to reach this horizon business science
and many aspects will inevitably rely on bioinformatics (Rudd, 2003). Keep in mind the
hidden power of data hidden throughout.

Even in the genome scaffolding, or partial transcription somatic data is available for more
plants. It is plausible to think that bioinformatics has been integrated into such an important
part of modern genome research. Bioinformatics is fully involved in its maturation process to
evaluate a variety of complete genetic sequences (Claverie and Notredame, 2003). Scientific
data management in genomics and proteomics, as a young industry in the field of information
technology, bioinformatics has a very fast development rate in 10 years ago. The
bioinformatics methods are implemented globally for this approach database and information
exchange for comparison, verification, storage and analysis biological data. There are
currently several databases on related proteins such as humans, animals, plants, bacteria and
other living organisms (Vassilev et al 2005).

Bioinformatics in plant science


Plants are the basis of life on Earth and produce life-sustaining oxygen we breath. They are
essential to our nutrition and health and to provide life an environment for the vast
biodiversity of the planet. For centuries, humans have chosen plants to develop the most

7
Instructor: Nguyen Minh Thanh Group 21

suitable crop varieties with multiple advantages. They differ in quality, quantity, and farming
methods compared to wild (wild) plants. The pluralistic nature of resistance and quality has
proven to be very difficult definitely combine and improve. The life science revolution is
signaled by the genome scale and scope of our investigation and test application plant science.
The scale and high resolving power of the genome extensive and detailed genetic
understanding of the different yield levels of crops the set. The complex biological processes
that form the mechanisms of resistance to pathogens and plant quality are now open to
system function analysis. This analysis is performed by specific software on large amounts of
data is the field of plant bioinformatics (Neerincx and Leunissen, 2005; Meyer and Mewes,
2002).

Genetic initiatives are underway for more than 60 different plant species. From the most
important from an economic point of view is the main food/feed crops, corn, grass, rice,
wheat, sorghum and barley; and legumes for fodder beans and alfalfa. Some of these
genomes are so large that they can sequence the entire genome this is impractical and effort
has been focused on genome comparison methods. Both rice and maize, however, have
relatively small of the agricultural economies of developed countries with complete genetic
sequences priority has been assigned. [7]

1. Rice
Some desirable traits of future improved rice varieties are high grain, higher quality,
harvestability, enhanced resistance to insects and diseases, and the ability to tolerate abiotic
stresses such as drought, flooding, frost and nutritional deficiencies.

2. Maize
Maize products generate about 30 billion dollars annually and are used in food, rubber,
plastic, fuel and clothing. The maize genome is about 20 times larger than that of the maize in
Arabidopsis. This means it is as large as the human genome but its the arrangement is more
complicated than all of the organisms are sorted today. Genes are in clusters throughout the
genome a chain repeats between. The container gene of regions accounting for 15% total
genome. Another important feature of the maize genome is multiple copies of most genes and
the presence of jumping or transposition genes make up a large part of the genome.

8
Instructor: Nguyen Minh Thanh Group 21

3. Tomato
The overall goals of the Tomato Genomics Project include development of an integrated
laboratory toolkit for use in the functional tomato genome. Much developed resources will be
used to further expand our understanding of molecules genetic events underpin fruit
development and responses to pathogens and provide the research community with various
plant biological analyzes phenomena. [7]

4. Wheat
The recent advances in plant genetics and genomics are unprecedented opportunity to explore
gene function and potential for gene manipulation crop improvement. Because of the size of
the wheat genome, the actual base pair sequences of the DNA molecule will be fully learned
shortly future. This project adopts an alternative strategy to realize the benefits of new
technology for explore genes and learn functions (functional genomes). Continued 10,000
wheat EST (Expression sequence card), physical position in chromosomes of wheat. This
process is unique wheat chromosome characteristics, the ability to tolerate some fragments. It
produces chromosomes and still produces viable plants. The mapping logic is direct: EST
found in plants with complete chromosomes but not in plants lacking known parts in the case
of single chromosomes, then the DNA sequence corresponds that EST is in that part of the
chromosome. Until the mapping is completed, the only 10,000 tools that will be created are
the components of this project. The DNA sequence corresponds to a gene that has a physical
place on the chromosome of wheat is known. This sets the stage for analysis, the next step of
the project defines the function of this mapped EST array.[15]

5. Other flowering plants


More than 90 different vascular seed plant genome projects United States Department of
Agriculture Web site in the world. This list includes beans, corn and fungal pathogens;
Australian projects on cotton, wheat, pine, sugarcane and 9 others; at least 24 European
projects trade in vegetables such as cabbage, cucumbers, peas and fruits like apples, peaches
and plums; more than 50 different North American projects squid, chrysanthemum, almond,
papaya grapes, papaya and poplar. Common denominator among all these projects is a
collection of genetic maps (in some cases map) and location of the plant genome common to
them. For some species large scale EST sequence project diagram comparative genomic
analysis (particularly in the regions of synteny) and Quantitative Trait Loci (QTL) mapping.

9
Instructor: Nguyen Minh Thanh Group 21

DISCUSSION
Although industrial development is a global trend, the agricultural sector also plays an
important role in the society and economy of countries around the world. Food crops are in
particular, the most important plants for humans. World population growth is a threat for
modern plant biotechnology. Crop yields have increased in the last century and will continue
to improve in the future as agronomy redefines advanced breeding and develops new
biotechnological strategies.[2]
Along with the development of science and technology, bioinformatics gradually becomes a
discipline that plays an important role in the field of agriculture, industry, utilization of
agricultural products and environmental management. Thanks to the rise in sequencing
projects in recent years, bioinformatics continues to make tremendous advancement in
biology by providing genomic information. Genome sequences of many important plant
species have helped researchers to recognize 'chromosome' and 'difference' factors in
sequences. This in turn has been used to define value traits for crop improvement . In
[16]

addition, the high degree of synthesis between diverse plant species, the commonality of
traits, the availability of expression and the function information of sequences have
contributed to discover the causality of diseases. This could lead to prevention and targeted
treatment of diseases and improved food production.
To conclude, bioinformatics can now be leveraged to accelerate the translation of
fundamental discovery into agriculture. Predictive manipulation of plant growth would have
an effect on agriculture at a time when food security, land reduction available for agricultural
use, environmental stewardship and climate change are all topics of increasing public
concern.[2]

REFERENCES
1. Christopher P. Austin, M.D. (2019). Bioinformatics. URL: https://bit.ly/33AtA6T
2. Xue J., Zhao S., Liang Y., Hou C., Wang J. (2008) Bioinformatics and its Applications in
Agriculture. In: Li D. (eds) Computer And Computing Technologies In Agriculture, Volume
II. CCTA 2007. The International Federation for Information Processing, vol 259. Springer,
Boston, MA. DOI: https://doi.org/10.1007/978-0-387-77253-0_29
3. SINGH V.K., SINGH A.K., CHAND R., KUSHWAHA C. (2011). Role of bioinformatics
in Agriculture and sustainable development. International Journal of Bioinformatics Research.
3 (2), 221-226. URL: http://dx.doi.org/10.9735/0975-3087.3.2.221-226
4. The role of bioinformatics in agriculture. (2020). URL: https://bit.ly/3lF69zw

10
Instructor: Nguyen Minh Thanh Group 21

5. Iquebal, M. A., Jaiswal, S., Mukhopadhyay, C. S., Sarkar, C., Rai, A., & Kumar, D.
(2015). Applications of Bioinformatics in Plant and Agriculture. PlantOmics: The Omics of
Plant Science, 755–789. DOI: https://doi.org/10.1007/978-81-322-2172-2_27
6. Bioinformatics Its Role & Applications In Agriculture & Other Discipline. (2017).
URL: https://www.technologytimes.pk/2017/12/19/bioinformatics-applications-agriculture/
7. Rajamanickam, Elanchezhian. (2012). Application of Bioinformatics in Agriculture. ICT
for agricultural development in changing climate (163-179). URL: https://bit.ly/33G0T8z
8. Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering
plant Arabidopsis thaliana. Nature, 408 (6814).
9. Goff SA, Ricke D, Lan, Presting TH, Wang G, Dunn RM et al. (2002) A Draft Sequence
of the Rice Genome (Oryza sativa L. ssp. japonica) Science, 296 (5565).
10. Rudd S. (2003) Expressed sequence tags: alternative or complement to whole genome
sequences? Trends in Plant Science.
11. Claverie JM and Notredame C (2003) Bioinformatics for Dummies. Willey Publ. Inc.
N.Y., USA, p.452.
12. Vassilev D, Leunissen J, Atanassov A, Nenov A and Dimov G (2005) Application of
bioinformatics in plant breeding. Biotechnology & Biotechnological Equipment.
13. Neerincx P and Leunissen J (2005) Evolution of web services in Bioinformatics.
Briefings in Bioinformatics.
14. Meyer K and Mewes HW (2002) How we can deliver the large plant genomes?
Strategies and perspectives. Current Opinion in Plant Biology.
15. The structure and function of the expressed portion of the wheat genomes. (2001).
URL: https://wheat.pw.usda.gov/NSF/
16. Ujjawal, K.S.K., Indra, D., Jai, P.J., & Birendra, J. (2017). Role of Bioinformatics in
Crop Improvement. Global Journal of Science Frontier Research: D Agriculture and
Veterinary. URL: https://bit.ly/3lzfx7K

11

You might also like