Professional Documents
Culture Documents
Bioinformatics Tools For Pharmaceutical Drug Product Development 1St Edition Edition Vivek Chavda Full Chapter PDF
Bioinformatics Tools For Pharmaceutical Drug Product Development 1St Edition Edition Vivek Chavda Full Chapter PDF
Bioinformatics Tools For Pharmaceutical Drug Product Development 1St Edition Edition Vivek Chavda Full Chapter PDF
https://ebookmass.com/product/tools-for-chemical-product-design-
from-consumer-products-to-biomedicine-1st-edition-mariano-martin/
https://ebookmass.com/product/chemoinformatics-and-
bioinformatics-in-the-pharmaceutical-sciences-pawan-kumar-raghav-
editor/
https://ebookmass.com/product/discovery-and-development-of-
antidiabetic-agents-from-natural-products-natural-product-drug-
discovery-goutam-brahmachari/
https://ebookmass.com/product/computation-in-bioinformatics-s-
balamurugan/
Process Systems Engineering for Pharmaceutical
Manufacturing 1st Edition Ravendra Singh
https://ebookmass.com/product/process-systems-engineering-for-
pharmaceutical-manufacturing-1st-edition-ravendra-singh/
https://ebookmass.com/product/systems-architecture-strategy-and-
product-development-for-complex-systems-bruce-cameron/
https://ebookmass.com/product/practical-application-of-
supercritical-fluid-chromatography-for-pharmaceutical-research-
and-development-hicks-m/
https://ebookmass.com/product/design-of-hybrid-molecules-for-
drug-development-michael-decker/
https://ebookmass.com/product/tools-for-teaching-2nd-edition/
Bioinformatics Tools
for Pharmaceutical Drug
Product Development
Scrivener Publishing
100 Cummings Center, Suite 541J
Beverly, MA 01915-6106
Publishers at Scrivener
Martin Scrivener (martin@scrivenerpublishing.com)
Phillip Carmical (pcarmical@scrivenerpublishing.com)
Bioinformatics Tools
for Pharmaceutical Drug
Product Development
Edited by
Vivek Chavda
Department of Pharmaceutics and Pharmaceutical Technology,
L. M. College of Pharmacy, Ahmedabad, India
Krishnan Anand
Department of Chemical Pathology, School of Pathology, University of the Free
State, Bloemfontein, South Africa
and
Vasso Apostolopoulos
Institute for Health and Sport, Immunology and Translational Research Group,
Victoria University, Melbourne, Australia
This edition first published 2023 by John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
and Scrivener Publishing LLC, 100 Cummings Center, Suite 541J, Beverly, MA 01915, USA
© 2023 Scrivener Publishing LLC
For more information about Scrivener publications please visit www.scrivenerpublishing.com.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or other-
wise, except as permitted by law. Advice on how to obtain permission to reuse material from this title
is available at http://www.wiley.com/go/permissions.
For details of our global editorial offices, customer services, and more information about Wiley prod-
ucts visit us at www.wiley.com.
ISBN 978-1-119-86511-7
Set in size of 11pt and Minion Pro by Manila Typesetting Company, Makati, Philippines
10 9 8 7 6 5 4 3 2 1
Contents
Preface xv
v
vi Contents
2.7.2 AI in Nanomedicine 29
2.7.3 Role of AI in Market Prediction 29
2.8 Discussion and Future Perspectives 30
2.9 Conclusion 31
References 31
3 Role of Bioinformatics in Peptide-Based Drug Design
and Its Serum Stability 37
Vivek Chavda, Prashant Kshirsagar and Nildip Chauhan
3.1 Introduction 37
3.2 Points to be Considered for Peptide-Based Delivery 38
3.3 Overview of Peptide-Based Drug Delivery System 40
3.4 Tools for Screening of Peptide Drug Candidate 41
3.5 Various Strategies to Increase Serum Stability of Peptide 42
3.5.1 Cyclization of Peptide 42
3.5.2 Incorporation of D Form of Amino Acid 44
3.5.3 Terminal Modification 44
3.5.4 Substitution of Amino Acid Which is Not Natural 46
3.5.5 Stapled Peptides 46
3.5.6 Synthesis of Stapled Peptides 47
3.6 Method/Tools for Serum Stability Evaluation 47
3.7 Conclusion 48
3.8 Future Prospects 49
References 49
4 Data Analytics and Data Visualization
for the Pharmaceutical Industry 55
Shalin Parikh, Ravi Patel, Dignesh Khunt, Vivek P. Chavda
and Lalitkumar Vora
4.1 Introduction 56
4.2 Data Analytics 57
4.3 Data Visualization 58
4.4 Data Analytics and Data Visualization for Formulation
Development 60
4.5 Data Analytics and Data Visualization for Drug
Product Development 65
4.6 Data Analytics and Data Visualization for Drug
Product Life Cycle Management 69
4.7 Conclusion and Future Prospects 71
References 72
Contents vii
Acknowledgement 111
References 111
7 Clinical Applications of “Omics” Technology
as a Bioinformatic Tool 117
Vivek Chavda, Rajashri Bezbaruah, Disha Valu, Sanjay Desai,
Nildip Chauhan, Swati Marwadi, Gitima Deka
and Zhiyong Ding
Abbreviations 118
7.1 Introduction 118
7.2 Execution Method 119
7.3 Overview of Omics Technology 121
7.4 Genomics 124
7.5 Nutrigenomics 127
7.6 Transcriptomics 128
7.7 Proteomics 129
7.8 Metabolomics 130
7.9 Lipomics or Lipidomics 133
7.10 Ayurgenomics 134
7.11 Pharmacogenomics 134
7.12 Toxicogenomic 135
7.13 Conclusion and Future Prospects 139
Acknowledgement 139
References 139
xv
xvi Preface
The Editors
Vivek Chavda
K. Anand
Vasso Apostolopoulos
December 2022
Part I
BIOINFORMATICS TOOLS
1
Introduction to Bioinformatics, AI,
and ML for Pharmaceuticals
Vivek P. Chavda1*, Disha Vihol2, Aayushi Patel3, Elrashdy M. Redwan4,5
and Vladimir N. Uversky6†
1
Department of Pharmaceutics and Pharmaceutical Technology, L. M. College of
Pharmacy, Ahmedabad, Gujarat, India
2
Department of Phytopharmacy and Phytomedicine, School of Pharmacy, Gujarat
Technological University, Ahmedabad, Gujarat, India
3
Pharmacy Section, L. M. College of Pharmacy, Ahmedabad, Gujarat, India
4
Department of Biological Sciences, Faculty of Sciences, King Abdulaziz University,
Jeddah, Saudi Arabia
5
Therapeutic and Protective Proteins Laboratory, Protein Research Department,
Genetic Engineering and Biotechnology Research Institute, City of Scientific
Research and Technological Applications, New Borg EL-Arab,
Alexandria, Egypt
6
Department of Molecular Medicine and Byrd Alzheimer’s Research Institute,
Morsani College of Medicine, University of South Florida, Tampa, FL, USA
Abstract
Bioinformatics is a growing field that has emerged in recent years. The use of com-
putational applications for protein sequence analysis in the early 1960s created the
groundwork for bioinformatics. Alongside, developments in molecular biology
techs evolved DNA analysis, leading to simpler manipulation of DNA, its sequenc-
ing, and computer science, suggesting the development of compatible and more
powerful computers with innovative software for performing bioinformatics tasks.
Biological Big Data collection when analyzed with bioinformatics tools leads to
powerful predictive results with repeatability. Due to advancements in the merg-
ing of computer science and biology, even subdisciplines like synthetic biology,
systems biology, and whole-cell modeling are emerging rapidly.
Vivek Chavda, Krishnan Anand and Vasso Apostolopoulos (eds.) Bioinformatics Tools for Pharmaceutical
Drug Product Development, (3–18) © 2023 Scrivener Publishing LLC
3
4 Bioinformatics Tools in Drug Development
1.1 Introduction
In the context of Artificial Intelligence (AI), Machine Learning (ML), and
Big Data, the healthcare industry explores the medication research process,
evaluating how emerging technologies can enhance efficacy [1]. Artificial
intelligence and machine learning are seen as the way of the future in a
variety of fields and sectors, including pharmaceuticals. In a world, where
a single authorized medicine costs millions of dollars and requires years of
rigorous testing before being licensed, saving money and time is a priority.
Producing new pharmacological compounds to combat any disease is
an expensive and time-consuming procedure, yet it goes unchecked. The
most important aspect of drug design is to take advantage of the collected
data and seek fresh and unique leads. Once the medication target has been
determined, numerous multidisciplinary domains collaborate to develop
enhanced pharmaceuticals using AI and ML technologies [2]. These tech-
nologies are utilized at every phase of the computer-aided drug discovery
process, and combining them results in a proven track record of success in
finding hit molecules. Furthermore, the fusion of AI and ML with high-
dimension data enhanced the capabilities of computer-aided drug discov-
ery and design [3]. Clinical trial output prediction using AI/ML integrated
models might decrease the costs of the clinical trial, while simultaneously
increasing their success rates. In this study, we will be covering the poten-
tial of AI and ML technologies in enabling computer-aided drug creation,
along with challenges and opportunities for the pharmaceutical sector.
1.2 Bioinformatics
When biological data along with genetic information is analyzed using
computer technology for calculating and obtaining mathematically and
statistically approved results, is called Bioinformatics. The computational
means are utilized for addressing data-intensive, large-scale biological
challenges [4]. It includes the development and application of databases,
algorithms, computational and statistical tools, and theory to tackle for-
mal and practical issues emerging from biological data administration and
analysis [5, 6].
Bioinformatic Tools for Pharmaceuticals 5
Table 1.1 Various bioinformatics and AI-driven tools applied in the pharmacy
department and industry.
Computational
tools Application Reference
BLAST The Basic Local Alignment Search Tool [15]
(BLAST) is used for searching local
similarity regions between sequences and
compares to the available database for
calculating the statistical significance of
matches. The matching infers functional
and evolutionary relationships between
sequences and identifies genetically
related families.
ChEMBL ChEMBL is designed manually to maintain [16]
a database of bioactive molecules. It
correlates genomic data with chemical
structure and bioactivity for the
development of new drugs.
geWorkbench The genomics Workbench (geWorkbench) [11]
comprises the tools for performing
management, analysis, visualization, and
annotation of biomedical data. It supports
data for microarray gene expression,
DNA and Protein Sequences, pathways,
Molecular structure – prediction,
Sequence Patterns, Gene Ontology, and
Regulatory Networks
GROMACS It is software for high-performance molecular [7]
dynamics and output analysis, especially for
proteins and lipids.
IGV Integrative Genomics Viewer (IGV) is [17]
a high-performance, user-friendly,
interactive tool for the visualization and
exploration of genomic data.
MODELLER A protein three-dimensional structural [18]
homology or comparative modeling tool.
SwissDrugDesign SwissDrugDesign provides a collection of [19]
tools required for Computer-Aided Drug
Design (CADD).
(Continued)
Bioinformatic Tools for Pharmaceuticals 7
Table 1.1 Various bioinformatics and AI-driven tools applied in the pharmacy
department and industry. (Continued)
Computational
tools Application Reference
UCSF Chimera UCSF Chimera is an interactive tool for the [20]
visualization and analysis of molecular
structures.
AlphaFold It is an AI system developed to [21]
computationally predict protein
structures with unprecedented accuracy
and speed.
Cyclica The ML tool address challenges faced [21]
across the drug discovery life cycle
by correlating biophysics, medicinal
chemistry, and systems biology.
DeepChem It is a deep learning framework for drug [18]
discovery.
DeltaVina Gives docking scores for protein-ligand [17]
binding affinity
Exscientia The AI engineers precision medicines more [13]
rapidly and efficiently by accelerating
pre-clinical drug discovery phases
through monitoring and analysis of drug
design and experiments.
Hit Dexter It estimates how likely a small molecule [4]
is to trigger a positive response in
biochemical and biological assays.
ORGANIC Objective-Reinforced Generative [8]
Adversarial Network for Inverse-
design Chemistry (ORGANIC) is a
tool for creating molecules with desired
properties.
Somatix It is a real-time gesture detection [22]
technology that enables passive data
collection of indoor and outdoor
patients for enhancing medication
adherence rates, data reliability, and
cost-effectiveness.
8 Bioinformatics Tools in Drug Development
1.3.1 Applications of ML
a) Research and development of new drug:
ML utilizes a feedback-driven drug development process by interpreting
existing results from sources, such as computational modeling data, liter-
ature surveys, and high-throughput screening. This process helps in iden-
tifying lead compounds with efficiency and reduced randomness, errors,
and time-lapse. In the approach such as de novo design, inputs require a
compound library gained through in silico methods and virtual screening
applications which mimic bioactivity and toxicity models [53]. The drug
discovery can be carried out by following a series of steps starting from
the identification of novel bioactive compounds through docking studies
and molecular dynamics. A hit compound can be found while screening
chemical libraries, computer simulation, or screening naturally isolated
materials, such as plants, bacteria, and fungi. Then the recognized hits
are screened in cell-based assays consisting of animal models of disease
to assess efficacy and safety. Once the activity of the lead molecule is con-
firmed, chemical modifications can be carried out in search of a novel com-
pound consisting of maximal therapeutic benefits with minimal harm [54,
55]. Hence, incorporating algorithm datasets in conjugation with chemical
structures and targets is utilized for the optimization of new leads and is
preferable to the laborious target-specific lead identification for the R&D
sector of a pharma company.
Bioinformatic Tools for Pharmaceuticals 13
- Relates genetic
information to
human functions
- High Throughput
Screening (HTS)
- Pharmacogenomics
- Drug Activity
- ADMET studies
- Bioactivity
- Formulation
development
- Scale up
pathways and dosage regimens [52]. In this sector, machine learning is uti-
lized as means of avoiding data clustering from scoring models by creating
a time series data correlated with treatment pathways for faster processing
and improved efficiency, by incorporating database tools like kdb+, and
Tensorflow.
1.3.2 Limitations of ML
The major issue while using ML is obtaining accurate diagnostic values. In
ML, a predefined set of algorithms are set for a particular disease but it is
not generally quantitative as the human diseased state is due to innumera-
ble complex pathways going inside. Rather than quantitative assessment in
identifying a disease and designing the formulation for them, experience
and expertise are needed for diagnosis and dose requirement calculation.
Nevertheless, ML can be utilized for creating a huge dataset that can give
fairly quantified data for further usage [52].
References
1. Quazi, S., Role of Artificial Intelligence and Machine Learning in Bioinformatics:
Drug Discovery and Drug Repurposing, 2021. https://www.preprints.org/
manuscript/202105.0346/v1
2. Selvaraj, C., Chandra, I., Singh, S.K., Artificial intelligence and machine
learning approaches for drug design: Challenges and opportunities for the
pharmaceutical industries. Mol. Divers., 3, 1893–1913, 2021.
3. Henstock, P.V., Artificial intelligence for pharma: Time for internal invest-
ment. Trends Pharmacol. Sci., 40, 8, 543–546, 2019.
4. Chen, C., Hou, J., Tanner, J.J., Cheng, J., Bioinformatics methods for mass
spectrometry-based proteomics data analysis. Int. J. Mol. Sci., 21, 8, 2873,
2020.
5. Can, T., Introduction to bioinformatics. Methods Mol. Biol., 1107, 51–71,
2014.
6. Najarian, K., Deriche, R., Kon, M.A., Hirata, N.S.T., Bioinformatics and bio-
medical informatics. Sci. World J., 2013, 591976, 2013.
7. Chakraborty, C., Doss, C.G.P., Zhu, H., Agoramoorthy, G., Rising strengths
Hong Kong SAR in bioinformatics. Interdiscip. Sci., 9, 2, 224–236, 2017.
8. Pallen, M.J., Microbial bioinformatics 2020. Microb. Biotechnol., 9, 5, 681–
686, 2016.
9. Wooller, S.K., Benstead-Hume, G., Chen, X., Ali, Y., Pearl, F.M.G.,
Bioinformatics in translational drug discovery. Biosci. Rep., 37, 4,
BSR20160180, 2017.
10. Pop, M. and Salzberg, S.L., Bioinformatics challenges of new sequencing
technology. Trends Genet., 24, 3, 142–149, 2008.
11. Tillett, R.L., Sevinsky, J.R., Hartley, P.D. et al., Genomic evidence for reinfec-
tion with SARS-CoV-2: A case study. Lancet Infect. Dis., 21, 1, 52–58, 2021.
12. Rothberg, J., Merriman, B., Higgs, G., Bioinformatics. Introduction. Yale J.
Biol. Med., 85, 3, 305–308, 2012.
13. Mbah, C.J. and Okorie, N.H., Pharmaceutical bioinformatics: Its relevance
to drug metabolism. Madridge J. Bioinform. Syst. Biol., 1, 1, 19–26, 2019.
[Internet] Available from: https://madridge.org/journal-of-bioinformatics-
and-systems-biology/mjbsb-1000104.php.
14. Bayat, A., Science, medicine, and the future: Bioinformatics. BMJ, 324, 7344,
1018–1022, 2002.
15. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., Basic local
alignment search tool. J. Mol. Biol., 215, 3, 403-410, 1990.
16. Davies, M., Nowotka, M., Papadatos, G. et al., ChEMBL web services:
Streamlining access to drug discovery data and utilities. Nucleic Acids Res.,
43, W1, W612–W620, 2015. [Internet] Available from: https://www.ebi.
ac.uk/chembl/.
16 Bioinformatics Tools in Drug Development
17. Thorvaldsdottir, H., Robinson, J.T., Mesirov, J.P., Integrative genomics viewer
(IGV): High-performance genomics data visualization and exploration.
Brief. Bioinform., 14, 2, 178–192, 2013. [Internet] Available from: https://
academic.oup.com/bib/article-lookup/doi/10.1093/bib/bbs017.
18. Nascimento, A.C.A., Prudêncio, R.B.C., Costa, I.G., A multiple kernel
learning algorithm for drug-target interaction prediction. BMC Bioinf.,
17, 1, 46, 2016. [Internet] Available from: http://www.biomedcentral.
com/1471-2105/17/46.
19. Daina, A. and Zoete, V., Application of the SwissDrugDesign online resources
in virtual screening. Int. J. Mol. Sci., 20, 18, 4612, 2019.
20. UCSF Chimera. [Internet] Available from: https://www.rbvi.ucsf.edu/
chimera/.
21. Jumper, J., Evans, R., Pritzel, A. et al., Highly accurate protein structure
prediction with AlphaFold. Nature, 596, 7873, 583–589, 2021. [Internet]
Available from: https://www.nature.com/articles/s41586-021-03819-2.
22. Snijder, E.J., Decroly, E., Ziebuhr, J., The nonstructural proteins directing
coronavirus RNA synthesis and processing. Adv. Virus Res., 96, 59–126,
2016.
23. Mamoshina, P., Vieira, A., Putin, E., Zhavoronkov, A., Applications of
deep learning in biomedicine. Mol. Pharm., 13, 5, 1445–1454, 2016.
[Internet] Available from: https://pubs.acs.org/doi/10.1021/acs.molpharmaceut.
5b00982.
24. Schneider, P., Walters, W.P., Plowright, A.T., Sieroka, N., Listgarten, J.,
Goodnow, R.A. Jr, Fisher, J., Jansen, J.M., Duca, J.S., Rush, T.S., Zentgraf, M.,
Hill, J.E., Krutoholow, E., Kohler, M., Blaney, J., Funatsu, K., Luebkemann,
C., Schneider, G., Rethinking drug design in the artificial intelligence era.
Nat. Rev. Drug Discov., 19, 5, 353–364, 2020.
25. Agatonovic-Kustrin, S. and Beresford, R., Basic concepts of artificial neural
network (ANN) modeling and its application in pharmaceutical research.
J. Pharm. Biomed. Anal., 22, 5, 717–727, 2000.
26. Sakiyama, Y., The use of machine learning and nonlinear statistical tools for
ADME prediction. Expert Opin. Drug Metab. Toxicol., 5, 2, 149–169, 2009.
27. Gobburu, J.V.S. and Chen, E.P., Artificial neural networks as a novel approach
to integrated pharmacokinetic—Pharmacodynamic analysis. J. Pharm. Sci.,
85, 5, 505–510, 1996.
28. Merk, D., Friedrich, L., Grisoni, F., Schneider, G., De novo design of bioac-
tive small molecules by artificial intelligence. Mol. Inform., 37, 1–2, 1700153,
2018. [Internet] Available from: https://onlinelibrary.wiley.com/doi/10.1002/
minf.201700153.
29. Klopman, G., Chakravarti, S.K., Zhu, H., Ivanov, J.M., Saiakhov, R.D., ESP:
A method to predict toxicity and pharmacological properties of chemicals
using multiple MCASE databases. J. Chem. Inf. Comput. Sci., 44, 2, 704–715,
2004. [Internet] Available from: https://pubs.acs.org/doi/10.1021/ci030298n.
Bioinformatic Tools for Pharmaceuticals 17
44. Gunčar, G., Kukar, M., Notar, M. et al., An application of machine learning to
haematological diagnosis. Sci. Rep., 8, 1, 411, 2018. [Internet] Available from:
http://www.nature.com/articles/s41598-017-18564-8.
45. Koohy, H., The rise and fall of machine learning methods in biomedical
research. F1000Res., 6, 2012, 2018. [Internet] Available from: https://f1000re-
search.com/articles/6-2012/v2.
46. Young, J.D., Cai, C., Lu, X., Unsupervised deep learning reveals prognos-
tically relevant subtypes of glioblastoma. BMC Bioinf., 18, S11, 381, 2017.
[Internet] Available from: http://bmcbioinformatics.biomedcentral.com/
articles/10.1186/s12859-017-1798-2.
47. Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., Blaschke, T., The rise of
deep learning in drug discovery. Drug Discovery Today, 23, 6, 1241–1250,
2018. [Internet] Available from: https://linkinghub.elsevier.com/retrieve/pii/
S1359644617303598.
48. Beneke, F. and Mackenrodt, M.-O., Artificial intelligence and collusion.
IIC-Int. Rev. Intellect. Prop. Compet. Law, 50, 1, 109–134, 2019. [Internet]
Available from: http://link.springer.com/10.1007/s40319-018-00773-x.
49. Tkachenko, N., Machine learning in healthcare: 12 real-world use cases to
know, in IEEE Access, 2021. [Internet] Available from: https://nix-united.
com/blog/machine-learning-in-healthcare-12-real-world-use-cases-to-
know/; Bharadwaj, H.K. et al., A review on the role of machine learning in
enabling IoT based healthcare applications. IEEE Access, 9, 38859–38890,
2021.
50. Vamathevan, J., Clark, D., Czodrowski, P. et al., Applications of machine
learning in drug discovery and development. Nat. Rev. Drug Discovery, 18, 6,
463–477, 2019.
51. Mamoshina, P., Volosnikova, M., Ozerov, I.V. et al., Machine learning on
human muscle transcriptomic data for biomarker discovery and tissue-
specific drug target identification. Front. Genet., 242, 9, 2018.
52. Dasgupta, N., Real-world use cases for AI & ML in pharma. Access date
03/05/2022; https://www.rxdatascience.com/blog/top-use-cases-for-machine-
learning-in-pharma
53. Yuan, Y., Pei, J., Lai, L., Ligbuilder 2: A practical de novo drug design
approach. J. Chem. Inf. Model., 51, 5, 1083–1091, 2011. [Internet] Available
from: https://pubs.acs.org/doi/10.1021/ci100350u.
54. Zhu, T., Cao, S., Su, P.-C. et al., Hit identification and optimization in virtual
screening: Practical recommendations based on a critical literature analysis.
J. Med. Chem., 56, 17, 6560–6572, 2013. [Internet] Available from: https://
pubs.acs.org/doi/10.1021/jm301916b.
55. Anderson, A.C., Structure-based functional design of drugs: From tar-
get to lead compound. Methods Mol. Biol., 2012, 359–366, 2012. [Internet]
Available from: http://link.springer.com/10.1007/978-1-60327-216-2_23.
2
Artificial Intelligence and Machine
Learning-Based New Drug Discovery
Process with Molecular Modelling
Isha Rani1, Kavita Munjal2, Rajeev K. Singla3,4 and Rupesh K. Gautam5*
2
MM College of Pharmacy, MM (Deemed to be) University-Mullana,
Ambala, Haryana, India
3
Institutes for Systems Genetics, Frontiers Science Center for Disease-Related
Molecular Network, West China Hospital, Sichuan University, Chengdu, China
4
iGlobal Research and Publishing Foundation, New Delhi, India
5
Department of Pharmacology, Indore Institute of Pharmacy, IIST Campus,
Rau, Indore (M.P.), India
Abstract
Drug development is a time-consuming, expensive and extremely risky proce-
dure. Up to 90% of drug concepts are discarded due to challenges such as safety,
efficacy and toxicity resulting in significant losses for the investor. The use of arti-
ficial intelligence (AI), namely machine learning and deep learning algorithms,
to improve the drug discovery process is one technique that has arisen in recent
years. AI has been effectively used in drug discovery and design. This chapter
includes these machine learning approaches in depth, as well as their applications
in medicinal chemistry. The current state-of-the-art of AI supported pharmaceu-
tical discovery is discussed, including applications in structure and ligand-based
virtual screening, de novo drug design, drug repurposing, and factors related, after
introducing the basic principles, along with some application notes, of the various
machine learning algorithms. Finally, obstacles and limits are outlined, with an
eye towards possible future avenues for AI-supported drug discovery and design.
Vivek Chavda, Krishnan Anand and Vasso Apostolopoulos (eds.) Bioinformatics Tools for Pharmaceutical
Drug Product Development, (19–36) © 2023 Scrivener Publishing LLC
19
20 Bioinformatics Tools in Drug Development
Abbreviations
AI Artificial intelligence
RNA Ribonucleic acid
R & D Research and develoment
ML Machine Learning
SBVS Structure-based virtual screening
VS Virtual screening
CADD Computer aided drug design
PDB Protein data bank
SAS Synthetic accessibility score
2.1 Introduction
The development of pharmaceutical drugs is a time-consuming and costly
process. Pharmaceutical and biotechnology companies often spend over
$1 billion to develop a drug to the market, and can take anywhere from
10 to 20 years. This process is extremely risky with up to 90% of new drug
concepts are discarded due to difficulties such as safety and efficacy, result-
ing in significant loss for the investor [1, 2]. Traditional drug discovery
methods are target-driven, in which a known target is used to screen for
small molecules that either interact with it or affect its function in cells.
These approaches work well for easily druggable targets with well-defined
structures and well-understood cellular interactions. However, due to the
complex nature of cellular interactions and limited knowledge of intrica-
cies, these methods are severely limited [1, 3].
The term “artificial intelligence” (AI) refers to intelligence displayed by
computers. When a computer exhibits cognitive behavior similar to that
of humans, such as learning or problem solving, this term is employed [4].
AI makes use of systems and software that can read and learn from data
in order to make independent judgments in order to achieve certain goals.
Machine learning, for example, is a well-established technology for learning
and predicting novel features [5]. By finding novel relationships and infer-
ring the functional importance of distinct components of a biological path-
way, AI can overcome these obstacles. Complex algorithms and machine
learning are employed by AI to extract useful information from enormous
datasets. As such, a dataset of RNA sequencing can be used to discover
genes whose expression correlates with a specific biological situation. AI
can also be used to discover compounds that could bind to ‘undrugga-
ble targets,’ or proteins with unknown structures. A predicted collection
of compounds may be easily identified in a relatively short length of time
AI and ML in Drug Discovery Process 21
Building of an AI planner
Model
Problem AI arhitecture Analysis of
assessment
recognition design data
and evaluation
data, such as binary numbers for binary classification tasks, numeric values
for cluster analysis, and legitimate numbers for regression issues, which
typically incorporate genuine biological outcomes [16]. Table 2.1 rep-
resents various data patterns used as input and output for the development
of AI algorithms in drug discovery.
Is
Yes performance on No
training set
better?
Need to Improve
Learning algorithm
screening (VS) can help with hit discovery (finding active drug candidates)
and lead optimization (transforming biologically active compounds into
appropriate pharmaceuticals by increasing their physicochemical qual-
ities). Finally, these improved leads go through preclinical and clinical
testing before being approved by regulatory agencies [28]. For VS, large
databases of known 3D structures are analyzed automatically utilizing
computational approaches. Initially, thousands of possible active ligands
are screened for a target. The probable ligands are selected using VS with
AutoDock Vina. DOCK 6 is then used for carrying out molecular dynam-
ics simulations. By this way, active compounds with promising results are
selected for further experimental testing. This leads to the most promis-
ing synthesized molecules. VS techniques do even wonders by identifying
if any compound is toxic. It gives proper ADMET (absorption, distribu-
tion, metabolism, excretion and toxicity) analysis of tested molecule. At
the binding site, search algorithms are utilized to systematically evaluate
ligand orientations and conformations. In rigid docking, the search algo-
rithms make use of translational and rotational degrees of freedom to
explore alternative positions of ligands at the active binding site, whereas
in flexible docking, conformational degrees of freedom are added to the
translations and rotations of the ligands. Search algorithms use a variety
of strategies to anticipate the correct conformation of ligands, including
checking the chemistry and geometry of the atoms involved (DOCK 6,
FLEXX and genetic algorithm) [29–31]. Further, SBVS uses scoring func-
tions to evaluate the force of non-covalent contacts between a ligand and a
molecular target, and it tries to anticipate the optimum interaction mode
between two molecules to form a stable complex [32].
The availability of a 3D structure of the target protein and the ligands to
be docked is a need for executing VS [33, 34]. Some databases have been
built to store 3D molecular structures, such as, Protein Data Bank (PDB),
PubChem, ChemSpider, Brazilian Malaria Molecular Targets (BraMMT),
Drugbank [35].
synthetically difficult to access. The field has seen some revival recently due
to developments in the field of artificial intelligence. An interesting approach
is the variational autoencoder, which consists of two neural networks, an
encoder network and a decoder network [38]. The encoder network converts
the chemical constituents described by the SMILES description into a legit-
imate constant vector. The decoding section can convert variables from the
latent space into active compounds. An in-silico model employed this feature
to search for best solution in subspace, and the decoder networks used this
feature to reverse translate such matrices into actual molecules. For the major-
ity of reverse translates, one molecule predominate, while minor structural
changes occur with a lower likelihood. The researchers trained a framework
on the QED drug-likeness rating [39] and the synthetic accessibility score
SAS [40] using the latent feature model. It might be possible to create a track
of compounds with specific targeted qualities. A generating model generates
unique chemical structures in the antagonistic learning algorithm. While the
predictive model tries to mislead the discriminative model, a second dis-
criminative adversarial model is trained to distinguish genuine molecules
from produced ones. In generating mode, the antagonistic algorithm created
considerably more acceptable architectures than the variational learning
algorithm. Novel structures anticipated to be effective in opposition to the
dopamine receptor type 2 might be produced using an in-silico approach.
A generative adversarial network (GAN) was utilized by Kadurin et al.
to suggest chemicals with suspected anticancer characteristics [41].
Numerous distinct architectures have just been constructed, each of which
is possible to produce complete and effective new structures one of them is,
Recursive neural networks (RNN) [42]. These approaches can be used to
explore a novel chemical space, with the created molecules’ property distri-
butions being similar to the training space. The methodology’s initial poten-
tial application was successful, with four out of five molecules exhibiting
the anticipated activity. However, further expertise with the vastness of the
chemical space examined and chemical validity of the defined compounds
is required.
*****
— Monarkisti.
— Sittenkin…