Gozalbes 2018

International Journal of Quantitative Structure-Property Relationships
Volume 3 • Issue 1 • January-June 2018
Applications of Chemoinformatics in
Predictive Toxicology for Regulatory
Purposes, Especially in the Context
of the EU REACH Legislation
Rafael Gozalbes, ProtoQSAR SL, València, Spain
Jesús Vicente de Julián-Ortiz, Department of Physical Chemistry, Faculty of Chemistry, University of Valencia
ABSTRACT
Chemoinformatics methodologies such as QSAR/QSPR have been used for decades in drug discovery
projects, especially for the finding of new compounds with therapeutic properties and the optimization
of ADME properties on chemical series. The application of computational techniques in predictive
toxicology is much more recent, and they are experiencing an increasingly interest because of the new
legal requirements imposed by national and international regulations. In the pharmaceutical field,
the US Food and Drug Administration (FDA) support the use of predictive models for regulatory
decision-making when assessing the genotoxic and carcinogenic potential of drug impurities. In
Europe, the REACH legislation promotes the use of QSAR in order to reduce the huge amount of
animal testing needed to demonstrate the safety of new chemical entities subjected to registration,
provided they meet specific conditions to ensure their quality and predictive power. In this review,
the authors summarize the state of art of in silico methods for regulatory purposes, with especial
emphasis on QSAR models.
Keywords
Computational Toxicology, Docking, QSAR, REACH, Read-Across, Virtual Screening
INTRODUCTION
The REACH Regulation

The European regulation 1907/2006 for Registration, Evaluation, Authorisation and Restriction of
Chemicals (abbreviated as “REACH”) entered into force in 2007 (European Commission, 2006). The
main objectives of REACH are the protection of human health and the environment from the risks
that can be posed by chemicals, and the enhancement of the competitiveness of the EU chemicals
industry. To achieve these objectives, REACH regulates both the production and use of chemicals
when they are produced or imported into Europe in an amount greater than one tonne per year.
According to this regulation, manufacturers and importers of chemicals in the European Union (EU)
are required to register these substances and communicate the information necessary to ensure their
safe use (as such, in mixtures or as part of the composition of items), by submitting a registration
dossier to the European Chemicals Agency (ECHA, www.echa.europa.eu/). The degree of information
required depends on the level of concern about the substance, being for example more complete for
DOI: 10.4018/IJQSPR.2018010101

Copyright © 2018, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

1
carcinogenic, mutagenic and/or toxic for reproduction (CMR) substances, or for highly toxic chemicals
to aquatic organisms. Also, goods producers, distributors and downstream users should be rigorous in
identifying these substances, and update and communicate information about them upon registration.
REACH has brought a revolution in the world of the regulation of chemicals, since for the first
time the industry should take over the potential risk of the products generated and their potential
impact on both human health and the ecosystem. Maintaining on the market requires the adoption
of new obligations under REACH, since manufactured or imported substances that have not been
previously submitted for registration to the ECHA cannot be commercialized, neither can be used
for purposes other than those recorded. REACH is the paradigm of an international shift towards
a responsible use of chemicals, and indeed other non-EU countries are studying/adopting similar
legislations (Van Heerden, 2012).
Despite its importance and positive impact, the REACH regulation also raises strong criticism,
especially from industrials that have to mandatorily adopt this legislation. The more controversial
issues are:
1. The registration process is very expensive, due to the high degree of experimental and
administrative work required. These costs directly affect the competitiveness of companies that
have to apply the regulation, and this is a particularly sensitive issue currently, due to the global
economic crisis. Small and medium-sized enterprises are particularly exposed to this risk, to the
extent that in some cases companies can be seen in the impediment to continue producing some
of their products.
2. At social level, REACH raises the ethical problem represented by the huge amount of animal
testing necessary to meet the requirements of information (Rovida & Hartung, 2009). It is
estimated that around twelve million vertebrate animals are used annually in the EU in experiments
performed for different purposes (scientific, toxicological, and regulatory) (Scholz et al., 2013;
Taylor & Rego, 2016) (Figure 1). Estimates of the large increase in the number of such experiments
due to the REACH implementation have alerted and mobilized a high number of representative
animal welfare organizations and broad social sectors.
Thus, the need for alternative methods to reduce or replace animal testing is stronger than ever. In
fact, the use of such methods instead of animal testing is clearly stimulated by the REACH regulation
itself, which in its text states that “every effort must be made so that testing chemicals on animals is
a last resort – when there is no other scientifically reliable way of showing the impact on humans or
the environment”.This institutional encouragement of alternative methods is not new: the European
Commission launched in 1991 the European Centre for the Validation of Alternative Methods
(ECVAM), with the objective of validating techniques able to reduce, refine or replace animal testing
of chemicals, biological products or vaccines. Also, the ECVAM was in charge of the promotion of
the development and dissemination of alternative methods, their application to industrial level and
its acceptance by the regulatory authorities. This centre is still in force, since 2011 under the name
“European Union Reference Laboratory for Alternatives to Animal Testing” (EURL-ECVAM, https://
eurl-ecvam.jrc.ec.europa.eu/).
Computational Toxicology
Currently, the toxic potential of a large number of industrial chemicals (including pharmaceuticals,
cosmetics, pesticides and other synthetic or semi-synthetic chemicals) is determined by using
standardized animal models. These studies are required for approval of any chemical to be registered
as a product that can be released to the market.
There are several alternative methods for replacing animal testing, as in vitro techniques (which
use portions of tissues, perfused organs, or cellular/subcellular cultures), or using lower organisms such
as bacteria, algae or fungi (Vinardell, 2007). Amongst such alternatives here are also the so-called in
2
Figure 1. Animals used in experiments in the European Union in 2014 (Data not available from Portugal nor Sweden)
silico methods, which allow the simulation of mechanisms of action of chemicals and the prediction
of values of human/environmental toxicity by using computer models (Figure 2). Computational
Toxicology has taken advantage of three significant technological advances: the increasing availability
of chemical and biological information (for example, from microarrays or from high throughput
screening -HTS- in vitro experiments), the increased computing power allowing to analyze these data,
and the development of novel biostatistical methods. The ancestors of computational toxicology are
chemoinformatics and molecular modelling, which are disciplines at the interface between chemistry,
biology and computer science that have been used for years in the world of drug discovery and drug
design (Nicolotti et al., 2014).
1. In vivo models are acceptable when other alternatives are not possible/reliable, and they are
completely banned in EU in specific cases such as cosmetics.
The in vivo experiments require much time for preparation and execution, and are expensive and
ethically questionable. Animal testing is based on the assumption that adverse effects (AEs) observed
in one animal species reproduce those that will also occur in humans. In front of them, computer
models have the ability to predict the physico-chemical or biological properties of compounds even
without necessarily carrying out their chemical synthesis in the laboratory. Therefore, the use of in
silico approaches represents important savings in time, resources and money (Modi, Hughes, Garrow
& White, 2012) and their applicability to new chemical structures is easy and quite immediate. Despite
their advantages, the computational methods are yet underused at regulatory level by industry for
several reasons (Clippinger, Hill, Curren & Bishop, 2016) (Figure 2).
3
Figure 2. Advantages and disadvantages of alternative predictive toxicology approaches and classical animal assays (*AEs =
Adverse Effects ** 1) In silico models are completely acceptable if following the OECD principles 2) A limited list of in vitro
assays has been developed and are accepted by regulatory authorities).
At the level of the pharmaceutical industry, it is very common the virtual screening (VS) of large
collections of structures, which is typically a process of high performance at low cost. VS provides a
rapid indication of the effectiveness and potential risks of compounds, thus facilitating prioritization:
as biological assays are not required, these studies can be performed in the early stages of discovery,
even for non-synthesized compounds, to select products with better properties and lower toxicity
(Modi, Hughes, Garrow & White, 2012; Merlot, 2010). In addition, computational tools can also
help in some cases to provide a mechanistic understanding of biological processes, for example to
explain why a compound is expected to show a certain type of toxicity.
There are various techniques of computational toxicology, and the use of one or another depends
on the degree of complexity of the type of toxicity studied and the baseline information available to us.
The most powerful and effective tool is the development and application of mathematical predictive
QSAR models, and therefore we will see them in more detail. Other notable predictive approaches
are the molecular coupling (“docking”) to analyze the interactions of chemical compounds with
biological receptors, the systematic study of the structure-activity relationships based on knowledge
and previous experience (“knowledge-based SAR systems”), the extrapolation of structural similarity
properties (read-across), and the study of the physiology-based pharmacokinetic (PBPK).
QSAR/QSTR MODELLING
One of the first applications of computational chemistry was the development of Quantitative Structure-
Activity Relationships (QSARs) for the prediction of biological activity based on chemical structure.
This technique involves the construction of a mathematical model relating the chemical structure
of a series of molecules with a physicochemical property or biological activity. The development
of QSARs requires the previous characterization of the molecules by a set of numerical descriptors,
and the application of statistical tools providing regression or classification models. Once a QSAR
has been developed and validated, it can be used to predict the property/activity of new molecules
whose chemical structure is known. Due to the high cost and time required for research and drug
development, QSAR models have been used for decades in “drug discovery” to save time, resources
and money. QSARs are part of the standard protocols for drug discovery and subsequent optimization
of leads, regarding both the improvement of therapeutic effects of drug candidates (Modi, Hughes,
Garrow & White, 2012) and the reduction of toxicological and adverse effects (Merlot, 2010; Valerio,
2013). Many QSAR models have been developed primarily to provide an accurate prediction of
pharmacokinetic properties (ADME) in early stages of the process (van de Waterbeemd & Gifford,
2003; Dickins & Modi, 2002; Duartet al., 2002; Gozalbes & Pineda-Lucena, 2010; Gozalbes,
Jacewicz, Annand, Tsaioun & Pineda-Lucena, 2011). When the object of the study is to model a
toxicity-related parameter, it is quite common to use the term QSTR (for “Quantitative Structure-
Toxicity Relationships”) instead of QSAR.
4
The use of QSTRs has two main advantages over other methodologies: 1) once a model has
been developed, the toxicity prediction of a compound can be made from only the knowledge of its
chemical structure, and 2) models can be easily automated, thus providing an extremely rapid means
for evaluating a large number of chemical structures.
There is a high number of potential uses of QSTRs at industry level (Modi, Hughes, Garrow
& White, 2012; Merlot, 2010; Valerio, 2013). For example, in the pharmaceutical industry, QSTRs
allow to increase the likelihood of early toxicity identification of drug candidates at clinical doses.
This identification of risk candidates can be done much before investing huge amounts of time and
resources, in order to reduce dropout rates of drug development (Merlot, 2008; Boyer, 2009). In
addition, the prediction of toxicity can be applied to environmental risk assessments for common
pollutants. However, its use may be affected by certain limitations, among which is the lack of
adequate toxicity data available in certain cases, and inadequate overly simplistic models for the
prediction of toxicity values for parameters dependent on complex or multiple mechanisms of action.
In addition, unlike animals or in vitro tests, it is important to keep in mind that QSTRs should be
regularly reviewed and refined when new data are available, so that models validated at any given
time have to be re-validated regularly to not become obsolete.
QSTR Models in Scientific Literature

A large number of QSTR models are available to estimate different toxicological parameters of
chemicals, either related to the effects on human health or the environmental impact. Overall, published
models can be classified according to the toxicity parameters to be evaluated:
1. Systemic Toxicity in Humans: Mainly predictions of carcinogenicity, mutagenicity,

reproductive toxicity and acute toxicity. Furthermore, there is considerable interest in predicting
pharmacokinetic parameters that may influence the bioavailability of the compounds, particularly
drugs.
2. Local Toxicity in Humans: Prediction of toxicity on the skin, respiratory sensitization, skin
and eye irritation, photo-toxicity, etc.
3. Environmental Distribution: It refers to the dispersion and ultimate fate of chemicals with toxic
effects on the environment, which can be modeled in terms of persistence (i.e., biodegradation,
hydrolysis, etc.), distribution and bioaccumulation.
4. Ecotoxicity: Predictions of toxic effects on plants, aquatic and terrestrial organisms (invertebrates
and vertebrates) and birds.
Early QSTRs were based on the premise that toxicity could be correlated with certain molecular
features of chemicals (García-Domenech et al., 2001). These early models were limited by the
number of parameters that could be modeled, and their predictive ability was globally low, especially
for complex toxicities that can occur through different mechanisms of action. Currently, the list
of published predictive studies is very broad, including QSTRs to predict the effects of drugs and
chemicals towards humans and the environment. Some relevant studies that can be cited here are:
• Fourches et al.studied the relationships between chemical structures and DILI toxicity (Fourches
et al., 2010). A combination of rigorous text mining and cheminformatics data analysis allowed
to obtain a curated database of 951 compounds from scientific literature for the developing of
chemoinformatics predictive models on toxicity of chemicals in the liver. Cluster analysis of
these compounds using 2D fragment descriptors allowed the identification of multiple clusters
of compounds belonging to structurally congeneric series. Binary QSTR models of liver toxicity
were derived, and the mean external prediction accuracy in 5-fold external validation study was
found to be 65%.
5
• The results of some QSAR-based programs (BioEpisteme, MC4PC, MDL-QSAR and Leadscope
Predictive Data Miner) were compared when predicting different toxicological parameters, such
as carcinogenesis in rodents (Matthews et al., 2008), early detection of drug-induced hepatobiliary
and urinary tract toxicities (Matthews et al., 2009), and drug-related cardiac adverse effects (Frid
& Matthews, 2010). More than 1,500 compounds were used in each of the three studies for the
respective modelled endpoint, and identical training data sets were configured for comparative
purposes. The conclusions were quite similar in the three studies: model performance was affected
by the ratio of the number of active to inactive drugs, and enhanced performance was obtained
by combining predictions from at least two programs (consensus predictions resulted in better
performance, as demonstrated by either internal or external validation).
• Kar & Roy developed a global QSAR model for carcinogenesis, by using a data set of 1,464
compounds including many marketed drugs for their carcinogenesis potential (Kar & Roy, 2011).
Other than the statistical quality of this model, its importance relies on the very interesting
structural inferences that were extracted, helping to the reduction of carcinogenic potential of
new molecules: 1) branching, size and shape were found to be crucial features for drug-induced
carcinogenicity; 2) higher lipophilicity values and conjugated ring systems, thio-keto and nitro
groups were found to contribute positively towards drug carcinogenicity; 3) secondary and tertiary
nitrogens, phenolic, enolic and carboxylic OH fragments, and presence of three-membered rings
were found to reduce carcinogenicity.
• Valerio & Cross developed a statistical QSAR approach on the Ames assay to predict the
mutagenic potential of drug impurities, and they described the results in terms of their content,
structural features and chemical space overlap in over a thousand drug impurities from FDA/
CDER drug applications and public sources (Valerio & Cross, 2012). This QSAR-mutagenicity
model was tested using an external validation with 2368 chemicals for which the Ames assay
outcome was known, and the results showed high sensitivity (81%). Furthermore, the model
outperformed the human expert alerts (based on experimental evidences) using the set of public
alerts assembled and coded into the Leadscope Enterprise computational software.
• Specific QSTR models were developed by Sangion & Gramatica to predict acute toxicity of
Active Pharmaceutical Ingredients (APIs) into three aquatic trophic levels, i.e., algae, Daphnia and
two species of fish (Sangion & Gramatica, 2016). Multiple Linear regressions - Ordinary Least
Squares (MLR-OLS) models were developed using the QSARINS software, based on theoretical
molecular descriptors calculated by the PaDEL-Descriptor software (http://www.yapcwsoft.com/
dd/padeldescriptor/) and selected by Genetic Algorithms. The models showed to be statistically
robust and externally predictive, and they were applied to predict acute toxicity for a large set
of APIs without experimental data. Predictions were then processed by Principal Component
Analysis (PCA) and a trend driven by the combination of models was highlighted. This trend,
named Aquatic Toxicity Index (ATI), can be used for the ranking of chemicals according to their
potential toxicity on the whole aquatic environment.
• A recent study was published to provide scientific evidence to support the EFSA’s recent
“Guidance on tiered risk assessment for plant protection products for aquatic organisms in
edge-of-field surface waters”, which outlines the opportunity to apply non-testing methods
such as the QSARs. Experimental fish LC50 values for 150 metabolites were extracted from the
Pesticide Properties Database (http://sitem.herts.ac.uk/aeru/ppdb/en/atoz.htm), and consequent
QSTR calculations were performed to predict fish acute toxicity using the US EPA’s ECOSAR
software (Burden et al., 2016). The results showed a significant correlation between predicted
and experimental fish LC50 values, therefore confirming the applicability of QSAR models in
the metabolite assessment scheme recommended by EFSA.
6
QSTRs in International Regulatory Norms

Various international organizations allow the use of computational methods (and in particular QSAR/
QSTR models) for various purposes related to the regulation of existing and new chemical entities
(Kar & Roy, 2010). In fact, chemoinformatic approaches along with other predictive in silico tools are
employed by Australian, Canadian, European, Japanese and US Government organizations (Fjodorova
et al., 2008). The toxicity/ecotoxicity of pharmaceuticals is one of the domains where computational
approaches for regulatory purposes are more recognized (Roy & Kar, 2016). A non-exhaustive list
of regulatory bodies that can be cited here could include the European Agency for the Evaluation of
Medicinal Products (EU-EMEA), the Food and Drug Administration (US-FDA), the Pharmaceutical
and Medical Devices Agency (Japan-PMDA), or the Australian Environment Agency (AEA) (Roy &
Kar, 2016). Also, the REACH regulation plays an inspirational role in the implementation of chemical
legislations across the globe. Countries such as Canada, China, India, Japan, Malaysia, South Korea,
Taiwan or Turkey, have adopted or are currently drafting registration/authorization laws on a similar
pattern, and therefore including the application/development of QSTRs as valid tools for regulatory
purposes (Van Heerden, 2012).
The use of QSTRs to help in the regulation of chemicals can be divided into three main areas:
classification and labeling, risk assessment and prioritization. Historically, North American regulatory
agencies were the first to take the lead in the development and use of computational methods. For
example, the US Environmental Protection Agency (EPA, https://www3.epa.gov/) regulated the use of
QSAR models in the notification process prior to the manufacture of new chemicals, especially when
no previous toxicity data were available (Cronin, 2002). Another example was the Canadian law on
environmental protection (mostly known as “Canadian Environmental Protection Act”, CEPA), under
which some 23,000 substances from a list of household products had to be screened and grouped into
categories according to their values of persistence, bioaccumulation and inherent toxicity (Cronin,
2002). Given the lack of experimental data for a large majority of these substances, the advantages
and disadvantages of using QSTRs were analyzed (Cronin, 2002; MacDonald, Breton, Sutcliffe &
Walker, 2002). More recently, US regulatory agencies such as the Food and Drug Administration
(FDA, www.fda.gov) have invested time and resources in evaluating the usefulness of computational
methods aimed at detecting signs of toxicological risk, including QSTRs (Yang, Valerio & Arvidson,
2009; Valerio, 2011). At the FDA, these efforts materialized in form of models based on toxicological
parameters that cannot be tested in humans, including QSTRs on genetic toxicity, reproductive toxicity
and carcinogenicity (Matthews &n Contrera, 2007; Contrera, Matthews, Kruhlak & Benz, 2005;
Matthews et al., 2007; Contrera, Matthews & Benz, 2003).
Despite the relatively limited use of computational methods in Europe at this time, the Danish
Environmental Protection Agency (http://eng.mst.dk/) used a variety of QSAR methods to prioritize
approximately some 166,000 chemicals according to their potential human health effects (Cronin,
2002). Finally, the incentive for a greater use of these methods has come from EU regulations,
initially driven by a White Paper published by the European Commission setting out a strategy for a
future policy on chemicals (European Parliament, 2001) that finally become included in the REACH
regulation (European Commission, 2006). In short, we can say that government policies, both in the
EU and North America, have encouraged the use of computational techniques to predict toxicity,
to the point that in some cases they have been incorporated in their laws. Among them, the QSTR
models are those with greater validity from a regulatory point of view.
A key issue raised at legal level is that of the validity of the QSTR models when making a
decision for authorization of a chemical compound. Logically, the decision on the application of
QSTRs for substances whose effects are still unknown is only sustainable when a model has a solid
foundation and it is not the result of theoretical speculations. However, several retrospective studies
have demonstrated the lack of quality and predictive ability of some QSTR previously developed
models, therefore opening the debate on the suitability of their use in a conventional way (Tropsha,
Gramatica & Gombar, 2003; Gramatica, 2013). A critical and impartial rethinking to assess the
7
reliability of the computational model results seemed necessary. According to the Organisation for
Economic Co-operation and Development (OECD, www.oecd.org/) a set of general rules can be
defined to help determining whether a QSTR model is suitable for regulatory use (OECD, 2007).
They are known as “rules of Setubal” because they were derived at the “International Workshop on
the Regulatory Acceptance of QSARs for Human Health and Environment Endpoints”, held in this
Portuguese city in March 2002. This event was organized by two of the most relevant professional
organizations representing the chemical industry: the European Chemical Industry Council (CEFIC,
an acronym for its original French name “Conseil Européen des Fédérations de l’Industrie Chimique)
and the International Council of Chemical Associations (ICCA). These principles establish that to
facilitate the consideration of a (Q)SAR model for regulatory purposes, it should be associated with
the following information (OECD, 2007) (Figure 3):
1. Principle 1: A defined endpoint (referred as any physicochemical, biological or environmental

effect that can be measured and therefore modelled). The intent of this principle is to ensure
transparency in the endpoint being predicted by a given model, since a given endpoint could
be determined by different experimental protocols and under different experimental conditions.
Ideally, (Q)SARs should be developed from homogeneous datasets in which the experimental
data have been generated by a single protocol.
2. Principle 2: An unambiguous algorithm, to ensure transparency in the description of the model
algorithm (especially in the case of commercially-developed models, since this information is
not always made publicly available).
3. Principle 3: A defined domain of applicability, since (Q)SARs are limited in terms of the types
of chemical structures, physicochemical properties and mechanisms of action for which the
models can generate reliable predictions.
4. Principle 4: Appropriate measures of goodness-of-fit, robustness and predictivity. This principle
expresses the need to provide two types of information: a) the internal performance of a model
(as represented by goodness-of-fit and robustness), determined by using a training set; and b)
the predictivity of a model, determined by using an appropriate test set.
5. Principle 5: A mechanistic interpretation, if possible. It is well known that a mechanistic
interpretation of a given (Q)SAR is frequently not possible. Nevertheless, the intent of this
principle is to ensure that, when possible, there is an assessment of the mechanistic associations
between the descriptors used in a model and the endpoint being predicted, and that any association
is documented.
These rules were finally adopted by all member countries of the OECD in the “37th Joint Meeting
of the Chemicals Committee and Working Party on Chemicals, Pesticides and Biotechnology” in
November 2004. The European Union’s has also included these rules in the Annex XI of REACH
Regulation (European Commission, 2006) and in the Annex IV of the Biocidal Products Regulations
(BPR) (European Parliament and The Council, 2012), virtually unchanged.
Considering this shift towards a more demanding use of QSTR techniques, many of the best
models developed for drug discovery could not meet these conditions (i.e., theoretically could
not be accepted from a regulatory perspective). This seems quite logical, because in fact the main
objective of drug discovery is to derive predictive models (relegating transparency and adequacy of
the results to a secondary role) and validation is often based on internal data that cannot be available
for regulatory purposes.
Sources of Toxicological Data

A key factor for a good predictive efficacy of computational models is the amount and quality of
data available for their development. These data may come from in vivo or in vitro assays (e.g.,
animal carcinogenicity assays vs. Amestests on bacterial mutation), and their interlinkages have
8
Figure 3. Computational methods in predictive toxicology and acceptance of QSTRs from a regulatory perspective
to be considered, as usually the ultimate goal is to predict in vivo effects in humans. One of the
major constraints that face QSAR practitioners to develop efficient models is the lack of adequate
data (quantity and quality), essential for development and validation. So far, the available data have
been very limited in terms of chemical and biological space for the vast majority of toxicological
parameters. The need for more data to increase the coverage of models should not be seen as a need
for further testing in animals, but the application of a larger effort to use existing data (e.g., corporate
databases). In the past, voices were raised to compile such bases (Cronin, 2002), and there are several
examples of proprietary data publication for the development of in silico models. These include
the “Cooperative Research and Development Agreement” (CRADA) between the US FDA and the
company Multicase, Inc. This collaboration, which involved the release of regulatory data (although
chemical structures were not made available to the public) and the co-development of software for
automated evaluation of chemical structures, meant a great improvement in the coverage and capacity
of carcinogenicity prediction models (Matthews & Contrera, 1998). Problems of commercial sensitivity
and confidentiality should be considered in the collection and use of this kind of data.
More recently, advances in high-throughput techniques (HTS) and “omics” methods have allowed
the generation of multidimensional toxicity data in large chemical libraries, thus representing an
interesting avenue for future developments in computational toxicology. Currently there is a greater
amount of public and private resources available for the development of toxicity models, among which
it could be highlighted some cases of integration of data provided by research collaboration projects
like DSSTox (from “Distributed Structure-Searchable Toxicity”), the CPDB (from “Carcinogenic
Potency Database”, with reports of animal cancer tests for more than 1,500 chemicals, which are
freely accessible via the Toxicology Data Network website, TOXNET, http://toxnet.nlm.nih.gov), or
PubChem (public repository of chemical structures and biological properties associated, most of the
data of which come from the Center of Molecular Libraries Screening program of the NIH, http://
pubchem.ncbi.nlm.nih.gov) (Richard, Gold & Nicklaus, 2006).
One problem posed by the existence of different sources of information is the quality of data and
their validity. Before starting the modeling, the QSAR practitioner has to ensure the quality of the
data sets, which in many cases come from different sources and in different formats, and he/she has
also to plan normalization procedures (Fourches, Muratov & Tropsha, 2010). Despite the efforts of
a number of international organizations seeking to harmonize this information, it remains a critical
need. Table 1 shows some useful electronic resources containing data suitable for building toxicity
models, all of them being public and accessible for free.
Specialized Software for QSTRs Development

The steps of a QSAR/QSTR model development are always quite similar:
9
Table 1. Selection of freely available public toxicological databases
Database Short description Website access
ACToR (Aggregated Collection by “US EPA Computational Toxicology

Computational Research Program” from public information sources.
Toxicology It provides environmental toxicological data in vitro www.epa.gov/actor
Resource) and in vivo for over 500,000 chemicals, especially
pesticides and land and water polluting agents.
CCRIS (Chemical Results of carcinogenicity and mutagenicity tests

Carcinogenesis for ca. 8,000 chemical compounds. This base was
Research developed by the National Cancer Institute (NCI),
Information System) mainly from studies cited in scientific publications http://toxnet.nlm.nih.gov/newtoxnet/ccris.htm
and reports from the NCI journals, and was reviewed
by experts in carcinogenesis and mutagenesis. This
database is not updated since 2011.
CPDB Carcinogenicity long-term test results in animals

(Carcinogenic (mice, rats, dogs, primates) for more than 1,500 http://toxnet.nlm.nih.gov/cpdb/
Potency Database) chemical compounds. Not updated since 2005.
DART
(Developmental Gathered by the National Library of Medicine (NLM)
and Reproductive that provides specific information on reproductive and http://www.nlm.nih.gov/pubs/factsheets/dartfs.html
Toxicology developmental toxicology published since 1965.
Database)
DSSTox (Distributed EPA initiative that proposes a decentralized set of

Structure-Searchable toxicity databases, searchable by chemical structure
http://www.epa.gov/ncct/dsstox/
Toxicity Database) and that can be downloaded in standard formats such
as SDF.
GENE-TOX Mutagenicity data for about 3,000 EPA chemicals.

(Genetic Toxicology These data were reviewed by a panel of experts from
http://toxnet.nlm.nih.gov/newtoxnet/genetox.htm
Data Bank) the scientific literature. This database has not been
updated since 1998.
PAN (Pesticide Pesticide information from many different sources,

Action Network) including (acute and chronic) human toxicity,
Database ecotoxicity and regulatory information for about 6,400 http://www.pesticideinfo.org/
pesticide active ingredients and their transformation
products.
RepDose (Repeated Database financed by CEFIC, with information of

Dose Toxicity) acute and chronic toxicity for ca. 1,200 chemical
compounds, in mice, rats and dogs. It includes values
http://fraunhofer-repdose.de/
in specific organs, and studies for which available
data are ordered by their credibility, according to their
degree of compliance with OECD rules.
TETRATOX Collection of the Institute of Agriculture at the

University of Tennessee (US), composed of aquatic
toxicity data for more than 2,400 organic compounds http://www.vet.utk.edu/TETRATOX/index.php
of industrial origin using the ciliate Tetrahymena
pyriformis.
ToxRefDB (Toxicity Contains information of chronic, subchronic,

Reference Database) reproductive and developmental toxicity for hundreds
of chemicals, many of them active ingredients http://www.epa.gov/ncct/toxrefdb/
of pesticides, and other products that affect the
environment.
a. The initial database with precise structural and biological information of each of the compounds
is randomly decomposed into two groups, “training” and “validation”. The first will serve to
develop the model, and the second to verify its predictive power.
b. All compounds have to be characterized by a series of numerical molecular descriptors, so that
each of the molecules is unmistakably represented.
c. Application of appropriate mathematical techniques to the training group of compounds for the
generation of statistical models.
10
d. Validation of models by assessing the standardized statistical parameters obtained, and checking
the quality of fit between experimental and predicted values for the validation group.
Specialized QSAR software must therefore refer to both the calculation of numerical descriptors
needed to characterize the molecules and the statistical techniques for the development of mathematical
models. The types of descriptors can be very different, from simple structural features (e.g., number
of heteroatoms, aromatic rings, etc.) or certain fragments or substructures (e.g., number of carbonyl
groups, number of carboxyl groups, etc.), to physicochemical properties (molecular weight, LogP,
solubility, etc.), topological indices (that consider exclusively the molecular graph) or three dimensional
conformational-dependent indices. There are hundreds of descriptors, and there is not a clear definition
of which are the best ones, since different descriptors encode different information, and comparative
studies until now have provided conflicting results (Livingstone, 2000). Nevertheless, we could
probably consider that the simplest descriptors are the more practical ones, since they allow faster
computing but maintaining a comparable predictive capacity, which is particularly useful when large
databases of thousands of chemical structures have to be screened (Gozalbes, Doucet & Derouin, 2002).
The number of algorithms and statistical techniques useful for carrying out modeling QSAR/
QSTR is also very broad (e.g., linear regression, clustering, kernel methods, discriminant analysis,
etc.), and their selection depends on the nature of the data set and the type of prediction to be obtained
(binary classification type “active/inactive” or quantitative estimation of toxicity such as DL50)
(Nantasenamat, Isarankura-Na-Ayudhya & Prachayasittikul, 2010; Gedeck, Kramer & Ertl, 2010).
There is a very broad panel of specialized programs, either for calculating descriptors or for
the application of statistical methods, as well as packages that integrate both aspects of the QSAR
development models (Toropov, Toropova, Raska, Leszczynska & Leszczynski, 2014). An important
aspect is that most of these tools are available for free, and in a number of cases through publicly
accessible websites (Singla et al., 2013). Table 2 shows a selection of free programs and applications
for the development of QSAR models or application of models previously developed and validated.
OTHER COMPUTATIONAL TOXICOLOGY METHODS
Docking
There are several intracellular receptors known to be important in mediating toxicological responses.
In many cases, the toxicity of chemical compounds originates from their binding affinity for one of
these specific biological receptors, resulting in dysfunction of processes such as biosynthesis, signal
transduction, transport, metabolism, etc. Examples of such receptors may be the glucocorticoid
receptor (which inhibition can cause damage to the immune system and various bodily functions) or
the hERG gene (which inhibition produces cardiac dysfunction and arrhythmias).
In this context, studies of docking can be of great predictive help, since they can allow determining
the optimal conformation and orientation preferred by a molecule to bind to one of these receptors,
generating a stable complex in which the free energy of the entire system is minimized. Docking
studies usually consist of four parts:
1. First, the structures of potential ligands must be minimized,

2. a conformational sampling is carried out, so as to have a predetermined number of possible
conformations for each ligand,
3. ligands are placed on a determined pocket of its receptor, and different translation and rotation
movements of the different conformations are performed in order to be energetically evaluated,
and
4. a “score” is assigned to each, in order to rank all possible hypotheses as function of their
interactions and their energies (Kitchen, Decornez, Furr & Bajorath, 2004).
11
Table 2. Selection of free software for development and/or application of QSAR/QSTR models
Software Short description Web page

CAESAR-VEGA CAESAR was a Project funded by the http://www.caesar-project.eu/
EU with the specific aim to develop http://www.vega-qsar.eu/
QSAR models adapted to the REACH
legislation. Five predictive models
were implemented for five properties
of high relevance in REACH: BCF,
skin sensitization, carcinogenicity,
mutagenicity, developmental toxicity.
These models were developed
according to the OECD principles, and
various statistical techniques as well
as the use of external validation sets to
certify their predictability were applied.
Currently the CAESAR tools have been
implemented in the VEGA platform.
ChemProp It consists of several modules, http://www.ufz.de/index.php?en=6738
depending on the calculation methods
you want to use. The implemented
techniques are mainly based on the
chemical structure in 2D (fragments,
topological indices, etc.). Several
models are incorporated from internal
databases. The models are accompanied
by tools to characterize the domain of
applicability and provide uncertainty
estimates.
CORAL-SEA Simple interface able to generate http://www.insilico.eu/coral/CORALSEA.html
regression models or binary
classifications, has the inconvenience
of being too slow to excessively large
sets of compounds (maximum 5,000
chemical structures).
DTC Lab. Software Complete ensemble of chemoinformatic http://teqip.jdvu.ac.in/QSAR_Tools/
Tools tools developed by the Drug theoretics
and cheminformatics laboratory (DTC
lab), Department of Pharmaceutical
technology, Jadavpur University.
Different pieces of software are
available for different aspects of
QSARs, such as normalization, data
pre-treatment, dataset division and
clustering, applicability domain
analyses, QSAR model development
and validation. A Nanoprofiler is also
included, to predict different properties
of nanoparticles by using Nano-QSAR
models reported in scientific literature.
QSAR-Toolbox It is the official program of the OECD, http://www.oecd.org/chemicalsafety/risk-assessment/theoecdqsartoolbox.htm
and facilitates the compliance with http://www.qsartoolbox.org
REACH standards. It is characterized
mainly by the possibility of grouping
chemicals based on their structural
similarity, allowing the use of existing
experimental data to fill information
gaps.
Lazar Web interface through which various http://lazar.in-silico.ch/predict
toxicity predictions can be generated
from the chemical structure. Generates
a rational report that includes these
predictions, the domain of applicability
and validation results.
RmSquare It is a web tool to estimate the http://aptsoftware.co.in/rmsquare/
predictive ability of a model from
experimental, predicted and reserved
for validation data values.
TEST Program developed by the EPA to http://www.epa.gov/nrmrl/std/qsar/qsar.html#TEST
estimate the acute toxicity, based
only on the structure of the chemical
compound. It uses different QSAR
techniques, and consists of various
prediction models: Fathead minnow
LC50 or Daphnia magna, oral LD50 in
rats, BCF, test Ames for mutagenicity,
developmental toxicity, etc. Also, it
contains models for predicting various
physical properties (boiling point,
viscosity, density, water solubility,
melting point, etc.)
continued on following page
12
Table 2. Continued
Software Short description Web page

Tox-Comp Modular system for early assessment of http://tox-comp.net/
cardiotoxicity of new chemical entities.
The platform consists of several
interconnected elements, including
a calculator of the hERG inhibition
potential.
Toxtree Application “open-source” developed http://toxtree.sourceforge.net/
by the company Idea consult Ltd.
commissioned by the Joint Research
Center (JRC) of the European
Commission. It is a tool that brings
together the chemical structures by
categories and predicts several types
of toxic effects by applying decision
trees. Among the models available:
skin irritation, eye irritation, in vivo
and in vitro mutagenicity (Ames test),
carcinogenicity, biodegradation and
persistence, DNA binding protein
binding.
Virtual Free access via the web to a set http://www.vcclab.org/
Computational of QSAR programs, ranging
Chemistry from calculators of numerical
Laboratory molecular descriptors (PCLIENT or
E-DRAGON), ALOGPS: the reference
software for prediction of lipophilicity
and aqueous solubility of molecules,
statistical tools (ASNN, PNN or
PLS) or models previously developed
QSARs, ready for application.
Virtual ToxLab Online tool for the prediction of http://www.chemie.unibas.ch/~vbc/molmod/virtualtox/index.html
potential toxicity mediated by 16
proteins known for their ability
to trigger adverse effects, using a
combination of computational methods
(docking and QSAR)
In industrial practice, it is usual that this technique is used as a computational screening for large
databases with hundreds / thousands of compounds (“virtual screening”).
A significant number of docking programs exist, among which are DOCK, one of the first
programs described (http://dock.compbio.ucsf.edu/) (Kuntz, Blaney, Oatley, Langridge & Ferrin,
1982) and AutoDock, which is probably the most cited docking program (http://autodock.scripps.
edu/) (Goodsell, Morris & Olson, 1996). Another important tool is the Virtual ToxLab, a platform
designed in the “Biographics Laboratory 3R”, a Swiss research organization for replacing animal
testing by computational methods (http://www.biograf.ch/), which has designed a combined docking
and QSAR strategy for the study of 16 recipients whose inhibition may induce adverse effects (Vedani,
Smiesko, Spreafico, Peristera & Dobler, 2009).
READ-ACROSS - EXTRAPOLATION BY STRUCTURAL SIMILARITY
The read-across (extrapolation by structural similarity) of risk data is a well-known method to predict
the hazard profile of a substance by linking it to structurally similar compounds for which experimental
data on an effect (Nicolotti et al., 2014; Vink, Mikkers, Bouwman, Marquart & Kroese,2010) are
known. This methodology is based on the well-established assumption in medicinal chemistry that
common structural features imply similar (Patlewicz et al., 2013) properties. A read-across prediction
can be derived from a property of a set of structurally similar compounds. For example, if it is known
a particular substance is carcinogenic, a very structurally similar compound can also be considered
as a high probable carcinogenic. Prediction is much more reliable when greater number of similar
chemical structures are grouped in a category. Analogous to QSAR/QSTR, one of the criteria for a
prediction by read-across is that the chemical candidate is within the scope of applicability of the
13
model, i.e., within the range of descriptors and / or classes of chemicals associated with the proposed
mechanism.
Unlike QSAR/QSTR, read-across outcomes are essentially qualitative, and their cornerstone
is the correct identification of the similar chemical(s) with the known property. In this regard, the
approach is significant if other aspects are considered in addition to the chemical similarity, as a
common mode of action and similar metabolic pathways. Read-across has the advantage that it has a
transparent methodology and can be easily understood, which favours its acceptability by end users
and regulators. REACH accepts read-across predictions, the use of extrapolation of toxicological
data of a chemical to predict the toxicity of another similar.
A useful program for generating read-across predictions is the QSAR-Toolbox. It is a publicly
available tool that allows the selection of analogues, building mechanistic-based categories to infer
trends and read-across analyses and predictions. It is a tool that is part of the OECD QSAR Project
and was co-developed with ECHA (www.oecd.org/chemicalsafety/testing/oecdquantitativestructure-
activityrelationshipsprojectqsars.htm), with the overall objective of increasing regulatory acceptance
of in silico predictions.
Structure-Activity Relationships based on Knowledge

(“Knowledge-based SAR Systems”)
This approach is associated with the reactivity of the functional groups of chemicals. These groups
can show steric and electronic features responsible of their interaction with specific biological targets,
thus inducing their subsequent toxicological effects. These are termed as “toxicophores”. Typical
toxicophore characteristics are hydrophobicity, aromaticity, cationic or anionic character, and the
possibility of forming hydrogen bridges.
The fundamental aspect of this methodology is a base of human knowledge to establish a set of
codified rules to predict whether a chemical can show certain toxicity. These systems detect structural
alerts when toxicophores with significant scientific evidence are identified in input molecular
structures. The predictions are based on literature references that can be read to check their relevance.
The development of the knowledge base or rule base requires a careful analysis of large databases,
and essentially represents a consensus - based approach. An inherent drawback to this approach is
that human experts can be wrong or even inadvertently biased in their determinations (Guzelian,
Victoroff, Halmes, James & Guzelian, 2005).
The main advantage of these systems is that the knowledge that generates the rules in the system
is based on empirical evidence, usually linked to a mechanistic understanding of the toxicity, in which
the judgment or human reasoning (Valerio & Long, 2010) is integrated. A disadvantage is that a
substructural feature is only part of a broader set of features of a molecule, and therefore predictions
may tend to overestimate the potential toxicity because they do not consider mitigating or modulating
features. Another interesting point is that, strictly speaking, the negative prediction is non-existent:
‘no structural alert’ simply means that there are no rules that have been encoded in the software to
alert toxicity for this molecule, maybe because there is not enough representation of the sub-structure
in the available literature in order to characterize it as a toxicophore.
A common misunderstanding is to believe that knowledge-based SAR systems are limited to
contain and manage a database of toxicity information. These programs are not simple database
management systems, but highly complex computer software that integrate knowledge and expert rules
derived from the analysis of chemical and toxicological data. Various computer programs generate
preventive structural alerts for the presence of toxicophore groups (Kazius, McGuire & Bursi, 2005;
Judson, 2006). The most popular and used one is Derek Nexus by Lhasa Ltd. (www.lhasalimited.org/
products/derek-nexus.htm) (Marchant, Briggs & Long, 2008). Most of these programs offer flexible
configuring, thus having a functionality of custom publishing rules so that the users can derive their
own rules according to their internal data.
14
PBPK
Adverse and toxic effects of compounds (environmental pollutants, drugs, etc.) in living beings
depend largely on their pharmacokinetic properties (ADME), i.e., how they are absorbed, distributed
throughout the body, metabolized and eventually excreted. PBPK modeling (“physiologically-based
pharmacokinetic modeling”) is a technique of mathematical simulation of the pharmacokinetics of
compounds for predicting their ADME properties, and it is used either for pharmaceutical research and
development or for evaluating health risks (Bartels et al., 2012; Chen, Yarmush & Maguire, 2012). It
is a methodology that aims to consider the pharmacokinetic differences between species to estimate
human risk from animal data, and individual differences intra-species (age, sex, race, etc.) to assess
the impact of pharmacokinetic variability in individual risks. In short, modeling PBPK attempts to
describe the relationship between external measures of applied dose (e.g., supplied amount of food,
drugs, or concentration of pollutants in water or air) and internal measurements of dose found (for
example, metabolized amount or concentration in the tissue sample that shows a toxic response)
using the most realistic possible way a description of the physiology and biochemistry of the animals/
humans (Caldwell, Evans & Krishnan, 2012).
Applications of PBPK modeling in toxicology have increased in recent years, including the
prediction of exposure to environmental pollutants and their metabolites, concentrations of compounds
with potential adverse effects in particular organs, or calculating the appropriate administration dose for
a drug considering its mechanism of action. Most of these models have been developed and validated
with experimental data from animal tests and then extrapolated to humans. Given the tendency to
limit animal testing, the information obtained by this way is currently much lower, and therefore it is
replaced by in vitro techniques and data from microarray gene expression (Kleinstreuer et al., 2011).
There is a great interest in extrapolating data from in vitro toxicity with PBPK models to calculate
the human equivalent doses for a given concentration, and various publications have accounted of
this in vitro-in vivo extrapolation for risk assessment (Wetmore et al., 2012; Clewell & Andersen,
2004). Although PBPK models are designed to predict tissue or blood concentrations, new models
of dose-response using biologically-based estimates of internal dose to predict toxicity through
statistical correlations are needed. This type of studies provides a unique opportunity to incorporate
information on mechanisms of action.
Concerning the software used for these studies, the reference certainly in this field is GastroPlus,
a simulation package capable of predicting the absorption, pharmacokinetics, pharmacodynamics and
drug-drug interactions for compounds administered by different routes (oral, ocular, intravenous, lung)
to animals and humans (http://www.simulations-plus.com) (Kuentz, Nick, Parrott & Röthlisberger,
2006).
OUTLOOK: PREDICTIVE NANOTOXICOLOGY
In recent years, the use of nanomaterials in different industrial sectors is experiencing a dramatic
increase, and their applications in real life are increasingly diverse. However, there is a huge knowledge
gap in understanding the toxic effects of these nanoparticles, and concern about their safety is
consequently an important issue. Nanoparticles have unique properties compared to traditional
chemicals, such as their greater surface to volume ratio, or the greater ability to cross biological
membranes. These physico-chemical properties have an important influence on the interactions with
biological systems: since the nanomaterials show a size comparable to biomacromolecules, they could
be involved in interactions not usually observed with traditional chemicals, and produce unknown
and potentially toxic effects to cells (Zhu et al., 2013).
Computational methodologies can help to understand the intrinsic characteristics of nanomaterials
and their mechanisms of action, but their development has so far been based on classical chemistry,
and needs adaptation to the parameters of nanomaterials, to quantitatively assess their risks and, if
necessary, to propose replacement solutions with safer and less toxic effects. There are examples
15
of predictive work using docking / molecular dynamics to investigate the binding of fullerenes to
potassium channels (Kraszewski, Tarek, Treptow & Ramseyer, 2010) or evaluating drug affinity to
carbon nanotubes in an aqueous medium (Liu, Yang & Hopfinger, 2009).
Computational techniques of statistical nature such as read-across or QSAR are the ones that have
mostly been used with nanomaterials. The greatest difficulty of such studies with nanomaterials is
that molecular structures have to be previously characterized by descriptors and, in this case, is quite
complicated because until recently there were no or limited number of simple numerical descriptors
adapted to the nanometric particles (Fourches et al., 2010). The development and validation of
statistically reliable computer models are difficult; so far, a few geometric, structural, physical,
chemical and biological studies of nanomaterials are available. Nevertheless, the number of “Nano-
QSAR” relevant models is growing significantly, and among them should be highlighted:
1. Several models have treated the prediction of the solubility of nanomaterials in different solvents,
since this is one of the more influential properties on their behaviour on the environment. Here
we may cite the work on solubility prediction of fullerenes with simple topological indices by
Petrova et al. (Petrova, Rasulev, Toropov, Leszczynska & Leszczynski, 2011).
2. Liu and Hopfinger studied the structural changes of cell membranes after the insertion of carbon
nanotubes, and have developed predictive QSAR models of the influence of these nanotubes in
toxicity (Liu & Hopfinger, 2008).
3. Fourches et al. developed different models with two sets of nanoparticles and various in vitro
cellular assays: 1) 51 nanoparticles containing various metallic nuclei, and 2) 109 nanoparticles
sharing the same metallic nucleus but having different surface modifiers. Nano-QSAR models
developed from these compounds were highly predictive, and can be used to prioritize the design
and manufacturing of safer nanomaterials (Fourches et al., 2010).
Efforts for better nanoparticles’ representation by new nano-descriptors have been recently
reviewed (Sizochenko & Leszczynski, 2016), and the contribution of computational models to
prediction of nanoparticles’ toxicity towards bacteria, cell lines and microorganisms has been clearly
stated. The finding of new nano-descriptors opens new ways of improving Nano-QSARs, as some
recent examples demonstrate: 1) the identification of properties of metal-based colloidal materials has
served for grouping purposes (Sayes, Smith, & Ivanov, 2013); 2) optimal descriptors calculated by
Monte Carlo with correlation weights of various concentrations and different exposure times allowed
to build up a model for cell membrane damage caused by nano metal-oxides (Toropova et al., 2015);
3) the grouping provided by the PCA approach after calculation of 35 individual nanodescriptors
based on the surface-core model was found to be in good accordance with the algal growth inhibition
data (Támm et al., 2016); 4) simplex-informational descriptors have been used for the prediction of
solubility of fullerene derivatives (Sizochenko, Kuz’min, Ognichenko & Leszczynski, 2016).
In short, the Nano-QSAR models are still at an early stage, given the lack of data available to their
generation and the need to develop new descriptors able to clearly identify their structural properties.
However, the cited references are good examples of their ability, and scientific, industrial and national
institutions should harmonize their regulatory efforts for their development (Puzyn, Leszczynska &
Leszczynski, 2009; Winkler et al., 2013).
FUTURE ASPECTS OF CHEMOINFORMATICS IN PREDICTIVE TOXICOLOGY
New regulations such as the EU-REACH present a significant pressure to minimize animal testing,
and clearly a great opportunity for replacing animals by in silico tools exists (Nicolotti et al., 2014).
Nevertheless, a completely alternative testing framework has not yet been definitively adopted by
companies, which argue several reasons, such as lack of worldwide regulatory acceptance of alternative
16
methods, possible overestimation of the hazard of products, or time and costs involved if a company
has to do both non-animal and animal tests (Clippinger, Hill, Curren & Bishop, 2016) (Figure 2).
Therefore, more efforts in computational toxicology are necessary to guarantee the progressive
replacement of in vivo assays.
As explained before, the lack of available toxicological data is one of the main points here. It is
well known that many data are produced in industry, and in many cases from projects that have been
abandoned and are no more of interest. Obviously, the release of these data could serve for building
chemoinformatics models, and therefore increase the performance of the predictions (Gasteiger,
2016). The quality of data is also crucial regarding the quality of the models, and unfortunately most
public data on chemicals and biological datasets are with plenty of errors. Consequently, curation
of data is an essential aspect to take into consideration by modelers (Young 2008), and very good
examples about the importance of refining the compound annotations have been reported (Fourches
et al., 2010). Another issue is the lack of standardization of the available information for both
databases and software, which is characterized not only by a “methodological dispersion” but also
geographical, since the criteria for estimating data can vary significantly between countries. Therefore,
harmonization, systematization and standardization of criteria and methods used internationally are
crucial for the near future. Finally, other obstacle regarding the availability of data is that published
models are relatively difficult to retrieve and reuse, thus preventing broader adoption of them. In
this way, very interesting initiatives such as the QSAR DataBank (QsarDB) repository (Ruusmann,
Sild & Maran, 2015), which aims to make the processes and outcomes of in silico modeling work
transparent, reproducible and accessible, are very welcomed.
Another issue of in silico predictions is the limited explanation of mechanisms of action for
complex parameters provided by some overly simplistic models. Toxicity is multidimensional, so
that many toxicological effects are the result of changes in multiple physiological processes. The
reliability of the predictions is often not enough documented to make safe decisions and justify the
waiver to animal testing. Consequently, new models and a more critical use of them are required to
obtain solid predictions that are acceptable also from a regulatory point of view. Predictive toxicology
is re-focusing to the identification and modelling of biologically significant perturbations of key
toxicity pathways at the molecular level. A great emphasis is being given to in-depth explorations
of the mechanisms of toxicological action, and for integrating multiple types of data from diverse
experimental systems into unified risk assessment paradigms such as the Adverse Outcome Pathway
(AOP) concept (Benigni, 2016).
Finally, another important aspect on the future of chemoinformatics from a regulatory point of
view is its applicability to new fields in which it has not been extensively used yet. Probably one of
the more relevant areas is nanotoxicology, as discussed before, but there are two other significant
emerging domains: the development of models for biological molecules and for chemical mixtures.
In the first case, probably the more studied molecules until now are antimicrobial peptides, which are
seen as a prospective class of antimicrobial therapeutics with significant advantages such as a broad
range of activity, low toxicity, and minimal development of resistances in target organisms (Cherkasov
et al., 2014). Concerning toxicology aspects, pioneer studies on bioinformatics/chemoinformatics
approaches from the Raghava’s group have led to the development of a web server (ToxinPred, www.
imtech.res.in/raghava/toxinpred/), which helps to predict toxicity of peptides, minimum mutations in
peptides for increasing or decreasing their toxicity, and toxic regions in proteins (Gupta et al., 2013).
Chemical mixtures have very broad uses and applications in commercial, industrial, and
pharmaceutical products, but chemoinformatics approaches have traditionally targeted individual
compounds. This field is very new and under active development, and the current challenges are similar
to those of classical QSAR years ago: appropriate descriptors have to be developed for mixtures,
and specific QSAR methods and validation procedures have to be defined (Cherkasov et al., 2014).
17
CONCLUSION
There is a growing interest in the use of chemoinformatics to predict toxicity. In silico predictive
models are experiencing a boom following the restrictions of animal tests in specific areas such
as the development of cosmetics. In industry, these methods allow to evaluate new compounds in
the earliest stages of their development. A careful use of chemoinformatics techniques by properly
trained specialists could lead to a very significant reduction in animal testing. Among the wide variety
of potential uses can be cited those related to the regulatory environment, such as classification
and labeling of chemicals, their risk assessment and the prioritization based on the prediction of
toxicological values.
The implementation of REACH strongly encourages the minimization of animal testing, and
therefore has provided an impetus to employ in silico methods for the safety assessment of chemicals
(Jacobs et al., 2016). Furthermore, since 11th September 2015, ECHA require registrants to show that
they have considered alternative testing methods before submitting new proposals, to further ensure
that testing on animals is only done as a last resort when there are not valid alternatives (https://
echa.europa.eu/view-article/-/journal_content/56/10162/22022250). Acceptance of these techniques
under international norms such as REACH requires compliance with more stringent rules than those
required in the internal processes in specialized industries, such as in “classical” drug discovery.
This is because it is necessary to ensure the adequacy of the results for regulatory purposes (e.g.,
avoiding unnecessary risks to human health and the environment). A critical assessment, case by case
and discussion on the reliability of the results is essential to give credibility to these methodologies.
Definitely, computational toxicology is in a developmental phase, but there are already many
examples of its successful applications. Many scientists and regulators believe that currently new and
better ways to assess human toxicity are needed, and technological advances are making possible to
implement computational methods even at the regulatory level. This new role of chemoinformatics
is accompanied by the expansion of the number of large databases and various specialized programs,
and it will depend on the harmonious integration of the different existing computational techniques.
A combination of chemoinformatics methods (especially QSAR and read across) in the context of a
Weight of Evidence strategy may be of greater reliability for decision making (Jacobs et al., 2016).
Some existing platforms such as VEGA (http://www.vega-qsar.eu/) explicitly support this kind of
combined approaches.
In regulatory practice, the use of computational methods such as QSAR is already possible in
the EU, and the registration tool IUCLID (International Uniform Chemical Information Database,
http://iuclid.eu/, which allows one to capture, manage and exchange data on properties of chemicals,
preparing dossiers with the data and send them to the competent authorities of different EU countries)
includes the ability to make QSAR predictions as long as they follow the OECD guidelines and
are presented in a specific format called “QSAR Model Reporting Format” (QMRF).Therefore,
chemoinformatics techniques should play a key role at regulatory level to reduce large-scale testing
of chemicals in animals.
18
REFERENCES
Bartels, M., Rick, D., Lowe, E., Loizou, G., Price, P., Spendiff, M., & Ball, N. et al. (2012). Development of
PK- and PBPK-based modeling tools for derivation of biomonitoring guidance values. Computer Methods and
Programs in Biomedicine, 108(2), 773–788. doi:10.1016/j.cmpb.2012.04.014 PMID:22704290
Benigni, R. (2016). Predictive toxicology today: The transition from biological knowledge to practicable models.
Expert Opinion on Drug Metabolism & Toxicology, 12(9), 989–992. doi:10.1080/17425255.2016.1206889
PMID:27351633
Boyer, S. (2009). The use of computer models in pharmaceutical safety evaluation. Alternatives to Laboratory
Animals, 37, 467–475. PMID:20017577
Burden, N., Maynard, S. K., Weltje, L., & Wheeler, J. R. (2016). The utility of QSARs in predicting acute fish
toxicity of pesticide metabolites: A retrospective validation approach. Regulatory Toxicology and Pharmacology,
80, 241–246. doi:10.1016/j.yrtph.2016.05.032 PMID:27235557
Caldwell, J. C., Evans, M. V., & Krishnan, K. (2012). Cutting edge PBPK models and analyses: Providing
the basis for future modeling efforts and bridges to emerging toxicology paradigms. Journal of Toxicology.
PMID:22899915
Chen, A., Yarmush, M. L., & Maguire, T. (2012). Physiologically based pharmacokinetic models: Integration
of in silico approaches with micro cell culture analogues. Current Drug Metabolism, 13, 863–880.
doi:10.2174/138920012800840419 PMID:22571482
Cherkasov, A., Muratov, E. N., Fourches, D., Varnek, A., Baskin, I. I., Cronin, M., & Tropsha, A. et al. (2014).
QSAR modeling: Where have you been? Where are you going to? Journal of Medicinal Chemistry, 57(12),
4977–5010. doi:10.1021/jm4004285 PMID:24351051
Clewell, H. J., & Andersen, M. E. (2004). Applying mode-of-action and pharmacokinetic considerations in
contemporary cancer risk assessments: An example with trichloroethylene. Critical Reviews in Toxicology,
34(5), 385–445. doi:10.1080/10408440490500795 PMID:15560567
Clippinger, A. J., Hill, E., Curren, R., & Bishop, P. (2016). Bridging the gap between regulatory acceptance and
industry use of non-animal methods. ALTEX, 33, 453–458. PMID:27254273
Contrera, J. F., Matthews, E. J., & Benz, R. D. (2003). Predicting the carcinogenic potential of pharmaceuticals
in rodents using molecular structural similarity and E-state indices. Regulatory Toxicology and Pharmacology,
38(3), 243–259. doi:10.1016/S0273-2300(03)00071-0 PMID:14623477
Contrera, J. F., Matthews, E. J., Kruhlak, N. L., & Benz, R. D. (2005). In silico screening of chemicals for bacterial
mutagenicity using electrotopological E-state indices and MDL QSAR software. Regulatory Toxicology and
Pharmacology, 43(3), 313–323. doi:10.1016/j.yrtph.2005.09.001 PMID:16242226
Cronin, M. T. (2002). The current status and future applicability of quantitative structure-activity relationships
(QSARs) in predicting toxicity. Alternatives to Laboratory Animals, 30 (Suppl. 2), 81–84. PMID:12513655
Dickins, M., & Modi, S. (2002). Importance of predictive ADME simulation. Drug Discovery Today, 7(14),
755–756. doi:10.1016/S1359-6446(02)02357-7 PMID:12547029
Duart, M. J., Antón-Fos, G. M., de Julián-Ortiz, J. V., Gozalbes, R., Gálvez, J., & García-Domenech, R. (2002).
Use of molecular topology for the prediction of physicochemical, pharmacokinetic and toxicological properties
of a group of antihistaminic drugs. International Journal of Pharmaceutics, 246(1-2), 111–119. doi:10.1016/
S0378-5173(02)00352-6 PMID:12270614
European Commission. (2006). Regulation (EC) No 1907/2006 of The European Parliament and The Council
of 18 December 2006. Off. J. Eur. Union Lett., 396, 1–849.
European Parliament. (2001). Resolution on the Commission White Paper on Strategy for a future Chemicals
Policy, COM (2001) 88-C5-0258/2001-2001/2118 (COS) in OJ C140E of 13.6.2002.
European Union. (2012). Regulation (EU) No 528/2012 of The European Parliament and of The Council of 22 May
2012 concerning the making available on the market and use of biocidal products. Off. J. Eur. Union Lett., 167.
19
Fjodorova, N., Novich, M., Vrachko, M., Smirnov, V., Kharchevnikova, N., Zholdakova, Z., & Benfenati, E.
et al. (2008). Directions in QSAR modeling for regulatory uses in OECD member countries, EU and in Russia.
Journal of Environmental Science and Health. Part C: Environmental Health Sciences, 26(2), 201–236.
doi:10.1080/10590500802135578 PMID:18569330
Fourches, D., Barnes, J. C., Day, N. C., Bradley, P., Reed, J. Z., & Tropsha, A. (2010). Cheminformatics analysis
of assertions mined from literature that describe drug-induced liver injury in different species. Chemical Research
in Toxicology, 23(1), 171–183. doi:10.1021/tx900326k PMID:20014752
Fourches, D., Muratov, E., & Tropsha, A. (2010). Trust, but verify: On the importance of chemical structure
curation in cheminformatics and QSAR modeling research. Journal of Chemical Information and Modeling,
50(7), 1189–1204. doi:10.1021/ci100176x PMID:20572635
Fourches, D., Pu, D., Tassa, C., Weissleder, R., Shaw, S. Y., Mumper, R. J., & Tropsha, A. (2010). Quantitative
nanostructure–activity relationship modeling. ACS Nano, 4(10), 5703–5712. doi:10.1021/nn1013484
PMID:20857979
Frid, A. A., & Matthews, E. J. (2010). Prediction of drug-related cardiac adverse effects in humans-B: Use of
QSAR programs for early detection of drug-induced cardiac toxicities. Regulatory Toxicology and Pharmacology,
56(3), 276–289. doi:10.1016/j.yrtph.2009.11.005 PMID:19941924
García-Domenech, R., de Julián-Ortiz, J. V., Duart, M. J., García-Torrecillas, J. M., Antón-Fos, G. M., Ríos-
Santamarina, I., & Gálvez, J. et al. (2001). Search of a topological pattern to evaluate toxicity of heterogeneous
compounds. SAR and QSAR in Environmental Research, 12(1-2), 237–254. doi:10.1080/10629360108035380
PMID:11697058
Gasteiger, J. (2016). Chemoinformatics: Achievements and challenges, a personal view. Molecules (Basel,
Switzerland), 21(2), 151. doi:10.3390/molecules21020151 PMID:26828468
Gedeck, P., Kramer, C., & Ertl, P. (2010). Computational analysis of structure-activity relationships. Progress
in Medicinal Chemistry, 49, 113–160. doi:10.1016/S0079-6468(10)49004-9 PMID:20855040
Goodsell, D. S., Morris, G. M., & Olson, A. J. (1996). Automated docking of flexible ligands: Applications of
AutoDock. Journal of Molecular Recognition, 9(1), 1–5. doi:10.1002/(SICI)1099-1352(199601)9:1<1::AID-
JMR241>3.0.CO;2-6 PMID:8723313
Gozalbes, R., Doucet, J. P., & Derouin, F. (2002). Application of topological descriptors in QSAR and drug design:
History and new trends. Current Drug Targets. Infectious Disorders, 2(1), 93–102. doi:10.2174/1568005024605909
PMID:12462157
Gozalbes, R., Jacewicz, M., Annand, R., Tsaioun, K., & Pineda-Lucena, A. (2011). QSAR-based permeability
model for drug-like compounds. Bioorganic & Medicinal Chemistry, 19(8), 2615–2624. doi:10.1016/j.
bmc.2011.03.011 PMID:21458999
Gozalbes, R., & Pineda-Lucena, A. (2010). QSAR-based solubility model for drug-like compounds. Bioorganic
& Medicinal Chemistry, 18(19), 7078–7084. doi:10.1016/j.bmc.2010.08.003 PMID:20810286
Gramatica, P. (2013). On the development and validation of QSAR models. Methods in Molecular Biology
(Clifton, N.J.), 930, 499–526. doi:10.1007/978-1-62703-059-5_21 PMID:23086855
Gupta, S., Kapoor, P., Chaudhary, K., Gautam, A., Kumar, R., & Raghava, G. P. S.Open Source Drug Discovery
Consortium. (2013). In silico approach for predicting toxicity of peptides and proteins. PLoS ONE, 8(9), e73957.
doi:10.1371/journal.pone.0073957 PMID:24058508
Guzelian, P. S., Victoroff, M. S., Halmes, N. C., James, R. C., & Guzelian, C. P. (2005). Evidence-based
toxicology: A comprehensive framework for causation. Human and Experimental Toxicology, 24(4), 161–201.
doi:10.1191/0960327105ht517oa PMID:15957536
Jacobs, M. N., Colacci, A., Louekari, K., Luijten, M., Hakkert, B. C., Paparella, M., & Vasseur, P. (2016).
International regulatory needs for development of an IATA for non-genotoxic carcinogenic chemical substances.
ALTEX, 33, 359–392. PMID:27120445
20
Judson, P. N. (2006). Using computer reasoning about qualitative and quantitative information to predict
metabolism and toxicity. In B. Testa, S. D. Krämer, H. Wunderli-Allenspach, & G. Folkers (Eds.), Pharmacokinetic
profiling in drug research: biological, physicochemical, and computational strategies. Wiley-VCH Verlag GmbH
& Co. KGaA (pp. 183–215). Weinheim: Alemania. doi:10.1002/9783906390468.ch24
Kar, S., & Roy, K. (2010). Predictive toxicology using QSAR: A perspective. Journal of the Indian Chemical
Society, 87, 1455–1515.
Kar, S., & Roy, K. (2011). Development and validation of a robust model for prediction of carcinogenicity of
drugs. Indian Journal of Biochemistry & Biophysics, 48, 111–122. PMID:21682143
Kazius, J., McGuire, R., & Bursi, R. (2005). Derivation and validation of toxicophores for mutagenicity prediction.
Journal of Medicinal Chemistry, 48(1), 312–320. doi:10.1021/jm040835a PMID:15634026
Kitchen, D. B., Decornez, H., Furr, J. R., & Bajorath, J. (2004). Docking and scoring in virtual screening for
drug discovery: Methods and applications. Nature Reviews. Drug Discovery, 3(11), 935–949. doi:10.1038/
nrd1549 PMID:15520816
Kleinstreuer, N. C., Judson, R. S., Reif, D. M., Sipes, N. S., Singh, A. V., Chandler, K. J., & Knudsen, T. B. et al.
(2011). Environmental impact on vascular development predicted by high throughput screening. Environmental
Health Perspectives, 119(11), 1596–1603. doi:10.1289/ehp.1103412 PMID:21788198
Kraszewski, S., Tarek, M., Treptow, W., & Ramseyer, C. (2010). Affinity of C60 neat fullerenes with membrane
proteins: A computational study on potassium channels. ACS Nano, 4(7), 4158–4164. doi:10.1021/nn100723r
PMID:20568711
Kuentz, M., Nick, S., Parrott, N., & Röthlisberger, D. (2006). A strategy for preclinical formulation development
using GastroPlus as pharmacokinetic simulation tool and a statistical screening design applied to a dog study.
European Journal of Pharmaceutical Sciences, 27(1), 91–99. doi:10.1016/j.ejps.2005.08.011 PMID:16219449
Kuntz, I. D., Blaney, J. M., Oatley, S. J., Langridge, R., & Ferrin, T. E. (1982). A geometric approach to
macromolecule-ligand interactions. Journal of Molecular Biology, 161(2), 269–288. doi:10.1016/0022-
2836(82)90153-X PMID:7154081
Liu, J., & Hopfinger, A. J. (2008). Identification of possible sources of nanotoxicity from carbon nanotubes
inserted into membrane bilayers using membrane interaction quantitative structure–activity relationship analysis.
Chemical Research in Toxicology, 21(2), 459–466. doi:10.1021/tx700392b PMID:18189365
Liu, J., Yang, L., & Hopfinger, A. J. (2009). Affinity of drugs and small biologically active molecules to carbon
nanotubes: A pharmacodynamics and nanotoxicity factor? Molecular Pharmaceutics, 6(3), 873–882. doi:10.1021/
mp800197v PMID:19281188
Livingstone, D. J. (2000). The characterization of chemical structures using molecular properties. A survey. Journal
of Chemical Information and Computer Sciences, 40(2), 195–209. doi:10.1021/ci990162i PMID:10761119
MacDonald, D., Breton, R., Sutcliffe, R., & Walker, J. D. (2002). Uses and limitations of quantitative structure-
activity relationships (QSARs) to categorize substances on the Canadian Domestic Substance List as persistent
and/or bioaccumulative, and inherently toxic to non-human organisms. SAR and QSAR in Environmental Research,
13(1), 43–55. doi:10.1080/10629360290002082 PMID:12074391
Marchant, C. A., Briggs, K. A., & Long, A. (2008). In silico tools for sharing data and knowledge on toxicity and
metabolism: Derek for Windows, Meteor, and Vitic. Toxicology Mechanisms and Methods, 18(2-3), 177–187.
doi:10.1080/15376510701857320 PMID:20020913
Matthews, E. J., & Contrera, J. F. (1998). A new highly specific method for predicting the carcinogenic
potential of pharmaceuticals in rodents using enhanced MCASE QSAR-ES software. Regulatory Toxicology
and Pharmacology, 28(3), 242–264. doi:10.1006/rtph.1998.1259 PMID:10049796
Matthews, E. J., & Contrera, J. F. (2007). In silico approaches to explore toxicity end points: Issues and concerns
for estimating human health effects. Expert Opinion on Drug Metabolism & Toxicology, 3(1), 125–134.
doi:10.1517/17425255.3.1.125 PMID:17269899
21
Matthews, E. J., Kruhlak, N. L., Benz, R. D., Contrera, J. F., Marchant, C. A., & Yang, C. (2008). Combined
use of MC4PC, MDL-QSAR, BioEpisteme, Leadscope PDM, and Derek for Windows software to achieve
high-performance, high-confidence, mode of action-based predictions of chemical carcinogenesis in rodents.
Toxicology Mechanisms and Methods, 18(2-3), 189–206. doi:10.1080/15376510701857379 PMID:20020914
Matthews, E. J., Kruhlak, N. L., Benz, R. D., Ivanov, J., Klopman, G., & Contrera, J. F. (2007). A comprehensive
model for reproductive and developmental toxicity hazard identification: II. Construction of QSAR models to
predict activities of untested chemicals. Regulatory Toxicology and Pharmacology, 47(2), 136–155. doi:10.1016/j.
yrtph.2006.10.001 PMID:17175082
Matthews, E. J., Ursem, C. J., Kruhlak, N. L., Benz, R. D., Sabaté, D. A., Yang, C., & Contrera, J. F. et al. (2009).
Identification of structure-activity relationships for adverse effects of pharmaceuticals in humans: Part B. Use
of (Q)SAR systems for early detection of drug-induced hepatobiliary and urinary tract toxicities. Regulatory
Toxicology and Pharmacology, 54(1), 23–42. doi:10.1016/j.yrtph.2009.01.009 PMID:19422098
Merlot, C. (2008). In silico methods for early toxicity assessment. Current Opinion in Drug Discovery &
Development, 11, 80–85. PMID:18175270
Merlot, C. (2010). Computational toxicology - a tool for early safety evaluation. Drug Discovery Today, 15(1-2),
16–22. doi:10.1016/j.drudis.2009.09.010 PMID:19835978
Modi, S., Hughes, M., Garrow, A., & White, A. (2012). The value of in silico chemistry in the safety assessment
of chemicals in the consumer goods and pharmaceutical industries. Drug Discovery Today, 17(3-4), 135–142.
doi:10.1016/j.drudis.2011.10.022 PMID:22063083
Nantasenamat, C., Isarankura-Na-Ayudhya, C., & Prachayasittikul, V. (2010). Advances in computational
methods to predict the biological activity of compounds. Expert Opin. Drug Discov., 5(7), 633–654. doi:10.15
17/17460441.2010.492827 PMID:22823204
Nicolotti, O., Benfenati, E., Carotti, A., Gadaleta, D., Gissi, A., Mangiatordi, G. F., & Novellino, E. (2014).
REACH and in silico methods: An attractive opportunity for medicinal chemists. Drug Discovery Today, 19(11),
1757–1768. doi:10.1016/j.drudis.2014.06.027 PMID:24998783
Patlewicz, G., Ball, N., Booth, E. D., Hulzebos, E., Zvinavashe, E., & Hennes, C. (2013). Use of category
approaches, read-across and (Q)SAR: General considerations. Regulatory Toxicology and Pharmacology, 67(1),
1–12. doi:10.1016/j.yrtph.2013.06.002 PMID:23764304
Petrova, T., Rasulev, B. F., Toropov, A. A., Leszczynska, D., & Leszczynski, J. (2011). Improved model for
fullerene C60 solubility in organic solvents based on quantum-chemical and topological descriptors. Journal of
Nanoparticle Research, 13(8), 3235–3247. doi:10.1007/s11051-011-0238-x
Puzyn, T., Leszczynska, D., & Leszczynski, J. (2009). Toward the development of nano-QSARs: Advances and
challenges. Small, 5(22), 2494–2509. doi:10.1002/smll.200900179 PMID:19787675
Richard, A. M., Gold, L. S., & Nicklaus, M. C. (2006). Chemical structure indexing of toxicity data on the Internet:
Moving toward a flat world. Current Opinion in Drug Discovery & Development, 9, 314–325. PMID:16729727
Rovida, C., & Hartung, T. (2009). Re-evaluation of animal numbers and costs for in vivo tests to accomplish
REACH legislation requirements for chemicals: A report by the Transatlantic Think Tank for Toxicology (T4).
ALTEX, 26, 187–208. doi:10.14573/altex.2009.3.187 PMID:19907906
Roy, K., & Kar, S. (2016). In silico models for ecotoxicity of pharmaceuticals. In E. Benfenati (Ed.), In silico
methods for predicting drug toxicity, MIMB (Vol. 1425, pp. 237-304). New York: Springer Science+Business
Media. doi:10.1007/978-1-4939-3609-0_12
Ruusmann, V., Sild, S., & Maran, U. (2015). QSAR DataBank repository: Open and linked qualitative and
quantitative structure-activity relationship models. J. Cheminform., 7(1), 32. doi:10.1186/s13321-015-0082-6
PMID:26110025
Sangion, A., & Gramatica, P. (2016). Hazard of pharmaceuticals for aquatic environment: Prioritization
bystructural approaches and prediction of ecotoxicity. Environment International, 95, 131–143. doi:10.1016/j.
envint.2016.08.008 PMID:27568576
Sayes, C. M., Smith, P. A., & Ivanov, I. V. (2013). A framework for grouping nanoparticles based on their
measurable characteristics. International Journal of Nanomedicine, 8(Suppl. 1), 45–56. doi:10.2147/IJN.S40521
PMID:24098078
22
Scholz, S., Sela, E., Blaha, L., Braunbeck, T., Galay-Burgos, M., García-Franco, M., & Winter, M. J. et al.
(2013). A European perspective on alternatives to animal testing for environmental hazard identification and
risk assessment. Regulatory Toxicology and Pharmacology, 67(3), 506–530. doi:10.1016/j.yrtph.2013.10.003
PMID:24161465
Singla, D., Dhanda, S. K., Chauhan, J. S., Bhardwaj, A., Brahmachari, S. K., Consortium, O. S. D. D.,
& Raghava, G. P. S.Open Source Drug Discovery Consortium. (2013). Open source software and web
services for designing therapeutic molecules. Current Topics in Medicinal Chemistry, 13(10), 1172–1191.
doi:10.2174/1568026611313100005 PMID:23647540
Sizochenko, N., Kuzmin, V., Ognichenko, L., & Leszczynski, J. (2016). Introduction of simplex-informational
descriptorsfor QSPR analysis of fullerene derivatives. Journal of Mathematical Chemistry, 54(3), 698–706.
doi:10.1007/s10910-015-0581-8
Sizochenko, N., & Leszczynski, J. (2016). Review of current and emerging approaches for Quantitative
Nanostructure-Activity Relationship modeling: The case of inorganic nanoparticles. Journal of Nanotoxicology
and Nanomedicine, 1(1), 1–16. doi:10.4018/JNN.2016010101
Tämm, K., Sikk, L., Burk, J., Rallo, R., Pokhrel, S., Mädler, L., & Tamm, T. et al. (2016). Parametrization of
nanoparticles: Development of full-particle nanodescriptors. Nanoscale, 8(36), 16243–16250. doi:10.1039/
C6NR04376C PMID:27714136
Taylor, K., & Rego, L. (2016). EU statistics on animal experiments for 2014. ALTEX, 33, 465–468. doi:10.14573/
altex.1609291 PMID:27806180
The Organisation for Economic Co-operation and Development (OECD). (2007). Guidance document on the
validation of (Quantitative)Structure-Activity Relationships [(Q)SAR] models. OECD Environment Health and
Safety Publications. Retrieved from www.oecd.org/ehs/
Toropov, A. A., Toropova, A. P., Raska, I. Jr, Leszczynska, D., & Leszczynski, J. (2014). Comprehension
of drug toxicity: Software and databases. Computers in Biology and Medicine, 45, 20–25. doi:10.1016/j.
compbiomed.2013.11.013 PMID:24480159
Toropova, A. P., Toropov, A. A., Benfenati, E., Korenstein, R., Leszczynska, D., & Leszczynski, J. (2015).
Optimal nano-descriptors as translators of eclectic data into prediction of the cell membrane damage by means of
nano metal-oxides. Environmental Science and Pollution Research International, 22(1), 745–757. doi:10.1007/
s11356-014-3566-4 PMID:25223357
Tropsha, A., Gramatica, P., & Gombar, V. K. (2003). The importance of being earnest: Validation is the absolute
essential for successful application and interpretation of QSPR models. QSAR & Combinatorial Science, 22(1),
69–77. doi:10.1002/qsar.200390007
Valerio, L. G. Jr. (2011). In silico toxicology models and databases as FDA Critical Path Initiative toolkits.
Human Genomics, 5(3), 200–207. doi:10.1186/1479-7364-5-3-200 PMID:21504870
Valerio, L. G. Jr. (2013). Predictive computational toxicology to support drug safety assessment. Methods in
Molecular Biology (Clifton, N.J.), 930, 341–354. doi:10.1007/978-1-62703-059-5_15 PMID:23086849
Valerio, L. G. Jr, & Cross, K. P. (2012). Characterization and validation of an in silico toxicology model to
predict the mutagenic potential of drug impurities. Toxicology and Applied Pharmacology, 260(3), 209–221.
doi:10.1016/j.taap.2012.03.001 PMID:22426359
Valerio, L. G. Jr, & Long, A. (2010). The in silico prediction of human-specific metabolites from hepatotoxic
drugs. Current Drug Discovery Technologies, 7, 170–187. doi:10.2174/157016310793180567 PMID:20843294
Van de Waterbeemd, H., & Gifford, E. (2003). ADMET in silico modelling: Towards prediction paradise? Nature
Reviews Drug Discovery, 2(3), 192–204. doi:10.1038/nrd1032 PMID:12612645
Van Heerden, S. (2012). Recent developments in global regulatory framework in the chemical industry. Popul.
Plast. Packag., 57, 46–50.
Vedani, A., Smiesko, M., Spreafico, M., Peristera, O., & Dobler, M. (2009). VirtualToxLabTM - in silico
prediction of the toxic (endocrine-disrupting) potential of drugs, chemicals and natural products. Two years
and 2,000 compounds of experience: A progress report. ALTEX, 26, 167–176. doi:10.14573/altex.2009.3.167
PMID:19907904
23
Vinardell, M. P. (2007). Alternatives to animal experimentation in Toxicology: Present situation. Acta Bioeth.,
13, 41–52.
Vink, S. R., Mikkers, J., Bouwman, T., Marquart, H., & Kroese, E. D. (2010). Use of read-across and tiered
exposure assessment in risk assessment under REACH - a case study on a phase-in substance. Regulatory
Toxicology and Pharmacology, 58(1), 64–71. doi:10.1016/j.yrtph.2010.04.004 PMID:20394791
Wetmore, B. A., Wambaugh, J. F., Ferguson, S. S., Sochaski, M. A., Rotroff, D. M., Freeman, K., & Thomas,
R. S. et al. (2012). Integration of dosimetry, exposure, and high-throughput screening data in chemical toxicity
assessment. Toxicological Sciences, 125(1), 157–174. doi:10.1093/toxsci/kfr254 PMID:21948869
Winkler, D. A., Mombelli, E., Pietroiusti, A., Tran, L., Worth, A., Fadeel, B., & McCall, M. J. (2013). Applying
quantitative structure-activity relationship approaches to nanotoxicology: Current status and future potential.
Toxicology, 313(1), 15–23. doi:10.1016/j.tox.2012.11.005 PMID:23165187
Yang, C., Valerio, L. G. Jr, & Arvidson, K. B. (2009). Computational toxicology approaches at the US Food and
Drug Administration. Alternatives to Laboratory Animals, 37, 523–531. PMID:20017581
Young, D., Martin, T., Venkatapathy, R., & Harten, P. (2008). Are the chemical structures in your QSAR correct?
QSAR & Combinatorial Science, 27(11-12), 1337–1345. doi:10.1002/qsar.200810084
Zhu, M., Nie, G., Meng, H., Xia, T., Nel, A., & Zhao, Y. (2013). Physicochemical properties determine
nanomaterial cellular uptake, transport, and fate. Accounts of Chemical Research, 46(3), 622–631. doi:10.1021/
ar300031y PMID:22891796
Rafael Gozalbes graduated from the University of Valencia, Spain, in Pharmacy, and he obtained his PhD on
Pharmacy in 1998 in the field of QSAR applied for the selection of new drugs with potential antiprotozoal activity.
Between 1998 and 2001, he was a postdoctoral scholarship holder of the Spanish Ministry of Foreign Affairs
and the “Ensemble contre le SIDA” French Foundation, at the “Groupe de Chimie Informatique et Modélisation”
(ITODYS – CNRS) and the Faculté de Médecine (Université Paris VII). Between 2001 and 2007 he worked as a
senior scientist in the Molecular Modeling group of the biotech company CEREP (Paris, France), and then research
scientist at the Laboratory of Structural Biochemistry, Centro de Investigación Príncipe Felipe (CIPF) (Valencia,
Spain) until 2010. In 2012 Dr. Gozalbes founded the ProtoQSAR company.
Jesús Vicente de Julián-Ortiz graduated from the University of Valencia, Spain, in Chemical Sciences, specialty
Biochemistry. He did the PhD program on Organic Synthesis and Fine Chemicals, and became Doctor in Pharmacy
in 1997. He was a postdoctoral scholarship holder of the Ministry of Education and Science at the Institute of
Computational Chemistry of the University of Girona, Spain (2002-2003), at the Molecular Engineering group
directed by Prof. Ramon Carbó-Dorca. He worked as a researcher of the “Red de Investigación de Centros de
Enfermedades Tropicales” (Network of Research of Centers of Tropical Diseases)-Faculty of Pharmacy-University
of Valencia, Spain (2003-2006). He is currently part-time professor in the Department of Physical Chemistry, Faculty
of Chemistry at the University of Valencia, and he acts as Scientific Director in ProtoQSAR SL, a company of (Q)
SAR and computational medicinal chemistry services.
24

Gozalbes 2018

Uploaded by

Copyright:

Available Formats

You might also like

Gozalbes 2018

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gozalbes 2018

Uploaded by

Copyright:

Available Formats

International Journal of Quantitative Structure-Property Relationships

Volume 3 • Issue 1 • January-June 2018

The REACH Regulation

QSTR Models in Scientific Literature

1. Systemic Toxicity in Humans: Mainly predictions of carcinogenicity, mutagenicity,

QSTRs in International Regulatory Norms

1. Principle 1: A defined endpoint (referred as any physicochemical, biological or environmental

Sources of Toxicological Data

Specialized Software for QSTRs Development

Table 1. Selection of freely available public toxicological databases

Database Short description Website access

ACToR (Aggregated Collection by “US EPA Computational Toxicology

CCRIS (Chemical Results of carcinogenicity and mutagenicity tests

CPDB Carcinogenicity long-term test results in animals

DSSTox (Distributed EPA initiative that proposes a decentralized set of

GENE-TOX Mutagenicity data for about 3,000 EPA chemicals.

PAN (Pesticide Pesticide information from many different sources,

RepDose (Repeated Database financed by CEFIC, with information of

TETRATOX Collection of the Institute of Agriculture at the

ToxRefDB (Toxicity Contains information of chronic, subchronic,

OTHER COMPUTATIONAL TOXICOLOGY METHODS

1. First, the structures of potential ligands must be minimized,

Software Short description Web page

continued on following page

Software Short description Web page

READ-ACROSS - EXTRAPOLATION BY STRUCTURAL SIMILARITY

Structure-Activity Relationships based on Knowledge

OUTLOOK: PREDICTIVE NANOTOXICOLOGY

FUTURE ASPECTS OF CHEMOINFORMATICS IN PREDICTIVE TOXICOLOGY

You might also like