Professional Documents
Culture Documents
Tema Neumática
Tema Neumática
2691–2697, 2016
# 2016 SETAC
Printed in the USA
(Submitted 29 March 2016; Returned for Revision 18 April 2016; Accepted 21 April 2016)
Abstract: Quantitative structure–activity relationships (QSARs) for toxicity of a large set of 758 organic compounds to Daphnia magna
were built up. The simplified molecular input-line entry system (SMILES) was used to represent the molecular structure. The Correlation
and Logic (CORAL) software was utilized as a tool to develop the QSAR models. These models are built up using the Monte Carlo
method and according to the principle “QSAR is a random event” if one checks a group of random distributions in the visible training set
and the invisible validation set. Three distributions of the data into the visible training, calibration, and invisible validation sets are
examined. The predictive potentials (i.e., statistical characteristics for the invisible validation set of the best model) are as follows: n ¼ 87,
r2 ¼ 0.8377, root mean square error ¼ 0.564. The mechanistic interpretations and the domain of applicability of built models are
suggested and discussed. Environ Toxicol Chem 2016;35:2691–2697. # 2016 SETAC
Keywords: Computational toxicology Ecological risk assessment Environmental toxicology Aquatic toxicology Organic
contaminant
of T* and N*, a predictive model can be calculated using the potential of a model can be estimated via a defect of Fk,
training set: defect(Fk) [31,32]:
Table 1. Statistical characteristics of quantitative structure–activity relationship models for 3 distributions of data into the training, invisible training, calibration,
and validation setsa
RESULTS
QSAR models
The QSAR models for toxicity to D. magna, which are
calculated with the balance of correlations for 3 distributions
of data into the training, invisible training, calibration, and
validation sets, are as follows:
However, if one does not use the balance of correlation Figure 2. Graphical representation of correlations between experimental
approach, the developed models are different. The QSAR and calculated –log 50% lethal concentration (pLC50) values.
models for toxicity to D. magna, which are calculated with the
traditional scheme (without the invisible training set) for 3 3, respectively. This changes when the traditional scheme of
distributions of the data into the training, calibration, and distribution is used. In such a case the numbers of outliers are
validation sets, are as follows: 14, 12, and 5 for distributions 1, 2, and 3, respectively. Thus,
distribution 3 seems to be preferable for both versions of the
pLC50 ¼ 2:087 ð 0:0028Þ þ 0:08818 ð 0:0001Þ Monte Carlo optimization (balance of correlations and traditional
ð10Þ
DCWð3;10Þ scheme). In fact, the criterion expressed as inequality 6 is an
indicator of compounds, which have rare, untypical molecular
pLC50 ¼ 2:301 ð 0:0026Þ þ 0:08368 ð 0:0001Þ features. In other words, these compounds are suspected to be
ð11Þ outliers. However, the data on the number of these suspected
DC Wð3;10Þ
compounds give an additional possibility to estimate different
distributions of the data into the training, invisible training,
pLC50 ¼ 2:538 ð 0:0023Þ þ 0:09843 ð 0:0001Þ calibration, and validation sets: “if the number of the above-
ð12Þ
DCWð3;10Þ mentioned outliers is smaller, then distribution is better.”
Table 2. List of molecular features extracted from simplified molecular input-line entry system (SMILES), which are promoters of increase (or decrease) for
–log 50% lethal concentration (pLC50)
provides a good summary of the already executed approaches. which have no physical interpretation, and 2) the Monte Carlo
One can compare the previously published data with the calculations take considerable time to execute, especially if the
statistical characteristics of the models developed in the present number of compounds is large (e.g., n > 500).
study. The data presented in Table 4 clearly indicate that the
QSAR models calculated with Equations 7 to 9 are comparable Availability of examined data
with models suggested in the literature. The Supplemental Data contain technical details: 1) SM1
contains correlation weights for calculations with Equations 7
Advantages of the CORAL models to 9; 2) SM2 contains experimental and predicted pLC50
There are various benefits of application of the CORAL values calculated with Equations 7 to 9; 3) SM3 contains the
models. The approach described in the present study applies correlation weights obtained in 3 probes of the Monte Carlo
the QSAR model using solely experimental values of pLC50 optimization where one can find the stable promoters of the
together with data on the molecular structure. There is no pLC50 increase together with promoters of the pLC50 decrease;
need to use information on physicochemical parameters, and 4) SM4 contains 3 distributions into the training (i.e., the
3-dimensional representation of the molecular systems, and training, invisible training, and calibration sets) and external
quantum mechanics descriptors for the considered compounds. validation sets which were examined in the present study
The method applied in the present study delivers QSAR models (these distributions can be checked with the CORAL software);
in accordance with Organisation for Economic Co-operation and 5) SM5 contains the CORAL method used to build the
and Development principles [38]. described models.
Disadvantages of the CORAL models
Though CORAL models are efficient and reliable, there are CONCLUSIONS
also some drawbacks of such approaches: 1) there are SMILES Using the CORAL approach, predictions of toxicity of 758
attributes (this is related to SSk and SSSk involved in Equation 1) organic compounds to D. magna were carried out. All QSAR
Table 3. Correlation weights and prevalence of molecular features Fk extracted from simplified molecular input-line entry system (SMILES), which are
promoters of increase (or decrease) for –log 50% lethal concentration
Feature, Fk Distribution CW(Fk) in probe 1 CW(Fk) in probe 2 CW(Fk) in probe 3 NT NIT NC Defect(Fk)
CW ¼ correlation weight; defect(Fk) ¼ defect of a feature, Fk, calculated with Equation 3; NT, NIT, and NC ¼ numbers of SMILES attribute x in the training,
invisible training, and calibration sets, respectively.
2696 Environ Toxicol Chem 35, 2016 A.P. Toropova et al.
Table 4. Comparison of the statistical quality of quantitative structure–activity relationship models for toxicity to Daphnia magna
No. n r2 s n r2 s Reference
1 — 0.740–0.768 0.79–1.00 — — — 34
2 222 0.738 75 0.721 35
3 97 0.77 0.39 — — 0.34–0.44 36
4 149 0.70 1.04 89 0.768 0.88 37
5 307 0.739 0.80 87 0.838 0.564 Present studya
a
Equation 9.
models are based on SMILES notation optimal descriptors 6. Ghaedi A. 2015. Predicting the cytotoxicity of ionic liquids using
and were developed with application of the Monte Carlo QSAR model based on SMILES optimal descriptors. J Mol Liq
208:269–279.
method. The predictive potential of the applied approach was
7. Li Q, Ding X, Si H, Gao H. 2014. QSAR model based on SMILES of
tested with 3 random splits into the subtraining, calibration, test, inhibitory rate of 2,3-diarylpropenoic acids on AKR1C3. Chemometr
and validation sets and with different statistical methods. Intell Lab Syst 139:132–138.
All models considered in the present study are characterized 8. Masand VH, Toropov AA, Toropova AP, Mahajan DT. 2014. QSAR
by the following features: 1) every time, the best statistical models for anti-malarial activity of 4-aminoquinolines. Curr Comput
Aided Drug Des 10:75–82.
characteristics for the calibration set are accompanied by 9. Scotti L, Lima EO, da Silva MS, Ishiki H, Lima IO, Pereira FO,
satisfactory predictive potential (the statistical characteristics MendonSca FJB Jr, Scotti MT. 2014. Docking and PLS studies on a set of
for the external validation set), and 2) the balance of correlation thiophenes RNA polymerase inhibitors against Staphylococcus aureus.
approach gives better predictions than the traditional scheme. Curr Top Med Chem 14:64–80.
10. Scotti L, Ishiki H, Ferreira MJP, Francisco JBM Jr, De P Emerenciano
Both features demonstrate that Monte Carlo method–based
V, Silva MS, Scotti MT. 2012. In silico methods applied in food
modeling incorporated in CORAL software is a very promising chemistry: A short review with bitter and mutagenic compounds. Lett
computational method in QSAR studies for risk assessment Drug Des Discov 9:527–534.
related to toxicity of organic chemicals to D. magna. The 11. Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MNDs. 2015.
SMILES attributes (both local and global), which are promoters Computational modeling in nanomedicine: Prediction of multiple
antibacterial profiles of nanoparticles using a quantitative structure–
of toxicity increase or decrease were identified and defined. activity relationship perturbation model. Nanomedicine 10:193–204.
These structural features are related to organic chemical 12. Torrens F, Castellano G. 2014. QSPR prediction of chromatographic
toxicity, and their identification helped to improve the retention times of pesticides: Partition and fractal indices. J Environ Sci
understanding of organic chemical toxicity toward D. magna. Health B 49:400–407.
13. Torrens F, Castellano G. 2012. QSPR prediction of retention times of
The Monte Carlo calculations described in the present study
phenylurea herbicides by biological plastic evolution. Curr Drug Saf
can be reproduced using the CORAL software. 7:262–268.
14. van der Jagt K, Munn S, Torslov J, de Bruijn J, eds. 2004. Alternative
Supplemental Data—The Supplemental Data are available on the Wiley approaches can reduce the use of test animals under REACH.
Online Library at DOI:10.1002/etc.3466. Addendum to: Assessment of additional testing needs under REACH
effects of (Q)SARS, risk based testing and voluntary industry
initiatives. IHCP report EUR 21405 EN. Joint Research Centre
Acknowledgment—A.A. Toropov and A.P. Toropova thank the European Institute for Health and Consumer Protection, European Commission,
Commission project PeptiCAPS (project 686141). A.M. Veselinovic and Ispra, Italy.
J.B. Veselinovic acknowledge support from the Ministry of Education 15. Ivanciuc O. 2013. Chemical graphs, molecular matrices and topological
and Science, the Republic of Serbia (project 43012). D. Leszczynska and indices in chemoinformatics and quantitative structure-activity rela-
J. Leszczynska acknowledge support from the National Science Foundation tionships. Curr Comput Aided Drug Des 9:153–163.
(NSF/CREST HRD-0833178) and EPSCoR (362492-190200-01/NSFEPS- 16. Weininger D. 1988. SMILES, a chemical language and information
090378). system. 1. Introduction to methodology and encoding rules. J Chem Inf
Comput Sci 28:31–36.
Data availability—Data were taken from Zhang et al. [27]. In addition, the 17. Weininger D, Weininger A, Weininger JL. 1989. SMILES. 2.
Supplemental Data contain the data as Excel files. Algorithm for generation of unique SMILES notation. J Chem Inf
Comput Sci 29:97–101.
18. Weininger D. 1990. SMILES. 3. Depict. Graphical depiction of
REFERENCES chemical structures. J Chem Inf Comput Sci 30:237–243.
1. Mackay D, Hubbarde J, Webster E. 2003. The role of QSARs and fate
19. Zivkovi c JV, Trutic NV, Veselinovic JB, Nikolic GM, Veselinovic
models in chemical hazard and risk assessment. QSAR Comb Sci AM. 2015. Monte Carlo method based QSAR modeling of maleimide
22:106–112. derivatives as glycogen synthase kinase-3b inhibitors. Comput Biol
2. Furtula B, Gutman I. 2011. Relation between second and third Med 64:276–282.
geometric-arithmetic indices of trees. J Chemom 25:87–91.
20. Veselinovic JB, Nikolic GM, Trutic NV, Zivkovi c JV, Veselinovic
3. Afantitis A, Melagraki G, Koutentis PA, Sarimveis H, Kollias G. 2011. AM. 2015. Monte Carlo QSAR models for predicting organophosphate
Ligand-based virtual screening procedure for the prediction and the inhibition of acetylcholinesterase. SAR QSAR Environ Res 26:449–
identification of novel b-amyloid aggregation inhibitors using Kohonen 460.
maps and counterpropagation artificial neural networks. Eur J Med
21. Veselinovic AM, Veselinovic JB, Zivkovi c JV, Nikolic GM. 2015.
Chem 46:497–508. Application of smiles notation based optimal descriptors in drug
4. Afantitis A, Melagraki G, Sarimveis H, Koutentis PA, Igglessi- discovery and design. Curr Top Med Chem 15:1768–1779.
Markopoulou O, Kollias G. 2010. A combined LS-SVM & MLR QSAR 22. Achary PGR. 2014. QSPR modelling of dielectric constants of p-
workflow for predicting the inhibition of CXCR3 receptor by conjugated organic compounds by means of the CORAL software. SAR
quinazolinone analogs. Mol Divers 14:225–235. QSAR Environ Res 25:507–526.
5. Duchowicz PR, Comelli NC, Ortiz EV, Castro EA. 2012. QSAR study 23. Worachartcheewan A, Nantasenamat C, Isarankura-Na-Ayudhya C,
for carcinogenicity in a large set of organic compounds. Curr Drug Saf Prachayasittikul V. 2014. QSAR study of H1N1 neuraminidase
7:282–288. inhibitors from influenza a virus. Lett Drug Des Discov 11:420–427.
QSAR models for toxicity to Daphnia magna Environ Toxicol Chem 35, 2016 2697
24. Achary PGR. 2014. Simplified molecular input line entry system-based of cytotoxicity for metal oxide nanoparticles under different conditions.
optimal descriptors: QSAR modelling for voltage-gated potassium Ecotoxicol Environ Saf 112:39–45.
channel subunit Kv7.2. SAR QSAR Environ Res 25:73–90. 32. Toropova MA, Toropov AA, Raska I, Raskova M. 2015. Searching
25. Garcıa J, Duchowicz PR, Rozas MF, Caram JA, Mirıfico MV, therapeutic agents for treatment of Alzheimer disease using the Monte
Fernandez FM, Castro EA. 2011. A comparative QSAR on 1,2,5- Carlo method. Comput Biol Med 64:148–154.
thiadiazolidin-3-one 1,1-dioxide compounds as selective inhibitors of 33. Ojha PK, Mitra I, Das RN, Roy K. 2011. Further exploring rm2 metrics
human serine proteinases. J Mol Graph Model 31:10–19. for validation of QSPR models. Chemometr Intell Lab Syst 107:194–
26. Mullen LMA, Duchowicz PR, Castro EA. 2011. QSAR treatment on 205.
a new class of triphenylmethyl-containing compounds as potent 34. Vikas R. 2015. Exploring the role of quantum chemical descriptors in
anticancer agents. Chemometr Intell Lab Syst 107:269–275. modeling acute toxicity of diverse chemicals to Daphnia magna. J Mol
27. Zhang X, Qin W, He J, Wen Y, Su L, Sheng L, Zhao Y. 2013. Graph Model 61:89–101.
Discrimination of excess toxicity from narcotic effect: Comparison of 35. Kar S, Roy K. 2010. QSAR modeling of toxicity of diverse organic
toxicity of class-based organic chemicals to Daphnia magna and chemicals to Daphnia magna using 2D and 3D descriptors. J Hazard
Tetrahymena pyriformis. Chemosphere 93:397–407. Mater 177:344–351.
28. Toropova AP, Toropov AA, Benfenati E, Leszczynska D, Leszczynski 36. Cassani S, Kovarich S, Papa E, Roy PP, van der Wal L, Gramatica P.
J. 2015. QSAR model as a random event: A case of rat toxicity. Bioorg 2013. Daphnia and fish toxicity of (benzo)triazoles: Validated QSAR
Med Chem 23:1223–1230. models, and interspecies quantitative activity–activity modeling.
29. Toropova AP, Toropov AA, Benfenati E, Gini G, Leszczynska D, J Hazard Mater 258–259:50–60.
Leszczynski J. 2011. CORAL: Quantitative structure-activity relation- 37. Toropova AP, Toropov AA, Martyanov SE, Benfenati E, Gini G,
ship models for estimating toxicity of organic compounds in rats. Leszczynska D, Leszczynski J. 2012. CORAL: QSAR modeling of
J Comput Chem 32:2727–2733. toxicity of organic chemicals toward Daphnia magna. Chemometr
30. Toropova AP, Toropov AA, Veselinovic JB, Veselinovic AM. 2015. Intell Lab Syst 110:177–181.
QSAR as a random event: A case of NOAEL. Environ Sci Pollut Res Int 38. Organisation for Economic Co-operation and Development. 2007.
22:8264–8271. Guidance document on the validation of (quantitative)structure–
31. Toropova AP, Toropov AA, Rallo R, Leszczynska D, Leszczynski J. activity relationship [(Q)SAR] models. Series on Testing and
2015. Optimal descriptor as a translator of eclectic data into prediction Assessment, No. 69. ENV/JM/MONO(2007)2. Paris, France.