4 Ethyl 2 Toxicity Report

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 63

Molecule TOPKAT_Aerobic_Biodegradability

Structural Similar Compounds


Name Phenol,_4,4'-(1- Dicofol Phenol,_2,2'-
methylethylidene)bis_2,6- methylenebis_3,4,6-
dibromo- trichloro-
Structure

C20H13BrCl2F3N2[?]
Actual Endpoint Non-Degradable Non-Degradable Non-Degradable
Molecular Weight: 489.13582
Predicted Endpoint Non-Degradable Non-Degradable Non-Degradable
ALogP: 7.747
Distance 0.737 0.743 0.757
Rotatable Bonds: 4
Reference Environmental Toxicology Environmental Toxicology Environmental Toxicology
Acceptors: 2 & Chemistry 18(9), 1763- & Chemistry 18(9), 1763- & Chemistry 18(9), 1763-
Donors: 1 1768, 1999. 1768, 1999. 1768, 1999.

Model Prediction Model Applicability


Prediction: Non-Degradable Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
in the training set.
Probability: 0.00624
Enrichment: 0.0143 1. All properties and OPS components are within expected ranges.
Bayesian Score: -27.3
Mahalanobis Distance: 17.8 Feature Contribution
Mahalanobis Distance p-value: 3.61e-022 Top features for positive contribution
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false Fingerprint Bit/Smiles Feature Structure Score Degradable in
negative rate. training set
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows SCFP_12 136597326 0.36 179 out of 307
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 0 0.223 328 out of 646

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Degradable in
training set
SCFP_12 -52074512 -1.87 5 out of 93

SCFP_12 -601571304 -1.84 5 out of 91

SCFP_12 26 -1.62 0 out of 10


Molecule TOPKAT_Ames_Mutagenicity
Structural Similar Compounds
Name CLOFAZIMINE 809-73-4 79-94-7
Structure

Actual Endpoint Non-Mutagen Non-Mutagen Non-Mutagen


C20H13BrCl2F3N2[?] Predicted Endpoint Non-Mutagen Non-Mutagen Non-Mutagen
Molecular Weight: 489.13582 Distance 0.619 0.637 0.652
ALogP: 7.747 Reference PDR 1994 Kazius et. al., J. Med. Kazius et. al., J. Med.
Rotatable Bonds: 4 Chem. (2005) 48, 312-320 Chem. (2005) 48, 312-320
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Mutagen 1. All properties and OPS components are within expected ranges.
Probability: 0.309
Enrichment: 0.554 Feature Contribution
Bayesian Score: -11.6 Top features for positive contribution
Mahalanobis Distance: 12.4 Fingerprint Bit/Smiles Feature Structure Score Mutagen in training
Mahalanobis Distance p-value: 0.000136 set
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false SCFP_12 -1798344807 0.437 81 out of 91
negative rate.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 1137425890 0.337 2 out of 2

SCFP_12 668219294 0.328 21 out of 26

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Mutagen in training
set
SCFP_12 -300280774 -1.51 3 out of 30

SCFP_12 -1903175541 -1.51 3 out of 30

SCFP_12 -1211133908 -1.51 2 out of 22


Molecule TOPKAT_Developmental_Toxicity_Potential
Structural Similar Compounds
Name Hexachlorophene Benzbromarone Pimozide
Structure

Actual Endpoint Toxic Toxic Non-Toxic


C20H13BrCl2F3N2[?] Predicted Endpoint Toxic Toxic Non-Toxic
Molecular Weight: 489.13582 Distance 0.667 0.699 0.744
ALogP: 7.747 Reference Teratology 12:83-88; 1975 Shinryo to Shinaku Kiso to Rinsho 14:2163-
Rotatable Bonds: 4 16:1521-1545; 1979 2170; 1980
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Toxic 1. All properties and OPS components are within expected ranges.
Probability: 0.495
Enrichment: 0.94 Feature Contribution
Bayesian Score: -1.45 Top features for positive contribution
Mahalanobis Distance: 10.6 Fingerprint Bit/Smiles Feature Structure Score Toxic in training
Mahalanobis Distance p-value: 0.00428 set
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false SCFP_6 -1505150337 0.369 5 out of 6
negative rate.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_6 1690735823 0.298 6 out of 8

SCFP_6 -1211866396 0.21 8 out of 12

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Toxic in training
set
SCFP_6 -1794974220 -0.55 2 out of 8

SCFP_6 -1798553344 -0.358 3 out of 9

SCFP_6 384920865 -0.33 5 out of 14


Molecule TOPKAT_Mouse_Female_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Hexachlorophene Quazepam Pimozide
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Carcinogen


C20H13BrCl2F3N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Carcinogen
Molecular Weight: 489.13582 Distance 0.700 0.743 0.760
ALogP: 7.747 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.206
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Enrichment: 0.642 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Bayesian Score: -7.17 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: 15.2 5. Unknown ECFP_2 feature: 1334250623: [*][c](:[*]):[c](Br):[cH]:[*]
Mahalanobis Distance p-value: 1.06e-007
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false
Feature Contribution
negative rate.
Probability: The esimated probability that the sample is in the Top features for positive contribution
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
cutoff. training set
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. ECFP_6 -1046436026 0.104 8 out of 23
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_6 -938530932 0.0661 8 out of 24

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_6 279095090 -0.805 0 out of 4

ECFP_6 -302078100 -0.805 0 out of 4

ECFP_6 1335691903 -0.669 3 out of 22


Molecule TOPKAT_Mouse_Female_NTP
Structural Similar Compounds
Name Dicofol 4;4'-Thiobis-(6-tert-butyl- 1-trans-delta(sup 9)-
m-cresol) tetrahydrocannabinol
Structure

C20H13BrCl2F3N2[?] Actual Endpoint Non-Carcinogen Non-Carcinogen Carcinogen


Molecular Weight: 489.13582 Predicted Endpoint Non-Carcinogen Non-Carcinogen Carcinogen
ALogP: 7.747 Distance 0.740 0.745 0.782
Rotatable Bonds: 4 Reference NTP/TR-090 NTP/TR-435 NTP446
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen 1. All properties and OPS components are within expected ranges.
Probability: 0.699 2. Unknown ECFP_2 feature: -1305021906: [*]['?']
3. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Enrichment: 1.78
4. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Bayesian Score: 3 5. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: 12.7
Mahalanobis Distance p-value: 1.47e-008
Prediction: Positive if the Bayesian score is above the estimated
Feature Contribution
best cutoff value from minimizing the false positive and false
negative rate. Top features for positive contribution
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a training set
cutoff.
Enrichment: An estimate of enrichment, that is, the increased ECFP_8 -302078100 0.575 10 out of 14
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_8 -1734834311 0.544 2 out of 2

ECFP_8 1334250623 0.544 2 out of 2

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_8 -813997308 -0.279 2 out of 8

ECFP_8 -1085223908 -0.268 6 out of 22

ECFP_8 279095090 -0.216 1 out of 4


Molecule TOPKAT_Mouse_Male_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Hexachlorophene Pimozide Quazepam
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


C20H13BrCl2F3N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 489.13582 Distance 0.691 0.732 0.736
ALogP: 7.747 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.252
Enrichment: 0.858
Bayesian Score: -2.18 Feature Contribution
Mahalanobis Distance: 22.4 Top features for positive contribution
Mahalanobis Distance p-value: 1.37e-024 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Prediction: Positive if the Bayesian score is above the estimated training set
best cutoff value from minimizing the false positive and false
negative rate. FCFP_6 71953198 0.612 12 out of 23
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_6 -1151884458 0.348 6 out of 15

FCFP_6 690511177 0.342 3 out of 7

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
FCFP_6 -497728148 -0.96 2 out of 26

FCFP_6 551850122 -0.433 8 out of 49

FCFP_6 -307477426 -0.423 0 out of 2


Molecule TOPKAT_Mouse_Male_NTP
Structural Similar Compounds
Name DICOFOL 4;4'-THIOBIS(6-t-BUTYL- 1-trans-delta(sup 9)-
m-CRESOL) tetrahydrocannabinol
Structure

C20H13BrCl2F3N2[?] Actual Endpoint Carcinogen Non-Carcinogen Carcinogen


Molecular Weight: 489.13582 Predicted Endpoint Carcinogen Non-Carcinogen Carcinogen
ALogP: 7.747 Distance 0.725 0.739 0.757
Rotatable Bonds: 4 Reference NTP/TR-90 NTP/TR-435 NTP446
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen 1. All properties and OPS components are within expected ranges.
Probability: 0.745
Enrichment: 1.89 Feature Contribution
Bayesian Score: 3.48 Top features for positive contribution
Mahalanobis Distance: 13.2 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Mahalanobis Distance p-value: 4.96e-009 training set
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false SCFP_12 28 0.661 11 out of 14
negative rate.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 1808857543 0.638 3 out of 3

SCFP_12 -1505150337 0.638 3 out of 3

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
SCFP_12 -1794974220 -0.555 0 out of 2

SCFP_12 -300280774 -0.316 0 out of 1

SCFP_12 1498989769 -0.316 0 out of 1


Molecule TOPKAT_Ocular_Irritancy_Mild_vs_Moderate_Severe
Structural Similar Compounds
Name 2;2'DIBENZANTHRONYL 2-(1'-ANTHRAQUINONYL)- BORIC ACID; TRI-o-
AMINOBENZANTHRONE CHLOROPHENYL ESTER
Structure

C20H13BrCl2F3N2[?] Actual Endpoint Mild Mild Moderate_Severe


Molecular Weight: 489.13582 Predicted Endpoint Mild Mild Moderate_Severe
ALogP: 7.747 Distance 0.724 0.740 0.768
Rotatable Bonds: 4 Reference 28ZPAK-;60;72 28ZPAK-;126;72 14KTAK -;706;64
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Mild 1. All properties and OPS components are within expected ranges.
Probability: 0.781 2. Unknown FCFP_2 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
Enrichment: 1.13
Bayesian Score: -1.67 Feature Contribution
Mahalanobis Distance: 8.59 Top features for positive contribution
Mahalanobis Distance p-value: 0.722 Fingerprint Bit/Smiles Feature Structure Score Moderate_Severe
Prediction: Positive if the Bayesian score is above the estimated in training set
best cutoff value from minimizing the false positive and false
negative rate. FCFP_10 -497728148 0.356 24 out of 25
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_10 136120670 0.206 53 out of 65

FCFP_10 3 0.165 383 out of 491

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Moderate_Severe
in training set
FCFP_10 900733322 -0.874 3 out of 13

FCFP_10 -1861645784 -0.598 9 out of 26

FCFP_10 -617729047 -0.507 0 out of 1


Molecule TOPKAT_Ocular_Irritancy_None_vs_Irritant
Structural Similar Compounds
Name 2;2'DIBENZANTHRONYL 2-(1'-ANTHRAQUINONYL)- ANTHRA(2;1;9-
AMINOBENZANTHRONE mna)NAPHTH(2;3-
h)ACRIDINE-5;10;15-
TRIONE
Structure

C20H13BrCl2F3N2[?]
Molecular Weight: 489.13582 Actual Endpoint Irritant Irritant Non-Irritant
ALogP: 7.747 Predicted Endpoint Irritant Irritant Non-Irritant
Rotatable Bonds: 4 Distance 0.722 0.726 0.761
Acceptors: 2 Reference 28ZPAK-;60;72 28ZPAK-;126;72 28ZPAK -;248;72
Donors: 1
Model Applicability
Model Prediction Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: Irritant in the training set.
Probability: 1 1. All properties and OPS components are within expected ranges.
Enrichment: 1.18 2. Unknown FCFP_2 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
Bayesian Score: 1.79
Mahalanobis Distance: 7.43 Feature Contribution
Mahalanobis Distance p-value: 0.989
Prediction: Positive if the Bayesian score is above the estimated
Top features for positive contribution
best cutoff value from minimizing the false positive and false Fingerprint Bit/Smiles Feature Structure Score Irritant in training
negative rate. set
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows FCFP_12 1747237384 0.208 44 out of 44
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 17 0.189 48 out of 49

FCFP_12 71476542 0.175 81 out of 84

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Irritant in training
set
FCFP_12 690511177 -0.268 1 out of 2

FCFP_12 1069584379 0 85 out of 101

FCFP_12 -453677277 0 264 out of 323


Molecule TOPKAT_Rat_Female_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Hexachlorophene Pimozide Dronabinol
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


C20H13BrCl2F3N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 489.13582 Distance 0.712 0.722 0.822
ALogP: 7.747 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. OPS PC19 out of range. Value: 3.5762. Training min, max, SD, explained variance: -4.1681,
Probability: 0.21 3.4693, 1.279, 0.0172.
Enrichment: 0.653 2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Bayesian Score: -6.06 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance: 14.2 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance p-value: 8.27e-007 5. Unknown ECFP_2 feature: 1334250623: [*][c](:[*]):[c](Br):[cH]:[*]
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false
negative rate. Feature Contribution
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows Top features for positive contribution
a normal distribution and is different from the prediction using a
cutoff. Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Enrichment: An estimate of enrichment, that is, the increased training set
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_12 459826767 0.613 2 out of 2

ECFP_12 -302078100 0.575 3 out of 4

ECFP_12 -428002189 0.208 1 out of 2

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_12 -175507738 -1.56 0 out of 12

ECFP_12 1335691903 -1.11 2 out of 26


ECFP_12 279095090 -0.941 0 out of 5
Molecule TOPKAT_Rat_Female_NTP
Structural Similar Compounds
Name HEXACHLOROPHENE 4;4'-THIOBIS(6-t-BUTYL- 1-TRANS-DELTA(9)-
m-CRESOL) TETRAHYDROCANNABIN
OL
Structure

C20H13BrCl2F3N2[?]
Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 489.13582
Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
ALogP: 7.747
Distance 0.682 0.723 0.763
Rotatable Bonds: 4
Reference TR-40 TR-435 TR-446
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. OPS PC11 out of range. Value: 3.7949. Training min, max, SD, explained variance: -2.3897,
Probability: 0.519 3.1905, 1.314, 0.0302.
Enrichment: 1.14 2. Unknown FCFP_2 feature: -1029533685: [*]:[c](:[*])C(F)(F)F
Bayesian Score: -0.0904
Mahalanobis Distance: 17.5 Feature Contribution
Mahalanobis Distance p-value: 1.89e-020
Prediction: Positive if the Bayesian score is above the estimated
Top features for positive contribution
best cutoff value from minimizing the false positive and false Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
negative rate. training set
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows FCFP_12 -617729047 0.575 3 out of 3
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 71953198 0.423 26 out of 40

FCFP_12 -1861645784 0.387 6 out of 9

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
FCFP_12 551850122 -0.824 2 out of 14

FCFP_12 907007053 -0.487 5 out of 21

FCFP_12 690511177 -0.349 0 out of 1


Molecule TOPKAT_Rat_Male_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Pimozide Dronabinol Astemizole
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


C20H13BrCl2F3N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 489.13582 Distance 0.755 0.812 0.814
ALogP: 7.747 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.321
Enrichment: 0.961
Bayesian Score: -1.29 Feature Contribution
Mahalanobis Distance: 16 Top features for positive contribution
Mahalanobis Distance p-value: 6.56e-009 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Prediction: Positive if the Bayesian score is above the estimated training set
best cutoff value from minimizing the false positive and false
negative rate. SCFP_6 1334878018 0.333 8 out of 17
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_6 384920865 0.322 11 out of 24

SCFP_6 -1798344807 0.313 3 out of 6

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
SCFP_6 -1211866396 -1.1 2 out of 25

SCFP_6 -1211133908 -0.496 0 out of 2

SCFP_6 -1272709286 -0.459 12 out of 61


Molecule TOPKAT_Rat_Male_NTP
Structural Similar Compounds
Name 4;4'-Thiobis-(6-tert-butyl- 1-trans-.delta.-9- 1-trans-delta(sup 9)-
m-cresol) Tetrahydrocannabinol tetrahydrocannabinol
Structure

C20H13BrCl2F3N2[?] Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


Molecular Weight: 489.13582 Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
ALogP: 7.747 Distance 0.742 0.780 0.780
Rotatable Bonds: 4 Reference NTP/TR-435 NTP/TR-446 NTP446
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen 1. OPS PC8 out of range. Value: 5.1581. Training min, max, SD, explained variance: -3.1888,
3.6693, 1.511, 0.0414.
Probability: 0.533
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Enrichment: 1.05 3. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Bayesian Score: -2.01 4. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance: 9.42 5. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance p-value: 0.00699
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false Feature Contribution
negative rate.
Probability: The esimated probability that the sample is in the Top features for positive contribution
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
cutoff. training set
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. ECFP_12 459826767 0.47 3 out of 3
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_12 -302078100 0.47 11 out of 13

ECFP_12 1334250623 0.405 2 out of 2

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_12 -175507738 -0.693 0 out of 2

ECFP_12 279095090 -0.406 0 out of 1

ECFP_12 226796801 -0.406 0 out of 1


Molecule TOPKAT_Skin_Irritancy_None_vs_Irritant
Structural Similar Compounds
Name N-o-tolylmaleidide Propane, 2,2-bis(3'-t- Phenol, 4,4'-
octyl-4'-hydroxyphenyl)- isopropylidenebis(2,6-
dichloro-
Structure

C20H13BrCl2F3N2[?]
Actual Endpoint Irritant Irritant Irritant
Molecular Weight: 489.13582
Predicted Endpoint Irritant Irritant Irritant
ALogP: 7.747
Distance 0.704 0.790 0.814
Rotatable Bonds: 4
Reference US ARMY 85JCAE "Prehled 85JCAE "Prehled
Acceptors: 2 Prumyslove Toxikologie; Prumyslove Toxikologie;
Donors: 1 Organicke Latky," Organicke Latky,"
Marhold, J., Prague , Marhold, J., Prague ,
Czechoslovakia, Czechoslovakia,
Model Prediction Avicenum, 1986
Volume(issue)/page/year:
Avicenum, 1986
Volume(issue)/page/year:
Prediction: Non-Irritant -,241,1986 -,536,1986
Probability: 0.967
Enrichment: 1.05 Model Applicability
Bayesian Score: -1.19 Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Mahalanobis Distance: 8.46 in the training set.
Mahalanobis Distance p-value: 0.708 1. All properties and OPS components are within expected ranges.
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false
negative rate.
Probability: The esimated probability that the sample is in the
Feature Contribution
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a Top features for positive contribution
cutoff. Fingerprint Bit/Smiles Feature Structure Score Irritant in training
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. set
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 -1029533685 0.0756 6 out of 6

FCFP_12 192331578 0.0756 6 out of 6

FCFP_12 366010441 0.0703 4 out of 4

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Irritant in training
set
FCFP_12 1069584379 -0.439 38 out of 65

FCFP_12 900733322 -0.153 3 out of 4


FCFP_12 367998008 -0.129 61 out of 76
Molecule TOPKAT_Skin_Sensitization_None_vs_Sensitizer
Structural Similar Compounds
Name 4;4'-Isopropylidene Dehydroabietic Acid 2-Hydroxy-2-n-
diphenol octoxybenzophenone
Structure

C20H13BrCl2F3N2[?] Actual Endpoint Sensitizer Sensitizer Sensitizer


Molecular Weight: 489.13582 Predicted Endpoint Sensitizer Sensitizer Sensitizer
ALogP: 7.747 Distance 0.886 0.897 0.913
Rotatable Bonds: 4 Reference Howard I Maibach (priv Contact Dermatitis (1989) SAR and QSAR in Env
comm) 20:41 Res (1994) 2:159
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Sensitizer
1. All properties and OPS components are within expected ranges.
Probability: 0.854
2. Unknown FCFP_2 feature: -1861645784: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
Enrichment: 1.24 3. Unknown FCFP_2 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Bayesian Score: 1.5
Mahalanobis Distance: 8.91
Feature Contribution
Mahalanobis Distance p-value: 0.00209
Prediction: Positive if the Bayesian score is above the estimated Top features for positive contribution
best cutoff value from minimizing the false positive and false Fingerprint Bit/Smiles Feature Structure Score Sensitizer in
negative rate.
Probability: The esimated probability that the sample is in the training set
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a FCFP_12 -497728148 0.304 15 out of 15
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 71953198 0.258 18 out of 19

FCFP_12 907007053 0.254 17 out of 18

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Sensitizer in
training set
FCFP_12 136120670 -0.245 15 out of 27

FCFP_12 554276193 -0.199 1 out of 2

FCFP_12 3 -0.0947 89 out of 136


Molecule TOPKAT_Skin_Sensitization_Weak_vs_Strong
Structural Similar Compounds
Name 4;4'-Isopropylidene Dehydroabietic Acid Methyl abietate
diphenol
Structure

C20H13BrCl2F3N2[?] Actual Endpoint Strong-Sensitizer Strong-Sensitizer Strong-Sensitizer


Molecular Weight: 489.13582 Predicted Endpoint Strong-Sensitizer Strong-Sensitizer Weak-Sensitizer
ALogP: 7.747 Distance 0.928 0.939 0.952
Rotatable Bonds: 4 Reference Howard I Maibach (priv Contact Dermatitis (1989) Contact Dermatitis (1989)
comm) 20:41 20:41
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Strong-Sensitizer
1. All properties and OPS components are within expected ranges.
Probability: 0.985
2. Unknown FCFP_2 feature: -1861645784: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
Enrichment: 1.27 3. Unknown FCFP_2 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Bayesian Score: 2.99
Mahalanobis Distance: 7.84
Feature Contribution
Mahalanobis Distance p-value: 0.0107
Prediction: Positive if the Bayesian score is above the estimated Top features for positive contribution
best cutoff value from minimizing the false positive and false Fingerprint Bit/Smiles Feature Structure Score Strong-Sensitizer
negative rate.
Probability: The esimated probability that the sample is in the in training set
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a FCFP_12 16 0.232 165 out of 165
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 1618154665 0.232 164 out of 164

FCFP_12 203677720 0.232 139 out of 139

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Strong-Sensitizer
in training set
FCFP_12 136597326 -0.239 74 out of 119

FCFP_12 3 -0.131 61 out of 88

FCFP_12 0 0 186 out of 244


Molecule TOPKAT_Weight_of_Evidence_Rodent_Carcinogenicity
Structural Similar Compounds
Name Hexachlorophene Pimozide Dicofol
Structure

Actual Endpoint Non-Carcinogen Carcinogen Non-Carcinogen


C20H13BrCl2F3N2[?] Predicted Endpoint Non-Carcinogen Carcinogen Non-Carcinogen
Molecular Weight: 489.13582 Distance 0.711 0.756 0.764
ALogP: 7.747 Reference US FDA (Centre for Drug US FDA (Centre for Drug NCI/NTP TR-90
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.493
Enrichment: 0.958
Bayesian Score: -1.18 Feature Contribution
Mahalanobis Distance: 5.72 Top features for positive contribution
Mahalanobis Distance p-value: 0.959 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Prediction: Positive if the Bayesian score is above the estimated training set
best cutoff value from minimizing the false positive and false
negative rate. SCFP_8 28 0.549 9 out of 10
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_8 1690735823 0.428 2 out of 2

SCFP_8 384920865 0.332 37 out of 55

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
SCFP_8 -1211866396 -0.685 6 out of 27

SCFP_8 -1211133908 -0.67 0 out of 2

SCFP_8 -1850396224 -0.67 0 out of 2


Molecule TOPKAT_Carcinogenic_Potency_TD50_Mouse
Structural Similar Compounds
Name Dicofol 1112 Chlorobenzilate
Structure

Actual Endpoint (-log C) 4.05158 4.92751 3.53947


C20H13BrCl2F3N2[?] Predicted Endpoint (-log 3.80707 4.64089 3.34564
Molecular Weight: 489.13582 C)
ALogP: 7.747 Distance 0.757 0.799 0.813
Rotatable Bonds: 4 Reference CPDB CPDB CPDB
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 2.13 1. All properties and OPS components are within expected ranges.
Unit: mg/kg_body_weight/day 2. Unknown ECFP_2 feature: -1305021906: [*]['?']
3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance: 13.2
4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance p-value: 3.76e-008
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the
Feature Contribution
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. Top features for positive contribution
Mahalanobis Distance p-value: The p-value gives the fraction of Fingerprint Bit/Smiles Feature Structure Score
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller ECFP_6 655739385 0.229
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_6 1572579716 0.225

ECFP_6 1559650422 0.203

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 1996767644 -0.251

ECFP_6 642810091 -0.247

ECFP_6 -182236392 -0.232


Molecule TOPKAT_Carcinogenic_Potency_TD50_Rat
Structural Similar Compounds
Name 1112 Nafenopin s Indomethacin
Structure

Actual Endpoint (-log C) 5.0006 4.45051 5.49293


C20H13BrCl2F3N2[?] Predicted Endpoint (-log 4.54653 3.8403 4.9569
Molecular Weight: 489.13582 C)
ALogP: 7.747 Distance 0.755 0.773 0.774
Rotatable Bonds: 4 Reference CPDB CPDB CPDB
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 1.17 1. All properties and OPS components are within expected ranges.
Unit: mg/kg_body_weight/day 2. Unknown FCFP_2 feature: -1029533685: [*]:[c](:[*])C(F)(F)F
Mahalanobis Distance: 16.8
Mahalanobis Distance p-value: 7.55e-017 Feature Contribution
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for Top features for positive contribution
correlations among the X properties. It is calculated as the Fingerprint Bit/Smiles Feature Structure Score
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. FCFP_6 -1861645784 0.359
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_6 690511177 0.293

FCFP_6 32 0.154

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 16 -0.354

FCFP_6 17 -0.149

FCFP_6 0 -0.115
Molecule TOPKAT_Chronic_LOAEL
Structural Similar Compounds
Name CLOTRIMAZOLE FLUVALINATE BAYTHROID
Structure

Actual Endpoint (-log C) 4.53762 5.30356 4.76272


C20H13BrCl2F3N2[?] Predicted Endpoint (-log 4.55826 4.89944 5.1129
Molecular Weight: 489.13582 C)
ALogP: 7.747 Distance 0.764 0.782 0.782
Rotatable Bonds: 4 Reference NDA-18813 EPA COVER SHEET EPA COVER SHEET
0281;880630;(1) 0132;891101;(1)
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.00174
1. All properties and OPS components are within expected ranges.
Unit: g/kg_body_weight
2. Unknown ECFP_6 feature: -1305021906: [*]['?']
Mahalanobis Distance: 25 3. Unknown ECFP_6 feature: -302078100: [*]Br
Mahalanobis Distance p-value: 1.51e-015 4. Unknown ECFP_6 feature: -1046436026: [*]F
Mahalanobis Distance: The Mahalanobis distance (MD) is a 5. Unknown ECFP_6 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the 6. Unknown ECFP_6 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
distance to the center of the training data. The larger the MD, the 7. Unknown ECFP_6 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of 8. Unknown ECFP_6 feature: 432684389: ['?'][c](:[*]):[*]
training data with an MD greater than or equal to the one for the 9. Unknown ECFP_6 feature: -938530932: [*]:[c](:[*])N
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non- 10. Unknown ECFP_6 feature: -813997308: [*][c](:[*]):[c](:[c]([*]):[*])[c](:[*]):[*]
normal X properties (e.g., fingerprints), the MD p-value is wildly 11. Unknown ECFP_6 feature: -175507738: [*]C([*])([*])[c](:[cH]:[*]):[cH]:[*]
inaccurate.
12. Unknown ECFP_6 feature: 1334250623: [*][c](:[*]):[c](Br):[cH]:[*]
13. Unknown ECFP_6 feature: -1952889961: [*]:[c](:[*])C(F)(F)F
14. Unknown ECFP_6 feature: 459826767: [*]:[c](:[*])Br
15. Unknown ECFP_6 feature: 99947387: [*]:[c](:[*])Cl
16. Unknown ECFP_6 feature: 767488533: [*]:[c](:[*])CC
17. Unknown ECFP_6 feature: 226796801: [*]C([*])([*])F

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 1559650422 0.129

FCFP_6 32 0.101

FCFP_6 3 0.0924

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 -453677277 -0.0906
FCFP_6 136597326 -0.0815

FCFP_6 203677720 -0.0713


Molecule TOPKAT_Daphnia_EC50
Structural Similar Compounds
Name Bifenthrin Fenvalerate Chlorophacinone
Structure

Actual Endpoint (-log C) 8.422 8.3991 5.944


C20H13BrCl2F3N2[?] Predicted Endpoint (-log 6.66616 7.4875 5.89478
Molecular Weight: 489.13582 C)
ALogP: 7.747 Distance 0.697 0.738 0.741
Rotatable Bonds: 4 Reference Toropov and Benfenati, Arch. Environ. Contam. Toropov and Benfenati,
2006, Bioorganic & Toxicol. 16(4):423- 2006, Bioorganic &
Acceptors: 2 Medicinal Chemistry, 432;1987B Medicinal Chemistry,
Donors: 1 14(8), 2779-2788 14(8), 2779-2788

Model Prediction Model Applicability


Prediction: 0.114 Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
in the training set.
Unit: mg/l
Mahalanobis Distance: 31.9 1. All properties and OPS components are within expected ranges.
2. Unknown ECFP_6 feature: -1305021906: [*]['?']
Mahalanobis Distance p-value: 2.33e-038
3. Unknown ECFP_6 feature: -427397688: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for 4. Unknown ECFP_6 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
correlations among the X properties. It is calculated as the 5. Unknown ECFP_6 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. 6. Unknown ECFP_6 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance p-value: The p-value gives the fraction of 7. Unknown ECFP_6 feature: 432684389: ['?'][c](:[*]):[*]
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller 8. Unknown ECFP_6 feature: -813997308: [*][c](:[*]):[c](:[c]([*]):[*])[c](:[*]):[*]
the p-value, the less trustworthy the prediciton. For highly non- 9. Unknown ECFP_6 feature: -175507738: [*]C([*])([*])[c](:[cH]:[*]):[cH]:[*]
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate. 10. Unknown ECFP_6 feature: 1334250623: [*][c](:[*]):[c](Br):[cH]:[*]
11. Unknown ECFP_6 feature: -1952889961: [*]:[c](:[*])C(F)(F)F
12. Unknown ECFP_6 feature: 459826767: [*]:[c](:[*])Br
13. Unknown ECFP_6 feature: 767488533: [*]:[c](:[*])CC
14. Unknown ECFP_6 feature: 226796801: [*]C([*])([*])F

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 642810091 0.148

ECFP_6 1572579716 0.114

FCFP_6 1069584379 0.0966

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 0 -0.202
FCFP_6 136597326 -0.193

FCFP_6 17 -0.189
Molecule TOPKAT_Fathead_Minnow_LC50
Structural Similar Compounds
Name 2;2'-Methylene-bis-(3;4;6- Dicofol 4;4'-Isopropylidene-bis-
trichlorophenol) (2;6-dichlorophenol)
Structure

C20H13BrCl2F3N2[?] Actual Endpoint (-log C) 7.287 5.788 5.44


Molecular Weight: 489.13582 Predicted Endpoint (-log 7.45687 6.23295 6.63381
C)
ALogP: 7.747
Distance 0.786 0.799 0.833
Rotatable Bonds: 4
Reference ATOCFM Volume 1 ATOCFM Volume 4 ATOCFM Volume 4
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 5.86e-006
1. ALogP out of range. Value: 7.747. Training min, max, mean, SD: -3.709, 7.307, 2.0523, 1.462.
Unit: g/l
2. OPS PC12 out of range. Value: 3.6428. Training min, max, SD, explained variance: -3.3233,
Mahalanobis Distance: 19.7 3.4374, 1.268, 0.0277.
Mahalanobis Distance p-value: 1.33e-039
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for Feature Contribution
correlations among the X properties. It is calculated as the
distance to the center of the training data. The larger the MD, the Top features for positive contribution
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of Fingerprint Bit/Smiles Feature Structure Score
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller FCFP_2 1069584379 0.105
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_2 71953198 0.0871

FCFP_2 16 0.0139

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 0 -0.275

FCFP_2 136120670 -0.214

FCFP_2 3 -0.198
Molecule TOPKAT_Rat_Inhalational_LC50
Structural Similar Compounds
Name 1H-Benzimidazole; 5- Ethanone; 2-((4-(2;4- Benzhydrol; 4;4'-dichloro-
chloro-6-(2;3- dichloro-3- alpha-(trichloromethyl)-
dichlorophenoxy)-2- methylbenzoyl)-1;3-
(methylthio)- dimethyl-1H-pyra zol-5-
yl)oxy)- 1-(4-
methylphenyl)-
Structure

C20H13BrCl2F3N2[?]
Molecular Weight: 489.13582
ALogP: 7.747
Actual Endpoint (-log C) 2.2548 1.7472 1.2677
Rotatable Bonds: 4
Predicted Endpoint (-log 1.69815 2.15944 2.08555
Acceptors: 2 C)
Donors: 1 Distance 0.724 0.813 0.825
Reference MDACAP Medicamentos NNGADV Nippon Noyaku PEMNDP Pesticide
Model Prediction de Actualidad. (J.R.
Prous; S.A.; Apartado de
Gakkaishi. Journal of the
Pesticide Science Society
Manual. (The British Crop
Protection Council; 20
Prediction: 5.61e+004 Correos 54 0; 08080 of Japan. (Nippon Noyaku Bridport R d.; Thornton
Unit: mg/m3/h Barcelona; Spain) V.1- Gakkai; 1-43-11; Heath CR4 7QG; UK)
1965- Komagome; Toshima-ku; V.1- 1968-
Mahalanobis Distance: 14.2 Volume(issue)/page/year: Tokyo 170; Japan) V.1- Volume(issue)/page/year:
21;227;1985 1976- 9;267;1 991
Mahalanobis Distance p-value: 3.45e-012 Volume(issue)/page/year:
Mahalanobis Distance: The Mahalanobis distance (MD) is a 15;125;1990
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Model Applicability
Mahalanobis Distance p-value: The p-value gives the fraction of Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller in the training set.
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly 1. OPS PC6 out of range. Value: -2.7436. Training min, max, SD, explained variance: -2.4569,
inaccurate. 8.1177, 1.698, 0.0400.
2. OPS PC21 out of range. Value: -3.8531. Training min, max, SD, explained variance: -3.0247,
4.4972, 1.058, 0.0155.
3. Unknown ECFP_2 feature: -1305021906: [*]['?']
4. Unknown ECFP_2 feature: -302078100: [*]Br
5. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
6. Unknown ECFP_2 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
7. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
8. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
9. Unknown ECFP_2 feature: -813997308: [*][c](:[*]):[c](:[c]([*]):[*])[c](:[*]):[*]
10. Unknown ECFP_2 feature: 1334250623: [*][c](:[*]):[c](Br):[cH]:[*]
11. Unknown ECFP_2 feature: 459826767: [*]:[c](:[*])Br

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_2 642810091 0.214

ECFP_2 1572579716 0.159

ECFP_2 -817402818 0.129

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
ECFP_2 863188371 -0.338

ECFP_2 734603939 -0.302

ECFP_2 -1046436026 -0.26


Molecule TOPKAT_Rat_Maximum_Tolerated_Dose_Feed
Structural Similar Compounds
Name HEXACHLOROPHENE DICOFOL 4,4'-THIOBIS(6-t-BUTYL-
m-CRESOL)
Structure

C20H13BrCl2F3N2[?] Actual Endpoint (-log C) 4.78017 3.9415 3.55454


Molecular Weight: 489.13582 Predicted Endpoint (-log 3.20776 3.81186 3.06707
C)
ALogP: 7.747
Distance 0.661 0.673 0.762
Rotatable Bonds: 4
Reference NCI/NTP TR-40 NCI/NTP TR-90 NCI/NTP TR-435
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.0317
1. ALogP out of range. Value: 7.747. Training min, max, mean, SD: -4.271, 7.574, 2.3494, 1.981.
Unit: g/kg_body_weight
Mahalanobis Distance: 10.6
Mahalanobis Distance p-value: 4.41e-006 Feature Contribution
Mahalanobis Distance: The Mahalanobis distance (MD) is a Top features for positive contribution
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the Fingerprint Bit/Smiles Feature Structure Score
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. FCFP_2 3 0.0737
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_2 136120670 0.064

FCFP_2 71953198 0.058

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 71476542 -0.134

FCFP_2 203677720 -0.0829

FCFP_2 16 -0.0512
Molecule TOPKAT_Rat_Maximum_Tolerated_Dose_Gavage
Structural Similar Compounds
Name MIX OF 1,2,3,6,7,8- AND 2,3,7,8-TCDD 1-TRANS-DELTA(9)-
1,2,3,7,8,9-HCDD TETRAHYDROCANNABIN
OL
Structure

C20H13BrCl2F3N2[?]
Actual Endpoint (-log C) 7.89304 9.66271 3.79861
Molecular Weight: 489.13582
Predicted Endpoint (-log 7.3873 7.00828 4.44032
ALogP: 7.747 C)
Rotatable Bonds: 4 Distance 0.903 1.009 1.064
Acceptors: 2 Reference NCI/NTP TR-198 NCI/NTP TR-201 NCI/NTP TR-446
Donors: 1
Model Applicability
Model Prediction Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: 0.0145 in the training set.
Unit: g/kg_body_weight 1. Molecular_Weight out of range. Value: 489.14. Training min, max, mean, SD: 68.074, 434.63,
Mahalanobis Distance: 9.27 171.13, 85.06.
Mahalanobis Distance p-value: 5.48e-005 2. Num_AromaticRings out of range. Value: 3. Training min, max, mean, SD: 0, 2, 0.5625, 0.693.
Mahalanobis Distance: The Mahalanobis distance (MD) is a
3. OPS PC7 out of range. Value: -3.0783. Training min, max, SD, explained variance: -2.8003,
generalization of the Euclidean distance that accounts for 2.9332, 1.16, 0.0416.
correlations among the X properties. It is calculated as the 4. Unknown FCFP_2 feature: -1861645784: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. 5. Unknown FCFP_2 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Mahalanobis Distance p-value: The p-value gives the fraction of 6. Unknown FCFP_2 feature: -1029533685: [*]:[c](:[*])C(F)(F)F
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly Feature Contribution
inaccurate.
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 32 0.526

FCFP_2 367998008 0.413

FCFP_2 71953198 0.113

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 136597326 -0.489

FCFP_2 203677720 -0.406


FCFP_2 0 -0.29
Molecule TOPKAT_Rat_Oral_LD50
Structural Similar Compounds
Name CLOFAZIMINE BENZBROMARONE PIMOZIDE
Structure

Actual Endpoint (-log C) 1.751 3.233 2.623


C20H13BrCl2F3N2[?] Predicted Endpoint (-log 2.73718 2.84957 2.83946
Molecular Weight: 489.13582 C)
ALogP: 7.747 Distance 0.621 0.660 0.702
Rotatable Bonds: 4 Reference ARZNAD 20;794;70 IYKEDH 10;232;79 NIIRDN 6;639;82
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.379 1. All properties and OPS components are within expected ranges.
Unit: g/kg_body_weight 2. Unknown ECFP_2 feature: -1305021906: [*]['?']
3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance: 18.6
4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance p-value: 4.46e-005 5. Unknown FCFP_6 feature: 16: [*][c](:[*]):[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for 6. Unknown FCFP_6 feature: -1861645784: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
correlations among the X properties. It is calculated as the 7. Unknown FCFP_6 feature: 1618154665: [*][c](:[*]):[cH]:[c]([*]):[*]
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. 8. Unknown FCFP_6 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Mahalanobis Distance p-value: The p-value gives the fraction of 9. Unknown FCFP_6 feature: 1747237384: [*][c](:[*]):n:[c]([*]):[*]
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller 10. Unknown FCFP_6 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
the p-value, the less trustworthy the prediciton. For highly non- 11. Unknown FCFP_6 feature: 1069584379: [*]:[c](:[*])N
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate. 12. Unknown FCFP_6 feature: 71476542: [*]:[c](:[*])Br

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 71953198 0.392

ECFP_6 -1046436026 0.349

ECFP_6 642810091 0.281

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 226796801 -0.32

ECFP_6 -817402818 -0.263


ECFP_6 -175507738 -0.262

You might also like