Download as pdf or txt
Download as pdf or txt
You are on page 1of 64

Molecule TOPKAT_Aerobic_Biodegradability

Structural Similar Compounds


Name Dicofol Triphenyltin_hydroxide 1,1'-Biphenyl_-4-ol,_3,5-
bis(1,1-dimethylethyl)-
Structure

C27H21N2[?] Actual Endpoint Non-Degradable Non-Degradable Non-Degradable


Molecular Weight: 373.46903 Predicted Endpoint Non-Degradable Non-Degradable Non-Degradable
ALogP: 6.545 Distance 0.616 0.656 0.668
Rotatable Bonds: 3 Reference Environmental Toxicology Environmental Toxicology Environmental Toxicology
& Chemistry 18(9), 1763- & Chemistry 18(9), 1763- & Chemistry 18(9), 1763-
Acceptors: 2 1768, 1999. 1768, 1999. 1768, 1999.
Donors: 1
Model Applicability
Model Prediction Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: Non-Degradable in the training set.
Probability: 0.0589 1. OPS PC13 out of range. Value: -4.0709. Training min, max, SD, explained variance: -3.5916,
Enrichment: 0.135 5.7035, 1.413, 0.0229.
Bayesian Score: -14.8
Mahalanobis Distance: 16.3 Feature Contribution
Mahalanobis Distance p-value: 1.45e-016 Top features for positive contribution
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false Fingerprint Bit/Smiles Feature Structure Score Degradable in
negative rate. training set
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows SCFP_12 136597326 0.36 179 out of 307
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 0 0.223 328 out of 646

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Degradable in
training set
SCFP_12 10 -1.61 11 out of 145

SCFP_12 -1850396224 -1.45 0 out of 8

SCFP_12 384920865 -1.36 6 out of 65


Molecule TOPKAT_Ames_Mutagenicity
Structural Similar Compounds
Name 24225-71-6 21232-47-3 delta 9-
Tetrahydrocannabinol
Structure

C27H21N2[?] Actual Endpoint Mutagen Non-Mutagen Non-Mutagen


Molecular Weight: 373.46903 Predicted Endpoint Mutagen Non-Mutagen Non-Mutagen
ALogP: 6.545 Distance 0.530 0.571 0.580
Rotatable Bonds: 3 Reference Kazius et. al., J. Med. Kazius et. al., J. Med. EMIC
Chem. (2005) 48, 312-320 Chem. (2005) 48, 312-320
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Mutagen
1. All properties and OPS components are within expected ranges.
Probability: 0.904
Enrichment: 1.62
Bayesian Score: 5.17 Feature Contribution
Mahalanobis Distance: 5.12 Top features for positive contribution
Mahalanobis Distance p-value: 1 Fingerprint Bit/Smiles Feature Structure Score Mutagen in training
Prediction: Positive if the Bayesian score is above the estimated set
best cutoff value from minimizing the false positive and false
negative rate. SCFP_12 -1637856718 0.511 14 out of 14
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 2095694678 0.508 13 out of 13

SCFP_12 469760330 0.495 175 out of 186

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Mutagen in training
set
SCFP_12 1498989769 -0.998 0 out of 3

SCFP_12 42 -0.531 10 out of 31

SCFP_12 136597326 -0.439 584 out of 1586


Molecule TOPKAT_Developmental_Toxicity_Potential
Structural Similar Compounds
Name Benzbromarone Triclabendazole Hexachlorophene
Structure

Actual Endpoint Toxic Toxic Toxic


C27H21N2[?] Predicted Endpoint Toxic Toxic Toxic
Molecular Weight: 373.46903 Distance 0.586 0.628 0.639
ALogP: 6.545 Reference Shinryo to Shinaku Toxicology 43(3):283-287; Teratology 12:83-88; 1975
Rotatable Bonds: 3 16:1521-1545; 1979 1987
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Toxic 1. All properties and OPS components are within expected ranges.
Probability: 0.442
Enrichment: 0.841 Feature Contribution
Bayesian Score: -3.01 Top features for positive contribution
Mahalanobis Distance: 8.5 Fingerprint Bit/Smiles Feature Structure Score Toxic in training
Mahalanobis Distance p-value: 0.369 set
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false SCFP_6 -1211866396 0.21 8 out of 12
negative rate.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_6 -1272709286 0.0607 14 out of 25

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Toxic in training
set
SCFP_6 1684592399 -0.718 0 out of 2

SCFP_6 2010506287 -0.422 0 out of 1

SCFP_6 -1798553344 -0.358 3 out of 9


Molecule TOPKAT_Mouse_Female_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Dronabinol Quazepam Hexachlorophene
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


C27H21N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 373.46903 Distance 0.596 0.622 0.657
ALogP: 6.545 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 3 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.228
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Enrichment: 0.71 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Bayesian Score: -2.26 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: 12.7
Mahalanobis Distance p-value: 0.0013 Feature Contribution
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false Top features for positive contribution
negative rate.
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
positive category. This assumes that the Bayesian score follows training set
a normal distribution and is different from the prediction using a
cutoff. ECFP_6 -81428579 0.581 3 out of 4
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_6 -178525456 0.457 4 out of 7

ECFP_6 717474525 0.451 3 out of 5

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_6 -219423964 -0.935 0 out of 5

ECFP_6 -427397688 -0.476 5 out of 28

ECFP_6 -813997308 -0.452 2 out of 12


Molecule TOPKAT_Mouse_Female_NTP
Structural Similar Compounds
Name 1-trans-delta(sup 9)- Dicofol 4;4'-Thiobis-(6-tert-butyl-
tetrahydrocannabinol m-cresol)
Structure

C27H21N2[?] Actual Endpoint Carcinogen Non-Carcinogen Non-Carcinogen


Molecular Weight: 373.46903 Predicted Endpoint Carcinogen Non-Carcinogen Non-Carcinogen
ALogP: 6.545 Distance 0.590 0.601 0.673
Rotatable Bonds: 3 Reference NTP446 NTP/TR-090 NTP/TR-435
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen 1. All properties and OPS components are within expected ranges.
Probability: 0.545 2. Unknown ECFP_2 feature: -1305021906: [*]['?']
3. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Enrichment: 1.38
4. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Bayesian Score: -1.04 5. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: 10.3
Mahalanobis Distance p-value: 0.00086
Prediction: Positive if the Bayesian score is above the estimated
Feature Contribution
best cutoff value from minimizing the false positive and false
negative rate. Top features for positive contribution
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a training set
cutoff.
Enrichment: An estimate of enrichment, that is, the increased ECFP_8 -1734834311 0.544 2 out of 2
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_8 -181568884 0.33 4 out of 7

ECFP_8 -938530932 0.287 27 out of 54

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_8 1306977740 -0.586 1 out of 7

ECFP_8 -1162217637 -0.555 0 out of 2

ECFP_8 120853561 -0.315 0 out of 1


Molecule TOPKAT_Mouse_Male_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Dronabinol Quazepam Hexachlorophene
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


C27H21N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 373.46903 Distance 0.589 0.631 0.653
ALogP: 6.545 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 3 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.356
Enrichment: 1.21
Bayesian Score: 1.71 Feature Contribution
Mahalanobis Distance: 13.1 Top features for positive contribution
Mahalanobis Distance p-value: 0.000192 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Prediction: Positive if the Bayesian score is above the estimated training set
best cutoff value from minimizing the false positive and false
negative rate. FCFP_6 -1373488931 0.517 2 out of 3
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_6 -387072142 0.477 4 out of 8

FCFP_6 -1319503346 0.46 1 out of 1

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
FCFP_6 -497728148 -0.96 2 out of 26

FCFP_6 1152389402 -0.423 0 out of 2

FCFP_6 907007053 -0.366 11 out of 62


Molecule TOPKAT_Mouse_Male_FDA_Single_vs_Multiple
Structural Similar Compounds
Name Mestranol Nafenopin Sertraline
Structure

Actual Endpoint Multiple-Carcinogen Single-Carcinogen Single-Carcinogen


C27H21N2[?] Predicted Endpoint Multiple-Carcinogen Single-Carcinogen Single-Carcinogen
Molecular Weight: 373.46903 Distance 0.688 0.692 0.695
ALogP: 6.545 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 3 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Single-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.147
Enrichment: 0.489
Bayesian Score: -8.23 Feature Contribution
Mahalanobis Distance: 13.1 Top features for positive contribution
Mahalanobis Distance p-value: 8.61e-005 Fingerprint Bit/Smiles Feature Structure Score Multiple-
Prediction: Positive if the Bayesian score is above the estimated Carcinogen in
best cutoff value from minimizing the false positive and false training set
negative rate.
Probability: The esimated probability that the sample is in the FCFP_12 907007053 0.235 5 out of 11
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 -105186863 0.174 1 out of 2

FCFP_12 -1373488931 0.174 1 out of 2

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Multiple-
Carcinogen in
training set
FCFP_12 1069584379 -1.11 0 out of 6

FCFP_12 -1151884458 -1.11 0 out of 6

FCFP_12 136120670 -0.709 1 out of 9


Molecule TOPKAT_Mouse_Male_NTP
Structural Similar Compounds
Name 1-trans-delta(sup 9)- DICOFOL 4;4'-THIOBIS(6-t-BUTYL-
tetrahydrocannabinol m-CRESOL)
Structure

C27H21N2[?] Actual Endpoint Carcinogen Carcinogen Non-Carcinogen


Molecular Weight: 373.46903 Predicted Endpoint Carcinogen Carcinogen Non-Carcinogen
ALogP: 6.545 Distance 0.587 0.599 0.674
Rotatable Bonds: 3 Reference NTP446 NTP/TR-90 NTP/TR-435
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen 1. All properties and OPS components are within expected ranges.
Probability: 0.606
Enrichment: 1.54 Feature Contribution
Bayesian Score: 0.119 Top features for positive contribution
Mahalanobis Distance: 11.2 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Mahalanobis Distance p-value: 5.48e-005 training set
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false SCFP_12 -1798344807 0.377 1 out of 1
negative rate.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 1684592399 0.377 1 out of 1

SCFP_12 1334878018 0.35 2 out of 3

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
SCFP_12 -534368618 -0.316 0 out of 1

SCFP_12 1498989769 -0.316 0 out of 1

SCFP_12 2098257782 -0.28 2 out of 8


Molecule TOPKAT_Ocular_Irritancy_Mild_vs_Moderate_Severe
Structural Similar Compounds
Name ANTHRAQUINONE;1- N;N'-(DI-2-NAPHTHYL)-P- Benzophenone; 4-(2-
(2;4;6- PHENYLENEDIAMINE ethylhexyloxy)-2-hydroxy-
TRIMETHYLPHENYLAMIN
O)-
Structure

C27H21N2[?]
Molecular Weight: 373.46903 Actual Endpoint Moderate_Severe Mild Mild
ALogP: 6.545 Predicted Endpoint Mild Mild Mild
Rotatable Bonds: 3 Distance 0.609 0.688 0.697
Acceptors: 2 Reference 28ZPAK-;242;72 28ZPAK-;74;72 Prehled Prumyslove
Donors: 1 Toxikologie; Organicke
Latky; Marhold; J. pp
648;86
Model Prediction
Prediction: Mild Model Applicability
Probability: 0.654 Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Enrichment: 0.949 in the training set.
Bayesian Score: -4.27 1. All properties and OPS components are within expected ranges.
Mahalanobis Distance: 6.59 2. Unknown FCFP_2 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance p-value: 1
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false
Feature Contribution
negative rate.
Probability: The esimated probability that the sample is in the Top features for positive contribution
positive category. This assumes that the Bayesian score follows Fingerprint Bit/Smiles Feature Structure Score Moderate_Severe
a normal distribution and is different from the prediction using a
cutoff. in training set
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. FCFP_10 -497728148 0.356 24 out of 25
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_10 136120670 0.206 53 out of 65

FCFP_10 -1331054323 0.186 1 out of 1

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Moderate_Severe
in training set
FCFP_10 900733322 -0.874 3 out of 13

FCFP_10 -620155118 -0.598 9 out of 26

FCFP_10 -1861645784 -0.598 9 out of 26


Molecule TOPKAT_Ocular_Irritancy_None_vs_Irritant
Structural Similar Compounds
Name ANTHRAQUINONE;1- BENZILIC ACID; 4;4'- Benzophenone; 4-(2-
(2;4;6- DICHLORO-; ISOPROPYL ethylhexyloxy)-2-hydroxy-
TRIMETHYLPHENYLAMIN ESTER
O)-
Structure

C27H21N2[?]
Molecular Weight: 373.46903 Actual Endpoint Irritant Irritant Irritant
ALogP: 6.545 Predicted Endpoint Irritant Irritant Non-Irritant
Rotatable Bonds: 3 Distance 0.603 0.680 0.681
Acceptors: 2 Reference 28ZPAK-;242;72 CIGET* -;-;77 Prehled Prumyslove
Donors: 1 Toxikologie; Organicke
Latky; Marhold; J. pp
648;86
Model Prediction
Prediction: Irritant Model Applicability
Probability: 1 Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Enrichment: 1.18 in the training set.
Bayesian Score: 0.94 1. All properties and OPS components are within expected ranges.
Mahalanobis Distance: 5.81 2. Unknown FCFP_2 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance p-value: 1
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false
Feature Contribution
negative rate.
Probability: The esimated probability that the sample is in the Top features for positive contribution
positive category. This assumes that the Bayesian score follows Fingerprint Bit/Smiles Feature Structure Score Irritant in training
a normal distribution and is different from the prediction using a
cutoff. set
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. FCFP_12 1747237384 0.208 44 out of 44
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 17 0.189 48 out of 49

FCFP_12 136120670 0.151 65 out of 69

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Irritant in training
set
FCFP_12 690511177 -0.268 1 out of 2

FCFP_12 -620155118 0 26 out of 31

FCFP_12 307419094 0 43 out of 52


Molecule TOPKAT_Rat_Female_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Dronabinol Hexachlorophene Desogen
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Carcinogen


C27H21N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Carcinogen
Molecular Weight: 373.46903 Distance 0.606 0.665 0.681
ALogP: 6.545 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 3 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.258
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Enrichment: 0.8 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Bayesian Score: -2.28 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: 11.7
Mahalanobis Distance p-value: 0.0121 Feature Contribution
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false Top features for positive contribution
negative rate.
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
positive category. This assumes that the Bayesian score follows training set
a normal distribution and is different from the prediction using a
cutoff. ECFP_12 -81428579 0.288 2 out of 4
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_12 1306977740 0.271 4 out of 9

ECFP_12 -428002189 0.208 1 out of 2

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_12 767488533 -0.941 0 out of 5

ECFP_12 -1734834311 -0.56 1 out of 8

ECFP_12 -181568884 -0.505 3 out of 18


Molecule TOPKAT_Rat_Female_NTP
Structural Similar Compounds
Name 1-TRANS-DELTA(9)- 1-trans-delta(sup 9)- HEXACHLOROPHENE
TETRAHYDROCANNABIN tetrahydrocannabinol
OL
Structure

C27H21N2[?]
Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 373.46903
Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
ALogP: 6.545
Distance 0.584 0.584 0.650
Rotatable Bonds: 3
Reference TR-446 NTP446 TR-40
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen
1. OPS PC11 out of range. Value: 3.7268. Training min, max, SD, explained variance: -2.3897,
Probability: 0.534 3.1905, 1.314, 0.0302.
Enrichment: 1.17
Bayesian Score: 0.507
Feature Contribution
Mahalanobis Distance: 10.8
Mahalanobis Distance p-value: 9.93e-005 Top features for positive contribution
Prediction: Positive if the Bayesian score is above the estimated Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
best cutoff value from minimizing the false positive and false training set
negative rate.
Probability: The esimated probability that the sample is in the FCFP_12 -1320007763 0.475 10 out of 14
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 -387072142 0.436 7 out of 10

FCFP_12 307419094 0.394 11 out of 17

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
FCFP_12 907007053 -0.487 5 out of 21

FCFP_12 1115242110 -0.349 0 out of 1

FCFP_12 690511177 -0.349 0 out of 1


Molecule TOPKAT_Rat_Male_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Dronabinol Clotrimazole Nafenopin
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Carcinogen


C27H21N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Carcinogen
Molecular Weight: 373.46903 Distance 0.592 0.677 0.682
ALogP: 6.545 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 3 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.425
Enrichment: 1.27
Bayesian Score: 2.2 Feature Contribution
Mahalanobis Distance: 12.7 Top features for positive contribution
Mahalanobis Distance p-value: 0.0042 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Prediction: Positive if the Bayesian score is above the estimated training set
best cutoff value from minimizing the false positive and false
negative rate. SCFP_6 1651620003 0.643 7 out of 10
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_6 -1379673609 0.526 11 out of 19

SCFP_6 2098257782 0.425 2 out of 3

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
SCFP_6 -1211866396 -1.1 2 out of 25

SCFP_6 -1272709286 -0.459 12 out of 61

SCFP_6 -1850396224 -0.278 0 out of 1


Molecule TOPKAT_Rat_Male_FDA_Single_vs_Multiple
Structural Similar Compounds
Name Tretinoin Nafenopin Mestranol
Structure

Actual Endpoint Single-Carcinogen Multiple-Carcinogen Single-Carcinogen


C27H21N2[?] Predicted Endpoint Multiple-Carcinogen Multiple-Carcinogen Multiple-Carcinogen
Molecular Weight: 373.46903 Distance 0.704 0.705 0.706
ALogP: 6.545 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 3 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Single-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.514
Enrichment: 1.24
Bayesian Score: -2.91 Feature Contribution
Mahalanobis Distance: 13.2 Top features for positive contribution
Mahalanobis Distance p-value: 0.000439 Fingerprint Bit/Smiles Feature Structure Score Multiple-
Prediction: Positive if the Bayesian score is above the estimated Carcinogen in
best cutoff value from minimizing the false positive and false training set
negative rate.
Probability: The esimated probability that the sample is in the SCFP_8 10 0.226 18 out of 39
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_8 136597326 0.215 25 out of 55

SCFP_8 2109165795 0.207 19 out of 42

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Multiple-
Carcinogen in
training set
SCFP_8 -1798344807 -0.737 0 out of 3

SCFP_8 -1798553344 -0.546 0 out of 2

SCFP_8 667776369 -0.546 0 out of 2


Molecule TOPKAT_Rat_Male_NTP
Structural Similar Compounds
Name 1-trans-delta(sup 9)- 1-trans-.delta.-9- 4;4'-Thiobis-(6-tert-butyl-
tetrahydrocannabinol Tetrahydrocannabinol m-cresol)
Structure

C27H21N2[?] Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


Molecular Weight: 373.46903 Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
ALogP: 6.545 Distance 0.592 0.592 0.673
Rotatable Bonds: 3 Reference NTP446 NTP/TR-446 NTP/TR-435
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen 1. All properties and OPS components are within expected ranges.
Probability: 0.703 2. Unknown ECFP_2 feature: -1305021906: [*]['?']
3. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Enrichment: 1.38
4. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Bayesian Score: 2.29 5. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: 9.47
Mahalanobis Distance p-value: 0.00592
Prediction: Positive if the Bayesian score is above the estimated
Feature Contribution
best cutoff value from minimizing the false positive and false
negative rate. Top features for positive contribution
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a training set
cutoff.
Enrichment: An estimate of enrichment, that is, the increased ECFP_12 -219423964 0.575 7 out of 7
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_12 -181568884 0.511 9 out of 10

ECFP_12 1639858918 0.47 7 out of 8

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_12 -1960682721 -0.406 0 out of 1

ECFP_12 -2024255407 -0.329 8 out of 23

ECFP_12 1088861418 -0.319 3 out of 9


Molecule TOPKAT_Skin_Irritancy_None_vs_Irritant
Structural Similar Compounds
Name Anthraquinone, 1-(2,4,6- p-Phenylenediamine, Phenol, 4,4'-
trimethylphenylamino)- N,N'-(di-2-naphthyl)- isopropylidenebis(2,6-
dichloro-
Structure

C27H21N2[?]
Actual Endpoint Irritant Irritant Irritant
Molecular Weight: 373.46903
Predicted Endpoint Non-Irritant Non-Irritant Irritant
ALogP: 6.545
Distance 0.624 0.684 0.685
Rotatable Bonds: 3
Reference 28ZPAK "Sbornik 85JCAE "Prehled 85JCAE "Prehled
Acceptors: 2 Vysledku Toxixologickeho Prumyslove Toxikologie; Prumyslove Toxikologie;
Donors: 1 Vysetreni Latek A Organicke Latky," Organicke Latky,"
Pripravku," Marhol d, J.V., Marhold, J., Prague , Marhold, J., Prague ,
Institut Pro Vychovu Czechoslovakia, Czechoslovakia,
Model Prediction Vedoucicn Pracovniku
Chemickeho Prumyclu
Avicenum, 1986
Volume(issue)/page/year:
Avicenum, 1986
Volume(issue)/page/year:
Prediction: Non-Irritant Praha, Cz echoslovakia, -,486,1986 -,536,1986
Probability: 0.96 1972
Volume(issue)/page/year:
Enrichment: 1.04 -,242,1
Bayesian Score: -1.54
Mahalanobis Distance: 6.7 Model Applicability
Mahalanobis Distance p-value: 0.999 Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: Positive if the Bayesian score is above the estimated in the training set.
best cutoff value from minimizing the false positive and false
negative rate. 1. All properties and OPS components are within expected ranges.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff. Feature Contribution
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. Top Features for negative contribution
Bayesian Score: The standard Laplacian-modified Bayesian
score. Fingerprint Bit/Smiles Feature Structure Score Irritant in training
Mahalanobis Distance: The Mahalanobis distance (MD) is the set
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 1069584379 -0.439 38 out of 65

FCFP_12 900733322 -0.153 3 out of 4

FCFP_12 -1861645784 -0.125 12 out of 15


Molecule TOPKAT_Skin_Sensitization_None_vs_Sensitizer
Structural Similar Compounds
Name Dehydroabietic Acid Palustric acid Abietic acid
Structure

Actual Endpoint Sensitizer Sensitizer Sensitizer


C27H21N2[?] Predicted Endpoint Sensitizer Sensitizer Sensitizer
Molecular Weight: 373.46903 Distance 0.650 0.672 0.682
ALogP: 6.545 Reference Contact Dermatitis (1989) Contact Dermatitis (1990) Contact Dermatitis (1989)
Rotatable Bonds: 3 20:41 23:90 20:41
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Sensitizer 1. All properties and OPS components are within expected ranges.
Probability: 0.717 2. Unknown FCFP_2 feature: -1861645784: [*]:[cH]:[c](:[cH]:[*])[c](:[*]):[*]
3. Unknown FCFP_2 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Enrichment: 1.04
Bayesian Score: -1.49
Mahalanobis Distance: 6.71 Feature Contribution
Mahalanobis Distance p-value: 0.435 Top features for positive contribution
Prediction: Positive if the Bayesian score is above the estimated Fingerprint Bit/Smiles Feature Structure Score Sensitizer in
best cutoff value from minimizing the false positive and false
negative rate. training set
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows FCFP_12 -497728148 0.304 15 out of 15
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 907007053 0.254 17 out of 18

FCFP_12 1069584379 0.229 22 out of 24

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Sensitizer in
training set
FCFP_12 -1373488931 -0.412 3 out of 7

FCFP_12 -387072142 -0.403 4 out of 9

FCFP_12 307419094 -0.397 5 out of 11


Molecule TOPKAT_Weight_of_Evidence_Rodent_Carcinogenicity
Structural Similar Compounds
Name Dronabinol Dicofol Hexachlorophene
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


C27H21N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 373.46903 Distance 0.598 0.600 0.669
ALogP: 6.545 Reference US FDA (Centre for Drug NCI/NTP TR-90 US FDA (Centre for Drug
Rotatable Bonds: 3 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.513
Enrichment: 0.997
Bayesian Score: -0.679 Feature Contribution
Mahalanobis Distance: 4.91 Top features for positive contribution
Mahalanobis Distance p-value: 0.999 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Prediction: Positive if the Bayesian score is above the estimated training set
best cutoff value from minimizing the false positive and false
negative rate. SCFP_8 2098257782 0.39 4 out of 5
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_8 1651620003 0.386 11 out of 15

SCFP_8 384920865 0.332 37 out of 55

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
SCFP_8 -1211866396 -0.685 6 out of 27

SCFP_8 -1850396224 -0.67 0 out of 2

SCFP_8 -534368618 -0.39 0 out of 1


Molecule TOPKAT_Carcinogenic_Potency_TD50_Mouse
Structural Similar Compounds
Name Dibenz(a,h)anthracene C.I. vat yellow 4 Benzo(a)pyrene
Structure

Actual Endpoint (-log C) 4.67521 1.48417 4.8616


C27H21N2[?] Predicted Endpoint (-log 4.44723 4.45029 4.33455
Molecular Weight: 373.46903 C)
ALogP: 6.545 Distance 0.756 0.761 0.778
Rotatable Bonds: 3 Reference CPDB CPDB CPDB
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 1.14 1. OPS PC16 out of range. Value: 4.2452. Training min, max, SD, explained variance: -3.1026,
4.016, 1.245, 0.0193.
Unit: mg/kg_body_weight/day
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Mahalanobis Distance: 11.9 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance p-value: 3.61e-005 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the
distance to the center of the training data. The larger the MD, the Feature Contribution
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of Top features for positive contribution
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller Fingerprint Bit/Smiles Feature Structure Score
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly ECFP_6 655739385 0.229
inaccurate.
ECFP_6 1572579716 0.225

ECFP_6 1559650422 0.203

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 1996767644 -0.251

ECFP_6 642810091 -0.247

ECFP_6 -182236392 -0.232


Molecule TOPKAT_Carcinogenic_Potency_TD50_Rat
Structural Similar Compounds
Name 3-Methylcholanthrene s Benzo(a)pyrene 1-Nitropyrene
Structure

Actual Endpoint (-log C) 5.73762 5.42148 4.87069


C27H21N2[?] Predicted Endpoint (-log 5.03041 5.03532 5.08823
Molecular Weight: 373.46903 C)
ALogP: 6.545 Distance 0.764 0.780 0.806
Rotatable Bonds: 3 Reference CPDB CPDB CPDB
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.633 1. OPS PC19 out of range. Value: -3.578. Training min, max, SD, explained variance: -2.9709,
5.6065, 1.282, 0.0158.
Unit: mg/kg_body_weight/day
Mahalanobis Distance: 14.2
Mahalanobis Distance p-value: 6.63e-009 Feature Contribution
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for
Top features for positive contribution
correlations among the X properties. It is calculated as the Fingerprint Bit/Smiles Feature Structure Score
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. FCFP_6 -1861645784 0.359
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_6 690511177 0.293

FCFP_6 203677720 0.137

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 991735244 -0.422

FCFP_6 16 -0.354

FCFP_6 17 -0.149
Molecule TOPKAT_Chronic_LOAEL
Structural Similar Compounds
Name CLOTRIMAZOLE C.I.YELLOW 4 MIDAZOLAM.HCL
Structure

Actual Endpoint (-log C) 4.53762 2.9775 4.55867


C27H21N2[?] Predicted Endpoint (-log 4.55826 3.32125 3.94765
Molecular Weight: 373.46903 C)
ALogP: 6.545 Distance 0.702 0.727 0.854
Rotatable Bonds: 3 Reference NDA-18813 NTP 134 69 NDA-18654
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.0121 1. All properties and OPS components are within expected ranges.
Unit: g/kg_body_weight 2. Unknown ECFP_6 feature: -1305021906: [*]['?']
3. Unknown ECFP_6 feature: -181568884: [*]:[cH]:[c](:[cH]:[*])[c](:[*]):[*]
Mahalanobis Distance: 26.5
4. Unknown ECFP_6 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Mahalanobis Distance p-value: 2.81e-018 5. Unknown ECFP_6 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for 6. Unknown ECFP_6 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
correlations among the X properties. It is calculated as the 7. Unknown ECFP_6 feature: 432684389: ['?'][c](:[*]):[*]
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. 8. Unknown ECFP_6 feature: -938530932: [*]:[c](:[*])N
Mahalanobis Distance p-value: The p-value gives the fraction of 9. Unknown ECFP_6 feature: 767488533: [*]:[c](:[*])CC
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller 10. Unknown ECFP_6 feature: -813997308: [*]:[c](:[*])[c](:[c](:[*]):[*]):[c](:[*]):[*]
the p-value, the less trustworthy the prediciton. For highly non- 11. Unknown ECFP_6 feature: -178525456: [*]:[cH]:[c](:[cH]:[*]):[c](:[*]):[*]
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate. 12. Unknown ECFP_6 feature: 1997021792: [*]:[cH]:[cH]:[cH]:[*]
13. Unknown ECFP_6 feature: 1333660716: [*][c](:[*]):[c](:[cH]:[*]):[c](:[*]):[*]

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 1559650422 0.129

FCFP_6 3 0.0924

FCFP_6 1069584379 0.0717

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 991735244 -0.134

ECFP_6 1564392544 -0.133


FCFP_6 -453677277 -0.0906
Molecule TOPKAT_Daphnia_EC50
Structural Similar Compounds
Name Brodifacoum OBPA Difethialone
Structure

Actual Endpoint (-log C) 5.728 8.02 8.088


C27H21N2[?] Predicted Endpoint (-log 6.96982 7.58014 7.23363
Molecular Weight: 373.46903 C)
ALogP: 6.545 Distance 0.761 0.769 0.794
Rotatable Bonds: 3 Reference Toropov and Benfenati, Toropov and Benfenati, Toropov and Benfenati,
2006, Bioorganic & 2006, Bioorganic & 2006, Bioorganic &
Acceptors: 2 Medicinal Chemistry, Medicinal Chemistry, Medicinal Chemistry,
Donors: 1 14(8), 2779-2788 14(8), 2779-2788 14(8), 2779-2788

Model Prediction Model Applicability


Prediction: 0.398 Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
in the training set.
Unit: mg/l
Mahalanobis Distance: 29.7 1. Num_AromaticRings out of range. Value: 5. Training min, max, mean, SD: 0, 4, 0.90826, 0.856.
2. Unknown ECFP_6 feature: -1305021906: [*]['?']
Mahalanobis Distance p-value: 1.37e-031
3. Unknown ECFP_6 feature: -181568884: [*]:[cH]:[c](:[cH]:[*])[c](:[*]):[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for 4. Unknown ECFP_6 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
correlations among the X properties. It is calculated as the 5. Unknown ECFP_6 feature: -427397688: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. 6. Unknown ECFP_6 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance p-value: The p-value gives the fraction of 7. Unknown ECFP_6 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller 8. Unknown ECFP_6 feature: 432684389: ['?'][c](:[*]):[*]
the p-value, the less trustworthy the prediciton. For highly non- 9. Unknown ECFP_6 feature: 767488533: [*]:[c](:[*])CC
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate. 10. Unknown ECFP_6 feature: -813997308: [*]:[c](:[*])[c](:[c](:[*]):[*]):[c](:[*]):[*]
11. Unknown ECFP_6 feature: -178525456: [*]:[cH]:[c](:[cH]:[*]):[c](:[*]):[*]
12. Unknown ECFP_6 feature: 1333660716: [*][c](:[*]):[c](:[cH]:[*]):[c](:[*]):[*]

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 642810091 0.148

ECFP_6 1572579716 0.114

FCFP_6 1069584379 0.0966

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 0 -0.202

FCFP_6 136597326 -0.193


FCFP_6 17 -0.189
Molecule TOPKAT_Fathead_Minnow_LC50
Structural Similar Compounds
Name 2,6-Diphenylpyridine Triphenylphosphine oxide Diphenylphthalate
Structure

Actual Endpoint (-log C) 6.04191 3.71444 6.6


C27H21N2[?] Predicted Endpoint (-log 5.89449 6.08828 6.4642
Molecular Weight: 373.46903 C)
ALogP: 6.545 Distance 1.066 1.072 1.114
Rotatable Bonds: 3 Reference DSSTox/EPAFHM DSSTox/EPAFHM ATOCFM Volume 2
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 5.85e-005 1. Num_AromaticRings out of range. Value: 5. Training min, max, mean, SD: 0, 3, 0.64948, 0.679.
Unit: g/l 2. OPS PC12 out of range. Value: 3.8075. Training min, max, SD, explained variance: -3.3233,
3.4374, 1.268, 0.0277.
Mahalanobis Distance: 19.4
Mahalanobis Distance p-value: 1.95e-038
Mahalanobis Distance: The Mahalanobis distance (MD) is a Feature Contribution
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the Top features for positive contribution
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. Fingerprint Bit/Smiles Feature Structure Score
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the FCFP_2 1069584379 0.105
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_2 16 0.0139

FCFP_2 907007053 0.00893

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 0 -0.275

FCFP_2 136120670 -0.214

FCFP_2 3 -0.198
Molecule TOPKAT_Rat_Inhalational_LC50
Structural Similar Compounds
Name 1H-Benzimidazole; 5- 1H-1;2;4-Triazole; 1- Propanamide; 2-(2-
chloro-6-(2;3- ((bis(4- naphthalenyloxy)-N-
dichlorophenoxy)-2- fluorophenyl)methylsilyl) phenyl-
(methylthio)- methyl)-
Structure

C27H21N2[?]
Molecular Weight: 373.46903 Actual Endpoint (-log C) 2.2548 1.5166 1.6396
ALogP: 6.545 Predicted Endpoint (-log 1.69815 1.21248 0.942087
Rotatable Bonds: 3 C)
Acceptors: 2 Distance 0.871 0.957 0.962
Donors: 1 Reference MDACAP Medicamentos NTIS** National Technical NNGADV Nippon Noyaku
de Actualidad. (J.R. Information Service. Gakkaishi. Journal of the
Prous; S.A.; Apartado de (Springfield; VA 22161) Pesticide Science Society
Model Prediction Correos 54 0; 08080
Barcelona; Spain) V.1-
Forme rly U.S.
Clearinghouse for
of Japan. (Nippon Noyaku
Gakkai; 1-43-11;
Prediction: 4.72e+004 1965- Scientific & Technical Komagome; Toshima-ku;
Unit: mg/m3/h Volume(issue)/page/year: Information. Tokyo 170; Japan) V.1-
21;227;1985 Volume(issue)/pag e/year: 1976-
Mahalanobis Distance: 14.9 OTS0543806 Volume(issue)/page/year:
12;563;1987
Mahalanobis Distance p-value: 1.35e-014
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the
Model Applicability
distance to the center of the training data. The larger the MD, the Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
less trustworthy the prediction. in the training set.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the 1. Num_AromaticRings out of range. Value: 5. Training min, max, mean, SD: 0, 3, 0.57958, 0.795.
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non- 2. OPS PC21 out of range. Value: -3.1422. Training min, max, SD, explained variance: -3.0247,
normal X properties (e.g., fingerprints), the MD p-value is wildly 4.4972, 1.058, 0.0155.
inaccurate.
3. Unknown ECFP_2 feature: -1305021906: [*]['?']
4. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
5. Unknown ECFP_2 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
6. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
7. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
8. Unknown ECFP_2 feature: -813997308: [*]:[c](:[*])[c](:[c](:[*]):[*]):[c](:[*]):[*]

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_2 642810091 0.214

ECFP_2 1572579716 0.159

ECFP_2 1996767644 0.127

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
ECFP_2 863188371 -0.338
ECFP_2 734603939 -0.302

ECFP_2 655739385 -0.217


Molecule TOPKAT_Rat_Maximum_Tolerated_Dose_Feed
Structural Similar Compounds
Name D&C YELLOW NO. 11 C.I. SOLVENT YELLOW 14 C.I.PIGMENT RED 3
Structure

Actual Endpoint (-log C) 4.03869 4.04277 2.65635


C27H21N2[?] Predicted Endpoint (-log 3.54593 2.8989 2.97957
Molecular Weight: 373.46903 C)
ALogP: 6.545 Distance 0.901 0.931 0.950
Rotatable Bonds: 3 Reference NCI/NTP TR-463 NCI/NTP TR-226 NCI/NTP TR-407
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.0889 1. Num_AromaticRings out of range. Value: 5. Training min, max, mean, SD: 0, 4, 1.1685, 0.8469.
Unit: g/kg_body_weight 2. OPS PC8 out of range. Value: 4.0879. Training min, max, SD, explained variance: -3.8548,
3.9137, 1.331, 0.0400.
Mahalanobis Distance: 11.4
Mahalanobis Distance p-value: 1.42e-007
Mahalanobis Distance: The Mahalanobis distance (MD) is a Feature Contribution
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the Top features for positive contribution
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. Fingerprint Bit/Smiles Feature Structure Score
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the FCFP_2 3 0.0737
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_2 136120670 0.064

FCFP_2 17 0.0441

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 203677720 -0.0829

FCFP_2 16 -0.0512

FCFP_2 0 -0.0314
Molecule TOPKAT_Rat_Maximum_Tolerated_Dose_Gavage
Structural Similar Compounds
Name PHENYLBUTAZONE PROMETHAZINE.HCL o-BENZYL-p-
CHLOROPHENOL
Structure

C27H21N2[?] Actual Endpoint (-log C) 3.48909 3.93152 3.26063


Molecular Weight: 373.46903 Predicted Endpoint (-log 3.17333 4.72433 3.64448
C)
ALogP: 6.545
Distance 1.279 1.291 1.301
Rotatable Bonds: 3
Reference NCI/NTP TR-367 NCI/NTP TR-425 NCI/NTP TR-424
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.169
1. Num_AromaticRings out of range. Value: 5. Training min, max, mean, SD: 0, 2, 0.5625, 0.693.
Unit: g/kg_body_weight
2. OPS PC6 out of range. Value: -3.1476. Training min, max, SD, explained variance: -2.4321,
Mahalanobis Distance: 15.4 2.9885, 1.256, 0.0488.
Mahalanobis Distance p-value: 1.61e-013 3. Unknown FCFP_2 feature: -1861645784: [*]:[cH]:[c](:[cH]:[*])[c](:[*]):[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a 4. Unknown FCFP_2 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. Feature Contribution
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the Top features for positive contribution
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non- Fingerprint Bit/Smiles Feature Structure Score
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate. FCFP_2 3 0.104
Top Features for negative contribution
Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 136597326 -0.489

FCFP_2 203677720 -0.406

FCFP_2 0 -0.29
Molecule TOPKAT_Rat_Oral_LD50
Structural Similar Compounds
Name sym-DI-.beta.-NAPHTHYL- FENDOSAL ANTHRAQUINONE; 1;4-
p-PHENYLENEDIAMINE bis-(p-TOLYLAMINO)-
Structure

C27H21N2[?] Actual Endpoint (-log C) 1.904 2.928 2.058


Molecular Weight: 373.46903 Predicted Endpoint (-log 1.87002 2.59 1.57464
C)
ALogP: 6.545
Distance 0.599 0.751 0.752
Rotatable Bonds: 3
Reference IPSTB3 3;93;76 AGACBH 8;209;78 85JCAE -;1330;86
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 1.51
1. All properties and OPS components are within expected ranges.
Unit: g/kg_body_weight
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Mahalanobis Distance: 20.5 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance p-value: 1.5e-010 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a 5. Unknown FCFP_6 feature: 16: [*]:[cH]:[*]
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the 6. Unknown FCFP_6 feature: 1618154665: [*][c](:[*]):[cH]:[cH]:[*]
distance to the center of the training data. The larger the MD, the 7. Unknown FCFP_6 feature: -1861645784: [*]:[cH]:[c](:[cH]:[*])[c](:[*]):[*]
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of 8. Unknown FCFP_6 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
training data with an MD greater than or equal to the one for the 9. Unknown FCFP_6 feature: 1747237384: [*][c](:[*]):n:[c]([*]):[*]
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non- 10. Unknown FCFP_6 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
normal X properties (e.g., fingerprints), the MD p-value is wildly 11. Unknown FCFP_6 feature: 1069584379: [*]:[c](:[*])N
inaccurate.

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 642810091 0.281

ECFP_6 1333660716 0.115

FCFP_6 136120670 0.103

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 655739385 -0.239

ECFP_6 734603939 -0.201


ECFP_6 -178525456 -0.157

You might also like