Smilesdrawer: Parsing and Drawing Smiles-Encoded Molecular Structures Using Client-Side Javascript

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Application Note

Cite This: J. Chem. Inf. Model. 2018, 58, 1−7 pubs.acs.org/jcim

SmilesDrawer: Parsing and Drawing SMILES-Encoded Molecular


Structures Using Client-Side JavaScript
Daniel Probst* and Jean-Louis Reymond*
Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Berne,
Freiestrasse 3, 3012 Berne, Switzerland
*
S Supporting Information

ABSTRACT: Here we present SmilesDrawer, a dependency-


See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

free JavaScript component capable of both parsing and drawing


SMILES-encoded molecular structures client-side, developed
to be easily integrated into web projects and to display organic
molecules in large numbers and fast succession. SmilesDrawer
can draw structurally and stereochemically complex structures
Downloaded via DURHAM UNIV on July 9, 2018 at 19:48:53 (UTC).

such as maitotoxin and C60 without using templates, yet has an


exceptionally small computational footprint and low memory
usage without the requirement for loading images or any other
form of client−server communication, making it easy to
integrate even in secure (intranet, firewalled) or offline
applications. These features allow the rendering of thousands of molecular structure drawings on a single web page within
seconds on a wide range of hardware supporting modern browsers. The source code as well as the most recent build of
SmilesDrawer is available on Github (http://doc.gdb.tools/smilesDrawer/). Both yarn and npm packages are also available.

■ INTRODUCTION
The past decade has seen the releases of a myriad of web-based
Currently, most web applications and database frontends
(PubChem Sketcher, Marvin JS by ChemAxon Inc.) rely on a
applications in the fields of bio- and cheminformatics. A major server-side backend for providing pregenerated images,
advantage of these browser-rendered frontends is the dynamically generated images or atom coordinates.8−14 The
availability of a large variety of JavaScript libraries and drawbacks of loading information from a server are the
components available through package managers like yarn or following: (I) Pregenerating images requires the creation,
npm. Among these libraries, many components deal with the possible update as well as the storage of many images in
display and transmission of molecular structure information, in persistent memory. Serving these images negatively influences
particular creating SMILES (simplified molecular-input line- loading times depending on the server hardware and the image-
entry system) from molecular structures drawn by the user,1,2 size, especially on high-latency mobile networks.15 Pregen-
such that the molecular information can be transmitted and erated images are also only available in certain resolutions,
processed rapidly. Indeed together with InChI,3 SMILES is the colors and drawing styles, possibly requiring the creation of a
de facto standard for encoding chemical species as short, single- new set of images depending on the front-end. (II)
line ASCII strings.4 Dynamically generating images using GET requests requires
A SMILES string is created from a molecular structure by either the provisioning of a web service capable of doing so or
computing a spanning tree of the undirected graph representing the reliance on a service provided by a third party, thus sending
the molecule (atoms as vertices, bonds as edges), retaining the potentially confidential information to an external server. (III)
broken cycles by indexing the removed edges (bonds) on both Calculating atom coordinates server-side, as implemented in
participating vertices (atoms) and identifying the longest path Marvin JS by ChemAxon, with subsequent drawing of the
in the resulting spanning tree. Next, the SMILES is generated structure client-side resolves performance issues when
by following the longest path, writing out the current chemical implemented correctly, but would still require the provisioning
elements symbol followed by a bond and the index of a broken of a web service resulting in infrastructure overhead and
cycle if available. In case of branching vertices (atoms), each potential security issues.
branch is written enclosed by parentheses.5 Rendering a molecular structure from SMILES directly is
Here we address the lack of the corresponding easy-to-use, challenging since the SMILES notation only records topology
small-footprint JavaScript components to perform the inverse but no spatial information, in contrast to other formats such as
task, that is, draw drug-like molecular structures from SMILES, PDB or CML,16,17 which explains the use of server-side
which can help in applications dealing with the display of
molecular structures from very large databases such as the GDB Received: July 18, 2017
and GDB-derived databases published by our group.6,7 Published: December 19, 2017

© 2017 American Chemical Society 1 DOI: 10.1021/acs.jcim.7b00425


J. Chem. Inf. Model. 2018, 58, 1−7
Journal of Chemical Information and Modeling Application Note

Figure 1. Molecular structures drawn by SmilesDrawer. SmilesDrawer applies a dynamic system simulation based on Kamada and Kawais algorithm
to determine the position of atoms when it encounters a bridged ring in a molecule. This enables SmilesDrawer to depict a wide range of molecules,
such as cubane (drawing time: 3.5 ms) (1), trinorbornane (4.3 ms) (2), heptacyclo[6.4.0.02,7.03,6.04,11.05,10.09,12]dodecane (10.5 ms) (3),
dodecahedrane (11.7 ms) (4), buckminsterfullerene (85.7 ms) (5), vitamin B12 (cyanocobalamin) (55.5 ms) (6), hydromethylglutaryl-coenzyme A
(23 ms) (7), cholesterol (7.2 ms) (8), arachidonic acid (9.2 ms) (9), prostaglandin E1 (2.6 ms) (10), strychnine (11.9 ms) (11), quinine (5.1 ms)
(12), vancomycin (115 ms) (13), calicheamycin γ1 (31.4 ms) (14), cyclosporin A (9.8 ms) (15), maitotoxin (140 ms) (16). SMILES of all
molecules shown are available in Table S1.

computation to circumvent this limitation. The only currently virtually impossible to customize, optimize, or adapt for
available JavaScript component to convert SMILES to integration into a web application without considerable
molecular structure drawings without any code server side is development overhead.
OpenChemLib-JS, a feature applied by JSME.2 OpenChemLib- Here, we introduce SmilesDrawer, a small (97 KB minified),
JS is maintained by the Cheminformatics Department of the dependency-free JavaScript component capable of drawing
Swiss Federal Institute of Technology and is cross-compiled molecular structures from SMILES strings client-side, which is
from Java to JavaScript with OpenChemLib as the codebase, much smaller and faster and overcomes many of the limitations
which is part of Actelions DataWarrior.18 This implementation, of OpenChemLib-JS. SmilesDrawer can be used by developers
however, has two major disadvantages: (I) The conversionless of web applications as well as JavaScript-based mobile and
structure drawing from SMILES is implemented using SVG desktop applications to render molecular structures from
(scalable vector graphics) implying retained mode rendering SMILES strings without the need of additional libraries or
resulting in each element of the drawing (letters, lines, ...) being communication with a server, which often presents a major
added to the DOM and thus generating object management drawback when processing sensitive information. The compo-
overhead for the web browser. This approach leads to nent is fully customizable, well-documented and its source code
unpredictable and generally lower performance across different is available under the MIT license. SmilesDrawer is written in
devices and browsers while complicating development given JavaScript ES6, transpiled adhering to the current ES6
the lack of an API. A canvas depicter option for OpenChemLib- implementation status using Babel and then packaged for
JS is available but not well documented or customizable. (II) both yarn and npm. SmilesDrawer is useful for web-based tools
The codebase of OpenChemLib-JS being written in Java, and that need to display thousands of molecules because it
thus, its reliance on being cross-compiled by GWT makes it generates the drawing from SMILES, which reduces the
2 DOI: 10.1021/acs.jcim.7b00425
J. Chem. Inf. Model. 2018, 58, 1−7
Journal of Chemical Information and Modeling Application Note

amount of data transmitted. SmilesDrawer has been utilized for ∂ 2E (t ) (t ) ∂ 2E


the 3D visualization of a multitude of chemical spaces 2
(xm , ym )δx + (xm(t ) , ym(t ))δy
∂xm ∂xm∂ym
populated by SureChEMBL data.19


∂E (t ) (t )
=− (xm , ym )
RESULTS AND DISCUSSION ∂xm (2)
SmilesDrawer Components. The SmilesDrawer Java-
Script component achieves the conversion of a SMILES to a ∂ 2E ∂ 2E ∂E (t ) (t )
two-dimensional structure drawing by combining two modules, (xm(t ) , ym(t ))δx + 2 (xm(t ) , ym(t ))δy = − (xm , ym )
∂xm∂ym ∂ym ∂ym
a SMILES parser to convert the SMILES back to its parent
spanning tree, and a SMILES drawer to convert this spanning (3)
tree to a two-dimensional structure drawing. Both are written in The functional E of the system is then minimized by iterative
JavaScript and while the drawer relies on the output of the local minimizations one vertex m at a time, where vertex m is
parser, the parser is usable as a standalone function and can be the vertex with the largest value of
applied in other projects. The parser accounts for approx-
imately 1/10 of the source code and is not directly Δm = (∂E /∂xm)2 + (∂E /∂ym )2 . Vertex m is then moved
customizable, as it was generated by a parser generator. by δx and δy until Δm reaches a lower threshold. δx and δy are
The parser module generates a parse tree from the input computed by applying a two-dimensional Newton−Raphson
SMILES, in which each atom is encoded by a node object in a method to solve eqs 1 and 2. Our implementation of the
linked tree data structure. The topology of the parse tree is algorithm by Kamada and Kawai enables the SMILES drawer
identical to the spanning tree used to generate the SMILES module to depict highly complex ring systems such as a
string. The parser was generated using PEG.js, a parser buckminsterfullerene without the need for templates (Figure
generator for JavaScript, and by translating the grammar 1). A drawback in implementing Kamada and Kawais algorithm
provided by the OpenSMILES specification into an unambig- for structure drawing arises when depicting macrocycles and
uous parsing expression grammar (PEG).20,21 PEG was chosen bridged ring-systems including macrocycles where rings might
over a context free grammar (CFG) to avoid ambiguities be distorted, ignore stereochemistry around double bonds or
(reduce-shift conflicts).22 Additionally, the generated parsers produce overlaps (Figure 1 compounds 6, 13, 15).
are an implementation of the packrat parsing algorithm and Smallest set of smallest ring discovery is implemented
thus express a linear time complexity through memoization, applying a robust algorithm based on path-included distance
resulting in increased space complexity.23,24 In practice, parsing matrices.30 Once atoms have been positioned, overlaps are
expression grammars simplify syntax definitions, conform resolved by rotating rotatable bonds where the resulting
closely to syntax practices (prioritized choices, unlimited positions yield a lower overlap score. If overlaps are still
lookahead), and allow linear time parsing for any TDPL present after the first step of overlap resolution, a second step
grammar. The higher average space complexity of packrat, rotates overlapping terminal vertices away from the overlap-
which is directly proportional to input length, is well causing position. The overlap score for each vector (atom) νi is
compensated for by the generally short length of SMILES l − || vi − vj ||
d e fi n e d a s overlapi = ∑j for l − || vi − vj || > 0,
strings. In addition to generating the parse tree, the parser can l
identify the location of an erroneous symbol. where l is the optimal bond length.
The SMILES drawer module converts the parse tree Chirality determination was implemented based on the
obtained from the SMILES to a 2D-structure drawing. The algorithm developed by Teixeira et al. and is based on the parity
module positions acyclic atoms, atoms in fused rings and atoms of permutation of ligands after ordering according to CIP rules
in spiros based on Euclidean and molecular geometry according compared to their index of occurrence in the SMILES string as
to the VSEPR model.25 The placement of bridged ring-systems defined in the OpenSMILES specification.31 For the depiction
with nrings ≥ 2 is treated as a two-dimensional graph embedding of wedges, we developed an algorithm which chooses the bond
problem and solved based on graph theoretic distances as to be flipped up, respectively down, based on the following
described by Kamada and Kawai.26 The algorithm sets up a simple set of rules (ordered by priority from highest to lowest):
virtual dynamic system, where weighted topological distances (1) Wedges are not to be drawn between two stereocenters.
between all vertices are modeled as springs, whereas other (2) If possible, wedges are not to be drawn inside a ring. (3)
spring embedders such as the Eades and Fruchterman− Drawing wedges toward heteroatoms takes priority, and (4)
Reingold algorithms, which have been adapted to depict Wedges are drawn in the direction of the shortest subtree.
molecular structures, model edges as springs and introduce The resulting structures are drawn using the Canvas API
repulsive electrical forces between non connected vertices to supported by all modern browsers. Unlike the commonly used
keep them apart.27−29 The formula for the energy functional of SVG (scalable vector graphics) format, Canvas implements
the dynamic system according to Kamada and Kawai is shown immediate mode rendering, thus abolishing the need for the
in eq 1, where ki,j = K/di,j2 is the spring strength between performance impacting object model kept in memory. After the
vertices i and j based on the topological distance di,j and the HTML 5 standard specifying the Canvas API became the stable
constant K; and li,j = L × di,j is the relaxed spring length based W3C recommendation in October 2014, the novel technology
on the topological distance and the desired edge length L. has been applied by several web-based cheminformatics and
n−1 n
bioinformatics applications, asserting its increased performance
1 and reduced code complexity compared to scalable vector
E= ∑ ∑ ki , j((xi − xj)2 + (yi − yj )2 + li2, j graphics.32−34
i=1 j=i+1 2
The SMILES drawer module implements the complete
− 2li , j (xi − xj)2 + (yi − yj )2 ) OpenSMILES specification except for square planar, trigonal
(1) bipyramidal and octahedral chirality. These types of chirality
3 DOI: 10.1021/acs.jcim.7b00425
J. Chem. Inf. Model. 2018, 58, 1−7
Journal of Chemical Information and Modeling Application Note

Figure 2. Analysis of the pooled test sets. Test sets include Drugbank (n = 7238) and samples from ChEMBL, FDB-17, GDB-17, and SureChEMBL
(each n = 7238). The sets were pooled into a super set containing all data (ntotal = 36 190). Subplots a and b show the distribution of ring count and
length of SMILES respectively. The range from 0 to 15 rings covers 99.981% (36 082), and the range from 0 to 150 characterizes 98.933% (35 704)
of all molecules in the pooled set. SMILES length was chosen as a measurement variable as it correlates best with both parse and render time (c, d).
The ρ values yielded by Pearson’s and Spearman’s methods suggest a strong nonlinear, monotonic relationship between SMILES length and
performance.

are, according to the specification, only implemented by very with minimal overlap in contrast to OpenChemLib-JS and
few SMILES systems and we did not encounter them in any of ChemAxon Marvin JS (Figure S1).
the organic molecule databases known to us. In addition, the To evaluate the runtime performance of SmilesDrawer, a test
proposed extensions, including external R-groups, polymers set including Drugbank (n = 7238) and samples from
and crystals, atom-based double bond configuration, radical ChEMBL, FDB-17, GDB-17, and SureChEMBL (each n =
centers and twisted SMILES, provided by the OpenSMILES 7238) was assembled. This pooled set containing all
specification are not supported. compounds (ntotal = 36 190) was analyzed for SMILES length
Assessing the Performance of SmilesDrawer. The and number of rings, as these two features intuitively have the
aesthetic performance of SmilesDrawer was assessed by visual highest impact on drawing speed. By running preliminary
inspection. SmilesDrawer performs extremely well for render- benchmarks, this assessment was confirmed through correlation
ing structures of a wide range of molecules (Figure 1). Excellent analysis as shown in Figure 2c, d. While parsing time primarily
drawings are produced for polycyclic hydrocarbons such as correlates with the SMILES length of a molecule (ρPearson =
cubane (1), trinorbornane (2),35 heptacyclo- 0.28, ρSpearman = 0.68), rendering time correlates well with both
[6.4.0.02,7.03,6.04,11.05,10.09,12]dodecane (3), dodecahedrane, (4) the length of the SMILES (ρPearson = 0.26, ρSpearman = 0.73) and
or buckminsterfullerene (5) without using any template. the number of rings (ρPearson = 0.15, ρSpearman = 0.67). SMILES
Depictions of complex biomolecules are also very good, for length was chosen as the measurement variable for ensuing
example the essential coenzyme vitamin B12 (6), the steroid performance benchmarks, as it correlates well with both parsing
precursor hydromethylglutaryl-coenzyme A (7), its biosynthetic and rendering time. Surprisingly, whereas monotonic relation-
endproduct cholesterol (8), the polyunsaturated omega-6 fatty ships were expected between rendering time versus the
acid arachidonic acid (9), and the immune modulator SMILES length and the number of rings respectively, the
prostaglandin E1 (10). Complex natural products are also nonlinear relationship between SMILES length and parsing
satisfactorily rendered such as the alkaloids strychnine (11) and time is unexpected due to the theoretically linear runtime of the
quinine (12), the antibiotic vancomycin (13) and the complex parser. We suspect this behavior to be caused by current
cytotoxic natural product calicheamycin γ1 (14), the JavaScript implementations not supporting tail call optimization
immunosuppresor cyclic peptide cyclosporin A (15), and the and the generated parser heavily relying on recursion.
complex polycyclic toxin maitotoxin (16), which possesses 98 The theoretical time complexities are O(n) and O(n3) for the
chiral centers. In addition to drawing small to medium-sized parser and drawer, respectively. Benchmarks were conducted
molecules, SmilesDrawer also excels at drawing large and using the Drugbank data set, containing 7238 compounds.8 In
topologically complex peptides such as peptide dendrimers36,37 addition, random subsets of equal size were extracted from the
4 DOI: 10.1021/acs.jcim.7b00425
J. Chem. Inf. Model. 2018, 58, 1−7
Journal of Chemical Information and Modeling Application Note

ChEMBL,9 FDB-17,38 GDB-17,6 and SureChEMBL13 data-


bases. The performance was assessed using desktop as well as
mobile hardware and software (Intel Core i7-7700 3.60 GHz,
16.0 GB DDR4 RAM, Windows 10.0.16299, Chrome Version
63.0.3239.84 64-bit; Samsung exynos8895 0.455-2.314 GHz, 4
GB LPDDR4X, Android 7.0, Linux version 4.4.13-12401979,
Chrome Version 63.0.3239.71).
SmilesDrawer shows excellent performance for both parsing
(tp̅ arsing = 0.04 ± 0.085 ms) and drawing (td̅ rawing = 2.445 ±
14.144 ms). The rendering time for molecules containing, in
terms of depiction using our proposed approach, complex
bridged ring systems (n = 5473 with an average of 4.033 ±
1.453 rings) is still low (td̅ rawing,bridged = 4.079 ± 17.731 ms).
Performance of drawing speed is excellent even on mobile
hardware with tp̅ arsing = 0.169 ± 0.557 ms, td̅ rawing = 7.481 ±
23.471 ms, and td̅ rawing,bridged = 13.692 ± 42.451 ms. Per set
performance is shown in Figure 3.
Comparison of SmilesDrawer on Different Devices
and with OpenChemLib-JS. To compare the total depiction
time (parsing + rendering time) of SmilesDrawer with the
JavaScript port of OpenChemLib, we ran the benchmarks on
the latest version of OpenChemLib-JS using its undocumented
canvas depicter. The performance of OpenChemLib-JS was
assessed using the same desktop test setting as for the
SmilesDrawer test case. The results are shown in Figure 3. The
total depiction time values show that the performance of
SmilesDrawer on a mobile phone is comparable to that of
OpenChemLib-JS on a desktop computer (Figure 3a). While
the render time for both SmilesDrawer and OpenChemLib-JS
are close, SmilesDrawer shows generally lower variance on the
desktop system (Figure 3b). SmilesDrawer’s parse time exhibits
a runtime performance which is orders of magnitude faster than
that of OpenChemLib-JS on both the desktop and the mobile
system (Figure 3c).
To further assess the comparative runtime performance, we
analyzed the data using two-dimensional KDE plots, which
show a detailed comparison of the drawing (parsing +
rendering) performance of SmilesDrawer with that of Open-
ChemLib-JS for the test sets (Figure 4a) GDB-17, (Figure 4b) Figure 3. Performance comparison between SmilesDrawer and
FDB-17, (Figure 4c) SureChEMBL, (Figure 4d) ChEMBL, and OpenChemLib-JS. Performance was established for three different
(Figure 4e) Drugbank. GDB-17 and FDB-17 are constrained to test cases. SmilesDrawer on a desktop computer (blue), Open-
ChemLib-JS on a desktop computer (green), and SmilesDrawer on a
a relatively low maximum atom count of 17, causing the parser
mobile phone (red). The total depiction time (parsing + rendering)
to take up a significant part of the drawing time. This fact values show that the performance of SmilesDrawer on a mobile phone
reflects in Figure 4a, b, where bimodal distributions are caused is comparable to that of OpenChemLib-JS on a desktop computer (a).
by the significantly slower parser of OpenChemLib-JS. While the render time for both SmilesDrawer and OpenChemLib-JS
The analysis of these benchmarks has shown that our are close, SmilesDrawer shows generally lower variance on the desktop
JavaScript module generally performs better throughout the system (b). SmilesDrawer’s parse time exhibits a runtime performance
test sets compared to the transcompiled version of Open- which is orders of magnitude faster than that of OpenChemLib-JS on
ChemLib-JS and that Kamada and Kawais algorithm is indeed both the desktop and the mobile system (c).
suited for placing the atoms of bridged ring systems without
any negative impact on overall rendering performance (Figure used in modern web applications in need of a method to
S2). display molecular structures. SmilesDrawer differentiates itself
For mobile applications, the overall performance of from other previously reported JavaScript components for
SmilesDrawer measured during benchmarking matches the SMILES drawing in that it does not require any third-party
latency (depending on carrier and network generation: 5−100 libraries, has a codebase written entirely in JavaScript, does not
ms) of mobile networks, facilitating application-scale perform- require the deployment of web services, and applies the
ance improvements over loading structures as image files from a algorithm proposed by Kamada and Kawai for positioning
web server on such networks.15


atoms in bridged rings while applying simple Euclidean
geometry for the placement of other atoms. Given that
CONCLUSION SmilesDrawer was implemented and optimized for the limited
SmilesDrawer is a highly customizable, easy-to-use and use on SureChEMBL data sets, its performance carries over
performant JavaScript component consisting of both a SMILES well to the Drugbank, ChEMBL, FDB-17, and GDB-17 data
parser and a Canvas API drawing module. It is tailored to be sets and even depicts complex molecules. SmilesDrawer should
5 DOI: 10.1021/acs.jcim.7b00425
J. Chem. Inf. Model. 2018, 58, 1−7
Journal of Chemical Information and Modeling Application Note

Figure 4. Comparison to OpenChemLib-JS. The two-dimensional KDE plots show a detailed comparison for the drawing (parsing + rendering)
performance of SmilesDrawer to that of OpenChemLib-JS for the test sets (a) GDB-17, (b) FDB-17, (c) SureChEMBL, (d) ChEMBL, and (e)
Drugbank. GDB-17 and FDB-17 are constrained to a relatively low maximum atom count of 17, causing the parser to take up a significant part of the
drawing time. This fact reflects in subplots a and b, where bimodal distributions are caused by the significantly slower parser of OpenChemLib-JS.
456 (1.262%) and 509 (1.409%) compounds were removed from the SmilesDrawer and OpenChemLib-JS set respectively, as they interfered with
the readability of these plots.

be generally useful to display molecules from SMILES in web


applications.
■ ACKNOWLEDGMENTS
This work was supported financially by the University of Berne,

■ ASSOCIATED CONTENT
* Supporting Information
S
the Swiss National Science Foundation, and the NCCR
TransCure. We thank ChemAxon for providing access to
Marvin JS.


The Supporting Information is available free of charge on the
ACS Publications website at DOI: 10.1021/acs.jcim.7b00425. REFERENCES
Additional benchmarking of SmilesDrawer (Figures S1 (1) Ihlenfeldt, W. D.; Bolton, E. E.; Bryant, S. H. The Pubchem
and S2) and SMILES of all molecules shown in Figure 1 Chemical Structure Sketcher. J. Cheminf. 2009, 1, 20.
and S1 (Table S1) (PDF) (2) Bienfait, B.; Ertl, P. Jsme: A Free Molecule Editor in Javascript. J.


Cheminf. 2013, 5, 24.
(3) Heller, S. R.; McNaught, A.; Pletnev, I.; Stein, S.; Tchekhovskoi,
AUTHOR INFORMATION
D. Inchi, the Iupac International Chemical Identifier. J. Cheminf. 2015,
Corresponding Authors 7, 23.
*E-mail: daniel.probst@dcb.unibe.ch (D.P.). (4) Weininger, D. Smiles, a Chemical Language and Information-
*E-mail: jean-louis.reymond@dcb.unibe.ch (J.-L.R.). System 0.1. Introduction to Methodology and Encoding Rules. J.
Chem. Inf. Model. 1988, 28, 31−36.
ORCID
(5) Weininger, D.; Weininger, A.; Weininger, J. L. Smiles. 2.
Jean-Louis Reymond: 0000-0003-2724-2942 Algorithm for Generation of Unique Smiles Notation. J. Chem. Inf.
Author Contributions Model. 1989, 29, 97−101.
D.P. designed and developed both modules of SmilesDrawer (6) Ruddigkeit, L.; van Deursen, R.; Blum, L. C.; Reymond, J. L.
and wrote the paper. J.-L.R. codesigned, supervised the project, Enumeration of 166 Billion Organic Small Molecules in the Chemical
Universe Database Gdb-17. J. Chem. Inf. Model. 2012, 52, 2864−2875.
and wrote the paper.
(7) Awale, M.; Visini, R.; Probst, D.; Arus-Pous, J.; Reymond, J. L.
Notes Chemical Space: Big Data Challenge for Molecular Diversity. Chimia
The authors declare no competing financial interest. 2017, 71, 661−666.

6 DOI: 10.1021/acs.jcim.7b00425
J. Chem. Inf. Model. 2018, 58, 1−7
Journal of Chemical Information and Modeling Application Note

(8) Wishart, D. S.; Knox, C.; Guo, A. C.; Shrivastava, S.; Hassanali, Distance Matrix. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 17355−
M.; Stothard, P.; Chang, Z.; Woolsey, J. Drugbank: A Comprehensive 17358.
Resource for in Silico Drug Discovery and Exploration. Nucleic Acids (31) Teixeira, A. L.; Leal, J. P.; Falcao, A. O. Automated
Res. 2006, 34, D668−D672. Identification and Classification of Stereochemistry: Chirality and
(9) Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Double Bond Stereoisomerism. arXiv.org 2013, No. arXiv:1303.1724.
Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; (32) Miller, C. A.; Anthony, J.; Meyer, M. M.; Marth, G. Scribl: An
Overington, J. P. Chembl: A Large-Scale Bioactivity Database for Drug Html5 Canvas-Based Graphics Library for Visualizing Genomic Data
Discovery. Nucleic Acids Res. 2012, 40, D1100−D1107. over the Web. Bioinformatics 2013, 29, 381−383.
(10) Awale, M.; van Deursen, R.; Reymond, J. L. Mqn-Mapplet: (33) Taylor, S.; Noble, R. Html5 Pivotviewer: High-Throughput
Visualization of Chemical Space with Interactive Maps of Drugbank, Visualization and Querying of Image Data on the Web. Bioinformatics
Chembl, Pubchem, Gdb-11, and Gdb-13. J. Chem. Inf. Model. 2013, 53, 2014, 30, 2691−2692.
509−518. (34) Vanderkam, D.; Aksoy, B. A.; Hodes, I.; Perrone, J.;
(11) Awale, M.; Reymond, J. L. Similarity Mapplet: Interactive Hammerbacher, J. Pileup.Js: A Javascript Library for Interactive and
Visualization of the Directory of Useful Decoys and Chembl in High in-Browser Visualization of Genomic Data. Bioinformatics 2016, 32,
Dimensional Chemical Spaces. J. Chem. Inf. Model. 2015, 55, 1509− 2378−2379.
1516. (35) Delarue Bizzini, L.; Muntener, T.; Haussinger, D.; Neuburger,
(12) Awale, M.; Reymond, J. L. Web-Based 3d-Visualization of the M.; Mayor, M. Synthesis of Trinorbornane. Chem. Commun. 2017, 53,
11399−11402.
Drugbank Chemical Space. J. Cheminf. 2016, 8, 25.
(36) Stach, M.; Siriwardena, T. N.; Kohler, T.; van Delden, C.;
(13) Papadatos, G.; Davies, M.; Dedman, N.; Chambers, J.; Gaulton,
Darbre, T.; Reymond, J. L. Combining Topology and Sequence
A.; Siddle, J.; Koks, R.; Irvine, S. A.; Pettersson, J.; Goncharoff, N.;
Design for the Discovery of Potent Antimicrobial Peptide Dendrimers
Hersey, A.; Overington, J. P. Surechembl: A Large-Scale, Chemically
against Multidrug-Resistant Pseudomonas Aeruginosa. Angew. Chem.,
Annotated Patent Document Database. Nucleic Acids Res. 2016, 44, Int. Ed. 2014, 53, 12827−12831.
D1220−D1228. (37) Bergmann, M.; Michaud, G.; Visini, R.; Jin, X.; Gillon, E.;
(14) Awale, M.; Probst, D.; Reymond, J. L. Webmolcs: A Web-Based Stocker, A.; Imberty, A.; Darbre, T.; Reymond, J. L. Multivalency
Interface for Visualizing Molecules in Three-Dimensional Chemical Effects on Pseudomonas Aeruginosa Biofilm Inhibition and Dispersal
Spaces. J. Chem. Inf. Model. 2017, 57, 643−649. by Glycopeptide Dendrimers Targeting Lectin Leca. Org. Biomol.
(15) Nikravesh, A.; Choffnes, D. R.; Katz-Bassett, E.; Mao, Z. M.; Chem. 2016, 14, 138−148.
Welsh, M., Mobile Network Performance from User Devices: A (38) Visini, R.; Awale, M.; Reymond, J.-L. Fragment Database Fdb-
Longitudinal, Multidimensional Analysis. In Passive and Active 17. J. Chem. Inf. Model. 2017, 57, 700−709.
Measurement. Pam 2014. Lecture Notes in Computer Science, Vol 8362;
Faloutsos, M. K. A., Ed.; Springer: Cham, 2014.
(16) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.
N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data
Bank. Nucleic Acids Res. 2000, 28, 235−242.
(17) Murray-Rust, P.; Rzepa, H. S.; Wright, M. Development of
Chemical Markup Language (Cml) as a System for Handling Complex
Chemical Content. New J. Chem. 2001, 25, 618−634.
(18) Sander, T.; Freyss, J.; von Korff, M.; Rufener, C. Datawarrior:
An Open-Source Program for Chemistry Aware Data Visualization and
Analysis. J. Chem. Inf. Model. 2015, 55, 460−473.
(19) Probst, D.; Reymond, J. L. Fun: A Framework for Interactive
Visualizations of Large, High Dimensional Datasets on the Web.
Bioinformatics 2017, DOI: 10.1093/bioinformatics/btx760.
(20) www.opensmiles.org (accessed December 12, 2017).
(21) Ford, B. Parsing Expression Grammars. In Proceedings of the 31st
Acm Sigplan-Sigact Symposium on Principles of Programming Lan-
guages−Popl ’04; ACM Press: New York, USA, 2004; pp 111−112.
(22) Parikh, R. J. On Context-Free Languages. J. Assoc. Comput.
Mach. 1966, 13, 570−581.
(23) Ford, B. Packrat Parsing. ACM SIGPLAN Not. 2002, 37, 36−47.
(24) Mizushima, K.; Maeda, A.; Yamaguchi, Y. Packrat Parsers Can.
Handle Practical Grammars in Mostly Constant Space. In Proceedings
of the 9th Acm Sigplan-Sigsoft Workshop on Program Analysis for
Software Tools and Engineering, Paste; ACM: New York, 2010; pp 29−
36.
(25) Hargittai, I.; Chamberland, B. The Vsepr Model of Molecular
Geometry. Comput. Math. Appl. 1986, 12, 1021−1038.
(26) Kamada, T.; Kawai, S. An Algorithm for Drawing General
Undirected Graphs. Inf. Process. Lett. 1989, 31, 7−15.
(27) Eades, P. A Heuristic for Graph Drawing. Congr. Numer. 1984,
42, 149−160.
(28) Fruchterman, T. M. J.; Reingold, E. M. Graph Drawing by
Force-Directed Placement. Softw. Pract. Exp. 1991, 21, 1129−1164.
(29) Fraczek, T. Simulation-Based Algorithm for Two-Dimensional
Chemical Structure Diagram Generation of Complex Molecules and
Ligand-Protein Interactions. J. Chem. Inf. Model. 2016, 56, 2320−2335.
(30) Lee, C. J.; Kang, Y. M.; Cho, K. H.; No, K. T. A Robust Method
for Searching the Smallest Set of Smallest Rings with a Path-Included

7 DOI: 10.1021/acs.jcim.7b00425
J. Chem. Inf. Model. 2018, 58, 1−7

You might also like