Balloon Manual

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Balloon version 1.5.0.

1143 64-bit build Feb 23 2014 16:01:34


Described in http://dx.doi.org/10.1021/ci6005646
Copyright (C) 2006-2014 Mikko J. Vainio and J. Santeri Puranen.
Copyright (C) 2010 Visipoint Ltd. www.visipoint.fi
All rights reserved.
THIS SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND.

Usage examples
Balloon lists available options when called with '-h' or '--help'. In order to perform
energy minimization using the MMFF94-like force field, Balloon needs to know the
location of the force field parameter file (named 'MMFF94.mff' in the distribution).
The location of the file is given using the '--forcefield' ('-f' for short) switch or via
the environment variable BALLOON_FORCEFIELD. If the parameter file is not
found, Balloon uses distance geometry only. The conformer sampling genetic
algorithm always requires the force field parameters.
•Generate an ensemble of conformations
balloon -f /path/to/MMFF94.mff --nconfs 20 --nGenerations 300 in.sdf out.sdf
Note that although the initial ensemble size is 20 conformations, the final
ensemble may contain more or less depending on the flexibility of the
structure.
•Generate one random conformation from SMILES
This example uses only distance geometry and no energy minimization:
balloon --nconfs 1 --noGA "C[n]1cnc2N(C)C(=O)N(C)C(=O)c12" caffeine.sdf
Now the same as above, but with energy minimization (just provide the
force field parameter file):
balloon -f /path/to/MMFF94.mff --nconfs 1 --noGA
"C[n]1cnc2N(C)C(=O)N(C)C(=O)c12" caffeine_min.sdf
•Assign MMFF94-like partial atomic charges
The output is in MOL2 format, geometry untouched:
balloon -f /path/to/MMFF94.mff --onlycharge in.sdf out.mol2
(Note that the charge models other than MMFF94 are not intended to be
used while performing geometry optimization, only when assigning partial
charges - see options 'onlycharge' and 'chargemodel')

Usage: /DATI/SOFTWARE/BALLOON-SHAEP/balloon [options] <inputfile | SMILES> ...


[<outputfile>]
General options:
-h [ --help ] Print a help message.
-v [ --verbose ] arg Verbosity level: 0 = normal (default), 1 =
verbose, 2 >= very verbose.
--config arg Name of a configuration file to read.
Command-line options override the ones in the
config file. A config file may contain lines
that look like
'long_option_name = value'
and comment lines that begin with '#'.
--writeMOL2 Force output of structures in MOL2 format (can
store partial atomic charges). See also
'output-format'.
--onlycharge Only assign partial atomic charges to the
input structures and output in MOL2 format. A
shortcut option for -c0 -k --writeMOL2. See
also 'chargemodel'.
-b [ --nobadmodels ] Do not write bad models to
'<output-file>_bad<.suffix>'.
--strict Skip structures that cannot be handled by the
used force field.
-c [ --nconfs ] arg (=1) Number of conformers to generate or the
initial population size if using GA. Zero will
cause the input structure to be written out
with partial atomic charges and energy as
calculated by the MMFF94-like force field.
--randomSeed arg Seed the pseudo-random number generator. Range
[1, 4294967295). Default value taken from
clock.
--sdfnamefieldheader arg When input is in MDL SD file format, use the
contents of the data field with the given
header as the molecule name. Only the first
line of the data is considered.
--output-file arg Name for the output file. If not given, the
last filename in the input list is taken as
the output file name. The file format is
deduced from the file extension if not
explicitly forced using 'output-format'.
Recognized extensions are
sdf
sdf:v3
mol2
smi
vbf
xyz

-o [ --output-format ] arg Force output in the given format. Format is


one of
sdf Accelrys mol/sdf format
sdf:v3 Accelrys mol/sdf format (V3000)
mol2 Tripos Mol2 format
smi SMILES format
vbf Visipoint Binary Format (experimental)
xyz XMol XYZ format (experimental code)

--input-file arg Input file name.


File lists from multiple occurrences of the
option are concatenated. If no output file is
provided, the last name in the input file list
is used as the output file name.
Input file format is recognized automatically
from the file contents. Recognized formats are
sdf Accelrys mol/sdf format
sdf:v3 Accelrys mol/sdf format (V3000)
mol2 Tripos Mol2 format
smi SMILES format
log Gaussian98/03 output log format
(experimental)
pdb Protein Data Bank format (experimental
code)
vbf Visipoint Binary Format (experimental)
xyz XMol XYZ format (experimental code)

--testrun GA: Test the generated conformer ensemble


against the input structure(s). The input
geometries are used as reference for RMSD
calculation, and ensemble statistics are
calculated. Confomers are generated as if the
input was 2D (connectivity only; see option
'rebuildGeometry'). Typically, ligands from
X-ray crystal structures are used as input for
a testrun.
--noGA Do not use a genetic algorithm (GA) to
generate conformations. Conformations will be
generated via distance geometry.
--addConformerNumberToName Add the number of a conformer as a suffix to
its name. The suffix is '_X', where X is the
number. The name of the first conformer is not
altered.
--listSymmetryClasses List the symmetry class identifiers for each
atom of each input structure.
--singleconf Output only the lowest-energy conformation,
regardless of the population size.
--fullforce GA: Use the full force field in the post-GA
structure optimization. The default is to
ignore torsion _gradient_ and all of
electrostatics.
--nosymmetry GA: Disable the use of symmetry in calculation
of RMSD. Symmetry perception can be very
costly for large structures, e.g., proteins.
--useRingAtomsForRMSD Consider ring atoms for RMSD calculation
instead of non-hydrogen atoms. Note that
acyclic structures will not be processed at
all.
--rebuildGeometry GA: Do not use the input geometry as template
for conformer generation, but always rebuild
the geometry. See also the 'testrun' option.
--maxPostprocessIter arg GA: Maximum allowed number of iterations for
conjugate gradient structure optimization in
the post-processing phase. Default = 100
--maxShapeIterations arg GA: Maximum allowed number of iterations for
overlap optimization in shape-matching.
Default = 300
The truncated Newton optimizer is adapted from
the TNPACK code of Schlick and co-workers:
Schlick T and Fogelson A (1992) ACM TOMS 18,
46-70 & 71-111; Xie D and Schlick T (1999)
SIAM J. Opt. 10, 132-154; Xie D and Schlick T
(1999) ACM TOMS 25, 108-122.
--maxtime arg GA: Maximum time [s] used per compound to
evolve the GA. Default = 120
--ftol arg Tolerance for the change in the objective
function value for terminating conjugate
gradient structure optimization. Pass a
negative value to omit this convergence
criterion. Default is to omit.
-t [ --gradientTolerance ] arg Tolerance for the gradient root-mean-square
(RMS) value for terminating conjugate gradient
structure optimization. Pass a negative value
to omit this convergence criterion. Default =
0.050000000000000003
--nicheRadius arg GA: Interconformer distance limit (RMSD) for
the calculation of niche count. Default = 1.5
-R [ --RMSDtol ] arg GA: Interconformer distance limit (RMSD) for
the final pruning of conformers. If two
conformers are closer than RMSDtol, the
higher-energy conformer will be discarded.
Default = 0.5
--energyConstant arg GA: Constant term [kcal/mol] for the linear
function of rotatable bonds used to calculate
the potential energy window within which the
final conformers must reside. Must be >= 0.
Default = 10
--energySlope arg GA: Slope [kcal/mol/rotatable bond] for the
linear function of rotatable bonds used to
calculate the potential energy window within
which the final conformers must reside. Must
be >= 0. Default = 0.5
--nGenerations arg GA: The maximum number of generations. Default
= 20
--tournamentSize arg GA: The number of individuals to involve in
the tournament selection. The higher the
number, the stronger the evolutionary pressure
towards geometrically dissimilar results.
Default = 2
--pTorsionMutation arg GA: Mutation probability for changing a
torsion angle value. Default =
0.050000000000000003
--pStereoMutation arg GA: Mutation probability for inverting a
stereochemical center. Default =
0.050000000000000003
--pPyramidMutation arg GA: Mutation probability for inverting a
pyramidal center. Default =
0.050000000000000003
--pRingFlipMutation arg GA: Mutation probability for a ring flip.
Default = 0.050000000000000003
--parallelityThreshold arg GA: Parallelity threshold [deg] for a ring
flip to take effect. Valid values within the
half-open interval [0,180). Default = 30
--pUniXO arg GA: Uniform cross-over: probability for
crossing. Tested at each locus. Default =
0.20000000000000001
--allowcrowded GA: Do not remove crowded solutions from the
population during the evolutionary cycle.
--noPopulationGrowth GA: Do not allow the population to grow in
order to accommodate the Pareto front.
--evolveOverlap GA: Evolve the shape-density overlap with a
template structure together with the
conformational energy. Requires a template
structure to be specified.

Initial geometry:
-H [ --adjusthydrogens ] Add hydrogens to the structures according to
the octet rule. Hydrogens are always added to
structures parsed from SMILES.
-n [ --neutralize ] Try to add or remove hydrogen atoms in order
to make the input molecule neutral.
--stripSalts Remove all but the largest (highest number of
atoms) connected component of the input
structure.
-E [ --expand ] Distance geometry: Create expanded conformers.
On each iteration, update the lower bounds of
nonbonded atoms from the previously generated
conformer.
-C [ --contract ] Distance geometry: Create contracted
conformers. On each iteration, update the
upper bounds of nonbonded atoms.
-s [ --stereo ] Always sample stereoconfigurations. Ignores
any stereochemistry specifications in the
input.
-k [ --keepInitial ] Output the initial geometry into the generated
set of conformers.
-i [ --maxiter ] arg Maximum allowed number of iterations for
conjugate gradient structure optimization in
the template geometry generation. Default =
100
--useSimplex Use simplex downhill structure optimization
prior to conjugate gradient.
--maxSimplexIterations arg Maximum allowed number of iterations for
downhill simplex structure optimization.
Default = 500
--simplexStepLength arg The simplex edge length for downhill simplex
structure optimization. Default = 1
--simplexFunctionTolerance arg Tolerance for the relatice change in the
objective function value for terminating
downhill simplex structure optimization.
Default = 0.10000000000000001
--nInitialDimensions arg The number of dimension in which to perform
the initial geometry optimization after
distance geometry. Increase if you encounter
incorrect stereochemistry. Default = 4

Force field:
-f [ --forcefield ] arg A filename to read for MMFF94 force field
parameters. Alternatively, you can set
environment variable named BALLOON_FORCEFIELD to
point to the parameters file. The command line
option overrides the environment variable.
--noVdWcutoff Do not use a cutoff distance for van der Waals
(steric) energy evaluation.
--vdWCutoffOn arg Distance at which the smoothing function for van
der Waals energy cutoff is turned on.
--vdWCutoffOff arg Distance at which the smoothing function for van
der Waals energy cutoff is turned off.
--noEcutoff Do not use a cutoff distance for electrostatic
energy evaluation.
--ECutoffOn arg Distance at which the smoothing function for
electrostatic energy cutoff is turned on.
--ECutoffOff arg Distance at which the smoothing function for
electrostatic energy cutoff is turned off.
--chargemodel arg Specify the partial atomic charge model to be
used.
Alternatives are:
EEM: Puranen et al. (2010) J. Comput. Chem. 31,
1722-1732. doi:10.1002/jcc.21460; Mortier
W.J. et al. (1986) J. Am. Chem. Soc. 108,
4315-4320. doi:10.1021/ja00275a013
MMFF94: Halgren TA (1996) J. Comput. Chem. 17,
490-519.
SFKEEM: Puranen et al. (2010) J. Comput. Chem.
31, 1722-1732. doi:10.1002/jcc.21460;
Chaves et al. (2006) J. Chem. Inf. Model.
46, 1657-1665. doi:10.1021/ci050505e

-d [ --distanceDependent ] Use a distance dependent dielectric model in


optimizing the initial geometry.
--dielectric arg Value of the dielectric constant aka relative
static permittivity used in the calculation of
electrostatic energy according to Coulomb's
equation. Value must be > 1e-6 in order to avoid
division by zero. For water, the value is 80 at
room temperature. Defaults to one for vacuum.
--listAtomTypes Output a list of assigned forcefield atom types
per each atom.

Genetic algorithm:
--maxRingSize arg Maximum size for rings (#atoms) whose flexibility is to
be handled. Negative values impose no limit (default).
--maxFlipDistance arg A maximum allowed number of bonds between the pairs of
ring atoms that define a flip-of-fragment operation.
Negative values impose no limit. Default = 20
-q [ --query ] arg GA: File name for a query structure upon which the
compounds are overlaid based on shape-density overlap.

You might also like