Serial molecule generator creates molecules from given input molecules by processing a sequence of predefined reactions. The reactions are specified in the configuration file which is an extension of the Reactor configuration file.
Different molecule generator algorithms can be added to the current scheme
which currently contains only a specific serial algorithm. This serial algorithm
takes one-reactant one-product reactions and creates product molecules by combining
reactions in all possible ways. The maximum number of reactions in a sequence
can be specified (StepCount configuration attribute or
--step-count command line parameter).
If this parameter is omitted then reaction sequences are extended
as long as new products can be generated. The reaction sequence together with the
index or ID of the starting input molecule is stored in an SDF tag.
Example
Take the following two reactions:
![]() |
![]() |
with the input molecule:
![]() |
Then the following molecules will be generated:
![]() |
Notes:
REACTIONS tag stores the reaction sequence data:
the input molecule index or ID is followed by the reaction IDs making up the reaction sequence,
items are separated by semicolons.
For a description of reaction mapping, see the Reaction mapping section of the Reactor Manual.
![]() |
--step-count command line parameter.
molgen -c <config file> [<options>] [<input files/strings>]
Prepare the usage of the molgen script or batch file
as described in Preparing the Usage of JChem
Batch Files and Shell Scripts.
Alternatively, the MoleculeGenerator class which is the common base class
for the molecule generator algorithm implementor classes can be directly invoked:
Win32 / Java 2 (assuming that JChem is installed in c:\jchem):
java -cp "c:\jchem\lib\jchem.jar;%CLASSPATH%" \
chemaxon.reaction.MoleculeGenerator \
-c <config file> [<options>] \
[<input files/strings>]
Unix / Java 2 (assuming that JChem is installed in /usr/local/jchem):
java -cp "/usr/local/jchem/lib/jchem.jar:$CLASSPATH" \
chemaxon.reaction.MoleculeGenerator \
-c <config file> [<options>] \
[<input files/strings>]
Options:
-h, --help this help message
-c, --config <filepath> configuration XML file
-a, --algorithm algorithm ID in the Algorithms section in the
configuration XML (default: serial)
-o, --output <filepath> output file path (default: stdout)
-i, --id SDFile tag that stores the molecule ID
(default ID: the molecule index)
-t, --tag SDFile tag that will store the
reaction sequence data (default: REACTIONS)
-s, --step-count the number of algorithm steps to be run
(default: infinity / unlimited)
-g, --ignore-error continue with next molecule on error
The command line parameter --config is mandatory. This
specifies the path and filename of a configuration file without which the
program cannot operate. A detailed description of the format of this
configuration file is given below.
The command line parameter --algorithm
specifies the molecule generator algorithm. The configuration XML
contains a section for each configured algorithm with this section name
(case insensitive string comparison is performed).
Currently only the serial algorithm is implemented and this is also
the default algorithm, therefore this parameter is mainly for future use.
The command line parameter --id specifies the SDF tag storing
the molecule ID to be written to the output SDF as reference to the input
molecule that the product molecule has been generated from.
The command line parameter --tag
specifies the SDF tag storing the reaction sequence data.
The command line parameter --step-count
specifies the maximum number of reaction processing steps to be performed in a reaction sequence.
This parameter may also be speified in the StepCount
configuration attribute - if it is given in both places then the command line
parameter is used.
If the command line parameter --ignore-error is specified, then import/export errors
will not stop the processing but the error is written to the console and the molecule is skipped.
By default, the program exits in case of molecule import/export erros.
Most molecular file formats are accepted ( MDL molfile, Compressed molfile, SDfile, Compressed SDfile, SMILES, etc.).
If no input file name or input string is specified in the command line then input is taken from the standard input.
MoleculeGenerator writes output molecules in SDF format. If the --output is omitted,
results are written to the standard output.
The configuration XML is an extension of the Reactor configuration XML. However, instead of taking only one of the specified reactions, MoleculeGenerator uses all reactions to make up the reaction sequences.
The MoleculeGenerator-specific part of the configuration is given in the
MoleculeGenerator section.
The Params subsection specifies some common molecule
generator parameters in attributes that can be overridden by command line parameters:
Reactions attribute: the SDF tag storing the
reaction sequence data (parameter --tag)
Algorithm attribute: the section name of the
molecule generator algorithm to be used
(parameter --algorithm)
StepCount attribute: the maximum number of
reactions making up a reaction sequence
(parameter --step-count)
The Algorithms subsection contains the sections for
the molecule generator algorithms. The Algorithm
attribute or the --algorithm command line parameter refers to
one of these algorithms by its section name. Currently only the
Serial algorithm is available.
The corresponding java class (that should be a subclass of
chemaxon.reaction.MoleculeGenerator) is specified in the mandatory
Class attribute. The serial algorithm has two specific parameters
that can be specified in Params subsection attributes:
Multiple attribute: "true" or "false" -
if set to "true" then all reaction centers are processed, otherwise only
one reaction center is processed at a time (default: "true")
HitCount attribute: specifies the maximum
number of reaction centers processed in one step.
Multiple attribute is "true" then this is
the maximum number of reaction centers to be processed per product,
The HitCount attribute can be used to avoid infinite
loops when the reaction produces one or more new reaction centers of the same type (see
Example note 3 in the Introduction).
Example
<ReactorConfiguration Version ="0.1" schemaLocation="react.xsd">
<Reactions>
<Reaction ID="R1" Structure="r1.rxn"/>
<Reaction ID="R2" Structure="r2.rxn"/>
</Reactions>
<MoleculeGenerator>
<Params Reactions="REACTIONS" Algorithm="Serial" StepCount="2"/>
<Algorithms>
<Serial Class="chemaxon.reaction.SerialMoleculeGenerator">
<Params Multiple="true" HitCount="5"/>
</Serial>
</Algorithms>
</MoleculeGenerator>
</ReactorConfiguration>
MolGen.xml
from the input molecules mols.sdf, writes result to the standard output:
molgen -c MolGen.xml mols.sdf
molgen -c MolGen.xml "CCC(CC(O)=O)C(O)=O" "CP\C=C\C(C(NC=C)NC=C)N(CP(C)C=C)\C=C\C(CC(O)=O)C(O)=O"
3 reactions per reaction sequence and writes result to
out.sdf with reaction sequence data stored in the RDATA SDF tag:
molgen -c MolGen.xml -s 3 -t RDATA mols.sdf -o out.sdf
ID SDF tag
instead of molecule indices and displays the result in MarvinView:
molgen -c MolGen.xml -s 3 -t RDATA -i ID mols.sdf -o out.sdf mview out.sdf
molgen -c MolGen.xml -s 3 -t RDATA -i ID mols.sdf | mview -
Note that such piping does not work in Windows.