chemaxon.descriptors
Class DescriptorGenerator

java.lang.Object
  extended by chemaxon.descriptors.DescriptorGenerator

public class DescriptorGenerator
extends java.lang.Object

Simple class for generating molecular descriptors (fingerprints). The main purpose of this class is to provide a lightweight common interface for creating various molecular descriptors and obtaining them in different formats.

Typical usage

      DescriptorGenerator gen = new DescriptorGenerator("ECFP");
      gen.setParameter("Length", "512");
      Molecule mol = getFirstMoleculeFromSomewhere();
      while (mol != null) {
          gen.generate(mol);
          doSomethingWith(gen.getAsString());
          doSomethingWith(gen.getAsBitSet());
          mol = getNextMoleculeFromSomewhere();
      }
 

Since:
JChem 5.4
Author:
Peter Kovacs (pkovacs84)

Constructor Summary
DescriptorGenerator(java.lang.String descrType)
          Creates a new instance using the given descriptor type with its default configuration parameters.
DescriptorGenerator(java.lang.String descrType, java.lang.String configString)
          Creates a new instance using the given descriptor type with the given XML configuration.
 
Method Summary
 void generate(Molecule mol)
          Generates descriptor for the given molecule.
 void generate(Molecule mol, int[] atoms)
          Generates partial descriptor for the given molecule.
 java.util.BitSet getAsBitSet()
          Returns the generated descriptor in a BitSet representation if it is available.
 float[] getAsFloatArray()
          Returns the generated descriptor in a float array representation if it is available.
 int[] getAsIntArray()
          Returns the generated descriptor in an int array representation if it is available.
 java.lang.String getAsString()
          Returns the generated descriptor in its native string representation.
static java.lang.String getDescriptorLongName(java.lang.String descrType)
          Returns the long name for the given molecular descriptor type.
static java.lang.String[] getDescriptorTypes()
          Returns the list of the built-in molecular descripor types.
 void setParameter(java.lang.String paramName, java.lang.String paramValue)
          Sets a parameter of the current descriptor configuration.
 void setStandardizer(Standardizer standardizer)
          Sets the standardizer object to be used during descriptor generation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DescriptorGenerator

public DescriptorGenerator(java.lang.String descrType)
Creates a new instance using the given descriptor type with its default configuration parameters.

Parameters:
descrType - Predefined type name or class name of the desired molecular descriptor type. The list of available descriptor types can be obtained using getDescriptorTypes(). If the given string does not match any of the predefined names, it is assumed to be a class name.
Throws:
java.lang.RuntimeException - if neither the given name matches any predefined descriptor type nor a derived class of MolecularDescriptor with that name can be initialized.

DescriptorGenerator

public DescriptorGenerator(java.lang.String descrType,
                           java.lang.String configString)
                    throws MDParametersException
Creates a new instance using the given descriptor type with the given XML configuration.

Parameters:
descrType - Predefined type name or class name of the desired molecular descriptor type. The list of available descriptor types can be obtained using getDescriptorTypes(). If the given string does not match any of the predefined names, it is assumed to be a class name.
configString - XML configuration string for the selected descriptor type.
Throws:
java.lang.RuntimeException - if neither the given name matches any predefined descriptor type nor a derived class of MolecularDescriptor with that name can be initialized.
MDParametersException - if the XML configuration is invalid.
Method Detail

getDescriptorTypes

public static java.lang.String[] getDescriptorTypes()
Returns the list of the built-in molecular descripor types. The returned array contains the short names of the descriptors. The long names can be obtained using getDescriptorLongName(String).


getDescriptorLongName

public static java.lang.String getDescriptorLongName(java.lang.String descrType)
Returns the long name for the given molecular descriptor type.

Parameters:
descrType - Predefined short name of a descriptor type. The list of available short names can be obtained using getDescriptorTypes().
Throws:
java.lang.IllegalArgumentException - if the given parameter is not an available descriptor type.

setParameter

public void setParameter(java.lang.String paramName,
                         java.lang.String paramValue)
Sets a parameter of the current descriptor configuration. Only a few main parameters for each descriptor type can be set, which are stored as attributes of a designated element in the XML configuration. For specifying more parameters, you should pass a full XML configuration to the constructor of the class.

Parameters:
paramName - the name of the parameter, which must be the same as the attribute name in the XML configuration.
paramValue - the new value of the parameter.

setStandardizer

public void setStandardizer(Standardizer standardizer)
Sets the standardizer object to be used during descriptor generation. This function replaces the standardizer that was defined before either by using this method or by the configuration parameters of the descriptor.

Parameters:
standardizer - the standardizer object

generate

public void generate(Molecule mol)
              throws MDGeneratorException
Generates descriptor for the given molecule.

Parameters:
mol - the molecule.
Throws:
MDGeneratorException - if failed to generate descriptor.

generate

public void generate(Molecule mol,
                     int[] atoms)
              throws MDGeneratorException
Generates partial descriptor for the given molecule. The generated descriptor will contain only those features that are related to the given atoms of the input molecule.

Currently, only ChemicalFingerprint supports this kind of partial descriptor generation. UnsupportedOperationException is thrown for all other descriptor types.

Parameters:
mol - the molecule.
atoms - indexes of the selected atoms.
Throws:
MDGeneratorException - if failed to generate descriptor.
java.lang.UnsupportedOperationException - if the selected descriptor type does not support partial generation.
Since:
JChem 5.4.1

getAsString

public java.lang.String getAsString()
Returns the generated descriptor in its native string representation. This function is applicable to all kinds of descriptors.


getAsFloatArray

public float[] getAsFloatArray()
                        throws java.lang.UnsupportedOperationException
Returns the generated descriptor in a float array representation if it is available.

Throws:
java.lang.UnsupportedOperationException - if no appropriate conversion can be applied for the selected descriptor type.

getAsIntArray

public int[] getAsIntArray()
                    throws java.lang.UnsupportedOperationException
Returns the generated descriptor in an int array representation if it is available.

Throws:
java.lang.UnsupportedOperationException - if this representation is not supported by the selected descriptor type.

getAsBitSet

public java.util.BitSet getAsBitSet()
                             throws java.lang.UnsupportedOperationException
Returns the generated descriptor in a BitSet representation if it is available.

Throws:
java.lang.UnsupportedOperationException - if this representation is not supported by the selected descriptor type.