chemaxon.descriptors
Class MolecularDescriptor

java.lang.Object
  extended by chemaxon.descriptors.MolecularDescriptor
All Implemented Interfaces:
chemaxon.stat.Diffable, java.lang.Cloneable
Direct Known Subclasses:
BCUT, ChemicalFingerprint, CustomDescriptor, PharmacophoreFingerprint, ReactionFingerprint, ScalarDescriptor

public abstract class MolecularDescriptor
extends java.lang.Object
implements java.lang.Cloneable, chemaxon.stat.Diffable

Generic definition of molecular descriptors. The MolecularDescriptor class models all kinds of structural keys, fingerprints (hashed, pharmacophoric), MDL keys and many others which can be implemented in derived classes, some of these are implemented in JChem. For the sake of generality the MolecularDescriptor class does not introduce operations that manipulate descriptors on "atomic" level, that is, cells (or bins) of descriptors cannot be accesed either for reading or for writing. This is because cells in various chemical descriptors can have different type (for example bit, integer or floating point value).
Operations between different MolecularDescriptor derivatives are not supported, though for the sake of efficiency no extra type checking is introduced (other than provided by the language itself).
Derived classes should define their own dissimilarity metrics for the sake of efficiency. This is one reason why this class does not define any dissimilarity metric; the other is, that different MolecularDescriptor subclasses may have different representations (for instance integer array vs. float array).

Since:
JChem 2.0
Author:
Miklos Vargyas, Zsuzsanna Szabo

Field Summary
protected  MDParameters params
          Parameter settings related to the descriptor.
 
Constructor Summary
MolecularDescriptor()
          Default constructor, creates an empty object.
MolecularDescriptor(MDParameters parameters)
          Creates a new MolecularDescriptor with the given parameters.
MolecularDescriptor(MolecularDescriptor c)
          Copy constructor, creates am identical copy of the MolecularDescriptor passed as a parameter.
 
Method Summary
abstract  java.lang.Object clone()
          Creates a new instance with identical internal state.
abstract  void fromData(byte[] dbRepr)
          Builds a MolecularDescriptor object from its external (database) representation.
abstract  void fromFloatArray(float[] descr)
          Builds a molecular descriptor from its float array representation.
abstract  void fromString(java.lang.String descr)
          Builds a molecular descriptor from its string representation.
 java.lang.String[] generate(Molecule m)
          Creates the descriptor for the given Molecule.
 java.awt.Color[] getAtomSetColors()
          Determines the coloring of atoms.
 int[] getAtomSetIndexes(Molecule m)
          Gets the individual atom color indexes.
 java.lang.String[] getAtomSetNames()
           
abstract  float[] getDefaultDissimilarityMetricThresholds()
          Gets the default dissimilarity threshold values for all dissimilarity metrics defined.
 int getDefaultMetricIndex()
          Gets the index of the default metric.
 float getDefaultThreshold(int metricIndex)
          Gets a metric dependent default threshold value.
abstract  float getDissimilarity(java.lang.Object other)
          Calculates the dissimilarity ratio between two MolecularDescriptor objects using the default metric.
abstract  float getDissimilarity(java.lang.Object other, int parametrizedMetricIndex)
          Calculates the dissimilarity between two MolecularDescriptor objects using the specified metric, apart from that it is the same as getDissimilarity( final Object other ).
 int getDissimilarityMetricIndex(java.lang.String metricName)
          Gets the internal index of the given metric.
abstract  java.lang.String[] getDissimilarityMetrics()
          Gets the dissimilarity metric names in an array.
 float getLowerBound(java.lang.Object other)
          Calculates an estimate for the minimum value of the distance distance using the default distance metric.
 int getMetricIndex(java.lang.String metricName)
          Gets the index of the given parametrized metric.
 java.lang.String getMetricName()
          Gets the name of the current parametrized metric.
 java.lang.String getMetricName(int metricIndex)
          Gets the name of a metric specified parametrized metric by its index.
 java.lang.String getName()
          Gets the sname of the descriptor.
 int getNumberOfMetrics()
          Gets the number of parametrized metrics available for the particular descriptor.
 int getNumberOfWeights(java.lang.String dissimilarityMetricName)
          Gets the number of weight factors used by the specified metric.
 MDParameters getParameters()
          Gets the parameters associated with the object.
 java.lang.String getParametersClassName()
          Gets the name of the parameters class corresponding to the descriptor (prefixed with the package name as getClass().getName() would return).
 java.lang.String getShortName()
          Gets the short name of the descriptor.
 float getThreshold()
          Gets threshold value of the current parameterized metric.
 float getThreshold(int metricIndex)
          Gets a metric dependent default threshold value.
static void main(java.lang.String[] args)
           
 boolean needsConfig()
          Indicates if class takes parameters from configuration file.
static MolecularDescriptor newInstance(java.lang.String descriptorTypeName)
          Creates a MolecularDescriptor specified by its name.
static MolecularDescriptor newInstance(java.lang.String descriptorTypeName, java.lang.String parameters)
          Creates a MolecularDescriptor specified by its name and xml parameter.
static MolecularDescriptor newInstanceFromXML(java.lang.String parameters)
          Creates a new MolecularDescriptor instance according to the given parameter string.
 void setParameters(MDParameters parameters)
          Sets the parameters for an already created MolecularDescriptor.
abstract  void setParameters(java.lang.String parameters)
          Sets the parameters for an already created MolecularDescriptor.
 void setScreeningConfiguration(java.lang.String config)
          Sets the screening configuration.
 java.lang.String toBinaryString()
          Creates the binary string representation of a MolecularDescriptor object.
abstract  byte[] toData()
          Converts the internal (memory) representation of a MolecularDescriptor instance into an external format that can be stored in a database.
abstract  java.lang.String toDecimalString()
          Creates the string representation of a MolecularDescriptor object.
abstract  float[] toFloatArray()
          Creates the float array representation of a MolecularDescriptor object.
abstract  java.lang.String toString()
          Creates the string representation of a MolecularDescriptor object.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

params

protected MDParameters params
Parameter settings related to the descriptor. Instances of the MolecularDescriptor class with the same parameter setting share a common MDParameter object, however MolecularDescriptor objects having different parameters are also allowed in the same application. MDParameters can be specialized with inheritance, thus classes derived from MolecularDescriptor are recommended to derive their own MDParameters sub-clusses.

Constructor Detail

MolecularDescriptor

public MolecularDescriptor(MDParameters parameters)
Creates a new MolecularDescriptor with the given parameters.

Parameters:
parameters - parameter settings of the descriptor to be created
Since:
JChem 2.2

MolecularDescriptor

public MolecularDescriptor()
Default constructor, creates an empty object.


MolecularDescriptor

public MolecularDescriptor(MolecularDescriptor c)
Copy constructor, creates am identical copy of the MolecularDescriptor passed as a parameter.

Parameters:
c - a MolecularDescriptor to be copied
Method Detail

newInstance

public static final MolecularDescriptor newInstance(java.lang.String descriptorTypeName,
                                                    java.lang.String parameters)
Creates a MolecularDescriptor specified by its name and xml parameter.

Parameters:
descriptorTypeName - predefined type name or class name If the name doesn't match any of the predefined names, it's assumed that it's a class name parameters xml parameter configuration
Returns:
a new instance of the appropriate MolecularDescriptor

newInstance

public static final MolecularDescriptor newInstance(java.lang.String descriptorTypeName)
Creates a MolecularDescriptor specified by its name. The descriptor is created with the default parameter settings.

Parameters:
descriptorTypeName - predefined type name or class name If the name doesn't match any of the predefined names, it's assumed that it's a class name
Returns:
a new instance of the appropriate MolecularDescriptor

newInstanceFromXML

public static final MolecularDescriptor newInstanceFromXML(java.lang.String parameters)
Creates a new MolecularDescriptor instance according to the given parameter string. The parameter string is and XML configuration that specifies the type of the MOlecularDescriptor as well as its parmeter settings.

Parameters:
parameters - XML configuration string
Returns:
a new instance of the appropriate MolecularDescriptor

clone

public abstract java.lang.Object clone()
Creates a new instance with identical internal state.

Overrides:
clone in class java.lang.Object
Returns:
the newly copied object

getName

public java.lang.String getName()
Gets the sname of the descriptor. The name is not the same as the class name, it is nicer, more readable and meaningful for end-users too.

Returns:
the descriptor's name depending on its actual type

getShortName

public java.lang.String getShortName()
Gets the short name of the descriptor.

Returns:
the short name used in text outputs (tables etc.)

getParametersClassName

public java.lang.String getParametersClassName()
Gets the name of the parameters class corresponding to the descriptor (prefixed with the package name as getClass().getName() would return).

Returns:
the name of the parameters class

setParameters

public void setParameters(MDParameters parameters)
Sets the parameters for an already created MolecularDescriptor.

Parameters:
parameters - parameter settings of the descriptor to be created

setParameters

public abstract void setParameters(java.lang.String parameters)
                            throws chemaxon.descriptors.MDParametersException
Sets the parameters for an already created MolecularDescriptor.

Parameters:
parameters - parameter settings of the descriptor
Throws:
chemaxon.descriptors.MDParametersException

getParameters

public MDParameters getParameters()
Gets the parameters associated with the object.

Returns:
parameters settings

needsConfig

public boolean needsConfig()
Indicates if class takes parameters from configuration file. Derived classes have to override this method appropriately.

Returns:
true, most descriptors classes must have a configuration (file)
Since:
JChem 2.2

setScreeningConfiguration

public void setScreeningConfiguration(java.lang.String config)
                               throws chemaxon.descriptors.MDParametersException
Sets the screening configuration. Overwrites old parameters with the new ones, parameters not affected by the screening configuration remain unchanged.

Parameters:
config - screening configuraton string
Throws:
chemaxon.descriptors.MDParametersException

toData

public abstract byte[] toData()
Converts the internal (memory) representation of a MolecularDescriptor instance into an external format that can be stored in a database.

Returns:
binary representation of the descriptor

fromData

public abstract void fromData(byte[] dbRepr)
Builds a MolecularDescriptor object from its external (database) representation.

Parameters:
dbRepr - an array generated by toData()

toString

public abstract java.lang.String toString()
Creates the string representation of a MolecularDescriptor object. This string value is stored in SDfiles, though the use of this string is not limited to this purpose. Typically, this string is compact, for instance zero values are not necessarily printed.

Overrides:
toString in class java.lang.Object
Returns:
a formatted string of the descriptor

toDecimalString

public abstract java.lang.String toDecimalString()
Creates the string representation of a MolecularDescriptor object. This string value contains all values of the descriptor (including all zeros), values are separated by tabs.

Returns:
a formatted string of the descriptor

toBinaryString

public java.lang.String toBinaryString()
Creates the binary string representation of a MolecularDescriptor object.

Returns:
a 0,1 string of the descriptor
Since:
JChem 2.3

fromString

public abstract void fromString(java.lang.String descr)
                         throws java.text.ParseException
Builds a molecular descriptor from its string representation. Typically used when SDfile is read.

Parameters:
descr - descriptor string, previously generated by toString()
Throws:
java.text.ParseException

toFloatArray

public abstract float[] toFloatArray()
Creates the float array representation of a MolecularDescriptor object. This array contains all values of the descriptor (including all zeros) in the elements of the array.

Returns:
a formatted float array of the descriptor
Since:
JChem 2.0.1

fromFloatArray

public abstract void fromFloatArray(float[] descr)
Builds a molecular descriptor from its float array representation. Typically used when a hypothesis is created.

Parameters:
descr - descriptor represented in a float array (e.g. generated by toFloatArray())
Since:
JChem 2.0.1

generate

public java.lang.String[] generate(Molecule m)
                            throws chemaxon.descriptors.MDGeneratorException
Creates the descriptor for the given Molecule.

Returns:
property names set in the molecule passed during generation
Throws:
chemaxon.descriptors.MDGeneratorException - when failed to generate descriptor

getAtomSetColors

public java.awt.Color[] getAtomSetColors()
Determines the coloring of atoms. This coloring does not reflect element types, instead other cathegories related to the specific descriptor are considered. Therefore, whenever the parameters are chenged, it is advisible to call getAtomSetColors() to obtain the current coloring scheme. Typically, the colorig scheme is defined in the MDParameters object associated with the MolecularDescriptor object.

Returns:
array of colors of different atom cathegories

getAtomSetNames

public java.lang.String[] getAtomSetNames()

getAtomSetIndexes

public int[] getAtomSetIndexes(Molecule m)
Gets the individual atom color indexes. This allows color mapping and visualisation of various properties encoded into the MolecularDescriptor. Prior to this method, getAtomSetColors() has to be called to obtain a color array. This method returns a per atom color index array and the returned indexes refer to to the color array returned by getAtomSetColors.

Parameters:
m - a molecule to assign atom colors to
Returns:
array of color indexes (indexed by atom indexes)

getDissimilarityMetrics

public abstract java.lang.String[] getDissimilarityMetrics()
Gets the dissimilarity metric names in an array.
This method must be overloaded by derived classes in order to get the metrics array depending on the dynamic type. (This is needed because the metrics[] array is a class variable, but class variables are shared among all derived classes.)

Returns:
the metrics array

getDefaultDissimilarityMetricThresholds

public abstract float[] getDefaultDissimilarityMetricThresholds()
Gets the default dissimilarity threshold values for all dissimilarity metrics defined.

Returns:
array of dissimilarity threshold values

getDissimilarityMetricIndex

public int getDissimilarityMetricIndex(java.lang.String metricName)
                                throws java.lang.IllegalArgumentException
Gets the internal index of the given metric.

Parameters:
metricName - name of a metric
Returns:
index of the specified metric
Throws:
java.lang.IllegalArgumentException

getNumberOfWeights

public int getNumberOfWeights(java.lang.String dissimilarityMetricName)
                       throws java.lang.IllegalArgumentException
Gets the number of weight factors used by the specified metric. This method can be applied to the dissimilarity metrics provided by the MolecularDescriptor class or its derived classes, but not to parametrized metric.

Parameters:
dissimilarityMetricName - name as returned by getDissimilarityMetrics()
Returns:
number of weights the metric uses
Throws:
java.lang.IllegalArgumentException - if the given parameter is not a valid metric name

getNumberOfMetrics

public int getNumberOfMetrics()
Gets the number of parametrized metrics available for the particular descriptor.

Returns:
number of matrics implemented in this class

getMetricIndex

public int getMetricIndex(java.lang.String metricName)
                   throws java.lang.IllegalArgumentException
Gets the index of the given parametrized metric.

Parameters:
metricName - name of a metric
Returns:
index of the specified metric
Throws:
java.lang.IllegalArgumentException - when given metric name is not valid

getDefaultThreshold

public float getDefaultThreshold(int metricIndex)
Gets a metric dependent default threshold value. The actual value of this wired in parameter is not important, since it is only used in user interfaces to simplify the use of applications for beginners. Note: this method is for compatibility reasons.

Parameters:
metricIndex - index of a parametrized metric

getThreshold

public float getThreshold(int metricIndex)
Gets a metric dependent default threshold value. Ideally, this value should be based on statistics, though the actual value is not too critical, since these are only used in user interfaces to simplify the use of applications for beginners. Note: this method is for compatibility reasons.

Parameters:
metricIndex - index of a parametrized metric

getThreshold

public float getThreshold()
Gets threshold value of the current parameterized metric.

Returns:
threshold value

getMetricName

public java.lang.String getMetricName()
Gets the name of the current parametrized metric.

Returns:
name of the current metric

getMetricName

public java.lang.String getMetricName(int metricIndex)
Gets the name of a metric specified parametrized metric by its index. Note: this method is kept for backward compatibility.

Parameters:
metricIndex - metric index
Returns:
name of the given metric

getDefaultMetricIndex

public int getDefaultMetricIndex()
Gets the index of the default metric. The default metric is one of the available ones that is the most commonly used for the given MolecularDescriptor.

Returns:
metric index of the default metric

getDissimilarity

public abstract float getDissimilarity(java.lang.Object other)
Calculates the dissimilarity ratio between two MolecularDescriptor objects using the default metric. Default metric is set in the corresponding MDParameters object. In the case of assymetric distances swapping the two descriptors can make a big difference.

Specified by:
getDissimilarity in interface chemaxon.stat.Diffable
Parameters:
other - a descriptor, to which the dissimilarity ratio is measured
Returns:
dissimilarity ratio

getDissimilarity

public abstract float getDissimilarity(java.lang.Object other,
                                       int parametrizedMetricIndex)
Calculates the dissimilarity between two MolecularDescriptor objects using the specified metric, apart from that it is the same as getDissimilarity( final Object other ).

Parameters:
other - a descriptor, to which the dissimilarity ratio is measured
parametrizedMetricIndex - the index of the parametrized metric to used
Returns:
dissimilarity ratio
See Also:
MDParameters, PFParameters

getLowerBound

public float getLowerBound(java.lang.Object other)
Calculates an estimate for the minimum value of the distance distance using the default distance metric. This is needed in clusterint, see chemaxon.stat.Diffable for further explanation.

Specified by:
getLowerBound in interface chemaxon.stat.Diffable

main

public static void main(java.lang.String[] args)