chemaxon.descriptors
Class CFGenerator

java.lang.Object
  extended by chemaxon.descriptors.MDGenerator
      extended by chemaxon.descriptors.CFGenerator

public class CFGenerator
extends MDGenerator

The CFGenerator class generates topological fingerprints of molecular graphs.

Basic concepts
A binary string (series of 0 and 1) is constructed based on local connectivity atoms and atom types. The length of the series is a predefined constant parameter. Two further parameters influence the fingerprints gerenerated: number of bonds, which determines the number size of local neighborhood of atoms takes into account by providing an upper limit for the length of path starting from each atom; and the number of bits to be set in the fingerprint for each property identified.

Typical usage
For the sake of optimal memory usage one instance of this class can generate fingerprints for a series of molecular graphs by the consecutive call to the generate() method.
In most cases the generator is not intended to be used directly. When molecules are taken from files or databases the corresponding MolecularDescriptors can be generated by the appropriate MDReader object. Alternatively, MolecularDescriptor.generate( final Molecule ) is the simplest way to obtain a descriptor corresponding to a molecular structure.

Example of the direct use of the class within an application:
      CFGenerator gen = new CFGenerator();
      ChemicalFingerprint fp = new ChemicalFingerprint( new CFParameters() );
      Molecule mol = getFirstMoleculeFromSomewhere();
      while ( mol != null ) {
          gen.generate( mol, fp );
          doSomethingWith( fp );
          mol = getNextMoleculeFromSomewhere();
      }
 

Since:
JChem 2.0
Author:
Miklos Vargyas, Peter Kovacs (pkovacs84)

Field Summary
 
Fields inherited from class chemaxon.descriptors.MDGenerator
createStatistics, density, freqCount, maxNonEmptyId, maxNonEmptyPercent, minNonEmptyId, minNonEmptyPercent, molCount, sumNonEmptyPercent
 
Constructor Summary
CFGenerator()
          Creates a new instance of CFGenerator which can be used to generate chemical fingerprints for an arbitrary number of molecules.
CFGenerator(int length)
          Deprecated. since 5.4
CFGenerator(Standardizer s, int length)
          Deprecated. since 2.2
 
Method Summary
protected  int calcFreqCount(MolecularDescriptor d)
          Updates statistics gathered on fingerprints generated and get the number of non-zero cells.
 java.lang.String[] generate(Molecule m, int[] aidxs, MolecularDescriptor d)
          Generates the partial chemical fingerprint for the given molecule.
 java.lang.String[] generate(Molecule m, MolecularDescriptor d)
          Generates the chemical fingerprint for the given molecule.
 
Methods inherited from class chemaxon.descriptors.MDGenerator
getAverageNonZeroRatio, getBrightestMolId, getDarkestMolId, getDensityCounts, getFrequencyCounts, getMaximumBitRatio, getMinimumBitRatio, getMoleculeCount, setCreateStatistics, updateStatistics
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CFGenerator

public CFGenerator()
Creates a new instance of CFGenerator which can be used to generate chemical fingerprints for an arbitrary number of molecules.

Since:
JChem 5.8

CFGenerator

public CFGenerator(Standardizer s,
                   int length)
Deprecated. since 2.2

Creates a new instance of CFGenerator which can be used to generate chemical fingerprints for an arbitrary number of molecules.

Parameters:
s - Standardizer object to transform input molecules into a standard form
length - length of the chemical fingerprint in bits

CFGenerator

public CFGenerator(int length)
Deprecated. since 5.4

Creates a new instance of CFGenerator which can be used to generate chemical fingerprints for an arbitrary number of molecules.

Parameters:
length - length of the chemical fingerprint in bits
Since:
JChem 2.2
Method Detail

generate

public java.lang.String[] generate(Molecule m,
                                   MolecularDescriptor d)
                            throws MDGeneratorException
Generates the chemical fingerprint for the given molecule. New instance of the ChemicalFingerprint object is not allocated, the MolecularDescriptor provided as a method parameter is updated (and it has to be allocated and initialized by the client of this class).

Specified by:
generate in class MDGenerator
Parameters:
m - molecule for which the fingerprint is created
d - the chemical fingerprint generated
Returns:
names of tags (properties) set (added) during fingerprint generation, it is always null in the case of ChemicalFingerprint
Throws:
MDGeneratorException - in the case of any failures to generate the descriptor

generate

public java.lang.String[] generate(Molecule m,
                                   int[] aidxs,
                                   MolecularDescriptor d)
                            throws MDGeneratorException
Generates the partial chemical fingerprint for the given molecule. New instance of the ChemicalFingerprint object is not allocated, the MolecularDescriptor provided as a method parameter is updated (and it has to be allocated and initialized by the client of this class).

Partial fingerprint is fingerprint for paths containing the given atoms. The algorithm performs the full path enumeration over the molecule, but only sets bits in the resulting fingerprint for paths containing the given atoms.

Parameters:
m - molecule for which the fingerprint is created
aidxs - atom indexes that define the partial fingerprint generation
d - the chemical fingerprint generated
Returns:
names of tags (properties) set (added) during fingerprint generation, it is always null in the case of ChemicalFingerprint
Throws:
MDGeneratorException

calcFreqCount

protected int calcFreqCount(MolecularDescriptor d)
Updates statistics gathered on fingerprints generated and get the number of non-zero cells.

Overrides:
calcFreqCount in class MDGenerator
Parameters:
d - newly generated MolecularDescriptor
Returns:
brightness of the fingerprint
Since:
JChem 2.1