MolFragment - the CCQ method

Introduction

Molecule characteristics depend largely on functional groups. To keep the information of the present groups, the CCQ method does not cleave bonds between the heteroatoms and attached carbons. The cleavage only occurs between two carbon atoms of which at least one is connected to a heteroatom*. This simple rule guarantees:

  • functional groups are not destroyed
  • no aromatic ring cleavage
  • heteroatoms connected to aromatic rings are not cleaved by default
  • aliphatic rings are fragmented
  • fast fragment generation without substructure search
  • reduces the risk of combinatorial explosion by avoidin the fragmentation of hydrocarbon chains
Features
  • fragments are generated by all cleavage combinations
  • in case of large amount and complex fragments, maximum bond count can limit the number of the CCQ fragments and larger fragments are discarded from the results
  • the minimal fragments are generated
  • no labelling is needed, substitution (connection) points are marked by implicit hydrogens if the initial molecules were saturated by explicit hydrogens
  • duplicate filtering;
*Heteroatom means any atom except for hydrogen and carbon.

The fragmentation of the input molecule yields a set of molecules without labels:

Input molecule Resulted fragments

Usage

    molfragment [<options>] [<input files/strings>] 

Options

      -h,--help             this help message
      -o,                   output file path (if not set standard output will be used)
      -f,                   output format (default is smiles)
      -m,                   maximum count of the CCQ bonds in the generated
                            fragments (default is unlimited)
      -e,                   add explicit hydrogens to the fragments
      -l,                   log file path 

Input

Most molecular file formats are accepted ( MDL molfile, Compressed molfile, SDfile, Compressed SDfile, SMILES, etc.).

If no input file name or input string is specified in the command line then input is taken from the standard input.

Output

By default, MolFragment writes output molecules in smiles format. Structures are in their de-aromatized form.
Other output formats can be specified in the --format parameter.

The --output parameter specifies the output file path. If omitted, results are written to the standard output.

Examples

Here are some examples to show the mechanisms of MolFragment:

  1. Fragments structures from the mols.sdf file and writes the molecule fragments to the standard output in cxsmiles format:
    molfragment mols.sdf
    
  2. The same with SMILES string input:
    fragment -c Fragmenter.xml "CC(CCN(C)COCCC1=CC=CC=C1C2=CC=CC=C2)COCN" "CCCCN(C)C(C(=O)C1CCCC(Cl)C1)C(C)C(Cl)Cl"
    
  3. Performs fragmentation and directly pipes output to MarvinView:
    fragment -c Fragmenter.xml mols.sdf -f sdf | mview -
    

    Note that such piping does not work in Windows.

Do you have a question? Would you like to learn more? Please browse among the related topics on our support forum or search the website. If you want to suggest modifications or improvements to our documentation email our support directly!