chemaxon.marvin.io
Class MRecordImporter

java.lang.Object
  extended by chemaxon.marvin.io.MRecordImporter

public class MRecordImporter
extends java.lang.Object

Marvin molecule file reader. By default, in case of machines with multiple processors processing is concurrent, otherwise single-threaded. By default, the number of worker threads is the number of processors - the number of concurrent threads can be set in setThreadCount(int).

Since:
Marvin 5.0, 05/07/2007
Version:
5.3, 09/17/2009
Author:
Peter Csizmadia, Szilveszter Juhos

Constructor Summary
MRecordImporter(MolInputStream in, java.lang.String opts)
          Creates a reader for the specified molecule input stream in concurrent mode.
 
Method Summary
 void close()
          Closes the input stream.
static MolImportModule createImportMod(MolInputStream mis)
          Creates an importer for the specified molecule input stream.
static MolImportModule createImportMod(java.lang.String fmt)
          Creates an importer for the specified molecule format.
 Molecule createMol()
          Creates an empty target molecule for import.
 Molecule createMolIfNeeded()
          Creates an empty target molecule for import if non-concurrent mode, returns null if concurrent mode.
 long getFilePointer()
          Gets the current position in the input file.
 java.lang.String getFormat()
          Gets the file format.
 MPropertyContainer getGlobalProperties()
          Gets the global properties.
 int getLineCount()
          Gets the current line number in the input file.
 java.lang.String getMoleculeString()
          Gets the last molecule as a string.
 MolImportModule getMolImportModule()
           
 java.lang.String getOptions()
          Gets the options for the import module.
 boolean getQueryMode()
          Gets the query mode.
 MRecordReader getRecordReader()
          Gets the record reader.
 boolean isSeekable()
          Tests whether the record reader is seekable.
 long length()
          Gets the length of the input file.
 MDocument readDoc()
          Reads the next document.
 Molecule readMol(Molecule mol)
          Reads the next molecule.
 MDocument readMolMovie(MDocument doc)
          Reads molecules as a movie.
 Molecule readMultiSet(Molecule m)
          Reads molecules as one multi-set molecule.
 java.lang.String readRecordAsText()
          Reads the next record.
 void seek(long p, int lcount, int k)
          Sets the file-pointer offset, measured from the beginning of this file, at which the next read or write occurs.
 void setOptions(java.lang.String opts)
          Sets the options for the import module.
 void setProgressMonitor(chemaxon.common.util.MProgressMonitor pmon)
          Sets the progress monitor.
 void setQueryMode(boolean q)
          Sets the query mode.
 void setThreadCount(int threadCount)
          Sets the number of threads for concurrent processing.
 MRecord skipRecord()
          Skips the next document.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MRecordImporter

public MRecordImporter(MolInputStream in,
                       java.lang.String opts)
                throws MolFormatException,
                       java.io.IOException
Creates a reader for the specified molecule input stream in concurrent mode.

Parameters:
in - the molecule input stream
opts - input options or null
Throws:
MolFormatException - If the format is not recognizable
java.io.IOException - If I/O error occured
java.nio.charset.IllegalCharsetNameException - if illegal encoding is used
java.nio.charset.UnsupportedCharsetException - if unsupported encoding is used
Method Detail

setThreadCount

public void setThreadCount(int threadCount)
                    throws java.lang.IllegalStateException
Sets the number of threads for concurrent processing. Default: the number of CPUs, single-threaded processing if there is 1 CPU.

Parameters:
threadCount - the number of threads, set 0 for the number of CPUs, 1 for single-threaded mode
Throws:
java.lang.IllegalStateException - if called after isSeekable() or readMol(chemaxon.struc.Molecule) or readDoc().
Since:
Marvin 5.3

setProgressMonitor

public void setProgressMonitor(chemaxon.common.util.MProgressMonitor pmon)
Sets the progress monitor.

Parameters:
pmon - the progress monitor

createImportMod

public static MolImportModule createImportMod(MolInputStream mis)
                                       throws java.io.IOException,
                                              MolFormatException
Creates an importer for the specified molecule input stream.

Parameters:
mis - the molecule input stream
Throws:
MolFormatException - If the format is not recognizable
java.io.IOException - If I/O error occured

createImportMod

public static MolImportModule createImportMod(java.lang.String fmt)
                                       throws java.io.IOException,
                                              MolFormatException
Creates an importer for the specified molecule format.

Parameters:
fmt - the molecule format
Throws:
MolFormatException - If the format is not recognizable
java.io.IOException - If I/O error occured
Since:
Marvin 5.2

getOptions

public java.lang.String getOptions()
Gets the options for the import module.

Returns:
the import options

setOptions

public void setOptions(java.lang.String opts)
Sets the options for the import module.

Parameters:
opts - options passed to the import module or null
Throws:
java.lang.IllegalStateException - if called after readMol(chemaxon.struc.Molecule) or readDoc().

readDoc

public MDocument readDoc()
                  throws MRecordParseException,
                         MolFormatException,
                         java.io.IOException
Reads the next document.

Returns:
the document or null at end of file
Throws:
MRecordParseException - If the record could not be parsed
MolFormatException - If the file format is invalid
java.io.IOException - If I/O error occured

readMol

public Molecule readMol(Molecule mol)
                 throws MRecordParseException,
                        MolFormatException,
                        java.io.IOException
Reads the next molecule. If the 'mol' parameter is not null then processing is single-threaded.

Parameters:
mol - target object or null
Returns:
the molecule or null at end of file
Throws:
MRecordParseException - If the record could not be parsed
MolFormatException - If the file format is invalid
java.io.IOException - If I/O error occured

readMultiSet

public Molecule readMultiSet(Molecule m)
                      throws MRecordParseException,
                             MolFormatException,
                             java.io.IOException
Reads molecules as one multi-set molecule. Processing is single-threaded.

Parameters:
m - the output molecule object
Returns:
the molecule in case of success, null otherwise
Throws:
java.io.IOException - cannot read molecule (bad format or I/O error)
MRecordParseException
MolFormatException

readMolMovie

public MDocument readMolMovie(MDocument doc)
                       throws MolFormatException,
                              java.io.IOException
Reads molecules as a movie. Processing is single-threaded.

Parameters:
doc - the output document object or null
Returns:
the document if success, null otherwise
Throws:
MolFormatException - invalid molecule file
java.io.IOException - cannot read molecule (bad format or I/O error)
Since:
Marvin 5.2, 02/12/2009

readRecordAsText

public java.lang.String readRecordAsText()
                                  throws MRecordParseException,
                                         java.io.IOException
Reads the next record. Processing is single-threaded.

Returns:
the record or null at end of file
Throws:
MRecordParseException - If the record could not be parsed
java.io.IOException - If I/O error occured
java.lang.IllegalStateException - if the concurrent processor is already running

getGlobalProperties

public MPropertyContainer getGlobalProperties()
Gets the global properties.

Returns:
the global properties or null

isSeekable

public boolean isSeekable()
Tests whether the record reader is seekable. In concurrent mode always returns false. Therefore this method should not be called before calling setThreadCount(int).

Returns:
true if it is seekable, false otherwise
See Also:
setThreadCount(int)

getFormat

public java.lang.String getFormat()
Gets the file format.

Returns:
the format

getQueryMode

public boolean getQueryMode()
Gets the query mode. SMILES strings are imported as SMARTS if query mode is set.

Returns:
query mode

setQueryMode

public void setQueryMode(boolean q)
Sets the query mode. SMILES strings are imported as SMARTS if query mode is set. IMPORTANT: call this before any call to readDoc() or readMol(chemaxon.struc.Molecule).

Parameters:
q - query mode
Throws:
java.lang.IllegalStateException - if the concurrent processor is already running

getFilePointer

public long getFilePointer()
Gets the current position in the input file.

Returns:
the position
Throws:
java.lang.IllegalStateException - if the concurrent processor is running

length

public long length()
            throws java.io.IOException
Gets the length of the input file.

Returns:
the length
Throws:
java.io.IOException - if the length cannot be determined

getLineCount

public int getLineCount()
Gets the current line number in the input file.

Returns:
the position

seek

public void seek(long p,
                 int lcount,
                 int k)
          throws java.io.IOException
Sets the file-pointer offset, measured from the beginning of this file, at which the next read or write occurs.

Parameters:
p - the file pointer
lcount - the line count at the specified position
k - the record count at the specified position
Throws:
java.io.IOException - if pos is less than 0 or if an I/O error occurs.
java.lang.UnsupportedOperationException - if the concurrent processor is running

skipRecord

public MRecord skipRecord()
                   throws MRecordParseException,
                          MolFormatException,
                          java.io.IOException
Skips the next document. Not implemented for concurrent processing, in this case returns the next complete record.

Returns:
incomplete record info containing only the start and end positions, or null if there are no more records
Throws:
java.io.IOException - If I/O error occured
MRecordParseException
MolFormatException

close

public void close()
           throws java.io.IOException
Closes the input stream. IMPORTANT: call this after reading molecules to close concurrent processing properly.

Throws:
java.io.IOException - If I/O error occured

getMolImportModule

public MolImportModule getMolImportModule()

getMoleculeString

public java.lang.String getMoleculeString()
Gets the last molecule as a string.

Returns:
the molecule as a string

createMolIfNeeded

public Molecule createMolIfNeeded()
Creates an empty target molecule for import if non-concurrent mode, returns null if concurrent mode.

Returns:
an empty molecule or null
Since:
Marvin 5.3

createMol

public Molecule createMol()
Creates an empty target molecule for import.

Returns:
an empty molecule

getRecordReader

public MRecordReader getRecordReader()
Gets the record reader.

Returns:
the record reader