chemaxon.naming
Class DocumentToStructure

java.lang.Object
  extended by chemaxon.naming.DocumentToStructure

public class DocumentToStructure
extends java.lang.Object

Since:
5.9
Author:
Daniel Bonniot

Field Summary
static java.lang.String BYTE
           
static java.lang.String CHARACTER
          The character offset since the beginning of the document, for text formats (html, xml, txt).
static java.lang.String CONFIDENCE
          The confidence that the structure is correct. 0 or less means very little confidence. 1 or more means high confidence.
static java.lang.String DOC_AUTHOR
           
static java.lang.String DOC_CREATION_DATE
           
static java.lang.String DOC_LAST_AUTHOR
           
static java.lang.String DOC_TITLE
           
static java.lang.String DOCUMENT
          The file name of the source document.
static java.lang.String DOCUMENT_METADATA
           
static java.lang.String PAGE
          The page number, if applicable (e.g. for a PDF document).
static java.lang.String SOURCE_TEXT
          The source text, as it appears in the original document.
static java.lang.String TYPE
          The type of source for the structure.
static java.lang.String TYPE_CAS
          CAS number.
static java.lang.String TYPE_CDX
          Embedded ChemDraw structure.
static java.lang.String TYPE_COMMON
          Common name.
static java.lang.String TYPE_GENERIC
          Generic name, for instance "C1-C4 alkyl".
static java.lang.String TYPE_INCHI
          InChI string.
static java.lang.String TYPE_ION
          Ion abbreviation, for instance K+ or Ca2+.
static java.lang.String TYPE_MRV
          Embedded ChemAxon MRV structure.
static java.lang.String TYPE_OSR
          Structure image recognized by Optical Structure Recognition.
static java.lang.String TYPE_PEPTIDE
          Peptide notation, for instance Val-Gly-Ser-Ala.
static java.lang.String TYPE_SMILES
          SMILES string.
static java.lang.String TYPE_SYMYX
          Embedded Symyx/ISIS draw structure.
static java.lang.String TYPE_SYSTEMATIC
          Systematic name.
 
Constructor Summary
DocumentToStructure()
           
 
Method Summary
static MolImporter process(java.lang.String text)
           
static MolImporter process(java.lang.String text, java.lang.String options)
          Creates a MolImporter instance to import structures for a given text.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SOURCE_TEXT

public static final java.lang.String SOURCE_TEXT
The source text, as it appears in the original document.

See Also:
Constant Field Values

DOCUMENT

public static final java.lang.String DOCUMENT
The file name of the source document.

See Also:
Constant Field Values

PAGE

public static final java.lang.String PAGE
The page number, if applicable (e.g. for a PDF document).

See Also:
Constant Field Values

CHARACTER

public static final java.lang.String CHARACTER
The character offset since the beginning of the document, for text formats (html, xml, txt).

See Also:
Constant Field Values

BYTE

public static final java.lang.String BYTE
See Also:
Constant Field Values

DOC_AUTHOR

public static final java.lang.String DOC_AUTHOR
See Also:
Constant Field Values

DOC_LAST_AUTHOR

public static final java.lang.String DOC_LAST_AUTHOR
See Also:
Constant Field Values

DOC_TITLE

public static final java.lang.String DOC_TITLE
See Also:
Constant Field Values

DOC_CREATION_DATE

public static final java.lang.String DOC_CREATION_DATE
See Also:
Constant Field Values

DOCUMENT_METADATA

public static final java.lang.String DOCUMENT_METADATA
See Also:
Constant Field Values

CONFIDENCE

public static final java.lang.String CONFIDENCE
The confidence that the structure is correct. 0 or less means very little confidence. 1 or more means high confidence. This is currently set on image recognition, that is Optical Structure Recognition (OSR), also known as "chemical OCR".

See Also:
Constant Field Values

TYPE

public static final java.lang.String TYPE
The type of source for the structure.

See Also:
TYPE_SYSTEMATIC, TYPE_COMMON, TYPE_GENERIC, TYPE_SMILES, TYPE_INCHI, TYPE_CAS, Constant Field Values

TYPE_SYSTEMATIC

public static final java.lang.String TYPE_SYSTEMATIC
Systematic name.

See Also:
Constant Field Values

TYPE_COMMON

public static final java.lang.String TYPE_COMMON
Common name.

See Also:
Constant Field Values

TYPE_GENERIC

public static final java.lang.String TYPE_GENERIC
Generic name, for instance "C1-C4 alkyl".

See Also:
Constant Field Values

TYPE_SMILES

public static final java.lang.String TYPE_SMILES
SMILES string.

See Also:
Constant Field Values

TYPE_INCHI

public static final java.lang.String TYPE_INCHI
InChI string.

See Also:
Constant Field Values

TYPE_CAS

public static final java.lang.String TYPE_CAS
CAS number.

See Also:
Constant Field Values

TYPE_ION

public static final java.lang.String TYPE_ION
Ion abbreviation, for instance K+ or Ca2+.

See Also:
Constant Field Values

TYPE_PEPTIDE

public static final java.lang.String TYPE_PEPTIDE
Peptide notation, for instance Val-Gly-Ser-Ala.

See Also:
Constant Field Values

TYPE_CDX

public static final java.lang.String TYPE_CDX
Embedded ChemDraw structure.

See Also:
Constant Field Values

TYPE_MRV

public static final java.lang.String TYPE_MRV
Embedded ChemAxon MRV structure.

See Also:
Constant Field Values

TYPE_SYMYX

public static final java.lang.String TYPE_SYMYX
Embedded Symyx/ISIS draw structure.

See Also:
Constant Field Values

TYPE_OSR

public static final java.lang.String TYPE_OSR
Structure image recognized by Optical Structure Recognition.

See Also:
Constant Field Values
Constructor Detail

DocumentToStructure

public DocumentToStructure()
Method Detail

process

public static MolImporter process(java.lang.String text)

process

public static MolImporter process(java.lang.String text,
                                  java.lang.String options)
Creates a MolImporter instance to import structures for a given text. Generally, the text is treated as plain text. However, for convenience, text that starts immediately with an XML or HTML prologue is recognized as such instead of plain text. For complete documents a direct call to a MolImporter constructor is often more appropriate than loading the whole document into a String object.

Returns:
a MolImporter that can be used to read the structures found in the text.