chemaxon.marvin.modules
Class AutoMapper

java.lang.Object
  extended by chemaxon.marvin.util.MarvinModule
      extended by chemaxon.marvin.modules.AutoMapper
All Implemented Interfaces:
java.io.Serializable

public class AutoMapper
extends chemaxon.marvin.util.MarvinModule

AutoMapper finds the best mapping from reactant side atoms to product side atoms of a reaction. The term mapping refers to the association of of reactant side atoms to product side atoms.
AutoMapper supports various mapping styles that are compatible with other vendors mappng approaches. By default only changing atoms are mapped, this mapping style is the CHEMAXON style.

The reaction to be mapped is passed as an RxnMolecule and it may contain initial atom maps. These predefined atom maps are preserved by AutoMapper .

Simple usage example:

      AutoMapper mapper = new AutoMapper();
      Molecule mol = MolImporter.importMol( "C1CCCCC1=O>>C1COC2(CCCCC2)O1" );
      RxnMolecule rm = RxnMolecule.getReaction( mol );
      mapper.map( rm );
      System.out.println( mol.toFormat( "smiles" ) );
 

Since:
Marvin 3.4.1
Version:
5.0
Author:
Vargyas Miklos
See Also:
Serialized Form

Field Summary
static int BEST
          slowest maping but better quality mapping
static int CHANGING
          Only those atoms are mapped that have chaning bond.
static int CHEMAXON
          Reaction is mapped according to ChemAxon's style.
static int COMPLETE
          all atoms in the reaction are mapped
static int DAYLIGHT
          Daylight style mapping, orphan/widow atoms are not mapped.
static double DEFAULT_COMPLEXITY_THRESHOLD
          maximum number of steps allowed in non-mcs matching
static int DEFAULT_MAPPING_STRATEGY
          golden middle
static int DEFAULT_MAPPING_STYLE
          By default, changing atoms are mapped.
static long DEFAULT_STEP_LIMIT
          by default there is no limitation on the number of steps performed
static int EITHER
          Mapping style of the input reaction is ambigous
static int FASTEST
          heuristic mode, fast but less accurate
static int MATCHING
          Only matching atoms are mapped.
static int ORPHANS
          missing orphan/widow atom maps are added only
static int STANDARD
          good balance between speed and accuracy
static int STOP_BADARGSINMODFUNC
          bad paramters passed to modfunc()
static int STOP_FOUND
          mapping stopped with an optimal solution
static int STOP_NOTFOUND
          no solution found
static int STOP_STEPLIMIT
          mapping stopped because maximum allowed step count reached, no optimal solution found
static int STOP_TIMELIMIT
          maximum allowed time exceeded, no optimum solution has been found
static int STOP_UNKONW
          mapping stopped for an unknown reason
static int UNKNOWN
          Mapping style of a pre-mapped reaction cannot be determined
static int UNMAPPED
          Reaction is not mapped
 
Fields inherited from class chemaxon.marvin.util.MarvinModule
moduleLoadingCounterLock
 
Constructor Summary
AutoMapper()
          Creates a new instance of AutoMapper.
 
Method Summary
static void clearMaps(RxnMolecule mol)
          Clears atom maps.
protected  void dump()
          Engineering function.
 java.lang.String getDiagnosticMessage(int diagLevel)
          Returns a short text message that discribes the outcome of the last search.
 int getlastStopCause()
          Returns the code of the last termination status.
 int getMapCount()
          Get number of solution maps found.
 int guessMappingStyle(RxnMolecule reaction)
          Guesses mapping style of the input reaction.
static void main(java.lang.String[] args)
          For engineering purposes only.
static void map(Molecule mol, int mappingStyle)
          Maps the input molecule according to the given mapping style.
 int map(RxnMolecule reaction)
          Convenience function that unifies setReaction() and setMap() in one method.
 int map(RxnMolecule reaction, boolean mapAlways)
          Convenience function that unifies setReaction() and setMap() in one method.
static void mapReaction(RxnMolecule reaction)
          Convenience function that unifies setReaction() and setMap() in one method.
 java.lang.Object modfunc(java.lang.Object arg)
          Mandatory method to be implemented by Marvin modules.
 void setComplexityThreshold(float newThreshold)
          Sets the complexity threshold.
 void setForbiddenMap(int mapId)
          The given atom map id should not be assigned to any atom.
 void setIgnoreH(boolean ignoreH)
          Turns hydrogen mapping on or off.
 void setMap(int mapId)
          Sets atom-atom maps in the RxnMolecule passed in setReaction( final RxnMolecule rm ) according to the mapId map.
 void setMappingMode(int mappingMode)
          Deprecated. Use setMappingStyle(int) instead.
 void setMappingStrategy(int newStrategy)
          Sets the mapping strategy.
 void setMappingStyle(int mappingStyle)
          Sets the mapping style to be used in consequent reaction mappings.
 boolean setOption(java.lang.String parameterName, java.lang.String parameterValue)
          Sets any options using string parameter names and string values.
 void setReaction(RxnMolecule rm)
          Sets the current reaction to be mapped.
 void setStepCountLimit(long maxNumberOfSteps)
          Sets the maximum number of atomic search steps allowed.
 void setTimeLimit(long maxMilliseconds)
          Sets the maximum allowed total search time.
 
Methods inherited from class chemaxon.marvin.util.MarvinModule
getSomething, isModuleLoadingInProgress, load, load, load, loadClass, loadClass, shutdownLoader
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ORPHANS

public static final int ORPHANS
missing orphan/widow atom maps are added only

See Also:
Constant Field Values

COMPLETE

public static final int COMPLETE
all atoms in the reaction are mapped

See Also:
Constant Field Values

DAYLIGHT

public static final int DAYLIGHT
Daylight style mapping, orphan/widow atoms are not mapped.

See Also:
Constant Field Values

CHANGING

public static final int CHANGING
Only those atoms are mapped that have chaning bond. Either the bond order changes, or new bond is created, or bond is deleted. Orphan and widow atoms are included.

See Also:
Constant Field Values

UNKNOWN

public static final int UNKNOWN
Mapping style of a pre-mapped reaction cannot be determined

See Also:
Constant Field Values

UNMAPPED

public static final int UNMAPPED
Reaction is not mapped

See Also:
Constant Field Values

EITHER

public static final int EITHER
Mapping style of the input reaction is ambigous

See Also:
Constant Field Values

CHEMAXON

public static final int CHEMAXON
Reaction is mapped according to ChemAxon's style. Same as CHANGING.

See Also:
Constant Field Values

MATCHING

public static final int MATCHING
Only matching atoms are mapped.

See Also:
Constant Field Values

DEFAULT_MAPPING_STYLE

public static final int DEFAULT_MAPPING_STYLE
By default, changing atoms are mapped.

See Also:
Constant Field Values

FASTEST

public static final int FASTEST
heuristic mode, fast but less accurate

See Also:
Constant Field Values

STANDARD

public static final int STANDARD
good balance between speed and accuracy

See Also:
Constant Field Values

BEST

public static final int BEST
slowest maping but better quality mapping

See Also:
Constant Field Values

DEFAULT_MAPPING_STRATEGY

public static final int DEFAULT_MAPPING_STRATEGY
golden middle

See Also:
Constant Field Values

DEFAULT_COMPLEXITY_THRESHOLD

public static final double DEFAULT_COMPLEXITY_THRESHOLD
maximum number of steps allowed in non-mcs matching

See Also:
Constant Field Values

DEFAULT_STEP_LIMIT

public static final long DEFAULT_STEP_LIMIT
by default there is no limitation on the number of steps performed

See Also:
Constant Field Values

STOP_UNKONW

public static final int STOP_UNKONW
mapping stopped for an unknown reason

See Also:
Constant Field Values

STOP_FOUND

public static final int STOP_FOUND
mapping stopped with an optimal solution

See Also:
Constant Field Values

STOP_STEPLIMIT

public static final int STOP_STEPLIMIT
mapping stopped because maximum allowed step count reached, no optimal solution found

See Also:
Constant Field Values

STOP_NOTFOUND

public static final int STOP_NOTFOUND
no solution found

See Also:
Constant Field Values

STOP_TIMELIMIT

public static final int STOP_TIMELIMIT
maximum allowed time exceeded, no optimum solution has been found

See Also:
Constant Field Values

STOP_BADARGSINMODFUNC

public static final int STOP_BADARGSINMODFUNC
bad paramters passed to modfunc()

See Also:
Constant Field Values
Constructor Detail

AutoMapper

public AutoMapper()
Creates a new instance of AutoMapper. One instance is capable of mapping multiple reaction in a sequential manner, there is no need to create a new AutoMapper object for each reaction to be mapped. Reusing a previously created AutoMapper objects is beneficial in terms of memory allocations thus memory size and running times.

Method Detail

map

public static void map(Molecule mol,
                       int mappingStyle)
                throws chemaxon.marvin.modules.AutoMapperException
Maps the input molecule according to the given mapping style. This method provides a simple all-in-one entry point to the services provided by the AutoMapper class. It maps the input molecule which can be a simple molecule, a reaction, a molecule/reaction with S-groups as well as R-groups.
Mapping assigns a unique value to each atom, or precisely, a value pairs to each All exsisting map indices are preserved. Molecules are mapped from 1 to the number of atoms, consequtively, mapping style is not considered in this case, as it does not make any sense. Reactions are mapped with respect to the specified mapping style. S atoms can be handleg, but R-groups (elements of R-groups are not mapped.

Parameters:
mol - molecule to be mapped
mappingStyle - specifies which kind of atoms are mapped and how (all, changing only etc.)
Throws:
chemaxon.marvin.modules.AutoMapperException
See Also:
MATCHING, CHANGING, ORPHANS, COMPLETE

setMappingStyle

public void setMappingStyle(int mappingStyle)
Sets the mapping style to be used in consequent reaction mappings. Supported styles are defined as constant values in this class.

Parameters:
mappingStyle - specifies which kind of atoms are mapped (all, changing only etc.)
See Also:
MATCHING, CHANGING, ORPHANS, COMPLETE

setMappingMode

public void setMappingMode(int mappingMode)
Deprecated. Use setMappingStyle(int) instead.

Sets the mapping style to be used in consequent reaction mappings. Supported styles are defined as constant values in this class.

Parameters:
mappingMode - the mapping style
See Also:
MATCHING, CHANGING, ORPHANS, COMPLETE

setMappingStrategy

public void setMappingStrategy(int newStrategy)
Sets the mapping strategy. Strategies allow various heuristic decisions to speed up search, though with the possible loss of optimum solution.

Parameters:
newStrategy - mapping strategy to be applied in subsequent mappings

setIgnoreH

public void setIgnoreH(boolean ignoreH)
Turns hydrogen mapping on or off. By default all atoms including hydrogen atoms are mapped.

Parameters:
ignoreH - if true, hydrogen atoms are not mapped.

setComplexityThreshold

public void setComplexityThreshold(float newThreshold)
Sets the complexity threshold. The complexity of the automapping problem is defined as the number of unique states that need to be searched without an MCS mapping. (MCS mapping is faster thus it is used to accelarate reaction auto mapping, though not all parts of the reaction equation can be MCS mapped.)

Parameters:
newThreshold - new complexity threshold value
Since:
4.0

setStepCountLimit

public void setStepCountLimit(long maxNumberOfSteps)
Sets the maximum number of atomic search steps allowed. It this limit is exceeded by the algorithm search stops immediately. This method complements setTimeLimit(), though with the guarantee of a deterministic, reproducible behaviour. This is better for testing, validation and batch usage purposes.

Parameters:
maxNumberOfSteps - maximum number of allowed elementary search steps

setTimeLimit

public void setTimeLimit(long maxMilliseconds)
Sets the maximum allowed total search time. It this limit is exceeded by the search algorithm it stops immediately. This method complements setStepCountLimit(), though with no guarantee of a deterministic, reproducible behaviour, but instead: easy and apparent control of maximum running time. This serves better interactive auto-mappaing application, e.g. in MarvinSketch.

Parameters:
maxMilliseconds - maximum allowed mapping tim ei n milliseconds

setOption

public boolean setOption(java.lang.String parameterName,
                         java.lang.String parameterValue)
Sets any options using string parameter names and string values. This is a convenience method that collects all other set*() methods in one method.
Note, that in the present implementation only setMappingStyle and setTimeLimit are available via this convenience method.

Parameters:
parameterName - name of parameter to be changed, identical to the orignal set* method name without the "set" prefix
parameterValue - new value of the specified parameter
Returns:
flag of successful chane of the given option parameter

setReaction

public void setReaction(RxnMolecule rm)
                 throws chemaxon.marvin.modules.AutoMapperException
Sets the current reaction to be mapped. In the present implementation this method performs the entire procedure of mapping: initializes all internals, completes the search for potential solutions and stores a predifined number (MAX_MAP_STORED) of solutions. Note, that the reaction to be automapped can already have some atoms mapped upfron calling this method. These are preserved during automapping. However, this mapping has to idempotent (which means that values should range from 1 to n on both sides and each and every value has to be mapped to one value only).

Parameters:
rm - reaction to be automapped, not cloned
Throws:
chemaxon.marvin.modules.AutoMapperException - when input reazction is too big (has more than MAX_FRAGMENT reactant or product fragments)

getlastStopCause

public int getlastStopCause()
Returns the code of the last termination status. This can be important in finding out wheter the mapping(s) found are the best that the algorithm can produce or for some reasons the search could not be accomplished (e.g. timeout encountered).

Returns:
code value for the last stop cause
Since:
5.1.2
See Also:
STOP_UNKONW, STOP_FOUND, STOP_NOTFOUND, STOP_STEPLIMIT, STOP_TIMELIMIT

getDiagnosticMessage

public java.lang.String getDiagnosticMessage(int diagLevel)
Returns a short text message that discribes the outcome of the last search. Not the result, which is a mapping, but a more general explanation, like sooluton was found or not. If not, then why.

Parameters:
diagLevel - detail of explanatoin: 0, no explanation; 1, reports running time; 1<, running time and "last stop cause"
Returns:
explanation about the last stop cause
Since:
5.1.2

guessMappingStyle

public int guessMappingStyle(RxnMolecule reaction)
Guesses mapping style of the input reaction. It recognizes ChemAxon and Daylight mapping, as well as unmapped reactions. The method is based on the algorithm described below:
 1. Is the input molecule mapped?
 1.1 No -> UNMAPPED
 1.2 Yes (go to 2.)
 2. Remove existing maps from the clone the input molecule
 3. Automap the clone (full)
 4. Compare the original mapping to the mapping of the clone (pairwise):
 4.1 There is at least a non matching pair -> UNKNOWN mapping
 4.2 Allpairs are matching (go to 5.)
 5. Is there an orphan atom in the clone (atom without pair)
 5.1 Yes. Are they mapped?
 5.1.1 All mapped in the original molecule -> CHEMAXON mapping
 5.1.2 None of them are mapped in the original molecule -> DAYLIGHT mapping
 5.1.3 Some -> UNKNOWN mapping
 5.2 No. Are there any unmapped atom in the original molecule?
 5.3.1 No. -> EITHER mapping
 5.3.2 Yes -> CHEMAXON mapping
 

Parameters:
reaction - input reaction
Returns:
mapping style identifiers: CHEMAXON, DAYLIGHT, UNKNOWN, COMPLETE, UNMAPPED

modfunc

public java.lang.Object modfunc(java.lang.Object arg)
Mandatory method to be implemented by Marvin modules. Service provided by this class via the Marvin dynamically loaded module interface are accessible via this method.
At present three different functions are available:
    1. mapping a reaction
      spcifying atoms to be excluded from auto-mapping
      setting some option parameters (as implemented by setOption(String,String)) Actions can be accessed by these names: map, setForbiddenMap, setOption. corresponding arguements: map takes and RxnMolecule, setForbiddenMap akes one atom index (referring to the reaction molecule), setOption uses various string parameters as described in setOption(String,String). Actions strings and their corresponding arguments are passed in an Object array, see parameter descriptiopn below.

      Specified by:
      modfunc in class chemaxon.marvin.util.MarvinModule
      Parameters:
      arg - it is interpreted as an array of Objects, the first element of this array is the name of the command (see description above), succeeding array elements are the arguments of the command

  • setForbiddenMap

    public void setForbiddenMap(int mapId)
    The given atom map id should not be assigned to any atom. Arbitrary map indices can be disabled by subsequent calls of this method. This method must be called before setReacition(), map and mapReaction in order to take effect.

    Parameters:
    mapId - map index not to be assigned to any atom

    map

    public int map(RxnMolecule reaction)
            throws chemaxon.marvin.modules.AutoMapperException
    Convenience function that unifies setReaction() and setMap() in one method. The parameters molecule is both input and output variable: on output all of its atoms are mapped. Atom maps that exists on input are preserved. This method does not clear forbidden map indices set by setForbiddenMap().

    Parameters:
    reaction - the reaction molecule to be mapped
    Returns:
    stop cause why the mapping terminated
    Throws:
    chemaxon.marvin.modules.AutoMapperException - when no valid map was found or the input reaction consists of too many fragments (see setReaction( final RxnMolecule rm ))

    map

    public int map(RxnMolecule reaction,
                   boolean mapAlways)
            throws chemaxon.marvin.modules.AutoMapperException
    Convenience function that unifies setReaction() and setMap() in one method. The parameters molecule is both input and output variable: on output all of its atoms are mapped. Atom maps that exists on input are preserved. This method does not clear forbidden map indices set by setForbiddenMap().

    Parameters:
    reaction - the reaction molecule to be mapped
    mapAlways - if set to false, maps are only added to original reaction if an optimal solution was found; otherwise maps are always added (if any solution is found at all)
    Returns:
    stop cause why the mapping terminated
    Throws:
    chemaxon.marvin.modules.AutoMapperException - when no valid map was found or the input reaction consists of too many fragments (see setReaction( final RxnMolecule rm ))
    Since:
    5.1.2

    mapReaction

    public static void mapReaction(RxnMolecule reaction)
                            throws chemaxon.marvin.modules.AutoMapperException
    Convenience function that unifies setReaction() and setMap() in one method. The parameters molecule is both input and output variable: on output all of its atoms are mapped. Atom maps that exists on input are preserved. This method does not clear forbidden map indices set by setForbiddenMap().

    Parameters:
    reaction - the reaction molecule to be mapped
    Throws:
    chemaxon.marvin.modules.AutoMapperException - when no valid map was found or the input reaction consists of too many fragments (see setReaction( final RxnMolecule rm ))

    getMapCount

    public int getMapCount()
    Get number of solution maps found. Note, that not all solutions are stored, only the MAX_MAP_STORED best mappings.

    Returns:
    number of mapping found

    setMap

    public void setMap(int mapId)
    Sets atom-atom maps in the RxnMolecule passed in setReaction( final RxnMolecule rm ) according to the mapId map.

    Parameters:
    mapId - index of a map, between and getMapCount() - 1

    dump

    protected void dump()
    Engineering function. Writes all internal structures to standard error.


    clearMaps

    public static void clearMaps(RxnMolecule mol)
    Clears atom maps.

    Parameters:
    mol - is the reaction whose atom maps are cleared

    main

    public static void main(java.lang.String[] args)
    For engineering purposes only. This method should not be included in released versions.

    Parameters:
    args - either one filename of reaction molecules or empty