Typifier¶
SMARTS-based atom typing and force field parameter assignment.
Quick reference¶
| Symbol | Summary | Preferred for |
|---|---|---|
OplsTypifier |
Full OPLS-AA typing pipeline | OPLS force fields |
GaffTypifier |
Full GAFF typing pipeline | GAFF / GAFF2 force fields |
ClpTypifier |
Full CL&P typing pipeline (OPLS engine + built-in clp.xml) |
Ionic-liquid force fields |
PairTypifier |
Pair (LJ) parameter assignment | Standalone nonbonded typing |
LayeredTypingEngine |
Dependency-aware SMARTS matching engine | Custom typing engines |
DependencyAnalyzer |
Computes SMARTS pattern dependency levels | Custom typing engines |
.typify(struct) |
Assign all types (atom → pair → bond → angle → dihedral) | One-call complete typing |
Each force-field typifier is a single orchestrator class: one typify() call
runs atom typing, then pair parameters, then bond/angle/dihedral types derived
from the atom assignments. Individual stages can be disabled with the
skip_*_typing constructor flags — there are no separate public
atom-only or bond/angle/dihedral typifier classes.
Canonical example¶
import molpy as mp
from molpy.typifier import OplsTypifier
ff = mp.io.read_xml_forcefield("oplsaa.xml")
typifier = OplsTypifier(ff, strict_typing=True)
typed_mol = typifier.typify(mol) # returns NEW Atomistic
Key behavior¶
typify()returns a newAtomistic— the original is not modifiedstrict_typing=Trueraises on untyped atoms;Falsesilently skips them- Atom typing uses SMARTS pattern matching with priority/override resolution
- Bonded types are derived from atom type assignments (CT-OH → bond type CT-OH)
Related¶
Full API¶
Force-Field Typifiers (OPLS, base orchestrator, pair)¶
atomistic ¶
ForceFieldAngleTypifier ¶
ForceFieldAtomTypifier ¶
Bases: TypifierBase['Atomistic']
Base class for SMARTS-based atom typifiers.
ForceFieldBondTypifier ¶
Bases: TypifierBase[Bond]
Match bond type based on atom types at both ends of the bond.
typify ¶
Assign the most specific (and highest-layer) matching bond type.
Matches by resolving each atom's type to its class and comparing against the bond type's class pair (stored as the component names). Among all matches the most specific wins; ties break toward the higher overlay layer so CL&P/CL&Pol bonds override OPLS-AA.
ForceFieldDihedralTypifier ¶
Bases: TypifierBase[Dihedral]
Match dihedral type based on atom types of four atoms in Dihedral
typify ¶
Assign the most specific (highest-layer) matching dihedral type.
OPLS dihedrals routinely use wildcard end atoms (X-CT-CT-X); the
specificity score makes a fully-resolved pattern win over a partially
wildcard one, and the layer tiebreak lets CL&P/CL&Pol override OPLS-AA.
ForceFieldTypifier ¶
OplsTypifier ¶
PairTypifier ¶
atomtype_matches ¶
Check if an AtomType matches a given type string.
Matching rules: 1. If atomtype has a specific type (not "*"), compare by type 2. If type doesn't match, compare by class
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
atomtype
|
AtomType
|
AtomType instance |
required |
type_str
|
str
|
Type string to match (from Atom.data["type"] or class name) |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if matches, False otherwise |
GAFF Typifier¶
gaff ¶
GAFF (Generalized Amber Force Field) typifier implementation.
GaffTypifier ¶
GaffTypifier(forcefield, skip_atom_typing=False, skip_pair_typing=False, skip_bond_typing=False, skip_angle_typing=False, skip_dihedral_typing=False, strict_typing=True)
Bases: ForceFieldTypifier
GAFF full typing orchestrator.
Runs the full typing pipeline: atom typing -> pair typing -> bond typing -> angle typing -> dihedral typing
CL&P Typifier¶
clp ¶
CL&P ionic-liquid force field typifier.
CL&P (Canongia Lopes & Padua, J. Phys. Chem. B 2004, 108, 2038,
DOI 10.1021/jp0362133) is an all-atom fixed-charge force field for ionic
liquids whose functional form is fully OPLS-AA compatible (harmonic bonds and
angles, OPLS cosine-series dihedrals, LJ 12-6 with geometric combining and
0.5/0.5 1-4 scaling). It therefore reuses the entire OPLS typing engine; only
the force-field data differs, shipped as the built-in clp.xml.
ClpTypifier ¶
Bases: OplsTypifier
Full CL&P typing pipeline: atom -> pair -> bond -> angle -> dihedral.
Inherits the OPLS-AA SMARTS/overrides matching engine unchanged. When no
force field is supplied, the built-in clp.xml is loaded through the
OPLS-AA reader (CL&P shares OPLS units and combining rules).
load_forcefield
staticmethod
¶
Load the built-in CL&P force field as an OPLS-AA overlay.
CL&P extends OPLS-AA, so the base oplsaa.xml is read first and
clp.xml is layered on top (layer=1). CL&P atom types therefore
override OPLS-AA wherever their SMARTS match (ionic-liquid atoms), while
OPLS-AA stays the fallback for any atom CL&P does not specifically cover
(e.g. molecular co-solvents in a mixed electrolyte).
Layered Typing Engine¶
layered_engine ¶
Layered typing engine for dependency-aware SMARTS matching.
LayeredTypingEngine ¶
Orchestrates level-by-level atom typing with dependency resolution.
This engine handles: 1. Dependency analysis and topological sorting 2. Layered matching (level 0 first, then level 1, etc.) 3. Conflict resolution within each level 4. Fixed-point iteration for circular dependencies
Attributes:
| Name | Type | Description |
|---|---|---|
patterns |
Dictionary mapping atom type names to SMARTSGraph patterns |
|
matcher |
SmartsMatcher instance for pattern matching |
|
analyzer |
DependencyAnalyzer for computing levels |
Initialize layered typing engine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
patterns
|
dict[str, SMARTSGraph]
|
Dictionary of {atom_type_name: SMARTSGraph} |
required |
get_explain_data ¶
Generate detailed explanation of typing process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mol_graph
|
Graph
|
Molecule graph |
required |
vs_to_atomid
|
dict[int, int]
|
Vertex to atom ID mapping |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Dictionary with detailed typing information |
typify ¶
Perform layered atom typing with dependency resolution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mol_graph
|
Graph
|
Molecule graph with vertex/edge attributes |
required |
vs_to_atomid
|
dict[int, int]
|
Mapping from vertex index to atom ID |
required |
max_iterations
|
int
|
Maximum iterations for circular dependency resolution |
10
|
Returns:
| Type | Description |
|---|---|
dict[int, str]
|
Dictionary mapping atom_id -> atom_type |
Dependency Analyzer¶
dependency_analyzer ¶
Dependency analysis for SMARTS patterns with type references.
DependencyAnalyzer ¶
Analyzes dependencies between SMARTS patterns and computes matching levels.
Attributes:
| Name | Type | Description |
|---|---|---|
patterns |
Dictionary mapping atom type names to their SMARTSGraph patterns |
|
dependency_graph |
dict[str, set[str]]
|
Adjacency list of dependencies (type -> depends_on) |
levels |
dict[str, int]
|
Dictionary mapping atom type names to their topological levels |
circular_groups |
list[set[str]]
|
List of sets containing types with circular dependencies |
Initialize dependency analyzer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
patterns
|
dict[str, SMARTSGraph]
|
Dictionary of {atom_type_name: SMARTSGraph} |
required |
get_max_level ¶
Get the maximum level number.
Returns:
| Type | Description |
|---|---|
int
|
Maximum level, or -1 if no patterns |
get_patterns_by_level ¶
Get all patterns at a specific level.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level
|
int
|
Topological level number |
required |
Returns:
| Type | Description |
|---|---|
list[SMARTSGraph]
|
List of SMARTSGraph patterns at that level |
has_circular_dependencies ¶
Check if there are any circular dependencies.
Returns:
| Type | Description |
|---|---|
bool
|
True if circular dependencies exist |
Adapter¶
adapter ¶
Adapter utilities for converting molecules to graphs for matching.
This module provides functions to convert Atomistic structures into igraph.Graph representations suitable for SMARTS pattern matching.
build_mol_graph ¶
Convert Atomistic structure to igraph.Graph for matching.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
structure
|
Atomistic
|
Atomistic structure with atoms and bonds |
required |
Returns:
| Type | Description |
|---|---|
Graph
|
Tuple of (graph, vs_to_atomid, atomid_to_vs) where: |
dict[int, int]
|
|
dict[int, int]
|
|
tuple[Graph, dict[int, int], dict[int, int]]
|
|
Vertex attributes set
- element: str (e.g., "C", "N", "O")
- number: int (atomic number)
- is_aromatic: bool
- formal_charge: int
- degree: int (number of bonds)
- hyb: int | None (1=sp, 2=sp2, 3=sp3)
- in_ring: bool
- cycles: set of tuples (ring membership)
Edge attributes set
- order: int | str (1, 2, 3, or ":")
- is_aromatic: bool
- is_in_ring: bool
Graph¶
graph ¶
Module for SMARTSGraph and SMARTS matching logic.
SMARTSGraph ¶
SMARTSGraph(smarts_string=None, parser=None, name=None, atomtype_name=None, priority=0, target_vertices=None, source='', overrides=None, *args, **kwargs)
Bases: Graph
A graph representation of a SMARTS pattern.
This class supports two modes of construction: 1. From SMARTS string (legacy mode) 2. From predicates (new predicate-based mode)
Attributes¶
atomtype_name : str The atom type this pattern assigns priority : int Priority for conflict resolution (higher wins) target_vertices : list[int] Which pattern vertices should receive the atom type (empty = all) source : str Source identifier for debugging smarts_string : str | None The SMARTS string (if constructed from string) ir : SmartsIR | None The intermediate representation (if constructed from string)
Notes¶
SMARTSGraph inherits from igraph.Graph
Vertex attributes
- preds: list[Callable] - list of predicates that must all pass
Edge attributes
- preds: list[Callable] - list of predicates that must all pass
Graph attributes
- atomtype_name: str
- priority: int
- target_vertices: list[int]
- source: str
- specificity_score: int (computed)
extract_dependencies ¶
Extract type references from SMARTS IR.
Finds all has_label primitives that reference atom types (e.g., %opls_154). These are parsed by Lark as AtomPrimitiveIR(type="has_label", value="%opls_154").
Returns:
| Type | Description |
|---|---|
set[str]
|
Set of referenced atom type names (e.g., {'opls_154', 'opls_135'}) |
find_matches ¶
Return sets of atoms that match this SMARTS pattern in a topology.
Parameters¶
structure : TopologyGraph The topology that we are trying to atomtype. typemap : dict The target typemap being used/edited
Notes¶
When this function gets used in atomtyper.py, we actively modify the
white- and blacklists of the atoms in topology after finding a match.
This means that between every successive call of
subgraph_isomorphisms_iter(), the topology against which we are
matching may have actually changed. Currently, we take advantage of this
behavior in some edges cases (e.g. see test_hexa_coordinated in
test_smarts.py).
from_igraph
classmethod
¶
Create SmartsGraph from an existing igraph.Graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
Graph
|
igraph.Graph with vertex/edge predicates |
required |
atomtype_name
|
str
|
Atom type this pattern assigns |
required |
priority
|
int
|
Priority for conflict resolution |
0
|
target_vertices
|
list[int] | None
|
Which vertices should be typed (empty = all) |
None
|
source
|
str
|
Source identifier |
''
|
Returns:
| Type | Description |
|---|---|
SMARTSGraph
|
SMARTSGraph instance |
get_specificity_score ¶
Compute specificity score for this pattern.
Scoring heuristic
+0 per element predicate (baseline) +1 per charge/degree/hyb constraint +2 per aromatic/in_ring constraint +3 per bond order predicate +4 per custom predicate
Returns:
| Type | Description |
|---|---|
int
|
Specificity score (higher = more specific) |
SMARTSMatcher ¶
Inherits and implements VF2 for a SMARTSGraph.