Skip to content

Typifier

SMARTS-based atom typing and force field parameter assignment.

Quick reference

Symbol Summary Preferred for
OplsTypifier Full OPLS-AA typing pipeline OPLS force fields
GaffTypifier Full GAFF typing pipeline GAFF / GAFF2 force fields
ClpTypifier Full CL&P typing pipeline (OPLS engine + built-in clp.xml) Ionic-liquid force fields
PairTypifier Pair (LJ) parameter assignment Standalone nonbonded typing
LayeredTypingEngine Dependency-aware SMARTS matching engine Custom typing engines
DependencyAnalyzer Computes SMARTS pattern dependency levels Custom typing engines
.typify(struct) Assign all types (atom → pair → bond → angle → dihedral) One-call complete typing

Each force-field typifier is a single orchestrator class: one typify() call runs atom typing, then pair parameters, then bond/angle/dihedral types derived from the atom assignments. Individual stages can be disabled with the skip_*_typing constructor flags — there are no separate public atom-only or bond/angle/dihedral typifier classes.

Canonical example

import molpy as mp
from molpy.typifier import OplsTypifier

ff = mp.io.read_xml_forcefield("oplsaa.xml")
typifier = OplsTypifier(ff, strict_typing=True)
typed_mol = typifier.typify(mol)  # returns NEW Atomistic

Key behavior

  • typify() returns a new Atomistic — the original is not modified
  • strict_typing=True raises on untyped atoms; False silently skips them
  • Atom typing uses SMARTS pattern matching with priority/override resolution
  • Bonded types are derived from atom type assignments (CT-OH → bond type CT-OH)

Full API

Force-Field Typifiers (OPLS, base orchestrator, pair)

atomistic

ForceFieldAngleTypifier

ForceFieldAngleTypifier(forcefield, strict=True)

Bases: TypifierBase[Angle]

Match angle type based on atom types of three atoms in Angle

typify
typify(angle)

Assign the most specific (highest-layer) matching angle type.

ForceFieldAtomTypifier

ForceFieldAtomTypifier(forcefield, strict=False)

Bases: TypifierBase['Atomistic']

Base class for SMARTS-based atom typifiers.

typify
typify(struct)

Return a new Atomistic with atom types assigned; input is not mutated.

ForceFieldBondTypifier

ForceFieldBondTypifier(forcefield, strict=True)

Bases: TypifierBase[Bond]

Match bond type based on atom types at both ends of the bond.

typify
typify(bond)

Assign the most specific (and highest-layer) matching bond type.

Matches by resolving each atom's type to its class and comparing against the bond type's class pair (stored as the component names). Among all matches the most specific wins; ties break toward the higher overlay layer so CL&P/CL&Pol bonds override OPLS-AA.

ForceFieldDihedralTypifier

ForceFieldDihedralTypifier(forcefield, strict=True)

Bases: TypifierBase[Dihedral]

Match dihedral type based on atom types of four atoms in Dihedral

typify
typify(dihedral)

Assign the most specific (highest-layer) matching dihedral type.

OPLS dihedrals routinely use wildcard end atoms (X-CT-CT-X); the specificity score makes a fully-resolved pattern win over a partially wildcard one, and the layer tiebreak lets CL&P/CL&Pol override OPLS-AA.

ForceFieldTypifier

ForceFieldTypifier(forcefield, skip_atom_typing=False, skip_pair_typing=False, skip_bond_typing=False, skip_angle_typing=False, skip_dihedral_typing=False, strict_typing=True)

Bases: TypifierBase[Atomistic]

Base orchestrator that runs the full typing pipeline.

Subclasses can override to use different atom typifiers or add additional typing steps (e.g., improper typing).

typify
typify(struct)

Return a new Atomistic with types assigned; input is not mutated.

OplsTypifier

OplsTypifier(forcefield, skip_atom_typing=False, skip_pair_typing=False, skip_bond_typing=False, skip_angle_typing=False, skip_dihedral_typing=False, strict_typing=True)

Bases: ForceFieldTypifier

OPLS-AA full typing orchestrator: atom → pair → bond → angle → dihedral.

PairTypifier

PairTypifier(forcefield, strict=True)

Bases: TypifierBase[Atom]

Assign nonbonded parameters (charge, sigma, epsilon) to atoms based on their types.

typify
typify(atom)

Assign nonbonded parameters to atom based on its type

atomtype_matches

atomtype_matches(atomtype, type_str)

Check if an AtomType matches a given type string.

Matching rules: 1. If atomtype has a specific type (not "*"), compare by type 2. If type doesn't match, compare by class

Parameters:

Name Type Description Default
atomtype AtomType

AtomType instance

required
type_str str

Type string to match (from Atom.data["type"] or class name)

required

Returns:

Type Description
bool

True if matches, False otherwise

GAFF Typifier

gaff

GAFF (Generalized Amber Force Field) typifier implementation.

GaffTypifier

GaffTypifier(forcefield, skip_atom_typing=False, skip_pair_typing=False, skip_bond_typing=False, skip_angle_typing=False, skip_dihedral_typing=False, strict_typing=True)

Bases: ForceFieldTypifier

GAFF full typing orchestrator.

Runs the full typing pipeline: atom typing -> pair typing -> bond typing -> angle typing -> dihedral typing

CL&P Typifier

clp

CL&P ionic-liquid force field typifier.

CL&P (Canongia Lopes & Padua, J. Phys. Chem. B 2004, 108, 2038, DOI 10.1021/jp0362133) is an all-atom fixed-charge force field for ionic liquids whose functional form is fully OPLS-AA compatible (harmonic bonds and angles, OPLS cosine-series dihedrals, LJ 12-6 with geometric combining and 0.5/0.5 1-4 scaling). It therefore reuses the entire OPLS typing engine; only the force-field data differs, shipped as the built-in clp.xml.

ClpTypifier

ClpTypifier(forcefield=None, **kwargs)

Bases: OplsTypifier

Full CL&P typing pipeline: atom -> pair -> bond -> angle -> dihedral.

Inherits the OPLS-AA SMARTS/overrides matching engine unchanged. When no force field is supplied, the built-in clp.xml is loaded through the OPLS-AA reader (CL&P shares OPLS units and combining rules).

load_forcefield staticmethod
load_forcefield()

Load the built-in CL&P force field as an OPLS-AA overlay.

CL&P extends OPLS-AA, so the base oplsaa.xml is read first and clp.xml is layered on top (layer=1). CL&P atom types therefore override OPLS-AA wherever their SMARTS match (ionic-liquid atoms), while OPLS-AA stays the fallback for any atom CL&P does not specifically cover (e.g. molecular co-solvents in a mixed electrolyte).

Layered Typing Engine

layered_engine

Layered typing engine for dependency-aware SMARTS matching.

LayeredTypingEngine

LayeredTypingEngine(patterns)

Orchestrates level-by-level atom typing with dependency resolution.

This engine handles: 1. Dependency analysis and topological sorting 2. Layered matching (level 0 first, then level 1, etc.) 3. Conflict resolution within each level 4. Fixed-point iteration for circular dependencies

Attributes:

Name Type Description
patterns

Dictionary mapping atom type names to SMARTSGraph patterns

matcher

SmartsMatcher instance for pattern matching

analyzer

DependencyAnalyzer for computing levels

Initialize layered typing engine.

Parameters:

Name Type Description Default
patterns dict[str, SMARTSGraph]

Dictionary of {atom_type_name: SMARTSGraph}

required
get_explain_data
get_explain_data(mol_graph, vs_to_atomid)

Generate detailed explanation of typing process.

Parameters:

Name Type Description Default
mol_graph Graph

Molecule graph

required
vs_to_atomid dict[int, int]

Vertex to atom ID mapping

required

Returns:

Type Description
dict

Dictionary with detailed typing information

typify
typify(mol_graph, vs_to_atomid, max_iterations=10)

Perform layered atom typing with dependency resolution.

Parameters:

Name Type Description Default
mol_graph Graph

Molecule graph with vertex/edge attributes

required
vs_to_atomid dict[int, int]

Mapping from vertex index to atom ID

required
max_iterations int

Maximum iterations for circular dependency resolution

10

Returns:

Type Description
dict[int, str]

Dictionary mapping atom_id -> atom_type

Dependency Analyzer

dependency_analyzer

Dependency analysis for SMARTS patterns with type references.

DependencyAnalyzer

DependencyAnalyzer(patterns)

Analyzes dependencies between SMARTS patterns and computes matching levels.

Attributes:

Name Type Description
patterns

Dictionary mapping atom type names to their SMARTSGraph patterns

dependency_graph dict[str, set[str]]

Adjacency list of dependencies (type -> depends_on)

levels dict[str, int]

Dictionary mapping atom type names to their topological levels

circular_groups list[set[str]]

List of sets containing types with circular dependencies

Initialize dependency analyzer.

Parameters:

Name Type Description Default
patterns dict[str, SMARTSGraph]

Dictionary of {atom_type_name: SMARTSGraph}

required
get_max_level
get_max_level()

Get the maximum level number.

Returns:

Type Description
int

Maximum level, or -1 if no patterns

get_patterns_by_level
get_patterns_by_level(level)

Get all patterns at a specific level.

Parameters:

Name Type Description Default
level int

Topological level number

required

Returns:

Type Description
list[SMARTSGraph]

List of SMARTSGraph patterns at that level

has_circular_dependencies
has_circular_dependencies()

Check if there are any circular dependencies.

Returns:

Type Description
bool

True if circular dependencies exist

Adapter

adapter

Adapter utilities for converting molecules to graphs for matching.

This module provides functions to convert Atomistic structures into igraph.Graph representations suitable for SMARTS pattern matching.

build_mol_graph

build_mol_graph(structure)

Convert Atomistic structure to igraph.Graph for matching.

Parameters:

Name Type Description Default
structure Atomistic

Atomistic structure with atoms and bonds

required

Returns:

Type Description
Graph

Tuple of (graph, vs_to_atomid, atomid_to_vs) where:

dict[int, int]
  • graph: igraph.Graph with vertex/edge attributes
dict[int, int]
  • vs_to_atomid: mapping from vertex index to atom ID
tuple[Graph, dict[int, int], dict[int, int]]
  • atomid_to_vs: mapping from atom ID to vertex index
Vertex attributes set
  • element: str (e.g., "C", "N", "O")
  • number: int (atomic number)
  • is_aromatic: bool
  • formal_charge: int
  • degree: int (number of bonds)
  • hyb: int | None (1=sp, 2=sp2, 3=sp3)
  • in_ring: bool
  • cycles: set of tuples (ring membership)
Edge attributes set
  • order: int | str (1, 2, 3, or ":")
  • is_aromatic: bool
  • is_in_ring: bool

Graph

graph

Module for SMARTSGraph and SMARTS matching logic.

SMARTSGraph

SMARTSGraph(smarts_string=None, parser=None, name=None, atomtype_name=None, priority=0, target_vertices=None, source='', overrides=None, *args, **kwargs)

Bases: Graph

A graph representation of a SMARTS pattern.

This class supports two modes of construction: 1. From SMARTS string (legacy mode) 2. From predicates (new predicate-based mode)

Attributes

atomtype_name : str The atom type this pattern assigns priority : int Priority for conflict resolution (higher wins) target_vertices : list[int] Which pattern vertices should receive the atom type (empty = all) source : str Source identifier for debugging smarts_string : str | None The SMARTS string (if constructed from string) ir : SmartsIR | None The intermediate representation (if constructed from string)

Notes

SMARTSGraph inherits from igraph.Graph

Vertex attributes
  • preds: list[Callable] - list of predicates that must all pass
Edge attributes
  • preds: list[Callable] - list of predicates that must all pass
Graph attributes
  • atomtype_name: str
  • priority: int
  • target_vertices: list[int]
  • source: str
  • specificity_score: int (computed)
priority property
priority

Get priority for conflict resolution (higher wins).

calc_signature
calc_signature(graph)

Calculate graph signatures for pattern matching.

extract_dependencies
extract_dependencies()

Extract type references from SMARTS IR.

Finds all has_label primitives that reference atom types (e.g., %opls_154). These are parsed by Lark as AtomPrimitiveIR(type="has_label", value="%opls_154").

Returns:

Type Description
set[str]

Set of referenced atom type names (e.g., {'opls_154', 'opls_135'})

find_matches
find_matches(graph)

Return sets of atoms that match this SMARTS pattern in a topology.

Parameters

structure : TopologyGraph The topology that we are trying to atomtype. typemap : dict The target typemap being used/edited

Notes

When this function gets used in atomtyper.py, we actively modify the white- and blacklists of the atoms in topology after finding a match. This means that between every successive call of subgraph_isomorphisms_iter(), the topology against which we are matching may have actually changed. Currently, we take advantage of this behavior in some edges cases (e.g. see test_hexa_coordinated in test_smarts.py).

from_igraph classmethod
from_igraph(graph, atomtype_name, priority=0, target_vertices=None, source='')

Create SmartsGraph from an existing igraph.Graph.

Parameters:

Name Type Description Default
graph Graph

igraph.Graph with vertex/edge predicates

required
atomtype_name str

Atom type this pattern assigns

required
priority int

Priority for conflict resolution

0
target_vertices list[int] | None

Which vertices should be typed (empty = all)

None
source str

Source identifier

''

Returns:

Type Description
SMARTSGraph

SMARTSGraph instance

get_priority
get_priority()

Get priority value (supports both new and legacy modes).

get_specificity_score
get_specificity_score()

Compute specificity score for this pattern.

Scoring heuristic

+0 per element predicate (baseline) +1 per charge/degree/hyb constraint +2 per aromatic/in_ring constraint +3 per bond order predicate +4 per custom predicate

Returns:

Type Description
int

Specificity score (higher = more specific)

override
override(overrides)

Set the priority of this SMART

plot
plot(*args, **kwargs)

Plot the SMARTS graph.

SMARTSMatcher

SMARTSMatcher(G1, G2, node_match_fn, edge_match_fn=None)

Inherits and implements VF2 for a SMARTSGraph.

is_isomorphic property
is_isomorphic

Return True if the two graphs are isomorphic.

candidate_pairs_iter
candidate_pairs_iter()

Iterate over candidate pairs of nodes in G1 and G2.

subgraph_isomorphisms
subgraph_isomorphisms()

Iterate over all subgraph isomorphisms between G1 and G2.