MolPy¶
A composable, strongly typed toolkit for computational molecular modeling — from single-molecule parameterization to polydisperse polymer system construction.
Representative Workflows¶
Parameterize a small organic molecule from a SMILES string using the bundled OPLS-AA force field and export complete LAMMPS input files.
Specify a poly(ethylene oxide) chain via G-BigSMILES notation. MolPy generates three-dimensional coordinates and exports a simulation-ready topology.
Sample a Schulz–Zimm molecular-weight distribution, construct each chain atomistically, and pack the ensemble into a periodic simulation box.
import molpy as mp
# Mn = 1500 Da, Mw = 3000 Da, target total mass ≈ 500 kDa
chains = mp.tool.polymer_system(
"{[<]CCOCC[>]}|schulz_zimm(1500,3000)||5e5|",
random_seed=42,
)
print(f"Built {len(chains)} chains")
frames = [c.to_frame() for c in chains]
packed = mp.pack.pack(frames, box=[80, 80, 80])
mp.io.write_lammps_system("peo_bulk/", packed, ff)
Prepare a monomer with partial charges via antechamber, assemble a chain with GAFF2 parameters via tleap, and retrieve AMBER topology files programmatically.
import molpy as mp
# BigSMILES → three-dimensional structure with port annotation
eo = mp.tool.PrepareMonomer().run("{[<]CCOCC[>]}")
# Assemble DP = 20 chain via AmberTools
result = mp.tool.polymer(
"{[#EO]|20}",
library={"EO": eo},
backend="amber",
)
# result.prmtop_path result.inpcrd_path result.pdb_path
Design Principles¶
-
Explicit representational hierarchy — Molecular graphs (
Atomistic), numerical snapshots (Frame), and force field parameters (ForceField) occupy distinct layers with explicit conversion boundaries. -
Native support for polymer chemistry notations — SMILES, BigSMILES, CGSmiles, and G-BigSMILES are parsed directly. A monomer, an architecture, or a polydisperse ensemble can each be expressed as a single string.
-
Statistical molecular-weight distributions — Schulz–Zimm, Poisson, Flory–Schulz, and uniform distributions are implemented natively. Target number- and weight-average molecular weights are specified directly; reproducible chain populations are generated from a fixed random seed.
-
Force fields as queryable data structures — A
ForceFieldobject is an inspectable typed dictionary. Parameter completeness and type consistency are verifiable at the Python level before any file export occurs. -
Programmatic reaction framework — Chemical reactions are expressed through composable anchor selectors and leaving-group selectors. Pre- and post-reaction topology templates for LAMMPS
fix bond/reactare generated automatically. -
Modular, independently composable packages — The parser, builder, typifier, packer, and I/O subsystems share no hidden coupling. Each may be used independently or assembled into composite pipelines through explicit function calls.
External Integrations¶
-
AmberTools — Antechamber (partial charge assignment), parmchk2 (missing parameter estimation), and tleap (topology assembly) are invoked programmatically with structured Python interfaces.
-
RDKit —
RDKitAdapterprovides bidirectional conversion betweenAtomisticand RDKitMolobjects, enabling three-dimensional embedding, conformer generation, and SMILES export. -
Packmol — Molecule packing into periodic simulation boxes is managed through a typed constraint interface wrapping the Packmol executable.
-
LAMMPS · CP2K — Complete input decks are generated from MolPy data objects. The engine abstraction layer decouples system description from simulation-code-specific syntax.
Documentation Structure¶
-
Getting Started — Installation, environment verification, and a five-minute end-to-end example establishing the
Atomistic → Frame → exportpipeline. -
Concepts — Systematic exposition of the core data model:
Atomistic,Block,Frame,Box,Trajectory,ForceField, and their inter-relationships. -
Guides — Task-oriented executable notebooks covering chemistry parsing, polymer construction, force field typification, and simulation file generation.
-
Developer Guide — Conventions, extension patterns, and internal architecture for contributors and library developers.