Automatic Polymer Builder¶
"How do I connect monomers using an explicit reaction, but still build the polymer automatically?"
This guide shows how to plug a chemistry step (Reacter) into the automated PolymerBuilder.
What We Will Do¶
- Load an OPLS-AA force field
- Build and type two monomers from BigSMILES (EO2, PS)
- Define a dehydration reaction and connect it via
ReacterConnector - Build three polymers (linear, ring, branched)
- Export each structure to LAMMPS
.data
Requirements & Outputs¶
- RDKit is required for 3D coordinate generation (
RDKitAdapter/Generate3D). - Outputs are written under
user-guide-output/03_polymer_smiles/(safe to delete).
Setup: Imports and Output Directory¶
We keep all generated artifacts in a single predictable folder so the notebook is safe to re-run and easy to clean up.
from pathlib import Path
import numpy as np
import molpy as mp
from molpy.builder.polymer import PolymerBuilder
from molpy.builder.polymer.connectors import ReacterConnector
from molpy.builder.polymer.placer import CovalentSeparator, LinearOrienter, Placer
from molpy.builder.polymer.port_utils import get_all_port_info
from molpy.core.atomistic import Atomistic
from molpy.adapter.rdkit import RDKitAdapter
from molpy.compute.rdkit import OptimizeGeometry, Generate3D
from molpy.io.data.lammps import LammpsDataWriter
from molpy.parser.smiles import bigsmilesir_to_polymerspec, parse_bigsmiles
from molpy.reacter import (
Reacter,
form_single_bond,
select_dehydration_left,
select_dehydration_right,
select_hydroxyl_group,
select_hydroxyl_h_only,
)
from molpy.typifier.atomistic import OplsAtomisticTypifier
OUTPUT_DIR = Path("user-guide-output") / "03_polymer_smiles"
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
print(f"Output directory: {OUTPUT_DIR.resolve()}")
Output directory: /opt/buildhome/repo/docs/user-guide/user-guide-output/03_polymer_smiles
Helper: Export LAMMPS Data¶
For documentation and debugging, we export each built polymer to a .data file that can be opened in OVITO/VMD.
This helper writes a structure-only LAMMPS data file (atoms/bonds/angles/dihedrals).
def export_to_lammps(structure: Atomistic, filepath: Path) -> None:
"""Export structure to LAMMPS data format (structure-only)."""
frame = structure.to_frame()
atoms = frame["atoms"]
n_atoms = atoms.nrows
# Normalize type fields to pure strings to avoid mixed None/str issues
for block_name in ("atoms", "bonds", "angles", "dihedrals"):
if block_name in frame and "type" in frame[block_name]:
types = frame[block_name]["type"]
frame[block_name]["type"] = np.array(
[str(t) if t is not None else "" for t in types],
dtype=str,
)
# Handle charge field: map "charge" to "q" if needed
if "charge" in atoms and "q" not in atoms:
atoms["q"] = atoms["charge"]
elif "charge" in atoms and "q" in atoms:
# If both exist, prefer "q" but warn if they differ
if not np.allclose(atoms["charge"], atoms["q"], equal_nan=True):
print("Warning: Both 'charge' and 'q' fields exist with different values. Using 'q'.")
# Add charge if missing -> zero charges
if "q" not in atoms:
atoms["q"] = np.zeros(n_atoms, dtype=float)
# Add mol ID if missing
if "mol" not in atoms:
atoms["mol"] = np.ones(n_atoms, dtype=int)
writer = LammpsDataWriter(filepath)
writer.write(frame)
Step 1: Load Force Field¶
We load the force field once and reuse it for typing all monomers.
In this notebook, monomers are typed before building; the resulting polymer carries those types forward.
ff = mp.io.read_xml_forcefield("oplsaa.xml")
typifier = OplsAtomisticTypifier(ff, strict_typing=True)
print("Force field loaded")
2025-12-30 16:08:45,763 - molpy.potential.dihedral.opls - WARNING - RB coefficients do not lie on the ideal 4-term OPLS manifold (C0+C1+C2+C3+C4 = 10.041600, expected ≈ 0). Conversion will preserve forces and relative energies exactly, but will introduce a constant energy offset of ΔE = 10.041600 kJ/mol. This does not affect MD simulations.
Force field loaded
Step 2: Build Typed Monomers¶
We define two monomers using BigSMILES.
The $ markers represent polymerization ports used by the builder. In this tutorial we use $-to-$ connections.
Monomers used:
- EO2:
{[$]OCCO[$]} - EO3
{[$]OCC(CO[$])(CO[$])} - PS:
{[$]OCC(c1ccccc1)CO[$]}
def build_monomer_from_bigsmiles(bigsmiles: str, typifier: OplsAtomisticTypifier) -> Atomistic:
"""Build a monomer from BigSMILES with 3D coordinates and OPLS typing."""
ir = parse_bigsmiles(bigsmiles)
polymerspec = bigsmilesir_to_polymerspec(ir)
monomers = polymerspec.all_monomers()
if len(monomers) != 1:
raise ValueError(f"Expected 1 monomer, got {len(monomers)}")
monomer = monomers[0]
adapter = RDKitAdapter(internal=monomer)
generate_3d = Generate3D(
add_hydrogens=True,
embed=True,
optimize=False, # Only generate coordinates, optimize later
update_internal=True,
)
adapter = generate_3d(adapter)
monomer = adapter.get_internal()
monomer.get_topo(gen_angle=True, gen_dihe=True)
for idx, atom in enumerate(monomer.atoms):
atom["id"] = idx + 1
typifier.typify(monomer)
return monomer
eo2 = build_monomer_from_bigsmiles("{[][$]OCCO[$][]}", typifier)
eo3 = build_monomer_from_bigsmiles("{[][$]OCC(CO[$])(CO[$])[]}", typifier)
ps = build_monomer_from_bigsmiles("{[][$]OCC(c1ccccc1)CO[$][]}", typifier)
print("Monomers built")
print(f" EO2: {len(eo2.atoms)} atoms, ports: {list(get_all_port_info(eo2).keys())}")
print(f" EO3: {len(eo3.atoms)} atoms, ports: {list(get_all_port_info(eo3).keys())}")
print(f" PS: {len(ps.atoms)} atoms, ports: {list(get_all_port_info(ps).keys())}")
library: dict[str, Atomistic] = {"EO2": eo2, "PS": ps, "EO3": eo3}
Monomers built EO2: 10 atoms, ports: ['$'] EO3: 17 atoms, ports: ['$'] PS: 29 atoms, ports: ['$']
Step 3: Configure Builder (Reacter + Connector + Placer)¶
We connect a dehydration Reacter to PolymerBuilder via ReacterConnector.
- The connector decides which port atoms participate in each step.
- The reacter selects anchors/leaving groups and forms a bond.
- The placer orients/positions monomers before connecting them.
For this tutorial, we connect all monomer pairs using $-to-$ ports.
dehydration_reacter = Reacter(
name="dehydration_ether_formation",
anchor_selector_left=select_dehydration_left,
anchor_selector_right=select_dehydration_right,
leaving_selector_left=select_hydroxyl_group,
leaving_selector_right=select_hydroxyl_h_only,
bond_former=form_single_bond,
)
# Map all monomer pairs to $-$ ports
port_map: dict[tuple[str, str], tuple[str, str]] = {}
for left_label in library.keys():
for right_label in library.keys():
port_map[(left_label, right_label)] = ("$", "$")
connector = ReacterConnector(default=dehydration_reacter, port_map=port_map)
placer = Placer(separator=CovalentSeparator(), orienter=LinearOrienter())
builder = PolymerBuilder(library=library, connector=connector, placer=placer, typifier=typifier)
print("Builder configured")
print(f" Library: {list(library.keys())}")
Builder configured Library: ['EO2', 'PS', 'EO3']
Step 4: Build Example Polymers and Export¶
We build three examples to show what the automated builder can do:
- Linear chain
- Ring polymer
- Simple branched structure
Each result is exported as a .data file under the output directory.
# Create optimizer (reusable)
optimizer = OptimizeGeometry(
max_opt_iters=200,
forcefield="UFF",
update_internal=True, # Coordinates will be synced to internal automatically
raise_on_failure=False,
)
# Linear chain
cgsmiles_linear = "{[#EO2]|4[#PS]}"
print(f"Building linear: {cgsmiles_linear}")
linear_result = builder.build(cgsmiles_linear)
linear_chain = linear_result.polymer
adapter = RDKitAdapter(internal=linear_chain)
adapter = optimizer(adapter) # update_internal=True syncs coordinates automatically
linear_chain = adapter.get_internal()
export_to_lammps(linear_chain, OUTPUT_DIR / "linear.data")
print(f" Atoms: {len(linear_chain.atoms)} | Bonds: {len(linear_chain.bonds)} | Steps: {linear_result.total_steps}")
print(f" Exported: {OUTPUT_DIR / "linear.data"}")
# Cyclic (ring) polymer
cgsmiles_ring = "{[#EO2]1[#PS][#EO2][#PS][#EO2]1}"
print(f"Building ring: {cgsmiles_ring}")
ring_result = builder.build(cgsmiles_ring)
ring_chain = ring_result.polymer
adapter = RDKitAdapter(internal=ring_chain)
adapter = optimizer(adapter) # update_internal=True syncs coordinates automatically
ring_chain = adapter.get_internal()
export_to_lammps(ring_chain, OUTPUT_DIR / "ring.data")
print(f" Atoms: {len(ring_chain.atoms)} | Bonds: {len(ring_chain.bonds)} | Steps: {ring_result.total_steps}")
print(f" Exported: {OUTPUT_DIR / "ring.data"}")
# Branched polymer (simple example)
cgsmiles_branch = "{[#PS][#EO3]([#PS])[#PS]}"
print(f"Building branch: {cgsmiles_branch}")
branch_result = builder.build(cgsmiles_branch)
branch_chain = branch_result.polymer
adapter = RDKitAdapter(internal=branch_chain)
adapter = optimizer(adapter) # update_internal=True syncs coordinates automatically
branch_chain = adapter.get_internal()
export_to_lammps(branch_chain, OUTPUT_DIR / "branch.data")
print(f" Atoms: {len(branch_chain.atoms)} | Bonds: {len(branch_chain.bonds)} | Steps: {branch_result.total_steps}")
print(f" Exported: {OUTPUT_DIR / "branch.data"}")
print("\nTip: open the exported files in OVITO/VMD to inspect the geometry.")
Building linear: {[#EO2]|4[#PS]}
/opt/buildhome/.asdf/installs/python/3.13.3/lib/python3.13/site-packages/molpy/compute/rdkit.py:158: UserWarning: UFF optimization returned code 1. Code 1 typically means convergence not reached within 200 iterations. The structure may still be improved. warnings.warn(msg, UserWarning)
Atoms: 57 | Bonds: 57 | Steps: 4
Exported: user-guide-output/03_polymer_smiles/linear.data
Building ring: {[#EO2]1[#PS][#EO2][#PS][#EO2]1}
Atoms: 73 | Bonds: 75 | Steps: 5
Exported: user-guide-output/03_polymer_smiles/ring.data
Building branch: {[#PS][#EO3]([#PS])[#PS]}
Atoms: 95 | Bonds: 97 | Steps: 3 Exported: user-guide-output/03_polymer_smiles/branch.data Tip: open the exported files in OVITO/VMD to inspect the geometry.
Summary¶
You built and exported polymer structures using an automated builder wired to an explicit reaction:
- Loaded OPLS-AA and created a typifier
- Built two typed monomers from BigSMILES (EO2, PS)
- Connected a dehydration
ReacterviaReacterConnector - Built linear/ring/branched examples
- Exported LAMMPS
.datafiles underuser-guide-output/03_polymer_smiles/
Next: try different CGSmiles strings and increase chain length to stress-test the workflow.