Polydisperse Systems¶

After reading this page you will be able to sample chain lengths from a target distribution, build typed copolymer chains with incremental junction re-typification, pack them into a periodic box, and export LAMMPS files.

!!! note "Prerequisites" This guide requires RDKit, Packmol, and the oplsaa.xml force field. Familiarity with Stepwise Polymer Construction is assumed.

From distribution to simulation box¶

Real polymer systems are polydisperse — chains have varying lengths. MolPy provides distribution samplers, a system planner, and packing tools to go from a target molecular weight distribution to a simulation-ready box in one continuous workflow.

The system in this guide is a statistical copolymer of styrene (Sty, 80 mol%) and methyl acrylate (MA, 20 mol%), produced by controlled radical polymerization. The corresponding GBigSMILES is:

CCOC(=O)C(C)(C){[>][<|8|]CC([>|8|])c1ccccc1, [<|2|]CC([>|2|])C(=O)OC [<]}|schulz_zimm(1400,1500)|[Br].|5e5|

The three summary statistics that describe polydispersity are the number-average molecular weight Mn, the weight-average molecular weight Mw, and the dispersity PDI = Mw/Mn, which equals 1.0 for a perfectly monodisperse sample.

Typification happens per monomer, not per chain¶

Each monomer is parsed, expanded to 3D with hydrogens, and assigned force field types individually. The builder then handles incremental re-typification at each junction during chain growth, so there is no need to re-typify the entire chain after assembly.

In [1]:

Copied!





import molpy as mp
from molpy.core.element import Element
from molpy.typifier import OplsAtomisticTypifier

ff = mp.io.read_xml_forcefield("oplsaa.xml")
typifier = OplsAtomisticTypifier(ff, strict_typing=False)

BIGSMILES = {
    "Sty": "{[][<]CC(c1ccccc1)[>][]}",
    "MA": "{[][<]CC(C(=O)OC)[>][]}",
}


def build_typed_monomer(bigsmiles, typifier):
    monomer = mp.parser.parse_monomer(bigsmiles)
    monomer = mp.tool.generate_3d(monomer, add_hydrogens=True, optimize=False)
    monomer = monomer.get_topo(gen_angle=True, gen_dihe=True)
    monomer = typifier.typify(monomer)
    return monomer


library = {label: build_typed_monomer(bs, typifier) for label, bs in BIGSMILES.items()}

monomer_mass = {}
for label, mon in library.items():
    mass = sum(Element(a.get("element")).mass for a in mon.atoms)
    ports = [a.get("port") for a in mon.atoms if a.get("port")]
    monomer_mass[label] = mass
    print(f"{label}: atoms={len(mon.atoms)}, mass={mass:.1f}, ports={ports}")
import molpy as mp
from molpy.core.element import Element
from molpy.typifier import OplsAtomisticTypifier

ff = mp.io.read_xml_forcefield("oplsaa.xml")
typifier = OplsAtomisticTypifier(ff, strict_typing=False)

BIGSMILES = {
    "Sty": "{[][<]CC(c1ccccc1)[>][]}",
    "MA": "{[][<]CC(C(=O)OC)[>][]}",
}


def build_typed_monomer(bigsmiles, typifier):
    monomer = mp.parser.parse_monomer(bigsmiles)
    monomer = mp.tool.generate_3d(monomer, add_hydrogens=True, optimize=False)
    monomer = monomer.get_topo(gen_angle=True, gen_dihe=True)
    monomer = typifier.typify(monomer)
    return monomer


library = {label: build_typed_monomer(bs, typifier) for label, bs in BIGSMILES.items()}

monomer_mass = {}
for label, mon in library.items():
    mass = sum(Element(a.get("element")).mass for a in mon.atoms)
    ports = [a.get("port") for a in mon.atoms if a.get("port")]
    monomer_mass[label] = mass
    print(f"{label}: atoms={len(mon.atoms)}, mass={mass:.1f}, ports={ports}")

2026-04-10 07:27:00,951 - molpy.potential.dihedral.opls - WARNING - RB coefficients do not lie on the ideal 4-term OPLS manifold (C0+C1+C2+C3+C4 = 10.041600, expected ≈ 0). Conversion will preserve forces and relative energies exactly, but will introduce a constant energy offset of ΔE = 10.041600 kJ/mol. This does not affect MD simulations.

Sty: atoms=24, mass=112.2, ports=['<', '>']
MA: atoms=14, mass=88.1, ports=['<', '>']

Sampling draws chain lengths from a statistical distribution¶

The sampling layer has three components that compose cleanly. WeightedSequenceGenerator controls the monomer mole ratio (80:20 here). PolydisperseChainGenerator draws a degree of polymerization or mass from the chosen distribution for each chain. SystemPlanner accumulates chains until a target total mass is reached, stopping when the accumulated mass is within max_rel_error of the target. Four distributions are demonstrated below so that their shape differences become visible in the next section.

In [2]:

Copied!





import numpy as np
from molpy.builder.polymer import (
    SchulzZimmPolydisperse,
    UniformPolydisperse,
    PoissonPolydisperse,
    FlorySchulzPolydisperse,
    WeightedSequenceGenerator,
    PolydisperseChainGenerator,
    SystemPlanner,
)

distributions = {
    "Schulz-Zimm": SchulzZimmPolydisperse(Mn=1400, Mw=1500, random_seed=42),
    "Uniform": UniformPolydisperse(min_dp=8, max_dp=22, random_seed=42),
    "Poisson": PoissonPolydisperse(lambda_param=14, random_seed=42),
    "Flory-Schulz": FlorySchulzPolydisperse(a=0.08, random_seed=42),
}

seq_gen = WeightedSequenceGenerator(monomer_weights={"Sty": 8.0, "MA": 2.0})
target_total_mass = 5e5

results = {}
for name, dist in distributions.items():
    chain_gen = PolydisperseChainGenerator(
        seq_generator=seq_gen,
        monomer_mass=monomer_mass,
        end_group_mass=0.0,
        distribution=dist,
    )
    planner = SystemPlanner(
        chain_generator=chain_gen,
        target_total_mass=target_total_mass,
        max_rel_error=0.02,
    )
    plan = planner.plan_system(np.random.default_rng(42))
    results[name] = plan.chains

for name, chains in results.items():
    mw = np.array([c.mass for c in chains])
    Mn = float(np.mean(mw))
    Mw = float(np.sum(mw**2) / np.sum(mw))
    print(f"{name:15s}: {len(chains):4d} chains, Mn={Mn:.0f}, PDI={Mw / Mn:.3f}")
import numpy as np
from molpy.builder.polymer import (
    SchulzZimmPolydisperse,
    UniformPolydisperse,
    PoissonPolydisperse,
    FlorySchulzPolydisperse,
    WeightedSequenceGenerator,
    PolydisperseChainGenerator,
    SystemPlanner,
)

distributions = {
    "Schulz-Zimm": SchulzZimmPolydisperse(Mn=1400, Mw=1500, random_seed=42),
    "Uniform": UniformPolydisperse(min_dp=8, max_dp=22, random_seed=42),
    "Poisson": PoissonPolydisperse(lambda_param=14, random_seed=42),
    "Flory-Schulz": FlorySchulzPolydisperse(a=0.08, random_seed=42),
}

seq_gen = WeightedSequenceGenerator(monomer_weights={"Sty": 8.0, "MA": 2.0})
target_total_mass = 5e5

results = {}
for name, dist in distributions.items():
    chain_gen = PolydisperseChainGenerator(
        seq_generator=seq_gen,
        monomer_mass=monomer_mass,
        end_group_mass=0.0,
        distribution=dist,
    )
    planner = SystemPlanner(
        chain_generator=chain_gen,
        target_total_mass=target_total_mass,
        max_rel_error=0.02,
    )
    plan = planner.plan_system(np.random.default_rng(42))
    results[name] = plan.chains

for name, chains in results.items():
    mw = np.array([c.mass for c in chains])
    Mn = float(np.mean(mw))
    Mw = float(np.sum(mw**2) / np.sum(mw))
    print(f"{name:15s}: {len(chains):4d} chains, Mn={Mn:.0f}, PDI={Mw / Mn:.3f}")

Schulz-Zimm    :  374 chains, Mn=1338, PDI=1.076
Uniform        :  312 chains, Mn=1609, PDI=1.086
Poisson        :  334 chains, Mn=1500, PDI=1.073
Flory-Schulz   :  193 chains, Mn=2597, PDI=1.619

Visualising the sampled ensembles reveals the distribution shape¶

The four panels below overlay sampled histograms against their theoretical curves. Schulz-Zimm is plotted as a continuous probability density; the other three are plotted as probability mass functions over degree of polymerization. Vertical dashed lines mark Mn and Mw for each ensemble.

In [3]:

Copied!





import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator

# ── colour palette ──
CLR_HIST = "#6baed6"  # steel blue  – sampled histogram
CLR_EDGE = "#3182bd"  # darker blue – histogram edge
CLR_THEO = "#e6550d"  # orange      – theoretical curve
CLR_MN = "#31a354"  # green       – Mn line
CLR_MW = "#de2d26"  # red         – Mw line
CLR_BOX = "#f7f7f7"  # near-white  – annotation box


def annotate_stats(ax, Mn, Mw, PDI, n_chains):
    txt = "\n".join(
        [
            rf"$M_n = {Mn:.0f}$  g/mol",
            rf"$M_w = {Mw:.0f}$  g/mol",
            rf"PDI $= {PDI:.3f}$",
            rf"$N = {n_chains}$",
        ]
    )
    ax.text(
        0.97,
        0.97,
        txt,
        transform=ax.transAxes,
        ha="right",
        va="top",
        fontsize=6.5,
        linespacing=1.4,
        family="monospace",
        bbox=dict(
            boxstyle="round,pad=0.35",
            facecolor=CLR_BOX,
            edgecolor="0.75",
            alpha=0.95,
            linewidth=0.6,
        ),
    )


fig, axes = plt.subplots(2, 2, figsize=(7.5, 6), constrained_layout=True)

for idx, (ax, (name, chains)) in enumerate(zip(axes.flatten(), results.items())):
    dist_obj = distributions[name]
    mw = np.array([c.mass for c in chains])
    dps = np.array([c.dp for c in chains])
    Mn = float(np.mean(mw))
    Mw = float(np.sum(mw**2) / np.sum(mw))
    PDI = Mw / Mn

    if isinstance(dist_obj, SchulzZimmPolydisperse):
        # ── continuous: Freedman–Diaconis bins ──
        iqr = np.subtract(*np.percentile(mw, [75, 25]))
        bw = max(2.0 * iqr / len(mw) ** (1 / 3), 20)
        bins = np.arange(mw.min() - bw, mw.max() + 2 * bw, bw)

        ax.hist(
            mw,
            bins=bins,
            density=True,
            color=CLR_HIST,
            edgecolor=CLR_EDGE,
            linewidth=0.5,
            alpha=0.55,
        )
        M_grid = np.linspace(max(0, mw.min() * 0.3), mw.max() * 1.3, 500)
        ax.plot(
            M_grid,
            dist_obj.mass_pdf(M_grid),
            color=CLR_THEO,
            linewidth=1.6,
            label="Theory",
        )
        ax.axvline(Mn, color=CLR_MN, ls="--", lw=1, label=r"$M_n$")
        ax.axvline(Mw, color=CLR_MW, ls="--", lw=1, label=r"$M_w$")
        ax.set_xlabel(r"Molecular weight $M$ (g mol$^{-1}$)", fontsize=8)
        ax.set_ylabel("Probability density", fontsize=8)
    else:
        # ── discrete: unit-width bars centred on integers ──
        dp_min, dp_max = int(dps.min()), int(dps.max())
        counts = np.bincount(dps)[dp_min:]
        freq = counts / (counts.sum() or 1)
        x_bar = np.arange(dp_min, dp_min + len(counts))

        ax.bar(
            x_bar,
            freq,
            width=1.0,
            align="center",
            color=CLR_HIST,
            edgecolor=CLR_EDGE,
            linewidth=0.5,
            alpha=0.55,
        )

        # Theory curve extends to 99th-percentile of the sample
        if isinstance(dist_obj, UniformPolydisperse):
            support = np.arange(dp_min, dp_max + 1)
        else:
            hi = max(dp_max, int(np.percentile(dps, 99) * 1.15))
            support = np.arange(max(1, dp_min), hi + 1)
        pmf = dist_obj.dp_pmf(support)
        ax.plot(
            support,
            pmf,
            "-o",
            color=CLR_THEO,
            markersize=2.2,
            linewidth=1.3,
            markeredgewidth=0,
            label="Theory",
        )

        avg_mass = float(np.mean(mw / dps))
        ax.axvline(Mn / avg_mass, color=CLR_MN, ls="--", lw=1, label=r"$M_n$")
        ax.axvline(Mw / avg_mass, color=CLR_MW, ls="--", lw=1, label=r"$M_w$")

        # Clip x-axis so long tails don't crush the peak
        x_hi = int(np.percentile(dps, 99.5)) + 2
        ax.set_xlim(max(0, dp_min - 1), x_hi)
        ax.xaxis.set_major_locator(MaxNLocator(integer=True))
        ax.set_xlabel(r"Degree of polymerization $n$", fontsize=8)
        ax.set_ylabel("Probability", fontsize=8)

    ax.set_title(name, fontsize=9.5, fontweight="semibold", pad=6)
    ax.tick_params(labelsize=7)
    ax.spines[["top", "right"]].set_visible(False)
    annotate_stats(ax, Mn, Mw, PDI, len(chains))
    if idx == 0:
        ax.legend(fontsize=7, loc="upper left", framealpha=0.85)

plt.savefig("05_polydisperse_distributions.png", dpi=200, bbox_inches="tight")
plt.show()
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator

# ── colour palette ──
CLR_HIST = "#6baed6"  # steel blue  – sampled histogram
CLR_EDGE = "#3182bd"  # darker blue – histogram edge
CLR_THEO = "#e6550d"  # orange      – theoretical curve
CLR_MN = "#31a354"  # green       – Mn line
CLR_MW = "#de2d26"  # red         – Mw line
CLR_BOX = "#f7f7f7"  # near-white  – annotation box


def annotate_stats(ax, Mn, Mw, PDI, n_chains):
    txt = "\n".join(
        [
            rf"$M_n = {Mn:.0f}$  g/mol",
            rf"$M_w = {Mw:.0f}$  g/mol",
            rf"PDI $= {PDI:.3f}$",
            rf"$N = {n_chains}$",
        ]
    )
    ax.text(
        0.97,
        0.97,
        txt,
        transform=ax.transAxes,
        ha="right",
        va="top",
        fontsize=6.5,
        linespacing=1.4,
        family="monospace",
        bbox=dict(
            boxstyle="round,pad=0.35",
            facecolor=CLR_BOX,
            edgecolor="0.75",
            alpha=0.95,
            linewidth=0.6,
        ),
    )


fig, axes = plt.subplots(2, 2, figsize=(7.5, 6), constrained_layout=True)

for idx, (ax, (name, chains)) in enumerate(zip(axes.flatten(), results.items())):
    dist_obj = distributions[name]
    mw = np.array([c.mass for c in chains])
    dps = np.array([c.dp for c in chains])
    Mn = float(np.mean(mw))
    Mw = float(np.sum(mw**2) / np.sum(mw))
    PDI = Mw / Mn

    if isinstance(dist_obj, SchulzZimmPolydisperse):
        # ── continuous: Freedman–Diaconis bins ──
        iqr = np.subtract(*np.percentile(mw, [75, 25]))
        bw = max(2.0 * iqr / len(mw) ** (1 / 3), 20)
        bins = np.arange(mw.min() - bw, mw.max() + 2 * bw, bw)

        ax.hist(
            mw,
            bins=bins,
            density=True,
            color=CLR_HIST,
            edgecolor=CLR_EDGE,
            linewidth=0.5,
            alpha=0.55,
        )
        M_grid = np.linspace(max(0, mw.min() * 0.3), mw.max() * 1.3, 500)
        ax.plot(
            M_grid,
            dist_obj.mass_pdf(M_grid),
            color=CLR_THEO,
            linewidth=1.6,
            label="Theory",
        )
        ax.axvline(Mn, color=CLR_MN, ls="--", lw=1, label=r"$M_n$")
        ax.axvline(Mw, color=CLR_MW, ls="--", lw=1, label=r"$M_w$")
        ax.set_xlabel(r"Molecular weight $M$ (g mol$^{-1}$)", fontsize=8)
        ax.set_ylabel("Probability density", fontsize=8)
    else:
        # ── discrete: unit-width bars centred on integers ──
        dp_min, dp_max = int(dps.min()), int(dps.max())
        counts = np.bincount(dps)[dp_min:]
        freq = counts / (counts.sum() or 1)
        x_bar = np.arange(dp_min, dp_min + len(counts))

        ax.bar(
            x_bar,
            freq,
            width=1.0,
            align="center",
            color=CLR_HIST,
            edgecolor=CLR_EDGE,
            linewidth=0.5,
            alpha=0.55,
        )

        # Theory curve extends to 99th-percentile of the sample
        if isinstance(dist_obj, UniformPolydisperse):
            support = np.arange(dp_min, dp_max + 1)
        else:
            hi = max(dp_max, int(np.percentile(dps, 99) * 1.15))
            support = np.arange(max(1, dp_min), hi + 1)
        pmf = dist_obj.dp_pmf(support)
        ax.plot(
            support,
            pmf,
            "-o",
            color=CLR_THEO,
            markersize=2.2,
            linewidth=1.3,
            markeredgewidth=0,
            label="Theory",
        )

        avg_mass = float(np.mean(mw / dps))
        ax.axvline(Mn / avg_mass, color=CLR_MN, ls="--", lw=1, label=r"$M_n$")
        ax.axvline(Mw / avg_mass, color=CLR_MW, ls="--", lw=1, label=r"$M_w$")

        # Clip x-axis so long tails don't crush the peak
        x_hi = int(np.percentile(dps, 99.5)) + 2
        ax.set_xlim(max(0, dp_min - 1), x_hi)
        ax.xaxis.set_major_locator(MaxNLocator(integer=True))
        ax.set_xlabel(r"Degree of polymerization $n$", fontsize=8)
        ax.set_ylabel("Probability", fontsize=8)

    ax.set_title(name, fontsize=9.5, fontweight="semibold", pad=6)
    ax.tick_params(labelsize=7)
    ax.spines[["top", "right"]].set_visible(False)
    annotate_stats(ax, Mn, Mw, PDI, len(chains))
    if idx == 0:
        ax.legend(fontsize=7, loc="upper left", framealpha=0.85)

plt.savefig("05_polydisperse_distributions.png", dpi=200, bbox_inches="tight")
plt.show()

No description has been provided for this image

Radical addition couples monomers without leaving groups¶

The reaction here is radical addition: each connection removes one hydrogen from each backbone carbon and forms a new C–C bond. This differs from the dehydration condensation used in earlier guides — there are no hydroxyl leaving groups, only hydrogen removal from both sides.

The typifier is passed to PolymerBuilder so that incremental re-typification handles the junction atoms at each coupling step. There is no whole-chain typification after building; only the atoms immediately flanking each new bond are re-typed.

In [4]:

Copied!





from molpy.builder.polymer import (
    Connector,
    CovalentSeparator,
    LinearOrienter,
    Placer,
    PolymerBuilder,
)
from molpy.reacter import (
    Reacter,
    form_single_bond,
    select_hydrogens,
    select_self,
)

rxn = Reacter(
    name="addition",
    anchor_selector_left=select_self,
    anchor_selector_right=select_self,
    leaving_selector_left=select_hydrogens(1),
    leaving_selector_right=select_hydrogens(1),
    bond_former=form_single_bond,
)

rules = {(l, r): (">", "<") for l in library for r in library}
connector = Connector(port_map=rules, reacter=rxn)
placer = Placer(
    separator=CovalentSeparator(buffer=-0.1),
    orienter=LinearOrienter(),
)
builder = PolymerBuilder(
    library=library,
    connector=connector,
    placer=placer,
    typifier=typifier,  # incremental re-typification at junctions
)

sz_chains = results["Schulz-Zimm"]
atomistic_chains = []
n_chains = 10  # truncated for this tutorial; use len(sz_chains) for a production run
for i, chain in enumerate(sz_chains[:n_chains]):
    labels = " ".join(f"[#{m}]" for m in chain.monomers)
    cgsmiles = "{" + labels + "}"
    result = builder.build(cgsmiles)
    atomistic_chains.append(result.polymer)
    if (i + 1) % 5 == 0:
        print(f"  built {i + 1}/{len(sz_chains)} chains ...")

total_atoms = sum(len(c.atoms) for c in atomistic_chains)
print(f"built {len(atomistic_chains)} chains, total atoms: {total_atoms}")
from molpy.builder.polymer import (
    Connector,
    CovalentSeparator,
    LinearOrienter,
    Placer,
    PolymerBuilder,
)
from molpy.reacter import (
    Reacter,
    form_single_bond,
    select_hydrogens,
    select_self,
)

rxn = Reacter(
    name="addition",
    anchor_selector_left=select_self,
    anchor_selector_right=select_self,
    leaving_selector_left=select_hydrogens(1),
    leaving_selector_right=select_hydrogens(1),
    bond_former=form_single_bond,
)

rules = {(l, r): (">", "<") for l in library for r in library}
connector = Connector(port_map=rules, reacter=rxn)
placer = Placer(
    separator=CovalentSeparator(buffer=-0.1),
    orienter=LinearOrienter(),
)
builder = PolymerBuilder(
    library=library,
    connector=connector,
    placer=placer,
    typifier=typifier,  # incremental re-typification at junctions
)

sz_chains = results["Schulz-Zimm"]
atomistic_chains = []
n_chains = 10  # truncated for this tutorial; use len(sz_chains) for a production run
for i, chain in enumerate(sz_chains[:n_chains]):
    labels = " ".join(f"[#{m}]" for m in chain.monomers)
    cgsmiles = "{" + labels + "}"
    result = builder.build(cgsmiles)
    atomistic_chains.append(result.polymer)
    if (i + 1) % 5 == 0:
        print(f"  built {i + 1}/{len(sz_chains)} chains ...")

total_atoms = sum(len(c.atoms) for c in atomistic_chains)
print(f"built {len(atomistic_chains)} chains, total atoms: {total_atoms}")

  built 5/374 chains ...

  built 10/374 chains ...
built 10 chains, total atoms: 2080

Packing and exporting follow the same pattern as earlier guides¶

The box size follows from total molecular weight and target density. Each chain is added to the packer as an individual target with count 1, and the packed frame is written as a LAMMPS data file together with the force field.

In [5]:

Copied!





from pathlib import Path
from molpy.pack import InsideBoxConstraint, Molpack

total_mw = sum(
    sum(Element(a.get("element")).mass for a in c.atoms) for c in atomistic_chains
)
target_density = 0.05  # g/cm^3 (use ~1.0 for production)
volume = (total_mw / 6.022e23) / target_density * 1e24
box_length = volume ** (1 / 3)

packer = Molpack(workdir=Path("05_output/packmol"))
constraint = InsideBoxConstraint(
    length=np.array([box_length] * 3),
    origin=np.zeros(3),
)
for chain in atomistic_chains:
    packer.add_target(chain.to_frame(), number=1, constraint=constraint)

packed = packer.optimize(max_steps=10000, seed=42)
packed.box = mp.Box.cubic(length=box_length)

mp.io.write_lammps_system("05_output/lammps", packed, ff)
print(f"packed: {packed['atoms'].nrows} atoms, box: {box_length:.1f} A")
from pathlib import Path
from molpy.pack import InsideBoxConstraint, Molpack

total_mw = sum(
    sum(Element(a.get("element")).mass for a in c.atoms) for c in atomistic_chains
)
target_density = 0.05  # g/cm^3 (use ~1.0 for production)
volume = (total_mw / 6.022e23) / target_density * 1e24
box_length = volume ** (1 / 3)

packer = Molpack(workdir=Path("05_output/packmol"))
constraint = InsideBoxConstraint(
    length=np.array([box_length] * 3),
    origin=np.zeros(3),
)
for chain in atomistic_chains:
    packer.add_target(chain.to_frame(), number=1, constraint=constraint)

packed = packer.optimize(max_steps=10000, seed=42)
packed.box = mp.Box.cubic(length=box_length)

mp.io.write_lammps_system("05_output/lammps", packed, ff)
print(f"packed: {packed['atoms'].nrows} atoms, box: {box_length:.1f} A")

packed: 2080 atoms, box: 71.5 A

The engine assembles a runnable input script from the exported data¶

Writing the data file is only half the story. To actually run the simulation, LAMMPS needs an input script that says how to read that file, which force field styles to activate, and what protocol to follow. MolPy models this through LAMMPSEngine, which pairs a Script object with subprocess management.

A Script is an editable, ordered list of lines that can be built programmatically and saved to disk without executing anything. This separation matters: you can inspect, modify, and version-control the script before committing to a run. When you are ready, engine.run() writes the script to the working directory and launches lmp -in input.lmp -log log.lammps -screen none.

The code below builds a minimal OPLS-AA equilibration protocol for the packed system. The force field styles must match those written by write_lammps_system—harmonic bonds and angles, opls dihedrals, lj/cut/coul/long non-bonds—because LAMMPS validates style consistency when it reads the data file.

In [6]:

Copied!





from molpy.core.script import Script
from molpy.engine import LAMMPSEngine

# Build the LAMMPS input script line-by-line.
# Script.from_text() dedents and normalises the block.
lmp_script = Script.from_text(
    name="input",
    language="other",
    text="""
        # Polydisperse PS/PMA system — generated by MolPy
        units           real
        atom_style      full

        read_data       lammps.data
        include         lammps.ff

        pair_style      lj/cut/coul/long 12.0
        pair_modify     mix arithmetic tail yes
        kspace_style    pppm 1e-4

        bond_style      harmonic
        angle_style     harmonic
        dihedral_style  opls
        improper_style  cvff

        # Energy minimisation before dynamics
        minimize        1.0e-4 1.0e-6 10000 100000

        timestep        1.0
        thermo          1000
        thermo_style    custom step temp press etotal

        # NVT equilibration at 300 K
        fix             nvt all nvt temp 300.0 300.0 100.0
        run             100000
    """,
)

# Save the script alongside the data files without launching LAMMPS.
# check_executable=False lets the call succeed in notebooks where lmp
# may not be on PATH.
engine = LAMMPSEngine("lmp", check_executable=False)
script_path = lmp_script.save("05_output/lammps/input.lmp")
print("Input script written to:", script_path)
print(lmp_script.preview(max_lines=12))

# To run the simulation, replace the two lines above with:
#   result = engine.run(lmp_script, workdir="05_output/lammps")
#   print("Exit code:", result.returncode)
from molpy.core.script import Script
from molpy.engine import LAMMPSEngine

# Build the LAMMPS input script line-by-line.
# Script.from_text() dedents and normalises the block.
lmp_script = Script.from_text(
    name="input",
    language="other",
    text="""
        # Polydisperse PS/PMA system — generated by MolPy
        units           real
        atom_style      full

        read_data       lammps.data
        include         lammps.ff

        pair_style      lj/cut/coul/long 12.0
        pair_modify     mix arithmetic tail yes
        kspace_style    pppm 1e-4

        bond_style      harmonic
        angle_style     harmonic
        dihedral_style  opls
        improper_style  cvff

        # Energy minimisation before dynamics
        minimize        1.0e-4 1.0e-6 10000 100000

        timestep        1.0
        thermo          1000
        thermo_style    custom step temp press etotal

        # NVT equilibration at 300 K
        fix             nvt all nvt temp 300.0 300.0 100.0
        run             100000
    """,
)

# Save the script alongside the data files without launching LAMMPS.
# check_executable=False lets the call succeed in notebooks where lmp
# may not be on PATH.
engine = LAMMPSEngine("lmp", check_executable=False)
script_path = lmp_script.save("05_output/lammps/input.lmp")
print("Input script written to:", script_path)
print(lmp_script.preview(max_lines=12))

# To run the simulation, replace the two lines above with:
#   result = engine.run(lmp_script, workdir="05_output/lammps")
#   print("Exit code:", result.returncode)

Input script written to: 05_output/lammps/input.lmp
 1 | 
 2 | # Polydisperse PS/PMA system — generated by MolPy
 3 | units           real
 4 | atom_style      full
 5 | 
 6 | read_data       lammps.data
 7 | include         lammps.ff
 8 | 
 9 | pair_style      lj/cut/coul/long 12.0
10 | pair_modify     mix arithmetic tail yes
11 | kspace_style    pppm 1e-4
12 | 
... (15 more lines)

GBigSMILES encodes the whole recipe in one string¶

If the repeat units, their weights, the distribution, and the target mass are already known, the entire specification can be written as a single GBigSMILES string. The parser extracts the monomer identities, stochastic weights, distribution parameters, and system mass directly from the notation, making the string self-documenting and portable.

In [7]:

Copied!





gbigsmiles = (
    "CCOC(=O)C(C)(C)"  # initiator
    "{[>]"
    "[<|8|]CC([>|8|])c1ccccc1,"  # styrene (80 mol%)
    " [<|2|]CC([>|2|])C(=O)OC"  # methyl acrylate (20 mol%)
    " [<]}"
    "|schulz_zimm(1400,1500)|"  # Mn=1400, Mw=1500
    "[Br]"  # end group
    ".|5e5|"  # total system mass
)
ir = mp.parser.parse_gbigsmiles(gbigsmiles)
print(f"molecules: {len(ir.molecules)}, total_mass: {ir.total_mass}")
gbigsmiles = (
    "CCOC(=O)C(C)(C)"  # initiator
    "{[>]"
    "[<|8|]CC([>|8|])c1ccccc1,"  # styrene (80 mol%)
    " [<|2|]CC([>|2|])C(=O)OC"  # methyl acrylate (20 mol%)
    " [<]}"
    "|schulz_zimm(1400,1500)|"  # Mn=1400, Mw=1500
    "[Br]"  # end group
    ".|5e5|"  # total system mass
)
ir = mp.parser.parse_gbigsmiles(gbigsmiles)
print(f"molecules: {len(ir.molecules)}, total_mass: {ir.total_mass}")

molecules: 1, total_mass: 500000.0

Troubleshooting¶

Step	Check
Monomer mass wrong	Verify monomer has explicit hydrogens before mass calculation
SystemPlanner total mass off	Check `max_rel_error` setting
Chain topology missing	Call `get_topo(gen_angle=True, gen_dihe=True)` before typification
Untyped atoms at junction	Pass `typifier` to `PolymerBuilder` for incremental re-typification
Packing fails	Lower target density or increase `max_steps`