Skip to content

Builder

System assembly: polymer chain construction from CGSmiles topology and monomer libraries.

Quick reference

Symbol Summary Preferred for
PolymerBuilder Build chains from CGSmiles + library + connector + placer Full control over assembly
polymer(cgsmiles, ...) Tool: CGSmiles → chain in one call Quick prototyping
Connector Port selection rules + reaction binding Defining which ports react
Placer Geometric placement (separator + orienter) Controlling inter-monomer geometry
CovalentSeparator Covalent radii-based distance Default monomer spacing
LinearOrienter Linear chain orientation Default growth direction

Canonical example

from molpy.builder.polymer import (
    PolymerBuilder, Connector, Placer,
    CovalentSeparator, LinearOrienter,
)
from molpy.tool import polymer

builder = PolymerBuilder(
    library={"EO": eo_template},
    connector=Connector(port_map={("EO","EO"): (">","<")}, reacter=rxn),
    placer=Placer(separator=CovalentSeparator(buffer=-0.1),
                  orienter=LinearOrienter()),
)
result = builder.build("{[#EO]|10}")
chain = result.polymer

# Or use the tool function:
result = polymer("{[#EO]|10}", library={"EO": eo_template}, reacter=rxn)
chain = result.polymer

Full API

Crystal

crystal

Crystal lattice builder module - LAMMPS-style crystal structure generator.

This module provides tools for creating crystal structures: - Define Bravais lattices with basis sites - Predefined common lattice types (SC, BCC, FCC, rocksalt) - Define regions in lattice or Cartesian coordinates - Efficient vectorized unit cell tiling and atom generation

Example

lat = Lattice.cubic_fcc(a=3.52, species="Ni") region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice") builder = CrystalBuilder(lat) structure = builder.build_block(region)

BlockRegion

BlockRegion(xmin, xmax, ymin, ymax, zmin, zmax, coord_system='lattice')

Bases: Region

Axis-aligned box region

Define a box region specified by x, y, z ranges.

Parameters

xmin, xmax : float x-direction range [xmin, xmax] ymin, ymax : float y-direction range [ymin, ymax] zmin, zmax : float z-direction range [zmin, zmax] coord_system : CoordSystem, optional Coordinate system, default is "lattice"

Examples
Region in lattice coordinates

region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice")

Region in Cartesian coordinates

region = BlockRegion(0, 30, 0, 30, 0, 30, coord_system="cartesian")

Initialize box region

Parameters

xmin, xmax : float x-direction range ymin, ymax : float y-direction range zmin, zmax : float z-direction range coord_system : CoordSystem, optional Coordinate system

contains_mask
contains_mask(points)

Check if points are in the box (vectorized)

Parameters

points : np.ndarray Point coordinates array of shape (N, 3)

Returns

np.ndarray Boolean array of shape (N,)

Examples

region = BlockRegion(0, 10, 0, 10, 0, 10) points = np.array([[5, 5, 5], [15, 5, 5]]) mask = region.contains_mask(points) print(mask) # [True, False]

CrystalBuilder

CrystalBuilder(lattice)

Crystal structure builder

Efficiently generate crystal structures using NumPy vectorized operations. Supports tiling lattices and creating atoms in specified regions.

Parameters

lattice : Lattice Lattice definition to use

Examples
Create a simple FCC structure

lat = Lattice.cubic_fcc(a=3.52, species="Ni") region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice") builder = CrystalBuilder(lat) structure = builder.build_block(region) print(len(structure.atoms))

Initialize crystal builder

Parameters

lattice : Lattice Lattice definition

build_block
build_block(region, *, i_range=None, j_range=None, k_range=None)

Build crystal structure within a box region

This method efficiently generates crystal structures using vectorized operations: 1. Determine cell index ranges to tile 2. Use NumPy meshgrid and broadcasting to generate all atom positions 3. Apply region filtering 4. Create and return Atomistic structure

Parameters

region : BlockRegion Box defining the region for atom generation i_range, j_range, k_range : range | None, optional Explicitly specify cell index ranges. If not provided: - For "lattice" coordinate system: inferred from region boundaries - For "cartesian" coordinate system: must be provided, otherwise raises error

Returns

Atomistic Generated crystal structure containing atoms and box information

Raises

ValueError If coord_system == "cartesian" and explicit ranges are not provided

Examples
Using lattice coordinates (auto-infer ranges)

lat = Lattice.cubic_sc(a=2.0, species="Cu") region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice") builder = CrystalBuilder(lat) structure = builder.build_block(region)

Using explicit ranges

structure = builder.build_block( ... region, ... i_range=range(0, 5), ... j_range=range(0, 5), ... k_range=range(0, 5) ... )

Cartesian coordinates (must provide ranges)

region_cart = BlockRegion(0, 20, 0, 20, 0, 20, coord_system="cartesian") structure = builder.build_block( ... region_cart, ... i_range=range(0, 10), ... j_range=range(0, 10), ... k_range=range(0, 10) ... )

Notes
  • This method uses no Python loops, fully based on NumPy vectorized operations
  • Generated structure contains:
  • Atom positions (Cartesian coordinates)
  • Atom species
  • Box information (lattice vectors)
  • For empty basis (no basis sites), returns empty Atomistic structure

Lattice

Lattice(a1, a2, a3, basis)

Bravais lattice with basis sites.

This class defines a crystal lattice structure, including lattice vectors and basis sites. Lattice vectors define the shape and size of the unit cell, while basis sites define the positions of atoms within the cell (in fractional coordinates).

Parameters

a1, a2, a3 : np.ndarray Lattice vectors, each is a NumPy array of shape (3,) basis : list[Site] List of basis sites in fractional coordinates

Attributes

a1, a2, a3 : np.ndarray Lattice vectors basis : list[Site] List of basis sites

Examples
Create simple cubic lattice

lat = Lattice.cubic_sc(a=2.0, species="Cu")

Create face-centered cubic lattice

lat = Lattice.cubic_fcc(a=3.52, species="Ni")

Create rocksalt structure

lat = Lattice.rocksalt(a=5.64, species_a="Na", species_b="Cl")

Initialize lattice

Parameters

a1, a2, a3 : np.ndarray Lattice vectors of shape (3,) basis : list[Site] List of basis sites

cell property
cell

Return 3×3 cell matrix with lattice vectors as rows

Returns

np.ndarray Matrix of shape (3, 3), each row is a lattice vector [a1; a2; a3]

add_site
add_site(site)

Add a basis site

Parameters

site : Site Basis site to add

cubic_bcc classmethod
cubic_bcc(a, species)

Create body-centered cubic (Body-Centered Cubic, BCC) lattice

Body-centered cubic lattice has two atoms per unit cell: one at corner and one at body center.

Parameters

a : float Lattice constant (in Å) species : str Atomic species (e.g., "Fe", "W")

Returns

Lattice Body-centered cubic lattice

Examples

lat = Lattice.cubic_bcc(a=3.0, species="Fe") print(len(lat.basis)) # 2

cubic_fcc classmethod
cubic_fcc(a, species)

Create face-centered cubic (Face-Centered Cubic, FCC) lattice

Face-centered cubic lattice has four atoms per unit cell: one at corner and one at each face center.

Parameters

a : float Lattice constant (in Å) species : str Atomic species (e.g., "Ni", "Cu", "Al")

Returns

Lattice Face-centered cubic lattice

Examples

lat = Lattice.cubic_fcc(a=3.52, species="Ni") print(len(lat.basis)) # 4

cubic_sc classmethod
cubic_sc(a, species)

Create simple cubic (Simple Cubic, SC) lattice

Simple cubic lattice is the simplest lattice type with one atom per unit cell.

Parameters

a : float Lattice constant (in Å) species : str Atomic species (e.g., "Cu", "Fe")

Returns

Lattice Simple cubic lattice

Examples

lat = Lattice.cubic_sc(a=2.0, species="Cu") print(len(lat.basis)) # 1

frac_to_cart
frac_to_cart(frac)

Convert fractional coordinates to Cartesian coordinates

Fractional coordinates (u, v, w) represent position relative to lattice vectors: cart = ua1 + va2 + w*a3 = frac @ cell

Parameters

frac : np.ndarray Fractional coordinates of shape (N, 3) or (3,), containing (u, v, w) values

Returns

np.ndarray Cartesian coordinates, same shape as input

Examples

lat = Lattice.cubic_sc(a=2.0, species="Cu") frac = np.array([0.5, 0.5, 0.5]) cart = lat.frac_to_cart(frac) print(cart) # [1.0, 1.0, 1.0]

rocksalt classmethod
rocksalt(a, species_a, species_b)

Create rocksalt (NaCl) structure

Rocksalt structure consists of two interpenetrating FCC sublattices. Each unit cell contains 4 A atoms and 4 B atoms.

Parameters

a : float Lattice constant (in Å) species_a : str First atomic species (e.g., "Na") species_b : str Second atomic species (e.g., "Cl")

Returns

Lattice Rocksalt structure lattice

Examples

lat = Lattice.rocksalt(a=5.64, species_a="Na", species_b="Cl") print(len(lat.basis)) # 8

Region

Region(coord_system='lattice')

Bases: ABC

Abstract geometric region class

Define a spatial region that can be represented in lattice or Cartesian coordinates.

Parameters

coord_system : CoordSystem Coordinate system, "lattice" or "cartesian" - "lattice": point coordinates in lattice units - "cartesian": point coordinates in Cartesian coordinates (Å)

Notes

Subclasses must implement contains_mask method using NumPy vectorized operations to efficiently check if multiple points are in the region.

Initialize region

Parameters

coord_system : CoordSystem, optional Coordinate system, default is "lattice"

contains_mask abstractmethod
contains_mask(points)

Check if points are in the region (vectorized)

Parameters

points : np.ndarray Point coordinates array of shape (N, 3)

Returns

np.ndarray Boolean array of shape (N,), True indicates point is in region

Notes
  • If coord_system == "lattice": points are in lattice units
  • If coord_system == "cartesian": points are in Cartesian coordinates
  • Must use vectorized operations, no Python loops

Site dataclass

Site(label, species, frac, charge=0.0, meta=None)

Lattice basis site in fractional coordinates.

Attributes

label : str Site identifier or name (e.g., "A", "B1") species : str Chemical species or type name (e.g., "Ni", "Na", "Cl") frac : tuple[float, float, float] Fractional coordinates (u, v, w) relative to the Bravais cell, typically in [0, 1) charge : float, optional Charge, default is 0.0 meta : dict[str, Any] | None, optional Optional metadata dictionary

Examples

site = Site(label="A", species="Cu", frac=(0.0, 0.0, 0.0)) site_charged = Site(label="Na", species="Na", frac=(0.0, 0.0, 0.0), charge=1.0)

Polymer

polymer

Polymer assembly module.

Provides linear polymer assembly with both topology-only and chemical reaction connectors, plus optional geometric placement via Placer strategies.

Chain dataclass

Chain(dp, monomers, mass)

Represents a single polymer chain.

Attributes:

Name Type Description
dp int

Degree of polymerization (number of monomers)

monomers list[str]

List of monomer identifiers in the chain

mass float

Total mass of the chain (g/mol)

Connector

Connector(reacter, *, port_map=None, overrides=None)

Select ports and execute reactions between adjacent monomers.

Port selection strategy (applied in order): 1. Explicit port_map lookup for (left_label, right_label) 2. Compatibility: > on left pairs with < on right 3. Single-port: each side has exactly one unconsumed port 4. Common name: both sides share a port name (for $ ports) 5. Raise AmbiguousPortsError

connect
connect(left, right, left_type, right_type, port_atom_L, port_atom_R, typifier=None)

Execute the chemical reaction between two structures.

get_reacter
get_reacter(left_type, right_type)

Get the appropriate Reacter for a structure pair.

select_ports
select_ports(left, right, left_ports, right_ports, ctx)

Select which ports to connect.

Parameters:

Name Type Description Default
left Atomistic

Left Atomistic structure.

required
right Atomistic

Right Atomistic structure.

required
left_ports Mapping[str, list[Atom]]

Available ports on left (name -> list[Atom]).

required
right_ports Mapping[str, list[Atom]]

Available ports on right (name -> list[Atom]).

required
ctx ConnectorContext

Context with step info and labels.

required

Returns:

Type Description
tuple[str, int, str, int, None]

(left_port_name, left_idx, right_port_name, right_idx, None)

ConnectorContext

Bases: dict[str, Any]

Shared context passed to the connector during linear build.

Keys: - step: int (current connection step index) - left_label: str (label of left monomer) - right_label: str (label of right monomer) - sequence: list[str] (full sequence being built)

CovalentSeparator

CovalentSeparator(buffer=0.0)

Separator based on typical bond lengths (for bonded atoms).

Uses realistic bond lengths based on element types. Typical bond lengths: - C-C: 1.54 Å (single), 1.34 Å (double) - C-O: 1.43 Å (single), 1.23 Å (double) - C-N: 1.47 Å (single) - O-H: 0.96 Å - N-H: 1.01 Å

Initialize covalent separator.

Parameters:

Name Type Description Default
buffer float

Additional buffer distance in Angstroms (default: 0.0) Can be negative to account for slight compression

0.0
get_separation
get_separation(left_struct, right_struct, left_port, right_port)

Calculate separation based on typical bond lengths.

Parameters:

Name Type Description Default
left_struct Atomistic

Previous structure in sequence

required
right_struct Atomistic

Next structure to place

required
left_port Atom

Connection port on left structure

required
right_port Atom

Connection port on right structure

required

Returns:

Type Description
float

Separation distance = typical_bond_length + buffer

DPDistribution

Bases: Protocol

Protocol for distributions that sample degree of polymerization directly.

Distributions implementing this protocol can sample DP values without requiring monomer mass information. This is suitable for distributions defined in DP space (e.g., Poisson, Uniform).

dp_pmf
dp_pmf(dp_array)

Probability mass function for DP values.

Parameters:

Name Type Description Default
dp_array ndarray

Array of DP values

required

Returns:

Type Description
ndarray

Array of probability mass values

sample_dp
sample_dp(rng)

Sample degree of polymerization from distribution.

Parameters:

Name Type Description Default
rng Generator

NumPy random number generator

required

Returns:

Type Description
int

Degree of polymerization (>= 1)

FlorySchulzPolydisperse

FlorySchulzPolydisperse(a, random_seed=None)

Flory-Schulz (geometric) distribution for degree of polymerization.

PMF: P(N = k) = a^2 * k * (1 - a)^(k-1), k = 1, 2, ...

Parameters:

Name Type Description Default
a float

Probability parameter (0 < a < 1), related to extent of reaction.

required
random_seed int | None

Optional random seed.

None
dp_pmf
dp_pmf(dp_array)

Flory-Schulz PMF.

sample_dp
sample_dp(rng)

Sample DP from Flory-Schulz distribution (>= 1).

GrowthKernel

Bases: Protocol

Protocol for local transition function in port-level stochastic growth.

A GrowthKernel decides which monomer (if any) to add next for a given reactive port on the growing polymer. This encapsulates the reaction probability logic from G-BigSMILES notation.

choose_next_for_port
choose_next_for_port(polymer, port, candidates, rng=None)

Choose next monomer for a given port.

Parameters:

Name Type Description Default
polymer Atomistic

Current polymer structure

required
port Atom

Port to extend from

required
candidates Sequence[MonomerTemplate]

Available monomer templates

required
rng Generator | None

Random number generator for sampling

None

Returns:

Name Type Description
MonomerPlacement MonomerPlacement | None

Add this template at target port

None MonomerPlacement | None

Terminate this port (implicit end-group)

LinearOrienter

Orienter for linear polymer arrangement.

Aligns the next monomer so that: 1. The two port atoms are separated by the specified distance 2. The port connection axis of the next monomer aligns with the port connection axis of the previous monomer 3. The monomer extends in a linear fashion

get_orientation
get_orientation(left_struct, right_struct, left_port, right_port, separation)

Calculate linear alignment transformation.

Strategy: 1. Get direction vector from left port anchor (outward) 2. Place right structure so its port anchor is at the target position 3. Align right structure's port direction with left port direction

Parameters:

Name Type Description Default
left_struct Atomistic

Previous structure in sequence

required
right_struct Atomistic

Next structure to place

required
left_port Atom

Connection port on left structure

required
right_port Atom

Connection port on right structure

required
separation float

Distance between port anchors

required

Returns:

Type Description
tuple[ndarray, ndarray]

Tuple of (translation_vector, rotation_matrix)

MassDistribution

Bases: Protocol

Protocol for distributions that sample molecular weight directly.

Distributions implementing this protocol sample mass values directly from the distribution without converting through DP. This is suitable for distributions defined in mass space (e.g., Schulz-Zimm).

mass_pdf
mass_pdf(mass_array)

Probability density function for mass values.

Parameters:

Name Type Description Default
mass_array ndarray

Array of mass values (g/mol)

required

Returns:

Type Description
ndarray

Array of probability density values

sample_mass
sample_mass(rng)

Sample molecular weight from distribution.

Parameters:

Name Type Description Default
rng Generator

NumPy random number generator

required

Returns:

Type Description
float

Molecular weight (g/mol, > 0)

MonomerPlacement dataclass

MonomerPlacement(template, target_descriptor_id)

Decision for next monomer placement during stochastic growth.

Represents the output of a GrowthKernel's decision: which template to add and which port on that template to connect.

Attributes:

Name Type Description
template MonomerTemplate

MonomerTemplate to add

target_descriptor_id int

Which port descriptor on the new monomer to connect

Example

placement = MonomerPlacement( ... template=eo_template, ... target_descriptor_id=1 # Connect via port descriptor 1 ... ) print(f"Add {placement.template.label} at port {placement.target_descriptor_id}")

MonomerTemplate dataclass

MonomerTemplate(label, structure, port_descriptors, mass, metadata=dict())

Template for a monomer with port descriptors and metadata.

This represents a monomer type that can be instantiated multiple times during stochastic growth. Each instantiation creates a fresh copy of the structure.

Attributes:

Name Type Description
label str

Monomer label (e.g., "EO2", "PS")

structure Atomistic

Base Atomistic structure (will be copied on instantiation)

port_descriptors dict[int, PortDescriptor]

Mapping from descriptor_id to PortDescriptor

mass float

Molecular weight (g/mol)

metadata dict[str, Any]

Additional metadata (optional)

Example

template = MonomerTemplate( ... label="EO", ... structure=eo_monomer, ... port_descriptors={ ... 0: PortDescriptor(0, "<", role="left"), ... 1: PortDescriptor(1, ">", role="right"), ... }, ... mass=44.05, ... ) fresh_copy = template.instantiate() print(f"Template: {template.label}, mass={template.mass} g/mol")

get_all_descriptors
get_all_descriptors()

Get all port descriptors for this template.

Returns:

Type Description
list[PortDescriptor]

List of all PortDescriptor objects sorted by descriptor_id

Example

template = MonomerTemplate(...) descriptors = template.get_all_descriptors() for desc in descriptors: ... print(f"Port {desc.descriptor_id}: {desc.port_name}")

get_port_by_descriptor
get_port_by_descriptor(descriptor_id)

Get port descriptor for a specific descriptor ID.

Parameters:

Name Type Description Default
descriptor_id int

Descriptor ID to look up

required

Returns:

Type Description
PortDescriptor | None

PortDescriptor if found, None otherwise

Example

template = MonomerTemplate(...) left_port = template.get_port_by_descriptor(0) if left_port: ... print(f"Port: {left_port.port_name}, role: {left_port.role}")

instantiate
instantiate()

Create a fresh copy of the structure.

Each instantiation is independent with separate atoms and bonds, allowing the same template to be used multiple times in a polymer.

Returns:

Type Description
Atomistic

New Atomistic instance with independent atoms and bonds

Example

template = MonomerTemplate(label="EO", structure=eo_monomer, ...) copy1 = template.instantiate() copy2 = template.instantiate() copy1 is not copy2 # Different objects True

Placer

Placer(separator, orienter)

Combined placer for positioning structures during assembly.

Uses a Separator to determine distance and an Orienter to determine orientation.

Initialize placer.

Parameters:

Name Type Description Default
separator Separator

Separator for calculating distance

required
orienter LinearOrienter

Orienter for calculating orientation

required
place_monomer
place_monomer(left_struct, right_struct, left_port, right_port)

Position right_struct relative to left_struct.

Modifies right_struct's atomic coordinates in-place.

Parameters:

Name Type Description Default
left_struct Atomistic

Previous structure in sequence

required
right_struct Atomistic

Next structure to place

required
left_port Atom

Connection port on left structure

required
right_port Atom

Connection port on right structure

required

PoissonPolydisperse

PoissonPolydisperse(lambda_param, random_seed=None)

Poisson distribution for the degree of polymerization (DP).

Zero-truncated: sampled k=0 is mapped to k=1.

Parameters:

Name Type Description Default
lambda_param float

Mean of the Poisson distribution (> 0).

required
random_seed int | None

Optional random seed.

None
dp_pmf
dp_pmf(dp_array)

Zero-truncated Poisson PMF.

sample_dp
sample_dp(rng)

Sample DP from zero-truncated Poisson distribution (>= 1).

PolydisperseChainGenerator

PolydisperseChainGenerator(seq_generator, monomer_mass, end_group_mass=0.0, distribution=None)

Middle layer: Chain-level generator.

Responsible for: - Sampling chain size: - Either in DP-space via a DPDistribution (sample_dp) - Or in mass-space via a MassDistribution (sample_mass) - Using a SequenceGenerator to build the chain sequence - Computing the mass of a chain using monomer mass table and optional end-group mass

Does NOT know anything about total system mass. Only returns one chain at a time.

Initialize polydisperse chain generator.

Parameters:

Name Type Description Default
seq_generator SequenceGenerator

Sequence generator for generating monomer sequences

required
monomer_mass dict[str, float]

Dictionary mapping monomer identifiers to their masses (g/mol)

required
end_group_mass float

Mass of end groups (g/mol), default 0.0

0.0
distribution DPDistribution | MassDistribution | None

Distribution implementing DPDistribution or MassDistribution protocol

None
build_chain
build_chain(rng)

Sample DP, generate monomer sequence, and compute mass.

Parameters:

Name Type Description Default
rng Generator

np.random.Generator number generator

required

Returns:

Type Description
Chain

Chain object with dp, monomers, and mass

sample_dp
sample_dp(rng)

Sample a degree of polymerization from the distribution.

Parameters:

Name Type Description Default
rng Generator

np.random.Generator number generator

required

Returns:

Type Description
int

Degree of polymerization (>= 1)

sample_mass
sample_mass(rng)

Sample a target chain mass from a mass-based distribution.

Parameters:

Name Type Description Default
rng Generator

np.random.Generator number generator

required

Returns:

Type Description
float

Target chain mass in g/mol (>= 0)

PolymerBuildResult dataclass

PolymerBuildResult(polymer, connection_history=list(), total_steps=0)

Result of building a polymer.

PolymerBuilder

PolymerBuilder(library, connector, typifier=None, placer=None)

Build polymers from CGSmiles notation with support for arbitrary topologies.

This builder parses CGSmiles strings and constructs polymers using a graph-based approach, supporting: - Linear chains: {[#A][#B][#C]} - Branched structures: {[#A]([#B])[#C]} - Cyclic structures: {[#A]1[#B][#C]1} - Repeat operators: {[#A]|10}

Example

builder = PolymerBuilder( ... library={"EO2": eo2_monomer, "PS": ps_monomer}, ... connector=connector, ... typifier=typifier, ... ) result = builder.build("{[#EO2]|8[#PS]}")

Initialize the polymer builder.

Parameters:

Name Type Description Default
library Mapping[str, Atomistic]

Mapping from CGSmiles labels to Atomistic monomer structures

required
connector Connector

Connector for port selection and chemical reactions

required
typifier TypifierBase | None

Optional typifier for automatic retypification

None
placer Placer | None

Optional Placer for positioning structures before connection

None
build
build(cgsmiles)

Build a polymer from a CGSmiles string.

Parameters:

Name Type Description Default
cgsmiles str

CGSmiles notation string (e.g., "{[#EO2]|8[#PS]}")

required

Returns:

Type Description
PolymerBuildResult

PolymerBuildResult containing the assembled polymer and metadata

Raises:

Type Description
ValueError

If CGSmiles is invalid

SequenceError

If labels in CGSmiles are not found in library

PortDescriptor dataclass

PortDescriptor(descriptor_id, port_name, role=None, bond_kind=None, compat=None)

Descriptor for a reactive port on a monomer template.

Port descriptors identify ports with unique IDs and store metadata about port behavior (role, bond type, compatibility).

Attributes:

Name Type Description
descriptor_id int

Unique ID within template (e.g., 0, 1, 2)

port_name str

Port name on atom (e.g., "<", ">", "branch")

role str | None

Port role (e.g., "left", "right", "branch")

bond_kind str | None

Bond type (e.g., "-", "=", "#")

compat set[str] | None

Compatibility set for port matching

Example

desc = PortDescriptor( ... descriptor_id=0, ... port_name="<", ... role="left", ... bond_kind="-", ... compat={"donor"} ... ) print(f"Descriptor {desc.descriptor_id}: port '{desc.port_name}' ({desc.role})")

ProbabilityTableKernel

ProbabilityTableKernel(probability_tables, end_group_templates=None)

GrowthKernel based on G-BigSMILES probability tables.

This kernel uses pre-computed probability tables that map each port descriptor to weighted choices over (template, target_descriptor_id) pairs. Weights are integers that are normalized to probabilities during sampling.

Initialize probability table kernel.

Parameters:

Name Type Description Default
probability_tables dict[int, list[tuple[MonomerTemplate, int, int]]]

Maps descriptor_id -> [(template, target_desc, integer_weight)] Integer weights are normalized to probabilities during sampling.

required
end_group_templates dict[int, MonomerTemplate] | None

Maps descriptor_id -> end-group template (no ports)

None
choose_next_for_port
choose_next_for_port(polymer, port, candidates, rng=None)

Choose next monomer based on probability table.

Parameters:

Name Type Description Default
polymer Atomistic

Current polymer structure

required
port Atom

Port to extend from

required
candidates Sequence[MonomerTemplate]

Available monomer templates

required
rng Generator | None

Random number generator (uses default if None)

None

Returns:

Type Description
MonomerPlacement | None

MonomerPlacement or None (terminate)

SchulzZimmPolydisperse

SchulzZimmPolydisperse(Mn, Mw, random_seed=None)

Schulz-Zimm molecular weight distribution for polydisperse polymer chains.

Implements :class:MassDistribution - sampling is done directly in molecular-weight space.

The probability density is:

.. math::

f(M) = \frac{z^{z+1}}{\Gamma(z+1)}
       \frac{M^{z-1}}{M_n^{z}}
       \exp\left(-\frac{z M}{M_n}\right),

where z = Mn / (Mw - Mn). This is equivalent to a Gamma distribution with shape z and scale theta = Mw - Mn.

Parameters:

Name Type Description Default
Mn float

Number-average molecular weight (g/mol).

required
Mw float

Weight-average molecular weight (g/mol), must satisfy Mw > Mn.

required
random_seed int | None

Optional random seed.

None
mass_pdf
mass_pdf(mass_array)

Probability density function for mass values.

sample_mass
sample_mass(rng)

Sample molecular weight from Schulz-Zimm (Gamma) distribution.

SequenceGenerator

Bases: Protocol

Protocol for sequence generators.

A sequence generator controls how monomers are arranged in a single chain.

expected_composition
expected_composition()

Return expected long-chain monomer fractions.

Returns:

Type Description
dict[str, float]

Dictionary mapping monomer identifiers to expected fractions

generate_sequence
generate_sequence(dp, rng)

Generate a monomer sequence of specified degree of polymerization.

Parameters:

Name Type Description Default
dp int

Degree of polymerization (number of monomers)

required
rng Generator

numpy random Generator

required

Returns:

Type Description
list[str]

List of monomer identifiers (strings)

StochasticChain dataclass

StochasticChain(polymer, dp, mass, growth_history=list())

Result of stochastic BFS growth.

Contains the assembled polymer structure along with metadata about the growth process.

Attributes:

Name Type Description
polymer Atomistic

The assembled Atomistic structure

dp int

Degree of polymerization (number of monomers added)

mass float

Total molecular weight (g/mol)

growth_history list[dict[str, Any]]

Metadata for each monomer addition step

Example

chain = StochasticChain( ... polymer=final_structure, ... dp=25, ... mass=1101.25, ... growth_history=[...] ... ) print(f"Built polymer: DP={chain.dp}, mass={chain.mass:.1f} g/mol")

SystemPlan dataclass

SystemPlan(chains, total_mass, target_mass)

Represents a complete system plan with all chains.

Attributes:

Name Type Description
chains list[Chain]

List of all chains in the system

total_mass float

Total mass of all chains (g/mol)

target_mass float

Target total mass that was requested (g/mol)

SystemPlanner

SystemPlanner(chain_generator, target_total_mass, max_rel_error=0.02, max_chains=None, enable_trimming=True)

Top layer: System-level planner.

Responsible for: - Enforcing a target total mass for the overall system - Iteratively requesting chains from PolydisperseChainGenerator - Maintaining a running sum of total mass - Stopping when mass reaches target window, and optionally trimming the final chain

Does NOT micromanage sequence probabilities or DP distribution; only orchestrates at the ensemble level.

Initialize system planner.

Parameters:

Name Type Description Default
chain_generator PolydisperseChainGenerator

Chain generator for building chains

required
target_total_mass float

Target total system mass (g/mol)

required
max_rel_error float

Maximum relative error allowed (default 0.02 = 2%)

0.02
max_chains int | None

Maximum number of chains to generate (None = no limit)

None
enable_trimming bool

Whether to enable chain trimming to better hit target mass

True
plan_system
plan_system(rng)

Repeatedly ask chain_generator for new chains until accumulated mass reaches target_total_mass within max_rel_error.

Parameters:

Name Type Description Default
rng Generator

np.random.Generator number generator

required

Returns:

Type Description
SystemPlan

SystemPlan with all chains and total mass

UniformPolydisperse

UniformPolydisperse(min_dp, max_dp, random_seed=None)

Uniform distribution over degree of polymerization (DP).

All integer DP values between min_dp and max_dp (inclusive) are equally likely.

Parameters:

Name Type Description Default
min_dp int

Lower bound (>= 1).

required
max_dp int

Upper bound (>= min_dp).

required
random_seed int | None

Optional random seed.

None
dp_pmf
dp_pmf(dp_array)

PMF: equal probability for all integer DP in [min_dp, max_dp].

sample_dp
sample_dp(rng)

Sample DP uniformly from [min_dp, max_dp].

VdWSeparator

VdWSeparator(buffer=0.0)

Separator based on van der Waals radii.

Calculates separation as sum of VdW radii of the two port anchor atoms, plus an optional buffer distance.

NOTE: VdW radii are designed for non-bonded contacts (~3-4 Å). For bonded atoms, use CovalentSeparator instead.

Initialize VdW separator.

Parameters:

Name Type Description Default
buffer float

Additional buffer distance in Angstroms (default: 0.0)

0.0
get_separation
get_separation(left_struct, right_struct, left_port, right_port)

Calculate separation based on VdW radii.

Parameters:

Name Type Description Default
left_struct Atomistic

Previous structure in sequence

required
right_struct Atomistic

Next structure to place

required
left_port Atom

Connection port on left structure

required
right_port Atom

Connection port on right structure

required

Returns:

Type Description
float

Separation distance = vdw_left + vdw_right + buffer

WeightedSequenceGenerator

WeightedSequenceGenerator(monomer_weights)

Sequence generator based on monomer weights/proportions.

Each selection is independent (no memory of previous selections).

expected_composition
expected_composition()

Return expected long-chain monomer fractions.

generate_sequence
generate_sequence(dp, rng)

Generate a sequence of specified degree of polymerization.

Parameters:

Name Type Description Default
dp int

Degree of polymerization (number of monomers)

required
rng Generator

numpy random Generator

required

Returns:

Type Description
list[str]

List of monomer identifiers