Builder¶

System assembly: polymer chain construction from CGSmiles topology and monomer libraries.

Quick reference¶

Symbol	Summary	Preferred for
`PolymerBuilder`	Build chains from CGSmiles + library + connector + placer	Full control over assembly
`polymer(cgsmiles, ...)`	Tool: CGSmiles → chain in one call	Quick prototyping
`Connector`	Port selection rules + reaction binding	Defining which ports react
`Placer`	Geometric placement (separator + orienter)	Controlling inter-monomer geometry
`CovalentSeparator`	Covalent radii-based distance	Default monomer spacing
`LinearOrienter`	Linear chain orientation	Default growth direction

Canonical example¶

from molpy.builder.polymer import (
    PolymerBuilder, Connector, Placer,
    CovalentSeparator, LinearOrienter,
)
from molpy.tool import polymer

builder = PolymerBuilder(
    library={"EO": eo_template},
    connector=Connector(port_map={("EO","EO"): (">","<")}, reacter=rxn),
    placer=Placer(separator=CovalentSeparator(buffer=-0.1),
                  orienter=LinearOrienter()),
)
result = builder.build("{[#EO]|10}")
chain = result.polymer

# Or use the tool function:
result = polymer("{[#EO]|10}", library={"EO": eo_template}, reacter=rxn)
chain = result.polymer

Full API¶

Crystal¶

crystal ¶

Crystal lattice builder module - LAMMPS-style crystal structure generator.

This module provides tools for creating crystal structures: - Define Bravais lattices with basis sites - Predefined common lattice types (SC, BCC, FCC, rocksalt) - Define regions in lattice or Cartesian coordinates - Efficient vectorized unit cell tiling and atom generation

Example

lat = Lattice.cubic_fcc(a=3.52, species="Ni") region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice") builder = CrystalBuilder(lat) structure = builder.build_block(region)

BlockRegion ¶

BlockRegion(xmin, xmax, ymin, ymax, zmin, zmax, coord_system='lattice')

Bases: Region

Axis-aligned box region

Define a box region specified by x, y, z ranges.

Parameters¶

xmin, xmax : float x-direction range [xmin, xmax] ymin, ymax : float y-direction range [ymin, ymax] zmin, zmax : float z-direction range [zmin, zmax] coord_system : CoordSystem, optional Coordinate system, default is "lattice"

Examples¶

Region in lattice coordinates¶

region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice")

Region in Cartesian coordinates¶

region = BlockRegion(0, 30, 0, 30, 0, 30, coord_system="cartesian")

Initialize box region

Parameters¶

xmin, xmax : float x-direction range ymin, ymax : float y-direction range zmin, zmax : float z-direction range coord_system : CoordSystem, optional Coordinate system

contains_mask ¶

contains_mask(points)

Check if points are in the box (vectorized)

Parameters¶

points : np.ndarray Point coordinates array of shape (N, 3)

Returns¶

np.ndarray Boolean array of shape (N,)

Examples¶

region = BlockRegion(0, 10, 0, 10, 0, 10) points = np.array([[5, 5, 5], [15, 5, 5]]) mask = region.contains_mask(points) print(mask) # [True, False]

CrystalBuilder ¶

CrystalBuilder(lattice)

Crystal structure builder

Efficiently generate crystal structures using NumPy vectorized operations. Supports tiling lattices and creating atoms in specified regions.

Parameters¶

lattice : Lattice Lattice definition to use

Examples¶

Create a simple FCC structure¶

lat = Lattice.cubic_fcc(a=3.52, species="Ni") region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice") builder = CrystalBuilder(lat) structure = builder.build_block(region) print(len(structure.atoms))

Initialize crystal builder

Parameters¶

lattice : Lattice Lattice definition

build_block ¶

build_block(region, *, i_range=None, j_range=None, k_range=None)

Build crystal structure within a box region

This method efficiently generates crystal structures using vectorized operations: 1. Determine cell index ranges to tile 2. Use NumPy meshgrid and broadcasting to generate all atom positions 3. Apply region filtering 4. Create and return Atomistic structure

Parameters¶

region : BlockRegion Box defining the region for atom generation i_range, j_range, k_range : range | None, optional Explicitly specify cell index ranges. If not provided: - For "lattice" coordinate system: inferred from region boundaries - For "cartesian" coordinate system: must be provided, otherwise raises error

Returns¶

Atomistic Generated crystal structure containing atoms and box information

Raises¶

ValueError If coord_system == "cartesian" and explicit ranges are not provided

Examples¶

Using lattice coordinates (auto-infer ranges)¶

lat = Lattice.cubic_sc(a=2.0, species="Cu") region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice") builder = CrystalBuilder(lat) structure = builder.build_block(region)

Using explicit ranges¶

structure = builder.build_block( ... region, ... i_range=range(0, 5), ... j_range=range(0, 5), ... k_range=range(0, 5) ... )

Cartesian coordinates (must provide ranges)¶

region_cart = BlockRegion(0, 20, 0, 20, 0, 20, coord_system="cartesian") structure = builder.build_block( ... region_cart, ... i_range=range(0, 10), ... j_range=range(0, 10), ... k_range=range(0, 10) ... )

Notes¶

This method uses no Python loops, fully based on NumPy vectorized operations
Generated structure contains:
Atom positions (Cartesian coordinates)
Atom species
Box information (lattice vectors)
For empty basis (no basis sites), returns empty Atomistic structure

Lattice ¶

Lattice(a1, a2, a3, basis)

Bravais lattice with basis sites.

This class defines a crystal lattice structure, including lattice vectors and basis sites. Lattice vectors define the shape and size of the unit cell, while basis sites define the positions of atoms within the cell (in fractional coordinates).

Parameters¶

a1, a2, a3 : np.ndarray Lattice vectors, each is a NumPy array of shape (3,) basis : list[Site] List of basis sites in fractional coordinates

Attributes¶

a1, a2, a3 : np.ndarray Lattice vectors basis : list[Site] List of basis sites

Examples¶

Create simple cubic lattice¶

lat = Lattice.cubic_sc(a=2.0, species="Cu")

Create face-centered cubic lattice¶

lat = Lattice.cubic_fcc(a=3.52, species="Ni")

Create rocksalt structure¶

lat = Lattice.rocksalt(a=5.64, species_a="Na", species_b="Cl")

Initialize lattice

Parameters¶

a1, a2, a3 : np.ndarray Lattice vectors of shape (3,) basis : list[Site] List of basis sites

cell `property` ¶

cell

Return 3×3 cell matrix with lattice vectors as rows

Returns¶

np.ndarray Matrix of shape (3, 3), each row is a lattice vector [a1; a2; a3]

add_site ¶

add_site(site)

Add a basis site

Parameters¶

site : Site Basis site to add

cubic_bcc `classmethod` ¶

cubic_bcc(a, species)

Create body-centered cubic (Body-Centered Cubic, BCC) lattice

Body-centered cubic lattice has two atoms per unit cell: one at corner and one at body center.

Parameters¶

a : float Lattice constant (in Å) species : str Atomic species (e.g., "Fe", "W")

Returns¶

Lattice Body-centered cubic lattice

Examples¶

lat = Lattice.cubic_bcc(a=3.0, species="Fe") print(len(lat.basis)) # 2

cubic_fcc `classmethod` ¶

cubic_fcc(a, species)

Create face-centered cubic (Face-Centered Cubic, FCC) lattice

Face-centered cubic lattice has four atoms per unit cell: one at corner and one at each face center.

Parameters¶

a : float Lattice constant (in Å) species : str Atomic species (e.g., "Ni", "Cu", "Al")

Returns¶

Lattice Face-centered cubic lattice

Examples¶

lat = Lattice.cubic_fcc(a=3.52, species="Ni") print(len(lat.basis)) # 4

cubic_sc `classmethod` ¶

cubic_sc(a, species)

Create simple cubic (Simple Cubic, SC) lattice

Simple cubic lattice is the simplest lattice type with one atom per unit cell.

Parameters¶

a : float Lattice constant (in Å) species : str Atomic species (e.g., "Cu", "Fe")

Returns¶

Lattice Simple cubic lattice

Examples¶

lat = Lattice.cubic_sc(a=2.0, species="Cu") print(len(lat.basis)) # 1

frac_to_cart ¶

frac_to_cart(frac)

Convert fractional coordinates to Cartesian coordinates

Fractional coordinates (u, v, w) represent position relative to lattice vectors: cart = ua1 + va2 + w*a3 = frac @ cell

Parameters¶

frac : np.ndarray Fractional coordinates of shape (N, 3) or (3,), containing (u, v, w) values

Returns¶

np.ndarray Cartesian coordinates, same shape as input

Examples¶

lat = Lattice.cubic_sc(a=2.0, species="Cu") frac = np.array([0.5, 0.5, 0.5]) cart = lat.frac_to_cart(frac) print(cart) # [1.0, 1.0, 1.0]

rocksalt `classmethod` ¶

rocksalt(a, species_a, species_b)

Create rocksalt (NaCl) structure

Rocksalt structure consists of two interpenetrating FCC sublattices. Each unit cell contains 4 A atoms and 4 B atoms.

Parameters¶

a : float Lattice constant (in Å) species_a : str First atomic species (e.g., "Na") species_b : str Second atomic species (e.g., "Cl")

Returns¶

Lattice Rocksalt structure lattice

Examples¶

lat = Lattice.rocksalt(a=5.64, species_a="Na", species_b="Cl") print(len(lat.basis)) # 8

Region ¶

Region(coord_system='lattice')

Bases: ABC

Abstract geometric region class

Define a spatial region that can be represented in lattice or Cartesian coordinates.

Parameters¶

coord_system : CoordSystem Coordinate system, "lattice" or "cartesian" - "lattice": point coordinates in lattice units - "cartesian": point coordinates in Cartesian coordinates (Å)

Notes¶

Subclasses must implement contains_mask method using NumPy vectorized operations to efficiently check if multiple points are in the region.

Initialize region

Parameters¶

coord_system : CoordSystem, optional Coordinate system, default is "lattice"

contains_mask `abstractmethod` ¶

contains_mask(points)

Check if points are in the region (vectorized)

Parameters¶

points : np.ndarray Point coordinates array of shape (N, 3)

Returns¶

np.ndarray Boolean array of shape (N,), True indicates point is in region

Notes¶

If coord_system == "lattice": points are in lattice units
If coord_system == "cartesian": points are in Cartesian coordinates
Must use vectorized operations, no Python loops

Site `dataclass` ¶

Site(label, species, frac, charge=0.0, meta=None)

Lattice basis site in fractional coordinates.

Attributes¶

label : str Site identifier or name (e.g., "A", "B1") species : str Chemical species or type name (e.g., "Ni", "Na", "Cl") frac : tuple[float, float, float] Fractional coordinates (u, v, w) relative to the Bravais cell, typically in [0, 1) charge : float, optional Charge, default is 0.0 meta : dict[str, Any] | None, optional Optional metadata dictionary

Examples¶

site = Site(label="A", species="Cu", frac=(0.0, 0.0, 0.0)) site_charged = Site(label="Na", species="Na", frac=(0.0, 0.0, 0.0), charge=1.0)

Polymer¶

polymer ¶

Polymer assembly module.

Provides linear polymer assembly with both topology-only and chemical reaction connectors, plus optional geometric placement via Placer strategies.

Chain `dataclass` ¶

Chain(dp, monomers, mass)

Represents a single polymer chain.

Attributes:

Name	Type	Description
`dp`	`int`	Degree of polymerization (number of monomers)
`monomers`	`list[str]`	List of monomer identifiers in the chain
`mass`	`float`	Total mass of the chain (g/mol)

Connector ¶

Connector(reacter, *, port_map=None, overrides=None)

Select ports and execute reactions between adjacent monomers.

Port selection strategy (applied in order): 1. Explicit port_map lookup for (left_label, right_label) 2. Compatibility: > on left pairs with < on right 3. Single-port: each side has exactly one unconsumed port 4. Common name: both sides share a port name (for $ ports) 5. Raise AmbiguousPortsError

connect ¶

connect(left, right, left_type, right_type, port_atom_L, port_atom_R, typifier=None)

Execute the chemical reaction between two structures.

get_reacter ¶

get_reacter(left_type, right_type)

Get the appropriate Reacter for a structure pair.

select_ports ¶

select_ports(left, right, left_ports, right_ports, ctx)

Select which ports to connect.

Parameters:

Name	Type	Description	Default
`left`	`Atomistic`	Left Atomistic structure.	required
`right`	`Atomistic`	Right Atomistic structure.	required
`left_ports`	`Mapping[str, list[Atom]]`	Available ports on left (name -> list[Atom]).	required
`right_ports`	`Mapping[str, list[Atom]]`	Available ports on right (name -> list[Atom]).	required
`ctx`	`ConnectorContext`	Context with step info and labels.	required

Returns:

Type	Description
`tuple[str, int, str, int, None]`	(left_port_name, left_idx, right_port_name, right_idx, None)

ConnectorContext ¶

Bases: dict[str, Any]

Shared context passed to the connector during linear build.

Keys: - step: int (current connection step index) - left_label: str (label of left monomer) - right_label: str (label of right monomer) - sequence: list[str] (full sequence being built)

CovalentSeparator ¶

CovalentSeparator(buffer=0.0)

Separator based on typical bond lengths (for bonded atoms).

Uses realistic bond lengths based on element types. Typical bond lengths: - C-C: 1.54 Å (single), 1.34 Å (double) - C-O: 1.43 Å (single), 1.23 Å (double) - C-N: 1.47 Å (single) - O-H: 0.96 Å - N-H: 1.01 Å

Initialize covalent separator.

Parameters:

Name	Type	Description	Default
`buffer`	`float`	Additional buffer distance in Angstroms (default: 0.0) Can be negative to account for slight compression	`0.0`

get_separation ¶

get_separation(left_struct, right_struct, left_port, right_port)

Calculate separation based on typical bond lengths.

Parameters:

Name	Type	Description	Default
`left_struct`	`Atomistic`	Previous structure in sequence	required
`right_struct`	`Atomistic`	Next structure to place	required
`left_port`	`Atom`	Connection port on left structure	required
`right_port`	`Atom`	Connection port on right structure	required

Returns:

Type	Description
`float`	Separation distance = typical_bond_length + buffer

DPDistribution ¶

Bases: Protocol

Protocol for distributions that sample degree of polymerization directly.

Distributions implementing this protocol can sample DP values without requiring monomer mass information. This is suitable for distributions defined in DP space (e.g., Poisson, Uniform).

dp_pmf ¶

dp_pmf(dp_array)

Probability mass function for DP values.

Parameters:

Name	Type	Description	Default
`dp_array`	`ndarray`	Array of DP values	required

Returns:

Type	Description
`ndarray`	Array of probability mass values

sample_dp ¶

sample_dp(rng)

Sample degree of polymerization from distribution.

Parameters:

Name	Type	Description	Default
`rng`	`Generator`	NumPy random number generator	required

Returns:

Type	Description
`int`	Degree of polymerization (>= 1)

FlorySchulzPolydisperse ¶

FlorySchulzPolydisperse(a, random_seed=None)

Flory-Schulz (geometric) distribution for degree of polymerization.

PMF: P(N = k) = a^2 * k * (1 - a)^(k-1), k = 1, 2, ...

Parameters:

Name	Type	Description	Default
`a`	`float`	Probability parameter (0 < a < 1), related to extent of reaction.	required
`random_seed`	`int \| None`	Optional random seed.	`None`

dp_pmf ¶

dp_pmf(dp_array)

Flory-Schulz PMF.

sample_dp ¶

sample_dp(rng)

Sample DP from Flory-Schulz distribution (>= 1).

GrowthKernel ¶

Bases: Protocol

Protocol for local transition function in port-level stochastic growth.

A GrowthKernel decides which monomer (if any) to add next for a given reactive port on the growing polymer. This encapsulates the reaction probability logic from G-BigSMILES notation.

choose_next_for_port ¶

choose_next_for_port(polymer, port, candidates, rng=None)

Choose next monomer for a given port.

Parameters:

Name	Type	Description	Default
`polymer`	`Atomistic`	Current polymer structure	required
`port`	`Atom`	Port to extend from	required
`candidates`	`Sequence[MonomerTemplate]`	Available monomer templates	required
`rng`	`Generator \| None`	Random number generator for sampling	`None`

Returns:

Name	Type	Description
`MonomerPlacement`	`MonomerPlacement \| None`	Add this template at target port
`None`	`MonomerPlacement \| None`	Terminate this port (implicit end-group)

LinearOrienter ¶

Orienter for linear polymer arrangement.

Aligns the next monomer so that: 1. The two port atoms are separated by the specified distance 2. The port connection axis of the next monomer aligns with the port connection axis of the previous monomer 3. The monomer extends in a linear fashion

get_orientation ¶

get_orientation(left_struct, right_struct, left_port, right_port, separation)

Calculate linear alignment transformation.

Strategy: 1. Get direction vector from left port anchor (outward) 2. Place right structure so its port anchor is at the target position 3. Align right structure's port direction with left port direction

Parameters:

Name	Type	Description	Default
`left_struct`	`Atomistic`	Previous structure in sequence	required
`right_struct`	`Atomistic`	Next structure to place	required
`left_port`	`Atom`	Connection port on left structure	required
`right_port`	`Atom`	Connection port on right structure	required
`separation`	`float`	Distance between port anchors	required

Returns:

Type	Description
`tuple[ndarray, ndarray]`	Tuple of (translation_vector, rotation_matrix)

MassDistribution ¶

Bases: Protocol

Protocol for distributions that sample molecular weight directly.

Distributions implementing this protocol sample mass values directly from the distribution without converting through DP. This is suitable for distributions defined in mass space (e.g., Schulz-Zimm).

mass_pdf ¶

mass_pdf(mass_array)

Probability density function for mass values.

Parameters:

Name	Type	Description	Default
`mass_array`	`ndarray`	Array of mass values (g/mol)	required

Returns:

Type	Description
`ndarray`	Array of probability density values

sample_mass ¶

sample_mass(rng)

Sample molecular weight from distribution.

Parameters:

Name	Type	Description	Default
`rng`	`Generator`	NumPy random number generator	required

Returns:

Type	Description
`float`	Molecular weight (g/mol, > 0)

MonomerPlacement `dataclass` ¶

MonomerPlacement(template, target_descriptor_id)

Decision for next monomer placement during stochastic growth.

Represents the output of a GrowthKernel's decision: which template to add and which port on that template to connect.

Attributes:

Name	Type	Description
`template`	`MonomerTemplate`	MonomerTemplate to add
`target_descriptor_id`	`int`	Which port descriptor on the new monomer to connect

Example

placement = MonomerPlacement( ... template=eo_template, ... target_descriptor_id=1 # Connect via port descriptor 1 ... ) print(f"Add {placement.template.label} at port {placement.target_descriptor_id}")

MonomerTemplate `dataclass` ¶

MonomerTemplate(label, structure, port_descriptors, mass, metadata=dict())

Template for a monomer with port descriptors and metadata.

This represents a monomer type that can be instantiated multiple times during stochastic growth. Each instantiation creates a fresh copy of the structure.

Attributes:

Name	Type	Description
`label`	`str`	Monomer label (e.g., "EO2", "PS")
`structure`	`Atomistic`	Base Atomistic structure (will be copied on instantiation)
`port_descriptors`	`dict[int, PortDescriptor]`	Mapping from descriptor_id to PortDescriptor
`mass`	`float`	Molecular weight (g/mol)
`metadata`	`dict[str, Any]`	Additional metadata (optional)

Example

template = MonomerTemplate( ... label="EO", ... structure=eo_monomer, ... port_descriptors={ ... 0: PortDescriptor(0, "<", role="left"), ... 1: PortDescriptor(1, ">", role="right"), ... }, ... mass=44.05, ... ) fresh_copy = template.instantiate() print(f"Template: {template.label}, mass={template.mass} g/mol")

get_all_descriptors ¶

get_all_descriptors()

Get all port descriptors for this template.

Returns:

Type	Description
`list[PortDescriptor]`	List of all PortDescriptor objects sorted by descriptor_id

Example

template = MonomerTemplate(...) descriptors = template.get_all_descriptors() for desc in descriptors: ... print(f"Port {desc.descriptor_id}: {desc.port_name}")

get_port_by_descriptor ¶

get_port_by_descriptor(descriptor_id)

Get port descriptor for a specific descriptor ID.

Parameters:

Name	Type	Description	Default
`descriptor_id`	`int`	Descriptor ID to look up	required

Returns:

Type	Description
`PortDescriptor \| None`	PortDescriptor if found, None otherwise

Example

template = MonomerTemplate(...) left_port = template.get_port_by_descriptor(0) if left_port: ... print(f"Port: {left_port.port_name}, role: {left_port.role}")

instantiate ¶

instantiate()

Create a fresh copy of the structure.

Each instantiation is independent with separate atoms and bonds, allowing the same template to be used multiple times in a polymer.

Returns:

Type	Description
`Atomistic`	New Atomistic instance with independent atoms and bonds

Example

template = MonomerTemplate(label="EO", structure=eo_monomer, ...) copy1 = template.instantiate() copy2 = template.instantiate() copy1 is not copy2 # Different objects True

Placer ¶

Placer(separator, orienter)

Combined placer for positioning structures during assembly.

Uses a Separator to determine distance and an Orienter to determine orientation.

Initialize placer.

Parameters:

Name	Type	Description	Default
`separator`	`Separator`	Separator for calculating distance	required
`orienter`	`LinearOrienter`	Orienter for calculating orientation	required

place_monomer ¶

place_monomer(left_struct, right_struct, left_port, right_port)

Position right_struct relative to left_struct.

Modifies right_struct's atomic coordinates in-place.

Parameters:

Name	Type	Description	Default
`left_struct`	`Atomistic`	Previous structure in sequence	required
`right_struct`	`Atomistic`	Next structure to place	required
`left_port`	`Atom`	Connection port on left structure	required
`right_port`	`Atom`	Connection port on right structure	required

PoissonPolydisperse ¶

PoissonPolydisperse(lambda_param, random_seed=None)

Poisson distribution for the degree of polymerization (DP).

Zero-truncated: sampled k=0 is mapped to k=1.

Parameters:

Name	Type	Description	Default
`lambda_param`	`float`	Mean of the Poisson distribution (> 0).	required
`random_seed`	`int \| None`	Optional random seed.	`None`

dp_pmf ¶

dp_pmf(dp_array)

Zero-truncated Poisson PMF.

sample_dp ¶

sample_dp(rng)

Sample DP from zero-truncated Poisson distribution (>= 1).

PolydisperseChainGenerator ¶

PolydisperseChainGenerator(seq_generator, monomer_mass, end_group_mass=0.0, distribution=None)

Middle layer: Chain-level generator.

Responsible for: - Sampling chain size: - Either in DP-space via a DPDistribution (sample_dp) - Or in mass-space via a MassDistribution (sample_mass) - Using a SequenceGenerator to build the chain sequence - Computing the mass of a chain using monomer mass table and optional end-group mass

Does NOT know anything about total system mass. Only returns one chain at a time.

Initialize polydisperse chain generator.

Parameters:

Name	Type	Description	Default
`seq_generator`	`SequenceGenerator`	Sequence generator for generating monomer sequences	required
`monomer_mass`	`dict[str, float]`	Dictionary mapping monomer identifiers to their masses (g/mol)	required
`end_group_mass`	`float`	Mass of end groups (g/mol), default 0.0	`0.0`
`distribution`	`DPDistribution \| MassDistribution \| None`	Distribution implementing DPDistribution or MassDistribution protocol	`None`

build_chain ¶

build_chain(rng)

Sample DP, generate monomer sequence, and compute mass.

Parameters:

Name	Type	Description	Default
`rng`	`Generator`	np.random.Generator number generator	required

Returns:

Type	Description
`Chain`	Chain object with dp, monomers, and mass

sample_dp ¶

sample_dp(rng)

Sample a degree of polymerization from the distribution.

Parameters:

Name	Type	Description	Default
`rng`	`Generator`	np.random.Generator number generator	required

Returns:

Type	Description
`int`	Degree of polymerization (>= 1)

sample_mass ¶

sample_mass(rng)

Sample a target chain mass from a mass-based distribution.

Parameters:

Name	Type	Description	Default
`rng`	`Generator`	np.random.Generator number generator	required

Returns:

Type	Description
`float`	Target chain mass in g/mol (>= 0)

PolymerBuildResult `dataclass` ¶

PolymerBuildResult(polymer, connection_history=list(), total_steps=0)

Result of building a polymer.

PolymerBuilder ¶

PolymerBuilder(library, connector, typifier=None, placer=None)

Build polymers from CGSmiles notation with support for arbitrary topologies.

This builder parses CGSmiles strings and constructs polymers using a graph-based approach, supporting: - Linear chains: {[#A][#B][#C]} - Branched structures: {[#A]([#B])[#C]} - Cyclic structures: {[#A]1[#B][#C]1} - Repeat operators: {[#A]|10}

Example

builder = PolymerBuilder( ... library={"EO2": eo2_monomer, "PS": ps_monomer}, ... connector=connector, ... typifier=typifier, ... ) result = builder.build("{[#EO2]|8[#PS]}")

Initialize the polymer builder.

Parameters:

Name	Type	Description	Default
`library`	`Mapping[str, Atomistic]`	Mapping from CGSmiles labels to Atomistic monomer structures	required
`connector`	`Connector`	Connector for port selection and chemical reactions	required
`typifier`	`TypifierBase \| None`	Optional typifier for automatic retypification	`None`
`placer`	`Placer \| None`	Optional Placer for positioning structures before connection	`None`

build ¶

build(cgsmiles)

Build a polymer from a CGSmiles string.

Parameters:

Name	Type	Description	Default
`cgsmiles`	`str`	CGSmiles notation string (e.g., "{[#EO2]\|8[#PS]}")	required

Returns:

Type	Description
`PolymerBuildResult`	PolymerBuildResult containing the assembled polymer and metadata

Raises:

Type	Description
`ValueError`	If CGSmiles is invalid
`SequenceError`	If labels in CGSmiles are not found in library

PortDescriptor `dataclass` ¶

PortDescriptor(descriptor_id, port_name, role=None, bond_kind=None, compat=None)

Descriptor for a reactive port on a monomer template.

Port descriptors identify ports with unique IDs and store metadata about port behavior (role, bond type, compatibility).

Attributes:

Name	Type	Description
`descriptor_id`	`int`	Unique ID within template (e.g., 0, 1, 2)
`port_name`	`str`	Port name on atom (e.g., "<", ">", "branch")
`role`	`str \| None`	Port role (e.g., "left", "right", "branch")
`bond_kind`	`str \| None`	Bond type (e.g., "-", "=", "#")
`compat`	`set[str] \| None`	Compatibility set for port matching

Example

desc = PortDescriptor( ... descriptor_id=0, ... port_name="<", ... role="left", ... bond_kind="-", ... compat={"donor"} ... ) print(f"Descriptor {desc.descriptor_id}: port '{desc.port_name}' ({desc.role})")

ProbabilityTableKernel ¶

ProbabilityTableKernel(probability_tables, end_group_templates=None)

GrowthKernel based on G-BigSMILES probability tables.

This kernel uses pre-computed probability tables that map each port descriptor to weighted choices over (template, target_descriptor_id) pairs. Weights are integers that are normalized to probabilities during sampling.

Initialize probability table kernel.

Parameters:

Name	Type	Description	Default
`probability_tables`	`dict[int, list[tuple[MonomerTemplate, int, int]]]`	Maps descriptor_id -> [(template, target_desc, integer_weight)] Integer weights are normalized to probabilities during sampling.	required
`end_group_templates`	`dict[int, MonomerTemplate] \| None`	Maps descriptor_id -> end-group template (no ports)	`None`

choose_next_for_port ¶

choose_next_for_port(polymer, port, candidates, rng=None)

Choose next monomer based on probability table.

Parameters:

Name	Type	Description	Default
`polymer`	`Atomistic`	Current polymer structure	required
`port`	`Atom`	Port to extend from	required
`candidates`	`Sequence[MonomerTemplate]`	Available monomer templates	required
`rng`	`Generator \| None`	Random number generator (uses default if None)	`None`

Returns:

Type	Description
`MonomerPlacement \| None`	MonomerPlacement or None (terminate)

SchulzZimmPolydisperse ¶

SchulzZimmPolydisperse(Mn, Mw, random_seed=None)

Schulz-Zimm molecular weight distribution for polydisperse polymer chains.

Implements :class:MassDistribution - sampling is done directly in molecular-weight space.

The probability density is:

.. math::

f(M) = \frac{z^{z+1}}{\Gamma(z+1)}
       \frac{M^{z-1}}{M_n^{z}}
       \exp\left(-\frac{z M}{M_n}\right),

where z = Mn / (Mw - Mn). This is equivalent to a Gamma distribution with shape z and scale theta = Mw - Mn.

Parameters:

Name	Type	Description	Default
`Mn`	`float`	Number-average molecular weight (g/mol).	required
`Mw`	`float`	Weight-average molecular weight (g/mol), must satisfy Mw > Mn.	required
`random_seed`	`int \| None`	Optional random seed.	`None`

mass_pdf ¶

mass_pdf(mass_array)

Probability density function for mass values.

sample_mass ¶

sample_mass(rng)

Sample molecular weight from Schulz-Zimm (Gamma) distribution.

SequenceGenerator ¶

Bases: Protocol

Protocol for sequence generators.

A sequence generator controls how monomers are arranged in a single chain.

expected_composition ¶

expected_composition()

Return expected long-chain monomer fractions.

Returns:

Type	Description
`dict[str, float]`	Dictionary mapping monomer identifiers to expected fractions

generate_sequence ¶

generate_sequence(dp, rng)

Generate a monomer sequence of specified degree of polymerization.

Parameters:

Name	Type	Description	Default
`dp`	`int`	Degree of polymerization (number of monomers)	required
`rng`	`Generator`	numpy random Generator	required

Returns:

Type	Description
`list[str]`	List of monomer identifiers (strings)

StochasticChain `dataclass` ¶

StochasticChain(polymer, dp, mass, growth_history=list())

Result of stochastic BFS growth.

Contains the assembled polymer structure along with metadata about the growth process.

Attributes:

Name	Type	Description
`polymer`	`Atomistic`	The assembled Atomistic structure
`dp`	`int`	Degree of polymerization (number of monomers added)
`mass`	`float`	Total molecular weight (g/mol)
`growth_history`	`list[dict[str, Any]]`	Metadata for each monomer addition step

Example

chain = StochasticChain( ... polymer=final_structure, ... dp=25, ... mass=1101.25, ... growth_history=[...] ... ) print(f"Built polymer: DP={chain.dp}, mass={chain.mass:.1f} g/mol")

SystemPlan `dataclass` ¶

SystemPlan(chains, total_mass, target_mass)

Represents a complete system plan with all chains.

Attributes:

Name	Type	Description
`chains`	`list[Chain]`	List of all chains in the system
`total_mass`	`float`	Total mass of all chains (g/mol)
`target_mass`	`float`	Target total mass that was requested (g/mol)

SystemPlanner ¶

SystemPlanner(chain_generator, target_total_mass, max_rel_error=0.02, max_chains=None, enable_trimming=True)

Top layer: System-level planner.

Responsible for: - Enforcing a target total mass for the overall system - Iteratively requesting chains from PolydisperseChainGenerator - Maintaining a running sum of total mass - Stopping when mass reaches target window, and optionally trimming the final chain

Does NOT micromanage sequence probabilities or DP distribution; only orchestrates at the ensemble level.

Initialize system planner.

Parameters:

Name	Type	Description	Default
`chain_generator`	`PolydisperseChainGenerator`	Chain generator for building chains	required
`target_total_mass`	`float`	Target total system mass (g/mol)	required
`max_rel_error`	`float`	Maximum relative error allowed (default 0.02 = 2%)	`0.02`
`max_chains`	`int \| None`	Maximum number of chains to generate (None = no limit)	`None`
`enable_trimming`	`bool`	Whether to enable chain trimming to better hit target mass	`True`

plan_system ¶

plan_system(rng)

Repeatedly ask chain_generator for new chains until accumulated mass reaches target_total_mass within max_rel_error.

Parameters:

Name	Type	Description	Default
`rng`	`Generator`	np.random.Generator number generator	required

Returns:

Type	Description
`SystemPlan`	SystemPlan with all chains and total mass

UniformPolydisperse ¶

UniformPolydisperse(min_dp, max_dp, random_seed=None)

Uniform distribution over degree of polymerization (DP).

All integer DP values between min_dp and max_dp (inclusive) are equally likely.

Parameters:

Name	Type	Description	Default
`min_dp`	`int`	Lower bound (>= 1).	required
`max_dp`	`int`	Upper bound (>= min_dp).	required
`random_seed`	`int \| None`	Optional random seed.	`None`

dp_pmf ¶

dp_pmf(dp_array)

PMF: equal probability for all integer DP in [min_dp, max_dp].

sample_dp ¶

sample_dp(rng)

Sample DP uniformly from [min_dp, max_dp].

VdWSeparator ¶

VdWSeparator(buffer=0.0)

Separator based on van der Waals radii.

Calculates separation as sum of VdW radii of the two port anchor atoms, plus an optional buffer distance.

NOTE: VdW radii are designed for non-bonded contacts (~3-4 Å). For bonded atoms, use CovalentSeparator instead.

Initialize VdW separator.

Parameters:

Name	Type	Description	Default
`buffer`	`float`	Additional buffer distance in Angstroms (default: 0.0)	`0.0`

get_separation ¶

get_separation(left_struct, right_struct, left_port, right_port)

Calculate separation based on VdW radii.

Parameters:

Name	Type	Description	Default
`left_struct`	`Atomistic`	Previous structure in sequence	required
`right_struct`	`Atomistic`	Next structure to place	required
`left_port`	`Atom`	Connection port on left structure	required
`right_port`	`Atom`	Connection port on right structure	required

Returns:

Type	Description
`float`	Separation distance = vdw_left + vdw_right + buffer

WeightedSequenceGenerator ¶

WeightedSequenceGenerator(monomer_weights)

Sequence generator based on monomer weights/proportions.

Each selection is independent (no memory of previous selections).

expected_composition ¶

expected_composition()

Return expected long-chain monomer fractions.

generate_sequence ¶

generate_sequence(dp, rng)

Generate a sequence of specified degree of polymerization.

Parameters:

Name	Type	Description	Default
`dp`	`int`	Degree of polymerization (number of monomers)	required
`rng`	`Generator`	numpy random Generator	required

Returns:

Type	Description
`list[str]`	List of monomer identifiers

Builder¶

Quick reference¶

Canonical example¶

Related¶

Full API¶

Crystal¶

crystal ¶

BlockRegion ¶

Parameters¶

Examples¶

Region in lattice coordinates¶

Region in Cartesian coordinates¶

Parameters¶

contains_mask ¶

Parameters¶

Returns¶

Examples¶

CrystalBuilder ¶

Parameters¶

Examples¶

Create a simple FCC structure¶

Parameters¶

build_block ¶

Parameters¶

Returns¶

Raises¶

Examples¶

Using lattice coordinates (auto-infer ranges)¶

Using explicit ranges¶

Cartesian coordinates (must provide ranges)¶

Notes¶

Lattice ¶

Parameters¶

Attributes¶

Examples¶

Create simple cubic lattice¶

Create face-centered cubic lattice¶

Create rocksalt structure¶

Parameters¶

cell property ¶

Returns¶

add_site ¶

Parameters¶

cubic_bcc classmethod ¶

Parameters¶

Returns¶

Examples¶

cubic_fcc classmethod ¶

Parameters¶

Returns¶

Examples¶

cubic_sc classmethod ¶

Parameters¶

Returns¶

Examples¶

frac_to_cart ¶

Parameters¶

Returns¶

Examples¶

rocksalt classmethod ¶

Parameters¶

Returns¶

Examples¶

Region ¶

Parameters¶

Notes¶

Parameters¶

contains_mask abstractmethod ¶

Parameters¶

Returns¶

Notes¶

Site dataclass ¶

Attributes¶

Examples¶

Polymer¶

polymer ¶

Chain dataclass ¶

Connector ¶

connect ¶

get_reacter ¶

cell `property` ¶

cubic_bcc `classmethod` ¶

cubic_fcc `classmethod` ¶

cubic_sc `classmethod` ¶

rocksalt `classmethod` ¶

contains_mask `abstractmethod` ¶

Site `dataclass` ¶

Chain `dataclass` ¶

MonomerPlacement `dataclass` ¶

MonomerTemplate `dataclass` ¶

PolymerBuildResult `dataclass` ¶

PortDescriptor `dataclass` ¶

StochasticChain `dataclass` ¶

SystemPlan `dataclass` ¶