Builder¶
System assembly: polymer chain construction from CGSmiles topology and monomer libraries.
Quick reference¶
| Symbol | Summary | Preferred for |
|---|---|---|
PolymerBuilder |
Build chains from CGSmiles + library + connector + placer | Full control over assembly |
polymer(cgsmiles, ...) |
Tool: CGSmiles → chain in one call | Quick prototyping |
Connector |
Port selection rules + reaction binding | Defining which ports react |
Placer |
Geometric placement (separator + orienter) | Controlling inter-monomer geometry |
CovalentSeparator |
Covalent radii-based distance | Default monomer spacing |
LinearOrienter |
Linear chain orientation | Default growth direction |
Canonical example¶
from molpy.builder.polymer import (
PolymerBuilder, Connector, Placer,
CovalentSeparator, LinearOrienter,
)
from molpy.tool import polymer
builder = PolymerBuilder(
library={"EO": eo_template},
connector=Connector(port_map={("EO","EO"): (">","<")}, reacter=rxn),
placer=Placer(separator=CovalentSeparator(buffer=-0.1),
orienter=LinearOrienter()),
)
result = builder.build("{[#EO]|10}")
chain = result.polymer
# Or use the tool function:
result = polymer("{[#EO]|10}", library={"EO": eo_template}, reacter=rxn)
chain = result.polymer
Related¶
Full API¶
Crystal¶
crystal ¶
Crystal lattice builder module - LAMMPS-style crystal structure generator.
This module provides tools for creating crystal structures: - Define Bravais lattices with basis sites - Predefined common lattice types (SC, BCC, FCC, rocksalt) - Define regions in lattice or Cartesian coordinates - Efficient vectorized unit cell tiling and atom generation
Example
lat = Lattice.cubic_fcc(a=3.52, species="Ni") region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice") builder = CrystalBuilder(lat) structure = builder.build_block(region)
BlockRegion ¶
Bases: Region
Axis-aligned box region
Define a box region specified by x, y, z ranges.
Parameters¶
xmin, xmax : float x-direction range [xmin, xmax] ymin, ymax : float y-direction range [ymin, ymax] zmin, zmax : float z-direction range [zmin, zmax] coord_system : CoordSystem, optional Coordinate system, default is "lattice"
Examples¶
Region in lattice coordinates¶
region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice")
Region in Cartesian coordinates¶
region = BlockRegion(0, 30, 0, 30, 0, 30, coord_system="cartesian")
Initialize box region
Parameters¶
xmin, xmax : float x-direction range ymin, ymax : float y-direction range zmin, zmax : float z-direction range coord_system : CoordSystem, optional Coordinate system
contains_mask ¶
Check if points are in the box (vectorized)
Parameters¶
points : np.ndarray Point coordinates array of shape (N, 3)
Returns¶
np.ndarray Boolean array of shape (N,)
Examples¶
region = BlockRegion(0, 10, 0, 10, 0, 10) points = np.array([[5, 5, 5], [15, 5, 5]]) mask = region.contains_mask(points) print(mask) # [True, False]
CrystalBuilder ¶
Crystal structure builder
Efficiently generate crystal structures using NumPy vectorized operations. Supports tiling lattices and creating atoms in specified regions.
Parameters¶
lattice : Lattice Lattice definition to use
Examples¶
Create a simple FCC structure¶
lat = Lattice.cubic_fcc(a=3.52, species="Ni") region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice") builder = CrystalBuilder(lat) structure = builder.build_block(region) print(len(structure.atoms))
Initialize crystal builder
Parameters¶
lattice : Lattice Lattice definition
build_block ¶
Build crystal structure within a box region
This method efficiently generates crystal structures using vectorized operations: 1. Determine cell index ranges to tile 2. Use NumPy meshgrid and broadcasting to generate all atom positions 3. Apply region filtering 4. Create and return Atomistic structure
Parameters¶
region : BlockRegion Box defining the region for atom generation i_range, j_range, k_range : range | None, optional Explicitly specify cell index ranges. If not provided: - For "lattice" coordinate system: inferred from region boundaries - For "cartesian" coordinate system: must be provided, otherwise raises error
Returns¶
Atomistic Generated crystal structure containing atoms and box information
Raises¶
ValueError If coord_system == "cartesian" and explicit ranges are not provided
Examples¶
Using lattice coordinates (auto-infer ranges)¶
lat = Lattice.cubic_sc(a=2.0, species="Cu") region = BlockRegion(0, 10, 0, 10, 0, 10, coord_system="lattice") builder = CrystalBuilder(lat) structure = builder.build_block(region)
Using explicit ranges¶
structure = builder.build_block( ... region, ... i_range=range(0, 5), ... j_range=range(0, 5), ... k_range=range(0, 5) ... )
Cartesian coordinates (must provide ranges)¶
region_cart = BlockRegion(0, 20, 0, 20, 0, 20, coord_system="cartesian") structure = builder.build_block( ... region_cart, ... i_range=range(0, 10), ... j_range=range(0, 10), ... k_range=range(0, 10) ... )
Notes¶
- This method uses no Python loops, fully based on NumPy vectorized operations
- Generated structure contains:
- Atom positions (Cartesian coordinates)
- Atom species
- Box information (lattice vectors)
- For empty basis (no basis sites), returns empty Atomistic structure
Lattice ¶
Bravais lattice with basis sites.
This class defines a crystal lattice structure, including lattice vectors and basis sites. Lattice vectors define the shape and size of the unit cell, while basis sites define the positions of atoms within the cell (in fractional coordinates).
Parameters¶
a1, a2, a3 : np.ndarray Lattice vectors, each is a NumPy array of shape (3,) basis : list[Site] List of basis sites in fractional coordinates
Attributes¶
a1, a2, a3 : np.ndarray Lattice vectors basis : list[Site] List of basis sites
Examples¶
Create simple cubic lattice¶
lat = Lattice.cubic_sc(a=2.0, species="Cu")
Create face-centered cubic lattice¶
lat = Lattice.cubic_fcc(a=3.52, species="Ni")
Create rocksalt structure¶
lat = Lattice.rocksalt(a=5.64, species_a="Na", species_b="Cl")
Initialize lattice
Parameters¶
a1, a2, a3 : np.ndarray Lattice vectors of shape (3,) basis : list[Site] List of basis sites
cell
property
¶
Return 3×3 cell matrix with lattice vectors as rows
Returns¶
np.ndarray Matrix of shape (3, 3), each row is a lattice vector [a1; a2; a3]
cubic_bcc
classmethod
¶
Create body-centered cubic (Body-Centered Cubic, BCC) lattice
Body-centered cubic lattice has two atoms per unit cell: one at corner and one at body center.
Parameters¶
a : float Lattice constant (in Å) species : str Atomic species (e.g., "Fe", "W")
Returns¶
Lattice Body-centered cubic lattice
Examples¶
lat = Lattice.cubic_bcc(a=3.0, species="Fe") print(len(lat.basis)) # 2
cubic_fcc
classmethod
¶
Create face-centered cubic (Face-Centered Cubic, FCC) lattice
Face-centered cubic lattice has four atoms per unit cell: one at corner and one at each face center.
Parameters¶
a : float Lattice constant (in Å) species : str Atomic species (e.g., "Ni", "Cu", "Al")
Returns¶
Lattice Face-centered cubic lattice
Examples¶
lat = Lattice.cubic_fcc(a=3.52, species="Ni") print(len(lat.basis)) # 4
cubic_sc
classmethod
¶
Create simple cubic (Simple Cubic, SC) lattice
Simple cubic lattice is the simplest lattice type with one atom per unit cell.
Parameters¶
a : float Lattice constant (in Å) species : str Atomic species (e.g., "Cu", "Fe")
Returns¶
Lattice Simple cubic lattice
Examples¶
lat = Lattice.cubic_sc(a=2.0, species="Cu") print(len(lat.basis)) # 1
frac_to_cart ¶
Convert fractional coordinates to Cartesian coordinates
Fractional coordinates (u, v, w) represent position relative to lattice vectors: cart = ua1 + va2 + w*a3 = frac @ cell
Parameters¶
frac : np.ndarray Fractional coordinates of shape (N, 3) or (3,), containing (u, v, w) values
Returns¶
np.ndarray Cartesian coordinates, same shape as input
Examples¶
lat = Lattice.cubic_sc(a=2.0, species="Cu") frac = np.array([0.5, 0.5, 0.5]) cart = lat.frac_to_cart(frac) print(cart) # [1.0, 1.0, 1.0]
rocksalt
classmethod
¶
Create rocksalt (NaCl) structure
Rocksalt structure consists of two interpenetrating FCC sublattices. Each unit cell contains 4 A atoms and 4 B atoms.
Parameters¶
a : float Lattice constant (in Å) species_a : str First atomic species (e.g., "Na") species_b : str Second atomic species (e.g., "Cl")
Returns¶
Lattice Rocksalt structure lattice
Examples¶
lat = Lattice.rocksalt(a=5.64, species_a="Na", species_b="Cl") print(len(lat.basis)) # 8
Region ¶
Bases: ABC
Abstract geometric region class
Define a spatial region that can be represented in lattice or Cartesian coordinates.
Parameters¶
coord_system : CoordSystem Coordinate system, "lattice" or "cartesian" - "lattice": point coordinates in lattice units - "cartesian": point coordinates in Cartesian coordinates (Å)
Notes¶
Subclasses must implement contains_mask method using NumPy
vectorized operations to efficiently check if multiple points are in the region.
Initialize region
Parameters¶
coord_system : CoordSystem, optional Coordinate system, default is "lattice"
contains_mask
abstractmethod
¶
Check if points are in the region (vectorized)
Parameters¶
points : np.ndarray Point coordinates array of shape (N, 3)
Returns¶
np.ndarray Boolean array of shape (N,), True indicates point is in region
Notes¶
- If coord_system == "lattice": points are in lattice units
- If coord_system == "cartesian": points are in Cartesian coordinates
- Must use vectorized operations, no Python loops
Site
dataclass
¶
Lattice basis site in fractional coordinates.
Attributes¶
label : str Site identifier or name (e.g., "A", "B1") species : str Chemical species or type name (e.g., "Ni", "Na", "Cl") frac : tuple[float, float, float] Fractional coordinates (u, v, w) relative to the Bravais cell, typically in [0, 1) charge : float, optional Charge, default is 0.0 meta : dict[str, Any] | None, optional Optional metadata dictionary
Examples¶
site = Site(label="A", species="Cu", frac=(0.0, 0.0, 0.0)) site_charged = Site(label="Na", species="Na", frac=(0.0, 0.0, 0.0), charge=1.0)
Polymer¶
polymer ¶
Polymer assembly module.
Provides linear polymer assembly with both topology-only and chemical reaction connectors, plus optional geometric placement via Placer strategies.
Chain
dataclass
¶
Represents a single polymer chain.
Attributes:
| Name | Type | Description |
|---|---|---|
dp |
int
|
Degree of polymerization (number of monomers) |
monomers |
list[str]
|
List of monomer identifiers in the chain |
mass |
float
|
Total mass of the chain (g/mol) |
Connector ¶
Select ports and execute reactions between adjacent monomers.
Port selection strategy (applied in order):
1. Explicit port_map lookup for (left_label, right_label)
2. Compatibility: > on left pairs with < on right
3. Single-port: each side has exactly one unconsumed port
4. Common name: both sides share a port name (for $ ports)
5. Raise AmbiguousPortsError
connect ¶
Execute the chemical reaction between two structures.
select_ports ¶
Select which ports to connect.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left
|
Atomistic
|
Left Atomistic structure. |
required |
right
|
Atomistic
|
Right Atomistic structure. |
required |
left_ports
|
Mapping[str, list[Atom]]
|
Available ports on left (name -> list[Atom]). |
required |
right_ports
|
Mapping[str, list[Atom]]
|
Available ports on right (name -> list[Atom]). |
required |
ctx
|
ConnectorContext
|
Context with step info and labels. |
required |
Returns:
| Type | Description |
|---|---|
tuple[str, int, str, int, None]
|
(left_port_name, left_idx, right_port_name, right_idx, None) |
ConnectorContext ¶
Bases: dict[str, Any]
Shared context passed to the connector during linear build.
Keys: - step: int (current connection step index) - left_label: str (label of left monomer) - right_label: str (label of right monomer) - sequence: list[str] (full sequence being built)
CovalentSeparator ¶
Separator based on typical bond lengths (for bonded atoms).
Uses realistic bond lengths based on element types. Typical bond lengths: - C-C: 1.54 Å (single), 1.34 Å (double) - C-O: 1.43 Å (single), 1.23 Å (double) - C-N: 1.47 Å (single) - O-H: 0.96 Å - N-H: 1.01 Å
Initialize covalent separator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
buffer
|
float
|
Additional buffer distance in Angstroms (default: 0.0) Can be negative to account for slight compression |
0.0
|
get_separation ¶
Calculate separation based on typical bond lengths.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left_struct
|
Atomistic
|
Previous structure in sequence |
required |
right_struct
|
Atomistic
|
Next structure to place |
required |
left_port
|
Atom
|
Connection port on left structure |
required |
right_port
|
Atom
|
Connection port on right structure |
required |
Returns:
| Type | Description |
|---|---|
float
|
Separation distance = typical_bond_length + buffer |
DPDistribution ¶
Bases: Protocol
Protocol for distributions that sample degree of polymerization directly.
Distributions implementing this protocol can sample DP values without requiring monomer mass information. This is suitable for distributions defined in DP space (e.g., Poisson, Uniform).
dp_pmf ¶
Probability mass function for DP values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dp_array
|
ndarray
|
Array of DP values |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Array of probability mass values |
sample_dp ¶
Sample degree of polymerization from distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
NumPy random number generator |
required |
Returns:
| Type | Description |
|---|---|
int
|
Degree of polymerization (>= 1) |
FlorySchulzPolydisperse ¶
Flory-Schulz (geometric) distribution for degree of polymerization.
PMF: P(N = k) = a^2 * k * (1 - a)^(k-1), k = 1, 2, ...
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
a
|
float
|
Probability parameter (0 < a < 1), related to extent of reaction. |
required |
random_seed
|
int | None
|
Optional random seed. |
None
|
GrowthKernel ¶
Bases: Protocol
Protocol for local transition function in port-level stochastic growth.
A GrowthKernel decides which monomer (if any) to add next for a given reactive port on the growing polymer. This encapsulates the reaction probability logic from G-BigSMILES notation.
choose_next_for_port ¶
Choose next monomer for a given port.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
polymer
|
Atomistic
|
Current polymer structure |
required |
port
|
Atom
|
Port to extend from |
required |
candidates
|
Sequence[MonomerTemplate]
|
Available monomer templates |
required |
rng
|
Generator | None
|
Random number generator for sampling |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
MonomerPlacement |
MonomerPlacement | None
|
Add this template at target port |
None |
MonomerPlacement | None
|
Terminate this port (implicit end-group) |
LinearOrienter ¶
Orienter for linear polymer arrangement.
Aligns the next monomer so that: 1. The two port atoms are separated by the specified distance 2. The port connection axis of the next monomer aligns with the port connection axis of the previous monomer 3. The monomer extends in a linear fashion
get_orientation ¶
Calculate linear alignment transformation.
Strategy: 1. Get direction vector from left port anchor (outward) 2. Place right structure so its port anchor is at the target position 3. Align right structure's port direction with left port direction
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left_struct
|
Atomistic
|
Previous structure in sequence |
required |
right_struct
|
Atomistic
|
Next structure to place |
required |
left_port
|
Atom
|
Connection port on left structure |
required |
right_port
|
Atom
|
Connection port on right structure |
required |
separation
|
float
|
Distance between port anchors |
required |
Returns:
| Type | Description |
|---|---|
tuple[ndarray, ndarray]
|
Tuple of (translation_vector, rotation_matrix) |
MassDistribution ¶
Bases: Protocol
Protocol for distributions that sample molecular weight directly.
Distributions implementing this protocol sample mass values directly from the distribution without converting through DP. This is suitable for distributions defined in mass space (e.g., Schulz-Zimm).
mass_pdf ¶
Probability density function for mass values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mass_array
|
ndarray
|
Array of mass values (g/mol) |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Array of probability density values |
sample_mass ¶
Sample molecular weight from distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
NumPy random number generator |
required |
Returns:
| Type | Description |
|---|---|
float
|
Molecular weight (g/mol, > 0) |
MonomerPlacement
dataclass
¶
Decision for next monomer placement during stochastic growth.
Represents the output of a GrowthKernel's decision: which template to add and which port on that template to connect.
Attributes:
| Name | Type | Description |
|---|---|---|
template |
MonomerTemplate
|
MonomerTemplate to add |
target_descriptor_id |
int
|
Which port descriptor on the new monomer to connect |
Example
placement = MonomerPlacement( ... template=eo_template, ... target_descriptor_id=1 # Connect via port descriptor 1 ... ) print(f"Add {placement.template.label} at port {placement.target_descriptor_id}")
MonomerTemplate
dataclass
¶
Template for a monomer with port descriptors and metadata.
This represents a monomer type that can be instantiated multiple times during stochastic growth. Each instantiation creates a fresh copy of the structure.
Attributes:
| Name | Type | Description |
|---|---|---|
label |
str
|
Monomer label (e.g., "EO2", "PS") |
structure |
Atomistic
|
Base Atomistic structure (will be copied on instantiation) |
port_descriptors |
dict[int, PortDescriptor]
|
Mapping from descriptor_id to PortDescriptor |
mass |
float
|
Molecular weight (g/mol) |
metadata |
dict[str, Any]
|
Additional metadata (optional) |
Example
template = MonomerTemplate( ... label="EO", ... structure=eo_monomer, ... port_descriptors={ ... 0: PortDescriptor(0, "<", role="left"), ... 1: PortDescriptor(1, ">", role="right"), ... }, ... mass=44.05, ... ) fresh_copy = template.instantiate() print(f"Template: {template.label}, mass={template.mass} g/mol")
get_all_descriptors ¶
Get all port descriptors for this template.
Returns:
| Type | Description |
|---|---|
list[PortDescriptor]
|
List of all PortDescriptor objects sorted by descriptor_id |
Example
template = MonomerTemplate(...) descriptors = template.get_all_descriptors() for desc in descriptors: ... print(f"Port {desc.descriptor_id}: {desc.port_name}")
get_port_by_descriptor ¶
Get port descriptor for a specific descriptor ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
descriptor_id
|
int
|
Descriptor ID to look up |
required |
Returns:
| Type | Description |
|---|---|
PortDescriptor | None
|
PortDescriptor if found, None otherwise |
Example
template = MonomerTemplate(...) left_port = template.get_port_by_descriptor(0) if left_port: ... print(f"Port: {left_port.port_name}, role: {left_port.role}")
instantiate ¶
Create a fresh copy of the structure.
Each instantiation is independent with separate atoms and bonds, allowing the same template to be used multiple times in a polymer.
Returns:
| Type | Description |
|---|---|
Atomistic
|
New Atomistic instance with independent atoms and bonds |
Example
template = MonomerTemplate(label="EO", structure=eo_monomer, ...) copy1 = template.instantiate() copy2 = template.instantiate() copy1 is not copy2 # Different objects True
Placer ¶
Combined placer for positioning structures during assembly.
Uses a Separator to determine distance and an Orienter to determine orientation.
Initialize placer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
separator
|
Separator
|
Separator for calculating distance |
required |
orienter
|
LinearOrienter
|
Orienter for calculating orientation |
required |
place_monomer ¶
Position right_struct relative to left_struct.
Modifies right_struct's atomic coordinates in-place.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left_struct
|
Atomistic
|
Previous structure in sequence |
required |
right_struct
|
Atomistic
|
Next structure to place |
required |
left_port
|
Atom
|
Connection port on left structure |
required |
right_port
|
Atom
|
Connection port on right structure |
required |
PoissonPolydisperse ¶
Poisson distribution for the degree of polymerization (DP).
Zero-truncated: sampled k=0 is mapped to k=1.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lambda_param
|
float
|
Mean of the Poisson distribution (> 0). |
required |
random_seed
|
int | None
|
Optional random seed. |
None
|
PolydisperseChainGenerator ¶
Middle layer: Chain-level generator.
Responsible for: - Sampling chain size: - Either in DP-space via a DPDistribution (sample_dp) - Or in mass-space via a MassDistribution (sample_mass) - Using a SequenceGenerator to build the chain sequence - Computing the mass of a chain using monomer mass table and optional end-group mass
Does NOT know anything about total system mass. Only returns one chain at a time.
Initialize polydisperse chain generator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seq_generator
|
SequenceGenerator
|
Sequence generator for generating monomer sequences |
required |
monomer_mass
|
dict[str, float]
|
Dictionary mapping monomer identifiers to their masses (g/mol) |
required |
end_group_mass
|
float
|
Mass of end groups (g/mol), default 0.0 |
0.0
|
distribution
|
DPDistribution | MassDistribution | None
|
Distribution implementing DPDistribution or MassDistribution protocol |
None
|
build_chain ¶
Sample DP, generate monomer sequence, and compute mass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
np.random.Generator number generator |
required |
Returns:
| Type | Description |
|---|---|
Chain
|
Chain object with dp, monomers, and mass |
sample_dp ¶
Sample a degree of polymerization from the distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
np.random.Generator number generator |
required |
Returns:
| Type | Description |
|---|---|
int
|
Degree of polymerization (>= 1) |
sample_mass ¶
Sample a target chain mass from a mass-based distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
np.random.Generator number generator |
required |
Returns:
| Type | Description |
|---|---|
float
|
Target chain mass in g/mol (>= 0) |
PolymerBuildResult
dataclass
¶
Result of building a polymer.
PolymerBuilder ¶
Build polymers from CGSmiles notation with support for arbitrary topologies.
This builder parses CGSmiles strings and constructs polymers using a graph-based
approach, supporting:
- Linear chains: {[#A][#B][#C]}
- Branched structures: {[#A]([#B])[#C]}
- Cyclic structures: {[#A]1[#B][#C]1}
- Repeat operators: {[#A]|10}
Example
builder = PolymerBuilder( ... library={"EO2": eo2_monomer, "PS": ps_monomer}, ... connector=connector, ... typifier=typifier, ... ) result = builder.build("{[#EO2]|8[#PS]}")
Initialize the polymer builder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
library
|
Mapping[str, Atomistic]
|
Mapping from CGSmiles labels to Atomistic monomer structures |
required |
connector
|
Connector
|
Connector for port selection and chemical reactions |
required |
typifier
|
TypifierBase | None
|
Optional typifier for automatic retypification |
None
|
placer
|
Placer | None
|
Optional Placer for positioning structures before connection |
None
|
build ¶
Build a polymer from a CGSmiles string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cgsmiles
|
str
|
CGSmiles notation string (e.g., "{[#EO2]|8[#PS]}") |
required |
Returns:
| Type | Description |
|---|---|
PolymerBuildResult
|
PolymerBuildResult containing the assembled polymer and metadata |
Raises:
| Type | Description |
|---|---|
ValueError
|
If CGSmiles is invalid |
SequenceError
|
If labels in CGSmiles are not found in library |
PortDescriptor
dataclass
¶
Descriptor for a reactive port on a monomer template.
Port descriptors identify ports with unique IDs and store metadata about port behavior (role, bond type, compatibility).
Attributes:
| Name | Type | Description |
|---|---|---|
descriptor_id |
int
|
Unique ID within template (e.g., 0, 1, 2) |
port_name |
str
|
Port name on atom (e.g., "<", ">", "branch") |
role |
str | None
|
Port role (e.g., "left", "right", "branch") |
bond_kind |
str | None
|
Bond type (e.g., "-", "=", "#") |
compat |
set[str] | None
|
Compatibility set for port matching |
Example
desc = PortDescriptor( ... descriptor_id=0, ... port_name="<", ... role="left", ... bond_kind="-", ... compat={"donor"} ... ) print(f"Descriptor {desc.descriptor_id}: port '{desc.port_name}' ({desc.role})")
ProbabilityTableKernel ¶
GrowthKernel based on G-BigSMILES probability tables.
This kernel uses pre-computed probability tables that map each port descriptor to weighted choices over (template, target_descriptor_id) pairs. Weights are integers that are normalized to probabilities during sampling.
Initialize probability table kernel.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
probability_tables
|
dict[int, list[tuple[MonomerTemplate, int, int]]]
|
Maps descriptor_id -> [(template, target_desc, integer_weight)] Integer weights are normalized to probabilities during sampling. |
required |
end_group_templates
|
dict[int, MonomerTemplate] | None
|
Maps descriptor_id -> end-group template (no ports) |
None
|
choose_next_for_port ¶
Choose next monomer based on probability table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
polymer
|
Atomistic
|
Current polymer structure |
required |
port
|
Atom
|
Port to extend from |
required |
candidates
|
Sequence[MonomerTemplate]
|
Available monomer templates |
required |
rng
|
Generator | None
|
Random number generator (uses default if None) |
None
|
Returns:
| Type | Description |
|---|---|
MonomerPlacement | None
|
MonomerPlacement or None (terminate) |
SchulzZimmPolydisperse ¶
Schulz-Zimm molecular weight distribution for polydisperse polymer chains.
Implements :class:MassDistribution - sampling is done directly in
molecular-weight space.
The probability density is:
.. math::
f(M) = \frac{z^{z+1}}{\Gamma(z+1)}
\frac{M^{z-1}}{M_n^{z}}
\exp\left(-\frac{z M}{M_n}\right),
where z = Mn / (Mw - Mn). This is equivalent to a Gamma distribution with shape z and scale theta = Mw - Mn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
Mn
|
float
|
Number-average molecular weight (g/mol). |
required |
Mw
|
float
|
Weight-average molecular weight (g/mol), must satisfy Mw > Mn. |
required |
random_seed
|
int | None
|
Optional random seed. |
None
|
SequenceGenerator ¶
Bases: Protocol
Protocol for sequence generators.
A sequence generator controls how monomers are arranged in a single chain.
expected_composition ¶
Return expected long-chain monomer fractions.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary mapping monomer identifiers to expected fractions |
generate_sequence ¶
Generate a monomer sequence of specified degree of polymerization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dp
|
int
|
Degree of polymerization (number of monomers) |
required |
rng
|
Generator
|
numpy random Generator |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of monomer identifiers (strings) |
StochasticChain
dataclass
¶
Result of stochastic BFS growth.
Contains the assembled polymer structure along with metadata about the growth process.
Attributes:
| Name | Type | Description |
|---|---|---|
polymer |
Atomistic
|
The assembled Atomistic structure |
dp |
int
|
Degree of polymerization (number of monomers added) |
mass |
float
|
Total molecular weight (g/mol) |
growth_history |
list[dict[str, Any]]
|
Metadata for each monomer addition step |
Example
chain = StochasticChain( ... polymer=final_structure, ... dp=25, ... mass=1101.25, ... growth_history=[...] ... ) print(f"Built polymer: DP={chain.dp}, mass={chain.mass:.1f} g/mol")
SystemPlan
dataclass
¶
Represents a complete system plan with all chains.
Attributes:
| Name | Type | Description |
|---|---|---|
chains |
list[Chain]
|
List of all chains in the system |
total_mass |
float
|
Total mass of all chains (g/mol) |
target_mass |
float
|
Target total mass that was requested (g/mol) |
SystemPlanner ¶
SystemPlanner(chain_generator, target_total_mass, max_rel_error=0.02, max_chains=None, enable_trimming=True)
Top layer: System-level planner.
Responsible for: - Enforcing a target total mass for the overall system - Iteratively requesting chains from PolydisperseChainGenerator - Maintaining a running sum of total mass - Stopping when mass reaches target window, and optionally trimming the final chain
Does NOT micromanage sequence probabilities or DP distribution; only orchestrates at the ensemble level.
Initialize system planner.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chain_generator
|
PolydisperseChainGenerator
|
Chain generator for building chains |
required |
target_total_mass
|
float
|
Target total system mass (g/mol) |
required |
max_rel_error
|
float
|
Maximum relative error allowed (default 0.02 = 2%) |
0.02
|
max_chains
|
int | None
|
Maximum number of chains to generate (None = no limit) |
None
|
enable_trimming
|
bool
|
Whether to enable chain trimming to better hit target mass |
True
|
plan_system ¶
Repeatedly ask chain_generator for new chains until accumulated mass reaches target_total_mass within max_rel_error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
np.random.Generator number generator |
required |
Returns:
| Type | Description |
|---|---|
SystemPlan
|
SystemPlan with all chains and total mass |
UniformPolydisperse ¶
Uniform distribution over degree of polymerization (DP).
All integer DP values between min_dp and max_dp (inclusive) are equally likely.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
min_dp
|
int
|
Lower bound (>= 1). |
required |
max_dp
|
int
|
Upper bound (>= min_dp). |
required |
random_seed
|
int | None
|
Optional random seed. |
None
|
VdWSeparator ¶
Separator based on van der Waals radii.
Calculates separation as sum of VdW radii of the two port anchor atoms, plus an optional buffer distance.
NOTE: VdW radii are designed for non-bonded contacts (~3-4 Å). For bonded atoms, use CovalentSeparator instead.
Initialize VdW separator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
buffer
|
float
|
Additional buffer distance in Angstroms (default: 0.0) |
0.0
|
get_separation ¶
Calculate separation based on VdW radii.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left_struct
|
Atomistic
|
Previous structure in sequence |
required |
right_struct
|
Atomistic
|
Next structure to place |
required |
left_port
|
Atom
|
Connection port on left structure |
required |
right_port
|
Atom
|
Connection port on right structure |
required |
Returns:
| Type | Description |
|---|---|
float
|
Separation distance = vdw_left + vdw_right + buffer |
WeightedSequenceGenerator ¶
Sequence generator based on monomer weights/proportions.
Each selection is independent (no memory of previous selections).
generate_sequence ¶
Generate a sequence of specified degree of polymerization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dp
|
int
|
Degree of polymerization (number of monomers) |
required |
rng
|
Generator
|
numpy random Generator |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of monomer identifiers |