Block¶
A Block is a small, columnar table: it maps column name → NumPy array, and columns represent the same set of rows (atoms, bonds, …).
This page shows the common operations you’ll use in practice: creating blocks, reading columns/rows, selecting rows, mutating columns, and round-tripping through serialization.
import molpy as mp
import numpy as np
1. Creating a Block¶
You can construct a Block from a plain dict of array-like values. Each column becomes a NumPy array.
atoms = mp.Block({
"id": [1, 2, 3],
"element": ["O", "H", "H"],
"x": [0.000, 0.957, -0.239],
"y": [0.000, 0.000, 0.927],
"z": [0.000, 0.000, 0.000],
})
print("nrows:", atoms.nrows)
print("shape:", atoms.shape)
print("columns:", list(atoms.keys()))
atoms
nrows: 3 shape: (3, 5) columns: ['id', 'element', 'x', 'y', 'z']
Block(id: shape=(3,), element: shape=(3,), x: shape=(3,), y: shape=(3,), z: shape=(3,))
2. Columns Are NumPy Arrays¶
Reading a column returns an np.ndarray. Dtypes follow NumPy’s rules.
If you need a specific dtype, cast explicitly (e.g. atoms["id"].astype(int)).
print("id dtype:", atoms["id"].dtype)
print("x dtype:", atoms["x"].dtype)
print("el dtype:", atoms["element"].dtype)
id dtype: int64 x dtype: float64 el dtype: <U1
3. Reading Columns and Computing Derived Arrays¶
A common workflow is: get numeric columns, stack them, and run NumPy operations.
block[["x", "y", "z"]] stacks columns into a 2D array shaped (nrows, 3) (good for numeric kernels).
xyz = atoms[["x", "y", "z"]]
r = np.linalg.norm(xyz, axis=1)
print("xyz shape:", xyz.shape)
print("r:", r)
xyz shape: (3, 3) r: [0. 0.957 0.95731395]
4. Selecting Rows¶
Row selection returns a new Block (still columnar).
Examples:
atoms[0:2](slice)atoms[mask](boolean mask)atoms[np.array([...])](pick/reorder)
If you need a scalar, index the column first: atoms["x"][0].
print("slice 0:2 ->", atoms[0:2])
mask = atoms["element"] == "H"
print("mask ->", mask)
print("atoms[mask] ->", atoms[mask])
print("reorder [2,0] ->", atoms[np.array([2, 0])])
print("scalar atoms['x'][0] ->", atoms["x"][0])
slice 0:2 -> Block(id: shape=(2,), element: shape=(2,), x: shape=(2,), y: shape=(2,), z: shape=(2,)) mask -> [False True True] atoms[mask] -> Block(id: shape=(2,), element: shape=(2,), x: shape=(2,), y: shape=(2,), z: shape=(2,)) reorder [2,0] -> Block(id: shape=(2,), element: shape=(2,), x: shape=(2,), y: shape=(2,), z: shape=(2,)) scalar atoms['x'][0] -> 0.0
5. Updating Columns¶
Setting a key inserts or overwrites a column. Deleting is explicit.
atoms2 = atoms.copy() # shallow
atoms2["r"] = np.linalg.norm(atoms2[["x", "y", "z"]], axis=1)
print("with r:", list(atoms2.keys()))
del atoms2["r"]
print("after del:", list(atoms2.keys()))
with r: ['id', 'element', 'x', 'y', 'z', 'r'] after del: ['id', 'element', 'x', 'y', 'z']
6. Copy Semantics¶
Block.copy() is shallow:
- the mapping is copied
- underlying NumPy arrays are not copied
So in-place mutation of arrays (e.g. atoms_copy["x"][0] = ...) affects the original too.
If you need a deep copy, copy the arrays explicitly.
atoms_shallow = atoms.copy()
atoms_shallow["x"][0] = 123.0
print("original x:", atoms["x"])
print("shallow x:", atoms_shallow["x"])
original x: [123. 0.957 -0.239] shallow x: [123. 0.957 -0.239]
# Rebuild fresh for the rest of the tutorial
atoms = mp.Block({
"id": [1, 2, 3],
"element": ["O", "H", "H"],
"x": [0.000, 0.957, -0.239],
"y": [0.000, 0.000, 0.927],
"z": [0.000, 0.000, 0.000],
})
# If you need a deep copy of a specific array, copy it explicitly
atoms_deep = atoms.copy()
atoms_deep["x"] = atoms_deep["x"].copy()
atoms_deep["x"][0] = 999.0
print("original x:", atoms["x"])
print("deep x:", atoms_deep["x"])
original x: [ 0. 0.957 -0.239] deep x: [ 9.99e+02 9.57e-01 -2.39e-01]
unsorted = mp.Block({"id": [3, 1, 2], "x": [30.0, 10.0, 20.0]})
sorted_new = unsorted.sort("id")
_ = unsorted.sort_("id")
print("unsorted (after sort_):", unsorted)
print("sorted_new:", sorted_new)
unsorted (after sort_): Block(id: shape=(3,), x: shape=(3,)) sorted_new: Block(id: shape=(3,), x: shape=(3,))
8. Serialize / Deserialize¶
to_dict() produces a JSON-friendly representation.
Block.from_dict(...) reconstructs a new Block.
payload = atoms.to_dict()
restored = mp.Block.from_dict(payload)
print("columns:", list(restored.keys()))
print("nrows:", restored.nrows)
print("xyz equal?", np.allclose(restored[["x", "y", "z"]], atoms[["x", "y", "z"]]))
columns: ['id', 'element', 'x', 'y', 'z'] nrows: 3 xyz equal? True
9. Load from CSV¶
Block.from_csv reads CSV from a file path or a StringIO. It does basic type inference.
from io import StringIO
csv_data = StringIO("id,element,x\n1,O,0.0\n2,H,0.957\n3,H,-0.239\n")
blk = mp.Block.from_csv(csv_data)
print("blk:", blk)
print("id dtype:", blk["id"].dtype)
print("x dtype:", blk["x"].dtype)
blk: Block(id: shape=(3,), element: shape=(3,), x: shape=(3,)) id dtype: int64 x dtype: float64
Practical Notes¶
- Prefer column operations (NumPy vectorization) over Python loops for performance.
- Be conscious of shallow copies when mutating arrays in-place.