Trajectory Tutorial¶
Learn how to work with Trajectory objects in MolPy! Trajectories represent sequences of molecular frames - perfect for analyzing simulation data.
What is a Trajectory?¶
A Trajectory is a sequence of Frame objects representing time evolution:
- Lazy Loading: Frames are loaded on-demand, not all at once
- Memory Efficient: Supports memory-mapped file reading
- Iterable: Loop through frames like any Python sequence
- Sliceable: Get subsets of frames with slicing
- Mappable: Apply functions to all frames
Perfect for analyzing large MD simulations!
In [1]:
Copied!
import numpy as np
import molpy as mp
from molpy.core.trajectory import FrameGenerator, Trajectory
import numpy as np
import molpy as mp
from molpy.core.trajectory import FrameGenerator, Trajectory
--------------------------------------------------------------------------- ImportError Traceback (most recent call last) Cell In[1], line 4 1 import numpy as np 3 import molpy as mp ----> 4 from molpy.core.trajectory import FrameGenerator, Trajectory ImportError: cannot import name 'FrameGenerator' from 'molpy.core.trajectory' (/opt/buildhome/.asdf/installs/python/3.13.3/lib/python3.13/site-packages/molpy/core/trajectory.py)
Creating a Trajectory¶
You can create trajectories from lists, generators, or file readers:
In [2]:
Copied!
# Create some frames
frames = []
for i in range(5):
frame = mp.Frame()
frame["atoms"] = mp.Block({"x": [0.0 + i * 0.1], "y": [0.0], "z": [0.0]})
frame.metadata["time"] = i * 0.1
frames.append(frame)
# Create trajectory from list
traj = Trajectory(frames)
print(f"Trajectory length: {len(traj)}")
# Create some frames
frames = []
for i in range(5):
frame = mp.Frame()
frame["atoms"] = mp.Block({"x": [0.0 + i * 0.1], "y": [0.0], "z": [0.0]})
frame.metadata["time"] = i * 0.1
frames.append(frame)
# Create trajectory from list
traj = Trajectory(frames)
print(f"Trajectory length: {len(traj)}")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[2], line 10 7 frames.append(frame) 9 # Create trajectory from list ---> 10 traj = Trajectory(frames) 11 print(f"Trajectory length: {len(traj)}") NameError: name 'Trajectory' is not defined
Iterating Over Frames¶
In [3]:
Copied!
# Iterate through all frames
for i, frame in enumerate(traj):
time = frame.metadata.get("time", 0.0)
n_atoms = frame["atoms"].nrows
print(f"Frame {i}: time={time:.2f}, atoms={n_atoms}")
# Iterate through all frames
for i, frame in enumerate(traj):
time = frame.metadata.get("time", 0.0)
n_atoms = frame["atoms"].nrows
print(f"Frame {i}: time={time:.2f}, atoms={n_atoms}")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[3], line 2 1 # Iterate through all frames ----> 2 for i, frame in enumerate(traj): 3 time = frame.metadata.get("time", 0.0) 4 n_atoms = frame["atoms"].nrows NameError: name 'traj' is not defined
Manual Iteration with next()¶
You can manually iterate through frames using next():
In [4]:
Copied!
# Manually get next frames
frame1 = next(traj)
print(f"First frame time: {frame1.metadata.get('time', 0.0)}")
frame2 = next(traj)
print(f"Second frame time: {frame2.metadata.get('time', 0.0)}")
# You can also use try/except to handle StopIteration
try:
frame3 = next(traj)
print(f"Third frame time: {frame3.metadata.get('time', 0.0)}")
except StopIteration:
print("No more frames")
# Manually get next frames
frame1 = next(traj)
print(f"First frame time: {frame1.metadata.get('time', 0.0)}")
frame2 = next(traj)
print(f"Second frame time: {frame2.metadata.get('time', 0.0)}")
# You can also use try/except to handle StopIteration
try:
frame3 = next(traj)
print(f"Third frame time: {frame3.metadata.get('time', 0.0)}")
except StopIteration:
print("No more frames")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[4], line 2 1 # Manually get next frames ----> 2 frame1 = next(traj) 3 print(f"First frame time: {frame1.metadata.get('time', 0.0)}") 5 frame2 = next(traj) NameError: name 'traj' is not defined
Accessing Individual Frames¶
You can also access frames by index:
In [5]:
Copied!
# Get a single frame by index
frame0 = traj[0]
print(f"First frame time: {frame0.metadata.get('time', 0.0)}")
# Get last frame (if trajectory has known length)
if traj.has_length():
frame_last = traj[-1]
print(f"Last frame time: {frame_last.metadata.get('time', 0.0)}")
# Get a single frame by index
frame0 = traj[0]
print(f"First frame time: {frame0.metadata.get('time', 0.0)}")
# Get last frame (if trajectory has known length)
if traj.has_length():
frame_last = traj[-1]
print(f"Last frame time: {frame_last.metadata.get('time', 0.0)}")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[5], line 2 1 # Get a single frame by index ----> 2 frame0 = traj[0] 3 print(f"First frame time: {frame0.metadata.get('time', 0.0)}") 5 # Get last frame (if trajectory has known length) NameError: name 'traj' is not defined
Slicing Trajectories¶
In [6]:
Copied!
# Get first 3 frames
first_three = traj[0:3]
print(f"First 3 frames: {len(first_three)} frames")
# Get every other frame
every_other = traj[::2]
print(f"Every other frame: {len(every_other)} frames")
# Get last 2 frames
last_two = traj[-2:]
print(f"Last 2 frames: {len(last_two)} frames")
# Get first 3 frames
first_three = traj[0:3]
print(f"First 3 frames: {len(first_three)} frames")
# Get every other frame
every_other = traj[::2]
print(f"Every other frame: {len(every_other)} frames")
# Get last 2 frames
last_two = traj[-2:]
print(f"Last 2 frames: {len(last_two)} frames")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[6], line 2 1 # Get first 3 frames ----> 2 first_three = traj[0:3] 3 print(f"First 3 frames: {len(first_three)} frames") 5 # Get every other frame NameError: name 'traj' is not defined
Mapping Functions Over Frames¶
The .map() method applies a function to each frame lazily:
In [7]:
Copied!
# Define a function to process frames
def center_frame(frame):
"""Center coordinates at origin."""
atoms = frame["atoms"]
xyz = atoms[["x", "y", "z"]]
center = xyz.mean(axis=0)
atoms["x"] = atoms["x"] - center[0]
atoms["y"] = atoms["y"] - center[1]
atoms["z"] = atoms["z"] - center[2]
return frame
# Apply to all frames (lazy evaluation)
centered_traj = traj.map(center_frame)
print(f"Centered trajectory: {len(centered_traj)} frames")
# The mapping is lazy - frames are processed on-demand
# You can use next() on the mapped trajectory too
if centered_traj.has_length():
first_centered = next(centered_traj)
print("First centered frame processed")
# Define a function to process frames
def center_frame(frame):
"""Center coordinates at origin."""
atoms = frame["atoms"]
xyz = atoms[["x", "y", "z"]]
center = xyz.mean(axis=0)
atoms["x"] = atoms["x"] - center[0]
atoms["y"] = atoms["y"] - center[1]
atoms["z"] = atoms["z"] - center[2]
return frame
# Apply to all frames (lazy evaluation)
centered_traj = traj.map(center_frame)
print(f"Centered trajectory: {len(centered_traj)} frames")
# The mapping is lazy - frames are processed on-demand
# You can use next() on the mapped trajectory too
if centered_traj.has_length():
first_centered = next(centered_traj)
print("First centered frame processed")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[7], line 14 10 return frame 13 # Apply to all frames (lazy evaluation) ---> 14 centered_traj = traj.map(center_frame) 15 print(f"Centered trajectory: {len(centered_traj)} frames") 17 # The mapping is lazy - frames are processed on-demand 18 # You can use next() on the mapped trajectory too NameError: name 'traj' is not defined
Loading from Files¶
In [8]:
Copied!
# Example: Load trajectory from files
# Note: These examples require actual trajectory files
# Load trajectory from XYZ file
from pathlib import Path
from molpy.io.trajectory import XYZTrajectoryReader
xyz_file = Path("trajectory.xyz")
if xyz_file.exists():
reader = XYZTrajectoryReader(xyz_file)
traj = Trajectory(reader)
print(f"Loaded trajectory: {len(traj)} frames")
else:
print("XYZ file not found. Using in-memory trajectory instead.")
# Load from LAMMPS trajectory
from molpy.io.trajectory import LammpsTrajectoryReader
lammps_file = Path("dump.lammpstrj")
if lammps_file.exists():
reader = LammpsTrajectoryReader(lammps_file)
traj = Trajectory(reader)
print(f"Loaded LAMMPS trajectory: {len(traj)} frames")
else:
print("LAMMPS trajectory file not found.")
# Example: Load trajectory from files
# Note: These examples require actual trajectory files
# Load trajectory from XYZ file
from pathlib import Path
from molpy.io.trajectory import XYZTrajectoryReader
xyz_file = Path("trajectory.xyz")
if xyz_file.exists():
reader = XYZTrajectoryReader(xyz_file)
traj = Trajectory(reader)
print(f"Loaded trajectory: {len(traj)} frames")
else:
print("XYZ file not found. Using in-memory trajectory instead.")
# Load from LAMMPS trajectory
from molpy.io.trajectory import LammpsTrajectoryReader
lammps_file = Path("dump.lammpstrj")
if lammps_file.exists():
reader = LammpsTrajectoryReader(lammps_file)
traj = Trajectory(reader)
print(f"Loaded LAMMPS trajectory: {len(traj)} frames")
else:
print("LAMMPS trajectory file not found.")
XYZ file not found. Using in-memory trajectory instead. LAMMPS trajectory file not found.
Lazy Loading with Generators¶
In [9]:
Copied!
# Create a generator function
def frame_generator():
for i in range(10):
frame = mp.Frame()
frame["atoms"] = mp.Block({"x": [i], "y": [0.0], "z": [0.0]})
frame.metadata["time"] = i * 0.1
yield frame
# Create trajectory from generator
gen_traj = Trajectory(FrameGenerator(frame_generator()))
# Check if length is available
if gen_traj.has_length():
print(f"Trajectory length: {len(gen_traj)}")
else:
print("Length not available (generator-based)")
# Still can iterate
count = 0
for frame in gen_traj:
count += 1
if count >= 3:
break
print(f"Iterated through {count} frames")
# Create a generator function
def frame_generator():
for i in range(10):
frame = mp.Frame()
frame["atoms"] = mp.Block({"x": [i], "y": [0.0], "z": [0.0]})
frame.metadata["time"] = i * 0.1
yield frame
# Create trajectory from generator
gen_traj = Trajectory(FrameGenerator(frame_generator()))
# Check if length is available
if gen_traj.has_length():
print(f"Trajectory length: {len(gen_traj)}")
else:
print("Length not available (generator-based)")
# Still can iterate
count = 0
for frame in gen_traj:
count += 1
if count >= 3:
break
print(f"Iterated through {count} frames")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[9], line 11 7 yield frame 10 # Create trajectory from generator ---> 11 gen_traj = Trajectory(FrameGenerator(frame_generator())) 13 # Check if length is available 14 if gen_traj.has_length(): NameError: name 'Trajectory' is not defined
Splitting Trajectories¶
In [10]:
Copied!
from molpy.core.trajectory import TrajectorySplitter
# Split by frame interval
splitter = TrajectorySplitter(traj)
segments = splitter.split_frames(interval=2) # Every 2 frames
print(f"Split into {len(segments)} segments")
for i, seg in enumerate(segments):
print(f" Segment {i}: {len(seg)} frames")
from molpy.core.trajectory import TrajectorySplitter
# Split by frame interval
splitter = TrajectorySplitter(traj)
segments = splitter.split_frames(interval=2) # Every 2 frames
print(f"Split into {len(segments)} segments")
for i, seg in enumerate(segments):
print(f" Segment {i}: {len(seg)} frames")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[10], line 4 1 from molpy.core.trajectory import TrajectorySplitter 3 # Split by frame interval ----> 4 splitter = TrajectorySplitter(traj) 5 segments = splitter.split_frames(interval=2) # Every 2 frames 6 print(f"Split into {len(segments)} segments") NameError: name 'traj' is not defined
Time-Based Splitting¶
In [11]:
Copied!
# Split by time interval (requires time metadata)
# Note: This requires frames to have "time" in metadata
segments = splitter.split_time(interval=0.5) # Every 0.5 time units
print(f"Time-based split: {len(segments)} segments")
for i, seg in enumerate(segments):
if seg.has_length():
print(f" Segment {i}: {len(seg)} frames")
else:
print(f" Segment {i}: variable length")
# Split by time interval (requires time metadata)
# Note: This requires frames to have "time" in metadata
segments = splitter.split_time(interval=0.5) # Every 0.5 time units
print(f"Time-based split: {len(segments)} segments")
for i, seg in enumerate(segments):
if seg.has_length():
print(f" Segment {i}: {len(seg)} frames")
else:
print(f" Segment {i}: variable length")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[11], line 3 1 # Split by time interval (requires time metadata) 2 # Note: This requires frames to have "time" in metadata ----> 3 segments = splitter.split_time(interval=0.5) # Every 0.5 time units 4 print(f"Time-based split: {len(segments)} segments") 6 for i, seg in enumerate(segments): NameError: name 'splitter' is not defined
Analysis Example¶
Here's a practical example of analyzing a trajectory:
In [12]:
Copied!
# Calculate mean position over trajectory
positions = []
for frame in traj:
atoms = frame["atoms"]
xyz = atoms[["x", "y", "z"]]
positions.append(xyz.mean(axis=0))
mean_pos = np.array(positions).mean(axis=0)
print(f"Mean position over trajectory: {mean_pos}")
# Calculate mean position over trajectory
positions = []
for frame in traj:
atoms = frame["atoms"]
xyz = atoms[["x", "y", "z"]]
positions.append(xyz.mean(axis=0))
mean_pos = np.array(positions).mean(axis=0)
print(f"Mean position over trajectory: {mean_pos}")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[12], line 3 1 # Calculate mean position over trajectory 2 positions = [] ----> 3 for frame in traj: 4 atoms = frame["atoms"] 5 xyz = atoms[["x", "y", "z"]] NameError: name 'traj' is not defined
In [ ]:
Copied!