Trajectory API

Public container

The TrajectoryCollection class is the single public entry point of SFI.trajectory. It represents one or more experiments, each with a rectangular tensor \(X(t, n, d)\) plus masks and extras, and is used both as:

  • the output of Langevin simulators (synthetic benchmarks), and

  • the input to inference pipelines (experimental data).

It is the canonical first argument of every inference engine — OverdampedLangevinInference(collection) and UnderdampedLangevinInference(collection).

TrajectoryCollection

Container for one or more trajectories plus per-dataset weights.

How to build one

Pick the constructor that matches your data shape:

You have…

Use this

One-liner

A dense in-memory tensor with shape (T, N, d) or (T, d) and a fixed N

TrajectoryCollection.from_arrays()

TrajectoryCollection.from_arrays(X=X, dt=0.01)

A tabular table — one row per (particle, time) observation, particles that enter/leave at different times

TrajectoryCollection.from_columns()

TrajectoryCollection.from_columns(particle_idx, time_idx, state_vectors, dt=0.01)

One or more files on disk (CSV, Parquet, HDF5)

TrajectoryCollection.load()

TrajectoryCollection.load("trajectory.parquet")

Several existing collections (e.g. multi-experiment)

TrajectoryCollection.concat()

c1 & c2 or c1.concat([c2])

Note

For tracked-particle data where the number of particles changes over time, from_columns() is the canonical path — do not pre-pad missing rows with NaN and call from_arrays().

The on-disk formats accepted by TrajectoryCollection.load() and written by TrajectoryCollection.save() are specified in Trajectory file formats.

Core constructors and I/O

High-level constructors and round-trip helpers. These are the usual “front door” for getting data in and out of SFI:

TrajectoryCollection.from_arrays

Build a single-dataset collection from array-likes.

TrajectoryCollection.from_columns

Build a single-dataset collection from flat (particle, time) columns.

TrajectoryCollection.load

Load a collection from a single file or a directory.

TrajectoryCollection.save

Save the collection.

TrajectoryCollection.to_arrays

Convenience helper: materialize one dataset as dense arrays.

Collection operations

Operations that change how experiments are grouped or weighted:

TrajectoryCollection.concat

Concatenate this collection with other collections or datasets.

TrajectoryCollection.with_weights

Set the per-dataset weights (an unnormalised multiplier).

TrajectoryCollection.degrade

Return a new degraded collection; the original is not modified.

Streaming interface

The streaming API exposes one-row-at-a-time access to the underlying datasets. It is what SFI.integrate uses internally to build increments and masks with controlled memory usage. Most users will only need these methods when implementing custom inference loops:

TrajectoryCollection.iter_slices

Yield chunks as (producer, t_idx) pairs for vmapped integration.

TrajectoryCollection.peek_row

Return a single-t sample row from the first dataset with valid indices.

Advanced: low-level helpers

The following modules are used internally by TrajectoryCollection to implement columnar I/O, degradation of synthetic data, and the dataset streaming contract. They are documented for advanced users who need direct access to these layers (for example, custom file formats or standalone benchmark scripts):

io

SFI.trajectory.io

degrade

SFI.trajectory.degrade

dataset

Trajectory dataset: single-index producer with explicit valid index window.