Trajectory API¶
Public container¶
The TrajectoryCollection class is the single public entry point of
SFI.trajectory. It represents one or more experiments, each with a
rectangular tensor \(X(t, n, d)\) plus masks and extras, and is used both
as:
the output of Langevin simulators (synthetic benchmarks), and
the input to inference pipelines (experimental data).
It is the canonical first argument of every inference engine —
OverdampedLangevinInference(collection) and
UnderdampedLangevinInference(collection).
Container for one or more trajectories plus per-dataset weights. |
How to build one¶
Pick the constructor that matches your data shape:
You have… |
Use this |
One-liner |
|---|---|---|
A dense in-memory tensor with shape |
|
|
A tabular table — one row per |
|
|
One or more files on disk (CSV, Parquet, HDF5) |
|
|
Several existing collections (e.g. multi-experiment) |
|
Note
For tracked-particle data where the number of particles changes over
time, from_columns() is the canonical path — do not pre-pad
missing rows with NaN and call from_arrays().
The on-disk formats accepted by TrajectoryCollection.load() and
written by TrajectoryCollection.save() are specified in
Trajectory file formats.
Core constructors and I/O¶
High-level constructors and round-trip helpers. These are the usual “front door” for getting data in and out of SFI:
Build a single-dataset collection from array-likes. |
|
Build a single-dataset collection from flat (particle, time) columns. |
|
Load a collection from a single file or a directory. |
|
Save the collection. |
|
Convenience helper: materialize one dataset as dense arrays. |
Collection operations¶
Operations that change how experiments are grouped or weighted:
Concatenate this collection with other collections or datasets. |
|
Set the per-dataset weights (an unnormalised multiplier). |
|
Return a new degraded collection; the original is not modified. |
Streaming interface¶
The streaming API exposes one-row-at-a-time access to the underlying datasets.
It is what SFI.integrate uses internally to build increments and masks
with controlled memory usage. Most users will only need these methods when
implementing custom inference loops:
Yield chunks as (producer, t_idx) pairs for vmapped integration. |
|
Return a single-t sample row from the first dataset with valid indices. |
Advanced: low-level helpers¶
The following modules are used internally by TrajectoryCollection
to implement columnar I/O, degradation of synthetic data, and the dataset
streaming contract. They are documented for advanced users who need direct
access to these layers (for example, custom file formats or standalone
benchmark scripts):