SFI.trajectory.io module¶

SFI.trajectory.io¶

CSV I/O utilities and columnar ↔ tensor conversion for trajectory data.

File format¶

We support a single CSV with optional YAML header. The numerical columns include:

particle_id (optional): if absent, it is a single-trajectory file.
time_step : integer time index t (0-based after relabel).
x0, x1, ..., x{d-1} : state vector components.

Extras are stored either in the header (YAML) or as extra numeric columns:

Prefixes (numeric columns)¶

TG_ : time-dependent globals — values depend on t only.
P_ : per-particle constants — values depend on particle only.
TP_ : time-dependent per-particle — values depend on (t, n).
G_ : global scalars — constants stored in the header via averaging.

Note: TG_/TP_ columns are parsed as time series and wrapped into TimeSeriesExtra. Header extras_global entries are treated as static unless explicitly wrapped when building the dataset.

Round-trip helpers¶

flatten_X_to_columns() / assemble_X_from_columns() convert between structured tensors (T,N,d) and flat columns.
save_trajectory_csv_with_extras() / load_trajectory_csv_with_extras() handle extras and header metadata.
columns_and_extras_to_dataset() builds a TrajectoryDataset ready for inference.

All functions are NumPy-based; JAX is optional for basic dtype detection only.

SFI.trajectory.io.columns_and_extras_to_dataset(particle_idx, time_idx, state_vectors, *, extras_global=None, extras_local=None, dt=None, t=None, mask_fill_value=nan, relabel=True, compress_particles=False, meta=None)[source]¶

Build a TrajectoryDataset from columns and parsed extras.

Preference order for the time axis: 1) explicit t argument, 2) extras_global['t'] (from header or TG_t), 3) fallback to scalar dt.

Parameters:

compress_particles (bool) – If True, apply greedy interval packing so that particles with non-overlapping time supports share the same column index. This can dramatically reduce N for open-boundary systems where particles enter and leave the field of view over time. Per-particle extras are automatically reindexed to the compressed column layout. The mapping is stored as meta['particle_column_map']. When False (default) and relabel=True, the original particle IDs are recorded as extras_local['original_particle_id'].
particle_idx (ndarray)
time_idx (ndarray)
state_vectors (ndarray)
extras_global (Mapping[str, Any] | None)
extras_local (Mapping[str, Any] | None)
dt (float | None)
t (ndarray | None)
mask_fill_value (float)
relabel (bool)
meta (Dict[str, Any] | None)

Return type:

TrajectoryDataset

SFI.trajectory.io.load_trajectory(filename, *, format=None, particle_column=0, time_column=1, state_columns=None, relabel=True, prefix_G='G_', prefix_TG='TG_', prefix_P='P_', prefix_TP='TP_')[source]¶

Unified loader for {‘csv’,’parquet’,’h5’} (inferred from filename if format=None).

Parameters:

particle_column (int | str | None) – Which columns hold the particle ID and the time index. Accept a column name (str) for any format, or a positional index (int) for CSV files only. CSV defaults are positional (column 0 = particle, column 1 = time); parquet and HDF5 default to the canonical names "particle_id" and "time_step" and ignore int values (their defaults cannot be distinguished from “unspecified”). particle_column=None marks a single-trajectory file.
time_column (int | str) – Which columns hold the particle ID and the time index. Accept a column name (str) for any format, or a positional index (int) for CSV files only. CSV defaults are positional (column 0 = particle, column 1 = time); parquet and HDF5 default to the canonical names "particle_id" and "time_step" and ignore int values (their defaults cannot be distinguished from “unspecified”). particle_column=None marks a single-trajectory file.
state_columns (Sequence[int | str] | None) – Optional explicit selection of the state-vector columns (names, or indices for CSV), in order. When given, every other non-extras column is dropped. Default: every column that is not an ID and does not carry an extras prefix is a state component.
tuple (Returns the standard) – (metadata, column_headers, particle_indices, time_indices, state_vectors, extras_global, extras_local)
filename (str)
format (str | None)
relabel (bool)
prefix_G (str)
prefix_TG (str)
prefix_P (str)
prefix_TP (str)

SFI.trajectory.io.save_trajectory(filename, *, particle_idx, time_idx, state_vectors, extras_global=None, extras_local=None, metadata=None, format=None, float_fmt='%.8f', compression='snappy', prefix_G='G_', prefix_TG='TG_', prefix_P='P_', prefix_TP='TP_')[source]¶

Unified saver for {‘csv’,’parquet’,’h5’} (inferred from filename if format=None).

Parameters:

filename (str)
particle_idx (ndarray | None)
time_idx (ndarray)
state_vectors (ndarray)
extras_global (Dict[str, Any] | None)
extras_local (Dict[str, Any] | None)
metadata (Dict[str, Any] | None)
format (str | None)
float_fmt (str)
compression (str)
prefix_G (str)
prefix_TG (str)
prefix_P (str)
prefix_TP (str)

Return type:

None

SFI.trajectory.io module¶