SFI.trajectory.io module

SFI.trajectory.io

CSV I/O utilities and columnar ↔ tensor conversion for trajectory data.

File format

We support a single CSV with optional YAML header. The numerical columns include:

  • particle_id (optional): if absent, it is a single-trajectory file.

  • time_step : integer time index t (0-based after relabel).

  • x0, x1, ..., x{d-1} : state vector components.

Extras are stored either in the header (YAML) or as extra numeric columns:

Prefixes (numeric columns)

  • TG_ : time-dependent globals — values depend on t only.

  • P_ : per-particle constants — values depend on particle only.

  • TP_ : time-dependent per-particle — values depend on (t, n).

  • G_ : global scalars — constants stored in the header via averaging.

Note: TG_/TP_ columns are parsed as time series and wrapped into TimeSeriesExtra. Header extras_global entries are treated as static unless explicitly wrapped when building the dataset.

Round-trip helpers

  • flatten_X_to_columns() / assemble_X_from_columns() convert between structured tensors (T,N,d) and flat columns.

  • save_trajectory_csv_with_extras() / load_trajectory_csv_with_extras() handle extras and header metadata.

  • columns_and_extras_to_dataset() builds a TrajectoryDataset ready for inference.

All functions are NumPy-based; JAX is optional for basic dtype detection only.

SFI.trajectory.io.columns_and_extras_to_dataset(particle_idx, time_idx, state_vectors, *, extras_global=None, extras_local=None, dt=None, t=None, mask_fill_value=nan, relabel=True, compress_particles=False, meta=None)[source]

Build a TrajectoryDataset from columns and parsed extras.

Preference order for the time axis: 1) explicit t argument, 2) extras_global['t'] (from header or TG_t), 3) fallback to scalar dt.

Parameters:
  • compress_particles (bool) – If True, apply greedy interval packing so that particles with non-overlapping time supports share the same column index. This can dramatically reduce N for open-boundary systems where particles enter and leave the field of view over time. Per-particle extras are automatically reindexed to the compressed column layout. The mapping is stored as meta['particle_column_map']. When False (default) and relabel=True, the original particle IDs are recorded as extras_local['original_particle_id'].

  • particle_idx (ndarray)

  • time_idx (ndarray)

  • state_vectors (ndarray)

  • extras_global (Mapping[str, Any] | None)

  • extras_local (Mapping[str, Any] | None)

  • dt (float | None)

  • t (ndarray | None)

  • mask_fill_value (float)

  • relabel (bool)

  • meta (Dict[str, Any] | None)

Return type:

TrajectoryDataset

SFI.trajectory.io.load_trajectory(filename, *, format=None, particle_column=0, time_column=1, state_columns=None, relabel=True, prefix_G='G_', prefix_TG='TG_', prefix_P='P_', prefix_TP='TP_')[source]

Unified loader for {‘csv’,’parquet’,’h5’} (inferred from filename if format=None).

Parameters:
  • particle_column (int | str | None) – Which columns hold the particle ID and the time index. Accept a column name (str) for any format, or a positional index (int) for CSV files only. CSV defaults are positional (column 0 = particle, column 1 = time); parquet and HDF5 default to the canonical names "particle_id" and "time_step" and ignore int values (their defaults cannot be distinguished from “unspecified”). particle_column=None marks a single-trajectory file.

  • time_column (int | str) – Which columns hold the particle ID and the time index. Accept a column name (str) for any format, or a positional index (int) for CSV files only. CSV defaults are positional (column 0 = particle, column 1 = time); parquet and HDF5 default to the canonical names "particle_id" and "time_step" and ignore int values (their defaults cannot be distinguished from “unspecified”). particle_column=None marks a single-trajectory file.

  • state_columns (Sequence[int | str] | None) – Optional explicit selection of the state-vector columns (names, or indices for CSV), in order. When given, every other non-extras column is dropped. Default: every column that is not an ID and does not carry an extras prefix is a state component.

  • tuple (Returns the standard) – (metadata, column_headers, particle_indices, time_indices, state_vectors, extras_global, extras_local)

  • filename (str)

  • format (str | None)

  • relabel (bool)

  • prefix_G (str)

  • prefix_TG (str)

  • prefix_P (str)

  • prefix_TP (str)

SFI.trajectory.io.save_trajectory(filename, *, particle_idx, time_idx, state_vectors, extras_global=None, extras_local=None, metadata=None, format=None, float_fmt='%.8f', compression='snappy', prefix_G='G_', prefix_TG='TG_', prefix_P='P_', prefix_TP='TP_')[source]

Unified saver for {‘csv’,’parquet’,’h5’} (inferred from filename if format=None).

Parameters:
  • filename (str)

  • particle_idx (ndarray | None)

  • time_idx (ndarray)

  • state_vectors (ndarray)

  • extras_global (Dict[str, Any] | None)

  • extras_local (Dict[str, Any] | None)

  • metadata (Dict[str, Any] | None)

  • format (str | None)

  • float_fmt (str)

  • compression (str)

  • prefix_G (str)

  • prefix_TG (str)

  • prefix_P (str)

  • prefix_TP (str)

Return type:

None