SFI.trajectory.degrade module¶
SFI.trajectory.degrade¶
Degrade synthetic trajectories to mimic real data: - motion blur (temporal window average) - downsampling - additive measurement noise - ROI filtering (mask points outside a region) - random data loss
Two front doors:¶
Dataset/Collection API (recommended for internal use) - degrade_dataset(ds, …) - degrade_collection(coll, …)
Columns API (back-compat for I/O scripts) - degrade_columns(meta, particle_idx, time_idx, state_vectors, …)
Why two? Column flow is convenient for simple scripts and file round-trips; dataset/collection flow keeps everything rectangular so we can blur/downsample time-dependent extras cleanly without flatten/unflatten gymnastics.
Extras semantics¶
- extras_global:
arrays with leading shape (T, …) are blurred/downsampled along time like X; other entries are passed through unchanged.
- extras_local:
arrays with shape (N, …): per-particle constants → unchanged
arrays with shape (T, N, …): blurred/downsampled along time like X
Noise/ROI/data-loss are applied on the mask (not by deleting rows), so tensor shapes remain intact. Flattening to columns (if needed) happens last.
Cache-only extras (auto-generated structural tables)¶
Keys starting with _cache/ are considered auto-generated structural extras
(e.g. CSR neighbor lists, stencil hyper tables). They are not degraded and
are dropped from outputs, because any degradation/context change invalidates
such cached structural objects. They can be regenerated on demand by calling
the appropriate host-side preparation routine.
- SFI.trajectory.degrade.degrade_collection(coll, *, downsample=1, motion_blur=0, data_loss_fraction=0.0, noise=None, ROI=None, seed=None, reweight='pool')[source]¶
Degrade all datasets in a collection and optionally recompute weights.
- Parameters:
coll (TrajectoryCollection) – Input collection to degrade.
downsample (int) – Same semantics as in
degrade_dataset().motion_blur (int) – Same semantics as in
degrade_dataset().data_loss_fraction (float) – Same semantics as in
degrade_dataset().noise (None | float | ndarray) – Same semantics as in
degrade_dataset().ROI (None | float | ndarray | Callable[[ndarray], bool]) – Same semantics as in
degrade_dataset().seed (int | None) – Same semantics as in
degrade_dataset().reweight (Literal['pool', 'keep']) –
Policy for updating collection-level weights after degradation:
"pool": recompute weights viawith_weights("pool")."keep": preserve the relative weights fromcoll.weights.
- Returns:
New collection whose datasets have been degraded in the same way.
- Return type:
Notes
This function is purely functional: the input collection is not modified.
- SFI.trajectory.degrade.degrade_dataset(ds, *, downsample=1, motion_blur=0, data_loss_fraction=0.0, noise=None, ROI=None, seed=None)[source]¶
Degrade a single
TrajectoryDataset.The function operates in tensor space; it returns a new dataset where:
Xis motion-blurred overmotion_blur + 1frames and downsampled bydownsample,the mask is AND-reduced over the blur window, then modified by ROI and random data loss,
t(if present) is averaged over the blur window and downsampled, otherwise scalardtis multiplied bydownsample,extras are processed consistently (see module docstring).
- Parameters:
ds (TrajectoryDataset) – Input dataset to degrade.
downsample (int) – Integer downsampling factor along the time axis (must be
>= 1).motion_blur (int) – Temporal averaging window size minus one. The actual blur window is
motion_blur + 1frames and must satisfy0 <= motion_blur < downsample.data_loss_fraction (float) – Fraction of currently valid entries to drop uniformly at random after ROI filtering (in
[0, 1)).noise (None | float | ndarray) – Additive Gaussian noise scale. If a float, isotropic noise with standard deviation
noiseis applied. If an array, broadcast to the state dimension.ROI (None | float | ndarray | Callable[[ndarray], bool]) –
Region-of-interest predicate or mask. Can be:
float: radial cutoff — keeps positions with
‖x‖₂ ≤ ROI,(2, d)ndarray: axis-aligned box (row 0 = lower bound, row 1 = upper bound),Callable[[np.ndarray], bool]: predicate evaluated on each observed position.
seed (int | None) – Optional RNG seed for the noise and data-loss generators.
- Returns:
Degraded dataset with the same number of particles but fewer time steps.
- Return type:
- SFI.trajectory.degrade.degrade_spatial_data(coll, *, downscale=2, method='mean', blur_radius=0, data_loss_fraction=0.0, noise=None, seed=None, mask_threshold=0.5, bc='noflux', prefix='box', order='C')[source]¶
Degrade an SPDE-style collection in space (blur/coarsen/pixel-loss/noise).
- Assumes the standard SPDE convention:
particle axis N is a flattened grid of shape grid_shape,
state dim d is #fields per site.
dxis read fromextras_global['{prefix}/dx']and updated automatically; it does not need to be supplied here.Also updates ‘box/’ box parameters and erases structural outputs starting with _cache (regenerated on next use).
- Parameters:
coll (TrajectoryCollection)
downscale (int | Tuple[int, ...])
method (Literal['mean', 'subsample'])
blur_radius (int)
data_loss_fraction (float)
noise (None | float | ndarray)
seed (int | None)
mask_threshold (float)
bc (Literal['noflux', 'pbc'])
prefix (str)
order (Literal['C', 'F'])
- Return type:
- SFI.trajectory.degrade.degrade_spatial_dataset(ds, *, downscale=1, method='mean', blur_radius=0, data_loss_fraction=0.0, noise=None, rng, mask_threshold=0.5, bc='noflux', prefix='box', order='C')[source]¶
Spatial degradation of a single SPDE-style dataset.
Key invariants ensured by this routine¶
The flattening convention is preserved (
order="C"or"F").Box metadata (grid_shape, dx) is updated consistently after coarsening.
Any prepared structural stencil payload is dropped so it is rebuilt for the new grid.
Mask handling is conservative: a coarse cell is valid only if enough fine pixels are valid.
- Parameters:
ds (TrajectoryDataset)
downscale (int | Tuple[int, ...])
method (Literal['mean', 'subsample'])
blur_radius (int)
data_loss_fraction (float)
noise (None | float | ndarray)
rng (Generator)
mask_threshold (float)
bc (Literal['noflux', 'pbc'])
prefix (str)
order (Literal['C', 'F'])
- Return type: