Glossary¶
Short definitions of the jargon that appears across the SFI docs.
Cross-link from any reference page with :term:\`PASTIS\` (etc.).
- ABP¶
Active Brownian Particle — a self-propelled particle carrying a position and a heading angle, the canonical active-matter model. See the ABP gallery demos.
- AIC¶
Akaike Information Criterion. Penalises support cardinality by
2k; the classical “thin” prior.- Basis¶
A parameter-free dictionary of state functions (
Basis) — the model class of the linear estimators, and the linear-in-θ fast path of the parametric estimators. See Building bases.- beam search¶
The default sparse-search strategy of
sparsify_force()(the PASTIS original): a beam of candidate supports is grown and pruned by the information criterion.- BIC¶
Bayesian Information Criterion. Penalises support cardinality by
k log N; stricter than AIC at large sample sizes.- bootstrapped trajectory¶
A trajectory simulated from the inferred force and diffusion (
simulate_bootstrapped_trajectory()), used as a qualitative validation and for error propagation.- conditional NLL¶
The negative log-likelihood seen as a function of \((\mathbf{D}, \Lambda)\) with the model parameters \(\theta\) held at their fitted values — minimised once to refine the profiled noise levels.
- degradation¶
Standardised synthetic data imperfections — added measurement noise, downsampling, frame loss, motion blur — applied via
SFI.trajectory.degradeto quantify estimator sensitivity.- errors-in-variables¶
Regression bias arising when the regressors themselves carry noise. In SFI, localization noise enters both the finite-difference velocities and the basis evaluations at measured positions, biasing the linear estimators on nonlinear systems; the parametric estimators correct it via the skip-trick instrument.
- Extras¶
User-defined fields attached to a
TrajectoryCollectionand passed to state functions at evaluation time —extras_global(per experiment) andextras_local(per particle). Used for box sizes, species labels, neighbour lists, trap centres, and other contextual data.- G_mode¶
The Gram-matrix construction mode of the linear estimators:
"rectangle","trapeze","shift"(overdamped), plus"doubleshift"(underdamped).- Gauss–Newton¶
Linearisation-then-least-squares method for parametric inference, the fast path for linear-in-θ bases (
inner="gn"). Replaces the Hessian of the loss with \(J^\top J\) of the test-function Jacobian.- Gram matrix¶
\(G_{\alpha\beta} = \langle \phi_\alpha, \phi_\beta \rangle\), the normal-equation matrix assembled by
SFI.integratefrom time-averaged basis evaluations.- held-out NMSE¶
The residual-based normalised mean-square error of a fitted force on an independent test collection (
inf.holdout_score(test)aftercoll.split_time(...)), with the diffusion noise floor subtracted. A side feature for data-abundant scenarios — SFI’s default validation (force_predicted_MSE+ diagnostics) costs no data; the held-out score is a bias detector whose resolution is set by χ² fluctuations.- Heun¶
Stochastic Heun predictor–corrector integrator (weak order 2); the default scheme of
OverdampedProcess(method="heun").method="euler"selects the classical Euler–Maruyama integrator (weak order 1).- information criterion¶
A penalised-likelihood score used to compare sparse supports: PASTIS (recommended), AIC, BIC.
- instrument¶
In errors-in-variables regression, a quantity correlated with the true regressor but uncorrelated with its measurement noise, used to build an unbiased estimating equation.
- Interactor¶
A local K-body interaction rule (
Interactor) that is dispatched over a neighbour graph to build a global multi-particle Basis/PSF/SF. See Particle systems.- Itô convention¶
SDE interpretation where the stochastic increment \(\sqrt{2D(x_t)}\,dW_t\) is evaluated at the left endpoint of the time step. See Physics Reference.
- JAX persistent cache¶
On-disk cache of compiled JAX traces, opt-in via
SFI_JAX_CACHE_DIR=~/.cache/sfi/jax_cache. Saves seconds to minutes per session on repeated runs.- L-BFGS¶
Limited-memory BFGS, a quasi-Newton optimiser; the inner solver of the parametric estimator for nonlinear-in-θ
PSFfamilies (inner="lbfgs") and for the \((D, \Lambda)\) profile.- LASSO¶
L1-penalised least-squares for sparse model selection. Implemented as
LassoStrategy.- Layout¶
The grid declaration of the experimental SPDE toolbox (
GridLayout): named field sectors on a regular grid with boundary conditions, providing differential operators and symmetry-aware embedding. See Structured fields: Layout, Sectors, and Embed.- linear estimators¶
The closed-form estimator family:
infer_force_linear(),infer_diffusion_linear(),compute_diffusion_constant(). A projection onto a basis — no initial guess, no iterations — exact in the fine-sampling, low-noise limit, biased outside it. See Choosing an estimator.- local-precision NLL¶
Negative log-likelihood weighted by the inverse of the locally estimated noise covariance, used in the parametric path to handle heteroscedastic measurement noise. See Parametric windowed estimators — concepts.
- M_mode¶
The moment/kinematics convention of the linear estimators. Overdamped:
"auto"(noise-aware selection),"Ito","Ito-shift","Strato". Underdamped:"symmetric"(the"auto"resolution),"early","anticipated".- mask¶
The boolean validity array (shape
(T, N)) attached to each dataset, encoding missing frames and particles entering or leaving. Honoured automatically by state functions and estimators.- measurement-noise covariance¶
The covariance \(\Lambda\) of the localization error on each recorded position. Estimated jointly with the diffusion by the Vestergaard method (linear estimators, exposed as
inf.Lambda) or profiled natively (parametric estimators).- moment estimator¶
A closed-form estimator built from low-order moments of the increments — used to initialise the parametric \((\mathbf{D}, \Lambda)\) profile, and, in the linear estimators, selected by the M_mode convention.
- neighbour list¶
CSR-encoded list of neighbour indices for each particle, used by pair-interaction bases. Built host-side via
SFI.utils.neighbors.build_neighbor_csr()between JIT chunks; see AGENTS.md §4.8.- NMSE¶
Normalised Mean Square Error — the canonical force/diffusion accuracy metric: mean squared error of the inferred field divided by the mean square of the true field. Available as
inf.NMSE_forceaftercompare_to_exact();inf.force_predicted_MSEis the a-priori estimate that needs no ground truth.- parametric estimators¶
The likelihood-based estimator family:
infer_force(),infer_diffusion(). One or more RK4 flow steps per observation interval, windowed-precision NLL, native \((\mathbf{D}, \Lambda)\) profiling; robust to measurement noise and coarse sampling, accepts nonlinear-in-θ models. See Choosing an estimator.- Pareto front¶
The error-vs-sparsity frontier explored by
sparsify_force(); the returnedSparsityResultstores it and can be re-queried under any criterion without re-running the search.- particles¶
The
Naxis of a trajectory’s(T, N, d)state array — the independent or interacting bodies tracked over time (cells, colloids, agents, …). State functions declare how they consume this axis throughpdepth:pdepth=0evaluates one particle at a time (the same law applied independently to each), whilepdepth=1sees all particles together for interactions. The particle count may vary over time; the mask records entries and exits. See Particle systems.- PASTIS¶
Parsimonious Stochastic Inference — the canonical information criterion used by
sparsify_force(). Penalises support cardinality with a Bayes-factor-like prior set byp. Gerardos & Ronceray, Phys. Rev. Lett. 135, 167401 (2025).- PBC¶
Periodic Boundary Conditions — wrap-around boundaries on a box or grid. Minimum-image inter-particle displacements are computed by
SFI.bases.pairs.pbc_displacement().- per-dataset parameter¶
A model parameter taking an independent inferred value per dataset of a pooled multi-experiment collection, selected through the reserved
dataset_indexextra:per_dataset_scalar()(parametric estimators) ordataset_indicator()one-hot features (linear estimators). The per-particle analogue lives insidemake_interactor()kernels via the reservedparticle_indexextra. To reproduce a single experiment, fold the model at one index withspecialize(), which removes thedataset_indexdependence (see specialize).- profiling¶
Internal estimation of nuisance parameters — in SFI, the diffusion level \(\mathbf{D}\) and measurement-noise covariance \(\Lambda\) during a parametric fit — so the user does not have to supply them. Skipped entirely when both are passed explicitly.
- PSF¶
Parametric State Function (
PSF) — a model family \(F(x;\theta)\) with a named parameter tree; the model class of nonlinear parametric inference. See Models and state functions.- rank¶
The tensor rank of a state-function output: 0 = scalar, 1 = vector (forces), 2 = matrix (diffusion tensors).
- RK4¶
Classical fourth-order Runge–Kutta scheme; used by the parametric estimator to integrate the deterministic drift flow over each observation interval.
- secant velocity¶
Centred finite-difference velocity \(v_t = (x_{t+1} - x_{t-1})/(2\Delta t)\) used by the underdamped diagnostics and the ULI residual. See Diagnostics.
- Sector¶
A named component group within a Layout (e.g. a scalar field
U, a Q-tensor), addressed when building SPDE bases.- SF¶
State Function with frozen parameters (
SF) — the evaluable object produced by a fit, ready for Langevin simulation.- skip-trick¶
The errors-in-variables instrument of the parametric Gauss–Newton path: test functions are evaluated at temporally separated (skipped) observations, decorrelating the instrument from the measurement noise of the residual and restoring consistency. On by default (
eiv="auto").- SPDE¶
Stochastic Partial Differential Equation — field dynamics on a regular grid, where the drift is a spatial-operator functional of the field. SFI infers them via composable stencil operators (experimental toolbox); see Spatial field inference (SPDE).
- specialize¶
Collapse a pooled model to one experiment’s standalone single-condition form:
specialize()folds every per-dataset parameter at a chosendataset_index(per-dataset arrays reduce to that index’s slice; one-hot indicators become constant) so the result does not readdataset_index. Used bysimulate_bootstrapped_trajectory()to export a clean single-trajectory model.- STLSQ¶
Sequential Thresholded Least Squares — the SINDy-style strategy. Implemented as
STLSQStrategy.- Stratonovich convention¶
Mid-point evaluation of the stochastic increment. Required for state-dependent
D. See Physics Reference.- trapeze¶
The trapezoidal Gram construction (
G_mode="trapeze"), which symmetrises basis evaluations across each interval and removes the leading finite-Δt bias of the rectangle rule. Amri et al., Phys. Rev. Research 6, 043030 (2024).- ULI¶
Underdamped Langevin Inference — the position-only inference scheme for inertial systems (Brückner, Ronceray & Broedersz, Phys. Rev. Lett. 125, 058103 (2020)), implemented by
UnderdampedLangevinInference.- velocity reconstruction¶
The underdamped engine’s internal estimation of unobserved velocities from positions (secant differences with bias-corrected moments). You never supply velocities; see Underdamped systems.
- Vestergaard¶
The covariance-based constant-diffusion estimator (after Vestergaard et al.), which fits the diffusion and the localization-error covariance jointly; selected by
compute_diffusion_constant(method="noisy")— the noise-robust choice ofcompute_diffusion_constant()and its"auto"selection when noise is detected. Vestergaard, C. L., Blainey, P. C. & Flyvbjerg, H., Optimal estimation of diffusion coefficients from single-particle trajectories, Phys. Rev. E 89, 022726 (2014).- WeakNoise¶
The clean-data constant-diffusion estimator of
compute_diffusion_constant; assumes negligible localization error.- weights (multi-experiment)¶
Per-dataset unnormalised multipliers of a
TrajectoryCollection, applied to every estimator (force, diffusion, parametric):"pool"(default — pool all increments on equal footing),"per_dataset"(each experiment counts equally), or an explicit array. Within-dataset weighting is intrinsic to each estimator (force per-Δt, diffusion per-point). See Trajectory data.- windowed precision¶
The banded inverse covariance of the parametric residuals over a short time window, providing the weights of the parametric NLL. Captures the correlations that measurement noise induces between consecutive residuals.