.. _glossary: Glossary ======== Short definitions of the jargon that appears across the SFI docs. Cross-link from any reference page with ``:term:\`PASTIS\``` (etc.). .. glossary:: :sorted: PASTIS Parsimonious Stochastic Inference — the canonical information criterion used by :meth:`~SFI.OverdampedLangevinInference.sparsify_force`. Penalises support cardinality with a Bayes-factor-like prior set by ``p``. Gerardos & Ronceray, *Phys. Rev. Lett.* **135**, 167401 (2025). AIC Akaike Information Criterion. Penalises support cardinality by ``2k``; the classical "thin" prior. BIC Bayesian Information Criterion. Penalises support cardinality by ``k log N``; stricter than AIC at large sample sizes. LASSO L\ :sub:`1`-penalised least-squares for sparse model selection. Implemented as :class:`~SFI.inference.LassoStrategy`. STLSQ Sequential Thresholded Least Squares — the SINDy-style strategy. Implemented as :class:`~SFI.inference.STLSQStrategy`. L-BFGS Limited-memory BFGS, a quasi-Newton optimiser; the inner solver of the parametric estimator for nonlinear-in-θ :class:`~SFI.statefunc.PSF` families (``inner="lbfgs"``) and for the :math:`(D, \Lambda)` profile. RK4 Classical fourth-order Runge–Kutta scheme; used by the parametric estimator to integrate the deterministic drift flow over each observation interval. Heun Stochastic Heun predictor–corrector integrator (weak order 2); the **default** scheme of :class:`~SFI.langevin.OverdampedProcess` (``method="heun"``). ``method="euler"`` selects the classical Euler–Maruyama integrator (weak order 1). Gauss–Newton Linearisation-then-least-squares method for parametric inference, the fast path for linear-in-θ bases (``inner="gn"``). Replaces the Hessian of the loss with :math:`J^\top J` of the test-function Jacobian. Gram matrix :math:`G_{\alpha\beta} = \langle \phi_\alpha, \phi_\beta \rangle`, the normal-equation matrix assembled by :mod:`SFI.integrate` from time-averaged basis evaluations. Itô convention SDE interpretation where the stochastic increment :math:`\sqrt{2D(x_t)}\,dW_t` is evaluated at the *left* endpoint of the time step. See :doc:`/physics_reference`. Stratonovich convention Mid-point evaluation of the stochastic increment. Required for state-dependent ``D``. See :doc:`/physics_reference`. secant velocity Centred finite-difference velocity :math:`v_t = (x_{t+1} - x_{t-1})/(2\Delta t)` used by the underdamped diagnostics and the ULI residual. See :doc:`/diagnostics`. local-precision NLL Negative log-likelihood weighted by the inverse of the *locally* estimated noise covariance, used in the parametric path to handle heteroscedastic measurement noise. See :doc:`/inference/parametric_concept`. neighbour list CSR-encoded list of neighbour indices for each particle, used by pair-interaction bases. Built host-side via :func:`SFI.utils.neighbors.build_neighbor_csr` between JIT chunks; see AGENTS.md §4.8. JAX persistent cache On-disk cache of compiled JAX traces, opt-in via ``SFI_JAX_CACHE_DIR=~/.cache/sfi/jax_cache``. Saves seconds to minutes per session on repeated runs. NMSE Normalised Mean Square Error — the canonical force/diffusion accuracy metric: mean squared error of the inferred field divided by the mean square of the true field. Available as ``inf.NMSE_force`` after :meth:`~SFI.OverdampedLangevinInference.compare_to_exact`; ``inf.force_predicted_MSE`` is the a-priori estimate that needs no ground truth. SPDE Stochastic Partial Differential Equation — field dynamics on a regular grid, where the drift is a spatial-operator functional of the field. SFI infers them via composable stencil operators (experimental toolbox); see :doc:`/spde/user_guide`. ABP Active Brownian Particle — a self-propelled particle carrying a position and a heading angle, the canonical active-matter model. See the ABP gallery demos. PBC Periodic Boundary Conditions — wrap-around boundaries on a box or grid. Minimum-image inter-particle displacements are computed by :func:`SFI.bases.pairs.pbc_displacement`. Extras User-defined fields attached to a :class:`~SFI.trajectory.TrajectoryCollection` and passed to state functions at evaluation time — ``extras_global`` (per experiment) and ``extras_local`` (per particle). Used for box sizes, species labels, neighbour lists, trap centres, and other contextual data. Interactor A local K-body interaction rule (:class:`~SFI.statefunc.Interactor`) that is dispatched over a neighbour graph to build a global multi-particle Basis/PSF/SF. See :doc:`/particles/user_guide`. particles The ``N`` axis of a trajectory's ``(T, N, d)`` state array — the independent or interacting bodies tracked over time (cells, colloids, agents, …). State functions declare how they consume this axis through ``pdepth``: ``pdepth=0`` evaluates one particle at a time (the same law applied independently to each), while ``pdepth=1`` sees all particles together for interactions. The particle count may vary over time; the :term:`mask` records entries and exits. See :doc:`/particles/user_guide`. linear estimators The closed-form estimator family: :meth:`~SFI.OverdampedLangevinInference.infer_force_linear`, :meth:`~SFI.OverdampedLangevinInference.infer_diffusion_linear`, :meth:`~SFI.OverdampedLangevinInference.compute_diffusion_constant`. A projection onto a basis — no initial guess, no iterations — exact in the fine-sampling, low-noise limit, biased outside it. See :ref:`choosing-an-estimator`. parametric estimators The likelihood-based estimator family: :meth:`~SFI.OverdampedLangevinInference.infer_force`, :meth:`~SFI.OverdampedLangevinInference.infer_diffusion`. One or more RK4 flow steps per observation interval, windowed-precision NLL, native :math:`(\mathbf{D}, \Lambda)` profiling; robust to measurement noise and coarse sampling, accepts nonlinear-in-θ models. See :ref:`choosing-an-estimator`. errors-in-variables Regression bias arising when the regressors themselves carry noise. In SFI, localization noise enters both the finite-difference velocities and the basis evaluations at measured positions, biasing the linear estimators on nonlinear systems; the parametric estimators correct it via the :term:`skip-trick` instrument. skip-trick The errors-in-variables :term:`instrument` of the parametric Gauss–Newton path: test functions are evaluated at temporally separated (skipped) observations, decorrelating the instrument from the measurement noise of the residual and restoring consistency. On by default (``eiv="auto"``). instrument In errors-in-variables regression, a quantity correlated with the true regressor but uncorrelated with its measurement noise, used to build an unbiased estimating equation. windowed precision The banded inverse covariance of the parametric residuals over a short time window, providing the weights of the parametric NLL. Captures the correlations that measurement noise induces between consecutive residuals. conditional NLL The negative log-likelihood seen as a function of :math:`(\mathbf{D}, \Lambda)` with the model parameters :math:`\theta` held at their fitted values — minimised once to refine the profiled noise levels. profiling Internal estimation of nuisance parameters — in SFI, the diffusion level :math:`\mathbf{D}` and measurement-noise covariance :math:`\Lambda` during a parametric fit — so the user does not have to supply them. Skipped entirely when both are passed explicitly. measurement-noise covariance The covariance :math:`\Lambda` of the localization error on each recorded position. Estimated jointly with the diffusion by the :term:`Vestergaard` method (linear estimators, exposed as ``inf.Lambda``) or profiled natively (parametric estimators). moment estimator A closed-form estimator built from low-order moments of the increments — used to initialise the parametric :math:`(\mathbf{D}, \Lambda)` profile, and, in the linear estimators, selected by the :term:`M_mode` convention. Vestergaard The covariance-based constant-diffusion estimator (after Vestergaard *et al.*), which fits the diffusion and the localization-error covariance jointly; selected by ``compute_diffusion_constant(method="noisy")`` — the noise-robust choice of :meth:`~SFI.inference.OverdampedLangevinInference.compute_diffusion_constant` and its ``"auto"`` selection when noise is detected. Vestergaard, C. L., Blainey, P. C. & Flyvbjerg, H., *Optimal estimation of diffusion coefficients from single-particle trajectories*, **Phys. Rev. E** 89, 022726 (2014). WeakNoise The clean-data constant-diffusion estimator of ``compute_diffusion_constant``; assumes negligible localization error. M_mode The moment/kinematics convention of the linear estimators. Overdamped: ``"auto"`` (noise-aware selection), ``"Ito"``, ``"Ito-shift"``, ``"Strato"``. Underdamped: ``"symmetric"`` (the ``"auto"`` resolution), ``"early"``, ``"anticipated"``. G_mode The Gram-matrix construction mode of the linear estimators: ``"rectangle"``, ``"trapeze"``, ``"shift"`` (overdamped), plus ``"doubleshift"`` (underdamped). trapeze The trapezoidal Gram construction (``G_mode="trapeze"``), which symmetrises basis evaluations across each interval and removes the leading finite-Δt bias of the rectangle rule. Amri *et al.*, *Phys. Rev. Research* **6**, 043030 (2024). Basis A parameter-free dictionary of state functions (:class:`~SFI.statefunc.Basis`) — the model class of the linear estimators, and the linear-in-θ fast path of the parametric estimators. See :doc:`/bases/user_guide`. PSF Parametric State Function (:class:`~SFI.statefunc.PSF`) — a model family :math:`F(x;\theta)` with a named parameter tree; the model class of nonlinear parametric inference. See :doc:`/statefunc/user_guide`. SF State Function with frozen parameters (:class:`~SFI.statefunc.SF`) — the evaluable object produced by a fit, ready for Langevin simulation. rank The tensor rank of a state-function output: 0 = scalar, 1 = vector (forces), 2 = matrix (diffusion tensors). mask The boolean validity array (shape ``(T, N)``) attached to each dataset, encoding missing frames and particles entering or leaving. Honoured automatically by state functions and estimators. degradation Standardised synthetic data imperfections — added measurement noise, downsampling, frame loss, motion blur — applied via :mod:`SFI.trajectory.degrade` to quantify estimator sensitivity. bootstrapped trajectory A trajectory simulated from the *inferred* force and diffusion (:meth:`simulate_bootstrapped_trajectory`), used as a qualitative validation and for error propagation. held-out NMSE The residual-based normalised mean-square error of a fitted force on an independent test collection (``inf.holdout_score(test)`` after ``coll.split_time(...)``), with the diffusion noise floor subtracted. A side feature for data-abundant scenarios — SFI's default validation (``force_predicted_MSE`` + diagnostics) costs no data; the held-out score is a bias detector whose resolution is set by χ² fluctuations. Pareto front The error-vs-sparsity frontier explored by :meth:`sparsify_force`; the returned :class:`~SFI.inference.sparse.SparsityResult` stores it and can be re-queried under any criterion without re-running the search. beam search The default sparse-search strategy of :meth:`sparsify_force` (the PASTIS original): a beam of candidate supports is grown and pruned by the information criterion. information criterion A penalised-likelihood score used to compare sparse supports: :term:`PASTIS` (recommended), :term:`AIC`, :term:`BIC`. velocity reconstruction The underdamped engine's internal estimation of unobserved velocities from positions (secant differences with bias-corrected moments). You never supply velocities; see :doc:`/inference/underdamped`. ULI Underdamped Langevin Inference — the position-only inference scheme for inertial systems (Brückner, Ronceray & Broedersz, *Phys. Rev. Lett.* **125**, 058103 (2020)), implemented by :class:`~SFI.inference.UnderdampedLangevinInference`. Layout The grid declaration of the experimental SPDE toolbox (``GridLayout``): named field sectors on a regular grid with boundary conditions, providing differential operators and symmetry-aware embedding. See :doc:`/spde/layout_guide`. Sector A named component group within a :term:`Layout` (e.g. a scalar field ``U``, a Q-tensor), addressed when building SPDE bases. per-dataset parameter A model parameter taking an independent inferred value per dataset of a pooled multi-experiment collection, selected through the reserved ``dataset_index`` extra: :func:`~SFI.bases.per_dataset_scalar` (parametric estimators) or :func:`~SFI.bases.dataset_indicator` one-hot features (linear estimators). The per-particle analogue lives inside :func:`~SFI.statefunc.make_interactor` kernels via the reserved ``particle_index`` extra. To reproduce a single experiment, fold the model at one index with :meth:`~SFI.statefunc.StateExpr.specialize`, which removes the ``dataset_index`` dependence (see :term:`specialize`). specialize Collapse a pooled model to one experiment's standalone single-condition form: :meth:`~SFI.statefunc.StateExpr.specialize` folds every :term:`per-dataset parameter` at a chosen ``dataset_index`` (per-dataset arrays reduce to that index's slice; one-hot indicators become constant) so the result does not read ``dataset_index``. Used by :meth:`~SFI.inference.OverdampedLangevinInference.simulate_bootstrapped_trajectory` to export a clean single-trajectory model. weights (multi-experiment) Per-dataset **unnormalised multipliers** of a :class:`~SFI.trajectory.TrajectoryCollection`, applied to every estimator (force, diffusion, parametric): ``"pool"`` (default — pool all increments on equal footing), ``"per_dataset"`` (each experiment counts equally), or an explicit array. Within-dataset weighting is intrinsic to each estimator (force per-Δt, diffusion per-point). See :doc:`/trajectory/user_guide`.