SFI.diagnostics.residuals module¶
Per-backend residual builders.
Each builder takes a fitted inference object and returns a
ResidualBundle containing pooled standardized residuals
\(z = \Sigma^{-1/2} r\) ready to feed into the statistical tests.
Measurement-noise-aware, banded whitening¶
Both residuals carry two correlation sources that a single-residual whitening ignores:
Measurement noise \(\Sigma_\eta\). The diagnostic residual covariance is \(C = \text{(thermal)} + c\,\Sigma_\eta\), not the thermal part alone. The estimator’s profiled \(\Sigma_\eta\) (
inferer.Lambda) is folded intoCso that a well-recovered but noisy fit still whitens to unit variance instead of tripping every flag. On clean data \(\Sigma_\eta\approx 0\) and this reduces to the thermal whitening.Serial correlation. Localisation error is shared between neighbouring residuals, so the residual series is a moving-average process (overdamped increment → MA(1) with lag-1 block \(-\Sigma_\eta\); the kept underdamped acceleration series → MA(1) with lag-1 block \(\Sigma_\eta/\Delta t^4\)). A banded whitening — the sequential block-Cholesky innovations of the tridiagonal residual covariance (
_sequential_innovations()) — decorrelates the stream, exactly paralleling the parametric core’s banded precision. On clean data the off-diagonal block vanishes and the innovations coincide with the marginal whitening.
The whitened stream z (moments / normality / autocorrelation) uses
the banded innovations; the per-row Mahalanobis norms z_squared_norms
(the chi-square / MSE-consistency bias check) keep the marginal
noise-aware form, which faithfully preserves a slowly-varying force bias
that the innovations would partly difference out.
Residual conventions¶
Overdamped:
with lag-1 covariance \(-\Sigma_\eta\). For the linear path the thermal part is the exact ML residual; for the parametric path it is an approximation that is nevertheless consistent (whitened residuals should have unit variance and no autocorrelation if the model is well specified).
Underdamped: symmetric acceleration \(\hat a_t = (X_{t+1} - 2X_t + X_{t-1})/\Delta t^2\),
For both regimes residuals are pooled across time, particles, and
spatial components, applying the dataset’s dynamic_mask (for
overdamped) or its 1-step erosion (for underdamped, which needs three
consecutive valid observations).
- class SFI.diagnostics.residuals.ResidualBundle(z, z_components, z_squared_norms, force_quadratic_form, mean_dt, n_obs, d, regime, backend, n_particles, nmse_excess_factor=1.0, whitened=<factory>)[source]¶
Bases:
objectStandardised residuals + metadata.
- Variables:
z (np.ndarray) – Whitened residuals, shape
(K,). Pooled across time, particles and spatial components after masking.z_components (np.ndarray) – Whitened residuals organised by spatial component, shape
(K_per_component, d). Used for per-axis statistics.z_squared_norms (np.ndarray) – Per-row squared Mahalanobis norm \(r_t^\top \Sigma_t^{-1} r_t\), shape
(K_per_row,). Used for the diffusion / “chi-square” check.force_quadratic_form (np.ndarray) – Per-row quadratic form \(F^\top A^{-1} F\) evaluated on the same valid samples used to build
z. Pre-computing it here avoids a second evaluation ofFin the MSE-consistency check downstream.mean_dt (float) – Average step size used in the residual construction.
n_obs (int) – Number of valid (un-masked) observations used to build
z.d (int) – Spatial dimension.
regime (str) –
"OD"or"UD".backend (str) – Coarse tag of the inference path (
"linear","parametric","nonlinear"). For diagnostic display only.n_particles (int) – Maximum number of particles in any dataset.
nmse_excess_factor (float) – Conversion factor from the chi-square excess to the force NMSE in
mse_consistency().1.0for the overdamped increment residual;KAPPA_UDfor the underdamped acceleration residual (see that constant for the derivation).whitened (list of (np.ndarray, np.ndarray)) – Per-dataset
(z_full, mask)pairs withz_fullof shape(K, N, d)(time-major) andmaskof shape(K, N). Kept so that autocorrelation can be measured strictly along time, per particle and per component — pooling the flattenedzstream would mix particles and components at short lags.
- Parameters:
z (ndarray)
z_components (ndarray)
z_squared_norms (ndarray)
force_quadratic_form (ndarray)
mean_dt (float)
n_obs (int)
d (int)
regime (str)
backend (str)
n_particles (int)
nmse_excess_factor (float)
whitened (list)
- backend: str¶
- d: int¶
- force_quadratic_form: ndarray¶
- mean_dt: float¶
- n_obs: int¶
- n_particles: int¶
- nmse_excess_factor: float = 1.0¶
- regime: str¶
- whitened: list¶
- z: ndarray¶
- z_components: ndarray¶
- z_squared_norms: ndarray¶
- SFI.diagnostics.residuals.build_overdamped_residuals(inferer, data=None)[source]¶
Build standardised Euler–Maruyama residuals for an OD inferer.
Routes data access through
TrajectoryDataset.make_batch_producer— the same low-level streaming layer used bySFI.integrate— so multi-particle, masked, and multi-dataset trajectories are handled transparently.Works for any overdamped inference path (linear, parametric, nonlinear) as long as
inferer.force_inferredis callable andinferer.A_invis available.- Return type:
- SFI.diagnostics.residuals.build_residuals(inferer, data=None)[source]¶
Dispatch to the OD / UD residual builder based on the engine class.
data(optional) evaluates the residuals on an independentTrajectoryCollectioninstead of the training data — the held-out path used byholdout_score.- Return type:
- SFI.diagnostics.residuals.build_underdamped_residuals(inferer, data=None)[source]¶
Build standardised innovations for a UD inferer from the symmetric acceleration residual.
Uses the symmetric ULI kinematics that the underdamped force estimator itself fits (see
SFI.inference.underdamped):\[\hat x = \tfrac13(X_{t-1}+X_t+X_{t+1}), \quad \hat v = \frac{X_{t+1}-X_{t-1}}{2\Delta t}, \quad \hat a = \frac{X_{t+1}-2X_t+X_{t-1}}{\Delta t^2},\]and forms the residual \(r_t = \hat a - F(\hat x, \hat v)\). Its thermal noise covariance is \(\tfrac23 A/\Delta t\) (see
KAPPA_UD); with measurement noise the diagonal block gains \(6\Sigma_\eta/\Delta t^4\). The thermal residual is MA(1), so only every second valid index is kept (removing the thermal lag-1); the residual measurement-noise correlation (lag-1 block \(\Sigma_\eta/\Delta t^4\)) is removed by the banded innovations whitening, leaving a serially independent stream.Like
build_overdamped_residuals(), all data access usesTrajectoryDataset.make_batch_producerso masking and multi-dataset / multi-particle pooling are handled by the same streaming layer that powersSFI.integrate.- Return type: