Measurement noise and coarse sampling¶
Real trajectories are rarely clean: localization error blurs every position, and the camera frame rate fixes a sampling interval \(\Delta t\) that may be coarse compared to the dynamics. Both imperfections bias the linear estimators in ways that more data will not fix — they call for the parametric estimators instead. This page shows how to recognise the symptoms and what to run.
Recognising the symptoms¶
You likely have a measurement-noise or sampling problem when:
Diagnostics flag it. After a linear fit,
SFI.diagnostics.assess()reports[mse_consistency](realised error well above the predicted, sampling-noise value), residual[autocorr]flags, or a whitened residual standard deviation far from 1. On experimental data these flags usually mean noise or coarse sampling, not a wrong basis.The diffusion estimators disagree.
compute_diffusion_constant(method="noisy")(noise-aware) andmethod="WeakNoise"(clean-data) give clearly different values, orinf.Lambda— the estimated measurement-noise covariance — is comparable to \(2 D \Delta t\).The error plateaus. Adding more data keeps shrinking the predicted error (
inf.force_predicted_MSE) while the realised error against held-out data stalls: you have hit a bias floor.
Why the linear estimators acquire a bias¶
Two distinct mechanisms, both growing from the finite-difference construction:
Errors-in-variables (measurement noise). The linear estimators regress finite-difference velocities on basis functions evaluated at the measured positions. Localization noise \(\eta\) of covariance \(\Lambda\) enters both sides: it inflates the velocity estimate (variance \(\sim 2\Lambda/\Delta t^2\)) and perturbs the regressors. On nonlinear systems the resulting errors-in-variables bias is proportional to the noise level and does not average away with longer trajectories.
Euler secant (coarse sampling). The linear estimators approximate the drift over one interval by the straight-line secant \((\mathbf{x}_{t+\Delta t} - \mathbf{x}_t)/\Delta t\). When \(\Delta t\) is no longer small compared to the dynamical timescales, the secant mis-tracks the curved true flow and the estimate acquires an \(O(\Delta t)\) bias.
See Parametric windowed estimators — concepts for the quantitative treatment of both effects.
The parametric workflow¶
The parametric estimators address both mechanisms natively: a single RK4 flow step per observation interval replaces the Euler secant, the measurement noise \(\Lambda\) is part of the observation model, and the skip-trick errors-in-variables instrument keeps the estimating equation consistent under noise.
from SFI import OverdampedLangevinInference
from SFI.bases import monomials_up_to
inf = OverdampedLangevinInference(coll)
B = monomials_up_to(order=3, dim=2, rank='vector')
inf.infer_force(B) # profiles (D, Λ) automatically
inf.infer_diffusion() # optional: defaults to a symmetric-matrix basis
inf.compute_force_error()
inf.print_report()
Notes:
Fcan be aBasis(fast Gauss–Newton path, PASTIS sparsification wired) or any differentiablePSF— see Choosing an estimator.The noise and diffusion levels \((\mathbf{D}, \Lambda)\) are profiled automatically: closed-form moment estimators initialise them, and one conditional-NLL refinement updates them at the fitted parameters. Nothing to tune.
If you know the noise from calibration (e.g. the localization precision of your microscope), pass it explicitly — and pass the diffusion too if known, which skips profiling entirely:
inf.infer_force(B, D=D_known, Lambda=Sigma_known) # fast path
The errors-in-variables instrument is on by default (
eiv="auto"); you should not need to touch it.
Runtime expectations. The parametric fit is iterative: expect minutes where the linear estimators take seconds on large problems, though on moderate data the gap is small (an underdamped solve at \(T \approx 10^4\), \(n = 8\) runs in ~20 s on a laptop CPU core, vs. ~10 s for the linear estimator — see Parametric windowed estimators — algorithm and parameters for scaling).
Cross-checking. Running both estimator families on the same basis is itself a diagnostic: if they agree, noise and sampling effects are under control and you can keep the cheaper linear workflow; if they disagree, trust the parametric fit — the discrepancy measures the linear bias.
Worked examples and validation¶
Experimental-data workflow template — an end-to-end experimental pipeline where the diagnostics flag localization noise and the parametric estimator removes the bias.
See also
Choosing an estimator — the regime table.
Parametric windowed estimators — concepts — the observation model and estimator theory.
Parametric windowed estimators — algorithm and parameters — algorithm details and the full parameter reference.
Underdamped systems — noise is doubly harmful for inertial systems; the underdamped page covers the specifics.