Parametric windowed estimators — concepts

This page presents the mathematical foundations of the parametric windowed estimators used by infer_force() and infer_force().

Throughout we use the notation established in the Physics Reference.

Motivation

The linear estimators (infer_force_linear()) solve a linear regression:

\[M_a = \sum_b G_{ab}\, c_b\]

where the force moments \(M_a\) and Gram matrix \(G_{ab}\) are computed at each time step from finite-difference velocities and user-chosen basis functions. This estimator is fast, non-iterative, and unbiased at leading order in \(\Delta t\). However:

  1. Finite-:math:`Delta t` bias: the linear regression is exact only in the \(\Delta t \to 0\) limit. At finite sampling interval the force moments carry systematic \(O(\Delta t)\) errors, especially in regions of high curvature.

  2. Measurement noise: when positions are corrupted by localization error \(\Lambda\), the finite-difference velocity \(v = \Delta x / \Delta t\) is dominated by \(\Lambda / \Delta t\) rather than by the true velocity.

  3. Nonlinear models: the regression is restricted to forces that are linear in their parameters. Parametric models such as neural networks or rational functions cannot be fitted directly.

The parametric windowed estimator addresses all three limitations. It replaces the finite-difference velocity by an exact numerical flow, accounts for measurement noise through a joint Gaussian likelihood, and iterates via Gauss–Newton to handle nonlinear parameter dependence.

The observation model

We assume the observed trajectory \(y_0, y_1, \dots, y_{T-1}\) is related to the true latent state \(x_t\) by

(1)\[y_t = x_t + \eta_t, \qquad \eta_t \sim \mathcal{N}(0, \Lambda)\]

where \(\Lambda\) is the measurement-noise covariance (localization error). The latent state evolves according to the Langevin SDE.

Overdamped (position only):

\[\mathrm{d}x = F(x;\,\theta)\,\mathrm{d}t + \sqrt{2\,D(x)}\;\mathrm{d}W_t .\]

Underdamped (position + velocity):

\[\dot{x} = v, \qquad \mathrm{d}v = F(x, v;\,\theta)\,\mathrm{d}t + \sqrt{2\,D}\;\mathrm{d}W_t .\]

Deterministic-flow linearisation

Instead of the Euler–Maruyama secant, we linearise each observation interval around the deterministic flow of the drift. Given the current state \(x_t\), we integrate the noise-free ODE \(\dot{z} = F(z;\theta)\), \(z(0) = x_t\), over a full step \(\Delta t\) — solved with \(n_{\text{sub}}\) RK4 micro-steps (default \(n_{\text{sub}} = 1\), a single 4th-order step per observation interval) — and model the stochastic part as a single Gaussian increment accumulated over that interval. There is no operator splitting: one deterministic flow step and one noise increment per observation.

The resulting flow map \(\Phi(x;\,\theta)\) is the RK4-integrated displacement:

(2)\[\Phi(x;\,\theta) = z(\Delta t) - x, \qquad \dot{z} = F(z;\,\theta), \quad z(0) = x.\]

Under this linearisation the transition distribution over one step is

(3)\[x_{t+1} \;\big|\; x_t \;\sim\; \mathcal{N}\!\Big( x_t + \Phi(x_t;\,\theta), \; V_t \Big)\]

where the process-noise covariance \(V_t\) depends on the diffusion and on the flow Jacobian \(J_t = \partial \Phi / \partial x \big|_{x_t} + I\):

(4)\[V_t = \Delta t \bigl( J_t \, D(x_t) \, J_t^{\!\top} + D(x_t + \Phi_t) \bigr) .\]

In the leading-order (LO) approximation used by the Gauss–Newton solver, the log-determinant contribution of \(V_t\) is treated as \(\theta\)-independent. This is justified because the \(\theta\)-dependent part of \(\log |V_t|\) enters only at \(O(\Delta t^3)\).

Residuals and the joint Gaussian

Combining (1) and (3) yields residuals

(5)\[r_t = y_{t+1} - y_t - \Phi(x_t;\,\theta)\]

that follow a block-tridiagonal multivariate Gaussian. The diagonal blocks of the joint covariance are

\[A_t = V_t + J_t \, \Lambda \, J_t^{\!\top} + \Lambda\]

and the off-diagonal blocks (from measurement noise coupling consecutive residuals) are

\[C_t = -\Lambda \, J_t^{\!\top} .\]

The pseudo-NLL (negative log-likelihood up to constants) is:

(6)\[-\log p(y \mid \theta) = \tfrac{1}{2} \bigl[ \log\det \Sigma + r^{\!\top} \Sigma^{-1} r \bigr]\]

where \(\Sigma\) is the block-tridiagonal matrix with blocks \((A_t, C_t)\). Both the log-determinant and the quadratic form can be evaluated in \(O(T d^3)\) time via a block-Cholesky factorisation in a single jax.lax.scan pass.

Gauss–Newton iteration

The force \(F(x;\,\theta) = \sum_\alpha \theta_\alpha \, \phi_\alpha(x)\) (or any differentiable parametric model) enters nonlinearly through the flow map \(\Phi\). Rather than optimising the NLL directly (which would require second-order derivatives of the flow), the parametric estimator uses a Gauss–Newton (GN) linearisation.

Define the test functions (parameter sensitivities):

\[\psi_\alpha(x_t) = \frac{\partial \Phi(x_t;\,\theta)}{\partial \theta_\alpha} ,\]

computed exactly through the flow with jax.jacfwd.

Errors-in-variables: the skip-trick instrument. Under measurement noise the regressor \(\psi(y_t)\) is evaluated at a noisy point, which correlates it with the residual and biases the symmetric normal equations (the classical errors-in-variables effect). The estimator therefore replaces the left factor of the normal equations by an η-clean instrument: the sensitivity evaluated at a lagged point whose noise is independent of the residual’s,

\[\psi_{\rm inst} = \frac{\partial \Phi}{\partial \theta} \Big|_{\text{stop-grad clean point}} ,\]

giving an asymmetric estimating equation whose root remains consistent under noise (eiv=True, the default on the Gauss–Newton path; eiv=False recovers the plain MLE).

The whitened Gram matrix and whitened right-hand side are

(7)\[\begin{split}G_{\alpha\beta} &= \sum_t \psi_\alpha^{\!\top} \, \Sigma_{\rm loc}^{-1} \, \psi_\beta, \\ f_\alpha &= \sum_t \psi_\alpha^{\!\top} \, \Sigma_{\rm loc}^{-1} \, \rho_t,\end{split}\]

where \(\rho_t = r_t + \psi \cdot \theta\) is the augmented residual and \(\Sigma_{\rm loc}\) is the local-precision covariance window (see Parametric windowed estimators — algorithm and parameters for details).

One GN step updates

\[\theta^{(k+1)} = G^{-1} f\]

(not an increment — \(f\) already contains the shifted residual). The iteration converges in 3–10 steps for typical problems.

Regularisation. When \(\operatorname{cond}(G) > \texttt{gram\_cond\_max}\) the system is Tikhonov-damped: \(G_{\rm reg} = G + \lambda I\) with \(\lambda = \sigma_1(G) / \texttt{gram\_cond\_max}\).

Comparison with previous SFI estimators

Property

infer_force_linear()

infer_force() (parametric)

Discretisation

Euler (\(O(\Delta t)\))

RK4 parametric (\(O(\Delta t^2)\))

Measurement noise

Heuristic subtraction

Joint likelihood (6)

Nonlinear models

No (linear in \(\theta\))

Yes (Gauss–Newton)

(D, Λ) estimation

Pre-computed constant

Moment init + profiled conditional NLL

Cost per iteration

\(O(T n^2 d)\)

\(O(T n d^3)\) (windowed)

Iterations needed

1 (exact solve)

2–5 GN steps

The parametric windowed approach is more accurate at larger \(\Delta t\), handles measurement noise natively, and can fit nonlinear parametric models. The cost is a modest iteration loop. For small \(\Delta t\) and clean data, the linear regression is often sufficient.

Overdamped vs. underdamped specifics

Overdamped parametric

The overdamped estimator uses a tridiagonal covariance structure (bandwidth 1): each residual couples only to its immediate neighbours through the off-diagonal block \(C_t = -\Lambda J_t^\top\).

The residuals are:

\[r_t = y_{t+1} - y_t - \Phi(y_t;\,\theta) .\]

The solve proceeds in three steps:

  1. Moment init: \((\mathbf{D}, \Lambda)\) from closed-form noise-robust moment estimators (Vestergaard diffusion + lag-1 increment anticorrelation) — one trajectory pass, no optimiser.

  2. Gauss–Newton loop: iterated whitened least-squares at the fixed noise levels.

  3. Reprofile: one windowed conditional-NLL refinement of \((\mathbf{D}, \Lambda)\) at the fitted \(\theta\), followed by a warm re-solve (the iteratively-reweighted fixed point for linear-in-\(\theta\) models).

Underdamped parametric (ULI)

The underdamped estimator works in \((x, v)\) phase space — only positions are observed, so the velocity at each window is resolved by shooting: one Newton step finds the initial velocity whose flow hits the next observed position. The 3-point residuals have a pentadiagonal covariance structure (bandwidth 2), and the process noise enters the position at \(O(\Delta t^3)\).

The skip-trick instrument is built from the clean lagged position pair and is particularly effective here: the shooting velocity divides the position noise by \(\Delta t\), so the velocity errors-in-variables bias of the plain MLE is strong.

Note

A single Euler step cannot support the underdamped skip-trick: the Euler position update \(x + v\,\Delta t\) does not depend on the force, so the instrument is identically zero. The solver detects integrator="euler", n_substeps=1 with eiv enabled, warns, and falls back to the plain MLE. The RK4 default does not have this degeneracy.

State-dependent diffusion inference

After force inference, infer_diffusion() fits a state-dependent diffusion tensor \(D(x;\,\theta_D)\) by optimising the windowed local-precision NLL with the force held fixed.

Two parametrisations are available:

  • a rank-2 Basis: \(D(x) = \sum_j \theta_j \, d_j(x)\) (linear in \(\theta_D\); the default symmetric_matrix_basis(d) spans all constant symmetric matrices);

  • a PSF: \(D(x) = \text{PSF}(x;\,\theta_D)\) directly — in the underdamped case the model may be velocity-dependent, \(D(x, v)\), evaluated at the shooting velocity.

The optimisation uses L-BFGS with exact JAX gradients through the integrate framework, initialised by projecting the moment-estimated constant \(\hat{\mathbf{D}}\) onto the model (a zero start is a singular covariance and cannot be escaped).

Key references

  1. Frishman, A. & Ronceray, P., Learning force fields from stochastic trajectories, Phys. Rev. X 10, 021009 (2020). — The original SFI linear regression estimator.

  2. Vestergaard, C. L., Blainey, P. C. & Flyvbjerg, H., Optimal estimation of diffusion coefficients from single-particle trajectories, Phys. Rev. E 89, 022726 (2014). — Measurement-noise-aware diffusion estimation.

  3. Brückner, D. B., Ronceray, P. & Broedersz, C. P., Inferring the dynamics of underdamped stochastic systems, Phys. Rev. Lett. 125, 058103 (2020). — Underdamped Langevin Inference (ULI), extended here to the parametric windowed estimator.

  4. Amri, S. et al., Inferring geometrical dynamics of cell nucleus translocation, Phys. Rev. Res. 6, 043030 (2024). — Trapezoidal integration scheme.

See also

Parametric windowed estimators — algorithm and parameters — algorithmic implementation details (parameters, stages, data flow).

Running inference — how to use the inference methods in practice.