Kalman Filter

Recall that the key steps in implementing a model are:

  • represent the probability distribution (analytically or using sampling)

  • model the motion (or state transition): p(xtut,xt1)p(x_t|u_t, x_{t-1})

  • model the measurement: p(ztxt)p(z_t|x_t)

Kalman Filter

In the standard Kalman Filter algorithm, the state transition is modeled as:

xt=Atxt1+Btut+ϵtx_t = A_tx_{t-1} + B_t u_t + \epsilon_t

where ϵtN(0,Rt)\epsilon_t \sim \mathcal{N}(0, R_t). The covariance matrix RtR_t represents the uncertainty or noise. To obtain the analytic form of p(xtut,xt1)p(x_t|u_t,x_{t-1}), we just need to observer xtAtxt1Btutx_t - A_tx_{t-1} - B_tu_{t} follows the normal distribution.

Similarly, the measurement in the standard Kalman filter is modeled as:

zt=Ctxt+δtz_t = C_tx_t + \delta_t

where δtN(0,Qt)\delta_t \sim \mathcal{N}(0, Q_t) describing the measurement noise.

It can be shown that in the standard Kalman filter, the states xtx_tfollows normal distribution. i.e.:

xtN(μt,Σt)x_t \sim \mathcal{N}(\mu_t, \Sigma_t)

and the algorithm is given as

\begin{algorithm}
    \renewcommand{\thealgorithm}{}
    \begin{algorithmic}[1]
    \Function{Kalman\_Filter}{$\mu_{t-1}, \Sigma_{t-1}, u_t, z_t$}
    \State $\bar{\mu}_t = A_t \mu_{t-1} + B_t u_t$
    \State $\bar{\Sigma}_t = A_t\Sigma_{t-1}A_t^T + R_t$
    
    \State $K_t = \bar{\Sigma}_t C_t^T (C_t \bar{\Sigma}_t  C_t^T + Q_t)^{-1}$  \Comment{This is called Kalman gain}
    
    \State $\mu_t = \bar{\mu}_t + K_t(z_t - C_t \bar{\mu}_t)$
    
    \State $\Sigma_t = (I - K_tC_t) \bar{\Sigma}_t$
    
    \State $\textbf{return} \;\;\; \mu_t, \Sigma_t$
    \EndFunction
    
\end{algorithmic}
\end{algorithm}

Extended Kalman Filter

In EKF, the state transition and measurement model take more general form:

xt=g(ut,xt1)+ϵtzt=h(xt)+δt\begin{align*} x_t & = g(u_t, x_{t-1}) + \epsilon_t \\ z_t & = h(x_t) + \delta_t \end{align*}

The problem is that for arbitrary functions, it may not be possible to obtain an analytic form of the distribution of state variable xtx_t. One way to get around this problem is linearization using Taylor expansion.

Before we look into the detail of the Taylor expansion, let's take one step back and review what we already have. One of the important things is not to confuse parameters with known values.

Recall that the ultimate goal is to calculate p(xtut,xt1)p(x_t|u_t, x_{t-1})and p(ztxt)p(z_t|x_t). Although they are conditional probability and (ut,xt1)(u_t, x_{t-1}) and xtx_t are conditions in the two expressions respectively, all of them are parameters (or function arguments). If we forget about the probability context for a moment, it's quite obvious p(xtut,xt1)p(x_t|u_t, x_{t-1}) is a mapping from (xt,ut,xt1)(x_t, u_t, x_{t-1})to a value.

We also recall that in the standard Kalman filter, the distribution of the states is tracked by N(μt,Σt)\mathcal{N}(\mu_t, \Sigma_t) so at time tt, μ1,μ2,...,μt1\mu_1, \mu_2, ..., \mu_{t-1} are known values.

Now, we can get back to the Taylor expansion. For the motion model, we perform the linearization around the μt1\mu_{t-1} because this is our estimate of the state at t1t-1 and it should be close to xt1x_{t-1}. Therefore, we have

g(ut,xt1)g(ut,μt1)+gxt1(ut,μt1)(xt1μt1)=g(ut,μt1)+Gt(xt1μt1) \begin{align*} g(u_t, x_{t-1}) & \approx g(u_t, \mu_{t-1}) + \frac{\partial g}{\partial x_{t-1}}(u_t, \mu_{t-1})(x_{t-1} - \mu_{t-1}) \\ & = g(u_t, \mu_{t-1}) + G_t(x_{t-1} - \mu_{t-1}) \end{align*}

where gxt1(ut,μt1)\frac{\partial g}{\partial x_{t-1}}(u_t,\mu_{t-1}) means the partial derivative with respect to the second variable evaluated at (ut,μt1)(u_t, \mu_{t-1}).

Similarly, we can write

h(xt)h(μˉt)+Ht(xtμˉt)h(x_t) \approx h(\bar{\mu}_t) + H_t(x_t - \bar{\mu}_t)

where μˉt=g(ut,μt1)\bar{\mu}_t = g(u_t, \mu_{t-1}).

The extended Kalman filter algorithm is given as:

\begin{algorithm}
    \renewcommand{\thealgorithm}{}
    \begin{algorithmic}[1]
        \Function{EKF}{$\mu_{t-1}, \Sigma_{t-1}, u_t, z_t$}
        \State $\bar{\mu}_t = g(u_t, \mu_{t-1})$
        \State $\bar{\Sigma}_t = G_t\Sigma_{t-1}G_t^T + R_t$
        
        \State $K_t = \bar{\Sigma}_t H_t^T (H_t \bar{\Sigma}_t  H_t^T + Q_t)^{-1}$ 
        
        \State $\mu_t = \bar{\mu}_t + K_t(z_t - h(\bar{\mu}_t))$
        
        \State $\Sigma_t = (I - K_tH_t) \bar{\Sigma}_t$
        
        \State $\textbf{return} \;\;\; \mu_t, \Sigma_t$
        \EndFunction
        
    \end{algorithmic}
\end{algorithm}

Last updated