Recall that the key steps in implementing a model are:
represent the probability distribution (analytically or using sampling)
model the motion (or state transition): p(xt∣ut,xt−1)
model the measurement: p(zt∣xt)
Kalman Filter
In the standard Kalman Filter algorithm, the state transition is modeled as:
xt=Atxt−1+Btut+ϵt
where ϵt∼N(0,Rt). The covariance matrix Rt represents the uncertainty or noise. To obtain the analytic form of p(xt∣ut,xt−1), we just need to observer xt−Atxt−1−Btut follows the normal distribution.
Similarly, the measurement in the standard Kalman filter is modeled as:
zt=Ctxt+δt
where δt∼N(0,Qt) describing the measurement noise.
It can be shown that in the standard Kalman filter, the states xtfollows normal distribution. i.e.:
In EKF, the state transition and measurement model take more general form:
xtzt=g(ut,xt−1)+ϵt=h(xt)+δt
The problem is that for arbitrary functions, it may not be possible to obtain an analytic form of the distribution of state variable xt. One way to get around this problem is linearization using Taylor expansion.
Before we look into the detail of the Taylor expansion, let's take one step back and review what we already have. One of the important things is not to confuse parameters with known values.
Recall that the ultimate goal is to calculate p(xt∣ut,xt−1)and p(zt∣xt). Although they are conditional probability and (ut,xt−1) and xt are conditions in the two expressions respectively, all of them are parameters (or function arguments). If we forget about the probability context for a moment, it's quite obvious p(xt∣ut,xt−1) is a mapping from (xt,ut,xt−1)to a value.
We also recall that in the standard Kalman filter, the distribution of the states is tracked by N(μt,Σt) so at time t, μ1,μ2,...,μt−1 are known values.
Now, we can get back to the Taylor expansion. For the motion model, we perform the linearization around the μt−1 because this is our estimate of the state at t−1 and it should be close to xt−1. Therefore, we have