where the normalization constant in the denominator is obtained as:
(4.9)
The recursive propagation of the state posterior density according to equations (4.7) and (4.8) provides the basis of the Bayesian solution. Having the posterior,
Minimum mean‐square error (MMSE) estimator(4.10) This is equivalent to minimizing the trace (sum of the diagonal elements) of the estimation‐error covariance matrix. The MMSE estimate is the conditional mean of :(4.11) where the expectation is taken with respect to the posterior, .
Risk‐sensitive (RS) estimator(4.12) Compared to the MMSE estimator, the RS estimator is less sensitive to uncertainties. In other words, it is a more robust estimator [49].
Maximum a posteriori (MAP) estimator(4.13)
Minimax estimator(4.14) The minimax estimate is the medium of the posterior, . The minimax technique is used to achieve optimal performance under the worst‐case condition [50].
The most probable (MP) estimator(4.15) MP estimate is the mode of the posterior, . For a uniform prior, this estimate will be identical to the maximum likelihood (ML) estimate.(4.16)
In general, there may not exist simple analytic forms for the corresponding PDFs. Without an analytic form, the PDF for a single variable will be equivalent to an infinite‐dimensional vector that must be stored for performing the required computations. In such cases, obtaining the Bayesian solution will be computationally intractable. In other words, the Bayesian solution except for special cases, is a conceptual solution, and generally speaking, it cannot be determined analytically. In many practical situations, we will have to use some sort of approximation, and therefore, settle for a suboptimal Bayesian solution [46]. Different approximation methods lead to different filtering algorithms.
4.4 Fisher Information
The relevant portion of the data obtained by measurement can be interpreted as information. In this line of thinking, a summary of the amount of information with regard to the variables of interest is provided by the Fisher information matrix [51]. To be more specific, Fisher information plays two basic roles:
1 It is a measure of the ability to estimate a quantity of interest.
2 It is a measure of the state of disorder in a system or phenomenon of interest.
The first role implies that the Fisher information matrix has a close connection to the estimation‐error covariance matrix and can be used to calculate the confidence region of estimates. The second role implies that the Fisher information has a close connection to Shannon's entropy.
Let us consider the PDF
(4.17)
This definition is based on the outer product of the gradient of
(4.18)
From the definition of
A rearrangement of the tuples may change the shape of the PDF curve significantly, but it does not affect the value of the summation in (2.95) or integration in (2.96), because the summation and integration can be calculated in any order. Since is not affected by local changes in the PDF curve, it can be considered as a global measure of the behavior of the corresponding PDF.
On the other hand, such a rearrangement of points changes the slope, and therefore gradient of the PDF curve, which, in turn, changes the Fisher information significantly. Hence, the Fisher information is sensitive to local rearrangement of points and can be considered as a local measure of the behavior of the corresponding PDF.
Both entropy (as a global measure of smoothness in the PDF) and Fisher information (as a local measure of smoothness in the PDF) can be used in a variational principle to infer about the PDF that describes the phenomenon under consideration. However, the local measure may be preferred in general [27]. This leads