Industrial Data Analytics for Diagnosis and Prognosis. Yong Chen. Читать онлайн. Newlib. NEWLIB.NET

Автор: Yong Chen
Издательство: John Wiley & Sons Limited
Серия:
Жанр произведения: Математика
Год издания: 0
isbn: 9781119666301
Скачать книгу
end cell row cell negative 111.0 end cell cell 533.1 end cell cell 300.7 end cell row cell 133.7 end cell cell 300.7 end cell cell 562.5 end cell end table close parentheses end cell end table"/>

      3.5 Bayesian Inference for Normal Distribution

      Let D = {x1, x2,…, xn} denote the observed data set. In the maximum likelihood estimation, the distribution parameters are considered as fixed. The estimation errors are obtained by considering the random distribution of possible data sets D. By contrast, in Bayesian inference, we treat the observed data set D as the only data set. The uncertainty in the parameters is characterized through a probability distribution over the parameters.

      In this subsection, we focus on Bayesian inference of normal distribution when the mean μ is unknown and the covariance matrix Σ is assumed as known. The Bayesian inference is based on the Bayes’ theorem. In general, the Bayes’ theorem is about the conditional probability of an event A given that an event B occurs:

table row cell text Pr end text left parenthesis A vertical line B right parenthesis equals fraction numerator text Pr end text left parenthesis B vertical line A right parenthesis text Pr end text left parenthesis A right parenthesis over denominator text Pr end text left parenthesis B right parenthesis end fraction. end cell end table

      Applying Bayes’ theorem for Bayesian inference of μ, we have

      where g(μ) is the prior distribution of μ, which is the distribution before observing the data, and f(μ|D) is called as the posterior distribution, which is the distribution after we have observed D. The function f(D|μ) on the right-hand side of (3.25) is the density function for the observed data set D. If it is viewed as a function of the unknown parameter μ, f(D|μ) is exactly the likelihood function of μ. Therefore the Bayes’ theorem can be stated in words as

table row cell f left parenthesis D right parenthesis equals integral f left parenthesis D vertical line bold mu right parenthesis g left parenthesis bold mu right parenthesis text d end text bold mu. end cell end table

      A point estimate of μ can be obtained by maximizing the posterior distribution. This method is called the maximum a posteriori (MAP) estimate. The MAP estimate of μ can be written as

      From (3.27), it can be seen that the MAP estimate is closely related to MLE. Without the prior g(μ), the MAP is the same as the MLE. So if the prior follows a uniform distribution, the MAP and MLE will be equivalent. Following this argument, if the prior distribution has a flat shape, we expect that the MAP and MLE are similar.

      We first consider a simple case where the data follow a univariate normal distribution with unknown mean μ and known variance σ2. The likelihood function based on a random sample of independent observations D = {x1, x2,…, xn} is given by

table row cell f left parenthesis D vertical line mu right parenthesis equals product from i equals 1 to n of f left parenthesis x subscript i vertical line mu right parenthesis equals 1 over left parenthesis 2 pi sigma squared right parenthesis to the power of n divided by 2 end exponent e to the power of negative fraction numerator 1 over denominator 2 sigma squared end fraction sum from i equals 1 to n of left parenthesis x subscript i minus mu right parenthesis squared end exponent. end cell end table

      Based on (3.26), we have

table row cell f left parenthesis mu vertical line D right parenthesis proportional to f left parenthesis D vertical line mu right parenthesis g left parenthesis mu right parenthesis comma end cell end table

      where g(μ) is the probability density function of the prior distribution. We choose a normal distribution N(μ0, σ02) as the prior for μ. This prior is a conjugate prior because the resulting posterior distribution will also be normal. By completing the square in the exponent of the likelihood and prior, the posterior distribution can be obtained as

table row cell mu vertical line D tilde N left parenthesis mu subscript n comma sigma subscript n superscript 2 right parenthesis comma end cell end table