bold x subscript bold 1 bold comma bold x subscript bold 2 bold comma bold horizontal ellipsis bold comma bold x subscript bold n bold right parenthesis bold equals bold product from bold i bold equals bold 1 to bold n of bold f bold left parenthesis bold x subscript bold i bold semicolon bold mu bold comma bold capital sigma bold right parenthesis end cell row cell bold equals bold product from bold i bold equals bold 1 to bold n of fraction numerator bold 1 over denominator bold left parenthesis bold 2 bold pi bold right parenthesis to the power of bold p bold divided by bold 2 end exponent bold vertical line bold capital sigma bold vertical line to the power of bold 1 bold divided by bold 2 end exponent end fraction bold e to the power of bold minus bold left parenthesis bold x subscript bold i bold minus bold mu bold right parenthesis to the power of bold T bold capital sigma to the power of bold minus bold 1 end exponent bold left parenthesis bold x subscript bold i bold minus bold mu bold right parenthesis bold divided by bold 2 end exponent end cell row cell bold equals fraction numerator bold 1 over denominator bold left parenthesis bold 2 bold pi bold right parenthesis to the power of bold np bold divided by bold 2 end exponent bold vertical line bold capital sigma bold vertical line to the power of bold n bold divided by bold 2 end exponent end fraction bold e to the power of bold minus bold sum from bold i bold equals bold 1 to bold n of bold left parenthesis bold x subscript bold i bold minus bold mu bold right parenthesis to the power of bold T bold capital sigma to the power of bold minus bold 1 end exponent bold left parenthesis bold x subscript bold i bold minus bold mu bold right parenthesis bold divided by bold 2 end exponent bold. end cell end table"/> (3.13)
It is often easier to find the MLE by minimizing the negative log likelihood function, which is given by
(3.14)
Taking the derivative of (3.14) with respect to μ, we have
(3.15)
Setting the partial derivative in (3.15) to zero, the MLE of μ is obtained as
(3.16)
which is the sample mean vector of the data set x1, x2,…, xn. The derivation of the MLE of Σ is more involved and beyond the scope of this book. The result is given by
(3.17)
where S is the sample covariance matrix as given in (2.6). Since the MLE uses n instead of n – 1 in the denominator, it is a biased estimator. So the sample covariance matrix S is more commonly used to estimate Σ, especially when n is small.
One useful property of MLE is the invariance property. In general, let denote the MLE of the parameter vector θ. Then the MLE of a function of θ, denoted by h(θ), is given by h(). This result makes it very convenient to find the MLE of any function of a parameter, given the MLE of the parameter. For example, based on (3.17), it is easy to see that the MLE of the variance of Xj, the jth element of X, is given by
Then based on the invariance property, the MLE of the standard deviation √σjj is .
The MLE has some good asymptotic properties and usually performs well for data sets of large sample sizes. For example, under mild regularity conditions, MLE satisfies the property of consistency, which guarantees that the estimator converges to the true value of the parameter as the sample size becomes infinite. In addition, under certain regularity conditions, the MLE is asymptotically normal and efficient. That is, as the sample size becomes infinite, the distribution of MLE will converge to a normal distribution with variance equal to the optimal asymptotic variance. The details of the regularity conditions are beyond the scope of this book. But these conditions are quite general and often satisfied in common circumstances.
3.4 Hypothesis Testing on Mean Vectors
In this section, we study how to determine if the population mean μ is equal to a specific value μ0 when the observations follow a normal distribution. We start by reviewing the hypothesis testing results for univariate data. Suppose X1, X2,…, Xn are a random sample of independent univariate observations following the normal distribution N(μ, σ2). The test on μ is formulated as
where H0 is the null hypothesis and H1 is the (two-sided) alternative hypothesis. For this test, we use the following test statistic: