Data Science in Theory and Practice. Maria Cristina Mariani. Читать онлайн. Newlib. NEWLIB.NET

Автор: Maria Cristina Mariani
Издательство: John Wiley & Sons Limited
Серия:
Жанр произведения: Математика
Год издания: 0
isbn: 9781119674733
Скачать книгу
1 Baseline comma ellipsis comma upper X Subscript k Baseline right-parenthesis"/>, or equivalently left-parenthesis n minus upper X Subscript r plus 1 Baseline minus midline-horizontal-ellipsis minus upper X Subscript k Baseline comma upper X Subscript r plus 1 Baseline comma ellipsis comma upper X Subscript k Baseline right-parenthesis, will have a multinomial distribution with associated probabilities left-parenthesis p Subscript upper Y Baseline comma p Subscript r plus 1 Baseline comma ellipsis comma p Subscript k Baseline right-parenthesis equals left-parenthesis p 1 plus midline-horizontal-ellipsis plus p Subscript r Baseline comma p Subscript r plus 1 Baseline comma ellipsis comma p Subscript k Baseline right-parenthesis.

      Next consider the conditional distribution of the first r components given the last k minus r components. That is, the distribution of

left-parenthesis upper X 1 comma ellipsis comma upper X Subscript r Baseline right-parenthesis bar upper X Subscript r plus 1 Baseline equals n Subscript r plus 1 Baseline comma ellipsis comma upper X Subscript k Baseline equals n Subscript k Baseline period

      2.3.3 Multivariate Normal Distribution

      A vector bold upper X is said to have a k‐dimensional multivariate normal distribution (denoted upper M upper V upper N Subscript k Baseline left-parenthesis mu comma sigma-summation right-parenthesis, where upper N Subscript k is k‐dimensional multivariate normal distribution) with mean vector mu equals left-parenthesis mu 1 comma ellipsis comma mu Subscript k Baseline right-parenthesis and covariance matrix sigma-summation equals left-parenthesis sigma Subscript i j Baseline right-parenthesis Subscript i j element-of StartSet 1 comma ellipsis comma k EndSet if its density can be written as

f left-parenthesis bold x right-parenthesis equals StartFraction 1 Over left-parenthesis 2 pi right-parenthesis Superscript k slash 2 Baseline det left-parenthesis sigma-summation right-parenthesis Superscript 1 slash 2 Baseline EndFraction e Superscript minus one half left-parenthesis bold x minus mu right-parenthesis Super Superscript upper T Superscript sigma-summation Overscript negative 1 Endscripts left-parenthesis bold x minus mu right-parenthesis Baseline comma

      where we used the usual notations for the determinant, transpose, and inverse of a matrix. The vector of means mu may have any elements in double-struck upper R, but, just as in the one‐dimensional case, the standard deviation has to be positive. In the multivariate case, the covariance matrix sigma-summation has to be symmetric and positive definite.

      The multivariate normal defined thus has many nice properties. The basic one is that the one‐dimensional distributions are all normal, that is, upper X Subscript i Baseline tilde upper N left-parenthesis mu Subscript i Baseline comma sigma Subscript i i Baseline right-parenthesis and Cov left-parenthesis upper X Subscript i Baseline comma upper X Subscript j Baseline right-parenthesis equals sigma Subscript i j Baseline. This is also true for any marginal. For example, if left-parenthesis upper X Subscript r Baseline comma ellipsis comma upper X Subscript k Baseline right-parenthesis are the last coordinates, then

Start 4 By 1 Matrix 1st Row upper X Subscript r Baseline 2nd Row upper X Subscript r plus 1 Baseline 3rd Row vertical-ellipsis 4th Row upper X Subscript k Baseline EndMatrix tilde upper M upper V upper N Subscript k minus r plus 1 Baseline left-parenthesis Start 4 By 1 Matrix 1st Row mu Subscript r Baseline 2nd Row mu Subscript r plus 1 Baseline 3rd Row vertical-ellipsis 4th Row mu Subscript k Baseline EndMatrix comma Start 4 By 4 Matrix 1st Row 1st Column sigma Subscript r comma r Baseline 2nd Column sigma Subscript r comma r plus 1 Baseline 3rd Column ellipsis 4th Column sigma Subscript r comma k Baseline 2nd Row 1st Column sigma Subscript r plus 1 comma r Baseline 2nd Column sigma Subscript r plus 1 comma r plus 1 Baseline 3rd Column ellipsis 4th Column sigma Subscript r plus 1 comma k Baseline 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column sigma Subscript k comma r Baseline 2nd Column sigma Subscript k comma r plus 1 Baseline 3rd Column ellipsis 4th Column sigma Subscript k comma k Baseline EndMatrix right-parenthesis period

      So any particular vector of components is normal.

      Conditional distribution of a multivariate normal is also a multivariate normal. Given that bold upper X is a upper M upper V upper N Subscript k Baseline left-parenthesis mu comma sigma-summation right-parenthesis and using the vector notations above assuming that bold upper X 1 equals left-parenthesis upper X 1 comma ellipsis comma upper X Subscript r Baseline right-parenthesis and bold upper X 2 equals left-parenthesis upper X Subscript r plus 1 Baseline comma ellipsis comma upper X Subscript k Baseline right-parenthesis, then we can write the vector mu and matrix sigma-summation as

mu equals StartBinomialOrMatrix mu 1 Choose mu 2 EndBinomialOrMatrix and sigma-summation equals Start 2 By 2 Matrix 1st Row 1st Column sigma-summation Underscript 11 Endscripts 2nd Column sigma-summation Underscript 12 Endscripts 2nd Row 1st Column sigma-summation Underscript 21 Endscripts 2nd Column sigma-summation Underscript 22 Endscripts EndMatrix comma

      where the dimensions are accordingly chosen to match the two vectors (r and k minus r). Thus, the conditional distribution