Data Science in Theory and Practice. Maria Cristina Mariani. Читать онлайн. Newlib. NEWLIB.NET

Информация о произведении:

Автор:	Maria Cristina Mariani
Издательство:	John Wiley & Sons Limited
Серия:
Жанр произведения:	Математика
Год издания:	0
isbn:	9781119674733

Скачать книгу

p period"/>

Example 3.2 Consider the following data matrix introduced in Example 3.1:

bold upper X equals Start 3 By 2 Matrix 1st Row 1st Column 48 2nd Column 3 2nd Row 1st Column 22 2nd Column 1 3rd Row 1st Column 50 2nd Column 2 EndMatrix period

Each receipt yields a pair of measurements, total dollar sales, and number of movies sold. To find the sample mean x overbar , we calculate the average of each column as follows:

StartLayout 1st Row 1st Column x overbar Subscript 1 2nd Column equals one third sigma-summation Underscript j equals 1 Overscript 3 Endscripts x Subscript j Baseline 1 Baseline equals one third left-parenthesis 48 plus 22 plus 50 right-parenthesis equals 40 comma 2nd Row 1st Column x overbar Subscript 2 2nd Column equals one third sigma-summation Underscript j equals 1 Overscript 3 Endscripts x Subscript j Baseline 2 Baseline equals one third left-parenthesis 3 plus 1 plus 2 right-parenthesis equals 2 period EndLayout

Therefore,

bold upper X overbar equals StartBinomialOrMatrix x overbar Subscript 1 Baseline Choose x overbar Subscript 2 Baseline EndBinomialOrMatrix equals StartBinomialOrMatrix 40 Choose 2 EndBinomialOrMatrix period

This implies that the average dollar sales for two movies is $40.00. Therefore, the average amount of dollars that cost a movie is 20 dollars.

3.4 Variance–Covariance Matrices

The variance–covariance matrix (or simply the covariance matrix) of a random vector bold upper X is given by

Cov left-parenthesis bold upper X right-parenthesis equals upper E left-bracket left-parenthesis bold upper X minus upper E left-parenthesis bold upper X right-parenthesis right-parenthesis left-parenthesis bold upper X minus upper E left-parenthesis bold upper X right-parenthesis right-parenthesis Superscript upper T Baseline right-bracket comma

where upper E left-parenthesis bold upper X right-parenthesis is the mean vector.

The sample covariance matrix upper S equals left-parenthesis s Subscript j k Baseline right-parenthesis is the matrix of sample variances and covariances of the variables.

(3.1)

In (3.1), the covariance matrix consists of the variances of the variables along the main diagonal and the covariances between each pair of variables in the other matrix positions. The sample covariance of the th and th variables, s Subscript i k , is calculated using the th and th columns of bold upper X :

(3.2)

where is the number of measurements.

For example if i equals 1 comma k equals 2 :

(3.3) s 12 equals StartFraction 1 Over n minus 1 EndFraction sigma-summation Underscript j equals 1 Overscript n Endscripts left-parenthesis x Subscript j Baseline 1 Baseline minus x overbar Subscript 1 Baseline right-parenthesis left-parenthesis x Subscript j Baseline 2 Baseline minus x overbar Subscript 2 Baseline right-parenthesis

and if i equals 1 comma k equals 1 :

(3.4) s 11 equals StartFraction 1 Over n minus 1 EndFraction sigma-summation Underscript j equals 1 Overscript n Endscripts left-parenthesis x Subscript j Baseline 1 Baseline minus x overbar Subscript 1 Baseline right-parenthesis squared

we have the sample variance.

The sample covariance measures the association between the th and th variables. The sample covariance reduces to the sample variance when i equals k as observed in (3.4). We note that the sample covariance matrix (3.1) is symmetric, i.e. s Subscript i k Baseline equals s Subscript k i for all and because of its definition. Other names used for the covariance matrix are variance matrix, variance–covariance matrix, and dispersion matrix. In finance the concept of covariance is applied in portfolio theory, in the diversification method, that reduces the risk by choosing assets that do not present a high positive covariance with each other.

If bold upper X is a random vector taking on any possible value in a multivariate population, the population covariance matrix is defined as

(3.5) bold sigma-summation equals cov left-parenthesis bold upper X right-parenthesis equals Start 4 By 4 Matrix 1st Row 1st Column sigma

<div class= Скачать книгу