Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP. Bhisham C. Gupta. Читать онлайн. Newlib. NEWLIB.NET

Автор: Bhisham C. Gupta
Издательство: John Wiley & Sons Limited
Серия:
Жанр произведения: Математика
Год издания: 0
isbn: 9781119516620
Скачать книгу

      This section introduces measures of relative position that divide the data into percentages to help locate any data value in the whole data set. Commonly used measures of relative position are percentiles and quartiles: percentiles divide the data into one hundred parts, such that each part contains at the most 1% of the data, and quartiles divide the data into four parts, such that each part contains at the most 25% of the data. Then from quartiles, we can derive another measure, which is called the interquartile range (IQR), to give the range of the middle 50% of the data values. This is obtained by first organizing the data in an ascending order and then trimming 25% of the data values from the lower and the upper ends. A quantile is a value which divide a distribution or an ordered sample such that a specified proportion of observations fall below that value. For instance, the percentiles and quartiles are very specific quantiles.

      2.7.1 Percentiles

      1 Step 1. Write the data values in an ascending order and rank them from 1 to .

      2 Step 2. Find the rank of the pth percentile (), which is given by(2.7.1)

      3 Step 3. Find the data value that corresponds to the rank of the pth percentile.

      We illustrate this procedure with the following example.

       62 48 52 63 85 51 95 76 72 51 69 73 58 55 54

      1 Find the 70th percentile for these data.

      2 Find the percentile corresponding to the salary of $60,000.

      Solution: (a) We proceed as follows:

      1 Step 1. Write the data values in the ascending order and rank them from 1 to 15.Salaries485151525455586263697273768595Ranks123456789101112131415

      2 Step 2. Find the rank of the 70th percentile, which from (2.7.1) is given by

      3 Step 3. Find the data value that corresponds to the ranks 11 and 12, which in this example are 72 and 73, respectively. Then, the 70th percentile is given byThus, the 70th percentile of the salary data is $72,200.That is, at most 70% of the engineers are making less than $72,200 and at most 30% of the engineers are making more than $72,200.

      (b) Now we want to find the percentile images corresponding to a given value images. This can be done by using the following formula:

      (2.7.2)equation

equation

      Hence, the engineer who makes a salary of $60,000 is at the 44th percentile. In other words, at most 44% of the engineers are making less than $60,000, or at most 56% are making more than $60,000.

      2.7.2 Quartiles

Diagram of quartiles and percentiles depicted by a horizontal shaded bar divided by 3 vertical lines labeled Q1 (25th percentiles), Q2 (50th percentile), and Q3 (75th percentile) into 4 unequal segments labeled 25%.

      2.7.3 Interquartile Range (IQR)

      Often we are more interested in finding information about the middle 50% of a population. A measure of dispersion relative to the middle 50% of the population or sample data is known as the IQR. This range is obtained by trimming 25% of the values from the bottom and 25% of the values from the top. This is equivalent to finding the spread between the first quartile and the third quartile, which is IQR and is defined as

      (2.7.3)equation

      Example 2.7.2 (Engineers' salaries) Find the IQR for the salary data in Example 2.7.1:

       Salaries: 48, 51, 51, 52, 54, 55, 58, 62, 63, 69, 72, 73, 76, 85, 95

      Solution: In order to find the IQR, we need to find the quartiles images and images or equivalently 25th percentile and the 75th percentile. We can easily see that the ranks of 25th and 75th percentile are given by (see (2.7.1))

equation equation equation

      Notes:

      1 The IQR gives an estimate of the range of the middle 50% of the population.

      2 The IQR is potentially a more meaningful measure of dispersion than the range as it is not affected by the extreme values that may be present in the data. By trimming 25% of the data from the bottom and 25% from the top, we eliminate the extreme values that may be present in the data set. Thus, the IQR is often used as a measure of comparison between two or more data sets on similar studies.

      2.7.4 Coefficient of Variation

      The