1 Find the mean, mode, and median for these data.
2 Prepare the box plot for the data.
3 Using the results of parts (a) and (b), verify if the data are symmetric or skewed. Examine whether the conclusions made using the two methods, the results of part (a) and (b) about the shapes of the distribution, are the same or not.
4 Using the box plot, check if the data contain any outliers.
5 If in part (c) the conclusion is that the data are symmetric, then find the standard deviation and verify if the empirical rule holds or not.
Solution: The sample size in this problem is n = 40. Thus, we have
1 Mean , mode , and median
2 To prepare the box plot, we first find the quartiles , , and .Rank of Rank of Rank of .Since the data presented in this problem are already in the ascending order, we can easily see that the quartiles , , and areThe interquartile range is IQR . Thus, Figure 2.8.4 Box plot for the data in Example 2.8.2.The box plot for the data is as shown in Figure 2.8.4.
3 Both parts (a) and (b) lead to the same conclusion; that is, the data are symmetric.
4 From the box plot in Figure 2.8.4, it is clear that the data do not contain any outliers.
5 In part (c), we concluded that the data are symmetric, so we can proceed to calculate the standard deviation and then determine whether or not the empirical rule holds.Thus, the standard deviation is . Now it can be seen that the intervalcontains 72.5% of the data and contains 100% of the data.
The data are slightly more clustered around the mean. But for all practical purposes, we can say that the empirical rule holds.
PRACTICE PROBLEMS FOR SECTIONS 2.7 AND 2.8
1 The following data give the amount of a beverage in 12 oz cans:11.3811.0311.8711.9812.3611.8012.3212.0611.3811.0712.1212.1112.2412.3711.7512.2513.6011.9313.1111.7612.3412.0811.8511.3712.3211.7412.7512.7612.1611.7210.9712.0912.5311.8812.1111.2812.0111.8012.4712.32Find the mean, variance, and standard deviation of these data.Find the three quartiles and the IQR for these data.Prepare a box plot for these data and determine if there are any outliers present in these data.
2 The following data gives the reaction time (in minutes) of a chemical experiment conducted by 36 chemistry majors:555846584946416059415942404442584658584051594846424356484154565748434943Find the mean, mode, and median for these data.Prepare a box plot for these data and check whether this data set contains any outliers.
3 The following data give the physics lab scores of 24 randomly selected of physics majors:211821182018185919202020191821581922191822182256Construct a box plot for these data and examine whether this data set contains any outliers.
4 The following data provide the number of six sigma black belt Engineers in 36 randomly selected manufacturing companies in the United States:736480677378667859797475736663626158657660796263717556787375636671746443Find the 60th percentile of these data.Find the 75th percentile of the data.Determine the number of data points that fall between the 60th and 75th percentiles you found in parts (a) and (b).Prepare the box plot for these data and comment on the shape of the data:
5 Consider the following two sets of data:Set I292425262324292924282327262120252430282829282226302126272523Set II464860435747425758595253415843504956575451466044554360505154504344535158Find the mean and standard deviation for the two data sets.Find the coefficient of variation for these data sets.Determine whether one of these data sets has higher variation than the other.
6 Reconsider the data in Problem 4 of Section 2.6, and do the following:Find the mean, variance, and standard deviation of these data.Find the three quartiles and the IQR for these data.Prepare a box plot for these data and determine if there are any outliers present in these data.
7 Reconsider the data in Problem 5 of Section 2.6 and do the following:Find the mean, variance, and standard deviation of these data.Find the three quartiles and the IQR for these data.Prepare a box plot for these data and determine if there are any outliers present in these data.
2.9 Measures of Association
So far in this chapter, the discussion was focused on only univariate statistics because we were interested in studying a single characteristic of a subject. In all the examples we considered, the variable of interest was either qualitative or quantitative. We now study cases involving two variables; this means examining two characteristics of a subject. The two variables of interest could be either qualitative or quantitative, but here we will consider only variables that are quantitative.
For the consideration of two variables simultaneously, the data obtained are known as bivariate data. In the examination of bivariate data, the first question is whether there is any association of interest between the two variables. One effective way to determine whether there is such an association is to prepare a graph by plotting one variable along the horizontal scale (x‐axis) and the second variable along the vertical scale (y‐axis). Each pair of observations
Example 2.9.1 (Cholesterol level and systolic blood pressure) The cholesterol level and the systolic blood pressure of 10 randomly selected US males in the age group 40–50 years are given in Table 2.1. Construct a scatter plot of this data and determine if there is any association between the cholesterol levels and systolic blood pressures.
Solution: Figure 2.9.1 shows the scatter plot of the data in Table 2.1. This scatter plot clearly indicates that there is a fairly strong upward linear trend. Also, if a straight line is drawn