15 The following data give the consumption of electricity in kilowatt‐hours during a given month in 30 rural households in Maine:260290280240250230310305264286262241209226278206217247268207226247250260264233213265206225Construct, using technology, a stem‐and‐leaf diagram for these data.Comment on what you learn from these data.
2.5 Numerical Measures of Quantitative Data
Methods used to derive numerical measures for sample data as well as population data are known as numerical methods.
Definition 2.5.1
Numerical measures computed by using data of the entire population are referred to as parameters.
Definition 2.5.2
Numerical measures computed by using sample data are referred to as statistics.
In the field of statistics, it is standard practice to denote parameters by letters of the Greek alphabet and statistics by letters of the Roman alphabet.
We divide numerical measures into three categories: (i) measures of centrality, (ii) measures of dispersion, and (iii) measures of relative position. Measures of centrality give us information about the center of the data, measures of dispersion give information about the variation around the center of the data, and measures of relative position tell us what percentage of the data falls below or above a given measure.
2.5.1 Measures of Centrality
Measures of centrality are also known as measures of central tendency. Whether referring to measures of centrality or central tendency, the following measures are of primary importance:
1 Mean
2 Median
3 Mode
The mean, also sometimes referred to as the arithmetic mean, is the most useful and most commonly used measure of centrality. The median is the second most used, and the mode is the least used measure of centrality.
Mean
The mean of a sample or a population is calculated by dividing the sum of the data measurements by the number of measurements in the data. The sample mean is also known as sample average and is denoted by
(2.5.1)
In (2.5.1),
Example 2.5.1 (Workers' hourly wages) The data in this example give the hourly wages (in dollars) of randomly selected workers in a manufacturing company:
8, 6, 9, 10, 8, 7, 11, 9, 8
Find the sample average and thereby estimate the mean hourly wage of these workers.
Solution: Since wages listed in these data are for only some of the workers in the company, the data represent a sample. Thus, we have
Thus, the sample average is observed to be
In this example, the average hourly wage of these employees is $8.44 an hour.
Example 2.5.2 (Ages of employees) The following data give the ages of all the employees in a city hardware store:
22, 25, 26, 36, 26, 29, 26, 26
Find the mean age of the employees in that hardware store.
Solution: Since the data give the ages of all the employees of the hardware store, we are dealing with a population. Thus, we have
so that the population mean is
In this example, the mean age of the employees in the hardware store is 27 years.
Even though the formulas for calculating sample average and population mean are very similar, it is important to make a clear distinction between the sample mean or sample average
Sometimes, a data set may include a few observations that are quite small or very large. For examples, the salaries of a group of engineers in a big corporation may include the salary of its CEO, who also happens to be an engineer and whose salary is much larger than that of other engineers in the group. In such cases, where there are some very small and/or very large observations, these values are referred to as extreme values or outliers. If extreme