Figure 2.4.10 The cumulative frequency histogram for the data in Example 2.4.5.
USING R
We can use the built in ‘hist()’ function in R to generate histograms. Extra arguments such as ‘breaks’, ‘main’, ‘xlab’, ‘ylab’, ‘col’ can be used to define the break points, graph heading,
SurvTime = c(60,100,130,100,115,30,60,145,75,80,89,57,64,92,87,110, 180,195,175,179,159,155, 146,157,167,174,87,67,73,109,123,135,129, 141,154,166,179,37,49,68,74,89,87,109,119,125,56,39,49,190) #To plot the histogram hist(SurvTime, breaks=seq(30,198, by=24), main=‘Histogram of Survival Time’, xlab=‘Survival Time’, ylab=‘Frequency’, col=‘grey’, right = FALSE) #To obtain the cumulative histogram, we replace cell frequencies by their cumulative frequencies h = hist(SurvTime, breaks=seq(30,198, by=24), right = FALSE) h$counts = cumsum(h$counts) #To plot the cumulative histogram plot(h, main=‘Cumulative Histogram’, xlab=‘Survival Time’, ylab=‘Cumulative Frequency’, col=‘grey’) Below, we show the histograms obtained by using the above R code.
Another graph called the ogive curve, which represents the cumulative frequency distribution (c.d.f.), is obtained by joining the lower limit of the first bin to the upper limits of all the bins, including the last bin. Thus, the ogive curve for the data in Example 2.4.5 is as shown in Figure 2.4.11.
Figure 2.4.11 Ogive curve using MINITAB for the data in Example 2.4.5.
2.4.5 Line Graph
A line graph, also known as a time‐series graph, is commonly used to study any trends in the variable of interest that might occur over time. In a line graph, time is marked on the horizontal axis (the
Example 2.4.6 (Lawn mowers) The data in Table 2.4.4 give the number of lawn mowers sold by a garden shop over a period of 12 months of a given year. Prepare a line graph for these data.
Table 2.4.4 Lawn mowers sold by a garden shop over a period of 12 months of a given year.
Months | January | February | March | April | May | June | July | August | September | October | November | December |
LM sold | 2 | 1 | 4 | 10 | 57 | 62 | 64 | 68 | 40 | 15 | 10 | 5 |
Solution: To prepare the line graph, plot the data in Table 2.4.4 using the
From the line graph in Figure 2.4.12, we can see that the sale of lawn mowers is seasonal, since more mowers were sold in the summer months. Another point worth noting is that a good number of lawn mowers were sold in September when summer is winding down. This may be explained by the fact that many stores want to clear out such items as the mowing season is about to end, and many customers take advantage of clearance sales. Any mower sales during winter months may result because of a discounted price, or perhaps the store may be located where winters are very mild, and there is still a need for mowers, but at a much lower rate.
Figure 2.4.12 Line graph for the data on lawn mowers in Example 2.4.6.
2.4.6 Stem‐and‐Leaf Plot
Before discussing this plot, we need the concept of the median of a set of data. The median is the value, say
Suppose