Some statistical tables
JMP files
R exhibits
Students
Chapters 20 and 21
Data sets
Partial solutions Manual
Certain proofs and derivations
Some statistical tables
JMP files
R exhibits
Chapter 1 Introduction
Statistics, the discipline, is the study of the scientific method. In pursuing this discipline, statisticians have developed a set of techniques that are extensively used to solve problems in any field of scientific endeavor, such as in the engineering sciences, biological sciences, and the chemical, pharmaceutical, and social sciences.
This book is concerned with discussing these techniques and their applications for certain experimental situations. It begins at a level suitable for those with no previous exposure to probability and statistics and carries the reader through to a level of proficiency in various techniques of statistics.
In all scientific areas, whether engineering, biological sciences, medicine, chemical, pharmaceutical, or social sciences, scientists are inevitably confronted with problems that need to be investigated. Consider some examples:
An engineer wants to determine the role of an electronic component needed to detect the malfunction of the engine of a plane.
A biologist wants to study various aspects of wildlife, the origin of a disease, or the genetic aspects of a wild animal.
A medical researcher is interested in determining the cause of a certain type of cancer.
A manufacturer of lenses wants to study the quality of the finishing on intraocular lenses.
A chemist is interested in determining the effect of a catalyst in the production of low‐density polyethylene.
A pharmaceutical company is interested in developing a vaccination for swine flu.
A social scientist is interested in exploring a particular aspect of human society.
In all of the examples, the first and foremost work is to define clearly the objective of the study and precisely formulate the problem. The next important step is to gather information to help determine what key factors are affecting the problem. Remember that to determine these factors successfully, you should understand not merely statistical methodology but relevant nonstatistical knowledge as well. Once the problem is formulated and the key factors of the problem are identified, the next step is to collect the data. There are various methods of data collecting. Four basic methods of statistical data collecting are as follows:
A designed experiment
A survey
An observational study
A set of historical data, that is, data collected by an organization or an individual in an earlier study
1.1 Designed Experiment
We discuss the concept of a designed experiment with an example, “Development of Screening Facility for Storm Water Overflows” (taken from Box et al., 1978, and used with permission). The example illustrates how a sequence of experiments can enable scientists to gain knowledge of the various important factors affecting the problem and give insight into the objectives of the investigation. It also indicates how unexpected features of the problem can become dominant, and how experimental difficulties can occur so that certain planned experiments cannot be run at all. Most of all, this example shows the importance of common sense in the conduct of any experimental investigation. The reader may rightly conclude from this example that the course of a real investigation, like that of true love, seldom runs smoothly, although the eventual outcome may be satisfactory.
1.1.1 Motivation for the Study
During heavy rainstorms, the total flow coming to a sewage treatment plant may exceed its capacity, making it necessary to bypass the excess flow around the treatment plant, as shown in Figure 1.1.1a. Unfortunately, the storm overflow of untreated sewage causes pollution of the receiving body of water. A possible alternative, sketched in Figure 1.1.1b, is to screen most of the solids out of the overflow in some way and return them to the plant for treatment. Only the less objectionable screened overflow is discharged directly to the river.
To determine whether it was economical to construct and operate such a screening facility, the Federal Water Pollution Control Administration of the Department of the Interior sponsored a research project at the Sullivan Gulch pump station in Portland, Oregon. Usually, the flow to the pump station was 20 million gallons per day (mgd), but during a storm, the flow could exceed 50 mgd.
Figure 1.1.2a shows the original version of the experimental screening unit, which could handle approximately 1000 gallons per minute (gpm). Figure 1.1.2a is a perspective view, and Figure 1.1.2b is a simplified schematic diagram. A single unit was about seven ft high and seven ft in diameter. The flow of raw sewage struck a rotating collar screen at a velocity of five to 15 ft/s. This speed was a function of the flow rate into the unit and hence a function of the diameter of the influent pipe. Depending on the speed of the rotation of this screen and its fineness, up to 90% of the feed penetrated the collar screen. The rest of the feed dropped to the horizontal screen, which vibrated to remove excess water. The solids concentrate, which passed through neither screen, was sent to the sewage treatment plant. Unfortunately, during operation, the screens became clogged with solid matter, not only sewage but also oil, paint, and fish‐packing wastes. Backwash sprays were therefore installed for both screens to permit cleaning during operation.
Figure 1.1.1 Operation of the sewage treatment plant: (a) standard mode of operation and (b) modified mode of operation, with screening facility, F = flow; S = settleable solids.
1.1.2 Investigation
The objective of the investigation was to determine good operating conditions.
1.1.3 Changing Criteria
What are good operating conditions? Initially, it was believed they were those resulting in the highest possible removal of solids. Referring to Figures 1.1.1b and 1.1.2a, settleable solids in the influent are denoted by