This is an introductory textbook in applied statistics and probability for undergraduate students in engineering and the natural sciences. It begins at a level suitable for those with no previous exposure to probability and statistics and carries the reader through to a level of proficiency in various techniques of statistics. This text is divided into two parts: Part I discusses descriptive statistics, concepts of probability, probability distributions, sampling distributions, estimation, and testing of hypotheses, and Part II discusses various topics of applied statistics, including some reliability theory, data mining, cluster analysis, some nonparametric techniques, categorical data analysis, simple and multiple linear regression analysis, design and analysis of variance with emphasis on
factorial designs, response surface methodology, and statistical quality control charts of phase I and phase II.This text is suitable for a one‐ or two‐semester undergraduate course sequence. The presentation of material gives instructors a lot of flexibility to pick and choose topics they feel should make up the coverage of material for their courses. However, we feel that in the first course for engineers and science majors, one may cover Chapter 1 and 2, a brief discussion of probability in 3, selected discrete and continuous distributions from Chapter 4 and 5 with more emphasis on normal distribution, Chapter 7–9, and couple of topics from Part II that meet the needs and interests of the particular group of students. For example, some discussion of the material on regression analysis and design of experiments in Chapter 15 and 17 may serve well. Chapter 11 and 12 may be adequate to motivate students' interest in data science and data analytics. A two‐semester course may cover the entire book. The only prerequisite is a first course in calculus, which all engineering and science students are required to take. Because of space considerations, some proofs and derivations, certain advanced level topics of interest, including Chapter 20 and 21 on statistical quality control charts of phase I and phase II, are not included in the text but are available for download via the book's website: www.wiley.com/college/gupta/statistics2e.
MOTIVATION
Students encounter data‐analysis problems in many areas of engineering or natural science curricula. Engineers and scientists in their professional lives often encounter situations requiring analysis of data arising from their areas of practice. Very often, they have to plan the investigation that generates data (an activity euphemistically called the design of experiments), analyzes the data obtained, and interprets the results. Other problems and investigations may pertain to the maintenance of quality of existing products or the development of new products or to a desired outcome in an investigation of the underlying mechanisms governing a certain process. Knowing how to “design” a particular investigation to obtain reliable data must be coupled with knowledge of descriptive and inferential statistical tools to analyze properly and interpret such data. The intent of this textbook is to expose the uninitiated to statistical methods that deal with the generation of data for different (but frequently met) types of investigations and to discuss how to analyze and interpret the generated data.
HISTORY
This text has its roots in the three editions of Introductory Engineering Statistics, first co‐authored by Irwin Guttman and the late, great Samuel Wilks. Professor J. Stuart Hunter (Princeton University), one of the finest expositors in the statistics profession, a noted researcher, and a colleague of Professor Wilks, joined Professor Guttman to produce editions two and three. All editions were published by John Wiley & Sons, with the third edition appearing in 1982. The first edition of the current text was published in 2013.
APPROACH
In this text, we emphasize both descriptive and inferential statistics. We first give details of descriptive statistics and then continue with an elementary discussion of the fundamentals of probability theory underlying many of the statistical techniques discussed in this text. We next cover a wide range of statistical techniques such as statistical estimation, regression methods, nonparametric methods, elements of reliability theory, statistical quality control (with emphasis on phase I and phase II control charts), and process capability indices, and the like. A feature of these discussions is that all statistical concepts are supported by a large number of examples using data encountered in real‐life situations. We also illustrate how the statistical packages MINITAB® Version 18, R® Version 3.5.1, and JMP® Version 9, may be used to aid in the analysis of various data sets.
Another feature of this text is the coverage at an adequate and understandable level of the design of experiments. This includes a discussion of randomized block designs, one‐ and two‐way designs, Latin square designs,
factorial designs, response surface designs, among others. The latest version of this text covers materials on supervised and unsupervised learning techniques used in data mining and cluster analysis with a great exposure in statistical computing using R software and MINITAB. As previously indicated, all this is illustrated with real‐life situations and accompanying data sets, supported by MINITAB, R, and JMP. We know of no other book in the market that covers all these software packages.WHAT IS NEW IN THIS EDITION
After a careful investigation of the current technological advancement in statistical software and related applications as well as the feedback received from the current users of the text, we have successfully incorporated many changes in this new edition.
R software exhibits along with their R code are included.
Additional R software help for beginners is included in Appendix D.
MINITAB software instructions and contents are updated to its latest edition.
JMP software instructions and contents are updated to its latest edition.
New chapters on Data Mining and Cluster analysis are included.
An improved chapter on Response Surface Design has brought back to the printed copy from the book website.
The p‐value approach is emphasized, and related practical interpretations are included.
The visibility of the theorems and definitions are improved and well formatted.
Graphical exhibits are provided to improve the visualizations.
HALLMARK FEATURES
Software Integration
As previously indicated, we incorporate MINITAB and R throughout the text and complete R exhibits with their outputs (Appendix D) and associated JMP exhibits are available on the book's website: www.wiley.com/college/gupta/statistics2e. Our step‐by‐step approach to the use of the software packages means no prior knowledge of their use is required. After completing a course that uses this text, students will be able to use these software packages to analyze statistical data in their fields of interest.