Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP. Bhisham C. Gupta. Читать онлайн. Newlib. NEWLIB.NET

Автор: Bhisham C. Gupta
Издательство: John Wiley & Sons Limited
Серия:
Жанр произведения: Математика
Год издания: 0
isbn: 9781119516620
Скачать книгу
use of advanced statistical techniques has increased exponentially. The collection and analysis of various kinds of data has become essential in the fields of agriculture, pharmaceuticals, business, medicine, engineering, manufacturing, product distribution, and by government or nongovernment agencies. In a typical field, there is often need to collect quantitative information on all elements of interest, which is usually referred to as the population. The problem, however, with collecting all conceivable values of interest on all elements is that populations are usually so large that examining each element is not feasible. For instance, suppose that we are interested in determining the breaking strength of the filament in a type of electric bulb manufactured by a particular company. Clearly, in this case, examining each and every bulb means that we have to wait until each bulb dies. Thus, it is unreasonable to collect data on all the elements of interest. In other cases, as doing so may be either quite expensive, time‐consuming, or both, we cannot examine all the elements. Thus, we always end up examining only a small portion of a population that is usually referred to as a sample. More formally, we may define population and sample as follows:

      Definition 2.1.1

      A population is a collection of all elements that possess a characteristic of interest.

      Populations can be finite or infinite. A population where all the elements are easily countable may be considered as finite, and a population where all the elements are not easily countable as infinite. For example, a production batch of ball bearings may be considered a finite population, whereas all the ball bearings that may be produced from a certain manufacturing line are considered conceptually as being infinite.

      Definition 2.1.2

      A portion of a population selected for study is called a sample.

      Definition 2.1.3

      The target population is the population about which we want to make inferences based on the information contained in a sample.

      Definition 2.1.4

      The population from which a sample is being selected is called a sampled population.

      The population from which a sample is being selected is called a sampled population, and the population being studied is called the target population. Usually, these two populations coincide, since every effort should be made to ensure that the sampled population is the same as the target population. However, whether for financial reasons, a time constraint, a part of the population not being easily accessible, the unexpected loss of a part of the population, and so forth, we may have situations where the sampled population is not equivalent to the whole target population. In such cases, conclusions made about the sampled population are not usually applicable to the target population.

      Definition 2.1.5

      A sample is called a simple random sample if each element of the population has the same chance of being included in the sample.

      There are several techniques of selecting a random sample, but the concept that each element of the population has the same chance of being included in a sample forms the basis of all random sampling, namely simple random sampling, systematic random sampling, stratified random sampling, and cluster random sampling. These four different types of sampling schemes are usually referred to as sample designs.

      Since collecting each data point costs time and money, it is important that in taking a sample, some balance be kept between the sample size and resources available. Too small a sample may not provide much useful information, but too large a sample may result in a waste of resources. Thus, it is very important that in any sampling procedure, an appropriate sampling design is selected. In this section, we will review, very briefly, the four sample designs mentioned previously.

      Before taking any sample, we need to divide the target population into nonoverlapping units, usually known as sampling units. It is important to recognize that the sampling units in a given population may not always be the same. Sampling units are in fact determined by the sample design chosen. For example, in sampling voters in a metropolitan area, the sampling units might be individual voters, all voters in a family, all voters living in a town block, or all voters in a town. Similarly, in sampling parts from a manufacturing plant, the sampling units might be an individual part or a box containing several parts.

      Definition 2.1.6

      A list of all sampling units is called the sampling frame.

      The most commonly used sample design is the simple random sampling design, which consists of selecting images (sample size) sampling units in such a way that each sampling unit has the same chance of being selected. If, however, the population is finite of size images, say, then the simple random sampling design may be defined as selecting images sampling units in such a way that each possible sample of size images has the same chance of being selected. The number of such samples of size images that may be formed from a finite population of size images is discussed in Section 3.4.3.