Sampling and Estimation from Finite Populations. Yves Tille. Читать онлайн. Newlib. NEWLIB.NET

Информация о произведении:

Автор:	Yves Tille
Издательство:	John Wiley & Sons Limited
Серия:
Жанр произведения:	Математика
Год издания:	0
isbn:	9781119071273

Скачать книгу

would like to thank all the people who, in one way or another, helped me to make this book: Laurence Broze, who entrusted me with my first sampling course at the University Lille 3, Carl Särndal, who encouraged me on several occasions, and Yves Berger, with whom I shared an office at the Université Libre de Bruxelles for several years and who gave me a multitude of relevent remarks. My thanks also go to Antonio Canedo who taught me to use LaTeX, to Lydia Zaïd who has corrected the manuscript several times, and to Jean Dumais for his many constructive comments.

I wrote most of this book at the École Nationale de la Statistique et de l'Analyse de l'Information. The warm atmosphere that prevailed in the statistics department gave me a lot of support. I especially thank my colleagues Fabienne Gaude, Camelia Goga, and Sylvie Rousseau, who meticulously reread the manuscript, and Germaine Razé, who did the work of reproduction of the proofs. Several exercises are due to Pascal Ardilly, Jean‐Claude Deville, and Laurent Wilms. I want to thank them for allowing me to reproduce them. My gratitude goes particularly to Jean‐Claude Deville for our fruitful collaboration within the Laboratory of Survey Statistics of the Center for Research in Economics and Statistics. The chapters on the splitting method and balanced sampling also reflect the research that we have done together.

Yves Tillé

Bruz, 2001

Chapter 1 A History of Ideas in Survey Sampling Theory

1.1 Introduction

Looking back, the debates that animated a scientific discipline often appear futile. However, the history of sampling theory is particularly instructive. It is one of the specializations of statistics which itself has a somewhat special position, since it is used in almost all scientific disciplines. Statistics is inseparable from its fields of application since it determines how data should be processed. Statistics is the cornerstone of quantitative scientific methods. It is not possible to determine the relevance of the applications of a statistical technique without referring to the scientific methods of the disciplines in which it is applied.

Scientific truth is often presented as the consensus of a scientific community at a specific point in time. The history of a scientific discipline is the story of these consensuses and especially of their changes. Since the work of Thomas Samuel Kuhn (1970), we have considered that science develops around paradigms that are, according to Kuhn (1970, p. 10), “models from which spring particular coherent traditions of scientific research.” These models have two characteristics: “Their achievement was sufficiently unprecedented to attract an enduring group of adherents away from competing modes of scientific activity. Simultaneously, it was sufficiently open‐ended to leave all sorts of problems for the redefined group of practitioners to resolve.” (Kuhn, 1970, p. 10).

Many authors have proposed a chronology of discoveries in survey theory that reflect the major controversies that have marked its development (see among others Hansen & Madow, 1974; Hansen et al., 1983; Owen & Cochran, 1976; Sheynin, 1986; Stigler, 1986). Bellhouse (1988a) interprets this timeline as a story of the great ideas that contributed to the development of survey sampling theory. Statistics is a peculiar science. With mathematics for tools, it allows the methodology of the other disciplines to be finalized. Because of the close correlation between a method and the multiplicity of its fields of action, statistics is based on a multitude of different ideas from the various disciplines in which it is applied.

The theory of survey sampling plays a preponderant role in the development of statistics. However, the use of sampling techniques has been accepted only very recently. Among the controversies that have animated this theory, we find some of the classical debates of mathematical statistics, such as the role of modeling and a discussion of estimation techniques. Sampling theory was torn between the major currents of statistics and gave rise to multiple approaches: design‐based, model‐based, model‐assisted, predictive, and Bayesian.

1.2 Enumerative Statistics During the 19th Century

In the Middle Ages, several attempts to extrapolate partial data to an entire population can be found in Droesbeke et al. (1987). In 1783, in France, Pierre Simon de Laplace (see 1847) presented to the Academy of Sciences a method to determine the number of inhabitants from birth registers using a sample of regions. He proposed to calculate, from this sample of regions, the ratio of the number of inhabitants to the number of births and then to multiply it by the total number of births, which could be obtained with precision for the whole population. Laplace even suggested estimating “the error to be feared” by referring to the central limit theorem. In addition, he recommended the use of a ratio estimator using the total number of births as auxiliary information. Survey methodology as well as probabilistic tools were known before the 19th century. However, never during this period was there a consensus about their validity.

The development of statistics (etymologically, from German: analysis of data about the state) is inseparable from the emergence of modern states in the 19th century. One of the most outstanding personalities in the official statistics of the 19th century is the Belgian Adolphe Quételet (1796–1874). He knew of Laplace's method and maintained a correspondence with him. According to Stigler (1986, pp. 164–165), Quételet was initially attracted to the idea of using partial data. He even tried to apply Laplace's method to estimate the population of the Netherlands in 1824 (which Belgium was a part of until 1830). However, it seems that he then rallied to a note from Keverberg (1827) which severely criticized the use of partial data in the name of precision and accuracy:

In my opinion, there is only one way to arrive at an exact knowledge of the population and the elements of which it is composed: it is that of an actual and detailed enumeration; that is to say, the formation of nominative states of all the inhabitants, with indication of their age and occupation. Only by this mode of operation can reliable documents be obtained on the actual number of inhabitants of a country, and at the same time on the statistics of the ages of which the population is composed, and the branches of industry in which it finds the means of comfort and prosperity.¹

In one of his letters to the Duke of Saxe‐Coburg Gotha, Quételet (1846, p. 293) also advocates for an exhaustive statement:

La Place had proposed to substitute for the census of a large country, such as France, some special censuses in selected departments where this kind of operation might have more chances of success, and then to carefully determine the ratio of the population either at birth or at death. By means of these ratios of the births and deaths of all the other departments, figures which can be ascertained with sufficient accuracy, it is then easy to determine the population of the whole kingdom. This way of operating is very expeditious, but it supposes an invariable ratio passing from one department to another. [

] This indirect method must be avoided as much as possible, although it may be useful in some cases, where the administration would have to proceed quickly; it can also be used with advantage as a means of control.²

It is interesting to examine the argument used by Quételet (1846, p. 293) to justify his position.

To not obtain the faculty of verifying the documents that are collected is to fail in one of the principal rules of science. Statistics is valuable only by its accuracy; without this essential quality, it becomes null, dangerous even, since it leads to error.³

Again, accuracy is considered a basic principle of statistical science. Despite the existence of probabilistic tools and despite various applications of sampling techniques, the use of partial data was perceived as a dubious and unscientific method. Quételet had a great influence on the development of official statistics. He participated in the creation of a section for statistics within the British Association of the Advancement of Sciences in 1833 with Thomas Malthus and Charles Babbage (see Horvàth, 1974). One of its objectives was to harmonize the production of official statistics. He organized the International Congress of Statistics in Brussels in 1853. Quételet was well acquainted with the administrative systems of France, the United Kingdom, the Netherlands, and Belgium. He has probably

Скачать книгу

Sampling and Estimation from Finite Populations. Yves Tille

Chapter 1 A History of Ideas in Survey Sampling Theory 1.1 Introduction

1.2 Enumerative Statistics During the 19th Century

Chapter 1 A History of Ideas in Survey Sampling Theory

1.1 Introduction