46 Thomsen, I. and Zhang, L.-C. (2001). The effects of using administrative registers in economic short term statistics: the Norwegian Labour Force Survey as a case study. Journal of Official Statistics 17: 285–294.
47 UNECE (2011). Using Administrative and Secondary Sources for Official Statistics: A Handbook of Principles and Practices. United Nations Economic Commission for Europe. Available at: http://www.unece.org.
48 UNECE (2014). Measuring Population and Housing. Practices of UNECE Countries in the 2010 Round of Censuses. United Nations Economic Commission for Europe. Available at: http://www.unece.org.
49 Upton, G. and Cook, I. (2008). A Dictionary of Statistics. Oxford University Press.
50 Valliant, R., Dorfman, A.H., and Royall, R.M. (2000). Finite Population Sampling and Inference: A Prediction Approach. New York: Wiley.
51 Van Delden, A., Scholtus, S., and Burger, J. (2016). Accuracy of mixed-source statistics as affected by classification errors. Journal of Official Statistics 32 (3): 619–642.
52 de Waal, T. (2016). Obtaining numerically consistent estimates from a mix of administrative data and surveys. Statistical Journal of the IAOS 32: 231–243.
53 de Waal, T., Pannekoek, J., and Scholtus, S. (2011). Handbook of Statistical Data Editing and Imputation. Hoboken, NJ: Wiley.
54 Wallgren, A. and Wallgren, B. (2014). Register-Based Statistics: Statistical Methods for Administrative Data, 2e. Wiley.
55 Wolter, K. (1986). Some coverage error models for census data. Journal of the American Statistical Association 81: 338–346.
56 Zhang, L.-C. (2009a). A triple-goal imputation method for statistical registers. Presented at the UNECE Statistical Data Editing Workshop, Neuchatel (October 2009).
57 Zhang, L.-C. (2009b). Estimates for small area compositions subjected to informative missing data. Survey Methodology 35: 191–201.
58 Zhang, L.-C. (2011). A unit-error theory for register-based household statistics. Journal of Official Statistics 27: 415–432.
59 Zhang, L.-C. (2012). Topics of statistical theory for register-based statistics and data integration. Statistica Neerlandica 66: 41–63.
60 Zhang, L.-C. (2015a). On proxy variables and categorical data fusion. Journal of Official Statistics 31: 783–807.
61 Zhang, L.-C. (2015b). On modelling register coverage errors. Journal of Official Statistics 31: 381–396.
62 Zhang, L.-C. and Chambers, R.L. (2004). Small area estimates for cross-classifications. Journal of the Royal Statistical Society, Series B 66: 479–496.
63 Zhang, L.-C. and Dunne, J. (2017). Trimmed dual system estimation. In: Capture–Recapture Methods for the Social and Medical Sciences (Chapter 17) (eds. D. Böhning, J. Bunge and P. van der Heijden), 239–259. Chapman & Hall/CRC.
64 Zhang, L.-C. and Fosen, J. (2012). A modeling approach for uncertainty assessment of register-based small area statistics. Journal of the Indian Society of Agricultural Statistics 66: 91–104.
65 Zhang, L.-C. and Giusti, C. (2016). Small area methods and administrative data integration. In: Analysis of Poverty Data by Small Area Estimation (ed. M. Pratesi), 61–82. Wiley.
66 Zhang, L.-C. and Pritchard, A. (2013). Short-term turnover statistics based on VAT and Monthly Business Survey data sources. European Establishment Statistics Workshop, Nuremberg, Germany (9–11 September 2013).
2 Disclosure Limitation and Confidentiality Protection in Linked Data
John M. Abowd1, Ian M. Schmutte2, and Lars Vilhuber3
1U.S. Census Bureau and Cornell University, Suitland, MD, USA
2University of Georgia, Athens, GA, USA
3Department of Economics and Executive Director of Labor Dynamics Institute (LDI) at Cornell University, Ithaca, NY, USA
2.1 Introduction
The use of administrative data has long been a part of the procedures at national statistical offices (NSOs), as evidenced in the various chapters in this book. The censuses and surveys conducted by NSOs may use sampling frames built at least partially from administrative data. For instance, the U.S. Census Bureau has used a business register – a list of all domestic businesses – derived from administrative tax filings since at least 1968. This register is the frame for its quinquennial censuses and annual surveys of business activity (DeSalvo, Limehouse, and Klimek 2016). It is also used to link businesses across surveys, to link surveyed businesses to other administrative record data, and as a direct source of statistical information on the levels and growth of business activity, published as the County Business Patterns (CBP) and Business Dynamics Statistics (BDS).1 Similar examples can be found in most countries that maintain some kind of registry for their businesses. In many countries, similar centrally maintained registers are used as frames for censuses and surveys of a country’s inhabitants and workers. Chapter 17 illustrates the Swedish approach to this problem for a national population census.2 The Institute for Employment Research (IAB), the research institute of the German Employment Agency, uses social security notifications filed by firms, and data generated from the administration of its mandated programs, to sample firms and workers. McMaster University and later Statistics Canada used administrative job termination notifications (“record of employment”) filed by employers to survey departing employees for the Canadian Out-of-Employment Panel (COEP) (Browning, Jones, and Kuhn 1995). Other uses of administrative data in NSOs include linkage for quality purposes (Chapters 8, 14, and 15), and data augmentation (Chapter 12 for the National Center for Health Statistics [NCHS] approach).
In addition, the increasing computerization of administrative records, has facilitated more extensive linking of previously disconnected administrative databases, to create more comprehensive and extensive information. Methods to link databases within administrative units based on common identifiers are easy to implement (see Chapter 9 for more details). In the United States, which does not have a legal national identifier or ID document, the increased use of the Social Security Number (SSN) has facilitated linkage of government databases and among commercial data providers. In many European countries, individuals have national identifiers, and efforts are underway to allow for cross-border linkages within the European Union, in order to improve statistics on the workforce and the businesses of the common economic area created by what is now called the European Union. However, even when common identifiers are not available, linkage is possible (see Chapter 15).
The result has been that data on individuals, households, and business have become richer, collected from an increasing variety of sources, both as designed surveys and censuses, as well as organically created “administrative” data. The desire to allow policy makers and researchers to leverage the rich linked data has been held back, however, by the concerns of citizens and businesses about privacy. In the 1960s in the United States, researchers had proposed a “National Data Bank” with the goal of combining survey and administrative data for use by researchers. Congress held hearings on the matter, and ultimately the project did not go forward (Kraus 2013). Instead, and partially as a consequence, privacy laws were formalized in the 1970s. The U.S. “Privacy Act” (Public Law 93-579, 5 U.S.C. § 552a), passed in 1974,