The second edition of Multilevel Modeling has been improved and expanded in ways too numerous to list in detail. However, the major changes in this new version are as follows:
Longitudinal methods are expanded and get their own new chapter.
Diagnostic procedures are expanded with an emphasis on influence statistics.
Coverage of models of counts (Poisson) has been added.
A short new section on power analysis has been added.
Cross-classified models are now discussed.
The coverage of centering has been updated to reflect current statistical knowledge and practices.
A new section has been added that makes recommendations for presenting modeling results.
A new support website has been developed for the book that provides the data and the statistical code (both R and Stata) used for all of the presented analyses.
I hope that with these changes, this book will remain useful and relevant for students and researchers for many years to come.
In developing this second edition, I would like to particularly thank the following people who all provided extremely detailed and helpful reviews of the earlier edition, and drafts of this second edition.
Edward Brent, Department of Sociology, University of Missouri
Brian V. Carolan, Department of Educational Foundations, Montclair State University
Timothy Ford, Department of Curriculum, Instruction, and Learning, University of Louisiana
Jennifer Hayes Clark, Department of Political Science, University of Houston
Changjoo Kim, Department of Geography, University of Cincinnati
David LaHuis, Department of Psychology, Wright State University
I dedicated the first edition of this book to my parents. I would like to dedicate this updated version to my daughter, Alina Luke. Alina was in first grade when I started work on the original volume. As time passes, daughters grow up. She recently received her MPH with a concentration in biostatistics and epidemiology, and is herself a skilled analyst with training in mixed-effects models. In fact, she has helped with this new edition by providing the Stata mixed-effects modeling code and in developing the support website for the book. For all of this, I get to thank her for her professional skills and for the joy she brings to my life.
Chapter 1. The Need for Multilevel Modeling
Background and Rationale
When one considers almost any phenomenon of interest to social and health scientists, it is hard to overestimate the importance of context. For example, we know that the likelihood of developing depression is influenced by social and environmental stressors. The psychoactive effects of drugs can vary based on the social frame of the user. Early childhood development is strongly influenced by a whole host of environmental conditions: diet, amount of stimulation in the environment, presence of environmental pollutants, quality of relationship with mother, and so on. Physical activity is shaped by neighborhood environment; people who live in neighborhoods with sidewalks are much more likely to walk. The probability of teenagers engaging in risky behavior is related to being involved in structured activities with adult involvement. A child’s educational achievement is strongly affected by classroom, school, and school system characteristics.
These examples can be extended to situations beyond where individuals are being influenced by their contexts. The likelihood of couples avoiding divorce is strongly related to certain types of religious and cultural backgrounds. Group decision-making processes can be influenced by organizational climate. Hospital profitability is strongly affected by reimbursement policies set by government and insurance companies.
What all these examples have in common is that characteristics or processes occurring at a higher level of analysis are influencing characteristics or processes at a lower level. Constructs are defined at different levels, and the hypothesized relations between these constructs operate across different levels. Table 1.1 presents an example of the interdependence among levels of analysis, here with an example from the area of tobacco control. Research programs on tobacco control exist at all levels of analysis, from the genetic up to the sociocultural and political (i.e., “from cells to society”). Moreover, although research can occur strictly within any of these levels, much of the most important research will look at the links between the levels. For example, as we learn more about the genetic basis of nicotine dependence, we may be able to tailor specific preventive interventions to particular genotypes.
Table 1.1
These types of multilevel theoretical constructs require specialized analytic tools to properly evaluate. These multilevel tools are the subject of this book.
Despite the importance of context, throughout much of the history of the health and social sciences, investigators have tended to use analytic tools that could not handle these types of multilevel data and theories. In earlier years, this was due to the lack of such tools. However, even after the advent of more sophisticated multilevel modeling approaches, practitioners have continued to use more simplistic single-level techniques (Luke, 2005).
Theoretical Reasons for Multilevel Models
The simplest argument, then, for multilevel modeling techniques is this: Because so much of what we study is multilevel in nature, we should use theories and analytic techniques that are also multilevel. If we do not do this, we can run into serious problems, including making incorrect causal claims.
For example, it is very common to collect and analyze health and behavioral data at the aggregate level. Epidemiologic studies, for example, have shown that in countries where fat is a larger component of the diet, the death rate from breast cancer is also higher (Carroll, 1975). It might seem reasonable to then assume that women who eat a lot of fat would be more likely to get breast cancer. However, this interpretation is an example of the ecological fallacy, where relationships observed in groups are assumed to hold for individuals (Freedman, 1999). Recent health studies, in fact, have suggested that the link between fat intake and breast cancer is not very strong at the individual level (Holmes et al., 1999).
This type of problem can also work the other way. It is very common in the behavioral sciences to collect data from individuals and then aggregate the data to gain insight into the groups to which those individuals belong. This can lead to the atomistic fallacy, where inferences about groups are incorrectly drawn from individual-level information (Diez-Roux, 1998). It is possible to be successful assessing ecological characteristics from individual-level data; for example, see Moos’s (1996) work on social climates. However, as Shinn and Rapkin (2000) have argued, this approach is fraught with danger and a much more valid approach is to assess group and ecological characteristics using group-level measures and analytic tools.
It is useful here to consider the sociological distinction between properties of collectives and members (Lazarsfeld & Menzel, 1961). Members belong to collectives, but various properties (variables) of both collectives and their members may be measured and analyzed at the same time. Lazarsfeld and Menzel identify analytical, structural, and global properties of collectives. Analytical