About the Companion Site
This book is accompanied by a companion website:
www.wiley.com/go/evidencebasedstatistics
The website includes materials for students (open access):
R statistical code for likelihood ratio and support calculations
Answers
Introduction
Likelihood is the central concept in statistical inference. Not only does it lead to inferential techniques in its own right, but it is as fundamental to the repeated-sampling theories of estimation advanced by the ‘classical’ statistician as it is to the probabilistic reasoning advanced by the Bayesian.
Thus begins Edwards's remarkable book on Likelihood [1].
Fisher was responsible for much of the fundamental theory underlying the modern use of statistics. He developed methods of estimation and significance testing but also, according to Edwards [1, p. 3] ‘quietly and persistently espoused an alternative measure by which he claimed rival hypotheses could be weighed. He called it likelihood …’. Neyman and Pearson were drawn to the use of the likelihood ratio, stating ‘…there is little doubt that the criterion of likelihood is one which will assist the investigator in reaching his final judgement’ [2]. Eventually they turned away from using it, when they realized that it would not allow them to estimate the Type I error probability necessary for frequentist statistics. Edwards is not alone when he laments in his 1992 preface ‘Nevertheless, likelihood continues to be curiously neglected by mathematical statisticians’ [1].
Richard Dawkins (biologist and author) once said ‘Evidence is the only good reason to believe anything’. However, ‘evidence’ has become an over-used buzz word appropriated in expressions like ‘evidence-based education’. Overused and attached to statements on policy or practice, it is no doubt used with the intention of enhancing or validating their endeavours. Often ‘evidence-based’ statements appear to refer to statistics as providing the evidence. However, we are in the curious situation where the two most popular statistical approaches do not actually quantify evidence. Bayesian and frequentist statistics provide probabilities rather than any weight of evidence. The lesser known likelihood approach is alone in providing objective statistical evidence. All three approaches were developed in Britain (specifically England), yet only the likelihood approach provides admissible evidence in British courts of law.
Many excellent texts in applied statistics make mention of likelihood since it is a key concept in statistical inference. Despite this, few texts give practical examples to demonstrate its use. None are available at the introductory level to explain, step-by-step, how the likelihood ratio calculations for many different types of statistical analyses, such as comparisons of means, associations between variables, categorical data analyses, and nonparametric analyses, are done. The current text is an attempt to fill this gap. It is assumed that the reader has some basic knowledge of statistics, perhaps from an introductory university or school course. Otherwise, the reader can consult any one of a large number of excellent texts and online resources.
John Tukey, a mathematician who made huge contributions to statistical methodology, once said: ‘Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise’ [3]. A p value provides an exact answer, but often to the wrong question.
For historical reasons, likelihoods and their ratios will probably not replace analyses using other approaches, especially the well-entrenched p value. However, the likelihood approach can supplement or complement other approaches. For some, it will add another instrument to their statistical bag of tricks.
References
1 1 Edwards AWF. Likelihood. Baltimore: John Hopkins University Press; 1992.
2 2 Neyman J, Pearson ES. On the use and interpretation of certain test criteria for purposes of statistical inference: part I. Biometrika. 1928; 20A(1/2):175–240.
3 3 Tukey JW. The future of data analysis. The Annals of Mathematical Statistics. 1962; 33(1):1–67.
1 The Evidence is the Evidence
It is the simple suggestion that the only valid reason for rejecting a statistical hypothesis is that some alternative hypothesis explains the observed events with a greater degree of probability. 1
—E.S. Pearson on receiving a letter from W.S. Gosset [2, p. 242]
1.1 Evidence-Based Statistics
Science advances from evidence, and scientific evidence guides decision-making, practice, and policy. Evidence-based practice encompasses numerous fields: policy, design, management, medicine, education, etc. In medicine, practitioners and patients alike rightly demand and expect that treatments used are evidence-based. To say that the use of a particular therapy is evidence-based means that it has sufficient evidence to support the benefit of its use compared with other possible treatments.
In science, data is obtained in many different ways depending on the methodology. Often the methodology is dictated by the constraints peculiar to the research area. Data can provide evidence on a number of different levels. It may be anecdotal, may come from observational, or from experimental studies. Anecdotal evidence is regarded as the weakest, although it may be the starting point for more systematic research. At the next level, multiple observations provide observational evidence which is usually correlational in nature. A carefully designed study, such as randomized controlled trial, can provide causal evidence for the effectiveness of a treatment. Finally, taking evidence from many research studies may be achieved by carrying out meta-analyses and systematic reviews. Each level in the pyramid of evidence has its advantages and drawbacks.
Appropriate statistical practice is fundamental to doing good science. This book is different from most statistical texts. It is an introduction to the likelihood approach and provides practical instructions on how to convert data into statistical evidence. It uses the likelihood approach that is fully objective in producing statistical results that depend only on the observed data. As Taper and Lele said ‘…the use of the likelihood ratio as an evidence measure is that only the models and the actual data are involved. This is quite different from the classical frequentist and error-statistical approaches, where the strength of evidence is the probability of making an error, calculated over all possible configurations of potential data’ [1, p. 538].
The likelihood approach encompasses a range of techniques grounded in established statistical theory. These techniques allow us to express relative evidence as a ratio of likelihoods. The phrases evidential approach and likelihood approach will be used interchangeably. Using the evidential approach frees us from dependence on the subjective considerations that bedevil other approaches. Based only upon observed evidence, it always informs us correctly about the relative strength of evidence for one hypothesis versus another.
A fuller discussion of the difficulties