If we survey a pond in order to look at the animals and their relationships with several physical, chemical, and/or biological factors, then no matter how many replicates we take, we are merely describing what happens in a single entity (i.e. this one pond). Such a study does not tell us anything about pond ecology in general, and the use of such replicates is termed pseudoreplication and should be avoided (Hurlbert 1984; van Belle 2002). In order to broaden our approach and gain more of an understanding of ponds in general, we would need to study a large number of separate ponds. Thus, studies of single sites or small parts of sites may not reveal information applicable to the wider ecological context.
In some situations, the data collected are linked to each other by design. For example, we might be interested in comparisons of matched data (e.g. examining the animals found on cabbages before and after the application of fertiliser or pesticide, or the numbers of mayfly larvae found above and below storm drain outflows into a series of streams). These designs can be perfectly sound, but because the data are matched (by cabbage or by stream) we require a slightly different approach to the resulting analysis (see Chapter 5).
When designing your sampling strategy, it is important to consider the variability and whether the timing or order of sampling might bias the result by measuring only part of the potential variation. For example, sampling the insects present on thistle flower heads will be biased if all the data are collected in the early morning, since this will miss any animals that are active later in the day. If two areas are being compared, sampling one site early and one site later will introduce another variable into the comparison: we would not just be looking at the two sites, but also at two times of day. Since it would be impossible to separate the two variables, it would be difficult to draw conclusions from such a survey design. In this example we would say that the findings were ‘biased by time of day’. It is in managing some of this variability that experiments come into their own, because they standardise as far as possible the conditions under which the subjects are examined, thus removing bias. It is much easier to design an experiment where only one factor (also known as the treatment) is manipulated, whilst all others remain constant. However, if we wished to survey a real‐life situation (as opposed to examining a rather more artificial experimental design) then we would take into account the time of day. We could do this by designing our survey so that we alternated the measurements or observations that we took from our two sites, sampling first one then the other, then back to the first, and so on, to get a spread of measurements for each site over the day. Alternatively, we could sample on successive days, reversing the order in which we sampled the sites on each day, or justify the need to obtain additional fieldwork assistance to make a balanced study easier to implement.
Table 1.2 Random numbers. Coordinates can be extracted simply by taking pairs of random numbers in sequence from the table (e.g. 23, 85 – shaded values – provides the position within a sampling area where we would take the first measurement of a series).27
23 | 85 | 56 | 84 | 92 | 4 |
62 | 51 | 27 | 74 | 83 | 84 |
56 | 32 | 87 | 75 | 95 | 5 |
87 | 7 | 20 | 30 | 25 | 12 |
99 | 86 | 29 | 41 | 29 | 39 |
31 | 73 | 30 | 73 | 27 | 97 |
24 | 38 | 91 | 16 | 17 | 66 |
94 | 59 | 12 | 17 | 37 | 39 |
41 | 67 | 25 | 42 | 2 | 84 |
32 | 67 | 48 | 99 | 74 | 3 |
68 | 1 | 59 | 20 | 25 | 7 |
There are several sampling layouts that help us to avoid bias. One commonly used approach is random sampling. Here, a random sequence is used to determine the order in which to sample plants, or the coordinates to sample experimental plots or survey sites. Hence, if we wanted to randomly sample 1 m × 1 m quadrats in a field, random coordinates can be used to position the sampling sites (Figure 1.4a) using pairs of random numbers generated using a calculator or computer, or obtained from a table (see Table 1.2). This works by using pairs of numbers as sampling coordinates, so if we have coordinates of 23 and 85 in a sampling grid that is 10 m by 10 m, we would place our quadrats 2.3 m along the base and 8.5 m up the vertical axis. In our example above, of insects on thistle flowers, random sampling may also be used to determine which site is visited first: here sites would be allocated number codes that are then selected randomly from the table.
Figure 1.4 Examples of sampling designs. (a) Random sampling; (b) systematic sampling; (c) stratified random sampling.
Although random sampling is often appropriate for selecting sampling points, where there is a great deal of variation across a sampling unit such as a site, by chance the coverage may not include all of the heterogeneity present. For example, in Figure 1.4a, the two squares in the lower right of the sampling site have no sampling points. If the site was reasonably homogeneous, then this would not