Figure 2.7. Effect of decoupling biogeographic inference from time in DEC+J. Two phylogenies with identical topology and tip distributions, but internal branches elongated or shortened by half. The software BiogeoBears was used to infer ancestral ranges and rates of dispersal (d) and extinction (e) under the DEC and DEC+J models; the latter includes a parameter, j, for cladogenetic jump dispersal (“founder speciation”). a) DEC (short branches): d=0.0542, e=0.0436, j=0; LnL=−7.75. b) DEC (long branches): d = 0.0289, e=0.0375, j=0; LnL=−8.94. c) DEC+J (short): d=0; e=0; j=0.4265; LnL=−3.99. d) DEC+J (long): d=0; e=0; j=0.4265; LnL=−3.99. Notice that the DEC reconstruction for the most basal nodes changes with the branch lengths, but the DEC+J reconstruction does not. Only the range with the highest relative likelihood is shown; the maximum number of areas in widespread ranges was constrained to two. LnL: model log-likelihood. For a color version of this figure, see www.iste.co.uk/guilbert/biogeography.zip
Regarding the BIB model, extensions have gone in the direction of introducing species-specific rates of geographic movement or implementing procedures for reducing the size of the Q matrix. The original BIB model was used to infer patterns of colonization in oceanic (Sanmartín et al. 2008) or continental (Sanmartín et al. 2010) islands. It implemented a hierarchical Bayesian approach in which relative dispersal rates between islands and island carrying capacities were estimated from phylogenetic and distribution data from multiple, co-distributed island lineages. Phylogenetic and biogeographic parameters were simultaneously estimated from species DNA sequence data and associated geographic distributions, but allowing each species to have their own rates of molecular and biogeographic (dispersal) evolution. This hierarchical, species-partitioned approach allows researchers to infer general, broad-scale patterns of island colonization while accounting for (marginalizing) organism-specific differences in rates of molecular evolution, age of origin or dispersal ability. The BIB model was subsequently implemented in a epidemiology context to study patterns of viral spread (Lemey et al. 2009). These authors also extended the BIB model to include a stepwise regression approach, Bayesian stochastic variable selection, to identify those transition rates or dispersal pathways in the CTMC Q matrix that are better supported by the data (Lemey et al. 2009). This BIB extension has also been used to infer migration patterns at the population, within-species level (Mairal et al. 2015). Other extensions of BIB have gone in the direction of making dispersal rates dependent on external factors or predictors (Faria et al. 2013), or allowing the inferred dispersal pathways to differ across taxa (Cybis et al. 2013).
The applications of BIB in epidemiology and phylogeography are probably some of the most popular uses of the model in the present. BIB in these fields is termed discrete trait analysis, DTA, or the “mugration” model because it equates migration to mutation events (De Maio et al. 2015). Though treating migration events as instantaneous mutations in a sequence might be acceptable at geological time scales and species levels, as was done in the original BIB (Sanmartín et al. 2008), it can be more problematic under the coalescent process; this is a model used at short-time scales and population-levels for building phylogenetic relationships (De Maio et al. 2015). Subsequent authors have extended the BIB-DTA model to allow for geographically structured populations’ conditioning under the coalescent process (De Maio et al. 2015; Muller et al. 2017).
2.5. Expanding parametric models
2.5.1. Time-heterogeneous models
BIB and DEC models assume constant rates of dispersal and extinction as part of the CTMC process governing range evolution. As with molecular evolutionary models, relaxation of these assumptions has gone in the direction of allowing for rates to vary over time and across lineages, the so-called time-heterogeneous CTMC models. In the case of BIB, Bjelec et al. (2014) extended the DTA model to allow for the overall dispersal rate to vary across time slices in a stratified phylogeny; they used a piecewise-constant stochastic process in which rates of migration are constant within a given time slice but change between time slices. The temporal boundary (breakpoint) between two time slices may be estimated from the phylogenetic and distribution data alongside the biogeographic parameters.
A similar approach was implemented in the time-stratified, “epoch” DEC model (Ree and Smith 2008; Landis 2017): the phylogeny is divided into time intervals, and each interval is assigned a different set of values that scale the baseline dispersal rate according to paleogeographic information; for example, the availability of temporal land bridges facilitating migration between continents (Buerki et al. 2011). Time-stratified DEC models can also be used in biogeographic dating (Landis 2017). Rather than assuming a single CTMC process over time, DEC is allowed to shift between different Q matrices at discrete time points, based on paleogeographic evidence. Phylogeny, molecular dating and biogeographic parameters are jointly estimated using hierarchical BI. Paleogeographic data, that is, the formation of dispersal corridors and barriers over time, is used to inform the rates of a piecewise-constant epoch DEC model, and these time-dependent CTMC probabilities are used in turn to inform estimates of species divergence times in the phylogeny; for example, species can only diverge in allopatry if a paleogeographic barrier is present (Landis 2017).
Another exciting approach is the modeling of non-stationary CTMC models, where the equilibrium frequencies are allowed to change at discrete time points between time slices (Sanmartín 2020). Changes in area carrying capacities could result from a global extinction event that wipes out the biota of an island, decreasing its standing carrying capacity, and thus changing the stationary properties of the CTMC dispersal process. The point in time when there is a change in equilibrium frequencies and also the intensity of the extinction event (which might vary between areas) can be estimated by BI (Sanmartín 2020). Alternatively, the CTMC process may never attain equilibrium, or start with different values at root, such as in a directional CTMC process (Klofstein et al. 2015).
2.5.2. Diversification-dependent models
The latest exciting developments in parametric biogeography have been in the direction of implementing “state-dependent speciation and extinction (SSE) models”, in which there is a causal relationship between range evolution and lineage diversification (Maddison et al. 2007, Goldberg et al. 2011; FitzJohn 2012). As explained above, BIB and DEC do not include a speciation parameter in the stochastic CTMC process that governs geographic evolution. This is unrealistic since diversification and range evolution clearly interact: for example, the dispersal of a species into a new region may result in increased speciation rates due to lower competition or access to novel environmental resources (Moore and Donoghue 2007). Moreover, unlike the DEC model, SSE models provide a complete parametric description of biogeographic evolution, since speciation is a rate parameter in the CTMC process. In the geological state-dependent speciation and extinction model (GeoSSE; Goldberg et al. 2011), the Q matrix includes parameters for anagenetic range expansion and range contraction or extinction, as well as parameters for lineage speciation within single areas (SA, SB) or within a widespread range (SAB). There is also a parameter for lineage extinction within single areas (EA, EB): for widespread ranges, this is modeled as the sum of extinction events in single areas. All these parameters are time-dependent. The SSE counterpart of DEC+J is the ClaSSE model (Goldberg and Igic 2012), which allows changes in states to occur not only along branches (anagenetic) but also at speciation nodes (cladogenetic): this “founder-speciation” event is governed by its own time-dependent rate parameter in the Q matrix.
Coupling