In the next section, we will describe two advanced methods for the supervised classification of multisource satellite image time series. These methods have the advantage of being applicable to series of two or more images taken by single or multiple sensors, operating at the same or different spatial resolutions, and with the same or different radar frequencies and spectral bands. In general, the available images in the series are temporally and spatially correlated. Indeed, temporal and spatial contextual constraints are unavoidable in multitemporal data interpretation. Within this framework, Markov models provide a convenient and consistent way of modeling context-dependent spatio-temporal entities originated from multiple information sources, such as images in a multitemporal, multisensor, multiresolution and multimission context.
1.2. Methodology
1.2.1. Overview of the proposed approaches
Let us consider a time series
composed of K images, acquired over the same area on K acquisition dates, by up to K optical and SAR different sensors. Each image in the series is generally composed of multiple features (i.e. it is vector-valued), possibly corresponding to distinct spectral bands or radar polarizations. Specifically, indicates the feature vector of pixel (p, q) in the k-th image in the series In general, each sensor may operate at a distinct spatial resolution; hence, a multisensor and multiresolution time series is being considered.The acquisition times of the images in the series are assumed to be close enough so that no significant changes occur in the land cover of the observed area. In particular, we assume that no abrupt changes (e.g. due to natural disasters such as floods or earthquakes) occur within the overall time span of the series. This assumption makes it possible to use the whole time series to classify the land cover in the scene, by using the benefit of the complementary properties of the images acquired by different sensors and at different spatial resolutions. Furthermore, this assumption may be especially relevant when the temporal dynamic of the ground scene per se is an indicator of land cover membership, such as in the case of forested (e.g. deciduous vs. evergreen) or agricultural areas. We denote as
the land cover classes in the scene and as their set. We operate in a supervised framework; hence, we assume that training samples are available for all of these classes.The overall formulation introduced in Hedhli et al. (2016) to address multitemporal fusion in the case of single-sensor imagery, and based on multiple quad-trees in cascade, is generalized here to take benefit from the images acquired by different sensors and from their mutual synergy. The multiscale topology of the quad-trees and of hierarchical MRFs defined on quad-trees intrinsically allows multiresolution and multisensor data to be naturally fused in the land cover mapping process.
In this framework, two specific algorithms are defined. In the first one, the k-th image in the series is assigned to a separate quad-tree based on its own spatial resolution. A hierarchical MRF is defined on this quad-tree topology, and inference on the resulting probabilistic graphical model is addressed using the Bayesian marginal posterior mode (MPM) criterion (Kato and Zerubia 2012). In the second proposed algorithm, the focus is on a specific case of multimission, multifrequency and multiresolution time series: multifrequency X-band COSMO-SkyMed and C-band RADARSAT-2 SAR images are used alongside optical visible and near-infrared (VNIR) Pléiades data. This scenario is of special current interest, both because of the potential of exploiting the synergy among these missions and especially in view of the recent COSMO-SkyMed Second Generation and RADARSAT Constellation programs. In the case of the second method, different quad-trees are also used, but both optical and SAR data are associated with each quad-tree in order to benefit from the finest resolution available from the considered sensors. Both approaches exploit the potential of hierarchical probabilistic graphical models (Kato and Zerubia 2012) to address challenging problems of multimodal classification of an image time series.
1.2.2. Hierarchical model associated with the first proposed method
Let us first define the multiple quad-tree structure associated with the first proposed method. The K images
in the series are included in the finest-scale layers (i.e. the leaves) of K distinct quad-trees. The coarser-scale layers of each quad-tree are filled in by applying wavelet transforms to the image on the finest-scale layer (Mallat 2008). The roots of the K quad-trees are assumed to correspond to the same spatial resolution. The rationale of this hierarchical structure is that each image in the input series originates from a separate multiscale quad-tree, generally with a different number of layers and the input image on the leaves, and that the roots of these quad-trees share a common spatial resolution (see Figure 1.4). This graph topology implicitly means that the spatial resolutions of the input images in the series are in a power-of-2 mutual relation. In general terms, this is a restriction but when concerning current high-resolution satellite missions, this condition is easily met up to possible minor resampling.Let
be the image associated with the -th layer of the k-th quad-tree in the series. We will index the common root with = 0 and the leaves of the k-th quad-tree with coincides with the original input image The images in the other layers have been obtained through wavelets from The whole time series of multiscale images, either acquired by the considered sensors or obtained through wavelets, will be denoted as .We will also indicate as
-th layer of the k-th quadtree -th layer of the k-th quadtree and is not on the root layer, then