This modeling choice is motivated by the remarkable flexibility that FMMs offer in the characterization of data with heterogeneous statistics – a highly desirable property in the application to high spatial resolution remote sensing imagery (Hedhli et al. 2016). In the proposed methods, for each layer of each quad-tree, if the corresponding data are multispectral, then for all class-conditional pdfs, g is chosen to be a multivariate Gaussian, i.e. a Gaussian mixture model is used. In this case, the parameter vector ψn of each component obviously includes the related vector mean and covariance matrix (n = 1, 2,...,N) (Landgrebe 2003). This model is also extended to the layers populated by wavelet transforms of optical data, consistently with the linearity of the wavelet operators.
On the contrary, for each layer that is populated by SAR data, all class-conditional pdfs are modeled using FMMs in which g is a generalized Gamma distribution, i.e. generalized Gamma mixtures are used. In this case, the parameter vector θn of each n-th component includes a scale parameter and two shape parameters (n = 1, 2,..., N). The choice of the generalized Gamma mixture is explained by its accuracy in the application to high spatial resolution SAR imagery (Li et al. 2011; Krylov et al. 2013). Here, we also generalize it – albeit empirically – to the layers populated with wavelet transforms of SAR imagery.
In all of these cases, the FMM parameters are estimated through the stochastic expectation maximization (SEM) algorithm. SEM is an iterative stochastic parameter estimation technique that has been introduced for problems characterized by data incompleteness and that approaches maximum likelihood estimates under suitable assumptions (Celeux et al. 1996). It is separately applied to the training set of each class ωm in each
-th layer of each k-th quad-tree, to model the corresponding class-conditional pdf In the case of the generalized Gamma mixtures for the SAR layers, it is also integrated with the method of log-cumulants (Krylov et al. 2013). Details on this combination can be found in (Moser and Serpico 2009). We recall that SEM also automatically determines the number N of mixture components, for which only an upper bound has to be provided by the operator. This upper bound was set to 10 in all of our experiments.
1.3. Examples of experimental results
1.3.1. Results of the first method
To experimentally validate the first method, a time series of two high-resolution images acquired in 2010 over Port-au-Prince, Haiti, has been used. The series was made of an HH-polarized single-look COSMO-SkyMed stripmap image with a 2.5 m pixel spacing (325 × 400 pixels; see Figure 1.6(a)) and of a GeoEye-1 image with a 2.5 m spatial resolution (see Figure 1.6(b)) and three channels in the visible wavelength range. The time lag between the two acquisitions was a few days. Five main land cover classes were present in the scene: urban, water, vegetation, soil and containers. These classes were defined by an expert photointerpreter who also annotated their training and test samples.
In the approach taken by the first proposed method, the quad-trees are ordered, and consistently with the cascade approach, the output classification map is the result obtained on the leaves of the second quad-tree. Here, the GeoEye-1 and COSMO-SkyMed images were associated with the first and second quad-trees, respectively. The rationale of this choice is to initialize the land cover mapping result using the optical data and to finalize it through the fusion with SAR imagery. In the case of both quad-trees, to fill in the empty levels of the quad-tree, 2D Daubechies wavelets of order 10 were applied (Mallat 2008).
The classification result obtained by the proposed technique (see Figure 1.6(g)) was compared to those generated by several previous approaches to multisensor and/or multiresolution classification. First, to compare with the result of a multiscale but single-sensor approach, the hierarchical MRF on a single quad-tree of Laferté et al. (2000) was applied to classify the image collected by each sensor. In this case, the MPM criterion was also used and the class-conditional pdfs were estimated using SEM together with multivariate Gaussian or generalized Gamma mixtures in the case of optical and SAR data, respectively (see Figure 1.6(c) and (d)). Then, to compare with a multisensor multiscale approach, the algorithm in Voisin et al. (2014) was used. It makes use of a hierarchical MRF on a single quad-tree, whose layers are filled in with both optical and SAR data in a stacked vector fashion. Multisensor fusion is accomplished using multivariate copula functions (see Figure 1.6(e)). Finally, to compare with a multisensor but single-scale approach, the technique in Storvik et al. (2009) was considered after upsampling all of the data to the pixel lattice at the finest resolution. In Storvik et al. (2009), the joint class-conditional distributions of multisensor data are estimated using meta-Gaussian density functions (essentially equivalent to Gaussian copulas) and the maximum likelihood decision rule is applied to generate the output classification map (see Figure 1.6(f)).
Figure 1.6. First proposed method. (a) COSMO-SkyMed (©ASI 2010) and (b) GeoEye-1 (©GeoEye 2010) images of the input series. The former is shown after histogram equalization. The R-band of the latter is displayed. Classification maps obtained by separately classifying (c) the GeoEye-1 and (d) the COSMO-SkyMed images through a hierarchical MRF on a single quad-tree. Classification maps generated by (e) the multisensor multiscale method in (Voisin et al. 2014), (f) the multisensor single-scale technique in (Storvik et al. 2009) and (g) the proposed algorithm.
Color legend: water urban vegetation bare soil containers
For a color version of this figure, see www.iste.co.uk/atto/change2.zip
First, a visual qualitative inspection of the classification maps generated by the proposed and benchmark techniques suggests that the first proposed algorithm yielded quite accurate results and obtained improvements, compared to the previous methods, especially in the separate multiscale classification of the individual images coming from COSMO-Skymed and GeoEye-1. Specifically, in the results achieved through the use of only the SAR image, roads were discriminated quite accurately, but most other classes were not. In the results obtained through the use of only the optical image, classes that were spatially homogeneous were discriminated more effectively. The proposed technique is able to benefit from both satellite data sources in order to produce a classification output in which most classes in the high-resolution data set are visually well detected. Furthermore, compared to the case of multisensor but single-scale classification through the algorithm described in Storvik et al. (2009), the proposed method improved in terms of the spatial regularity of the classification map. This result is interpreted as a consequence of the contextual modeling components that are integrated in the proposed approach and are due to MRF modeling over a quad-tree and wavelet transformation.
Table 1.1. First proposed method: classification accuracies and computation times of the proposed technique and of the previous algorithms in Storvik et al. (2009) and Voisin et al. (2014) on the test set of the time series composed of COSMO-SkyMed and GeoEye-1 images. Computation times refer to an Intel i7 quad-core, 2.40