Markov-based approaches for ternary change detection between two high resolution synthetic aperture sonar tracks

Change detection methods, consisting in detecting potential changes between two images, representing the same geographical area but acquired at different times, have been widely used in sonar imagery. Such methods are very useful to accurately monitor, potentially low, variations in complex environments. Coherent methods, relying on both amplitude and phase of the backscattered signal, are limited because of the signals correlation that plummets with both frequency and temporal baseline. In this paper, we propose an incoherent method, only using the amplitude images, to solve the change detection problem. This method relies on a robust mathematical expression for the class conditional probability density functions of the log-ratio image along with various Markov-based approaches to provide a ternary change map, thus allowing to better understand the changes that have occured on the seafloor.


I. INTRODUCTION
The detection of changes on the Earth surface is of high interest for many institutions, in order to monitor land ressources ( [1], [2]), marine ones [3] or underwater ones [4].
Change detection techniques have thus been widely used in the remote sensing field ranging from satellite images [5] to SAR data [6] through underwater acoustic ones [7].
In our specific context, Mine Counter Measures (MCM) operations classically consist in detecting, locating, classifying, identifying and neutralizing potential threats such as minelike objects lying on the seabed. The most common way to detect (and classify) such objects is to process the surveyed area by means of an Automatic Target Recognition chain [8], [9], especially in case of newly surveyed areas. However, if an area has already been imaged during a previous mission, it can be of interest to compare this reference sonar track with the one recently acquired in order to detect potential changes.
We can split such change detection methods into two groups: coherent and incoherent-based ones. Coherent-based methods rather rely on complex images to detect seabed changes using both the amplitude and the phase of the signal. The complex coherence [10]- [12] or the canonical correlation analysis (CCA) [13] have been used in this context. The pro of the use of the phase component is its ability to detect tiny changes not visible in the amplitude of the backscattered energy. However, to ensure a sufficient temporal correlation between the reference and the repeated track, Lyons [14] suggests not to work at a frequency higher than 30kHz and to limit the time interval between the acquisition of both tracks to few days.
Incoherent change detection methods only use the amplitude images (amplitude of the backscattered energy) to find changes. While symbolic or object-based approaches rely on the extraction of features of interest (ATR detection [15], [16], echo-shadow pairs [17], [18]) in both images before matching them by considering their local arrangement, imagebased methods [7], [19] only consider pixels intensity to detect potential changes by means of the computation of a difference image.
To detect changes between the reference and repeated tracks, Quidu [19] proposes either to threshold the absolute value of the log-ratio image while getting rid of false alarms through morphological mathematics and geometric constraints, or to detect non-speckle areas within both tracks before computing a difference image from those detections.
As it can elapse several months between the acquisition of our reference and repeated tracks, coherence-based methods are prohibited and we thus focus on incoherent-based ones.
The main contributions of this paper is the derivation of an analytical expression for the class-conditional probability density function (PDF) of the log-ratio image, based on physical considerations, to further provide a ternary change map, along with a novel method, combining multiple Hidden Markov Chains, to yield a robust change detection mask.
In section II and III, we respectively detail our approach to the change detection problem and derive a mathematical formulation for the probability density functions of the log-ratio image under different hypotheses. Markov Random Fields and Hidden Markov Chains are then considered in section IV. Results on real synthetic aperture sonar images are provided in section V while conclusion and future work are drawn in section VI.

A. Image superimposition
In the context of image registration and change detection, the most widely used way of representing the result is the superimposition of both images ( [7], [10], [20]), one being said the reference image while the other is named the repeated one. Thus, in this paper, we follow the same approach, especially when we deal with sonar tracks registration.

B. Change detection approach
When we perform change detection between two highresolution synthetic aperture sonar tracks, as we consider only two classes (bottom reverberation and shadow), we can face four different cases regarding both images at a given location: • The area corresponds to bottom-reverberation (speckle noise) in both images, thus the area remains unchanged (i.e. no change label). • An object exists on both tracks at the same location, leading to the unchanged class (i.e. no change label). • An object has appeared between tracks acquisition, and a change of the first kind is considered (i.e. object appearance label). • An object has disappeared between tracks acquisition, and a change of the second kind is considered (i.e. object disappearance label). To assign a label ("no change", "object appearance" or "object disappearance") to each pixel of the surveyed area, we rely on the computation of the pixel-wise log-ratio operator LR, between the reference image X 1 and the repeated one X 2 (1).
The log-ratio image has been chosen over the difference one as it only consider relative change in pixels intensity. Moreover, the ratio operator is more robust to calibration error than the difference one [21].
The reference and repeated passes have previously been projected into the Earth frame and co-registered [20].

C. Bottom-reverberation model
Areas corresponding to bottom-reverberation are corrupted with high frequency noise called speckle. Such a phenomenon comes from the fact that, after interacting with the seafloor, the acoustic waves are no longer in phase, hence causing constructive or destructive interferences [22]. The speckle is often modelled by means of a Rayleigh distribution [23] or a more general Weibull one [24] (Fig.1(a)).
However, as our sonar tracks are beforehand projected into the Earth frame at a coarser resolution, we can expect bottom reverberation areas not to follow such a Rayleigh law. In fact as shown in Fig.1(b), the bottom reverberation class seems to derive from this hypothesis. This can be well explained by the mosaicking process. Indeed, as it consists in averaging neighboring pixels, the central limit theorem states that the resulting distribution will tend towards a normal law. Thus, such a model is retained (2) for the bottom reverberation class.
where σ r and µ r respectively stand for the mean and standard deviation.

D. Shadow model
Shadows, corresponding to uninsonified areas, use to be model thanks to a Weibull [24] or Rayleigh [25] laws although the Gaussian model has also been employed [23]. For the same reasons as those previously formulated, a normal model is adopted for the shadow conditional probability density function (3).
Although normal models have been retained for both classes, as the Shapiro test [26] yields a very small p-value (p < 0.001), we can state that our experimental data are likely to deviate from such a probability model. However, experimental results demonstrate that such an approximation perform well in practise.
III. LOG-RATIO OPERATOR As all cases have been listed in section II-B and normal models are now assumed for both conditional-class PDFs (II-C, II-D), we can now derive a mathematical formulation of the log-ratio operator.
The log-ratio image can be rewritten, Thus, once the PDF of the random variables Y i = f (X i ) = log X i and Z = Y 1 − Y 2 have been computed, the classconditional probability density function of the difference image can be expressed as, with In case of ternary change detection, the parameters associated to the three classes are: r , µ r , σ 2 r } or {µ s , σ 2 s , µ s , σ 2 s } In case of normal distributions, the parameters µ r , σ 2 r , µ s , σ 2 s can be estimated by means of the well known maximum-likelihood estimators (9).
For each class, an unique area is manually selected and the previous parameters can be estimated. Such estimated parameters can be kept fixed during the processing, as our images intensity are normalized, or updated in an unsupervised manner. However, as our three probability density functions will share the same parameters, the latter is made difficult.

A. Experimental validation
To assess the robustness of our approach, we have firstly estimated the parameters corresponding to the bottomreverberation and shadow classes in a fully supervised manner i.e. by manually selecting a snippet of interest in sonar tracks. Then, the previous theoretical distributions of the potential cases (cf. section II-B) have been superimposed with histograms computed from the previous snippets (reference and repeated tracks). Fig.2 illustrates the match between histograms (computed on snippets) and our theoretical model.
The different cases which can be encountered are thus restrained to three. Indeed, the two cases yielding the no change label, i.e. when both areas are made up of shadow (resp. reverberation), are indistinguishable.

IV. MARKOV-BASED APPROACHES
In order to obtain a smooth and robust change detection mask, a regularization penalty has to be added to the maximum likelihood term by taking into account the neighborhood of each pixel. Different Markov-based approaches are thus introduced in this section.
To be a Markov random field, our model must obey the following Markov property: Fig. 3: First and second-order cliques considered in our MRF model. We distinguish between horizontal, vertical, first kind diagonal and second kind diagonal interaction. The MRF model is thus fully described by four parameters β 1 , β 2 , β 3 and β 4 .
which means that a label is conditionally independent of all others variables given its neighborhood N s .
The goal of the MRF model is to find the optimal labelinĝ w ∈ Ω to maximize the posterior probability P (w|x). Ω is the set of all the possible labelings. Under the pixels independence assumption, we have: As we only consider up to pairwise interactions in our MRFs, the log-prior term can be replaced by its associated pairwise potential: We define the set of cliques C = C 1 ∪C 2 where C 1 and C 2 are respectively the singletons and doubletons cliques (Fig.3), with their associated potentials ψ 1 (13) and ψ 2 (14) respectively representing the data and regularization terms.
The singleton potential, as its name suggests, do not rely on any contextual information but only on the observed variable (x) to assign a label to a given site. Thus, it is defined as the negative log-likelihood function (13) of (6): According to Ising's model, the potentials associated to doubletons rather rely on the comparison between labels of two neighboring sites s and t.
β s,t is the penalty term to ensure a smooth final change map, and depends on the relative position of node s and t (Fig 3). The set of parameters regarding the prior model is defined as β = {β 1 , β 2 , β 3 , β 4 } but can be restricted to a single parameter β in case of an isotropic model.
Finally, the change detection problem aims at solving

B. Hidden Markov Models
Markov grid models such as Markov Random Fields, are well adapted to model computer vision problems. However, such models being made of multiple loops where inference methods are iterative ones, the inference algorithm can take time to converge and we often cannot ensure to reach the global optimum. In this section, we thus introduce the use of HMM to perform change detection.
To model our change detection through Hidden Markov Chains (HMC)( [27], [28]), we must reshape our images to one dimensional array. Instead of basically performing a rasterization, we rather rely on a Hilbert-Peano scanning strategy [29] (Fig.4). Indeed, a pro of such a scan is that two spatially close pixels will also be close in the provided chain. Under the chain model, the joint probability can be expressed as The transition matrix (p(w s |w s−1 )), denoted as C and whose the unnormalized version is given (17), has been designed following certain guidelines: • p(w s = i|w s−1 = j) = p(w s = j|w s−1 = i) • Both kinds of change are equiprobable. • The next most expected state is to stay in the current one. • The most expensive transition is to switch from a given type of change to the other one.
It is also possible to refine such a transition matrix, still in an unsupervised manner, by means of the Baum-Welch (or forward-backward) [30] algorithm. Moreover, as considering a unique neighbor can lack robustness, we rather rely on the fusion of multiple HMCs. Indeed, the previous scanning strategy is applied on four rotated versions (0 • , 90 • , 180 • and 270 • ) of the log-ratio image. Then, four different inferences are performed thanks to the Viterbi's algorithm thus yielding four change detection maps (w 1 , w 2 , w 3 and w 4 ). Such computations can be easily parallelized through multithreading or multiprocessing techniques thus not adding an extra computation time. The final change detection map w f us is obtained by fusing the four previous results according to the following rule, along with a morphological opening whose the structural element is a disk of radius 1 pixel.

C. Inference
In our experiments, we have used a graph-cut based method, the alpha-expansion move algorithm [31] to perform inference on our MRFs. Regarding the inference within HMCs, the maximization of (16) can thus be performed by means of dynamic programming (Viterbi's algorithm) [32] to yield the global optimum through a single travel along the chain.

A. Real data
The available data consist in two pairs of synthetic aperture sonar tracks. Such data have been acquired with a THALES sonar sensor, where we rely only on the broadside aspect to perform change detection. The carrier is also equipped with an inertial navigation system whose data are fused with a Doppler velocity log (DVL). For a given pair, once both sonar tracks have been projected into the Earth frame, our nonrigid multi-resolution registration algorithm described in [20] is applied to geometrically align such tracks. For both pairs, about three weeks elapsed between the reference and repeated tracks sensing.

B. Results
The registration results of the first sonar tracks pair are provided through three snippets pairs (Fig.5) as, because of the seabed changes, the change detection area lacks some distinctive features to show the efficiency of the registration step.
Thus, snippets corresponding to the change detection area are presented in (Fig.6 (a), (b)). Fig.7 shows change detection results obtained with both MRF and HMC approaches.
Both approaches are able to detect the four mine-like objects that have been removed from the seabed between both surveys. The HMC model, due to its limited spatial context (single neighbor) still provides a huge number of false alarms (Fig.6  (c)). However, our fusion rule is able to output a smoother change mask where all object removals have been detected (Fig.6 (d)), although two of the detected objects are split into two distinct parts.
However, as we can see from the second sonar tracks pair (Fig.8), the grazing angle difference causes false alarms as a part of the longest shadow can be considered as an object. Such an issue could be solved by comparing the ternary change detection mask, with one of the sonar tracks, to detect the connexity between objects (shadows) and thus remove such false alarms. The change detection mask could also be weighted according to the absolute difference between the reference and repeated grazing angle maps.
Regarding the proposed methods, both MRF and HMC could be applied, the former yielding a smoother change detection mask than the latter but being slower to infer.

VI. CONCLUSION
Under the normal hypothesis to model both bottomreverberation and shadow classes in sonar tracks, we have derived a mathematical formulation to model the (absolute) log-ratio image. Such a formula has then been integrated in different Markov frameworks (MRF and HMM) to finally yield a ternary change detection mask allowing to provide more information regarding the kind of change. This method has been demonstrated to perform well on real data, to detect changes between two high-resolution sonar tracks. We have also shown that, as expected, the results are very sensitive to re-navigation accuracy, a grazing angle difference between both tracks leading to false alarms. In such cases, as only relying on a single pixel to detect a potential change can be too restrictive, we could rather rely on Conditional Random Fields [33], where singletons potentials would be derived from classifiers such as convolutional neural nets [34] in order to learn a grazing/aspect angle invariance.