Semi-Huber quadratic function and comparative study of some MRFs for Bayesian image restoration

The present work introduces an alternative method to deal with digital image restoration into a Bayesian framework, particularly, the use of a new half-quadratic function is proposed which performance is satisfactory compared with respect to some other functions in existing literature. The bayesian methodology is based on the prior knowledge of some information that allows an efﬁcient modelling of the image acquisition process. The edge preservation of objects into the image while smoothing noise is necessary in an adequate model. Thus, we use a convexity criteria given by a semi-Huber function to obtain adequate weighting of the cost functions (half-quadratic) to be minimized. The principal objective when using Bayesian methods based on the Markov Random Fields (MRF) in the context of image processing is to eliminate those effects caused by the excessive smoothness on the reconstruction process of image which are rich in contours or edges. A comparison between the new introduced scheme and other three existing schemes, for the cases of noise ﬁltering and image deblurring, is presented. This collection of implemented methods is inspired of course on the use of MRFs such as the semi-Huber, the generalized Gaussian, the Welch, and Tukey potential functions with granularity control. The obtained results showed a satisfactory performance and the effectiveness of the proposed estimator with respect to other three estimators.


INTRODUCTION
The use of powerful methods proposed in the seventies under the name of Bayesian estimation [1]- [3], are nowadays essential at least in the cases of image filtering, segmentation and restoration (e.g.image deblurring) [4].The basic idea of these methods is to construct a Maximum a posteriori (MAP) estimate of the modes or so called estimator of true images by using Markov Random Fields (MRF) in a Bayesian framework.The idea is based on a robust scheme which could be adapted to reject outliers, tackling situations where noise is present in different forms during the acquisition process [5]- [12].Some image analysis and processing tasks involve the filtering or image recovery x (restoration) after it passes by a degradation process giving the observed image y (see Eqs. (1) and ( 3)).Such degradation covers the noise perturbation, blurring effects due to focus of the adquisition systems lenses or to the bandwidth capacity, and other artifacts that may be added to the correct data.The image restoration or recuperation approaches of an image to its original condition given a degraded image, pass by reverting the effects caused by a distortion function.In fact, the degradation characteristics given by F(x) and n in Eq. ( 1) are crucial information and they must be known or estimated during the inversion procedure.Typically, F(x) is related to a point spread function H which can be linked with the probability distribution of the noise contamination n.In the case of MAP filters, usually the additive Gaussian noise is considered.A global image formulation model could be: where, F(x) is a functional that could take for instance, two forms: F(x) = x and F(x) = Hx, being H a linear opera-tor which models the image degradation.All variables presented along the text are, x = (x s ) s∈S , where S is a set of pixels: which represents a realization of a Markov random field X (or image to be estimated), y: represents the observed image with additive noise n and/or distorted by H, and x: is the estimator of x with respect to data y.There is another source of information which imposes a key rule in the image processing context, this is the spatial information that represents the likelihood or correlation between the intensity values of a neighborhood of pixels well specified.The modellig, when using MRF, takes into account such spatial interaction and it was introduced and formalized by Besag [1], where the powerfulness of these statistical tools is shown (as well as in pioneering works [2,3,13,14]).Combining both kinds of information in an statistical framework, the restoration is led by an estimation procedure given the maximum a posteriori of the true images when the distortion functionals are known.
The algorithms implemented in this work were developed considering a degraded signal, where the resulting non-linear recursive filters show excellent characteristics to preserve all the details contained in the image, and on the other hand, they smooth the noise components.Particularly four estimation schemes are implemented using, the semi-Huber potential function which is proposed as an extension of some previous works [15,16] (see in Section 3), and the generalized MRF introduced in the work of Bouman [13], the Welch and Tukey potential functions as used in the works of Rivera [17]- [19] (see in Section 4).
Section 2 describes the general definition of MAP estimator, gives an introduction to MRFs and describes briefly alternative similar tools called Hidden Markov Fields (HMF).
The potential functions compared in this paper must be obtained or proposed to conduct adequately the inversion process.Such functions are described in Section 3 and 4 where the convexity is the key to formulate an adequate criterion to be minimized.In Sections 5 and 6, the MAP estimators resulting from different MRF structures and some illustrative results are briefly discussed.Finally, in Section 7 some partial conclusions and comments are given.

MAP estimation, MRFs and Hidden Markov Fields (HMF)
The problem of image estimation (e.g.filtering or restoration) into a Bayesian framework deals with the solution of an inverse problem, where the estimation process is carried out in a whole stochastic environment.The most popular estimators used nowadays are: Maximum Likelihood (ML) estimator: where p(•) is the probability function of y given x.This estimator is a classical approach in estimation theory, but under certain circumstances, in image processing restoration, it results in an ill-posed problem or produces excessive noise and also causes smooth of edges.The regularization of the ML estimator leads to a Bayesian approach, where it is important to exploit all known information or so called prior information about any process under study, which gives a better statistical estimator called Maximum A Posteriori (MAP) estimator: in this case, the estimator is regularized by using a Markov random field function g(x) which models all prior information as a whole probability distribution, where X is the set of possible values of x capable to maximize the posterior law p(x|y), and p(y|x) is the likelihood probability function from y given x.
The Markov random fields (MRF) can be represented in a general way by using the following equation: where Z is a normalization constant, C is a set of "cliques" c, and V c (x c ) is a potential function given over the local group of points c.Generally, the "cliques" correspond to the sets of neighborhoods of pixels if ∀k, r ∈ c, k and r are neighbors, and one can construct a neighborhood system called ∂k; for the 8 closest neighbors ∂k = {r : |k − r| < 2}.A theorem introduced by Hammersley-Clifford [1,3] probes the equivalence between the Gibbs distribution and the MRFs.The Markov random fields have the capacity to represent various image sources.
There is a variety of cost functions also known as potential functions that can be used into the MRF [13], [17]- [19].Each potential function characterizes the interactions between pixels in the same local group.As an example, the following family represents convex functions: where λ is a constant parameter to be chosen, and p takes constant values such as p ≥ 1 accordingly to the theorem 2 in [13].
On the other hand, if one considers an equivalent problem of segmentation, the reconstruction modeling problem can be seen as Hidden Markov Fields (HMF), in this case x = (x s ) s∈S and y = (y s ) s∈S are considered as a pairwise Markov random fields where x is a realization of a hidden field X given the realization y of the observed field Y [20].The statistical assumptions and theoretic framework are more general and well developed.The Markovianity of X is not necessary, but the Markovianity of X conditionally to Y must be assured.In the classical assumptions of Markovianity for the hidden X, the law holds according to Eq. ( 4).However, for the pairwise Markovianity (X, Y) a more general model law is given by in this case the Markov fields (X, Y) are coupled assuming that the posterior law p(x|y) is Markovian.In recent works, these coupled Markov random fields have been extended in at least two directions: 1) Conditional Markov Fields (CMF) where it is directly assumed the Markovianity of p(x|y), even if the pairwise (X, Y) is not Markovian [21]- [23], 2) Triplet Markov Fields (TMF), here a third finite field u = (u s ) s∈S is introduced assuming that the triplet field (X, U, Y) is Markovian [24]- [26].It is possible to estimate x from y as shown for image segmentation tasks [24,27].
These extensions of HMF are promising for segmentation, and could also be an alternative for the equivalent problems of image filtering and restoration.

Semi-Huber (SH) proposed potential function
The principal apport of this work is the proposition of the semi-Huber potential function for image restoration, which performance is comparable with respect to half-quadratic functional performance.In order to assure completely the robustness into the edge preserving image filtering, diminishing at the same time the convergence speed, the Huber-like norm or semi-Huber (SH) potential function is proposed as a halfquadratic (HQ) function.Such functional has been used in one dimensional robust estimation as described in [15] for the case of non-linear regression.This function is adjusted in this work in two dimensions according to the following equation: where and where ∆ 0 > 0 is a constant value, b kr is a constant that depends on the distance between the r and k pixels, ct is a constant term, and ϕ 1 (x) = e 2 where e = (x k − x r ).The potential function ρ 1 (x) must fulfill the following conditions Notice that there is not necessary a scale parameter and that the potential function meets all requirements imposed by conditions (8).

Generalized Gaussian MRF and other half-quadratic functions
In some works [28]- [32] a variety of new potential functions were introduced, such proposed functions are semi-quadratic functionals or half-quadratic and they characterize certain convexity into the regularization term [33,34] (e.g.extension of penalization) which permits to build efficient and robust estimators in the sense of data preservation which is linked to the original or source image.Also, the necessary time of computation decreases with respect to other proposed schemes as shown by M. Nikolova [6]- [11], [35] and Labat [36,37].On the other hand, a way to obtain the posterior distribution of images has been proposed in previous works from A. Gibbs [38], in this case, it is necessary to use sophisticated stochastic simulation techniques based on the Markov Chain Monte Carlo (MCMC) methods [39,40].If it is possible to obtain the posterior distribution of any image, then, it is also possible to sample from such posterior distribution and obtain the MAP estimator, or other estimators such as the median estimator.The MAP and the median estimators search the principal mode of the posterior distribution.
In the present paper some potential functions are compared.The proposed semi-Huber is compared with respect to the generalized Gaussian MRF introduced in [13,14], the Welch, and Tukey potential functions with granularity control.These two last functions were proposed and used in the works of Rivera [17]- [19] providing excellent performance.

Generalized Gaussian MRF (GGMRF)
If one considers to generalize the Gaussian MRF (when p = q = 2 one has a GMRF, see Eq. ( 18)) as proposed in [13], then the generalized potential functions can be limited such as obtaining the GGMRF where theoretically a k > 0 and b kr > 0, k is the site or pixel of interest and S is the set of sites into the whole MRF, and r corresponds to the local neighbors.In practice it is recommended to take a k = 0 thus, the unicity of x MAP , can be assured given that the likelihood term is quadratic q = 2, then and from Eq. (3), log p(y|x) is strictly convex and so x MAP is continuous in y, and in p.The choice of the power p is capital, since it constrains the convergence speed of the local or global estimator, and the quality of the restored image.Small values for p allows abrupt discontinuities modeling while large values smooth them.the potential function meets all requirements imposed by conditions (8).

Welch potential function
Known as a hard redescender potential function with granularity control given by µ, and proposed in [17] log g(x) where k is a positive scale parameter and This function is also half-quadratic such as the Tukey function presented in the following subsection.

Tukey potential function
This is another hard redescender potential function, proposed in [17] that fulfills all requirements imposed by conditions (8) log g(x) where and where k g is also a scale parameter.On the other hand, ϕ 1 (x) can be the quadratic function which together with µ induces the granularity control.

Image filtering
In this section some estimators are deduced, the single problem of filtering noise to restore an observed signal y leads to establish the estimators.The observation equation could be where I is the identity matrix, and the general MAP estimator for this case is deduced from the next minimization process Thus, the MAP estimators for this particular problem under hypothesis of centered Gaussian noise with variance σ 2 n are given using the four previous potential functions.The first MAP estimator can be obtained when using our proposed semi-Huber (SH) MRF function.The global estimator can be described by the equation and, the local estimator leads to the following expression for the first local MAP estimator On the other hand, the minimization problem leads to consider various methods [6,9]- [11,35,41]: • global iterative techniques such as: the descendent gradient [35], conjugate gradient [17] (for recent propositions one can consult the work of Labat [36,37]), Gauss-Seidel, Over-relaxed methods, etc.
• local minimization techniques: minimization at each pixel x k (which generally needs more time, but from our point of view are more precise), where also some of the above methods can be used.
In this work the local techniques were used (the expectation maximization (EM) could also be implemented, or the complete half-quadratic scheme as proposed by Geman and Reinolds [33], and Geman and Yang [34]), since all hyperparameters included into the potential functions were chosen heuristically or according to values proposed in references.
Only, the step of minimization with respect to x was implemented to probe convergence of estimators.The second MAP estimator can be obtained when using the GGMRF function.Its global estimator is described by the following equation where b kr = b k−r = b r−k assuming homogeneity of the MRFs.In this case, the local MAP estimator for the GGMRF is given by where, according to the value of parameters p and q, the performance of such estimator varies.For example, if p = q = 2, one has the Gaussian condition of the potential function where the obtained estimator is similar to the least-squares one (L 2 norm) since the likelihood function is quadratic, with an additional quadratic term of penalization which degrades (e.g.large smoothing) the estimated image On the other hand, in the case of p = q = 1, the criterion is absolute (L 1 norm), and the estimator converges to the median estimator, which in practice, it is difficult to implement in a precise way This criterion is not differentiable at zero and this fact causes instability in the minimization procedure.For intermediate values of p and q the estimators become sub-optimal, and the iterated methods can be used to minimize the obtained criterions.Some iterative methods are the sub-gradient, or the Levenberg-Marquardt method of MATLAB 7, the last was used in this work.For cases where q = p, for example q = 2 and 1 < p < 2, some studies and different prior functions have been proposed in [7]- [11,35], particularly in [7]- [11] where non-convex regularized least-squares schemes are deduced and its convergence is analyzed (where 0 < p < 1) with very good times of convergences as presented in [42].The local or global condition of the estimator depends thus on: 1) if one has values of 1 < p < 2: the estimator x min.loc→ x min.glob, which means that a local minimum would coincide with a global minimum, 2) moreover, if p = q, the criterion is not homogeneous, but: x(αy, λ) = α x(y, α 1−q/p λ), assuring the convergence and existence of the estimator which is continuous with respect to p.
Moreover, since the noise is Gaussian, the value of q = 2 is a good choice.However, if the noise is non-Gaussian, and if the structure of noise is known, then the likelihood term changes to give a particular estimator (such as that proposed by Bertaux [5]) with some properties of robustness to the minimization procedure.If the structure of noise is assumed unknown, still one could reconsider the use of the GGMRF with values for q ∈ (1, 2], or consider also another type of potential functions as those described in some works of Idier [28]- [31] and Nikolova [9]- [11,35].
The third MAP estimator is obtained when using the Welch potential function with granularity control, that is, and its local version is And finally, the fourth MAP estimator is deduced from the Tukey potential function with granularity control, deriving the following global estimator where the local estimator is given by The use of a prior distribution function based on the logarithm, with any degree of convexity and quasi-homogeneous, permits to consider a variety of possible choices of potential functions.Maybe, the most important challenges that must be well solved are: the adequate selection of hyper-parameters from potential functions, where different versions of the EM algorithms try to tackle this problem [11,28,31,32], another is the minimization procedure which in any sense will regulate the convergence speed as proposed in [9,10,17,33,34,36,37,41].

Image deconvolution
On the other hand, for the problem of image deblurring to restore an observed signal y, the observation equation used is given by y for the four MAP estimators the likelihood term changes, such that, where the matrix H is known and given by the following truncated Gaussian blurring function, as used also in [42], with σ b = 1.5, and m = 1, 2, 3, 4 according to the four SH, GGMRF, Welsh and Tukey potential functions.
Here, the results were improved combining ideas introduced in a similar Bayesian way by Levin [43] adding a Sparse prior (SP) for filtering and then reconstructing the image.

Some experiments
Results presented in this section were concerned experimenting extensively with five images: synthetic, Lena, Cameraman, Boat and fringe pattern, to probe the performance of the presented estimators.

Image filtering
Continuing with the problem of filtering noise, some estimation results are presented when images are contaminated by Gaussian noise, and there are not other type of distortions.The first experiment was made considering σ n = 5, 10, 15.In the work by De la Rosa [16] some results were previously presented based on the analysis of a synthetic image and the standard image of "Lena", different levels of noise were added to the images: n ∼ N (0, Iσ 2 n ), the values of σ n are given such that the obtained degradation is perceptible and difficult to eliminate.The results were compared using different values for ∆ 0 and λ = 1 (MAP1), different values for p and λ preserving q = 2 (MAP2), and different values for k, µ and λ (MAP3 and MAP4).Generally, with the four estimators the filtering task gives good visual results (see Figures 2, 3, 4 and  5), but the time of computation is different between them.The faster estimator is the MAP3, while the slowest is the MAP2 with p = 1.2 which results correspond to the Cameraman in Figure 5(d).In the case of the Welsh and Tukey functionals the tuning problems must be solved implementing in correct ways more sophisticated algorithms based, for example, on the expectation maximization method.Figure 6 shows a synthetic generated fringe image, which was used to probe performance of estimators.In this case, it is known how the noise structure that contaminates data is, but the signal-to-noise is unknown.
Once again the obtained results coincide with the previous results for other images, but with an increase of computation time, which has a relation with the image dimensions (as   shown in Table 1).Some interesting applications of robust estimation are particulary focused on phase recovery from fringe patterns as presented in work [44], phase unwrapping, and some other problems in optical instrumentation.In this sense some filtering results were thus obtained using the presented MAP estimators.
Finally, Table 1 shows the performance of the four MAP estimators for the problem of filtering Gaussian noise, where an objective evaluation is conducted accordingly to the peak signal to noise ratio (PSNR).Also the computation times in MATLAB are shown in Table 1.Such comparative evaluation shows that our proposed approach MAP1, gives better or similar performance with respect to MAP2, MAP3, and MAP4.On the other hand, the use of half-quadratic potential functions permits flexibility on the computation times [7,8,11], but still it is a challenge to tune correctly the hyper-parameters to obtain a better performance in the sense of restoration.Perhaps the most simple potential function to tune is the semi-Huber (MAP1).Also making the correct hypothesis over the noise could help to improve the performance of the estimator.This could be directly reflected by proposing a more adapted likelihood function, as proposed by Bertaux [5] and some other recent works [9] (in cases of non-Gaussian noise), where a connection with variational and partial differential equations is illustrated evoking the famous work of Perona and Malik [45], and some recent related works.

Image deconvolution
Now, for the problem of image deconvolution some estimation results are presented when images are contaminated by Gaussian noise, and Gaussian distortion (with σ b = 1.5)Blur- ring the image.This second experiment was made considering thus σ n = 3, 5, 7. The results were compared using different values for ∆ 0 and with λ = 1 (MAP1), different values for p and λ preserving q = 2 (MAP2), and different values for k, µ and λ (MAP3 and MAP4).Figures 7 and 9 show a comparison of results obtained for the Cameraman and Boat images accordingly to the four MAP estimators.One can notice that preserving values of hyper-parameters near those used for the filtering case, the estimators smooth the noise but does not made good deblurring or recuperation of the image (the PSNR for this case is depicted in Table 2).One must change the hyper-parameters values searching a trade of between the granularity of the noise and the sharpness of the image to make a good deblurring task (decompositing in two steps the image reconstruction).Figure 8 and 10 show the obtained results on the Cameraman and Boat images using a combination of proposed estimators together with a Sparse prior (SP) deconvolution technique introduced in [43].The improvement in the restoration is visible; here the recuperation was obtained in two steps; first the noise was smoothed and then, the deblurring was obtained using SP deconvolution technique (our approach can be used in both steps, tuning the hyper-parameters two times).On the hands, in Table 2 the performance of the four MAP estimators and the SP de-   from potential functions, where different versions of the EM algorithms try to tackle this problem, another is the minimization procedure which in any sense will regulate the convergence speed as proposed by Allain [41], German [33,34], Rivera [17], Labat [36,37], Nikolova [9,10,35].
In the case of the semi-Huber potential function (MAP1), the tuning is less complicated and of course, the estimator manipulation is far simpler than Welsh (MAP3) and Tukey (MAP4).
On the other hand, this problem can be solved as argued by Idier [28] and Rivera [17] by implementing more sophisticated algorithms with the compromise to reduce time of computation and better quality in restoration as recently exposed by Ruimin Pan [12,28,42].Some advantages on the use of GGMRF as prior information into the Bayesian estimation (MAP2) are: the continuity of the estimator is assured as a function of the data values when 1 < p ≤ 2 even for Gaussian and non-Gaussian noise asumption.The edge preserving is also assured, over all when p → 1 and obviously it depends on the choice between the interval 1 < p < 2. The final objective of this work has been to contribute with a series of software tools for image analysis focused for instance to optical instrumentation tasks such as those treated in the works [44] and [18,19] obtaining competitive results in filtering and reconstruction.More over, the extensions of HMF will be considered in future work as an alternative for solving the problems of image filtering and restoration.

Figure 1 (
Figure 1(a) shows the behavior of the semi-Huber proposed function for different ∆ 0 values, in the range of x ∈ [−8, 8].Notice that there is not necessary a scale parameter and that the potential function meets all requirements imposed by conditions(8).

FIG. 1
FIG. 1 The four convex potential functions used: (a) The Semi-Huber potential function for ∆ 0 with different values; (b) The Generalized Gaussian potential function for p different values, while q = 2; (c) The granularity control Welch potential function for k g different values, while µ = 0.01; (d) The granularity control Tukey potential function for k g different values, while µ = 0.01.

Figure 1 (
c) shows the behavior of the Welch function with granularity control for different k g threshold values, in the range of x ∈ [−8, 8], µ = 0.01.Also, this potential function fulfills all requirements imposed by conditions(8).

Figure 1 (
d) shows the behavior of the Tukey function with granularity control for different k g values, in the range of x ∈ [−8, 8], µ = 0.01.

TABLE 2
Results obtained in evaluating the deconvolution capacity of the different MAP estimators.