Abstract
In this work, a non-subsampled shearlet transform (NSST) based anisotropic diffusion method is proposed. In the proposed method, the NSST transform is firstly applied to the noisy image to provide several scale and directional components. Then, the NSST coefficients are classified into textured regions and noise-related ones by using Sparse Un-mixing by the variable Splitting and Augmented Lagrangian (SUnSAL) classifier. Subsequently, an energy function is formed by the noise-related coefficients to be minimized by diffusion equations. The noisy free image is approximated from the denoised coefficients obtained by the anisotropic diffusion method and textured coefficients which are remained unchanged. Visual and quantitative assessments demonstrate that the proposed method outperforms the state-of-art denoising methods in terms of noise removal and detail preservation.
Keywords
Introduction
A digital image is inevitably corrupted by noise in the process of transmission, coding and receiving. Image denoising methods aim at recovering the unknown initial image from a contaminated observation or noisy measurement. Up to now, numerous image denoising methods including, Gaussian smoothing filtering [1], anisotropic diffusion [2], bilateral filtering [3], total variation minimization [4] and multiresolution wavelet/contourlet/shearlet thresholding method [5–7] have been explored for recovering the “true” image and removing the noise. These methods adopt different strategies to use the local information of the noisy image for removing the noise.
Nowadays, multiresolution approaches lead to convenient tools in signal processing applications such as, data compression and signal denoising. To find an efficient representation of objects with discontinuities along curves, several researchers have introduced different multiresolution approaches namely, the wavelet transform [8], contourlet transform [9] and shearlet transform [10]. But these transforms are not shift-invariant, which can be a problem in signal analysis, pattern recognition or, as in our case, image denoising [11]. The main drawback of these transforms is that additional edges or structures called artifacts are produced by these methods. To tackle this problem, the shift-invariant version of these transforms i.e., non-subsampled wavelet transform, non-subsampled contourlet transform (NSCT) and non-subsampled shearlet transform (NSST) have been proposed in Refs. [11–13].
The wavelet method is suitable for one dimensional objects, because this transform does not provide directional components. Among the multiresolution approaches providing directional components, NSST has several advantages. It has more sparse coefficients and a lower computational complexity compared with NSCT [14]. There is no limitation on the number of directions in the NSST to which images are decomposed. In addition, unlike the NSCT, NSST guarantees sparse approximations for piecewise smooth images [15]. Therefore, the NSST has been used to decompose the noisy image in this paper.
Since white noise is distributed evenly over all the wavelet/contourlet/shearlet coefficients, removing the coefficients with small magnitude decreases most of the noise energy while preserving most of the image energy. So, it is understandable why as simple an operation as thresholding in the wavelet/contourlet/shearlet domain can reduce the effect of the noise while preserving image information [16].
Because of its simplicity, the thresholding method is widely used for image denoising in the multiresolution based methods. But the thresholding method presents oscillations in the vicinity of the image’s discontinuity which is similar to the Gibbs phenomena exhibited by the Fourier thresholding [17]. Because of this similarity, they are called pseudo-Gibbs phenomena. Although using the soft thresholding method instead of hard one enables us to partially reduce the pseudo-Gibbs phenomena, this method lowers all of the coefficients and consequently the local average is not preserved leading edges to be eroded [17].
Anisotropic diffusion method follows a completely different approach. To keep edges and textures, the anisotropic diffusion method can be performed using I t = ∇ . (c (x, y, t) ∇ I), where c (x, y, t) = g (|| ∇ I (x, y, t) ||), in which g is a monotonically decreasing function. As a consequence, for the textured regions with high gradient pixels, c (x, y, t) is small and therefore gets less diffused. On the contrary, for smooth regions with low gradient pixels, c (x, y, t) has a higher value, and these pixels get blurred with neighboring pixels [18]. So, the anisotropic diffusion method smoothes out noise while preserving the informative features.
In some recent papers, several researchers use learning methods e.g., support vector machines (SVMs) to classify the coefficients provided by the multiresolution transforms, into two classes, i.e. clean and noisy [6, 19]. After determining the noisy coefficients, they are denoised by the soft thresholding approach. Then, the inverse transform is applied to the clean coefficients and the denoised version of the noisy coefficients to get the noise free image. The main advantage of the SVM classifier over other conventional classifiers is that it is very successful when there are a limited number of training samples. To cope with the challenging task of classification in the small sample size situation, several classifiers have been proposed to improve the SVM based classifiers. The SUnSAL classifier is one of the state of the art classifiers aiming at representing most new observations with linear combinations of atoms (training samples), which are selected from an over-complete training dictionary [20]. It has been experimentally shown that, the SUnSAL classifier is more successful than the SVMs based classifiers even in the small sample size satiation. Therefore, we have made use of this classifier in our methodology.
The denoising algorithm presented in this paper combines the NSST and the anisotropic diffusion method by a new strategy. Although, a hybrid method combining the shearlet and anisotropic diffusion, has been proposed for denoising in Ref. [21], we improve it in this paper by classifying the NSST coefficients and forming a new cost function to be minimized. The proposed method takes advantage of multiresolution approach while the artifacts have been reduced because of the anisotropic diffusion used in our method.
This article is organized into five sections.Section 2 reviews the NSST, the SUnSAL classifier and the anisotropic diffusion method. In Section 3, the proposed method is presented. The experimental results and discussions are described in Section 4. Finally, Section 5 is devoted to the conclusions.
A review of methods
This section reviews the NSST, the SUnSAL classifier and the anisotropic diffusion method as presented in Refs. [13, 22].
NSST
The NSST is a shift-invariant, multi-scale and multi-directional expansion demonstrating low computational complexity compared with NSCT. Moreover, unlike other multiresolution transforms, NSST transform can be decomposed into any number of directions. Therefore, using NSST in the image denoising application provides more sparse coefficients by which the denoising algorithm has a higher performance. Before describing NSST, the discrete shearlet transform should be explained. So, in this subsection the discrete shearlet transform and NSST are reviewed.
A) Discrete shearlet transform
When dimension n is 2, the affine systems with composite dilations are defined as following:
The first matrix, i.e. A is an anisotropic dilation matrix which controls the ‘scale’ of the shearlets and guarantees that the frequency support of the shearlets becomes increasingly elongated at finer scales, while the second matrix, i.e. S is a shearing matrix controlling the orientation of the shearlets. It is usually assumed a = 4 and s = 1, such that and .
For assume is:
For any (ξ1, ξ2) ∈ C0, where called horizontal cone, the functions form a tiling of C0 (see Fig. 1). This property implies that the collection
Suppose , and . Then, the Parseval frame for L2 (C1) ∨ is as follows,
B) Non-subsampled shearlet transform (NSST)
The shift invariant version of the shearlet transform was proposed by Easley in 2008 [13]. The NSST eliminates the down-sampler and up-sampler operators in the structure of the shearlet transform. This transform uses the non-subsamples Laplacian pyramid (NSLP) with several combinations of shearing filters. The process by which the scales are produced is as follows:
Given an N × N image and the number of direction D, the steps of NSST in the scale j are described as follows: Use the low-pass and high-pass filters of NSLP to decompose into a low-pass image and a high-pass image . Note that removing the downsampler results in and with size N × N. Perform FFT to to obtain in pseudo polar grid, then compute . To get , apply a band-pass filtering to . Performing inverse FFT to , gives NSST coefficients in pseudo polar grid.
Figure 2 shows the results of applying the two level NSST on the Zoneplate image, yielding a low-pass subband (Fig. 2(b)) and several highpass subbands. In this figure, the numbers of shearing directions are chosen to be 4 and 8 from coarser to finer scale.
The sparse representation is based on the idea that all the test samples can be well-approximated by a (sparse) linear combination of atoms selected from an overcomplete training dictionary [20]. Given a training dictionary, denoted as A = {x1, …, xn} ∈ Rn×m, in which x i ∈ R1×m is a training sample, n is the number of training samples and m denotes the dimension of each training sample (i.e., the number of features), we want to approximate each new observation by this dictionary. Let the dictionary contain a total of c different classes. So, the dictionary can be organized as A = [A1, …, Ac], where A k = {x k 1 , …, x k n k } and A k includes the training samples of class k in its column, n k is the number of training samples in class k, and . In supervised classification of data, a conventional scenario is that a limited set of labeled training samples for each class exists, and we use this information to train a classifier, which is then used to classify unlabeled samples. Given a new observation (an unlabeled sample) denoted as x i , sparse representation aims at representing it by linear combination of the training samples (atoms) in the dictionary A as follows:
Note that Equation (12) is a convex problem. Therefore, it can be solved using linear programming solvers. The Lagrange multiplier denoted by τ1 creates a tradeoff between the remained error and sparse solution. Analogously to the results of Ref. [20], we have experimentally observed that the nonnegativity constraint (NC) i.e.,
The idea of image denoising by the nonlinear diffusion method can be explained as follows [22]. Let I be the noisy image. This image is known to be the sum of the noise free image and some Gaussian noise n (x, y):
The most well-known diffusivity, is Perona–Malik diffusivity g (x) =1/(1 + x2/γ2) which is used in the proposed method.
The goal of this section is to propose a model combining NSST and nonlinear diffusion for discontinuity-preserving denoising. Let I (x, y) be an observed discrete noisy image on the square Ω ⊂ R2 known to be the sum of the original image and noise. We first apply NSST to this noisy image. In the conventional hybrid methods, such as curvelet–diffusion [22] and shearlet-diffusion [21], the coefficients of detail subbands of the transform are divided into two sets by a threshold value. In these methods, coefficients with values more than the threshold are preserved while the coefficients with values lower than the threshold are denoised by the diffusion filtering. This idea preserves the coefficients with higher values which are likely to correspond to the location of textured regions. The reason is that these coefficients contain informative features of the image and denoising these coefficients can erode the edges of the image. However, determining the coefficients belonging to the textured regions cannot be done with only a simple thresholding operation. So, to improve the quality of the denoised image, we adopt a different strategy in which the coefficients belonging to the textured regions and noise-related ones, are determined by the SUnSAL classifier. Then, the coefficients of the second class, i.e., the noise-related class, are denoised by the diffusion filtering.
In the NSST, pixels of the original image spatially relate to the coefficients in the transform domain. The spatial relationships in the NSST domain represent the edges and textures of the image which should be preserved in the denoising process. A natural image is formed by regular features and performing amultiresolution transform such as NSST on the image results in sparse, spatially joined coefficients, representing edges and textures of the original image. The spatial dependency/contiguity of the multiresolution transform coefficients is known as spatial regularity [7]. Some researchers have used the similar concept (i.e., signal regularity) instead of spatial regularity for determining the important coefficients in their proposed method [19]. Although the spatial dependency of the features is represented by statistical models such as Markov random sequences in some methods [25], these forms of representation have two defects. First, the computational complexity of these methods is high. Second, their models cannot describe the geometry of the features truly. To remove these defects, the spatial dependency of the features is described by the connectivity of NSST coefficients in our strategy. Due to the fact that the NSST realizes that spatial dependency between its coefficients, the subbands of the NSST are generally formed by the connected coefficients. This concept is utilized in our methodology to determine the textured regions and noise-related ones. After determining the noise-related coefficients, they are denoised by the diffusion filtering in our method to prevent the artifacts usually produced by the soft thresholding method. The steps of the proposed hybrid method are asfollowing:
(1) Perform the NSST on the noisy image I to obtain L levels and O orientations. One approximate NSST subband denoted by W0 and several detail subbands denoted by Wi,j (i = 1, 2, …, L, j = 1, 2, …, O) are provided here.
(2) Select the training samples to be used as atoms, and find the feature vector of these samples for detail subbands of the NSST i.e., Wi,j. For this purpose, firstly obtain the preliminary binary map Bi,j [x, y]. Then compute the support value Si,j [x, y] from the binary map, in which x and y determine the position of coefficient. The binary map Bi,j [x, y] is obtained by the following formula:
(3) Use the SUnSAL classifier to classify all the coefficients of detail subbands Wi,j (i = 1, 2, …, L, j = 1, 2, … , O), where the training samples are used as atoms.
(4) Now, the following idea is used. We want to keep the low frequency coefficients and the important NSST coefficients (labeled as textured regions) almost untouched, but we want to denoise the NSST coefficients which have been labeled as noisy coefficients, by the diffusion procedure in such a way, that the image is smoothed. Therefore, we will not apply the nonlinear diffusion process directly to I, but only to the difference image I
d
= I - I0, where I0 is obtained by applying the inverse NSST to the set of coefficients formed by low frequency coefficients and textured coefficients. After some iteration steps of the diffusion scheme, the smoothed difference image is added to I0 in order to obtain the final result. The proposed model can be formulated as follows:
We have conducted some experiments to assess the performance of the proposed method quantitatively and visually. In the quantitative comparison, noisy images were simulated by adding white Gaussian noise with various standard deviations (20, 30, 40, 50, 60, 70) to some standard test images namely, Lena, Boat, Barbara, Cameraman, and a set of ordinary images from the Berkeley dataset including Castle, Man and Children (see Fig. 4). The noise was then removed from these images using several algorithms and the PSNR results were calculated. Four-level nonsubsampled shearlet transform (NSST) with 2, 2, 3, 3 directions, respectively, was used to decompose the noisy images.
In the first step, we should determine the free parameters used in our algorithm. Our experiments showed that the optimal regularization parameter is τ1 = 3 ×10-5. In addition, the number of training samples k (see step 2 of the proposed algorithm) is set to be 24, because the improvement is negligible for the larger number of k. In the next step, we should determine the threshold value T, to be used in the proposed method. For this purpose, the idea called oracle [19] is usually used in the image processing field, in which several images are corrupted by noise and subsequently, the noise is removed by a series of T and the optimal threshold T maximizing the PSNR value is obtained to be used for noise reduction in general. Here, five well-known training images, namely, Stream and bridge, Fruits, Goldhill, Couple and Baboon have been used. The obtained optimal threshold T is 11. In addition the time step used to iteratively solve Equation (19), was set to be 0.05.
In the implementations, we have used the publicly codes available for BLS-GSM [26], ProbShrink [27] and NL-means [28], but quantitative results of ShearletSVM have been reported from Ref. [7] just in the table.
Visual inspections are necessary to evaluate the performance of image denoising algorithms, because artificial edges cannot be well-quantified by the PSNR value. The results of the denoising algorithms for the Lena data have been represented in Fig. 5. Zoomed areas of Fig. 5 have also been shown in Fig. 6. In addition, the visual performances of the denoised images for the Cameraman data have been represented in Fig. 7. The standard deviation of noise is 20 in this experiment. From these figures (specifically Fig. 6(c)), one can easily observe that, in the BLS-GSM method, the details and edges have been eroded, but the noise has been reduced satisfactorily. As it is obvious, the result of the ProbShrink method has preserved the edges better than that of the BLS-GSM method, but the noise is still remained in the result of this method. In addition, a considerable amount of artifacts has been produced by this method (see Figs. 5, 6, 7(d)). As can be seen from these figures, the result of NL-means method seems pretty noisy, such that the level of the noise in this method is higher than that in the ProbShrink method. Moreover, the edges and textures have been blurred in this method. One can observe that the proposed denoising method preserves the edges and textures better than other methods, while removing the noise almost completely (see Figs. 5, 6, 7(f)).
Finally, the performance of these methods was assessed quantitatively. The values of PSNR index for the proposed method and other mentioned methods have been reported in Table 1. This table demonstrates that the proposed method outperforms other powerful denoising methods quantitatively.
Conclusion
A new image denoising method was proposed combining NSST and anisotropic diffusion. In order to preserve features of the image, after applying the NSST to the noisy image, the obtained coefficients are classified into two classes i.e., edge-related coefficients and noise-related ones by using the SUnSAL classifier, and only the coefficients of the second class are used in the denoising procedure. To prevent artifacts, we denoised the noise-related coefficients by the anisotropic diffusion method. Some standard grayscale images and several ordinary images selected from the Berkeley dataset were denoised by the proposed method and five powerful algorithms. The comparison of the denoising results obtained with the proposed method, and with the several powerful methods, demonstrates the efficiency of our new image denoising approach. The visual quality of denoised images obtained by the proposed method is moreover characterized by fewer artifacts than the other methods.
