Abstract
Breast ultrasound (BUS) imaging has become a crucial modality, especially for providing a complementary view when other modalities (i.e., mammography) are not conclusive in the task of assessing lesions. The specificity in cancer detection using BUS imaging is low. These false-positive findings often lead to an increase of unnecessary biopsies. In addition, increasing sensitivity is also challenging given that the presence of artifacts in the B-mode ultrasound (US) images can interfere with lesion detection. To deal with these problems and improve diagnosis accuracy, ultrasound elastography was introduced. This paper validates a novel lesion segmentation framework that takes intensity (B-mode) and strain information into account using a Markov Random Field (MRF) and a Maximum a Posteriori (MAP) approach, by applying it to clinical data. A total of 33 images from two different hospitals are used, composed of 14 cancerous and 19 benign lesions. Results show that combining both the B-mode and strain data in a unique framework improves segmentation results for cancerous lesions (Dice Similarity Coefficient of 0.49 using B-mode, while including strain data reaches 0.70), which are difficult images where the lesions appear with blurred and not well-defined boundaries.
Introduction
Digital mammography (DM) is the most used technique for screening breast cancer. However, performed in dense breasts, it may yield false-positive results. 1 Ultrasonography (US) is used as a complementary technique for screening due to its higher sensitivity in cases of dense breast, but its specificity in cancer detection is low because most of the detected solid lesions are benign. This leads to an increase of unnecessary biopsies causing discomfort to the patients and increasing costs. 2
To overcome the limitations of conventional US and obtain a more accurate characterization of breast lesions, US elastography was introduced. Breast US (BUS) elastography is a recent diagnostic technique based on imaging the difference in stiffness between cancerous and benign tissues. Elastography allows regions with different stiffness values to be differentiated, and usually cancerous tissue is stiffer than normal tissue. Hence, cancerous tissues are more easily identifiable in an elastogram than in conventional US techniques. 3
Compression (also referred as conventional) elastography is based on the application of a compressive force on the breast using a conventional transducer and on the measurement of the deformation of the tissues. As the external compression force is unknown, this technique only allows the calculation of the deformability ratio (strain) by measuring variations in the radio-frequency (RF) ultrasound signals before and after compression 4 and not the absolute elasticity value. 2 When displaying these kinds of images, the tissue strain is encoded in a color map shown as an overlay in a B-mode, where the different colors represent different strain values. Clinical studies support the use of this technique to enrich current screening methods.5-7 Thus, it is reasonable to assume that lesion segmentation methods could also benefit from this complementary information.
Previous works also used strain information in the segmentation process. Regarding lesion segmentation in BUS imaging, von Lavante and Noble 8 used a graph-cut segmentation network that incorporated strain values estimated from the RF signals. More recently, Nedevschi et al. 9 proposed the segmentation of strain images using the Expectation-Maximization (EM) algorithm. The method automatically initializes the EM by analyzing the peaks in the strain histogram. However, no quantitative results were reported. Zhou et al. 10 proposed the inclusion of strain data in a level-set segmentation framework, but they only evaluated their approach on images taken in phantoms. Finally, Selvan et al. 11 presented a nonlinear fuzzy inference system applied for classifying the breast lesions, which involves a stage of strain segmentation, but no segmentation results were reported.
Strain data can also be useful in the segmentation of other organs, such as the liver or the prostate. This approach was used by Techavipoo et al., 12 who proposed a semi-automatic algorithm to segment strain images of the liver. In their work, the user interaction is needed to place two regions of interest (ROIs) to initialize a histogram thresholding segmentation process. The segmentation is then refined with a morphological operation to remove artifacts. Liu et al. 13 also proposed the segmentation of strain images of the liver. They presented an Active Contour Model method, where the contour is initialized with a coarse-to-fine transformation (Gaussian pyramid). In the field of prostate segmentation, we find the work of Mahdavi et al., 14 where they presented a method that combines B-mode and strain information in an Active Shape Model.
An early version of this paper was presented in Pons et al. 15 where we briefly described the novel lesion segmentation framework that takes intensity information (B-mode) and strain information into account using a Markov Random Field (MRF) and a Maximum A Posteriori (MAP) approach by modeling both data in a bivariate Gaussian Probability Density Function (PDF). However, the paper only shows qualitative results in two cases.
This paper validates this approach by applying it to clinical data, demonstrating that the combined use of B-mode and strain information provides better results in difficult US images (where the lesion is not well defined) than B-mode alone, using images collected from two different datasets.
Materials and Method
Elastography can be seen as a complementary source of information to the conventional B-mode. The lesion segmentation method used in this study combines intensity information from the B-mode with complementary stiffness information obtained with elastography. Both sets of information are modeled with a bivariate Gaussian PDF. The analysis employs a MRF as well as a MAP approach, as proposed by Xiao et al., 16 and assesses it using two datasets of both strain and US images acquired in different clinical institutions with a total of 33 images.
Bivariate MRF-MAP Segmentation
Xiao et al.
16
assumed that US images present intensity inhomogeneities and these inhomogeneities are described by a multiplicative field
Given the class label
Experimental results corroborate the validity of the assumption of a bivariate Gaussian PDF for both lesion and background information in B-mode and strain, as illustrated in Figure 1, where the intensity distribution for lesion and background are shown for strain and B-mode data. The intensity distribution for lesion is different from the background distribution, which allows the methodology proposed to distinguish between the two different classes.

Example of a B-mode image of a carcinoma (a), where the lesion location is highlighted with a rectangle. Its strain map represented by a color overlay (b). Note that red represents high values of stiffness (hard tissue) and blue low values (soft tissue). The second row shows the bi-variant histograms of the combined B-mode and strain information of the lesion (c), and the background (d) (relative values).
The complete framework is described as follows (see Xiao et al. 16 for a more detailed explanation):
1.
2.
where Ni denotes the set of neighbor pixels of i.
3.
Where
4.
5.
All the tests have been performed on a PC (Intel® Core™ 2 Quad 2.83GHz 8GB RAM) using Matlab® programming environment.
Clinical Study
The clinical study in this paper was performed using two datasets provided by different research institutions containing B-mode and strain information. These datasets were used to assess the MRF-MAP segmentation method, and they are described below:
Dataset A is composed of 12 images, all from different patients, acquired in the Churchill Hospital (Oxford, England) with a z.one (Zonare Medical Systems LTD., Mountain View, CA, USA) system and an L10-5 linear array transducer (8.5 MHz), and the mean size of all the images is 245 × 290 pixels. Within these images, 10 present a cancerous lesion while the other 2 images present a benign lesion. Among the cancerous lesions, 6 were diagnosed Ductal Carcinoma In Situ (DCIS) + Invasive Ductal Carcinoma (IDC), 2 IDC, 1 Invasive Lobular Carcinoma, and 1 DCIS. The 2 benign lesions were diagnosed Fibroadenoma (FA). The strain information was obtained based on tracking the displacement of the RF signal. 18
Dataset B is composed of 21 images, all from different patients, and it was provided by the Medical Imaging Group of the Cambridge University Engineering Department. The scans were obtained with a Diasus (Dynamic Imaging LTD., Livingston, UK) ultrasound machine with a 5 to 10 MHz linear array transducer, and the mean size of all the images is 384 × 294 pixels. Within these images, 4 present a cancerous lesion while the other 17 images present a benign lesion. Among the cancerous lesions, 3 were diagnosed IDC and 1 Necrosis. The 17 benign lesions were diagnosed 6 FA and 16 cysts. The strain information was generated using a tissue displacement tracking algorithm proposed by the same research group. 19 Table 1 summarizes the dataset composition.
B-Mode Ultrasound Dataset Composition.
IDC = Invasive Ductal Carcinoma; DCIS = Ductal Carcinoma In Situ
Manual delineations of the tumors were performed by an expert radiologist using both B-mode and strain information in a dedicated workstation. For verifying the results using only B-mode information, the radiologist delineated the Ground Truth (GT) images using only B-mode, whereas for the combined B-mode and strain data, they used both overlaid information. All of the images involved in this work were previously made anonymous to preserve the confidentiality of the patients. Informed consent is obtained from all patients in this study, and authorizations of the Ethics Committee of both hospitals were issued. These diagnoses were supported by a posterior biopsy/pathological examination after the acquisition.
Statistical Analysis
To evaluate quantitatively our results, we need some measures of comparison between our segmentations and the GT provided by experienced radiologists. Figure 2 shows the graphical representation of four areas that appear when comparing two different delineations. In terms of object segmentation, these four values mean the following,

Graphic representation of the evaluation in terms of area measures. TN = true negatives; FP = false positives.
Expressing the results in terms of how many pixels belong to each of these classes is not clear enough to determine how good the results are. For that reason, different area metrics relating the four regions are commonly used. Most of the indexes are defined within the interval [0, 1] (or in percentage), where 1 indicates perfect overlap and 0 means no overlap at all, although some works report their results as a percentage.
Dice Similarity Coefficient (DSC) 20 is a well-known measure and is the most commonly used in image segmentation works to assess the segmentation results. This measure is expressed as follows:
where TP is the number of true positives, TN is the number of true negatives, FP is the number of false positives, and FN is the number of false negatives.
Specificity measures the proportion of negatives correctly identified. Specificity is described as follows:
Sensitivity measures the number of pixels correctly labeled as lesion with respect to the area of the lesion reference:
These measurements are calculated particularly for each analyzed image. Then, the average value for each data subset is reported in “Quantitative Results” section.
To determine whether there are significant differences in the performance between the different values of the parameters in comparison with the default configuration, a hypothesis test was performed. Initially, the Kolmogorov–Smirnov test
21
was used to confirm that the values we compared were normally distributed. Then, a paired two-sample Student’s t test
22
was applied. In experiments where the number of samples is less than 25, the nonparametric test, Wilcoxon rank sum, is used instead. The null hypothesis specifies that there are no significant differences between the mean values:
Results
Qualitative Results
A qualitative evaluation is first presented to show the behavior of the algorithm dealing with illustrative cases, two cases in which the lesion is easily segmented using B-mode information alone (see Figure 3i and ii) and two cases where this modality is clearly insufficient (see Figure 3iii and iv). First, the images are segmented using B-mode information alone. Subsequently, they are segmented including both B-mode and strain information. We have compared the different results to determine the benefits of combining both US B-mode and strain information in the lesion segmentation problem.

Segmentation results using B-mode information alone (column a) and combining B-mode and strain (column c). Columns (a) and (c) plot the original image, (b) and (d) the segmentation result overlapping between the result and the GT, where the light gray color denotes TP pixels, dark gray represents FN, black denotes TN, and white denotes FP. GT = ground truth; TP = true negatives; FN = false negatives; TN = true negatives; FP = false positives.
Figure 3(i) and (ii) shows the segmentation results for two different images, where the tumor is well defined in the B-mode images and the elastograms do not provide essential additional information. Note that the results are similar for both cases. Although it is not clearly appreciated in this example, the inclusion of strain information might yield an over-segmented result owing to the peritumoral tissues are also stiff.
However, Figure 3(iii and iv) shows the segmentation results of B-mode images that do not provide enough information to clearly segment the tumor, and the method fails in both cases. In images such as these, strain data provide better information on the location of the tumor, as shown in the second and fourth row of Figure 3. Using this information, the segmentation results are clearly improved.
Quantitative Results
As in the qualitative results, we compared the results obtained by using B-mode information alone and combining it with strain information in the segmentation framework. The results are presented in Table 2 for all the images, in Table 3 for images with cancerous lesions, and in Table 4 for images with benign lesion. All the results are exposed in relationship with the dataset used.
Quantitative Results Using B-Mode Alone or Including Elastography Information, in Terms of Sensitivity, Specificity, AO and DSC in %.
Results expressed in M ± SD. AO = area overlap; DSC = Dice Similarity Coefficient.
Quantitative Results for Cancerous Lesions Using B-Mode Alone or Including Strain Information, in Terms of Sensitivity, Specificity, AO and DSC in %.
Results expressed in mean ± standard deviation. AO = area overlap; DSC = Dice Similarity Coefficient.
Quantitative Results for Benign Lesions Using B-Mode Alone or Including Strain Information, in Terms of Sensitivity, Specificity, AO and DSC in %.
Results expressed in mean ± standard deviation. AO = area overlap; DSC = Dice Similarity Coefficient.
Figure 4 shows a set of box plot charts comparing the DSC values obtained regarding the type of lesion.

Box plot charts comparing the DSC values for Datasets A and B and the combined dataset regarding the type of lesion. The charts with statistical significant differences are denoted by *. DSC = Dice Similarity Coefficient.
First row shows the box plots regarding the results obtained using images with cancerous lesions, the second row of charts depicts the results obtained using images of benign lesions, and finally the last row shows the results obtained using all the images. The first column shows the results using images from Dataset A, the second one from Dataset B, and the last column the images from both datasets combined.
Ground Truth Generation
All the segmentation results were compared with the manual delineations of the tumors performed by an expert radiologist. For verifying the results using only B-mode information, the radiologist delineated the GT images using only B-mode, whereas for the combined B-mode and strain data, they used both overlaid information. In this subsection, the GT generated using only B-mode and including strain are compared. Thus, information about the similarity of both delineations is exposed.
Table 5 shows how the criteria of the doctor differs from delineating lesions using only B-mode images or when using both strain and B-mode images overlapped.
Ground Truth Comparison.
Results are exposed as the mean DSC value (in %) ± standard deviation. DSC values are computed between the radiologist’s manual segmentation from B-mode alone and manual segmentation from B-mode with elastography. DSC = Dice Similarity Coefficient.
Image Quality Comparison
In this subsection, we compare the obtained segmentation results with the quality of the images. The term quality is linked to a subjective perspective. However, we can find some indicators to help us whether an image is difficult to be segmented for an automatic algorithm or not. In this section, we analyzed the segmentation results with respect to the contrast of the lesion boundaries. To obtain this value, we have compared the inner border of the lesion with the outer one. The chosen distance was 10 pixels from the boundary. Then, the mean intensity values for the inner and outer areas were compared using a subtraction. Lesions with good contrast return higher difference than lesions with blurry boundaries. Figure 5 shows the comparison between the difference of lesion contrast and the obtained DSC for (a) Database A, (b) Database B, and (c) the combined dataset; showing a trend where the DSC increases when the contrast between lesion and background is higher. Note that this comparison was made using the segmentation with only B-mode images.

Comparison between the difference of lesion contrast and the obtained DSC for (a) Database A, (b) Database B, and (c) the combined dataset. DSC = Dice Similarity Coefficient.
To determine the improvement of the results due to the inclusion of strain imaging in the segmentation algorithm, an expert radiologist labeled the images as good quality or bad quality. Table 6 shows the results. For Dataset A and combined dataset, the inclusion of strain imaging improved significantly the performance of the segmentation algorithm in B-mode images with a low quality according to the expert radiologist.
Quantitative Results for All the Types of Images Grouped According the Quality of the Images.
Results are exposed as the mean DSC value (in %) ± standard deviation. DSC = Dice Similarity Coefficient.
Discussion
Analysis of Table 2 shows that the inclusion of the strain in the segmentation framework improves significantly the segmentation results in Dataset A, where the DSC was increased from 50.1% to 72.8%, using B-mode information only and including strain, respectively. For Dataset B and the combined dataset, the DSC values obtained were also higher using strain information, from 78.0% to 78.9% and from 67.9% to 76.7%, respectively.
Similar results are shown in Table 3, in which only the images with presence of cancerous lesions are considered. For cancerous lesions, the inclusion of strain information in the segmentation process improves the performance significantly for images of Dataset A, and slightly for images of Dataset B. Note that only four images of Dataset B contain cancerous lesions, a fact that only allow us to extract trends but no final conclusions.
However, the analysis of Table 4, where only images with benign lesions are studied, shows that the inclusion of strain slightly decreases the performance of the segmentation for those images. As in the examples shown in Figure 3 (iii) and (iv), benign lesions (cyst particularly) appear in the US images with well-defined boundaries. Therefore, the inclusion of strain can be an element of distortion because some benign lesions are not stiffer than the healthy tissues. For images such as these, the B-mode information is enough to obtain an accurate segmentation result, and as the obtained results show, the inclusion of strain can decrease the performance.
These results are complemented with the set of box plot charts in Figure 4, where we can observe graphically the quantitative results disposed in Tables 2, 3, and 4.
Regarding the performed image quality study, Figure 5 shows a correlation between DSC and the contrast of the lesion and the surrounding tissues. Thus, images where the lesion has a higher difference of intensity values between inner and outer pixels (the lesion is well defined) tend to obtain a higher DSC value. In terms of qualitative assessment of the images, Table 6 shows that for Dataset A and the combined dataset, the inclusion of strain data improves significantly the segmentation in bad quality images.
After this discussion, we have shown that the inclusion of strain in the segmentation framework improves considerably (significantly for one of the datasets) the segmentation results for cancerous cases. However, when segmenting images with benign lesions, the use of strain can slightly decrease the performance of the segmentation method, this difference not being significative. It is worth mentioning that the most challenging problem of the B-mode segmentation methods in the literature is when dealing with cancerous lesions, in which the lesions appear with blurred and not well-defined boundaries. 23 Thus, the inclusion of strain is a valuable option to improve the results in such difficult cases.
Conclusion
In this paper, a study of a unified framework for segmenting lesions in BUS using both B-mode and strain data using two different image sources was proposed. Qualitative experiments have been performed using different illustrative examples, ones where the B-mode images show a well-defined lesion and others where elastograms provide more meaningful information. The accuracy of segmentation obtained by combining B-mode information with strain data was also compared with the results obtained by using B-modes only. The results show that combining both B-mode and strain data improves the segmentation results, particularly when the B-mode images are not conclusive or are of lower quality, as is often the case in noncystic lesions.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by the Spanish R+D+I Grant TIN2012-37171-C02-01 and by a predoctoral grant FI program of Generalitat de Catalunya. Dataset B courtesy of the Medical Imaging Group, Cambridge University Engineering Department,
.
