Segmentation method for foreign fibers in cotton based on improved simple linear iterative clustering algorithm and SVM

Abstract

In response to the challenge of detecting small foreign fibers mixed in cotton during picking and transportation, a segmentation method for identifying foreign fibers in cotton based on superpixel features and a support vector machine (SVM) is proposed. The method involves integrating the local multidirectional anti-noise binary pattern with the simple linear iterative clustering algorithm to segment the image into superpixel units with similar color features. Color features and grayscale histogram texture features of each superpixel block are extracted to form a multidimensional feature vector. The particle swarm optimization–SVM classification model is then utilized to distinguish between background superpixels and foreign fiber superpixels. Finally, superpixel correction is performed based on confidence and regional adjacency relations, with the introduction of the Bhattacharyya coefficient, to obtain a completely segmented image. The superpixel classification accuracy is reported to be 98.65%. Compared with traditional image segmentation algorithms, this proposed algorithm shows superior segmentation performance.

Keywords

Cotton foreign fiber local binary pattern superpixel segmentation texture features support vector machine category correction

With the rapid development of the cotton industry, China has become a major global producer of cotton. Over the past decade, China’s annual average cotton planting area has been 3302.34 thousand hectares, with an average annual cotton output of 5.8434 million tons. The mixing of foreign fibers in cotton significantly affects cotton quality, causing such issues as yarn breakage, uneven dyeing, and coarse texture during textile processing, severely damaging the textile economy. Foreign fibers mainly come from the picking and transportation stage of cotton, and there are many kinds, such as chemical fibers, hair, silk, hemp, plastic ropes, and dyed threads (ropes, fabric pieces).¹ The detection step is used in the cleaning process of cotton processing. Manual removal methods are insufficient to meet the needs of efficient enterprise development, so the accurate identification of foreign fibers using automated equipment is the main focus of current development in the textile industry.² Methods for removing foreign fibers from cotton are evolving toward intelligence and high efficiency. At present, the main automated detection methods for foreign fibers include photoelectric detection,³ ultrasonic detection, and machine vision detection, all of which can improve detection efficiency. The classic Shirley analyzer analyzes the content of foreign fibers by mechanically separating waste components,⁴ while the high-volume instrument uses spectroscopy and imaging techniques in the infrared and near infrared regions to identify or quantify foreign fibers.⁵

Owing to the variety of foreign fibers and significant differences in the characteristics between different types, from the perspective of optical technology, such methods as infrared light,⁶ hyperspectral imaging,^7,8 and polarized light effectively enhance the contrast between foreign fibers and the cotton background.⁹ However, these methods are costly and difficult to implement in actual production. Additionally, in terms of detection algorithms, Guo et al.¹⁰ utilized three-channel fused image information and employed the Otsu method for image segmentation, achieving a detection error rate of less than 5% in cotton impurity detection. Edge detection algorithms, such as the Canny¹¹ and Sobel¹² algorithms, are also widely used in foreign fiber recognition. However, these algorithms are prone to broken boundary contours, owing to background noise and complex textures. Additionally, machine learning algorithms, such as support vector machines (SVMs) have been used to classify manually selected pixel neighborhoods to obtain segmented images of foreign fibers and for classification purposes.¹³ In recent years, deep learning algorithms have also been applied in the field of foreign fiber detection. Wei et al.¹⁴ improved the U-Net (a network structure with a U shape), achieving real-time segmentation of foreign fiber images and calculating the actual size of the foreign fibers. Subsequently, a cotton layer foreign fiber classification and recognition method based on near infrared spectroscopy and a combined convolutional neural network (CNN) and temporal convolutional network (TCN) was proposed, effectively identifying foreign fibers in cotton layers.¹⁵ Shi et al.¹⁶ designed a foreign fiber detection network model algorithm based on a residual structure, addressing the issue of foreign fiber edge positions. Additionally, the You Only Look Once (YOLO) series of detection models, such as version 3,¹⁷ version 4,¹⁸ and version 5,¹⁹ have been employed for foreign fiber detection, each performing detection tasks with varying degrees of success. These algorithms, however, have high requirements for computer hardware. In the application of lightweight networks, the MobileNet network enables end-to-end segmentation of fabric defects. Moreover, it introduces depthwise separable convolution, which significantly reduces the complexity and model size of the network.²⁰

The aforementioned algorithms only consider the characteristics of individual pixels during segmentation, without fully utilizing the relational information between pixels. Superpixels use such information as color and spatial structure between pixels to divide an image into irregular superpixel blocks with similar color and texture features. Common superpixel segmentation methods can be divided into graph-based methods, such as Graph Cuts,²¹ and gradient ascent-based methods, such as Superpixels Extracted via Energy-Driven Sampling (SEEDS)²² and simple linear iterative clustering (SLIC).²³ The SLIC segmentation algorithm generates superpixels that are compact and uniform in size, effectively reducing the complexity of the image and exhibiting strong robustness. However, it only considers the effect of color and spatial features on the clustering process, resulting in poor segmentation performance for linear foreign fibers and weak-edge areas. To address this issue, in this work, linear transformation and the local multidirectional noise robustness binary mode (LMNRBM) operator are combined for texture feature extraction and integrated with the SLIC algorithm to enhance segmentation performance. Superpixels are obtained using the improved SLIC segmentation algorithm, and their color and texture information are extracted to form multidimensional feature vectors. A particle swarm optimization (PSO)-SVM classification model is used for the preliminary classification of superpixels. Based on confidence and adjacency relationships, a classification correction algorithm is designed, and superpixels are merged according to the correction results, achieving complete segmentation of foreign fibers.

Data collection

Experimental materials and samples were obtained from cotton textile enterprises in the Xinjiang Uygur Autonomous Region, China. To ensure the diversity of the dataset, for this study, eight types of common foreign fibers in cotton were selected as segmentation targets, including cloth strips, polymer fibers, hemp ropes, and nylon ropes. The image acquisition tool selected for the experiment was a Baumer industrial camera, equipped with a complementary metal oxide semiconductor (CMOS) image sensor. The image resolution was 3000 × 4000 pixels, and the images were saved in .jpg format. A light-emitting diode light source was used, fixed on an experimental stand with nuts and bolts, to allow for adjustment of the distance between the light source and the cotton to provide evenly distributed illumination. The camera was fixed above the cotton and the light source with a cantilever bracket. Static image capture was completed by adjusting the distance and angle of the light source. Detailed images of some of the foreign fibers found in the cotton are shown in Figure 1. The foreign fibers can be classified as linear or particulate, with the length of linear foreign fibers ranging from 0.2 cm to 1 cm, and the area of particulate foreign fibers ranging from 0.16 cm² to 0.81 cm².

Figure 1.

Foreign fibers found in cotton.

Research methodology

The segmentation method for foreign fibers in cotton proposed in this study can be divided into three main parts. First, the preprocessing stage involves dividing the entire image into blocks and using the gray-level difference method to determine whether the sub-image contains foreign fibers. For images containing foreign fibers, superpixel segmentation incorporating texture features is used to obtain superpixel images. Finally, based on the superpixel features, the complete foreign fibers are segmented by merging the superpixels. The main process is illustrated in Figure 2.

Figure 2.

Algorithm flow chart of this study. LMNRBM: local multidirectional noise robustness binary mode; PSO: particle swarm optimization; SLIC: simple linear iterative clustering; SVM: support vector machine.

Improved local binary patterns operator

The local binary patterns (LBP) operator,²⁴ proposed by Ojala et al., is a powerful texture feature descriptor, and is widely used in such fields as face recognition and image classification. The traditional LBP operator compares the grayscale value of the central pixel with that of eight neighboring pixels and encodes them to generate a binary number. This binary number is then converted into the LBP value of the central pixel according to a specific weighting rule, producing a texture feature description image. This operator is simple, efficient, and robust to changes in illumination. However, it is sensitive to noise and it is difficult to resist interference using this operator because only the grayscale value differences between local pixels are considered. Therefore, in this work, an improved LMNRBM is proposed, as shown in Figure 3.

Figure 3.

Local multidirectional noise robustness binary mode operator diagram.

This method improves on the circular LBP operator by evenly distributing neighboring points along the circumference of a circle with radius R. To eliminate noise, the four vertically and horizontally neighboring points (V₁, V₃, V₅, V₇) are replaced by the average grayscale values of their four nearby pixels. For the diagonally neighboring points (V₂, V₄, V₆, V₈), which do not fall on pixel centers, bilinear interpolation is used to obtain their approximate grayscale values. Moreover, since the traditional LBP operator only considers the relationship between the central pixel and its neighboring pixels, ignoring the relationships between the neighboring pixels themselves, in this method, texture variation information is also extracted in two dimensions: horizontally and vertically, as well as diagonally between the neighboring pixels. This further enhances the texture feature description. The binary string code obtained is then converted into the LMNRBM feature value of the central pixel, according to the weighting rule for subsequent segmentation.

The texture feature value L₁ between the central pixel and the neighboring pixels is calculated as

L_{1} {(i, j)}_{r, p, R} = \sum_{n = 1}^{p} s (V_{n}, V_{c}) 2^{n - 1}

(1)

where s is the threshold function:

s (V_{n}, V_{c}) = {\begin{matrix} 1, & V_{n} - V_{c} \geq 0 \\ 0, & V_{n} - V_{c} < 0 \end{matrix}

(2)

where (i, j) represents the position of the central pixel; R is the radius of the sampling circle; p is the number of sampling points in the neighborhood; n denotes the position of the neighboring points; V_n is the approximate value of the neighboring pixel; V_c is the value of the central pixel; and s(V_n, V_c) is the difference value, namely, the value of the neighboring pixel minus that of the central pixel.

For the vertical and horizontal neighbors (V₁, V₃, V₅, V₇), the pixel values are the averages of the four pixels on the circular neighborhood with radius (r). Taking V₁ as an example:

V_{1} = \frac{1}{4} \sum_{m = 1}^{4} V_{1 m}

(3)

For the diagonal neighborhood pixels (V₂, V₄, V₆, V₈), the approximate grayscale value of the neighborhood pixels is obtained using bilinear interpolation. Taking V₄ as an example:

V_{4} = \sum_{m = 1}^{4} w_{m} \cdot V_{4 m}

(4)

where V₄ _m is the value of four pixels in the diagonal neighborhood and w_m represents the weight coefficients of the four points near the diagonal neighborhood in the interpolation, which satisfy

\sum_{m = 1}^{4} w_{m} = 1

The texture feature value L₂ between vertical and horizontal neighborhoods is calculated as

L_{2} (i, j) = s (V_{3}, V_{1}) 2^{3} + s (V_{5}, V_{1}) 2^{5} + s (V_{7}, V_{1}) 2^{7}

(5)

The texture feature value L₃ between diagonal neighborhoods is calculated as

L_{3} (i, j) = s (V_{4}, V_{2}) 2^{4} + s (V_{6}, V_{2}) 2^{6} + s (V_{8}, V_{2}) 2^{8}

(6)

These three texture feature values are concatenated to obtain a three-dimensional multidirectional noise-resistant LBP feature vector (a LMNRBM operator), as

LMNRBM (i, j) = [\begin{matrix} L_{1}, & L_{2}, & L_{3} \end{matrix}]

(7)

To highlight the differences between foreign fibers and the cotton background in the texture feature map, the Sobel edge detection operator is used to extract edge information in different directions, generating a linearly enhanced image. This image is then subjected to a linear transformation before applying the LMNRBM operator for feature extraction, as shown in Figure 4. The comparison clearly shows that the texture features extracted by the LMNRBM operator, when applied to the Sobel-linear transformed image, exhibit a higher contrast between foreign fibers and the cotton background. This is particularly beneficial for detecting linear and weak-edge foreign fibers, facilitating subsequent segmentation tasks.

Figure 4.

Contrast of texture feature images of foreign fibers: (a) original image of cotton foreign fibers; (b) original local binary patterns texture image; (c) Sobel-local multidirectional noise robustness binary mode (LMNRBM) texture image and (d) Sobel-linear transformation-LMNRBM texture image.

Improved SLIC superpixel segmentation

The SLIC superpixel segmentation algorithm, proposed by Achanta et al.,²³ utilizes both color and spatial information of an image to perform iterative clustering, thereby segmenting the image into several irregular pixel blocks with similar characteristics, such as color and brightness. The algorithm transforms the image from the red-green-blue (RGB) color space to the CIELAB color space, where each pixel is represented by a five-dimensional feature vector that includes its spatial coordinates. Local clustering is then performed based on a similarity metric. If an image contains N pixels and it is expected that K superpixels will be generated, then the side length of each superpixel is approximately $S = \sqrt{N / K}$ . To avoid clustering centers being located at the image edges or on noise, the clustering centers are moved to the lowest gradient position within their 3 × 3 neighborhood. K-means iterative clustering is performed within a 2S × 2S region until the maximum number of iterations is reached. Finally, a connected components algorithm is used to eliminate isolated points.

During the clustering process, the distance (D(i, c_k)) between a point i within the region and the clustering center c_k is defined as

D (i, c_{k}) = \sqrt{d_{c} {(i, c_{k})}^{2} + {(\frac{d_{s} {(i, c_{k})}^{2}}{S})}^{2} m^{2}}

(8)

d_{c} (i, c_{k}) = \sqrt{{(l_{i} - l_{c_{k}})}^{2} + {(a_{i} - a_{c_{k}})}^{2} + {(b_{i} - b_{c_{k}})}^{2}}

(9)

d_{s} (i, c_{k}) = \sqrt{{(x_{i} - x_{c_{k}})}^{2} + {(y_{i} - y_{c_{k}})}^{2}}

(10)

where d_c(i, c_k) is the color distance metric; l_i, a_i, and b_i are the values of point i in the region in the CIELAB color space;

l_{c_{k}}

a_{c_{k}}

b_{c_{k}}

are the values of the clustering center c_k in the CIELAB color space; d_s(i, c_k) is the spatial distance metric; x_i and y_i are the spatial coordinates of point i in the image; x_i and y_i are the spatial coordinates of the clustering center in the image; S is the initial side length of the superpixel; and m is the balancing parameter between color similarity and spatial proximity.

Although the aforementioned SLIC algorithm has good color and spatial complexity, it does not utilize image texture features. When the foreign fibers in cotton are similar in color to the background but differ in texture, the segmentation accuracy is not high. To address this, the LMNRBM operator is used to calculate texture features; and texture distance is introduced into the original SLIC similarity metric, making the segmentation results more accurate. The improved SLIC similarity metric is defined as

D_{IM} (i, c_{k}) = \sqrt{d_{c} {(i, c_{k})}^{2} + {(\frac{d_{s} {(i, c_{k})}^{2}}{S})}^{2} α^{2} + d_{tex} {(i, c_{k})}^{2} β^{2}}

(11)

d_{tex} (i, c_{k}) = \frac{\sum_{n = 1}^{3} L_{i n} L_{c_{k} n}}{\sqrt{\sum_{n = 1}^{3} L_{i n}^{2}} \times \sqrt{\sum_{n = 1}^{3} L_{c_{k} n}^{2}}}

(12)

where d_c is the color feature distance; d_s is the spatial feature distance; d_tex is the texture feature distance; α is the spatial distance weight factor; β is the texture feature distance weight factor; S is the step size;

L_{i} = [L_{i 1}, L_{i 2} {, L}_{i 3}]

is the LMNRBM texture feature vector of point i within the 2S × 2S region, and

L_{i} = [L_{c_{k} 1}, L_{c_{k} 2} {, L}_{c_{k} 3}]

is the LMNRBM texture feature vector of the clustering center c_k. Compared with the traditional SLIC algorithm, the improved SLIC algorithm incorporates LMNRBM texture features, resulting in a higher conformity of the superpixel segmentation boundaries for foreign fibers.

Superpixel classification and merging

The image segmented by the improved SLIC algorithm does not directly yield a complete image of the foreign fibers but instead divides the image of cotton foreign fibers into several irregular superpixel blocks, as shown in part in Figure 5. Therefore, to solve the over-segmentation problem, it is necessary to merge superpixels of the same class, based on the similarity between adjacent superpixels. By extracting color and texture features from the superpixel blocks to form multidimensional feature vectors, an SVM classification model can be used to classify the cotton background superpixels and foreign fiber superpixels to achieve image segmentation.

Figure 5.

Superpixel sample diagram.

Feature extraction

Color features, as the most intuitive image characteristic distinguishing the target from the background, are widely used in image segmentation, owing to their strong robustness. In this study, the mean values of the color components R, G, B, L, a, and b in the RGB and CIELAB color models are extracted from the superpixels to form a six-dimensional color feature vector. Additionally, texture extraction is performed on the variations and patterns of homogeneous pixels or local regions within the image. Since the contours of superpixels are irregular, the gray-level histogram method is used to describe their texture features. The gray-level histogram of the superpixel is obtained, and five texture descriptors—mean (m), standard deviation (σ), uniformity (U), contrast (c), and second-order moment (V_a)—are extracted sequentially to form a five-dimensional texture feature vector. The color features and texture features together form an 11-dimensional feature vector F for the superpixel, as

F = [\begin{matrix} \begin{matrix} \begin{matrix} R, & G, & B, & L, \end{matrix} & a, & b, & m, \end{matrix} & σ, & U, & \begin{matrix} c, & V_{a} \end{matrix} \end{matrix}]

(13)

This feature vector is then normalized to accomplish the feature description and classification tasks for the superpixels.

PSO-SVM classification of superpixels

The experiment uses a multi-class SVM model with a radial basis function kernel to classify the superpixel feature vectors of cotton foreign fibers. The labels for cotton background samples are denoted “0”, while the labels for foreign fiber superpixels are denoted “1” to “8”. During the classification process, because the grayscale values of the plastic film and the cotton background are not significantly different, classifier I cannot effectively distinguish between the two. Therefore, a secondary classification is performed for the background superpixels and plastic film, as shown in Figure 6. In block segmentation, the number of initially classified foreign fiber labels X is counted. If X > 2, the classification result is directly output; otherwise, classifier II is used for secondary classification before outputting the classification result.

Figure 6.

Particle swarm optimization (PSO)-optimized support vector machine (SVM) classification model based on radial basis function kernel.

The penalty factor c and kernel function parameter g are important parameters in the SVM classification model, and their selection significantly affects the model’s performance. In this study, the PSO algorithm is used to optimize the SVM classification model. This algorithm performs a global search with moving particles, adjusting the search strategy according to the current situation, demonstrating clear advantages. The maximum number of evolutionary iterations is set to 70, and the population size is set to 50 particles. The 11-dimensional feature vector F is used as the model input variable, and the superpixel categories are used as the model output variables to build the classification model. The model performance is evaluated using classification accuracy.

Before training the classifier, the feature data are normalized using

x_{i}^{*} = \frac{x_{i} - x_{\min}}{x_{\max} - x_{\min}} (n_{\max} - n_{\min}) + n_{\min}

(14)

where

x_{i}^{*}

is the normalized feature data value; x_i is the original feature data value; x_max and x_min are the maximum and minimum values of the current feature sample; and n_max and n_min are the maximum and minimum values of the normalized data range.

Category correction

Owing to the misclassification present in the trained classification model, the following correction method is proposed to compensate for classification errors and achieve better segmentation results.

Confidence adjustment for predicted labels

After using the SVM classification model, the labels predicted as foreign fibers are uniformly set to positive samples “1”, while the labels predicted as cotton background remain as negative samples “0”. The confidence level for each predicted sample is obtained during the classification process. By comparison, it was found that in the predicted samples of classifier I, most deviations, which cause under-segmentation, result from foreign fiber superpixels being misclassified as cotton background superpixels, with confidence levels in the range [−0.3, 0.3]. Superpixel labels predicted as cotton background within this threshold are corrected to “1”. In the predicted samples of classifier II, those with confidence levels in the range [−0.3, 0] are cotton background superpixels misclassified as foreign fiber superpixels, and their labels are corrected to “0”. Confidence levels in the range [0, 0.15] correspond to foreign fiber superpixels misclassified as cotton background superpixels, and their labels are corrected to “1”. Some of the superpixel classification confidences are given in Table 1. The predicted sample confidence thresholds obtained are used to set the correction rules for preliminary label correction.

Table 1.

Confidence information of partial superpixel classification

Classifier I				Classifier II
Superpixel	True label	Predicted label	Confidence	Superpixel	True label	Predicted label	Confidence
	1	0	−0.2305		1	0	0.0755
	1	0	0.1979		1	0	0.0899
	0	1	0.4499		0	1	−0.1999
	0	1	2.2860		0	1	−0.1307
	1	1	3.5116		1	1	−4.3157
	0	0	−1.1902		0	0	0.9004

Bhattacharyya coefficient label correction

Confidence correction in the SVM classification model rectifies classification errors, further improving classification accuracy, though some misclassified labels still exist. To better describe the spatial relationships of each superpixel block in a two-dimensional space, a region adjacency graph (RAG) was constructed and its generation process is shown in Figure 7.

Figure 7.

Region adjacency map generation: (a) simple linear iterative clustering segmentation image and (b) generated region adjacency graph.

On statistical analysis of several superpixel segmented images of cotton foreign fibers, it was found that each superpixel region is adjacent to approximately one to seven other superpixels. In regions where foreign fiber superpixels are misclassified as cotton background, there are always two or more adjacent foreign fiber superpixels. Similarly, in regions where cotton background superpixels are misclassified as foreign fiber superpixels, there are always two or more adjacent cotton background superpixels. Based on these statistical results, the classification results of superpixels are corrected using the region adjacency relationships and by introducing the Bhattacharyya coefficient. The specific steps are as follows.

The grayscale histograms of all superpixels classified as foreign fibers and cotton background are statistically analyzed. The grayscale histogram of cotton background superpixels adjacent to foreign fiber superpixels is obtained. The Bhattacharyya coefficient between these histograms and those of foreign fiber and cotton background superpixels is calculated to measure similarity. If the similarity between a superpixel and the foreign fiber is greater than that between the superpixel and the cotton background, its classification label is reclassified as a foreign fiber superpixel.

The grayscale histogram of foreign fiber superpixels adjacent to cotton background superpixels is obtained. The Bhattacharyya coefficient between these histograms and those of foreign fiber and cotton background superpixels is calculated to measure similarity. If the similarity between a superpixel and the cotton background is greater than that between the superpixel and the foreign fiber, its classification label is reclassified as a cotton background superpixel.

These steps are repeated until the classification labels no longer change.

The Bhattacharyya coefficient can be used to measure the similarity between two discrete probability distributions. The formula for calculating it is

{BC}_{(p_{A}, p_{B})} = \sum_{x = 0}^{L - 1} \sqrt{p_{A} (x), p_{B} (x)}

(15)

where L is the number of gray levels in the image; p_A(x) is the gray-level probability distribution of the superpixel to be corrected; p_B(x) is the gray-level probability distribution of all superpixels already classified as cotton background or all superpixels already classified as foreign fibers; and x = 0, 1, …, L − 1. The value of BC ranges from 0 to 1. The closer the value is to 1, the greater the similarity between the two adjacent regions, and the greater the likelihood that the two regions belong to the same class.

Model evaluation

To further quantitatively describe the segmentation performance, in this work, four evaluation metrics are used to analyze the segmented images: precision, recall, F₁ score, and accuracy. The precision represents the proportion of pixels predicted by the model to belong to the segmented region that belongs to the target segmented region. The recall represents the proportion of pixels that are correctly predicted by the model out of all the pixels that belong to the target segmented region. The F₁ score is a metric that measures the combined performance of precision and recall, providing a more comprehensive evaluation of the model’s performance in segmenting foreign fiber samples. A higher F₁ score indicates better model performance. The corresponding formulas are:

Precision = \frac{T_{P}}{T_{P} + F_{P}}

(16)

Recall = \frac{T_{P}}{T_{P} + F_{N}}

(17)

F_{1} = {(\frac{{Recall}^{- 1} + {Precision}^{- 1}}{2})}^{- 1} = \frac{2 Recall \times Precision}{Recall + Precision}

(18)

Accuracy = \frac{T_{P} + T_{N}}{T_{P} + T_{N} + F_{P} + F_{N}}

(19)

where T_P is the number of foreign fiber pixels correctly classified as foreign fiber pixels; T_N is the number of background pixels correctly classified as cotton background pixels; F_P is the number of background pixels incorrectly classified as foreign fiber pixels; and F_N is the number of foreign fiber pixels incorrectly classified as background pixels.

Results and analysis

The algorithm in this paper is implemented using VC++ and OpenCV 4.5.0. The computer used is a Lenovo Legion Y7000 2020, configured with an Intel Core i7 10750H processor, a central processing unit clock speed of 2.6 GHz, 16 GB of random-access memory and the 64-bit Windows 10 operating system. The segmentation standard uses the ImageLabeler tool in Matlab for pixel-level annotation.

Improved SLIC segmentation effect

To verify the segmentation performance of the improved SLIC algorithm, we compared the superpixel segmentation effects of cotton foreign fibers using both the original SLIC algorithm and the improved SLIC algorithm, with the number of superpixels set to K = 400. As shown in Figure 8(b), the original SLIC algorithm exhibits several cases of under-segmentation, with the yellow and green boxes corresponding to under-segmentation phenomena of linear and weakly edged foreign fibers, respectively. After incorporating texture information, the improved SLIC segmentation effect is shown in Figure 8(c), where the boundary segmentation performance of linear and weakly edged foreign fibers is significantly better than that of the original SLIC algorithm.

Figure 8.

Comparison of effect before and after improvement of simple linear iterative clustering (SLIC) algorithm: (a) original image; (b) SLIC superpixel segmentation algorithm and (c) improved SLIC superpixel segmentation algorithm.

Establishment and analysis of superpixel classification model

The improved SLIC algorithm segments images of foreign fibers mixed in cotton into several irregular superpixels. We extract color and texture features from these superpixels to form an 11-dimensional feature vector. Using the PSO-SVM classification model, we classify the superpixels of the cotton background and foreign fibers to achieve image segmentation. In the experiment, 100 images were selected for superpixel sample collection, with 20 superpixels of each type used to calculate the mean feature values, as shown in Table 2. To ensure segmentation speed, the resolution of the collected images was reduced to 300 pixels × 450 pixels. A total of 3928 superpixels were collected, including 2657 foreign fiber superpixels and 1271 cotton background superpixels. Of these, 3416 superpixels were randomly selected as training samples for the SVM classification model, including 2233 foreign fiber superpixels and 1183 cotton background superpixels.

Table 2.

Comparison of characteristic values of different kinds of superpixel

Feature	Superpixel type
Feature	Purple cloth	Polyester fiber	Plastic block	Hemp rope	Nylon rope	Polyethylene	Polypropylene	Plastic film	Cotton background
R	142.056	64.4357	64.8362	43.9608	41.4061	83.5604	232.347	224.16	187.207
G	108.844	93.5643	112.305	91.3678	36.2969	85.6035	142.645	180.544	189.371
B	110.952	74.9728	236.755	118.925	36.5902	190.854	92.8738	202.984	189.305
L	47.2428	37.2891	61.8345	40.5684	14.4411	50.1571	59.6225	76.6271	76.6809
a	8.37311	−13.1698	44.8943	5.54035	1.43153	41.7974	10.4063	16.1454	−0.41654
b	−17.8737	14.4906	48.1414	31.1189	−3.2766	22.5272	−50.3727	−18.9117	1.07878
m	113.179	84.7023	144.111	94.2047	36.9232	116.857	137.984	192.261	189.164
σ	11.8927	23.3849	14.8008	15.8495	15.8583	12.7819	13.0085	11.2445	5.15692
U	0.03324	0.01817	0.03684	0.03853	0.03647	0.04162	0.04474	0.03915	−0.001166
c	282.838	1093.71	438.128	502.417	502.969	326.7555	338.445	252.855	53.1877
V_a	0.023738	0.012541	0.020909	0.020502	0.021531	0.025101	0.024076	0.025938	0.055676

The training samples are normalized within the range [0, 1], and the preliminary training of the classifier is conducted using the LIBSVM library. The trained SVM classifier is then used to test 412 test samples, and the confidence scores of the samples are obtained. The test results show that the overall classification accuracy of the model is 79.85%. Owing to the similar feature values for hemp rope and nylon rope and for plastic film and the cotton background, the classification accuracy for hemp rope and plastic film is low. To address this, the PSO algorithm is used to optimize the classification model. The number of particles is set to 50, and the number of iterations is set to 70. The fitness curves for the two stages of classifier optimization are shown in Figure 9. After optimization, the parameters for classifier I are c = 14.67 and g = 0.1, with a classification accuracy of 95.98%. For classifier II, the parameters are c = 4.89 and g = 4.79, with a classification accuracy of 96.55%. The overall classification accuracy is 96.10%. The comparison of classification results before and after optimization is shown in Figure 9, indicating a significant improvement in classification performance.

Figure 9.

Comparison of support vector machine (SVM) model classification results before and after particle swarm optimization (PSO): (a) fitness curve of PSO-SVM; (b) SVM model classification results before PSO optimization and (c) classification results of SVM model after PSO.

Algorithm performance analysis

Superpixel merging

Superpixel images undergo classification, whereas irregular superpixel blocks are classified and merged accordingly. Superpixels classified as cotton background are turned black, while those classified as foreign fibers remain unchanged. Owing to the sparse distribution and small area of foreign fibers in the image, the image size significantly affects the speed of SLIC superpixel segmentation. To enhance segmentation efficiency, cotton images are divided into blocks to check for the presence of foreign fibers. Images with a resolution of 300 pixels × 450 pixels are divided into six sub-images with a side length of 150 pixels each. The threshold is obtained using the grayscale extreme difference to determine whether each sub-image contains foreign fibers. Sub-images without foreign fibers are turned black, while those with classification and merging of images reveal significant over-segmentation (as seen in Figure 10(c) II, III, V, VII, IX) and under-segmentation (as seen in Figure 10(c) VI, VII, X), owing to poor classification performance. After confidence correction, some misclassified superpixels are corrected, and the segmentation of foreign fibers is mostly completed, as shown in Figure 10(d), although some misclassified superpixels still exist. On this basis, by utilizing the region adjacency relationships with the introduction of the Bhattacharyya coefficient, more complete segmented images can be obtained, as shown in Figure 10(e). The superpixel classification accuracy is improved to 98.65%.

Figure 10.

Sub-image segmentation effect comparison: (a) original image; (b) improved simple linear iterative clustering superpixel segmentation; (c) classification model, initial merged image; (d) image corrected using confidence and (e) image corrected using Bhattacharyya coefficient.

According to statistics, the average segmentation time for 50 sub-images is 0.061 s, whereas the average segmentation time for 50 whole images, with a preset value of K = 400, is 0.171 s. Therefore, when there are more than two sub-images containing heteromorphic fibers, segmentation of the entire image is performed. An example of the segmentation effect on the entire image is shown in Figure 11 IV, V.

Figure 11.

Whole image segmentation effect: (a) original image; (b) improved simple linear iterative clustering superpixel segmentation and (c) final segmentation effect.

Comparative analysis of different segmentation algorithms

In this study, we selected typical images from collected images of foreign fibers mixed in cotton for verification. We compared our proposed method with the widely used adaptive threshold segmentation Otsu algorithm and K-means clustering segmentation algorithm, which are currently commonly employed in foreign fiber detection.

The experiments were conducted on both the entire image and the sub-images, with some segmentation results shown in Figure 12. Figure 12(c) shows the segmentation image using the Otsu algorithm. It can be seen that this algorithm is greatly affected by uneven illumination. Since it can only adaptively find the optimal segmentation threshold, it is not effective in segmenting foreign fibers of various types, especially those with gray levels similar to the cotton background, resulting in poor segmentation performance. The K-means segmentation algorithm, conversely, performs clustering on the entire image. It requires selection of an appropriate value of K before segmentation, as shown in Figure 12(d). The choice of K greatly affects the final clustering result. To ensure segmentation quality, we chose K = 5 for the whole image segmentation, which yielded the best results. For sub-image clustering segmentation, K was uniformly set to 2. Comparative analysis revealed that our proposed method, by fully utilizing various visual features of the image, achieved overall better segmentation performance than the other two methods, accurately segmenting most of the foreign fiber pixels.

Figure 12.

Different segmentation algorithm renderings: (a) original images; (b) manually calibrated images; (c) Otsu threshold segmentation; (d) K-means segmentation and (e) segmentation method of this study.

Thirty images containing foreign fibers were selected, of which 15 full images (as shown in Figure 12(a) I) and 20 sub-images (as shown in Figure 12(a) II, III) were manually annotated. These images were segmented using the Otsu algorithm, the K-means clustering algorithm, and our proposed segmentation algorithm, and then compared with the manually annotated images. The precision, recall, and F₁ score for each image, as well as the segmentation time, were calculated, and their averages are given in Table 3.

Table 3.

Segmentation of performance metrics of different algorithms

Algorithm	Image type	Evaluation metrics				Segmentation time, s
Algorithm	Image type	Precision	Recall	F₁ score	Accuracy	Segmentation time, s
Otsu algorithm	Full image	0.6915	0.7124	0.7018	0.7555	0.343
Otsu algorithm	Sub-image	0.7241	0.7354	0.7398	0.6129	0.076
K-means clustering algorithm	Full image	0.7952	0.8034	0.7992	0.8605	0.751
K-means clustering algorithm	Sub-image	0.8297	0.8751	0.8518	0.6956	0.216
Textual algorithm	Full image	0.9124	0.9258	0.9190	0.9770	0.171
Textual algorithm	Sub-image	0.9562	0.9583	0.9572	0.9425	0.061

From Table 3, it can be observed that the Otsu algorithm is highly sensitive to uneven illumination and performs poorly in segmenting foreign fibers under natural lighting conditions. This is especially evident when segmenting more transparent plastic films, where the algorithm struggles to determine an adaptive threshold, resulting in the lowest precision, recall, and F₁ score of the three algorithms. The K-means clustering segmentation algorithm, however, can achieve varying segmentation effects if different values of K are set, and is more sensitive to color information. Its segmentation performance indicators are higher than those of the Otsu thresholding algorithm. However, it requires a predetermined number of clusters, which is difficult to ascertain in practical foreign fiber segmentation scenarios.

Our proposed superpixel-based segmentation algorithm leverages both the color and texture features of the image. After an initial classification of superpixel units using an SVM, a correction method is designed, significantly enhancing segmentation performance. This algorithm effectively segments various types of foreign fiber, particularly plastic films and filamentous foreign fibers with grayscale values similar to the cotton background. The segmented foreign fiber regions exhibit high precision, recall, F₁ score, and accuracy, and the segmentation time meets the basic requirements for real-time applications.

Conclusion

Addressing the poor segmentation performance of the SLIC superpixel algorithm on filamentous and weak-edge foreign fibers, in this paper, we propose a segmentation method for identifying foreign fibers in cotton based on superpixel features. By utilizing the designed LMNRBM operator to fully extract image texture information and improve the SLIC algorithm, the superpixels generated by the improved SLIC algorithm can better fit the boundaries of filamentous and weak-edge foreign fibers, effectively enhancing segmentation performance.

By extracting the color and texture features of superpixel blocks to form an 11-dimensional feature vector, and using a PSO-SVM classification model to classify and merge the superpixels, the initial segmentation of foreign fibers can be achieved. A confidence correction method and a correction method based on the RAG and the Bhattacharyya coefficient were designed to perform two corrections on the superpixels classified by the SVM. This further improved the classification accuracy to 98.65%, effectively segmenting the foreign fibers.

Compared with classical segmentation algorithms, the proposed algorithm demonstrates superior performance in terms of segmentation accuracy, recall, and F₁ score. The average segmentation accuracy per image reaches 95.73%. However, there are still some superpixel classification errors that reduce the segmentation precision. Therefore, future work could be focused on enhancing the feature distinction between the foreground and background during the image preprocessing stage or on designing more refined correction methods to improve image segmentation accuracy.

Footnotes

Data availability statement

All relevant data are within the paper.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Youth Foundation of China University of Petroleum-Beijing at Karamay (grant number XQZX20230038); the Basic Scientific Research Business Expenses Projects of Autonomous Region Universities (grant number XJEDU2024Z008); and the Xinjiang Uygur Autonomous Region “Tianshan Talent” Training Program Projects (grant number 2023TSYCJC0036).

Ethical approval

The authors confirm that all the research meets ethical guidelines and adheres to the legal requirements of the study country. The study does not involve human research participants.

ORCID iD

Shuhao Sun

References

Dong

Ren

, et al. Research progress in optical imaging technology for detecting foreign fibers in cotton. J Text Res 2020; 41: 183–189.

Ren

Zuo

, et al. Research progress in image segmentation and edge detection methods for alien fibers detection in cotton. J Text Res 2021; 42(12): 196–204.

Yao

Optical signal detection of foreign fiber based on H_∞ filter. J Text Res 2013; 34(9): 125–129. (In Chinese.)

Pfeiffenberger

GW.

The Shirley analyzer. Text Res 1944; 14: 50–54.

Liu

Delhom

The relationship between instrumental leaf grade and Shirley analyzer trash content in cotton lint. Text Res J 2018; 88: 1091–1098.

Cintrón

Rodgers

JE.

Identification of common cotton contaminants using an FTIR microscope with a focal plane array detector. AATCC J Res 2017; 4: 12–17.

Mustafic

Jiang

, and Li

Cotton contamination detection and classification using hyperspectral fluorescence imaging. Text Res J 2016; 86: 1574–1584.

Zhang

, and Yang

Classification of foreign matter embedded inside cotton lint using short wave infrared (SWIR) hyperspectral transmittance imaging. Comput Electron Agric 2017; 139: 75–90.

Peng

Huang

, and Li

. Detection of colorless plastic contaminants hidden in cotton layer using chromatic polarization imaging. Chin Opt Lett 2015; 13: 092901–092905.

10.

Guo

Wang

Zhai

, et al. A novel method for identification of cotton contaminants based on machine vision. Optik 2014; 125: 1707–1710.

11.

Zhang

, and Li

Detection of impurity rate of machine-picked cotton based on improved Canny operator. Electronics 2022; 11: 974.

12.

Wang

Yang

, and Li

A fast image segmentation algorithm for detection of pseudo-foreign fibers in lint cotton. Comput Electr Eng 2015; 46: 500–510.

13.

Yang

, and Wang

Classification of foreign fibers in cotton lint using machine vision and multi-class support vector machine. Comput Electron Agric 2010; 74: 274–279.

14.

Wei

Zhang

, and Deng

Content estimation of foreign fibers in cotton based on deep learning. Electronics 2020; 9: 1795.

15.

Ren

, et al. Application of near-infrared spectroscopy and CNN-TCN for the identification of foreign fibers in cotton layers. J Nat Fibers 2023; 20: 2172638.

16.

Shi

Wei

Guan

, et al. Cotton foreign fibers detection algorithm based on residual structure. J Text Res 2023; 44(12): 35–42.

17.

Zhang

, et al. Detection of foreign fiber in cotton based on improved YOLOv3. Chin J Liq Cryst Disp 2020; 35: 1195–1203.

18.

Zhang

, and Zhang

The detection of impurity content in machine-picked seed cotton based on image processing and improved YOLO V4. Agronomy 2021; 12: 66.

19.

Wang

Zhang

Yang

, et al. Detection and classification of cotton foreign fibers based on polarization imaging and improved YOLOv5. Sensors 2023; 23: 4415.

20.

Jing

Wang

Rätsch

, et al. Mobile-Unet: An efficient convolutional neural network for fabric defect detection. Text Res J 2022; 92: 30–42.

21.

Rother

Kolmogorov

, and Blake

“GrabCut”—Interactive foreground extraction using iterated graph cuts. ACM Trans Graphics 2004; 23: 309.

22.

Van Den Bergh

Boix

Roig

, et al. SEEDS: Superpixels extracted via energy-driven sampling. Int J Comput Vision 2015; 111: 298–314.

23.

Achanta

Shaji

Smith

, et al. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 2012; 34: 2274–2282.

24.

Ojala

Pietikainen

, and Harwood

Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. Proceedings of 12th international conference on pattern recognition, Jerusalem, 9–13 October 1994, pp. 582–585. Piscataway, NJ: IEEE.